<<

Southern Methodist University SMU Scholar

Economics Theses and Dissertations Economics

Spring 5-19-2018

Essays in

Manini Ojha Southern Methodist University, [email protected]

Follow this and additional works at: https://scholar.smu.edu/hum_sci_economics_etds

Part of the Econometrics Commons, Growth and Development Commons, and the Health Economics Commons

Recommended Citation Ojha, Manini, "Essays in Development Economics" (2018). Economics Theses and Dissertations. 4. https://scholar.smu.edu/hum_sci_economics_etds/4

This Dissertation is brought to you for free and open access by the Economics at SMU Scholar. It has been accepted for inclusion in Economics Theses and Dissertations by an authorized administrator of SMU Scholar. For more information, please visit http://digitalrepository.smu.edu. EMPIRICAL ESSAYS IN DEVELOPMENT ECONOMICS

Approved by:

Dr. Daniel L. Millimet Professor of Economics

Dr. Thomas B. Fomby Professor of Economics

Dr. Elira Kuka Assistant Professor of Economics

Dr. Anil Kumar Senior Economist, Dallas Fed EMPIRICAL ESSAYS IN

DEVELOPMENT ECONOMICS

A Dissertation Presented to the Graduate Faculty of the

Dedman College

Southern Methodist University

in

Partial Fulfillment of the Requirements

for the degree of

Doctor of Philosophy

with a

Major in Economics

by

Manini Ojha

B.A., Economics, University of , M.A., Economics, University, New Delhi, India M.A., Economics, Southern Methodist University, Dallas, TX

May 19, 2018 Copyright (2018)

Manini Ojha

All Rights Reserved ACKNOWLEDGMENTS

First and foremost, I would like to thank my advisor, Dr. Daniel Millimet. Dr. Millimet, it has been an honor to be your Ph.D. student. This work would not have been accomplished without your wisdom, encouragement and constant guidance. Your direction not only helped conceptualize my ideas but also influenced me as a writer. I sincerely thank you for allowing me to grow as a researcher. I am also grateful to Dr. Elira Kuka for being a constant source of motivation. Dr. Kuka, your outlook, invaluable suggestions and research greatly influenced my work. I thank Dr. Thomas Fomby and Dr. Anil Kumar, the other members of my dissertation committee, for offering their valuable feedback regarding my work. I extend my heartfelt thanks to Dr. Omer¨ Ozak¨ for his support throughout my doctoral journey. Dr. Ozak,¨ your accessibility as a professor was an immense source of comfort. I cherish all my interactions with you through the years and greatly value our friendship. I thank Dr. Santanu Roy for his guidance, support and valuable inputs, both on a profes- sional and a personal front, throughout my career as a graduate student. I am also grateful to Dr. Klaus Desmet, Dr. Danila Serra, Dr. Tim Salmon, and Dr. James Lake for their advice and suggestions. Further it would be remiss of me if I did not thank Margaux Mont- gomery and Stephanie Hall, who work tirelessly to ensure that our lives in the department be as comfortable as possible. I owe a special thanks to my colleagues, friends and co-authors, Andres Giraldo and Priyanka Chakraborty. Andres and Priyanka, I have learnt a great deal from you and am deeply grateful to you for being a part of my intellectual as well as emotional journey. I also thank my classmates and colleagues Punarjit Roychowdhury, Erik Hille and Hao Li. I couldn’t have hoped for a better set of colleagues. To my friends in Dallas, you became family. Six years back, I came here alone and today, I leave rich with friendships and memories that will last a life-time. Ankita and Akshay,

iv thank you for accepting me with arms wide open and introducing me to the most natural and easy friendships I have ever known. You been a home away from home. Manali, Madhu, Aarushi, Amod, Sajid, Rohan and the whole crew, thank you for believing in me, walking beside me and for all the love. This journey would not have been the same without each one of your presence in my life. I cherish your friendship. A special thanks to my family. Words cannot express how grateful I am to my parents, Neeraja and Rajani Ranjan Rashmi, for their unconditional love, for letting me follow my path, for encouraging me to be better and for all the sacrifices they made on my behalf. Ma and Papa, I owe everything to you and more. I also thank my beloved brother, Pratyay Ojha, for being my confidant, my sounding board, and for always cheering me up. Finally, none of this would have been possible without the love and support of my hus- band, Apurv. Apurv, I am forever grateful to you for being so patient with me, for pulling me up every time I would was down, for believing in me even when I didn’t, for steering me in the right direction and for the constant reminders about the world around me. Your work ethic, passion for what you believe in, and pragmatism has always been an inspiration. Your dogged confidence in us and our long, long-distance marriage has been a pillar of strength through this journey. With you, life is a series of remarkable events. Thank you for being my sankat-mochan.

v Ojha, Manini B.A., Economics, University of Delhi, India M.A., Economics, Jawaharlal Nehru University, New Delhi, India M.A., Economics, Southern Methodist University, Dallas, TX

Empirical Essays in Development Economics

Advisor: Dr. Daniel L. Millimet degree conferred May 19, 2018 Dissertation completed April 20, 2018

This dissertation consists of three empirical essays in development economics. In the first essay, I examine the impact of a health insurance scheme called the Rashtriya Swasthya Bima Yojana (RSBY), launched in 2008 in India, on schooling decisions and gender differences in education. At the outset, it is not entirely obvious as to whether health insurance would benefit education or have a detrimental impact. Healthier children could either mean greater future economic returns from schooling or greater value as child labour. More specifically, the questions I seek to answer are twofold: (1) Does access to a health insurance scheme designed for the poor have an impact on school expenditure decisions of households? (2) Does it affect school enrollment of boys and girls within the household? Employing difference-in-differences and triple differences approaches, I find that access to RSBY is beneficial for child education as school expenditure increases by 20 to 28 percent after the treatment. Additionally, I find RSBY to be relatively more advantageous to girls as it reduces the existing gender gap in school enrollment by 1/3rds. From a policy perspective, it is interesting to see that a health insurance scheme has unintended positive consequences not only on household school expenditure but also on parental responses within household in terms of enrollments of boys versus girls. Such responses should ideally be considered when designing policies to remedy any disadvantages among children, since parents can eliminate these effects by aiming at equitable child human capital formation within the family.

vi In the second essay, I study the impact of India’s National Rural Employment Guarantee Act (MG-NREGA) on the pattern of household consumption be- haviour. NREGA, passed in 2005, created the world’s largest public works programme under a statutory framework, legally guaranteeing hundred days of employment. Guaranteeing such employment opportunities can directly affect intra-household decisions through a change in total resources but also allocation of resources. Using the phase wise roll-out of NREGA to districts and employing a difference-in-differences approach, I find a shift in discretionary spending towards ‘wiser’ consumption choices like school expenditure and durable goods, away from ‘wasteful’ expenditure like entertainment. These effects are broadly suggestive of an increase in female bargaining power since men and women are seen to have systemati- cally different consumption preferences and spending patterns. I also find the shifts in con- sumption patterns to be amplified in regions with higher share of women employed through NREGA; in states that guarantee employment at higher minimum wages; and in rice growing regions of India, where females are traditionally more intensively involved in production. This dissertation also delves into the relationship between human capital formation and socio-economic conditions in developing countries. To this effect, in the third essay, I evaluate the impact of quality of education on violence and crime, using data from Colombia, a country with a long standing history of violence and conflict. Over the long run, successful efforts to improve school quality would imply an extraordinary rate of return, and may be a tool for social mobility and development. I exploit geographic and time variation at the municipality level and use an Instrumental Variable approach to identify this effect. The instruments are based on transfer of funds from the central government to municipalities for investments in education. I find that better education quality, measured by student test scores on a mandatory school-exit examination, has a significant and negative impact on the intensity of crime. A 1 standard deviation increase in test scores leads to a decline of 6.2 standard deviations in property crimes. These effects are perhaps indicative of an ‘opportunity cost effect’ of education. I also find that better education quality reduces violent crimes as well as presence of illegal armed groups suggesting a ‘pacifying effect’ of education.

vii TABLE OF CONTENTS

LIST OF FIGURES ...... xii

LIST OF TABLES ...... xiii

CHAPTER

1. GENDER GAP IN SCHOOLING: IS THERE A ROLE FOR HEALTH IN- SURANCE? ...... 1

1.1. Introduction ...... 1

1.2. Literature ...... 5

1.3. Background on RSBY ...... 7

1.4. Empirics ...... 8

1.4.1. Data ...... 8

1.4.2. School expenditure - Estimation and identification ...... 11

1.4.3. School enrollment - estimation and identification ...... 14

1.5. Results ...... 15

1.5.1. School expenditure as a budget share ...... 15

1.5.2. School enrollment ...... 17

1.6. Robustness Checks ...... 18

1.6.1. School expenditure - other estimation issues ...... 18

1.6.1.1. School expenditure in levels ...... 19

1.6.1.2. School expenditure - fractional logit estimation ...... 21

1.6.1.3. School expenditure - panel analysis ...... 22

1.6.2. School enrollment - other estimation issues...... 23

1.6.2.1. School enrollment - probit with correlated random ef- fects ...... 23

1.6.2.2. School enrollment - instrumental variable approach ...... 24

viii 1.7. Sensitivity Analysis ...... 25

1.7.1. Variation in income distribution ...... 25

1.7.2. Variation in treatment intensity ...... 26

1.7.3. Variation in programme take-up by district ...... 29

1.7.4. Sub sample analysis ...... 30

1.8. Conclusion ...... 31

2. INTRA-HOUSEHOLD CONSUMPTION DECISIONS: EVIDENCE FROM NREGA ...... 38

2.1. Introduction ...... 38

2.2. Background on NREGA ...... 40

2.3. Literature review ...... 41

2.4. Data ...... 43

2.5. Empirics ...... 44

2.6. Results ...... 48

2.7. Sensitivity analysis ...... 50

2.7.1. Women employment in NREGA jobs ...... 51

2.7.2. State minimum wages ...... 52

2.7.3. Crop regions ...... 53

2.8. Robustness checks ...... 55

2.8.1. Fractional logit estimation with correlated random effects ...... 55

2.8.1.1. Baseline model...... 55

2.8.1.2. Heterogeneous effects ...... 56

2.8.2. Consumption in levels ...... 56

2.8.2.1. Baseline model...... 56

2.8.2.2. Heterogeneous effects ...... 58

2.9. Conclusion ...... 58

ix 3. THE EFFECT OF QUALITY OF EDUCATION ON CRIME: EVIDENCE FROM COLOMBIA1 ...... 66

3.1. Introduction ...... 66

3.2. Literature ...... 70

3.3. Background ...... 72

3.4. Data and Identification ...... 72

3.4.1. Data ...... 72

3.4.2. Selection Issues ...... 75

3.4.3. Estimation ...... 76

3.4.4. Identification Issues ...... 77

3.4.5. Institutional Framework ...... 79

3.5. Results ...... 80

3.5.1. Crime Rate ...... 80

3.5.2. Property Crimes ...... 81

3.5.3. Violent Crimes ...... 82

3.5.4. Conflict ...... 82

3.6. Transmission Channel ...... 83

3.7. Robustness ...... 83

3.7.1. Sub-Sample Analysis ...... 83

3.7.2. Other Government Transfers ...... 84

3.7.3. Other Measures of Education Quality ...... 85

3.8. Conclusion ...... 86

APPENDIX

A. GENDER GAP IN SCHOOLING: IS THERE A ROLE FOR HEALTH IN- SURANCE? ...... 93

1With Andres Giraldo, Southern Methodist University and Pontificia Universidad Javeriana

x B. INTRA-HOUSEHOLD CONSUMPTION DECISIONS: EVIDENCE FROM NREGA ...... 107

C. THE EFFECT OF QUALITY OF EDUCATION ON CRIME: EVIDENCE FROM COLOMBIA...... 118

BIBLIOGRAPHY ...... 144

xi LIST OF FIGURES

Figure Page

1.1 Pre-trends at district level ...... 35

A.1 RSBY Coverage ...... 94

B.1 Districts map of India ...... 108

C.1 Crime Rate 2007 ...... 118

C.2 Education Quality 2007 ...... 119

C.3 Crime Rate 2013 ...... 120

C.4 Education Quality 2013 ...... 121

xii LIST OF TABLES

Table Page

1.1 Summary statistic - Household level ...... 33

1.2 Summary statistics - Individual level ...... 34

1.3 Impact of RSBY on household school expenditure...... 36

1.4 Impact of RSBY on child school enrollment ...... 37

2.1 Impact of NREGA on expenditure shares - DID ...... 60

2.2 Impact of NREGA on expenditure shares - DDD ...... 61

2.3 Impact of NREGA on probability that household is female headed ...... 62

2.4 Heterogeneous Impacts of NREGA on Expenditure Shares: Female Share of NREGA Employment ...... 63

2.5 Heterogeneous Impacts of NREGA on Expenditure Shares: State Stipulated Minimum Wages ...... 64

2.6 Heterogeneous Impacts of NREGA on Expenditure Shares: Crop Regions ..... 65

3.1 Crime and Education Quality ...... 88

3.2 Crime and Education Quality ...... 89

3.3 Crime and Education Quality ...... 90

3.4 Presence and Quality of Education ...... 91

3.5 Lights and Education Quality ...... 92

A.1 Robustness: Impact of RSBY on household school expenditure - Instrumen- tal variable approach ...... 95

A.2 Robustness: Impact of RSBY on household school expenditure - Fractional logit estimation ...... 96

A.3 Robustness: Impact of RSBY on household school expenditure - Panel analysis 97

xiii A.4 Robustness: Impact of RSBY on child school enrollment - Probit with cor- related random effects ...... 98

A.5 Robustness: Impact of RSBY on child school enrollment - Instrumental variable approach ...... 99

A.6 Sensitivity analysis: Impact of RSBY on household school expenditure and child school enrollment - Variation in income categories ...... 100

A.7 Sensitivity analysis: Impact of RSBY on household school expenditure - Variation by intensity of treatment ...... 101

A.8 Sensitivity analysis: Impact of RSBY on child school enrollment - Variation by intensity of treatment ...... 102

A.9 Sensitivity analysis: Impact of RSBY on household school expenditure - Variation in take-up by district...... 103

A.10 Sensitivity analysis: Impact of RSBY on child school enrollment - Variation in age groups ...... 104

A.11 Sensitivity analysis: Impact of RSBY on household school expenditure and child school enrollment - Rural vs urban ...... 105

A.12 Sensitivity analysis: Impact of RSBY on household school expenditure and child school enrollment - Variation by castes ...... 106

B.1 Summary statistics ...... 109

B.2 Impact of NREGA on Expenditure Shares - Fractional Logit Model with Correlated Random Effects Approach ...... 110

B.3 Heterogeneous Impacts of NREGA on Expenditure Shares: Female Share of NREGA Employment - Fractional Logit Model with Correlated Ran- dom Effects Approach ...... 111

B.4 Heterogeneous Impacts of NREGA on Expenditure Shares: State Stipulated Minimum Wages - Fractional Logit Model with Correlated Random Effects Approach ...... 112

B.5 Heterogeneous Impacts of NREGA on Expenditure Shares: Crop Regions - Fractional Logit Model with Correlated Random Effects Approach ...... 113

B.6 Impact of NREGA on Expenditure in Levels ...... 114

B.7 Heterogeneous Impacts of NREGA on Expenditure in Levels: Female Share of NREGA Employment ...... 115

xiv B.8 Heterogeneous Impacts of NREGA on Expenditure in Levels: State Stipu- lated Minimum Wages ...... 116

B.9 Hetergeneous Impacts of NREGA on Expenditure in Levels: Crop Regions .... 117

C.1 Summary statistics ...... 122

C.2 Crime and Education Quality (Without Bogota) ...... 126

C.3 Disaggregated Crime and Education Quality (Without Bogota) ...... 127

C.4 Crime and Education Quality (Without State Capitals) ...... 127

C.5 Disaggregated Crime and Education Quality (Without State Capitals) ...... 128

C.6 Violence and Education Quality (With Population <200,000 Inhabitants) ..... 128

C.7 Disaggregated Crime and Education Quality (With Population < 200, 000 Inhabitants) ...... 129

C.8 Crime and Education Quality (Rural Areas) ...... 129

C.9 Disaggregated Crime and Education Quality (Rural Areas) ...... 130

C.10 Crime and Education Quality (Urban Areas) ...... 130

C.11 Disaggregated Crime and Education Quality (Urban Areas) ...... 131

C.12 Crime and Education Quality (Total Transfers as Instruments) ...... 131

C.13 Disaggregated Crime and Education Quality (Total Transfers as Instru- ments) ...... 132

C.14 Crime and Education Quality (Total Transfers as an Additional Regressor) ... 133

C.15 Disaggregated Crime and Education Quality (Total Transfers as an Addi- tional Regressor) ...... 134

C.16 Crime and Education Quality (Total Transfers instead of Total Expendi- tures) ...... 135

C.17 Disaggregated Crime and Education Quality (Total Transfers instead of Total Expenditures) ...... 136

C.18 Crime and Education Quality (Total Transfers Instrumented) ...... 137

C.19 Disaggregated Crime and Education Quality (Total Transfers Instrumented) .. 138

C.20 Crime and Education Quality (Cognitive Areas) ...... 138

xv C.21 Disaggregated Crime and Education Quality (Cognitive Areas) ...... 139

C.22 Crime and Education Quality (Social Areas) ...... 140

C.23 Disaggregated Crime and Education Quality (Social Areas) ...... 141

C.24 Crime and Education Quality (Total Score) ...... 142

C.25 Disaggregated Crime and Education Quality (Total Score) ...... 143

xvi To Casper, my forever-magnificent companion. Chapter 1

GENDER GAP IN SCHOOLING: IS THERE A ROLE FOR HEALTH INSURANCE?

1.1. Introduction

Concerns about adequate healthcare and access to health insurance have witnessed pro- found growth over the past few decades amongst policymakers worldwide. The WHO states that 400 million people in the world have no access to essential health services and 6 per- cent of people in developing countries are pushed further into extreme poverty due to health spending (WHO[2015]). Health shocks can be particularly devastating for the poor in de- veloping countries owing to a lack of affordable insurance (Hamoudi et al.[1999], Wagstaff et al.[2009], Wagstaff et al.[2009]). 1 Absence of a formal pervasive public insurance system means large out-of-pocket expenditure is the main source of healthcare. As such, the burden of health shocks may be greater if its consequences are transferred to human capital of future generations in families unable to access formal insurance markets (Currie and Moretti[2007]; Bhalotra and Rawlings[2011]; Flores et al.[2008], Morduch[1999], Sun and Yao[2010]). Child human capital formation can potentially be affected through the following chan- nels. First, if children are considered as substitutes for adult labour in a family with an ailing parent, they are compelled to be withdrawn from school and sent to work to smooth consumption (Fabre and Pallage[2015]). 2 Second, if the case is of an ailing child, they are withdrawn from school as their survival and health status assume more importance in such situations. Third, health shocks reduce a household’s ability to afford the upfront cost of schooling. In the absence of safety nets coupled with poverty, households thus resort

1Most private healthcare deliveries have low penetration due to lack of awareness and affordability. As a result, the government often fills this void in the market. 2They may even be asked to look after the sick parent reducing the time they can devote to school (Bratti and Mendola[2014]).

1 to financing healthcare expenditure through other costly measures like reduction in school expenditure or delaying their children’s enrollments. However, it is not entirely obvious whether access to health insurance would have a positive or a negative impact on child edu- cation. On the one hand, the above mechanisms imply that insurance could protect children from being pushed into labour and reduce school dropouts in households affected by health shocks. On the other, better child health as a result of insurance could even mean more child labour for such families. The effect is therefore ambiguous and speaks to the importance of addressing it empirically. Moreover, the impact of health insurance on education may not be gender-neutral. That there exists a problem of gender gap in education in developing countries, is well known.3 Researchers cite several reasons for this gap, like differential economic returns to education, parental preferences or biases, concerns over old-age support, and family’s economic condi- tions, of which, health spending is a key determinant. Given this context, it is noteworthy to examine whether a health insurance system designed for the poor impacts schooling decisions and gender differences in education. Gender specific roles within households invariably result in different time opportunity cost of schooling for boys and girls. Health insurance in such a scenairo has the potential to impact not only the time opportunity cost of schooling but also the monetary costs of schooling. On the one hand, if resources are reallocated from schooling to cover health expenses, then households may reallocate first from the girls’ expenditures if preferences and/or returns to education favour boys.45 On the other hand, improved health due to health insurance may increase the returns for child labour more for boys than girls and thus

3Girls tend to receive less schooling than boys (Burgess and Zhuang[2000], Schultz[2002], Colclough et al.[2000], Alderman et al.[1996], Alderman and King[1998]). 4Girls invariably become the first victim of a health shock to the family without insurance (Garg and Morduch[1998]). In addition, resource constraints can exacerbate patterns of preferences within households as income changes (Hill and King[1995], Alderman and Gertler[1997]). 5Basic education in developing countries is public but school attendance still requires out-of-pocket ex- penditures, sometimes large enough to keep children out of school. Although direct fee is unlikely to differ by gender, costs such as those of reaching school, learning materials, and uniforms may influence schooling decisions of girls more than boys.

2 reduce educational investments for boys relative to girls.6 Again, the direction of impact is ambiguous and merits more empirical work. As such, access to a health insurance system for the poor could perhaps ease some resource constraints, thus resulting in a change in the gender gap in enrollments. This paper focuses on India’s cashless, paperless and portable health insurance scheme started in 2008, called the Rashtriya Swasthya Bima Yojana (RSBY) to investigate these issues.7 RSBY was implemented with the aim to protect the poor, across rural and urban areas, from financial liabilities and increase their access to quality healthcare. Given that 28 percent of India’s population is below poverty line (BPL), health care expenditure is one of the most important reasons for indebtedness. Alarmingly, less than 15 percent of the 1.1 billion population are covered by health coverage. Moreover, over 78 percent of all medical expenditure in India is private financing most of which is out-of-pocket expenditure and is amongst the highest in the world (Swarup and Jain[2011]). 8 Initially targeted at below poverty line households, RSBY has since expanded to cover other unorganized workers and marginalized sections who enroll into the scheme. The beneficiaries of the scheme are provided with a bio-metric smart card that can be used to receive health services from hospitals empanelled under the scheme without any out-of-pocket expenditure subject to certain conditions. RSBY therefore assumes importance as a policy measure to not only decrease the vulnerability of credit-constrained households, but to also potentially protect their children from adverse shocks. While there exists literature on the impact of health insurance on health expenditure and health related outcomes in India, most papers focus on smaller insurance schemes con-

6Sons are valued more as they are considered labour assets and support during old age. Daughters however, usually leave the natal family post marriage (Sen and Sengupta[1983], Bardhan[1985]; Rosenzweig and Schultz[1982], Duraisamy[1992], Garg and Morduch[1998], Kingdon[2005], Almond et al.[2010], Haddad et al.[1984]). 7In recent times, many developing countries have subsidized health insurance for the rural and informal sector workers and their families (Wagstaff et al.[2009]). China adopted a new health insurance system for the rural population called the New Cooperative Medical Scheme. On similar lines, Vietnam, Taiwan, Indonesia, and Philippines are also striving to achieve universal health coverage. 8External aid to the health sector accounts for a negligible 2 percent of the total health expenditure.

3 centrated in some states. Few recent empirical papers investigate the impact of RSBY on financial burden, health services and expenditures (Azam[2018], Karan et al.[2017], Ravi and Bergkvist[2015], Karan et al.[2014], Johnson and Krishnaswamy[2012]). However, thus far, no evidence exists for the spillover effects of health coverage, in general, and RSBY, in particular, on education. This is the first paper, to my knowledge, to investigate the role of a public health insurance scheme in India in determining school expenditure and enrollment decisions. Using nationally representative longitudinal survey, my empirical analysis employs two different identification strategies. First, I estimate the effect of RSBY on both school expen- diture and enrollment using a difference-in-differences strategy. Second, I employ a triple differences model which exploits the fact that rich households are significantly less likely to be affected by the program (due to the initial focus on BPL households). Using nationally representative household level data, I investigate the treatment impact of RSBY on house- hold school expenditure. In addition, using nationally representative individual level data, I quantify a similar treatment effect of RSBY on school enrollment and the existing gender gap. I compare households in districts that are exposed to RSBY by the second wave of the survey, to those that were not exposed to the scheme in the sample period in order to obtain the intent-to treat (ITT) impact of the programme. The findings are interesting and ought to serve as a guide to future research and policy discussions. A key result is that access to health insurance is beneficial for child human capital formation, as school expenditure increases at the household level after the treatment. The estimates found imply an increase in the budget share of school expenditure of 0.5 to 0.7 percentage points. This effect is statistically and economically significant given that school expenditure accounted for 2.5 percent of the budget share for such households prior to RSBY. This amounts to an increase of 20 to 28 percent in its budget share after the treatment. Given that health insurance reduces uncertainty about occupational hazards, availability and access to RSBY mitigates costly choices a household may otherwise resort to, like reducing school expenditure. These results are robust to several alternative modeling

4 choices. Finding positive impacts on household school expenditure, the paper goes further to quantify the effects of RBSY on school enrollments of children within households. I find a clear reduction in the gender gap in school enrollment after implementation of RSBY. Absent the programme, school enrollment of boys is about 6 percentage points more than girls. I find that the probability of enrollment is 0.8 percentage points higher for boys and 2.7 percentage points higher for girls, after the programme went into effect. Thus, the gap in enrollment reduces by one-third. Triple differences approach confirms this result for relatively less well-off households. Rest of the paper proceeds as follows. Section 1.2 presents a review on related literature. Section 1.3 provides the background and programme details of RSBY. Section 1.4 is divided into sub-sections: 1.4.1 describes the data, followed by the estimation and identification strategy for the analysis of school expenditure and school enrollment in subsections 1.4.2 and 1.4.3 respectively. Section 1.5 discusses the baseline results followed by robustness of the baseline models in Section 1.6. Section 1.7 presents sensitivity analysis of school expenditure and school enrollment. The paper ends with the conclusion in Section 1.8.

1.2. Literature

This paper contributes broadly to two bodies of literature. First, it contributes to the vast literature on the impact of public health insurance schemes. Effect of health coverage on uptake of treatment, out-of-pocket expenditures, in-patient and out-patient services in developing countries have been examined in Acharya et al.[2012], Wagstaff et al.[2009]. Currie and Gruber[1996], Chen and Jin[2012], Liu and Zhao[2014] study its impact on other health-related outcomes like health care disparity, health statuses of new born children, mothers and the elderly. In the Indian context, the impact of health insurance, particularly RSBY, on various outcomes are studied in Azam[2018], Karan et al.[2017], Raza et al.[2016], Devadasan et al.[2013], Das and Leino[2011], Palacios et al.[2011], Johnson and Krishnaswamy[2012],

5 Rajasekhar et al.[2011], Virk and Atun[2015], Ravi and Bergkvist[2015]. Using panel data from the India Human Development Survey (IHDS), Azam[2018] utilizes a difference- in-differences with propensity score matching approach to estimate the average treatment impact (ATT) of RSBY on the beneficiary households. The paper uses both the house- hold and the individual level data from the IHDS to investigate the impact on utilization of health services for short term and long term morbidity, total out-of-pocket expenditures, per capita in-patient and out-patient expenditures. Both Karan et al.[2017] and Johnson and Krishnaswamy[2012] use difference-in-differences with matching at household level to evaluate the ITT impact of RSBY using cross-section data from the national sample sur- vey (NSS). Karan et al.[2017] find marginal decline in in-patient, out-patient out-of-pocket expenditures and budget share of out-of-pocket expenditure. Johnson and Krishnaswamy [2012] find that the scheme has led to a small decrease in out-patient and total medical expenditure of target households and some limited evidence of increased hospital utilization rates. On similar lines, Ravi and Bergkvist[2015] also use data from NSS and implement difference-in-differences across insurance districts versus uncovered districts to study the ITT impact of publicly provided health insurance schemes in India on the likelihood of impover- ishment, catastrophic health expenditure, and the poverty gap index. Nandi et al.[2013] use district-wise official data on enrollment, and correlate those with district characteristics to find the determinants of participation in RSBY. Fewer studies have focused on the impact of health insurance on non health related outcomes. Among these papers, most have looked at the impact on household choices associated with health shocks (Kochar[1995], Liu[2016], Mohanan[2013]). Second, the paper adds to the strand of literature on gender gaps in treatment of children in south Asia. According to some papers, boys are favored over girls in terms of intra- household allocation of resources and nutrients as found through indices like weight for age, mortality rates, and breastfeeding (Barcellos et al.[2014], Behrman[1988], Bardhan[1985], Sen and Sengupta[1983], Rosenzweig and Schultz[1982]). Other papers suggest household income, parental education and supply side factors like quantity and quality of schools are

6 explanations for low educational achievements and gender gaps in such countries (Behrman and Knowles[1999], Duraisamy[1992], Kambhampati and Pal[2001], Pal[2004], Dreze and Kingdon[2001]). More specifically, in the context of India, evidence of gender differences in child schooling exists for some states but very few studies are able to explain such differences (Pal[2004], Glick et al.[2016]).

1.3. Background on RSBY

Rashtriya Swasthya Bima Yojana (RSBY) or the national health insurance scheme was launched by the government of India as a cashless, paperless and portable health insurance scheme in 2008. The scheme was initially designed to target below poverty line population (BPL) both in rural and urban India but was later expanded to also cover unorganized workers such as construction workers, domestic help, street vendors, rickshaw pullers etc. RSBY aims to protect the poor from financial risk arising from out-of-pocket expenditures on hospitalizations and to improve the access to quality healthcare. Unlike most central government schemes, implementation of RSBY did not follow a top driven approach. The government marketed the scheme and rolled it out in districts based on factors such as need for the scheme, ease of implementation and acceptance from local governments. By October 2013, approximately 36 million families out of a target of approximately 65 million were enrolled in the scheme. As of 2013, the scheme was implemented in 512 districts out of 640 districts in 29 states across India (Government of India[2013b]). Beneficiaries of the scheme are entitled to hospitalization coverage of up to INR 30,000 (approximately $460) for a family of five and transportation costs up to INR 1,000 (approxi- mately $16). The scheme is jointly funded by the central and state governments with 75% of premium from the center and 25% from the state.9 State governments set up state agencies to prepare a list of identified households.10 Awareness campaigns are conducted through the

9In case of Jammu & Kashmir and North-eastern States, 90% of premium is from the central government and 10% from the state. 10These are referred to as the state ‘nodal’ agencies by the government.

7 Gram Panchayat and enrollment camps set up across districts.11 Insurance companies, se- lected through a competitive bidding process by the government, are responsible for reaching out to the beneficiaries for enrollments. Once a hospital is empanelled, a nationally-unique hospital ID number is generated so that transactions can be tracked at each hospital. Beneficiaries pay a small amount of INR 30 (approximately $5) as registration fee which is aggregated at the state level and is used to take care of the administrative cost of the scheme. Households that choose to enroll into the scheme receive a bio-metric card with a national unique ID. Upon receiving the card, the beneficiary can visit any empanelled hospitals across the country to get cashless treatment. Insurance companies are paid a fixed price per household enrolled and must settle all claims with the hospitals directly based on rates fixed by the central government. While all pre-existing diseases are covered, the scheme does not cover out-patient procedures. There is no age limit on the enrollment of beneficiaries.

1.4. Empirics

1.4.1. Data

I utilize two waves of the India Human Development Survey (IHDS), collected in 2004- 05 and 2011-12 for the analysis.12 IHDS is a nationally representative multi-topic survey of approximately 40,000 households across 1503 villages and 971 urban neighbourhoods of India. The surveys are collected from January to March.13

11The date and location of the enrollment camp are publicized in advance. Some mobile enrollment stations are also established at local centers like public schools at each village at least once a year. These stations are equipped by the insurer with the hardware to collect bio-metric information and photographs of the members of the household covered. 12IHDS I refers to the time period 2004-05 and IHDS II to 2011-12. 13IHDS is collected by the National Council of Applied Research and Training (NCAER), New Delhi and University of Maryland. The waves are publicly available to be downloaded from the Inter-University Consortium for Political and Social Research (ICPSR). IHDS-I surveyed 41,554 households and IHDS-II 42,152 households.

8 IHDS-II is mostly re-interviews of households interviewed for IHDS-I. I merge the two survey waves for my analysis both at household as well as individual level.14 The household sample is restricted to include households with children and where the age of head lies between 18 to 90 years. After dropping these observations, my sample consists of 29,381 households in the first survey wave and 25,226 in the second. The individual level sample is restricted to children in the age group 5 to 18 years. The sample consists of 48,571 children in the first survey wave and 41,576 in the second. I consider the individual level data as repeated cross-section since it is difficult to track the same child over a period of 7 years between 2005 and 2012. Some children may have finished school while new children are enrolled. For consistency purposes, I also consider the household sample as a repeated cross- section.15 I merge both the household and the individual samples separately with data on implementation of RSBY at district-level. Information about the roll-out of health insurance scheme is taken from the official ministry website.16 The final sample consists of 393 districts across India. No districts were treated in the first wave and 53 districts were not treated as of the second wave. In addition, I use three rounds of Household Consumption Expenditure Surveys from the National Sample Survey (NSS) of India for the years 2004-05, 2005-06 and 2006-07 for district level average monthly consumption expenditure prior to implementation of RSBY. I consider budget share of school expenditure out of total monthly household expenditure as my outcome variable for the baseline analysis at household level (see Eqns. 1.1 and 1.2). Child specific school enrollment within each household is my outcome variable for the analysis at individual level (see Eqn. 1.3 and 1.4). Standard errors are clustered at district level in all estimations.17

14The states of Andhra Pradesh, , Tamil Nadu have been dropped from my sample as these states already have state-funded health insurance schemes in place. 15I redo my analysis at household level treating the household data as a true household level panel data for robustness purposes (refer to section 1.6.1.3). 16List of districts and phases of implementation can be found at http://www.rsby.gov.in/. 17This is true except when I estimate the treatment effect considering my sample as a panel data with household fixed effects. I cluster the standard errors at household level in this case.

9 The set of controls for household level analysis (refer to Eqns. 1.1 and 1.2) includes household size, age of the head of the household, age squared, educational characteristics of male and female members of the household, number of years a family has stayed in one place, indicators for caste (Brahmins, Scheduled Tribes, Scheduled Castes, and Other Backward Class), indicators for religion (Hindu, Muslim, Sikh, Buddhist, Jain, other religion), dummy for urban areas, whether head of the household can converse in English, gender dummy for the head of the household, number of married male and females in the households, dummies indicating number of years of marriage, whether the household has a bank account and a credit card. Control variables for the individual level analysis (refer to Eqn. 1.3 and 1.4) include household size, age of the child, age squared, mother and father’s education charac- teristics, indicators for caste, religion dummies, dummy for urban areas, school facilities and scholarships offered. In addition to this I also include an indicator for the relatively poorer households, that takes value 1 if the household belongs to the bottom 70 percent of income distribution in my sample and 0 otherwise (refer to Eqn. 1.2 and 1.4 in 1.4.2 and 1.4.3). The summary statistics for my control and treatment districts in the two time periods are presented in Tables 1.1 and 1.2. Note that in all my models, I include household size as a regressor which is likely endoge- nous. Excluding household size as a control while analyzing school expenditures and gender differences in education implies that boys and girls live in families with similar characteris- tics, in terms of both observables and unobservables. However, this assumption is likely to bias the estimates if families have a preference for sons and follow male-biased stopping rules of childbearing (Barcellos et al.[2014]). If fertility decisions are driven by a desire to have a certain number of boys, then girls end up in larger families on average. To address this concern, I instrument household size by by gender of the first born child in the family under the assumption of no sex-selective abortion.18 Although this assumption is not without crit- icism, in such cases, gender of the first child is likely to be a good predictor of the number

18Ban on sex-selective abortion was enacted in India in 1971 and later amended in 2004 making prenatal sex-screening and sex-selective abortion punishable by law ([2017a])

10 of children in the household or family size and excludable from the second stage (Barcellos et al.[2014],Clark[2000],Clarke[2017]).

1.4.2. School expenditure - Estimation and identification

I use a difference-in-differences (DID) strategy to compare households in districts that are exposed to health insurance by the second wave of the survey to those that were never exposed to the scheme. All households in 2004-05 and some households in 2011-12 that are never exposed to RSBY form my control group and the households in districts exposed to RSBY in 2011-12 form my treatment group. A simple comparison of households from districts that received the scheme to those that did not would likely lead to biased estimates. I include district fixed effects to address the concern of any time invariant district level characteristics that may be correlated with the treatment. Time fixed effects control for the time-varying characteristics that impact all districts equally. Identification relies on changes in household school expenditure at the district level after the phase-wise implementation of RSBY in 2008. I do not identify which households directly participated in the programme in my estimation. I use all the households in a treated district and estimate the effect of access to the programme. This is the intent-to-treat (ITT) effect of RSBY on school expenditure. Although IHDS data identifies the households that participate in RSBY, I chose to estimate the ITT impact instead of the treatment-on- the-treated (TOT) impact for two reasons. First, ITT is a more policy-relevant impact at the district level when the idea of a government scheme is to provide the option of having it available. Second, TOT would bring forth more complicated econometric problems, for instance, extra selection issues leading to an added level of endogeneity.19 Moreover, if the households participating in RSBY are not properly identified, we worry about measurment error in the participation variable. At the same time, with better outreach and awareness campaigns, take-up of the scheme can improve. However, how beneficial it is, is a separate

19Not only does the district have to have the programme, but the households in the districts have to decide to participate in it.

11 question, some of which has been addressed in Azam[2018]. I use the following DID specification to compare the households in districts over the two time periods, 2004-05 and 2011-12, before and after RSBY was rolled out:

yhdt = β0 + β1Tt + βDDRSBYdt + γXhdt + µd + hdt (1.1)

where yhdt is the budget share of school expenditure in household h in district d at time t. Tt

takes the value 1 for 2011-12 and 0 for 2004-05. RSBYdt a treatment indicator which takes

20 the value 1 if district d is exposed to RSBY in time t and and 0 otherwise. Xhdt is the

set of household level controls and µd depicts district fixed effects. The disturbance term

hdt summarizes the influence of all other unobserved variables that vary across households, districts, and time. The baseline Eqn. 1.1 is estimated using an Instrumental Variable

21 approach (IV). The parameter of interest is βDD which provides the differential impact of

RSBY on household’s expenditure on school after its introduction. β1 identifies the effect of any systematic changes that affected households in all districts between 2004-05 and 2011-12. A primary concern with the identification strategy in a DID approach is that the districts may be trending differently prior to RSBY. Using three rounds of the Household Consump- tion Expenditure Survey of the NSS, I provide evidence in Figure 1.1 that there are no pre-existing differential trends between the control and the treated districts over 2004-05 to 2006-07. To further alleviate such concerns, I estimate a triple differences model where I refine the definition of my control and treatment groups. I include the indicator variable

LowInch for poorer households as described in 3.4.1. Households in the top 30 percent are now the controls for such differential trends in the districts. The assumption here is that the richer households are perhaps less affected by RSBY. This is reasonable since richer households are less likely to be resource constrained and in a position to insure themselves

20 Note that RSBYdt varies with both district and time and is equivalent to the usual treat × post that one finds in difference-in-differences analyses. 21Gender of the first born child in the family is used as an instrument for household size.

12 against unexpected shocks or have access to private health insurance.22 As such, the triple differences (DDD) estimator is more convincing as it looks at changes among poorer households in treated versus the control districts and nets out any differential change in wealthy households across treated versus control districts. The main identification assumption in such triple differences model is no longer that changes in treatment households should be uncorrelated with district level trends, but that these changes should be uncorre- lated with district level trends that affect the rich and the poor differently. The assumption in this model is indeed weaker. This methodology helps take care of two potential confound- ing elements that are of concern in a DID model. One, the changes in school expenditure of the poorer households in the treated districts is not a result of changes in school expenditure of such households across all districts, nor is it a result of changes in school expenditure of all households in the treatment districts (possibly due to other unobservables that affects all households). The second specification I estimate is the following triple differences model:

yhdt = β0 + β1Tt + β2LowInch + β3RSBYdt + β4Tt ∗ LowInch + βDDDRSBYdt ∗ LowInch

+ µd ∗ LowInch + γXhdt + µd + hdt (1.2)

where RSBYdt is the treatment dummy that varies with district and time. The new coeffi- cient of interest is βDDD which is the difference-in-difference-in-differences estimator. βDDD captures the variation in school expenditure in poorer households (relative to the rich) in treated districts (relative to control districts) after implementation of RSBY. Similar to the DID model, the baseline triple differences in Eqn. 1.2 is also estimated using an IV approach. Other set of controls are same as the baseline model. District fixed effects, time fixed effects, time by income fixed effects and district by income fixed effects are included. Standard errors

22It must be noted that the initial target population intended by the scheme was the bottom 30 percent of income distribution. However, the scheme was later extended to several unorganized workers over the years (Government of India[2013b]). At the outset, it is necessary to caution that the top 30 percent may not form a clean control. I test the robustness of my triple difference results by altering the income distribution categories for my control and treatment groups. These are discussed in the later sections.

13 are clustered at district by household income level.23

1.4.3. School enrollment - estimation and identification

Ideally, investigating within-household expenditure patterns on boys versus girls would help quantify the exact gender differences in parental investment. However, studies that have attempted to examine gender bias in schooling through household expenditure data have met with little success. Expenditure on individual members of a household is typically not observed in survey data which makes it impossible to directly observe gender biases in allocation of expenditure. Most papers therefore, resort to indirectly detecting differential treatment within households by examining changes in household expenditure with changes in gender composition. Reliability of this methodology, however, has been called into question because it generally fails to detect a gender bias (Deaton[1997]). Even in countries with known gender bias, researchers thus far find mixed evidence of significant effects of the child’s gender on the composition of household spending (Bhalotra and Attfield[1998]). Similar lack of convincing expenditure data at the child level makes it impossible for me to quantify the treatment impact on gender differences in parental investments in educational expenditure. Instead, I use data at individual level on school enrollments to get at the treatment effect on gender differences in boys’ and girls’ enrollments within households. I estimate the following linear probability model (LPM) to estimate the treatment effect

yihdt = α0 + α1Tt + α2RSBYdt + α3RSBYdt ∗ Boyi + γXihdt + µd + ihdt (1.3)

where yihdt is an indicator variable which takes value 1 if the child i in household h in district d is enrolled in school in time period t. Boyi takes value 1 if the child is a boy and 0 if a girl. District and time fixed effects are included in the model and standard errors are clustered at district level. α1 identifies the effect of any systematic changes that affect the child between

23For comparison purposes, I also estimate both the DID and DDD models for the numerator and denomi- nator of the budget share separately, that is, logarithm of school expenditure in levels and logarithm of total consumption expenditure in levels for the household. This is discussed in the robustness section 1.6.1.1.

14 the two time periods. α2 depicts school enrollment of a girl as a result of the treatment.

α2 + α3 identifies the school enrollment of a boy post the treatment. The coefficient of interest is α3 which gives the change in the gender gap in school enrollment due to RSBY. I also control for the gender dummy of the child, the coefficient of which identifies the school enrollment of boys versus girls absent the treatment. All other relevant controls are included as described in the section 1.4.1. Similar to the school expenditure triple differences analysis, I also estimate an equiva- lent model for school enrollment. Incorporating the new treatment and control groups, the specification looks as follows:

yihdt = α0 + α1Tt + α2LowInch + α3RSBYdt + α4Tt ∗ LowInch + α5RSBYdt ∗ LowInch

+ α6RSBYdt ∗ LowInch ∗ Boyi + µd ∗ LowInch + γXihdt + µd + ihdt (1.4)

where α5 depicts the effect of RSBY on enrollment of girls and α5 + α6 depicts the effect of RSBY on enrollment of boys. Change in the gender gap in school enrollments as a result of RSBY for poorer households in the treated districts is thus given by α6. It captures the variation in boys’ and girls’ school enrollments within such households in the treatment districts, nets out the change in the average enrollments in the control districts and then nets out the change in the average enrollments in richer households in the treatment district. As before, the model includes all controls, all relevant double interaction terms as well as district and time fixed effects.

1.5. Results

1.5.1. School expenditure as a budget share

I present the baseline school expenditure results in Table 1.3. Panel A presents the results for the DID specification 1.1. Column (1) shows that RSBY increases the budget share on school expenditure by 0.5 percentage points and the effect is statistically significant at p¡0.01

15 significance level. Access to health insurance has positive spillover effect on school expendi- ture decisions of households. Panel B presents the results for the triple differences estimation of Eqn. 1.2. From column (4), notice that the triple differences analysis gives a treatment effect of the order of 0.7 percentage points on the budget share of school expenditure for the poor households relative to the rich in treatment district relative to control.24 Summary statistics in table 1.1. shows that the average share of school expenditure out of total expenditure for such households in 2004-05 is about 2.5 percent. Both the DID and DDD effects are therefore economically significant and imply that the budget share of school expenditure increases by 20 to 28 percent after RSBY. To the extent that access to public health insurance helps reduce household’s financial burden, RSBY benefits child human capital formation through an increase in expenditure on school. As such, RSBY perhaps helps eliminate costly smoothing mechanisms that households may resort to, in absence of such an insurance coverage, like cutting down on school expenditure or delaying their children’s enrollments.25 Note that, several diagnostic tests have been performed to assess the efficiency and re- liability of the instrument. The endogeneity test reports test statistics that are robust to various violations of conditional homoskedasticity. I reject exogeneity of household size.26 As far as underidentification is concerned, I report chi-squared p-values for the test where rejection of the null implies full rank and identification [Baum et al., 2007b]. This test tells us whether the excluded instrument is correlated with the endogenous regressor. The p-value based on Kleibergen-Paap rk LM statistic allows me to clearly reject the null that the instru- ment is uncorrelated with the endogenous regressor and that the model is underidentified. From the weak identification test, rejection of the null represents absence of weak-instrument

24Columns (2), (3), (5) and (6) present the impact of RSBY on the logarithm of school expenditure in levels and logarithm of total consumption expenditure in levels for DID and DDD models. These results are discussed in detail in section 1.6.1.1. 25Selling assets, exhausting savings, non-institutional borrowings and reducing consumption below critical levels are other examples of such costly measures (Morduch[1999],Sauerborn et al.[1996], Edmonds[2006]). 26Under conditional homoskedasticity, this endogeneity test is numerically equal to a Hausman test statis- tic.

16 problem. Since the specification has clustered standard errors at district level, the reported test statistic is based on the Kleibergen–Paap rk statistic which indicates absence of weak instrument problem, given that it is above 10 in the baseline specification of DID (column (1)).27

1.5.2. School enrollment

Given that RSBY has an impact on budget share of school expenditure at the household level, it is noteworthy to examine its impact on gender gap in school enrollments within households. I present the results for the baseline school enrollment analysis in Table 1.4. Panel A provides the DID results estimated using a linear probability model for specification 1.3. Column (1) presents the impact on enrollments without a gender differential whereas column (2) presents the impact when I introduce a gender differential. In this case, notice that absent the health coverage, a gender gap in school enrollment exists. More boys are enrolled in school. In fact, enrollment of boys is about 6 percentage points higher than that of girls. Average enrollment is 78.4 percent for boys and 72.4 percent for girls prior to the treatment. Difference in parental expected future returns from their children’s schooling or parental preferences could be possible explanations for this, as found in extant literature. If parents expect higher returns from boys than girls, it limits the amount of equality a household can afford. Column (2) shows that I find the treatment to have a larger impact on girls. The probability of enrollment is 2.7 percentage points higher for girls after imple- mentation of RSBY as compared to 0.8 percentage points higher for boys. The reduction in the gender gap as a result of the treatment is by 1.9 percentage points and is statistically significant at p¡0.01 significance level. The triple differences results for specification 1.4 are presented in panel B. Column (4) shows a reduction (albeit smaller in comparison to DID) in the gender gap in enrollment by 0.9 percentage points and is statistically significant at p¡0.05 significance level. This suggests that benefits of the health insurance scheme accrues

27The instrument becomes slightly weaker in the baseline of triple differences model owing to perhaps more number of controls and lower correlation.

17 more to girls insofar as school enrollment is concerned. Gender specific roles in domestic chores and differential time opportunity cost of boys’ and girls’ schooling explains these results to some extent. As suggested, differential pat- terns of preferences within the household are exacerbated with changes in household income (Alderman and Gertler[1997]). Given that girls spend less time in school and more hours working to substitute for mothers’ domestic duties, the greater impact on girls could per- haps be a result of RSBY reducing the degree of impact of a shock to mother’s health on daughters.28 One could perhaps also say that larger treatment effect on enrollment of girls is because the demand for girls’ human capital is more income and price elastic than demand for boys’. Moreover, although basic education in India is tuition-free, school attendance still entails cost of reaching school, learning materials, uniforms that are large enough out-of- pocket expenditures to keeps more girls out of school. Access to a cashless health insurance system perhaps eases some resource constraints in the households leading to a reduction in the gender gap in enrollments post the treatment.

1.6. Robustness Checks

There may be other potential concerns related with my baseline estimations. This section discusses the additional analyses I conduct to explore the robustness of my results to different modeling choices for both school expenditure as well as school enrollments. I start with a discussion of school expenditure models and then proceed to school enrollments.

1.6.1. School expenditure - other estimation issues

Taking the budget share of household school expenditure as my outcome variable would ideally require me to estimate a fractional response model.29 However, given that I am controlling for a large number of districts, a fractional response model with fixed effects

28With women receiving less healthcare, a shock to the mother’s health would have a larger impact on the girls required to take up on mother’s chores (Alam[2015], Hazarika and Sarangi[2008], Katz[1995], Skoufias [1993]) 29The budget share is a fraction and is bounded between 0 and 1.

18 becomes infeasible. I therefore, compare the baseline IV results with those from two main alternative estimation approaches.

1.6.1.1. School expenditure in levels

My treatment effect could possibly be understated when I consider budget share of house- hold school expenditure as the dependent variable. A direct positive income effect of RSBY could perhaps be translated to an increase in total household consumption expenditure itself given that health insurance relieves household’s resource constraints. If total consumption expenditure of households rises, this would mean a lower effect on the budget share of school expenditure. Therefore, I first estimate a model where the outcome variable is the loga- rithm of household’s school expenditure per month in levels excluding total consumption expenditure from the specification. I also estimate the treatment effect on logarithm of total consumption expenditure in levels. This helps me tease out the treatment effect on both household school expenditure and total consumption expenditure separately. Note that in IHDS survey, some households report zero expenditure on goods. My depen- dent variable is in logarithms which implies that value of the corresponding outcome variable will be undefined if I include such households. One way to avoid this problem is simply to drop these households and run regressions based on the trimmed sample. However, this may result in sample selection bias. Rather, a more sophisticated way to circumvent this problem and include these households is to apply the inverse hyperbolic sine transformation of con- sumption expenditures (Burbidge et al.[1988]). The inverse hyperbolic sine transformation √ requires transformation of the variable in question, say, z as log(z2 + z2 + 1) which unlike log z, is defined even for z = 0.30 As such, in this paper I use the inverse hyperbolic sine transformation to deal with households reporting zero consumption expenditure. The results for these is presented in Table 1.3. Columns (2) and (5) provide the DID and DDD effects on log of school expenditure in levels. I find that RSBY increases school

30According to Burbidge et al.[1988], except for very small values of z, the transformation is approximately equal to log(2zi) or log(2) + log(zi), and so it can be interpreted in exactly the same way as a standard logarithmic dependent variable.

19 expenditure by 30.2 to 42.2 percent approximately. The effect is found to be greater in the DDD model for the poorer households in treated districts. Column (3) provides the DID effect on log of total consumption expenditure. An increase by 7.7 percent is seen from column (3). I find a positive impact on log of total consumption expenditure in the DDD model as well but the effect is not statistically significant (column (6)). Second, I estimate the levels model while controlling for total consumption expenditure as a regressor. This takes care of any income effect of the scheme as it holds the budget constraint constant for the household. However, there may be a possible endogeneity concern for total consumption expenditure here. I instrument total monthly household consumption expenditure by assets possessed by the household at the time of the survey to circumvent this problem. This serves as valid instrument because assets held at the time of the survey do not directly impact the monthly expenditure on school but are a good predictor of total household income or consumption. Monthly expenditures on commodities are usually out of current earned income rather than out of assets or wealth.31 Panel A and B, Table A.1. present these results. Columns (1) and (3) repeat my baseline results as in table 1.3. Columns (2) and (4) present the results where I include total consumption expenditure and instrument it with total household assets. In this specification, I have two endogenous regressors and two instruments. From column (2), the treatment effect shows an 8 percent increase in the level of school expenditure and is statistically significant while holding the budget constraint of the household constant. The triple differences model also shows a higher and statistically significant impact on the level of school expenditure of almost 18.7 percent for the poor households in the treated districts (see column (4)). Here, the total treatment impact from the triple differences model is 8.2 percent which is approximately equivalent to the difference-in-differences result. As before, diagnostic tests have been performed to assess the efficiency and reliability of the instruments. The instruments fair broadly well on these specification tests.

31Although, land could affect school expenditure to some extent since land requires work and missing work would factor into opportunity cost of expenditure related to school.

20 1.6.1.2. School expenditure - fractional logit estimation

Here, I return to budget share as my outcome but estimate a fractional response model with correlated random effects to account for district level characteristics since a fixed effects fractional response model is not feasible. I estimate specification 1.1 via a fractional logit model with correlated random effects (CRE). The advantage of using CRE fractional logit is that it places some structure on the nature of correlation between the unobserved effects and the covariates (Lake and Millimet[2016]). Formally, the structural model in the CRE fractional logit is given by

E(yhdt| Xhdt, µd) = Φ(Xhdtβ + µd) (1.5)

where Xhdt includes the full set of covariates in specifications 1.1 and 1.2 and Φ is the standard normal cumulative density function. The Mundlak[1978] version of the CRE probit model further assumes ¯ 2 µd|Xhdt ∼ N(δ0 + Xhδ1, σµ) (1.6)

¯ 2 where Xh is the average of Xhdt for each district and σµ is the variance of µd. Under 1.5 and 1.6, we get

¯ 2 −1/2 E(yhdt| Xhdt, µd) = Φ[(δ0 + Xhdtβ + Xhδ1).(1 + σµ) ] µ µ ¯ µ = Φ[δ0 + Xhdtβ + Xhδ1 ] (1.7)

To capture the district fixed effects in 1.7, means of all controls at district level across time are included as additional controls in the DID model. Standard errors are clustered at the district level and time fixed effects are included. I include the means of all controls at district by household income level as the correlated random effects for my triple differences model. Here, the standard errors are clustered at district by household income level. Following Wooldridge et al.[2011], Wooldridge[2015], Baum et al.[2013], Papke and Wooldridge[2008], I use a two step control function approach to deal with the continuous

21 endogenous regressor, household size included in my model. In the control function approach, I first estimate household size as a function of my instrument, which is, gender of the first child in the household. This gives me residuals similar to the first stage of a 2SLS approach. I then use the residuals from this model as an additional regressor in the main model which is estimated as a CRE-fractional logit model. I present the results in Table A.2. Panel A provides the DID results and panel B, the triple differences results. Columns (1) and (3) repeat my baseline results as in table 1.3. Columns (2) and (4) presents the results using IV results for the CRE-fractional logit model. Since column (3) is the CRE fractional logit specification, I cannot interpret the coefficients and thus calculate the marginal effect of the treatment. I find a small positive marginal effect of RSBY but it is not statistically different from zero. The CRE fractional logit specification of triple differences model in column (4) shows a small but statistically significant difference in the marginal effects of RSBY for the poor and the rich households in the treated districts after RSBY. There is no effect on the rich households. The magnitude of this difference is of 0.2 percentage point which implies a difference of 8 percent in the budget shares for the poor and the rich in treated districts.

1.6.1.3. School expenditure - panel analysis

As an additional robustness check of my baseline school expenditure model, I estimate the treatment effect by considering the data as a panel since IHDS II are re-interviews of most of IHDS-I households. The results are overall robust to this change. Table A.3. presents the results. Panel A provides the difference-in-differences results and panel B, the triple differences. Columns (1) and (3) repeat the baseline DID and DDD results as in Table 1.3. Column (2) shows the effect of RSBY using IV approach with household fixed effects for the panel data. The standard errors are clustered at household level. RSBY increases budget share of school expenditure by 0.3 percentage points as suggested by the DID model. This implies a 12 percent increase in the budget share of school expenditure given that the mean budget share was 2.5 percent from Table 1.1. From Column (4), the triple differences

22 estimator shows that RSBY leads to an increase of 0.4 percentage points in the budget share of school expenditure for the poorer households in the treated districts. This is equivalent to a 16 percent increase in the budget share of school spending for the poorer households.

1.6.2. School enrollment - other estimation issues

1.6.2.1. School enrollment - probit with correlated random effects

A first potential concern with my school enrollment analysis is that the dependent variable is a binary outcome and should ideally be estimated as a non-linear model such as a probit or a logit. Linear probability models are likely to give a biased and inconsistent estimate (Horrace and Oaxaca[2006]). Probit or logit models however use a proper functional form where the probability depends on x through the index xβ

Pr(yi = 1|xi) = F(xiβ) where the functional form F(· ) maps into a response probability F : R −→[0, 1] for which we consider CDFs as they map numbers from the entire real number line on to the unit interval. Given that the difference between a probit or a logit is small in practice, I use a probit model. However, as before, I have a fixed effects baseline model where I control for a large number of districts making a simple probit estimation infeasible. Thus, I compare estimates from the baseline fixed effects linear probability model with two alternative estimation approaches to analyze the robustness of my results. First, a linear probability model with correlated random effects. Second, an IV probit model with correlated random effects for which, a variant of 1.7 would look like

¯ 2 −1/2 µ µ ¯ µ Pr(yihdt = 1| Xihdt, µd) = Φ[(δ0 +Xihdtβ +Xiδ1).(1+σµ) ] = Φ[δ0 +Xihdtβ +Xiδ1 ] (1.8)

Again, to capture the district fixed effects, means of all controls at district level across time are included as additional controls in the estimation. All standard errors are clustered at

23 the district level and time fixed effects are included. The results are presented in Table A.4. Panel A presents the difference-in-differences results and panel B, the triple differences results. One can compare the baseline results presented in columns (1) and (4) with those from a linear probability model with correlated random effects presented in columns (2) and (5) as well as CRE-IV probit model presented in columns (3) and (6). Notice that from both the linear probability models with fixed effects and correlated random effects, the DID approach show that, absent the treatment, the probability of enrollment of a boy is approximately 6 percentage points higher than a girl. After the treatment, I find a larger effect on probability of girls’ enrollment. A reduction in the gender gap in enrollment of 1.8 percentage points is seen from column (2). Since column (3) presents the probit model results, I cannot simply interpret the coefficients. Looking at the marginal effects of RSBY, I find consistent results. Notice that marginal effect of the treatment on the probability that enrollment of a girl is statistically significantly higher than that of a boy. The DID estimation shows that reduction in gender gap as a result of access to health insurance is of 3.2 percentage points and statistically significant. The triple differences results confirm a similar story. Results from both a LPM with fixed effects and LPM with CRE are quantitatively similar. The impact of RSBY on the probability of enrollment of girls is higher. The reduction in enrollment gender gap of 0.9 percentage points is seen in both specifications. I also find a reduction in the gender gap in enrollment of 1.4 percentage points from the CRE-IV probit as seen by the marginal effects of RSBY on a boy and a girl in column (6), however, the effects are not precisely estimated.

1.6.2.2. School enrollment - instrumental variable approach

The second concern is related to the instrument used for household size in my school enrollment analysis. I compare how my results change from the baseline LPM model where I instrument household size with gender of the first born in the family, with two alternative specifications. First, I include household size in the specification but do not use an instrument for it. Second, I exclude household size from the specification.

24 In Table A.5., I present the baseline LPM results in columns (1) and (4) and compare with LPM specifications where household size is included but not instrumented for (columns (2) and (5)) as well specifications where I omit household size as a regressor (columns (3) and (6)). Panel A presents the difference-in-differences results and panel B presents the triple differences results. Strikingly, the DID results from all three columns (1), (2) and (3) are qualitatively and quantitatively similar as well as statistically significant. I find a reduction in the gender gap in enrollment as a result of RSBY, of 1.8 to 1.9 percentage points, in all three estimation choices. The triple differences results also confirm a statistically significant reduction in gender gap in enrollment of 0.8 to 0.9 percentage points as a result of RSBY from all three specifications (columns (4), (5) and (6)). These alternative specification choices thus support validity of my baseline results.

1.7. Sensitivity Analysis

This section first discusses the sensitivity of my baseline results to variations in income distribution introduced in my models in 1.7.1. Second, I explore the heterogeneous effects of the treatment by intensity in section 1.7.2. Third, I exploit the variation in take-up of RSBY by district to estimate the heterogeneous treatment effect in 1.7.3. Lastly, I explore whether my baseline effects are different across sub samples varied by age groups for enrollment; by areas and by castes for both expenditure and enrollment in 1.7.4.

1.7.1. Variation in income distribution

I introduce variation in the income distribution categories used to define treatment and control groups in the triple differences model. To maintain symmetry with my baseline triple differences, I first restrict the sample to households in the top 30 percent and bottom 30

percent of income distribution. For this, I redefine LowInch in Eqn 1.2 such that the top 30 percent households form controls for my new treatment group, which is, the bottom 30 percent. Observations in the middle 40 percent are dropped. Second, I drop observations from the bottom 30 percent and re-define LowInch such that the top 30 percent are now

25 controls for households in the middle 40 percent of the sample. Since RSBY was expanded to cover other unorganized and domestic workers, the expectation for this second variation in treatment group is that the effect is perhaps positive, but smaller. I present these results in Table A.6. Panel I provides the results for school expenditure analysis. Panel A repeats the baseline DID results. Panel B presents the results for the two variations in my triple differences model. Column (2) shows that RSBY has a treatment effect of 0.5 percentage points increase in the budget share of school expenditure for the households that belong to the bottom 30 percent in the treated districts. This was the initial target group of the scheme. Average budget share of school expenditure for this target group in 2004-05 is approximately 1.8 percent. A treatment effect showing 0.5 percentage point increase thus implies approximately 27 percent rise in their budget share of school spending. Contrary to the expectation, the triple differences estimator in column (3) shows a zero effect on the households in the middle 40 percent of the sample. This could perhaps be a result of difference in the take-up of the program as the sample for this specification changes. An equivalent school enrollment analysis is presented in panel II. RSBY leads to small reduction in the gender gap in enrollment of 0.2 percentage points for the bottom 30 percent in the treated districts. However, I do not find any reduction in the gender gap for the middle 40 percent households.

1.7.2. Variation in treatment intensity

Here, I exploit the variation in treatment intensity to estimate the heterogeneous effect of RSBY on household school expenditure as well as school enrollment. I define a three

1 dummy variables based on the duration a household has been exposed to RSBY. Intensityd takes value 1 if RSBY has been in effect in the district for one year by the second wave

2 of the IHDS survey; Intensityd takes value 1 if RSBY has been in effect for two years; 3 32 and Intensityd takes value 1 if it has been effect for three years. One may expect the effect of RSBY to vary with time since implementation. To explore this, I use the following

32By the second wave of the survey, the scheme was active for three years.

26 difference-in-difference and triple differences models for school expenditure:

3 3 X j j X j j yhdt = β0 +β1Tt +β2RSBYdt + β3Intensityd + β4RSBYdt ∗Intensityd +γXhdt +µd +hdt j=1 j=1 (1.9)

3 X j j yhdt = β0 + β1Tt + β2LowInch + β3Intensityd + β4RSBYdt + β5Tt ∗ LowInch j=1 3 X j j + β6RSBYdt ∗ LowInch + β7RSBYdt ∗ LowInch ∗ Intensityd j=1

+ µd ∗ LowInch + γXhdt + µd + hdt (1.10)

The parameter of interest varies with time t and district d, where the total impact of RSBY P3 j j is given by β2 + j=1 β4Intensityd depending upon the duration of exposure to RSBY in j Eqn. 1.9. The heterogeneous effect of RSBY is captured by β4. Similarly the parameter j that captures the heterogeneous effect of RSBY in the triple differences Eqn. 1.10 is β7 and P3 j j the total effect of RSBY for the poor households is given by β4 + β6 + j=1 β7Intensityd. Similarly, I estimate the following DID and DDD models for school enrollment:

3 X j j yihdt = α0 + α1Tt + α2Boyihdt + α3RSBYdt + α4Intensityd + α5RSBYdt ∗ Boyihdt j=1 3 3 X j j X j j + α6RSBYdt ∗ Intensityd + α7RSBYdt ∗ Intensityd ∗ Boyihdt + γXhdt + µd + εihdt j=1 j=1 (1.11)

27 3 X j j yihdt = α0+α1Tt+α2Boyihdt+α3LowInch+α4RSBYdt+ α5Intensityd+α6RSBYdt∗LowInch j=1 3 X j j + α7RSBYdt ∗ LowInch ∗ Boyihdt + α8RSBYdt ∗ LowInch ∗ Intensityd j=1 3 X j j + α9RSBYdt ∗ LowInch ∗ Intensityd ∗ Boyihdt + α10Tt ∗ LowInch j=1

+ µd ∗ LowInch + γXihdt + µd + εihdt (1.12)

P3 j j In Eqn. 1.11, α3 + j=1 α6Intensityd provides the heterogeneous effect of the treatment P3 j on enrollment of girls by intensity of treatment duration whereas (α3 + α5) + j=1(α6 + j j α7)Intensityd captures the heterogeneous effect on enrollment of boys by intensity. The P3 j j change in the gender gap due to RSBY is captured by α5 + j=1 α7Intensityd. Similarly, the change in the gender gap in enrollment due to RSBY for the poorer households in Eqn. P3 j j 1.12 is given by α7 + j=1 α9Intensityd. Results for the heterogeneous effects of RSBY by intensity of treatment are provided in Tables A.7. and A.8. Panel A and B provide the DID and DDD results respectively. Panel I and II provide the results for school expenditure and school enrollment respectively. Table A.7. shows a treatment impact of 0.1 percentage point increase in budget share of school expenditure for households in districts that are exposed to the scheme for one year; 0.2 percentage point increase for households in districts exposed to RSBY for two years and 0.3 percentage point increase for households in distrcits exposed to RSBY for three years respectively. This points to a weighted average equal to my baseline result found in column (1) Table 1.3. The DDD results do not show a statistically significant effect in this model, but the effects point to a similar story. Column (1) in Table A.8. shows that the reduction in gender gap for the individuals in districts that have been exposed to RSBY for one year is by 1.6 percentage points but is not statistically significant. This effect is of the order of 6.2 percentage and 3.2 percentage points in districts that have had RSBY for two and three years respectively by 2011-12 and are both statistically significant. The DDD analysis confirms this pattern. Panel B of

28 Table A.8. shows a reduction in the gender gap in enrollments for boys and girls in poor households in districts exposed to RSBY for one year is by 2.6 percentage points. This result is a statistically significant and the gender gap consequently increases for such households that have had longer access to the scheme.

1.7.3. Variation in programme take-up by district

Third, I exploit district variation in the take-up of the programme to estimate the effect of RSBY on household school expenditure. Administrative reports suggests that health insurance take-up reached approximately 50 percent by 2013. Considering this, one would expect the treatment effect to be double if full take-up could be achieved. To explore this, I use the following difference-in-difference model yhdt = β0 + β1Tt + β2RSBYdt + β3DistrictT akeupd + β4RSBYdt ∗ DistrictT akeupd (1.13) + γXhdt + µd + hdt

The coefficient of interest is β2 + β4. Data for district-wise enrollment into the scheme is taken from the official RSBY website. RSBY enrollment data is available for districts from 15 states out of 29 is either because some districts have not been exposed to the scheme or simply because of unavailability of data. Table A.9. presents the results. Panel A, column (1) shows the simple difference-in- differences treatment results without differential take-up for the districts with available data. I find an RSBY leads to 0.7 percentage point increase in the household’s budget share of school expenditure for the districts enrolled into the scheme. This is equivalent to a 28 percent increase in their budget share of school spending given that the school expenditure comprised approximately 2.6 percent of the total household expenditure for such households before RSBY. Column (2) in panel A provides the differential treatment of RSBY effect by take-up. If the treatment effect is extrapolated to a 100 percent take-up, I find household’s budget share of school expenditure increases by 0.9 percentage points which is equivalent to almost 35 percent increase in the budget share of school spending for such the treated households. This is an economically large effect and is of interest since my treatment is

29 the availability of the scheme and not household participation. However, a word of caution is warranted here that this is only suggestive evidence of the treatment effect since it is based on incomplete data and enrollment into the scheme is endogenous. In addition, my instrument for households size does not pass the specification tests in this model. The triple difference analysis does not show any statistically significant results in this case.

1.7.4. Sub sample analysis

I re-estimate my baseline school expenditure and school enrollment models in Eqns. 1.1, 1.2, 1.3 and 1.4 to find the treatment impact by changing the samples. First, I estimate three sub-sample regressions for school enrollment analysis by varying age groups. Table A.10., panels A and B present the difference-in-differences and triple differences results. I restrict the sample to children in the age group 5-9 years, 10-14 years and 15-17 years. I find consistent results with the baseline for the sub-sample of 5-9 years and 10-14 years. For both the age groups, the gender gap in enrollment reduces by 1.9 percentage points as a result of access to health insurance. I do not find any impact of RSBY for the sub sample 15-17 years. This could possibly be explained by lower marginal benefit of keeping older children in school than that of younger children. Second, I conduct sub sample regressions of both school expenditure and enrollment models for rural and urban areas separately. Table A.11. presents the results. From Panel I, I find RSBY to have a larger treatment effect on household’s budget share of school expenditure in urban areas than rural from the DID model. The treatment effect is found to be 0.7 percentage points in the urban areas compared to 0.4 percentage points rise in the rural areas. Both effects are statistically significant. However, from the triple differences model, I do not find a statistically significant impact of RSBY in the urban areas. In contrast, RSBY has a treatment effect of 0.8 percentage points increase in the budget share of school expenditure for the poor households in the treated districts of rural areas. For the school enrollment model (see panel II), RSBY reduces the gender gap in enrollment by 1.5 to 2.4 percentage points in the rural areas, perhaps owing to the low levels of girls’ enrollments to

30 begin with. No such impact is found in the urban areas from either the DID or the DDD models. This speaks to the effectiveness of having access to such a health insurance in rural parts. Table A.12. presents the sub sample results for both expenditure and enrollment models estimated by caste categories. Panel I suggests that the treatment has a positive effect on school expenditure for the general category and other backward castes. This can be seen from the DID models. The DDD model also suggests a positive impact on the poor households in the treated districts belonging to other caste categories, apart from those in belonging to general castes. Panel II suggests that RSBY reduces the gender gap in enrollment to a large extent for the other backward castes (OBC) category. The magnitude of reduction in the gender gap for OBCs is economically large. This is confirmed by both the DID and DDD models. The triple differences model also suggests reduction in the gender gap of enrollments for the scheduled tribes and other castes. However, these results are not confirmed in the DID models.

1.8. Conclusion

Gender gap in schooling remains a concern for most policy makers and educationists. The UN Millennium Development Goals first enunciated in 2000, emphasized reducing gender gap in school that disadvantage girls (Grant and Behrman[2010], Nations[2015]). Such differences in education could potentially lead to further gender inequalities in income, work and social status. Given that women are significant contributors in the labour force in most developing countries, gender gaps can act as constraints on economic growth. In fact, investment in female education is widely regarded as essential by policymakers owing to the positive externalities associated with it, such as, better child health, household welfare, and lower population growth (Song et al.[2006], Alderman and King[1998]). Appropriate policy responses to reduce the gender gap thus require an understanding of its determinants. Little evidence exists on the impact of health insurance on school expenditure, in general, and on this gender gap, in particular, in India and my paper attempts to analyze this unexplored

31 determinant. Understanding the impact of shocks on education decisions of vulnerable households and the channels of this impact could help in designing safety nets and other policies to insulate investments in education from health shocks (Glick et al.[2016]). An undesired consequence of negative health shocks may be taking children out of school either to protect their health or to send them to work for additional income. These strategies can have undesired consequences in the long term for human capital accumulation of future generations and labor market opportunities. From a policy perspective, it is not only interesting to see if a health insurance scheme has an unintended role to play on school expenditure decisions of households but also on parental response within household in terms of enrollments of boys versus girls. At the outset, it is not entirely obvious as to whether health insurance would benefit children’s education or have a detrimental impact. Healthier children could either mean greater future economic returns from schooling or greater value as child labour. Such responses need to be considered when designing policies to remedy any disadvantages among children, since parents can eliminate these effects by aiming at equitable child human capital formation within the family. Although RSBY was implemented with the intention of reducing financial burden for the poor, I find that it has unintended positive consequences for children. First, I find that household expenditure on school increases as a result of access to RSBY. Second, I find that access to a health insurance systems provides additional resources to parents, in a society that depends largely on sons for support during old age, to not exclude their daughters from education opportunities. Robustness checks and sensitivity analyses support the validity of my results. This is evidence that health insurance protects the poor and also helps such households keep their children in school in the face of health shocks. This unintended benefit could help push households out of the vicious cycle of poor health in childhood leading to lesser education and hence lower incomes and health in adulthood. In addition, there is also a long-term positive effect of health insurance coverage on economic development, this effect being reinforced through the positive impact on school enrollments of girls.

32 Table 1.1. Summary statistic - Household level

Control Districts Treated Districts 2004-05 2011-12 2004-05 2011-12 Variables Mean SD Mean SD Mean SD Mean SD School expenditure 59.55 129.67 137.08 272.57 104.13 218.84 162.60 288.19 Total consumption expenditure 3527.21 3315.30 7719.33 7977.05 4022.48 3917.97 7631.89 7318.71 Age 45.51 13.13 46.27 13.20 46.82 13.33 47.08 13.60 Household size 6.38 2.95 5.56 2.14 6.71 3.09 5.85 2.31 Urban (1 = yes) 0.30 0.46 0.31 0.46 0.27 0.44 0.29 0.45 Other Backward Castes (1 = yes) 0.42 0.49 0.21 0.41 0.40 0.49 0.21 0.41 Scheduled Caste (1 = yes) 0.21 0.41 0.43 0.49 0.22 0.42 0.41 0.49 Scheduled Tribe (1 = yes) 0.10 0.30 0.19 0.40 0.08 0.27 0.24 0.42 Other castes (1 = yes) 0.24 0.43 0.14 0.34 0.24 0.43 0.09 0.29 Muslim (1 = yes) 0.09 0.28 0.09 0.28 0.14 0.35 0.16 0.36 Christian (1 = yes) 0.02 0.12 0.03 0.17 0.03 0.16 0.02 0.14

33 Sikh or Buddhist (1 = yes) 0.03 0.16 0.02 0.14 0.04 0.19 0.03 0.18 Other religion (1 = yes) 0.01 0.09 0.00 0.06 0.01 0.11 0.01 0.08 HH Head - literate (1 = yes) 0.64 0.48 0.70 0.46 0.63 0.48 0.66 0.47 HH Head - knows english (1 = yes) 0.15 0.35 0.16 0.36 0.18 0.38 0.16 0.37 HH Head - ever attended school (1 = yes) 0.65 0.48 0.65 0.48 0.63 0.48 0.61 0.49 Male with primary education (1 = yes) 0.15 0.35 0.15 0.35 0.14 0.35 0.15 0.36 Male with secondary education (1 = yes) 0.28 0.45 0.38 0.49 0.26 0.44 0.37 0.48 Male with senior sec. education (1 = yes) 0.06 0.24 0.14 0.35 0.06 0.24 0.12 0.33 Male with college education (1 = yes) 0.04 0.20 0.13 0.33 0.05 0.22 0.11 0.31 Female with primary education (1 = yes) 0.17 0.37 0.16 0.36 0.16 0.37 0.15 0.36 Female with secondary education (1 = yes) 0.37 0.48 0.35 0.48 0.38 0.48 0.30 0.46 Female with senior sec. education (1 = yes) 0.12 0.33 0.10 0.30 0.10 0.30 0.09 0.29 Female with college education (1 = yes) 0.09 0.29 0.07 0.26 0.10 0.30 0.08 0.26 Gender of the head (1 = male) 0.94 0.23 0.91 0.29 0.92 0.28 0.87 0.34 # of married males 1.45 0.86 1.30 0.70 1.45 0.88 1.26 0.73 # of married females 1.48 0.86 1.34 0.71 1.53 0.90 1.38 0.73 Proportion of children 0.39 0.15 0.37 0.14 0.39 0.15 0.38 0.16 HH has a bank account (1=yes) 0.34 0.48 0.34 0.47 0.36 0.48 0.34 0.47 HH has a Kisan credit card (1=yes) 0.04 0.20 0.06 0.24 0.05 0.21 0.05 0.23 HH has a credit card (1=yes) 0.01 0.11 0.02 0.15 0.01 0.12 0.03 0.16 Notes: Sample is resticted to households where age of the head of the household is between 18 to 90 years. The table shows the summary statistics in the control districts and treatment districts in 2004-05 and 2011-12 for household level. Dummy variables containing information about education levels, demography, bank information, caste and religion of the household are included. Muslim takes value 1 if the household is Muslim, 0 otherwise. Christian = 1 if the household is Christian, 0 otherwise. Sikh = 1 if the household is Sikh, 0 otherwise. Other religion = 1 if the the household falls under any of the other categories like Jainism, Buddhism, Zoroastrianism, and others, 0 otherwise. ST = 1 if the household is scheduled tribe, 0 otherwise. SC = 1 if the household is scheduled caste, 0 otherwise. OBC =1 if the household belongs to other backward castes, 0 otherwise. Other religion =1 if the household belongs to other other castes or general category, 0 otherwise. Table 1.2. Summary statistics - Individual level

Control Districts Treated Districts 2004-05 2011-12 2004-04 2011-12 Variables Mean SD Mean SD Mean SD Mean SD Enrolled (1 = yes) 0.795 0.404 0.886 0.317 0.758 0.429 0.865 0.342 Gender (1 = boy) 0.522 0.500 0.535 0.499 0.525 0.499 0.525 0.499 Household size 6.919 3.213 7.153 3.219 7.100 3.149 7.377 3.347 Scholarship from school (1 = yes) 0.074 0.261 0.365 0.481 0.089 0.285 0.350 0.477 Age 10.968 3.683 11.469 3.609 10.876 3.712 11.321 3.634 Benefits from school (1 = yes) 0.487 0.500 0.547 0.498 0.344 0.475 0.434 0.496 Urban (1 = yes) 0.274 0.446 0.251 0.434 0.268 0.443 0.287 0.452 OBC (1 = yes) 0.355 0.479 0.257 0.437 0.384 0.486 0.228 0.420 SC (1 = yes) 0.170 0.376 0.374 0.484 0.234 0.423 0.411 0.492 34 ST (1 = yes) 0.159 0.366 0.162 0.369 0.054 0.226 0.244 0.429 Other castes (1 = yes) 0.272 0.445 0.162 0.369 0.268 0.443 0.062 0.241 Muslim (1 = yes) 0.116 0.320 0.102 0.303 0.176 0.380 0.181 0.385 Christian (1 = yes) 0.038 0.190 0.028 0.164 0.017 0.131 0.016 0.125 Sikh & Buddhist (1 = yes) 0.033 0.177 0.026 0.159 0.043 0.203 0.038 0.191 Other religion (1 = yes) 0.019 0.135 0.006 0.075 0.011 0.103 0.004 0.066 Father with primary education (1 = yes) 0.167 0.373 0.159 0.366 0.148 0.355 0.150 0.357 Father with secondary education (1 = yes) 0.251 0.434 0.372 0.483 0.224 0.417 0.353 0.478 Father with senior sec. education (1 = yes) 0.054 0.225 0.129 0.335 0.042 0.201 0.107 0.309 Father with college educatiom (1 = yes) 0.032 0.175 0.107 0.309 0.036 0.185 0.089 0.285 Mother with primary education (1 = yes) 0.178 0.382 0.170 0.375 0.161 0.367 0.154 0.361 Mother with secondary education (1 = yes) 0.371 0.483 0.309 0.462 0.370 0.483 0.257 0.437 Mother with senior sec. education (1 = yes) 0.111 0.314 0.078 0.268 0.085 0.279 0.067 0.249 Mother with college education (1 = yes) 0.084 0.277 0.051 0.220 0.082 0.275 0.054 0.226 Notes: Sample is resticted to households where children between the age group 5 to 18 years. The table shows summary statistics in the control districts and treatment districts in 2004-05 and 2011-12 for the individual level data. Dummy variables containing information about gender, school facilities, parental education levels, age, caste and religion of the individuals are included. Muslim takes value 1 if individual is Muslim, 0 otherwise. Christian = 1 if individual is Christian, 0 otherwise. Sikh = 1 if individual is Sikh, 0 otherwise. Other religion = 1 if the individual falls under any of the other categories like Jainism, Buddhism, Zoroastrianism, and others, 0 otherwise. ST = 1 if individual's caste is scheduled tribe, 0 otherwise. SC = 1 if individual's caste is scheduled caste, 0 otherwise. OBC=1 if individual belongs to other bcakward caste, 0 otherwise. Other religion =1 if individual belongs to other castes or general category, 0 otherwise. Figure 1.1. Pre-trends at district level

35 Table 1.3. Impact of RSBY on household school expenditure

Panel A. DID Panel B. DDD (1) (2) (3) (4) (5) (6) School expd. Log School expd. Log Total consumption School expd. Log School expd. Log Total consumption Budget Share Levels expd. Levels Budget Share Levels expd. Levels RSBY*Post 0.005*** 0.302*** 0.077*** -0.003* -0.120 -0.015 (0.001) (0.127) (0.014) (0.002) (0.257) (0.074) Low Income (=1 for bottom 70%) -0.047** -0.829* -0.224*** (0.024) (0.499) (0.084) RSBY*Post*Low Income 0.007*** 0.422*** 0.098 (0.001) (0.188) (0.086) Underidentification test p=0.000 p=0.000 p=0.000 p=0.001 p=0.001 p=0.001 Weak-identification test 36 Kleigbergen Paap rk Wald F statistic 11.646 11.607 11.574 6.580 6.543 6.553 Endogeneity test p=0.010 p=0.000 p=0.025 p=0.010 p=0.002 p=0.027 Other Controls Y Y Y Y Y Y District fixed effects Y Y Y Y Y Y Time fixed effects Y Y Y Y Y Y District*Income fixed effects Y Y Y Time*Income fixed effects Y Y Y N 47421 47421 47421 47421 47421 47421 * p<0.10, ** p<0.05, *** p<0.01. The sample is restricted to HH with children and where age of the head is between 18 to 90 years. Panel A and B provide the DID and DDD results respectively. Estimation is using IV approach Col(1) and (4): dependent variable is budget share of household's school expenditure (school expenditure/total consumption expenditure). Col. (2) and (5) : dependent variable is the inverse hyperbolic sine transformation of school expenditure in levels. Col. (3) and (6): dependent variable is the inverse hyperbolic sine transformation of total consumption expenditure in levels. Additional controls include: RSBY = 1 if the district was exposed to RSBY & 0 otherwise, dummy for Low Income =1 if HH does not belong to top 30% and 0 otherwise (for DDD), HH size (instrumented by gender of the first child), highest education degrees of male and female members, indicators for religion of HH, indicators for caste of HH, dummy for urban areas, number of married men in the HH, number of married women in the HH, proportion of children, teens and adults, indicator for if HOH is married, dummy for if the HH has a bank account, dummy for if the HH has a farmer credit card, district fixed effects, time fixed effects, district by income fixed effects (for DDD), time by income fixed effect (for DDD). Standard errors reported are clustered standard errors. Table 1.4. Impact of RSBY on child school enrollment

Panel A. DID Panel B. DDD (1) (2) (3) (4) Enrollment with Enrollment with Enrollment gender differential Enrollment gender differential RSBY*Post 0.017*** 0.027*** -0.024 -0.023 (0.005) (0.006) (0.017) (0.017) Boy 0.053*** 0.060*** 0.054*** 0.055*** (0.004) (0.004) (0.004) (0.004) Low Income (=1 for bottom 70%) -3.328 -3.576 (9.608) (9.577) RSBY*Post*Boy -0.019*** (0.005) RSBY*Post*Low Income 0.042** 0.046** 37 (0.019) (0.020) RSBY*Post*Low Income*Boy -0.009*** (0.001) Underidentification test p=0.000 p=0.000 p=0.000 p=0.000 Weak-identification test Kleigbergen Paap rk Wald F statistic 43.434 44.022 42.242 42.46 Endogeneity test p=0.509 p = 0.570 p=0.401 p=0.414 Other Controls Y Y Y Y District Fixed Effects Y Y Y Y Time Fixed Effects Y Y Y Y District*Income Fixed Effects Y Y Time*Income Fixed Effects Y Y N 83221 83221 83221 83221 * p<0.10, ** p<0.05, *** p<0.01. The sample is restricted to children above the age of 5 and below the age of 18. Panel A and B provide the DID and DDD results respectively. Estimation is using a LPM. Dependent variable is school enrollment of a child in a household. Additional controls include RSBY = 1 if the district was exposed to RSBY & 0 otherwise, dummy for Low Income =1 if HH does not belong to top 30% and 0 otherwise (for DDD), a gender dummy = 1 for a boy and 0 for a girl, RSBY, HH size, parental education characteristics, indicators for religion of HH, indicators for caste of HH, dummy for urban areas, school facilities and scholarships offered, district FE, time FE, district by income fixed effects (for DDD), time by income fixed effects (for DDD). HH size is instrumented by the gender of the first child. Standard errors reported are clustered standard errors. Chapter 2

INTRA-HOUSEHOLD CONSUMPTION DECISIONS: EVIDENCE FROM NREGA

2.1. Introduction

Public works programmes are a popular tool used to address the issues of poverty and unemployment in developing countries. The Mahatma Gandhi National Rural Employment Guarantee Act (MG-NREGA) passed in 2005 in India created the world’s largest public works programme under a statutory framework. The programme legally guarantees hundred days of unskilled manual work to participants with the intention to alleviate rural poverty.1 Guaranteeing such employment opportunities can directly affect intra-household decisions through a change in total resources and the allocation of these resources. In this paper, I examine the impact of NREGA on the pattern of household consumption expenditure. Looking at changes in consumption patterns within households also gives us some insights into the possible effects of NREGA on bargaining power since men and women are seen to have systematically different consumption preferences and spending patterns (Kanbur and Haddad[1994], Quisumbing et al.[2000], Doepke and Tertilt[2016]). NREGA represents a compelling policy change for several reasons. First, its annual cost is close to 1% of India’s GDP, generates around 2.35 billion person-days of employment and currently benefits more than 50 million households of rural India Ministry of Rural Development[2016]. A primary contribution of the paper is thus to speak to the welfare effects of such a large scale public works programme. Any conclusions drawn on the basis of this pervasive scheme will therefore be of broad interest. Second, since NREGA was rolled out in a phase-wise manner starting with the most backward districts in 2006, eventually

1The programme was initially called National Rural Employment Guarantee Act (NREGA) but later was changed to MG-NREGA in 2009. I use NREGA to refer to this programme throughout the paper.

38 covering the entire country by mid 2008, the variation provides an opportunity to evaluate the impact of this programme. I use two rounds of cross-sectional data from the National Sample Survey (NSS) that span final implementation of the programme. The data allows for a comparison of households in the districts before and after the programme to those in districts that have the programme in both the survey waves. Lastly, it mandates that one-third beneficiaries be women providing an impetus to female autonomy. Considerable literature exists on the impact of NREGA on labour market outcomes, agri- cultural wages, consumption, time-use and impact on children (Bose[2017], Imbert and Papp [2015], Ravi and Engler[2015], Deininger and Liu[2013], Diiro et al.[2014]). In contrast, this paper remains unique because it not only evaluates the impact on household consumption expenditure behaviour but also sheds light on traditionally overlooked outcomes, particularly on channels through which bargaining power of women may be affected in households. In general, most papers evaluating the impact of income shocks to households find that a boost to income increases expenditure on all commodities that households spend on. However, my analysis shows a change in the pattern of spending depicting a shift in discretionary ex- penditure towards some commodities more than others. This could be suggestive of greater involvement of women in household decisions given their preferences for welfare improving commodities. A key result found in the paper is a shift in discretionary spending towards school expen- diture as a result of NREGA. To the extent that women are the primary caregivers in the family and are concerned with their children’s well-being (Diiro et al.[2014], Jacoby[1995], Glick[2002]), this suggests a transformative shift in pattern of resource allocation towards goods women care more about. At the same time, a stark decline in the budget share of entertainment is seen implying that the pattern has changed to what can be considered ‘wiser’ consumption choices. These shifts are accompanied by increase in expenditure share of durable goods. This result could potentially be driven by more resources being allocated to commodities that substitute women’s chores in the households given that they spend more time at NREGA work sites.

39 The paper goes further to see if the effects of the programme are magnified in situations where one would expect them to be stronger. For instance, greater share of women employed through NREGA should lead to a greater impact on allocation towards goods women pre- fer. The paper finds that the pattern of consumption is similar to the baseline results but exhibit larger effects where women-to-total employment ratio is higher. Moreover, guaran- teed employment should induce larger impacts where higher minimum wages are provided as part of the programme. Analyzing the heterogeneous effects due to variations in state stipulated minimum wages, the paper finds the magnitude of impact to be greater where participants’ wages are subject to higher minimum wages. Another source of variation in the programme effect may arise due to differences in the degree of women’s involvement in agricultural processes employed for crop production. Considering this heterogeneity, it is found that households belonging to wheat and rice growing regions are affected differentially given differences in the status of women prior to the treatment. Lastly, the programme is found to marginally increase the probability of female headed households for the sample consisting of at least one male and female adult. Rest of the paper proceeds as follows. Section 2.2 provides the background and pro- gramme details of NREGA. Section 2.3 presents a review on related literature. Section 2.4 describes the data followed by the empirical strategy in Section 2.5. Section 2.6 discusses the baseline results followed by sensitivity analysis in Section 2.7. The paper ends with robustness checks presented in Section 2.8 followed by the conclusion in Section 2.9.

2.2. Background on NREGA

The Mahatma Gandhi National Rural Employment Guarantee Act, 2005 is aimed at enhancing the livelihood of households in rural areas. In February 2006, the programme was introduced to 200 backward districts as the first phase of its implementation. The second phase was rolled out in April 2007 and extended to additional 130 districts. By April 2008, 284 more districts were covered exposing entire rural India to the programme. NREGA provides at least 100 days of guaranteed wage employment every financial year to house-

40 holds where adult members volunteer to undertake unskilled manual work. This is the first incidence of a legally binding commitment made by the government to provide employment. In a short span of operation, NREGA has had a substantial impact in generating rural em- ployment affecting approximately 50 million households. A minimum statutory requirement of the policy is to have 33 percent women participation. Current statistics suggest that the actual participation is about 52 percent. This is particularly striking, given that women make up less than 30% of the total labor force (Ministry of Rural Development[2013]). To obtain work, adult members of a household apply for a job card at the local Gram Panchayat. 23 After due verification, the registered household is issued a job card within 15 days. The card is valid for at least five years after which it can be renewed. Once the household obtains the job card, members can apply for a job at any time and are assigned work within 15 days, failing which they are eligible for unemployment compensation. Projects sanctioned under NREGA are employment projects decided by the intermediate administrative body between Gram Panchayat and the district. These projects pertain to water conservation, irrigation, land development, construction of roads and ponds, building of canals, afforestation, leveling of fields, fisheries, rural sanitation and government relief works. Workers are paid either a piece rate or a daily wage subject to a minimum specified by the state and governed by a national minimum (Ministry of Rural Development[2013]).

2.3. Literature review

This paper contributes to two strands of literature on NREGA. One pertains to the evaluation of NREGA as a welfare programme. The impact of NREGA has been studied on labour market outcomes like participation in public works, private employment, wages and welfare outcomes (Bose[2017], Imbert and Papp[2015], Deininger and Liu[2013]). Imbert and Papp[2015] estimate the effect on private employment and wages and find that public sector low skilled manual work crowds out private sector work (similar to Zimmermann

2A household in this analysis is defined as the set of individuals who cook around one common stove. 3The lowest governing body at the village level.

41 [2014]) and increases private sector wages. Azam[2012] finds a positive impact on labor force participation which is driven by significant female participation. Similarly, Diiro et al. [2014] show that presence of work opportunities in the villages increases average wages of casual workers, reduces gender wage gap and increases the probability of female labor market participation. Ravi and Engler[2015] measure the welfare impact of NREGA and find significant impacts on rural poverty alleviation, increasing food security, and probability of saving. Bose[2017] finds an increase in consumption for the marginalized caste group and that in general consumption patterns to have shifted to higher caloric food. The second strand of literature pertains to NREGA effects on outcomes impacting chil- dren. Afridi et al.[2016] specifically find greater participation of mothers relative to fathers is associated with children spending more time spent in school and girls benefiting more from an increase in mother’s participation. Islam and Sivasankaran[2014] on the other hand find that time spent on education for younger children increases but time spent working outside the household for older children increases post NREGA. Li and Sehkri[2013] also find such unintentional perverse effects in terms of increase in child labour. Despite the benefits of the programme, some papers advocate a roll-back owing to its high costs and corruption Niehaus and Sukhtankar[2012]. Therefore, if NREGA does in fact alter consumption patterns, another benefit of the paper would be a contribution to an accurate cost and benefit analysis of the programme. In addition, this paper is also an effort to contribute to the literature on unitary mod- els of households versus bargaining models. There is considerable evidence refuting models assuming common preferences Becker[1974] in favor of models where intra-household bar- gaining takes place (McElroy and Horney[1981], Manser and Brown[1980], Heath and Tan [2014], Lundberg and Pollak[1996], Chiappori[1988, 1992]). Extant literature finds that final consumption allocations are made on the basis of weights attached to the preferences of household members towards goods they especially care about. Such difference in con- sumption preferences between men and women is well documented across many settings (Lundberg and Pollak[1996], Anderson and Baland[2002], Basu[2006]). Mencher[1988],

42 Riley[1997], Desai and Jain[1994] suggest that the a woman’s preferences are visible in household decisions depending by her actual contribution to household budget. On simi- lar lines, Anderson and Eswaran[2009] find that any contribution to an income generating activity potentially increases female autonomy. NREGA as an income generating and em- ployment guarantee policy should therefore alter consumption patterns and have some effects on female bargaining power within households.

2.4. Data

The 64th and 68throunds of repeated cross-section data from the employment and un- employment survey of the National Sample Survey Organization (NSSO) are used. The two waves pertain to 2007-08 and 2011-12. The survey is conducted from July to June to capture the full agricultural cycle and is stratified by urban and rural areas.4 Information on roll-out of NREGA to districts across India is taken from the official NREGA website.5 Employment and women participation statistics at district level, data on consumer price index and state-wise minimum wages for NREGA workers as per the Minimum Wage Act, 1948 and NREGA Act, 2005, for the relevant years are taken from the Ministry of Labour and Employment, Government of India.6 Information on rice and wheat producing districts is obtained from the Ministry of Agriculture and Farmers Welfare, Government of India. Urban areas from the survey sample have been dropped since NREGA is only applicable to the households in rural areas. All districts of India are included except those from the state of Jammu and Kashmir which is ridden with persistent internal conflict and has missing data problem. Districts of Mumbai, New Delhi, Ladakh, Andaman & Nicobar islands and some other districts for which there is no information are also excluded. The sample is restricted to include only households with at least one adult male and female member to circumvent any issue related to absence of a male in the household due to migration, ill-health or death.

4NSS Survey is stratified by urban and rural areas of each district and is further divided into four sub- rounds each lasting three months. 5List of districts and phases can be found at http://nrega.nic.in/MNREGA Dist.pdf. 6Provided as per the central Government notification for the relevant years upon request.

43 A basket of fourteen commodities - cereals and cereal products; pulses and pulses prod- ucts; edible oil; fuel and light; meat, fish, milk and milk products; intoxicants and tobacco; entertainment; vegetables and fruits; spices, salt and condiments; personal items, toiletry and other miscellaneous products; school expenditure; durable goods; medical expenditure; and clothing, bedding and footwear - are considered as my outcome variables. Cases for which consumption expenditure has many zero values are dropped.7 NSS data uses a thirty day time frame for some commodities while for some a three hundred and sixty five day time frame. All expenditures are converted to the monthly time frame before estimation. The dependent variables are in the form of budget shares spent on fourteen separate commodity categories out of the total monthly spending by a household in a district at a particular point in time. The sample is further restricted to include only households with children for the model where my outcome variable is budget share of school expenditure. Standard errors are clustered at district levels in all estimations. The set of controls include household size, age of the head of the household, age squared, number of children, number of literate males and females, number of males and females with primary, middle, higher and technical education, and indicators for caste and religion (scheduled tribe, scheduled caste, other backward class, Hindu, Muslim, Christian, Sikh, and other religion).

2.5. Empirics

The following difference-in-differences specification is used to compare phase 1 and 2 districts to phase 3 districts before and after NREGA is rolled out in its third phase:

yidt = β0 + β1Tt + βDIDNREGAdt + γXidt + µd + idt (2.1)

where yidt is the log of the budget share for a particular commodity for household i in

district d at time t, Tt takes the value 1 for 2011-12 and 0 for 2007-08, and the treatment

NREGAdt takes the value 1 if the household belongs to district d where NREGA has been

7Around 200 observations are dropped from approximately 90,000 observations.

44 implemented at time t. Xidt is the set of controls; and µd depicts district fixed effects.

The disturbance term idt summarizes the influence of all other unobserved variables that vary across households, districts, and time. The baseline model is estimated via OLS with fixed effects. Taking budget share of each commodity would ideally require me to estimate a fractional response model. However, given that I am controlling for 576 districts, a fractional response model with fixed effects becomes infeasible. While a fixed effects fractional response model is not feasible, I compare the OLS re- sults with those from two alternative estimation approaches. First, I estimate a correlated random effects fractional logit model (section 2.8.1). Second, I estimate the model using an instrumental variable approach where my outcome variable is the logarithm of consump- tion per month for each commodity category. Log of total consumption per month is then added as an explanatory variable in this model to hold the household budget constraint constant. Note that NREGA could affect consumption decisions by altering the household budget constraint or by affecting bargaining power through guaranteed employment. Given this, controlling for total consumption isolates the bargaining power effect of the programme. Land possessed by the household at the time of the survey is used as an instrument for total consumption in this specification since total consumption is likely endogenous. The details and results of these models are discussed in robustness checks (section 2.8.2).

My coefficient of interest is βDID which the differential impact of NREGA introduced in phase 3 districts on the budget share of expenditure on relevant commodity for household i

8 in district d. β1 identifies the effect of any systematic changes that affected households in all districts between 2007-08 and 2011-12. My empirical strategy exploits the phased roll out of NREGA to different districts and compares households in districts that received the programme earlier to districts that received it later. Households in NREGA’s early implementation districts are my control group and late implementation districts are my treatment group. The phased roll-out of NREGA means

8 Percentage change in the budget shares due to NREGA is given by 100.{exp(β2) − 1} (see Halvorsen and Palmquist[1980], Thornton and Innes[1989] for further discussion on interpreting dummy variables in semi-logarithmic regressions).

45 that some districts remained uncovered in 2007-08. Identification therefore relies on changes in household consumption behaviour at the district level when NREGA is introduced in its third phase. Phase 3 of the programme comprised of the largest part of the roll-out of NREGA covering 284 districts of India making it pertinent to examine. NSS data does not identify which households participated in the programme. Thus, I use all the households in a district and estimate the effect of access to the program which is the intent to treat (ITT) effect on consumption patterns. The empirical strategy employed in this paper is closest to the strategy used by Bose[2017] and Imbert and Papp[2015]. A word of caution warranted here is that roll-out of the programme was not randomly determined. Phase 1 districts are the more ‘backward’ districts. Simple comparison of households from districts that received the programme earlier to those from districts that were covered later is thus biased. To address the concern of any time invariant district level characteristics that may be correlated with the treatment, I include district fixed effects. Time fixed effects control for the time-varying characteristics that impact all districts equally. A primary concern with this identification strategy is that the districts that received the programme in different phases may be trending differently prior to NREGA. Ideally, two rounds of survey waves prior to the programme would aid in analyzing the pre-trends. How- ever, extensive missing consumption data in the 61st round of employment-unemployment survey of NSS restricts my analysis of pre-trends in consumption. Survey rounds prior to the 61st round do not conduct the consumption survey as part of the employment-unemployment survey. Although nothing can be said about the trends in consumption outcomes for the control and treatment districts, other outcomes analyzed in several papers show that the districts that received NREGA in different phases are not trending differentially.9 To alle- viate this concern further, I estimate a difference-in-difference-in-difference (DDD) model.

9Using data from 1999-00, Imbert and Papp[2015] show no differential increase in public employment in early districts relative to the late districts prior to NREGA. Similarly, Li and Sehkri[2013] conclude that growth in school enrollment in districts that received the programme in different phases is similar in the pre-treatment periods. Azam[2012] conducts a falsification test using data from 1999-00 and 2004-05 to suggest that overall labor force participation as well as male and female labor force participation in treatment and comparison districts were moving in tandem absent the program.

46 I introduce a dummy variable sector which takes value 1 if the household belongs to a ru- ral sector and 0 if urban and modify Eqn. 2.1 to include a triple interaction term given sector ∗ NREGAdt. The DDD estimate calculates the changes in average consumption in the treatment districts in rural sector while netting out the change in average consumption in the control districts in rural sector and the change in average consumption in the treated districts in the urban sector. This methodology helps take care of two potential confounds and ensures that the changes in average consumption in the treated districts in the rural sector is not a result of changes in consumption for all districts in the rural areas, nor is it a result of changes in consumption for all households in the treated districts. A secondary concern with my strategy would be if NREGA changes the sample through rural to rural migration. However, migration from early implementation districts to late implementation districts is unlikely since rural to rural migration in India is limited. Only about 0.4 percent of adult population report having migrated to different rural districts for employment Imbert and Papp[2015]. Additionally, households are required to show proof of residence in the village to obtain job cards that will permit them to work under NREGA which eliminates the concern of rural to rural migration to gain work udnder the scheme. Another potential shortcoming of the baseline model is that it masks meaningful het- erogeneous effects the programme may have across different households. I go beyond the baseline to consider if the programme effects are amplified in situations where one would expect them to be stronger to address this concern. First, I analyze whether households with higher female employment share in NREGA lead to greater changes in consumption patterns. Second, whether guaranteed employment leads to greater bargaining power effects in areas where higher minimum wage are provided as part of NREGA. Third, I estimate if different agricultural processes used for rice and wheat production in the country induce differential treatment effects conditional on the prevailing status of women in such crop ar- eas. Various interactions to control for these heterogeneous treatment effects are used in my model specifications, the details of which are discussed in section 2.7.

47 I also estimate the following model to capture the importance of bargaining power as a mechanism to explain the shifts in the pattern of consumption spending as a result of NREGA.

DfemheadHHidt = β0 + β1τt + βDIDNREGAd ∗ τt + γXidt + µd + idt (2.2) where DfemheadHH takes the value 1 if the household i is headed by a woman in period t in district d and 0 otherwise.10 Note that female headed households will not simply pick the lack of males in the household since my sample includes households with at least one male and female adult. The marginal effect of access to the programme on the expected

11 probability of whether the households is female headed is given by the parameter βDID.

2.6. Results

Table 2.1. provides results for my baseline analysis where the outcome variable is the log of the budget share for each commodity group. Statistically significant increase of ap- proximately 2.7 percent in the budget share of school expenditure, 2.2 percent in the budget share of durable goods and 0.5 percent in clothing, bedding and footwear are found. At the same time, there is a fall in the budget shares of entertainment; spices, salt and other condiments; meat and milk products, personal commodities and fuel and light. Share of expenditure on spices and condiments reduces by about 1.3 percent, fuel and light by 0.8 percent, milk products and poultry by 0.5 percent and personal commodities by 0.5 percent. Share of spending on entertainment shows a larger decline of 2.3 percent.12

10The model is estimated using a linear probability model (LPM). Merits of LPM over Probit/Logit models in cases of Limited Dependent Variable (LDV) Models are debatable. However, there are some advantages of LPM despite its shortcomings as MLE estimates are inconsistent in many cases. Additionally, given that I have fixed effects where I control for 576 districts, a probit specification becomes infeasible. 11This may be an imperfect indication of bargaining power as a self-reported ‘female-headed household’ in the survey may still be a male-headed family. But for purposes of policy and programme implementation, the term female headed household is a practical proxy for a whole range of family structures in which women are the primary providers Buvini´cand Gupta[1997]. 12Note that systematic missing data problem could potentially bias the estimate for entertainment as the number of observations is much lower. The coefficient should be interpreted with caution. Also note that

48 With guaranteed employment increasing women’s contribution to household income, there seems to be a shift towards expenditure on commodities women tend to care more about such as investment in children’s education, durable goods and other households items like bedding and clothes. Moreover, higher school expenditure suggests a causal effect on children’s education of mother’s relative control over household resources.13 A rise in the share of school expenditure and a fall in the share of entertainment expenditure makes a compelling story for greater female bargaining power as a consequence of NREGA because household welfare-improving commodities are valued higher by women Hoddinott and Had- dad[1995]. A plausible explanation for an increase in the budget share of durables could be that it reflects purchases designed to replace female chores in the household since women are now actively part of the labour force. This seems consistent with anecdotal references in Mann and Pande[2012] indicating that women exercise independence in spending NREGA wages suggesting that greater decision-making power. The decline in the budget share of fuel and light is however somewhat surprising.14 There could be two reasons for this. With majority of rural population dependent on agriculture, access to fuel relies heavily on common property resources. NREGA under its environment-conserving initiative emphasizes natural resource regeneration and promotes green economy through creation of sustainable rural assets to reduce reliability on such resources Mann and Pande[2012]. Moreover, more women engaged in NREGA through the day could potentially imply that lesser household resources are allocated to the use of fuel and light. Decline in the budget share of milk products, egg, fish, and meat could be attributed to NREGA providing impetus to create infrastructure that promotes livestock farming such the total number of observations for each commodity changes due to missing data. 13Exposure to awareness programmes at NREGA work-sites may have contributed to parent’s motivation to invest in school expenditure. This could perhaps be a mechanism in which NREGA works regardless of whether the participants are male or female. Thus, we cannot rule out that such programmes could change preference of males rather than change bargaining power of women. 14Effect of NREGA on household total consumption per month increases but the budget share of fuel and light declines. However, when estimating the treatment effect on the level of consumption expenditure (2.8.2) on fuel and light, I find the expenditure to decline while holding total consumption fixed.

49 as poultry, cattle ownership, and small fisheries Ministry of Rural Development[2013]. Ra- jasthan state governments under the initiative promotes individuals from low socio-economic strata to develop their own agricultural land under a sub-scheme called ‘Apana Khet, Apana Kam’.15 Similarly, the Madhya Pradesh government designed schemes that help job card holders build assets like small land, poultry, fisheries, and farm ponds Ministry of Rural De- velopment[2013]. Goods like salt, spices and condiments are typically considered essential goods for rural households and additional income invariably leads to decline in relative ex- penditure on these items.16 Table 2.2. provides the results for the difference-in-difference-in-difference analysis. No- tice that the results are largely similar to the results found in Table 2.1. which provides further evidence that this estimation controls for the potential counfounding elements that may arise from the trends in average consumption in control and treated districts. Note that the zero share cases for consumption cannot be dropped from this sample as they are large in number and would lead to sample selection bias. 17 The sample now consists of urban as well as rural areas, the total number of observations being approximately 195,000. A more sophisticated way to circumvent this problem and include these households in the analysis is to apply the inverse hyperbolic sine transformation to the variable. Inverse hyperbolic √ sine transformation requires simply to transform the variable, say, z as log(z2 + z2 + 1) which unlike log z, is defined even for z = 0. I use the inverse hyperbolic sine transfor- mation to deal with households reporting zero consumption expenditure (Burbidge et al. [1988],Friedline et al.[2015]). Table 2.3., specification (1) shows the marginal effect of NREGA on the probability that a household is female headed. It increases marginally by 0.3 percent and is statistically significant at 10 percent lending some support to a bargaining power effect of NREGA on women.

15Translates to ‘my land- my labour’. 16An effect similar to fuel and light is found in the case of spices and condiments as well. 17Unlike the baseline which consisted roughly of only 200 observations out of approximately 90,000.

50 2.7. Sensitivity analysis

2.7.1. Women employment in NREGA jobs

If household consumption behaviour is in fact suggestive of higher female involvement, these effects should be larger where higher share of women are employed by NREGA. I interact the programme with variation in share of women employed at district level in the two time periods.18 I calculate this heterogeneous effect at two levels of women employment share - 25 percent and 75 percent - with the idea that districts with higher share of women employed by NREGA would exhibit these effects more prominently.

yidt = β0 + β1Tt + β20NREGAdt + β21NREGAdt ∗ ShareOfW omenEmployeddt (2.3) + β22ShareOfW omenEmployeddt + γXidt + µd + idt

DF emheadHHidt = β0 +β1T +β20NREGAdt +β21NREGAdt ∗ShareOfW omenEmployeddt + β22ShareOfW omenEmployeddt + γXidt + µd + idt (2.4)

The parameter of interest varies with time t, and district d where the total impact of

NREGA is given by β20 + β21ShareOfW omenEmployeddt. Results in Table 2.4. confirm that households with greater share of women employed by NREGA shift expenditure towards commodities like school, medical, durables and house- holds items that may maximize general household welfare. Where the share of women-to- total employed by NREGA is 75 percent, the budget share of school expenditure rises to 1.2 percent and medical expenditure to 1.6 percent as compared to when women-to-total em- ployment share is at 25 percent. Similar to the baseline, a statistically significant rise of 1.7 percent and 1 percent is found on durables and clothing, bedding and footwear respectively.19 Table 2.3., specification (2) shows the probability that a household is female headed increases with NREGA where a higher share of women are employed by the programme but the impact is not precisely estimated.

18Share of women-to-total NREGA employment is calculated from total person-days of employment gen- erated by NREGA and women participation rates. 19However, note that the direction of impact does not increase for entertainment and intoxicants for households with higher share of women employed through NREGA as expected.

51 2.7.2. State minimum wages

Given the wide variation in the state stipulated minimum wages provided under NREGA across different states, I assert that if the baseline effects are due to bargaining power, the effects should be amplified when NREGA employment pays more. To get at this, I exploit variation in minimum wages to see if the NREGA effects are larger in areas where higher minimum wages are provided. I use the following specifications introducing an interaction between the treatment and the state stipulated median - standardized minimum wage for district d in time period t.20

yidt = β0 + β1Tt + β20NREGAdt + β21NREGAdt ∗ minW dt + β22minWdt (2.5) + γXidt + µd + idt

DF emheadHHidt = β0 + β1T + β20NREGAdt + β21NREGAdt ∗ minWdt (2.6) + β22minWdt + γXidt + µd + idt

The parameter of interest varies with time and district where the total impact of NREGA is given by β20 + β21minWdt. Table 2.5. provides the results. As before, a noticeable increase of 4.2 percent in the share of school expenditure for households with higher minimum wages is estimated making a case for women having greater say in the household decisions as they work for higher minimum wages. A rise of 2.1 percent in this share is found for households with lower minimum wages as well but the magnitude of impact is lower than the impact evaluated at the maximum limit. The difference between households that receive higher and lower minimum wages is statistically significant. This supports the assertion that if NREGA provides higher bargaining power to women, this bargaining power must be higher where higher minimum wages are provided. NREGA is also found to increase the budget shares of durables, and clothing, bedding and footwear for higher minimum wage households. Impact evaluated at the maximum limit shows a statistically significant rise of 1.5 and 2.1 percent in their budget shares respectively

20I create a standardized measure of minimum wages across states by dividing the minimum wages for each state by the median wage for the year in consideration.

52 whereas these impacts evaluated at the minimum limit do not show statistically significant results. To the extent that women who work for higher minimum wages may care more about durable commodities that help substituting their chores, as well as clothing and household items, NREGA employments seems to show a significant shift towards these items. Results also suggest that these households are substituting wheat and wheat products with more nutritious foods like vegetables and fruits. Statistically significant increase in their monthly budget share of vegetables and fruits of approximately 1.6 percent and decline of 1.3 percent in the budget share of wheat products are found. Households with lower minimum wages depict a statistically significant decline in the budget shares of entertainment and personal commodities post the treatment. These im- pacts when evaluated at the maximum limit also show a decline however the results are not precisely estimated. Table 2.3., specification (3) shows that the probability a household is headed by a female increases with NREGA employment at higher minimum wages but the impact is not precisely estimated.

2.7.3. Crop regions

Literature suggests that women have a comparative advantage in rice production rela- tive to wheat farming (Flueckiger[1996], Bardhan 1974). 21 As a result, absent NREGA, bargaining power for women ought to be higher in rice regions. Since the ‘baseline’ level of bargaining power is different in rice regions compared to wheat, the effect of NREGA may differ across the two regions. However, it is not clear a priori where the effect should be larger. The effect may be larger in wheat regions because women’s bargaining power is initially lower. On the other hand, the effect may be larger in rice regions because the ad- ditional bargaining power conferred onto women from NREGA may ‘tip-the-scales’ in favor

21“Transplantation of paddy is an exclusively female job in paddy [rice] areas; besides, female labour plays a very important role in weeding, harvesting and threshing of paddy. By contrast, in dry cultivation and even in wheat cultivation, under irrigation, the work involves more muscle power and less tedious, often back-breaking, but delicate, operations...” [Bardhan, 1974, p. 1304]

53 of women within these households. Thus, while the effects are likely to be heterogeneous across regions, the direction is an empirical question. To estimate these heterogeneous impacts of NREGA, I estimate the following models:

yidt = β0 + β1T + β20NREGAdt + β21NREGAdt ∗ DRiced + β23NREGAdt ∗ DBothd + β24DRiced + β25DBothd + γXidt + µd + idt (2.7)

DF emheadHHidt = β0 + β1T + β20NREGAdt + β21NREGAdt ∗ DRiced (2.8) + β23NREGAdt ∗ DBothd + β24DRiced + β25DBothd + γXidt + µd + idt

DRice takes the value 1 for districts that belong to rice producing states and 0 otherwise. DBoth takes the value 1 if the districts belong to both rice and wheat producing states and 0 otherwise. Wheat growing districts are given by when DRice = 0 and DBoth = 0.22 All the other variables remain the same as my baseline. The parameter of interest now varies with crop districts consequently the impact of NREGA differs with crop districts considered. Table 1.6. provides the results. At the outset, notice that although the patterns alter slightly for the two regions, most of the impacts are found to be higher for rice regions. Women in rice regions presumably have greater say in household decisions absent the programme. Introduction of NREGA thus boosts their position further flipping the balance of power to some extent. Whereas, absent the programme, women have much lower decision making power in wheat regions. NREGA alone is therefore insufficient to alter bargaining power fundamentally. Similar to the baseline, a statistically significant increase in the budget share of school expenditure is seen as a result of NREGA in both rice and wheat growing regions but the impact is larger in rice regions. Other effects found in the rice regions are decline in the budget shares of entertainment and condiments. Rice regions also show a decline in budget shares of meat and milk and personal commodities but an increase in clothing, footwear and

22Regions that produce neither rice or wheat are excluded because nothing can be said about the status of women in regions that grow other crops. Note that this causes the total numer of observations to be lower than the baseline, women employment share as well as the minimum wages models.

54 household items. This is in line with the evidence provided earlier that NREGA helps create own infrastructure that promotes livestock farming reducing their reliance on purchase of such commodities from the market. Shift from spending on personal commodities towards goods that may increase the overall household utility suggests a change in the pattern more in line with preferences of women. Exposure to NREGA in wheat regions depicts a shift from the budget shares of cereals and cereal products, fuel and electricity towards durables. Similar results were noticed in the baseline model suggesting more resources being spent on durables which substitute women’s chores in the house. The marginal impact of NREGA on expected probability that a household is female headed is found to rise for the rice regions but the impact is not statistically significant (Table 2.3., specification (4)). No such impact is seen for the wheat regions.

2.8. Robustness checks

2.8.1. Fractional logit estimation with correlated random effects

2.8.1.1. Baseline model

Given that my outcome variables are in the form of monthly budget shares spent on each commodity, a fractional logit model is more suitable for estimation. However, fractional logit is infeasible with fixed effects. I therefore estimate my model via a correlated random effects (CRE) fractional logit model. The advantage of using CRE fractional logit is that it places some structure on the nature of correlation between the unobserved effects and the covariates. To capture the district fixed effects, means of all controls at district level across time are included as additional controls in the estimation. All standard errors are clustered at the district level.23 The point estimates from Appendix table B.2. suggest that the results

23As additional robustness checks, I estimate the baseline via OLS without fixed effects and compare the results with a fractional logit without fixed effects. The results are generally similar and available upon request.

55 are robust. The estimations show similar results in terms of statistical significance and the magnitude of impact as the baseline.

2.8.1.2. Heterogeneous effects

I follow the same procedure and re-estimate a CRE fractional logit model to examine the heterogeneous impacts of NREGA (Appendix Table B.3., B.4., and B.5.). For all three models capturing the heterogeneous impacts of NREGA, the marginal effects of NREGA are found to be broadly robust to their baseline results. One surprising result is that the marginal effect of NREGA evaluated at the maximum of the stipulated minimum wages has a negative impact on school expenditure. However, this effect is imprecisely estimated. For the crop regions, marginal effect of the treatment in rice regions are found to be higher than wheat. The pattern of spending shifts in the wheat areas as well but NREGA seems to be insufficient to change the balance of power in these households.

2.8.2. Consumption in levels

2.8.2.1. Baseline model

I alter my estimation by changing the outcome variable to the log of monthly consump- tion of each commodity. As a control for this model, I include the log of total monthly consumption of the household since the outcomes are no longer in form of budget shares. However, total consumption is likely endogenous since it is the sum of consumption expendi- tures on each commodity. Using an instrumental variable approach therefore, I instrument total monthly consumption by land possessed by the household at the date of the survey to circumvent this problem. This serves as a valid instrument because land possessed makes up the assets held at the time of the survey and does not directly impact the monthly ex- penditure on each commodity. Theory suggests that monthly expenditures on commodities

56 are out of current earned income rather than out of household assets or wealth.24 Table B.6. provides the results. Several diagnostic tests have been performed to assess the efficiency and reliability of the model. The endogeneity test reports test statistics that are robust to various violations of conditional homoskedasticity. I reject exogeneity of log of total consumption for most specifications.25 As far as underidentification is concerned, I report chi-squared p-values for the test where rejection of the null implies full rank and identification Baum and Schaffer[2007]. This test tells us whether the excluded instrument is correlated with the endogenous regressor. In all the specifications, the p-value based on Kleibergen-Paap rk LM statistic allows me to clearly reject the null that the instrument is uncorrelated with the endogenous regressor and that the model is underidentified. I also report the Cragg-Donald (1993) Wald F statistic. Rejection of the null here rep- resents absence of weak-instrument problem. The F-statistics are well above 10 across all estimations indicating that none of the specifications suffer from weak instrument problem. Since all the specifications have clustered standard errors at district level, the reported test statistic is based on the Kleibergen–Paap rk statistic which also indicates absence of weak instrument problem. Point estimates show that the results found for this model are broadly consistent with the baseline results. There is a 20 percent increase in expenditure on school and approximately 18.7 percent rise in expenditure on durables. Household expenditure on spices and condi- ments has reduced by about 10 percent and on fuel and light by 4.8 percent. Expenditure on entertainment shows a large decline of 18 percent. The pattern of spending is thus consistent with commodities that women prefer and suggests a bargaining power effect.

24Although, land could affect school expenditure to some extent since land requires work and missing work would factor into opportunity cost of expenditure related to school. Moreover, it cannot be disregarded that land possessed could also possibly be correlated with commodities like meat, poultry as well as milk which require land for production. 25Under conditional homoskedasticity, this endogeneity test is numerically equal to a Hausman test statis- tic.

57 2.8.2.2. Heterogeneous effects

Results are found to be robust and the patterns of spending similar to the baseline when I estimate the impact of NREGA with variation in the share of women employed using IV approach (Table B.7.). All specifications perform well on the diagnostic tests. Similarly, results robust to the baseline are found for the model with state stipulated minimum wages (Table B.8.). The programme effects for different crop regions are also found to yield results that are similar to the baseline crop regions model (Table B.9.).

2.9. Conclusion

This paper evaluates the world’s largest public works programme, NREGA, with an at- tempt to marry the literature on welfare programmes with the literature on intra-household resource allocation decisions. Such welfare programmes, despite their long standing history, have been subject to constant debate regarding their requirement and efficacy. However, the enormous scope of NREGA ensured a highest ever allocation of INR 480 billion in the finan- cial budget for 2017-18 by the government India Union Budget[2017]. More importantly, NREGA generated approximately 2.35 billion total person days of employment in 2015-16 of which approximately 55 per cent were by women Ministry of Rural Development[2016]. Given this background, it is imperative to evaluate the impact of the programme. The paper addresses how the consumption patterns in rural households change as a result of NREGA and if these effects are suggestive of higher bargaining power for women. I provide empirical evidence that an employment guarantee programme such as this leads to an apparent shift in the pattern of household consumption behaviour towards goods mostly preferred by women, consistent with a bargaining power effect of the programme. I estimate the causal impact of the phase wise roll out of NREGA on the pattern of monthly household consumption expenditure using two rounds of nationally representative survey data. Households belonging to phase 3 are richer and more developed districts in general but to my knowledge, any causal impacts of phase 3 of the programme on pattern of consumption expenditure has not been studied. NREGA having any sort of impact on

58 backward districts, those covered in phases 1 and 2, seems like an expected conclusion, but any evidence of bargaining power shifts through changes in consumption patterns found for the rich districts speaks to the effectiveness of the programme even in the richer areas. One of the key policy relevant impacts found is that NREGA increases the household monthly budget share of school expenditure by approximately 2.7 percent. This has impor- tant policy implications for developing countries considering employment schemes. I find that in general, expenditure on durables and clothing, bedding and footwear increase while the expenditure on entertainment decline. The results potentially imply that households in the more developed rural districts are now switching to purchases that substitute women’s chores. Importantly, the effects documented are stronger where one would expect, lending further credence to the interpretation that NREGA is atleast partially affecting consumption patterns via changes in female bargaining power. Specifically, the effects are larger in areas with greater share of female participants, a higher minimum wage and specializing in rice production.

59 Table 2.1. Impact of NREGA on expenditure shares - DID

Veg & Meat & Medical School Clothing & Durable Variables Cereals Pulses Edible Oil Fuel & Light Intoxicants Entertainment Fruits Condiments Milk Expd Expd Personal bedding Goods NREGA -0.001 -0.002 -0.004 -0.008*** -0.003 -0.023*** 0.001 -0.013*** -0.005** 0.001 0.027*** -0.005* 0.005* 0.022*** (0.002) (0.003) (0.002) (0.002) (0.004) (0.004) -0.003 -0.003 (0.003) (0.007) (0.006) (0.003) (0.003) (0.006)

Other Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes District Fixed Effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

N 80234 79427 79610 79738 57931 37019 80157 80248 78466 63887 52018 80159 80082 79628 Notes: * p<0.10, ** p<0.05, *** p<0.01. Estimation is via OLS approach. The sample is restricted to include households with atleast one adult female and male member. Dependent variables are in the form of budget shares spent on 14 separate commodity

categories out of the total monthly spending by a household in a district at a particular point in time. Additional controls included in each specification - district fixed effects, household size, age of the head of the household, age squared, number of children, number

of literate male and female members, number of male and female members with primary, middle, higher and technical education, Scheduled Tribe (ST), Scheduled Caste (SC), Other Backward Class (OBC), Hindu, Islam, Christianity, Sikhism, and other religion. 60 Standard errors are clustered at district level and reported in parenthesis. Sample is restricted to include only households with atleast 1 male and female adult member who have school going children for the model where outcome is school expdenditure. Table 2.2. Impact of NREGA on expenditure shares - DDD

Edible Fuel & Veg & Meat & Medical School Clothing & Durable Variables Cereals Pulses Oil Light Intoxicants Entertainment Fruits Condiments Milk Expd Expd Personal bedding Goods NREGA 0.001 -0.005 -0.007*** -0.011*** 0.062*** -0.111*** 0.002 -0.018*** 0.011*** -0.004 0.070*** -0.010*** 0.008*** 0.016*** (0.002) (0.004) (0.002) (0.002) (0.010) (0.013) (0.002) (0.003) (0.003) (0.010) (0.007) (0.003) (0.002) (0.005) Sector 0.015*** 0.006*** 0.004*** -0.010*** -0.015*** -0.025*** -0.001 0.000 0.003* -0.011*** -0.048*** -0.007*** 0.011*** -0.001 (0.001) (0.002) (0.001) (0.001) (0.004) (0.006) (0.001) (0.001) (0.001) (0.004) (0.004) (0.002) (0.001) (0.002) NREGA*Sector -0.003** -0.002 0 0.004** -0.067*** 0.051*** -0.002 0.004** -0.011*** 0.003 -0.049*** 0.003 -0.006*** 0.001 (0.001) (0.002) (0.001) (0.001) (0.005) (0.008) (0.001) (0.002) (0.002) (0.005) (0.005) (0.002) (0.001) (0.003) Impact of NREGA on rural -0.007** sector -0.002 -0.007** -0.007** -0.005 -0.060*** 0.000 -0.013*** 0.001 -0.002 0.020** -0.007** 0.003 0.016** (0.002) (0.003) (0.002) (0.002) (0.010) (0.012) (0.002) (0.003) (0.003) (0.010) (0.009) (0.002) (0.002) (0.005)

61 Other Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes District Fixed Effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N 182532 181699 181956 182056 127809 152429 182400 182556 180667 156793 164254 182304 182374 181215 Notes: * p<0.10, ** p<0.05, *** p<0.01. Estimation is via OLS approach. The sample is restricted to include households with atleast one adult female and male member. Dependent variables are in the form of budget shares spent on 14 separate commodity categories out

of the total monthly spending by a household in a district at a particular point in time. Additional controls included in each specification - Dummy for sector (rural=1, urban=2), interaction between sector and NREGA traetment (triple difference), district fixed effects,

household size, age of the head of the household, age squared, number of children, number of literate male and female members, number of male and female members with primary, middle, higher and technical education, Scheduled Tribe (ST), Scheduled Caste (SC), Other

Backward Class (OBC), Hindu, Islam, Christianity, Sikhism, and other religion. Standard errors are clustered at district level and reported in parenthesis. Sample is restricted to include only households with atleast 1 male and female adult member who have school going

children for the model where outcome is school expdenditure. Table 2.3. Impact of NREGA on probability that household is female headed

Variables (1) (2) (3) (4) NREGA 0.003* -0.055 -0.030 -0.020 (0.002) (0.094) (0.025) (0.013) NREGA*female share of NREGA 0.060 employment (0.018) NREGA*minW 0.032 (0.026) NREGA*Rice 0.052** (0.023) NREGA*Both -0.011 (0.020) 62 H : Female share of NREGA 0 p=0.732 employment = 25%

H : Female share of NREGA 0 p=0.786 employment = 75%

H :NREGA+NREGA*minW 0 p = 0.367 (at Rs.82.50 per day)=0

H :NREGA+NREGA*minWage 0 p = 0.349 (at Rs. 159.40 per day)=0

p = 0.118 H0:NREGA+NREGA*Rice = 0

H0:NREGA+NREGA*Both=0 p = 0.063

N 80279 78471 80279 38164 Notes: * p<0.10, ** p<0.05, *** p<0.01. Estimation is via OLS approach. The sample is restricted to include households with atleast one adult female and male member. Dependent variables in each specification is a binary variable indicating whether the household is headed by a female or not - takes value 1 if it is headed by female and 0 otherwise. Specification (1) pertains to the baseline model. Specification (2) pertains to the model including ratio of women to total employment through NREGA jobs at district level. Specification (3) pertains to model including state stipulated minimum wages. Specification (4) pertains to the model including rice producing areas, wheat producing areas and those that produce both. Controls included in specification (1) - district fixed effects, log (total consumption), household size, age of the head of the household, age squared, number of children, number of literate male and female members, number of male and female members with primary, middle, higher and technical education, Scheduled Tribe (ST), Scheduled Caste (SC), Other Backward Class (OBC), Hindu, Islam, Christianity, Sikhism, and other religion. Additional control included in specification (2) compared to (1) is share of women to total employment through NREGA jobs. Additional control included in specification (3) compared to (1) is state minimum wages. Additional controls included in specification (4) compared to (1) are dummy for rice producing areas and dummy for areas producing both rice and wheat. Standard errors are clustered at district level and reported in parenthesis. A small fraction of households are female headed as compared to the total number of households - aproximately 8% of households for the full sample and approxmately 9% are female headed for crop regions sample. Table 2.4. Heterogeneous Impacts of NREGA on Expenditure Shares: Female Share of NREGA Employment

Edible Fuel & Veg & Meat & Medical School Clothing & Durable Variables Cereals Pulses Oil Light Intoxicants Entertainment Fruits Condiments Milk Expd Expd Personal bedding Goods

NREGA 0.039*** 0.008* 0.014*** 0.006** -0.066*** -0.039*** 0.010*** -0.026*** -0.037*** -0.045*** 0.000* -0.021*** 0.003 -0.005

(0.003) (0.004) (0.003) (0.003) (0.006) (0.006) (0.003) (0.003) (0.005) (0.007) (0.009) (0.003) (0.003) (0.006)

NREGA*female share of NREGA employment -0.057*** -0.017*** -0.031*** -0.034*** 0.119*** 0.029*** -0.016*** 0.018*** 0.017*** 0.081*** 0.016* 0.025*** 0.010** 0.029***

(0.005) (0.006) (0.004) (0.004) (0.009) (0.006) (0.004) (0.005) (0.007) (0.010) (0.010) (0.005) (0.005) (0.006) Marginal Effects of NREGA

Female share of NREGA 0.025*** 0.003 0.006** -0.002 -0.036*** -0.031*** 0.006** -0.021*** -0.033*** -0.024*** 0.004** -0.015*** 0.005** 0.002 employment = 25%

(0.003) (0.004) (0.003) (0.002) (0.005) (0.005) (0.003) (0.003) (0.004) (0.006) (0.008) (0.003) (0.002) (0.005)

63 Female share of NREGA -0.004 -0.006 -0.010*** -0.019*** 0.024*** -0.017*** -0.002*** -0.013*** -0.024*** 0.016** 0.012* -0.002 0.010*** 0.017*** employment = 75%

(0.004) (0.004) (0.003) (0.003) (0.006) (0.005) (0.003) (0.003) (0.005) (0.007) (0.010) (0.003) (0.003) (0.006)

Other Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

District Fixed Effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

N 78436 77787 77919 77971 56617 36164 78356 78448 76716 62531 65037 78366 78287 77885 Notes: * p<0.10, ** p<0.05, *** p<0.01. Estimation is via OLS approach. The sample is restricted to include households with atleast one adult female and male member. Dependent variables are in the form of budget shares spent on 14 separate commodity categories out of the total monthly spending by a household in a district at a particular point in time. Additional controls included in each specification - district fixed effects, women to total employment ratio in NREGA jobs, household size, age of the head of the household, age squared, number of children, number of literate male and female members, number of male and female members with primary, middle, higher and technical education, Scheduled Tribe (ST), Scheduled Caste (SC), Other Backward Class (OBC), Hindu, Islam, Christianity, Sikhism, and other religion. Standard errors are clustered at district level and reported in parenthesis. Sample is restricted to households with atleast 1 male and female adult who have school going children for the model where outcome is school expdenditure. Table 2.5. Heterogeneous Impacts of NREGA on Expenditure Shares: State Stipulated Minimum Wages

Fuel & Veg & Meat & Medical School Clothing & Durable Variables Cereals Pulses Edible Oil Light Intoxicants Entertainment Fruits Condiments Milk Expd Expd Personal bedding Goods NREGA 0.020** -0.024** -0.027*** -0.021** -0.008 -0.056*** -0.026** -0.042*** 0.000 0.012 0.009 -0.007 -0.024** 0.028 (0.009) (0.011) (0.009) (0.009) (0.015) (0.015) (0.010) (0.010) (0.010) (0.025) (0.024) (0.011) (0.010) (0.020) NREGA*minW -0.021** 0.022** 0.023*** 0.013 0.005 0.033** 0.026** 0.030*** -0.005 -0.01 0.019 0.001 0.028*** -0.008 (0.008) (0.010) (0.009) (0.009) (0.015) (0.013) (0.010) (0.010) (0.009) (0.024) (0.023) (0.010) (0.010) (0.019)

Marginal Effects of NREGA Minimum Wage = Rs. 82.50 per day 0.003 -0.006 -0.007*** -0.010 -0.004 -0.028*** -0.004 -0.017*** -0.005 0.004 0.021** -0.006* -0.001 0.021 (0.003) (0.004) (0.003) (0.003) (0.005) (0.006) (0.003) (0.003) (0.003) (0.008) (0.010) (0.003) (0.003) (0.007) Minimum Wage = Rs. 159.40 per day -0.013** 0.011* 0.011* -0.010*** -0.001 -0.003 0.016** 0.006 -0.009 -0.003 0.042** -0.004 0.021*** 0.015***

64 (0.005) (0.006) (0.006) (0.006) (0.010) (0.008) (0.007) (0.007) (0.006) (0.015) (0.018) (0.006) (0.007) (0.012)

Other Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes District Fixed Effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

N 80234 79427 79610 79738 57931 37019 80157 80248 78466 63887 52018 80159 80082 79628 Notes: * p<0.10, ** p<0.05, *** p<0.01. Estimation is via OLS approach. The sample is restricted to include households with atleast one adult female and male member. Dependent variables are in the form of budget shares spent on 14 separate commodity categories out

of the total monthly spending by a household in a district at a particular point in time. Additional controls included in each specification - district fixed effects, minimum wages, household size, age of the head of the household, age squared, number of children, number of

literate male and female members, number of male and female members with primary, middle, higher and technical education, Scheduled Tribe (ST), Scheduled Caste (SC), Other Backward Class (OBC), Hindu, Islam, Christianity, Sikhism, and other religion. Standard errors

are clustered at district level and reported in parenthesis. Sample is restricted to households with atleast 1 male and female adult who have school going children for the model where outcome is school expdenditure. Table 2.6. Heterogeneous Impacts of NREGA on Expenditure Shares: Crop Regions

Edible Fuel & Veg & Meat & Medical School Clothing & Durable Variables Cereals Pulses Oil Light Intoxicants Entertainment Fruits Condiments Milk Expd Expd Personal bedding Goods

NREGA -0.013*** -0.001 -0.003 -0.013*** -0.005 -0.012 -0.001 0.001 0.001 0.007 0.035*** -0.007 0.000 0.031** (0.005) (0.005) (0.004) (0.004) (0.007) (0.007) -0.005 -0.005 (0.004) (0.015) (0.012) (0.006) (0.008) (0.014) NREGA*Rice 0.020*** -0.006 -0.004 0.011* -0.002 -0.011 -0.008 -0.037*** -0.018*** -0.022 0.004 -0.007 0.012 -0.018 (0.006) (0.008) (0.005) (0.005) (0.010) (0.009) (0.006) (0.007) (0.006) (0.018) (0.014) (0.007) (0.009) (0.015) NREGA*Both 0.000 -0.007 0.010** 0.002 0.038*** -0.017 0.011* 0.001 0.002 0.044 -0.027* -0.003 0.009 -0.02 (0.009) (0.008) (0.005) (0.005) (0.013) (0.023) (0.007) (0.010) (0.007) (0.029) (0.016) (0.010) (0.010) (0.021)

Marginal Effects of NREGA

Wheat Regions -0.013*** -0.001 -0.003 -0.013*** -0.005 -0.012 -0.001 0.001 0.001 0.007 0.035*** -0.007 0.000 0.031** (0.005) (0.005) (0.004) (0.004) (0.007) (0.007) -0.005 -0.005 (0.004) (0.015) (0.012) (0.006) (0.008) (0.014) Rice Regions 0.007 -0.007 -0.007 -0.002 -0.007 -0.023** -0.008* -0.037*** -0.017** -0.015 0.039*** -0.014** 0.012** 0.013 (0.005) (0.007) (0.005) (0.005) (0.008) (0.009) (0.005) (0.007) (0.006) (0.015) (0.011) (0.005) (0.006) (0.010)

65 Regions producing both -0.013 -0.008 0.007* -0.011** 0.034** -0.029 0.010* 0.002 0.003 0.051* 0.008 -0.010 0.009 0.011 (0.005) (0.006) (0.005) (0.004) (0.010) (0.025) (0.006) (0.010) (0.007) (0.020) (0.014) (0.008) (0.008) (0.018)

Other Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes District Fixed Effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

N 38141 37536 37722 37866 28802 16616 38103 38146 37211 30407 25213 38112 38081 37814 Notes: * p<0.10, ** p<0.05, *** p<0.01. Estimation is via OLS approach. The sample is restricted to include households with atleast one adult female and male member. Sample is further restricted to include only those regions that are rice producing, wheat producing and those that produce both rice and wheat. DRice=1 for rice regions. If DRice=0, then DBoth is also equal to zero. Dependent variables are in the form of budget shares spent on 14 separate commodity categories out of the total monthly spending by a household in a district at a particular point in time. Additional controls included in each specification - district fixed effects, dummy for rice region, dummy for regions that produce both rice and wheat, household size, age of the head of the household, age squared, number of children, number of literate male and female members, number of male and female members with primary, middle, higher and technical education, Scheduled Tribe (ST), Scheduled Caste (SC), Other Backward Class (OBC), Hindu, Islam, Christianity, Sikhism, and other religion. Standard errors are clustered at district level and reported in parenthesis. Sample is restricted to include only households with atleast 1 male and female adult members who have school going children for the model where outcome is school expenditure. Chapter 3

THE EFFECT OF QUALITY OF EDUCATION ON CRIME: EVIDENCE FROM COLOMBIA1

3.1. Introduction

The World Health Organization reports suggest that 500,000 people are murdered around the world [World Health Organization, 2014]. Besides homicides, men and women are ex- posed to violence in some form or the other, be at home, school, or on the streets, given its prevalence worldwide. As per the WHO, violence is preventable and its impact may be re- duced but the efforts made have not been enough to tackle it in an effective way. Krug et al. [2002] asserts that this might be the result of an absence of sound decision-making, reduced feasibility of policy options, or lack of determination. Besides its causes, since violence is considered as a form of crime, the actions to address mainly involve investing in more police and army. In 1996, the World Health Organization (WHO) declared violence “...as a major and growing public health problem across the world” Krug et al.[2002, pp. XIX]. Treating violence as a public health problem instead may help in addressing the problem through investment in other kinds of policy interventions, such as better education systems and socio-economic conditions. As such, education policy may be a tool that countries use not only to contribute to the development of human capital but to also reduce violence and its impacts. Education may affect violence through different channels. First, education may increase expectations of being absorbed in the labor market and of future returns, discouraging engaging in criminal activities. This is what it is called the ‘opportunity cost effect’ of education. Second, investments in education may generate environments that are less violent,

1With Andres Giraldo, Southern Methodist University and Pontificia Universidad Javeriana

66 as well as promote social and political stability. It is a way in which the government may positively affect social development. In this sense, education may have what we refer to as a ‘pacifying effect’. Third, education may even be used as a means of indoctrination of ideas in regions with a strong presence of politics or religion. Strong ideological differences on account of political ideas could plausibly be fueled by education and lead to conflict between parties as well as against the government machinery. We call this the ‘indoctrination effect’. As a fourth channel, improvements in quality of education will likely impact enrollment and years of education as well in a country over time, which in turn has a direct impact on violence levels.2 The relationship between education and violence has garnered significant interest from researchers and policy-makers over the last few decades. In general, a violent environment is found to hurt economic development in the long run and affect human capital investment decisions of households [Rodr´ıguezet al., 2009]. Determining optimal public and private policies required to combat violence, specifically crime, are thus of utmost importance in such environments [Becker, 1968]. Apart from greater expenditure on defense, police, and an efficient judicial system, research suggests that these policies could also be extended to include expenditure on areas that generate better socio-economic conditions. Improving local educational systems is a primary way to achieve this goal [Lochner, 2004, 2010a,b]. Extant literature in this field focuses on the relationship between violence and quantity of education measured by educational enrollment or attainment. Much less is known about the impact of quality of education on violence. Recent debates however emphasize the importance of looking at education quality rather than quantity as a reliable indicator of economic impacts for a country. The number of years a student stays in school may not be an adequate measure of a good education system or even student achievement. Measures of individual cognitive skills that incorporate dimensions of test-score performance are found to provide better indicators of economic outcomes [Hanushek, 2005, Hanushek et al., 2016,

2In this paper we do not explore neither the indoctrination effect nor the fourth channel. For discussion on school quality and school choice impacting educational attainment and in turn crime, see Lochner[2010b].

67 Hanushek et al., 2017]. In line with this, we assert that assessing the impact of education quality is essential for researchers to understand the existence and persistence of violence and conflicts. Moreover, from a policy perspective, investment in better quality education may be a tool of social mobility and long run development for the country. When students learn more in school, they become more skilled and effective participants in the country’s workforce. Over the long run, successful efforts to improve school quality would thus imply an extraordinary rate of return. Thus, quantity of education without quality may not matter. This paper therefore attempts to analyze the causal impact of quality of education on violence, specifically on different types of violent crimes and on presence of conflict. One limitation of the literature that tries to evaluate the impact of education on crime and conflict is the lack of an identification strategy that overcomes the traditional endogeneity problem [Barakat and Urdal, 2009, Collier et al., 2004, Hegre et al., 2009, Melander, 2005, Shayo, 2007]. Moreover, the existing papers are cross country analyses which increases the probability of having omitted variable bias as the data and institutions across countries are less comparable at aggregate levels. Our paper on the other hand, addresses the endogeneity issues and exploits geographic and time variation at a disaggregated level to study this relationship. We examine the ‘opportunity cost’ and the ‘pacifying’ effects using Colombia as a case study.3 Our empirical analysis is at the municipality level and spans a period of six years from 2007-2013. We use results from a mandatory standardized examination for students at the last level of high school as the measure of quality of education. Test scores as a measure of quality are associated with selection issues as they are conditional on taking the exam. Therefore, we correct for the self - selection problem in test scores to minimize measurement error in our estimates.

3Although religion is an important aspect of Colombian society, given its hispanic roots, it is not con- sidered as a country in which religion may be used as a way of indoctrinating people. In fact, the religious education may be a root of social cohesion and stability, though we do not explore this channel in the paper.

68 We follow an instrumental variable approach for our estimation since education quality is endogenous. Quality of education is dependent on funds allocated by the central government for education to each municipality. However, this allocation is likely endogenous as well given that there are unobservables associated with the process of allocation of funds that could be correlated with violence in municipalities. We construct two instruments to address this endogeneity problem. The first instrument is a spatial instrument constructed by taking spatially lagged transfers of funds from the center to the municipalities. More specifically, it is based on central government transfers to neighboring municipalities for investment in quality of education. The second instrument is based on a shift-share approach which exploits variation in the size of the central budget, but is not a function of current allocation decisions. We take the investment in education quality by the central government in municipalities in the year 2001 as fixed and multiply that with yearly total central government budget for 2007 to 2012 to arrive at the investment figures.4 We use one period lag of both our instruments for quality recognizing that investment in education may have a lag effect. Both instruments are in per capita terms. Our main findings show that quality of education has a significant and negative impact on crime at an aggregate level, as well as on more disaggregated measures of crime such as property crime and violent crimes. More specifically, these include crimes like car theft, total kidnappings and non-political kidnappings. We categorize all types of thefts as crimes on property (or property crimes) and the results point towards an opportunity cost effect of education quality. On the other hand, violent crimes include kidnappings and homicides. Our results are perhaps suggestive of a pacifying effect of better education quality in this case. Finally, we analyze the impact education quality has on the presence of illegal armed groups in municipalities as additional outcome. Better quality of education in municipalities is found to reduce the probability of presence of such groups This corroborates our results suggesting

4The year 2001 for investment allocation decision was considered due to data limitations. This is the only year for which central government investment data was available before our period of analysis.

69 a pacifying effect of better education quality as it points to a general state of peace and stability. The results are robust to sample restrictions like exclusion of state capitals or municipalities with less than 200,000 population as well as urban areas. The rest of the paper is organized as follows. Section 3.2 presents a review of related literature on education and violence which is followed by a brief background on violence in Colombia in section 3.3. Section 3.4 consists of five subsections. Subsection 3.4.1 describes how we construct our data followed by subsection 3.4.2 describing selection issues and sub- section 3.4.3 describing our estimation strategy. Subsection 3.4.4 gives a detailed account of our identification strategy and subsection 3.4.5 provides the institutional framework for cen- tral government allocation of funds in Colombia. Section 3.5 discusses the baseline results followed by robustness checks in section 3.7. The paper ends with conclusion and policy discussion in Section 3.8.

3.2. Literature

Considerable macro-level and cross-national studies exploring the correlation between the levels of education and conflict find that countries with higher average levels of education have a lower risk of experiencing conflict. Most of the evidence focuses on education levels measured by some variant of secondary education enrollment or years of education. In particular, it is found that young male population are more likely to increase the risk of conflict in societies where secondary education is low, especially in low and middle income countries. Increasing secondary male enrollment and average schooling of population thus reduces risk of civil war and conflict [Collier et al., 2004, Melander, 2005, Shayo, 2007, Barakat and Urdal, 2009, Hegre et al., 2009]. Single country papers studying the causal impacts of education levels on terrorism, re- ligious and ethno-communal violence find ambiguous results. Urdal[2008] suggests that literacy has no causal impact on armed conflict risk and a slightly positive effect on political violence. Mancini[2005] finds that on average, inter-ethnic educational inequality is gen- erally lower in peaceful districts for Indonesia. Krueger and Maleˇckov´a[2003] present that

70 terrorists have slightly better average education than the population in general in Gaza. Other papers exploring the relation between quantity of education and violence for single countries are Berrebi[2007], Humphreys and Weinstein[2008] and Oyefusi[2008] but these papers report correlations. Buonanno and Leonida[2009] find a negative impact of education on crime using a set of region fixed effect, year fixed effects and region-specific time trends together with an extensive set of variables, trying to address the endogeneity problem the relationship between education and crime intrinsically has. Another strand of literature focuses on educational policies and violence. Brown[2011] in his theoretical paper examines the ways in which education policies impact dynamics of violent conflict. Moretti[2005] argues that the reductions in violence and property crime are caused by increased schooling although education increases the returns to white collar crime more than the returns to work. Lochner[2004] finds that arrest rates for white collar crimes increase when education levels rise. Rodr´ıguez et al.[2009] explores in-prison behavior in Argentina to asses the effect of educational programs on violence and finds that such programs significantly reduce property damages in prison. Lochner[2010b] in his review of empirical work recognizes that both school quality and the type of school students attend are important for determining the impacts of quantity of education on crime. However, there are no studies estimating a direct impact of school quality on crime. Some causal papers investigate the impact of school choice on student outcomes including delinquency and crime Cullen et al.[2006], Deming[2011], Guryan[2004]. These point to the fact that school quality has an impact on enrollment and through this channel, reduces crime. Lastly, some papers investigate the reverse relationship, that is, the causal effect of vio- lence on education and labor market outcomes. Rodr´ıguezand Sanchez[2012] estimate the causal effect of armed conflict exposure on school drop-outs and labor decisions of Colom- bian children and find that conflict affects children older than 11, inducing them to drop out of school and enter the labor market too early. Barrera and Ib´a˜nez[2004] develop a dynamic theoretical model on the relationship between violence and education investments.

71 They identify that violence affects utility of households directly, modifies consumption of education, rates of return of education and thus changes investment in education.

3.3. Background

The relationship between education and violence is of special interest in Colombia since it has suffered a long standing conflict. Following the assassination of the presidential candidate in 1948, Colombia was engulfed in violent civil war known as La Violencia. Civil conflict among the main political parties in rural areas eventually ended with a political agreement known as Frente Nacional under which the two parties agreed to alternate power as a sign of peace. Interpreted as a discriminatory policy by some factions of the liberal party, this motivated the creation of two left wing guerrillas - FARC and ELN - that are still active today. The 1970s marked the onset of the drug phenomena that resulted in acute violence across the country and extended to the urban areas as well. According to the United Nations Office on Drugs and Crime (UNODC), Colombia was one of the most violent countries in the 1990s measured by homicide rates. Although the homicide rates have decreased significantly, it remains a country with severe levels of violence even today.

3.4. Data and Identification

3.4.1. Data

Our data for this analysis is taken from four different sources. First, we use municipality level panel data constructed by the Studies Center of Economic Development (CEDE by its acronym in Spanish). The panel contains information on 1122 municipalities and around 2000 variables from the last two decades. It consists of 5 sub-panels: general characteristics, land and agriculture, fiscal policy, conflict and violence and education.5 Second, we use the Colombian Institute for Evaluation of Education (acronym ICFES in Spanish) database for test scores at individual level within the municipalities. Third, we use the census information

5The CEDE collects information from different public and private institutions and is publicly available.

72 from the National Administrative Department of Statistics (acronym DANE in Spanish) ad- ministered by Minnesota Population Center, University of Minnesota, IPUMS International [Minnesota Population Center, 2015]. The IPUMS sample contains information for approx- imately 4 million individuals and the census was conducted between May 2005 to February 2006. Fourth, we use data from the National Planning Department (DNP) for information on investment in educational quality. Our final constructed data is at municipality level and spans the years 2007 to 2013. Our main outcome variables are different forms of crime in a particular municipality at a given point in time. These are homicides, kidnappings, and thefts. Theft is further divided between theft on persons, car theft, commerce theft and household theft. Kidnapping is segregated between total, political and non-political kidnappings. Homicides are defined as the number of people killed. Kidnapping is defined as the abduction or illegal transportation of a person, and political kidnapping is a kidnapping committed by an illegal armed group.6 For ease of understanding and analysis, we first generate a measure of intensity of crime which is the sum of all crime rates. We then group our crime measure into two categories - property crimes and violent crimes. We construct a measure of intensity of property crimes which includes different theft rates. Similarly, we create a measure of intensity of violent crimes which includes non-political, and political kidnappings, and homicides. We also use all disaggregated rates of crime discussed above as our outcome variables. Crime rate is total crime divided by total population times 100,000 inhabitants respectively for the entire analysis. Another outcome of interest in our paper is the presence of illegal armed groups in a municipality at a given point in time. Presence of illegal armed groups is a dummy variable which takes the value 1 if either FARC, ELN or both are present in the municipality. This outcome is of special interest because they suggest the impact of education quality on violence associated solely with conflict in Colombia.

6Political kidnapping is perpetrated by guerrillas and para-militaries and non-political kidnapping is perpetrated by common delinquencies, narco-traffickers and others.

73 Our main variable of interest is quality of education at municipality level for which we consider student test scores at a standardized examination at their last level of high school. ICFES provides individual standardized test scores for mathematics, language, social sci- ences, philosophy, biology, chemistry, and physics. We construct a municipality level mea- sure of test scores that accounts for selection into the examination. Our preferred measure of quality is an average of the selection-corrected median scores in the subjects combined. We also consider test scores in only mathematics and language to ascertain performance in terms of cognitive ability, as well as social sciences and philosophy to examine perfor- mance in the social area. These measures are an average of the selection-corrected median scores in mathematics and language; and social sciences and philosophy, respectively. Ad- ditional measures of quality are explored in this paper such as average z-score index of the selection-corrected median in seven subjects, average individual total score, median score in mathematics, median score in language, median score in social sciences, and median score in philosophy, separately. Our control variables include a linear time trend, demographic and economic municipality level controls like total population, birth rate, infant mortality rate, a rurality index of municipality as an indicator of inequality and development, and agricultural yield7; projected population to attend primary and secondary school, as measures of quantity of education or enrollment; and fiscal characteristics like per-capita municipality expenditures and tax revenue as measures of economic growth. Table C.1 summarizes the variables used in our analysis.8

7Agricultural yield is the ratio of agricultural cultivation to agricultural production for all crops at mu- nicipality level. 8General characteristics of municipality (notaries, banks, churches, health centers, clinics, tax collection offices, electricity coverage), historical characteristics (history of violence, Spanish occupation of municipality, presence of indigenous population, presence of land conflict, presence of illegal crops, armed groups) and geographical characteristics (area of municipality in squared km., height of municipality in squared km., linear distance to state capital in squared km) distribution of land and land owners in municipality are not included as we estimate a fixed effects model and these are time invariant characteristics of the municipalities.

74 For illustration purposes, Figures (C.1)-(C.4) shows the distribution of crime rate9 and the average score in subjects across the country in 2007 and 2013. The correlation between crime rate and the average score in subjects is 0.2359 in 2007 and 0.2360 in 2013. The initial positive correlation apparent from the figure is intriguing and speaks to the importance of analyzing the causal link between quality of education and violence in Colombia further.

3.4.2. Selection Issues

A potential issue with using test scores as a measure of education quality is that test scores suffer from self-selection issues. Since the test scores are conditional on going to school till grade 11 and taking the standardized exam, they do not represent the true quality of education in the municipality and would lead to measurement error in our estimates. We correct this self-selection issue by using data from the 2005 IPUMS Census and estimate the drop out rates at municipality level to minimize the measurement error. All municipalities of a state are not included in the IPUMS Census sample. IPUMS aggregates the municipalities with population less than 20,000 into one category for every state. To arrive at the final municipality level dropout rates, we make two assumptions. First, we assume that the dropout rate for each municipality that falls under the aggregated category of IPUMS is same. We believe this is a valid assumption since these are smaller municipalities and are similar in population characteristics to each other. Second, municipality level drop out rates do not change significantly across time. To estimate the drop out rates, we use probability weights provided in the census data and calculate the total population in each state in 2005 for the age category of 16-18 years. This is the age group at which most students take the examination in high school in Colombia.10 We then calculate the population of 16-18 year olds who never attended school, were not attending school in 2005 or had studied up to middle school but did not complete schooling in 2005. This depicts the total number of dropouts for each municipality. Dividing the

9We measure crime rate as the sum of the individual crime rates included in the analysis. 10ICFES data shows that approximately 77% of the population that took this examination belonged to this age category in 2007.

75 total dropouts by the total population in this age group for each municipality gives us the weighted drop out rates for 2005. Using individual level test scores from ICFES, we arrive at the median score at munic- ipality level. Our aim is to impute the dropouts as those scoring below this median score. We impute zeros for those students who belong to the dropout category and then take the median score for each municipality since the zero is irrelevant as long as dropouts are below the median. The assumption for this imputation is that those students who did not appear for the exam or dropped out are considered to be students who would have scored below the median. This brings us to the selection-corrected median test scores which is our measure of education quality at the municipality level.11

3.4.3. Estimation

We estimate the following model to identify the causal impact of quality of education on violence and crime measures

Ymt = β0 + β1EducationQualitymt + β2Xmt + µm + trendt + εmt (3.1)

where Ymt is first taken as the index of crimes in municipality m at time period t, which is the sum of all individual crimes, then as the index of only property crimes, and finally as the index of only violent crimes in municipality m at time period t. This is followed by a disaggregated analysis where eight separate rates of crime are taken in municipality m at time period t; EducationQuality is municipality level measure of test scores explained in the previous section; Xmt are the set of covariates; µm are the municipality fixed effects; trendt captures time trend of the outcome variable and εmt the mean zero error term in equation. Education quality is instrumented by two instruments given the existing endogeneity issues.

The instruments are discussed in the next subsection. The parameter of interest is β1 giving

11Note that in the database, there are some missing values for the municipality of residence. We impute the municipality of residence with the municipality where the students took the examination. For 2007 there were 198, 2008: 207, 2009: 223, 2010: 535, 2011: 1171, 2012: 95 missing values in ICFES database.

76 us the causal impact of education quality on violence. We also estimate another model to identify the causal impact of quality of education on presence of illegal armed groups

P resencemt = β0 + βp,1EducationQualitymt + β2Xmt + µm + trendt + εmt (3.2)

where the outcome variable P resencemt = 1 if any of the illegal armed groups (ELN or FARC) is present in municipality m at time t. We also decompose this outcome variable and estimate the model separately for presence of FARC and presence of ELN. Equation 3.2 is estimated by a correlated random effects (CRE) probit model. The advantage of the CRE probit model is that it places some structure on the nature of the correlation between unobserved effects and the covariates. In order to capture the municipality fixed effects, we include the means of all the controls at the municipality level across time as additional controls in the model. We use instruments here as well to deal with the endogeneity of education quality thereby estimating a CRE IV-probit model.12

3.4.4. Identification Issues

With reverse causality present from violence to education, a simple Ordinary Least Squares (OLS) estimation of our baseline model is not likely to yield unbiased or consis- tent estimates of the impact of education quality on violence measures. Moreover, education quality is likely endogenous even otherwise, since test scores are a noisy proxy of true ed- ucation quality. We therefore employ an Instrumental Variable approach to find a causal impact of education quality on violence. We use two instruments in our model. Our first instrument is constructed from the data on central government transfers to municipalities for investment in quality of education. Quality of education depends on central government’s allocation of funds to municipalities. Transfer of funds for investment in quality of education to every municipality is based on three criteria, which are, population projected

12We run a linear probability model for this as well and find a negative impact of education quality on likelihood that the illegal armed group is present in the municipality but the estimates are not statistically different from zero and thus maybe imprecisely estimated.

77 to attend school in the municipality, population that attended school in the municipality and a measure of equality between municipalities.13 Given this, transfers directly assigned to a municipality is likely endogenous since there could be unobservables associated with this process of allocation of funds that are correlated with violence in the municipality. Thus, we do not use the central transfers directly to municipality m as this may impact violence in municipality m directly and would violate the exclusion restriction required for a valid instrument. Instead, our instrument is based on transfers allocated to the neighboring municipalities. The central government has a fixed budget for education in a state and distributes it to different municipalities within the state. Funds allocated to the neighboring municipalities thus affect the funds allocated to municipality m which in turn would affect the quality of education in m. We believe that such investments do not have a contemporaneous correlation with test scores. Additionally, such investments have a gestation period and take time to have an impact. Moreover, in construction of our instrument, we exclude the neighboring municipalities that share a common border with m because government funds to the neighbors may still impact violence in m due to easy mobility between municipalities which share borders with m. To avoid such spillovers, we exclude the first ring of neighbors. Our instrument is therefore, the average of the funds for investment in quality of education allocated to the neighboring municipalities of m eliminating the first ring of neighbors in time period t − 1. The second instrument is also based on the central government investment for education quality in municipality, m however it is constructed using a ‘shift-share’ formula. We take the base year of 2001 for investment in education and calculate the share of government

14 funds allocated to each municipality in 2001, sm,2001. This is municipality specific and

time invariant. The shift-share of investment is calculated by multiplying the share sm,2001 by the total central government budget in years 2007 to 2013. We use one period lag of the shift-share of investment as the instrument given the belief that education quality has a

13Details on the institutional framework is provided in the next section. 14The year 2001 is considered due to availability of data.

78 lagged impact on violence and crime. This is posited to be exogenous since the proportion of funds are based on the year 2001 making it time invariant and unlikely to be correlated with violence or crime today. It can be argued that violence in Colombia is persistent which could invalidate the exogeneity restriction. However, during this decade, violence at an aggregate level has been on a declining trend. Thus, the fixed share of government investment in education quality in 2001 will not likely influence or be influenced by violence rates today.15 Since our instruments are predictors for educational quality, for which we use student test scores, all our instruments are employed at per capita level.

3.4.5. Institutional Framework

The political constitution of 1991 required the central government in Colombia to pro- vide resources to states, special districts and municipalities with the aim of encouraging decentralization.16 Fraction of transfers to states and special districts were called Situado Fiscal (SF) and the fraction to municipalities were called municipalities participation (PM by its acronym in Spanish). The SF and PM resources were calculated as a fraction of the current national revenue (ICN by its acronym in Spanish). Resources constituting the SF were to be spent on education and health, whereas the PM on health, education, potable water, physical education, recreation, sports and investment. Post the 1999 crisis, the initial system of allocation was reformed, the SF was eliminated and replaced by a General System of Participation (SGP by its acronym in Spanish). The resources allocated were to be invested in education (58.5%), health (24.5%) and general purposes (17%) in the states and municipalities. The criteria of transfers extended overtime to include population that attended school; population projected to attend school; equality

15One concern that could arise here is that even though the trend of violence is declining, if there exists a positive serial correlation between the violence measures over time, then central government allocations in 2001 may still be correlated with violence today. However, in our analysis, we cluster the standard errors at municipality level which takes care of the serial correlation in the idiosyncratic error term [Drukker, 2003, Wooldridge, 2010]. Moreover, we see no serial correlation between most of violence measures from 2007-2013 except for the case of homicide rates and rate of household thefts. 16An excellent summary of the way the fiscal decentralization works in Colombia may be found in Bonet et al.[2014]. This section is mainly based on this document.

79 and administrative efficiency for health; relative poverty, rural and urban population, fiscal and administrative efficiency for general purposes. This system underwent further reform in 2007. The new law included investment in education, health and general purposes as well as potable water and basic sanitation. Share of resources to be allocated changed to 58.5% for education, 24.5% for health, 11.6% for potable water and basic sanitation and 5.4% for general purposes.17 By 2012, the SGP represented 4% of the GDP, 30% of the ICN and 15.7% of the total public expenditure [Bonet et al., 2014]. With respect to education, its share in ICN changed from 23.17% in 2002 to 16.61% in 2012. This sector receives the biggest portion of the national transfers. The reform in 2007 sought to include quantity and quality criteria in education. The main goal was to increase coverage to 100% of territory and improve the score on the standardized test that we is used in this paper.18

3.5. Results

3.5.1. Crime Rate

Our measure of education quality is the average of the selection-corrected median scores in all subjects (see section 3.4.1). Table (3.5) shows the effects of test scores on the index of crime rate. The first six columns present the OLS estimates of equation (3.1), where column 6 presents the reduced form of the same equation but with the instruments instead of our measure of education quality. The first column shows the simple correlation without controls, fixed effects, and trend. As it is shown in Figures (C.1- C.4), the correlation is positive. However, when we include both fixed effects and trend, the effect becomes negative. When we include the demographic and the economic controls, the impact remains negative and significative. When the measures of quantity of education are included as well as the variables that capture economic growth, the effect of test scores on crime rate is consistently

17The Congress and central government follow a strategy to control and monitor the way the resources are invested under this reform. 18We do not discuss whether the quality goal has been achieved.

80 negative and significative. This implies that quality is more important than quantity of education. Finally, the reduced form in column 6 shows that the impact of the spatial instrument is negative, as expected. Column 7 shows the result of the instrumental variable estimation, which corrects the identification issues. Our model fairs well on all specification tests. We report the p-value of the Kleibergen Paap rk LM statistic which depicts the underidentification test. The null here is that the model is underidentified and we are able to safely reject the null for all six specifications implying that our instruments are relevant and correlated with the endogenous regressor. The Kleibergen Paap F statistic is also reported which depicts the weak-identification test. The F statistic is well above 10 across all specification suggesting absence of weak-instrument problem. Since we use two instruments, we report the Hansen J statistic for overidentification of our model. The null here is that the instruments are jointly valid and we do not reject the null in our specifications (see Baum et al., 2007b). The point estimate from Table (3.2) suggests that one standard deviation increase in average median score leads to a decline of approximately 5.9 standard deviations in the crime index.

3.5.2. Property Crimes

Tables (3.2) depicts the impact of test scores on the crime index described in subsection 3.5.1, the index of property crime, as well as the index of violent crime. Our models fair well on all specification tests. The point estimates from Table (3.2) suggests that one standard deviation increase in average median test scores leads to a decline of approximately 6.2 standard deviations in the index of economic crime. In accordance with this result, when we look at a disaggregated analysis of the crime separately in Table (3.3), we find that that test scores leads to a statistically significant decline in rate of theft on cars. An increase in average median test scores in all subjects by one standard deviation results in a marginal decline of 6.4 standard deviations in the rate of theft on cars. These results support the assertion that better quality

81 of education has an ‘opportunity cost effect’ on such property crimes. Better performance in the school-exit examination encourages students for better potential opportunities in the labor market increasing their opportunity cost of engaging in criminal behavior.

3.5.3. Violent Crimes

Table (3.2) shows in column 3 the effects of test scores on the index of violent crimes and columns 5-8 in Table (3.3) show disaggregated violent crimes like total kidnapping rates, political, and non-political kidnapping rates, and homicide rates, respectively. As before, our models do well on the specification tests and our instruments are valid and strong. If the effect of education quality is found to be negative on these measures, one could assert a ‘pacifying effect’ of education in play. Notice that the impact of test scores on the z-score index of violent crime suggests a positive impact however the effects are not precisely estimated (column 3 Table 3.2). Upon disaggregation, we find a statistically significant and negative impact of test scores on total kidnapping rates as shown in column 5 Table (3.3). An increase in average median test scores in all subjects by one standard deviation results in a decline of approximately 3.3 standard deviations in total kidnapping rates and 4.6 standard deviations in non political kidnapping rate.

3.5.4. Conflict

Results for the model 3.2 from Table (3.4) suggests that better quality of education lowers the likelihood of presence of illegal armed groups in the municipalities. From the marginal effects, notice that one standard deviation increase in test scores leads to a decline in probability that FARC is present in the municipality by 1.1 percent, and either FARC or ELN by approximately 1 percent. These marginal effects are found to be statistically significant.19

19We find better education quality reduces the presence of coca crops but the marginal effect is not found to be significant.

82 As a consequence of the above models estimated, we assert that the results are indicative of a ‘pacifying effect’ since a decline in the likelihood of presence of illegal armed groups is found.

3.6. Transmission Channel

The results presented above indicate that both the ‘opportunity cost’ and the ‘pacifying’ effects explain the impact of quality of education on crime. To confirm if the mechanism behind the negative effect of education on property crime is the ‘opportunity cost’ effect, we estimate a similar model represented in equation (3.1), but with an outcome variable that signifies development. This is done to tease out the effect of quality of education from quantity. Better educational attainment is found to be correlated with higher economic growth or development. Recent literature has shown that one way to measure economic growth is through satellite data on night lights. The advantage of using light intensity is that the measure of economic activity can even cover areas that are typically difficult to access. Additionally, lights data have high spatial resolution, and is able to access sub- national levels as well [Henderson et al., 2012, Donaldson and Storeygard, 2016]. Table (3.5) presents the results. We find that quality of education remains a more important factor in explaining the positive impact on development. In particular, the point estimate shows that one standard deviation increase on the test scores increases the per capita mean of light intensity in approx- imately 2 standard deviations. This result indicates that development deters involvement in criminal activities.

3.7. Robustness

3.7.1. Sub-Sample Analysis

We carry out sub-sample analyses to check the robustness of our models (see Appendix). First, we exclude Bogota from the full sample (Tables C.2 and C.3) as well as all capital

83 cities from the full sample (Tables C.4 and C.5) and the results are found to be broadly similar to the baseline results in terms of the direction and magnitude of the impact. The results are statistically significant and the models perform well on all diagnostic tests. Second, we run a robustness check by restricting our sample to municipalities with pop- ulation less than 200,000 to give some indication of how results change for smaller and more rural areas (Tables C.6 and C.7). Results are robust and remain the same as the benchmark model we estimate. Lastly, we explore the rural-urban divide and choose municipalities with the proportion of rural population greater than half of the total municipality population to evaluate the effect for rural areas (Tables C.8 and C.9). We find statistically significant results similar to our benchmark case, although at a disaggregated level only non political kidnappings exhibits significant results. However, restricting our sample to include only urban areas with the proportion of rural population less than half of the total, we find that the effects are the opposite to those we found in rural areas. Specifically, the test scores affects negatively the index of violent crime and the homicide rate. These results are statistically significant (Tables C.10 and C.11), although the models do not perform well on the underidentification test. This suggests that rural areas may be driving most of our baseline results.

3.7.2. Other Government Transfers

We run our baseline model using two different instruments for education quality - spatial as well as shift-share - based on central government transfers to municipalities for other purposes and not investment in education quality. These transfers to municipalities are for purposes of education, health, food and general purposes (Tables C.12 and C.13). We find our models to do well on the specification tests as the baseline. The coefficients for the aggregated measures of crime maintain the expected sign but only property crime is statistically significant. At disaggregated levels, the results remain the same but are not precisely estimated. This suggests that the transfers from central government for other purposes are good predictors of education quality but they do not have statistically significant

84 impact on our outcome variables. This perhaps implies that our baseline model does in fact capture the impact of transfers from the central government for the purpose of improving quality of education specifically on violence through test scores. The same effects are not found through other kinds of transfers from the central government to the municipalities. Further, we include these transfers to municipality for other purposes as mentioned above as an additional regressor in our models (Tables C.14 and C.15). We then replace per capita total expenditures by per capita total transfers (Tables C.16 and C.17). Finally, we instru- ment this variable by constructing spatial and shift-share instruments based on central gov- ernment transfers for other purposes to neighboring municipalities (Tables C.18 and C.19). We instrument both education quality and central government transfer to municipalities for other purposes. We estimate this model to study if our results are not merely capturing state presence in terms of transfers of funds to municipalities. Our instruments become weak or not valid in most of the specifications. In those specifications in which the instruments are valid (Tables C.14, column 3; C.15 columns 2-8), the results remain the same as the baseline. However, we find that the variable capturing transfers from the center to each municipality for general purposes has no economic impact on crime and violence measures. The coefficient associated with the regressor is of the order of zero. The effect of test scores change slightly in terms of magnitude but the sign remains broadly robust to the baseline. This suggests that we are perhaps capturing the effect of education quality and not just state presence in general.

3.7.3. Other Measures of Education Quality

We carry out our analysis using other measures of education quality to compare if our results change from the baseline. Other measures used are average of median selection- corrected test scores in specific subjects like mathematics and language depicting cognitive ability of students and philosophy and social sciences depicting social area; and the original test scores in the exam provided by ICFES without correcting for self-selection (Tables C.20- C.25).

85 We find our results to be robust when consider the aggregated measure of crime and the measure of education quality used are the median scores in cognitive subjects, social areas, and total score. The models perform well on the specification tests and instruments are valid and relevant based on the underidentification, weak-identification as well as overidentification tests. Our results are similar to baseline model. However, when we use disaggregated levels of crime, the models perform well only when we use the average score in social areas, and the results are similar to those found in the baseline. The signs of the coefficients are as expected and the statistical significance remain robust.

3.8. Conclusion

This paper attempts to understand if the inherent assumptions about the trade-offs as- sociated with education, work and involvement in violent or criminal activity do in fact exist [Lochner, 2004]. Theoretically, education quality can have ambiguous impacts on crimes. Better quality of education may have an opportunity cost effect that reduces incentives of engaging in criminal activities due to higher future labor market returns; or a pacifying ef- fect on crime as a result of more political and social stability. Better education quality may even lead to organized violence or sometimes indoctrination of political ideas on account of ideological differences fueled through education systems. In this paper, we evaluate the first two hypotheses and using an Instrumental Variable approach, we gauge the causal impact of education quality on violence and crime. Although the paper uses Colombia as a case, the results found could be applied to wider range of countries with a history of violence. Our measure of quality of education is the performance of students in a mandatory standardized examination at the last level of high school. We correct for selection bias in the test scores to minimize measurement error since test scores are conditional on taking the exam. We estimate the municipality level drop out rates using the Census sample and impute zeros as the grades for those students who neither finished nor were enrolled in high school for this examination. We arrive at the selection bias corrected test scores and use the standardized average median scores across subjects indicating education quality as a more

86 accurate measure of central tendencies. Crime outcomes are given by theft rates, kidnapping rates, and homicide rates. We instrument education quality by constructing spatial instruments based on central government transfers of funds for improving quality of education to neighboring municipal- ities of a municipality in consideration. We also use instruments based on the investment by central government into education quality in every municipality in 2001 and construct shift-share of investments in each municipality for the periods 2007-2013. Our results suggest that education quality could have differential impacts on different forms of crimes. Improvement in quality of education has a statistically significant and negative impact on an aggregate measure of crime and property crimes one period later. Furthermore, a disaggregated analysis of economic crime rates shows that the higher the median scores in the exam, the lower the rates of theft on cars one period hence. This is in line with an opportunity cost effect thus lowering the incentives of engaging in such economic crimes. We also find that better education quality leads to a statistically significant but marginal decline in total and non-political kidnappings. Besides we find better education quality reduces the presence of illegal armed groups in municipalities suggesting a pacifying effect. Our results speak to the importance of designing educational policies that focus not only on increasing the quantity of education in terms of higher enrollments, years of education or construction of more educational establishments as suggested by previous works but also on improving the quality of education with a focus on better facilities, teacher quality and higher student performance.

87 Table 3.1. Crime and Education Quality

OLS Reduced Form IV (1) (2) (3) (4) (5) (6) (7) Average Score in Subjects 0.21*** -0.30*** -0.17** -0.15** -0.16** -5.85*** (0.02) (0.07) (0.07) (0.07) (0.06) (2.06) Total Population (log) -0.93* -0.06 0.00 0.50 -5.78** (0.53) (0.79) (0.80) (0.84) (2.45) Birth Rate 0.08*** 0.09*** 0.10*** 0.10*** 0.04 (0.03) (0.03) (0.03) (0.03) (0.05) Infant Mortality Rate -0.44*** -0.35*** -0.32*** -0.35*** 0.13 (0.08) (0.10) (0.11) (0.11) (0.20) Rurality Index 0.06 0.31 0.30 -0.23 2.42** (0.40) (0.41) (0.41) (0.55) (1.17) Agricultural Yield -0.02 -0.02 -0.02 -0.02 -0.06 (0.06) (0.06) (0.06) (0.05) (0.07) Projected Population to Attend Primary School (log) -0.56 -0.50 -0.26 1.08 (0.41) (0.41) (0.49) (0.69) Projected Population to Attend Secundary School (log) -0.35 -0.41 -0.91* 3.28**

88 (0.47) (0.48) (0.54) (1.66) Total Expenditure 0.08* 0.20*** 0.23*** (0.05) (0.06) (0.06) Total Tax Revenue 0.01 0.00 -0.01 (0.03) (0.03) (0.03) L.Per Capita Average Investment in Quality of Neighbors -0.12*** (0.05) L.Per Capita Shift Share of Investment on Quality -0.01 (0.01) Municipality FE No Yes Yes Yes Yes Yes Yes Trend No Yes Yes Yes Yes Yes Yes Adjusted-R2 0.05 0.01 0.02 0.02 0.03 0.03 -2.72 Observations 5508 5508 5460 5460 5452 4489 4486 Underidentification 0.012 Weak Identification 22.412 Overidentification 0.503 Notes: Standardized coefficients from Ordinary Least Squares (OLS) and Instrumental Variable (IV) regressions. Heteroskedas- ticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical signif- icance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; Endogeneity Test reports the p-value with null being variable is exogenous; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid. Table 3.2. Crime and Education Quality

Crime Rate Economic Crime Violent Crime (1) (2) (3) Average Score in Subjects -5.85*** -6.17*** 0.26 (2.06) (2.24) (0.99) Municipality FE Yes Yes Yes Controls Yes Yes Yes Trend Yes Yes Yes Adjusted-R2 -2.72 -3.00 -0.20 Observations 4486 4491 6134 Underidentification 0.012 0.011 0.001 Weak Identification 22.412 22.412 22.190 Overidentification 0.503 0.614 0.804 Notes: Standardized coefficients from Instrumental Variable (IV) re- gression. Heteroskedasticity robust standard error estimates clus- tered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

89 Table 3.3. Crime and Education Quality

Car Commerce Household Person Kidnap. Pol. Kidnap. Non Pol. Kidnap. Homid. (1) (2) (3) (4) (5) (6) (7) (8) Average Score in Subjects -6.37*** 0.69 0.01 -3.65 -3.32** -0.27 -4.59** 0.72 (2.21) (1.21) (0.88) (2.69) (1.61) (0.82) (2.12) (1.03) Municipality FE Yes Yes Yes Yes Yes Yes Yes Yes Controls Yes Yes Yes Yes Yes Yes Yes Yes Trend Yes Yes Yes Yes Yes Yes Yes Yes Adjusted-R2 -1.77 -0.21 -0.19 -1.51 -0.52 -0.19 -0.83 -0.22 Observations 4586 5962 6036 6130 6213 6213 6213 6134 Underidentification 0.013 0.001 0.001 0.001 0.001 0.001 0.001 0.001 Weak Identification 19.152 21.899 22.112 22.155 22.365 22.365 22.365 22.190 Overidentification 0.581 0.193 0.230 0.295 0.547 0.806 0.490 0.976 90 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen- Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid. Table 3.4. Presence and Quality of Education

FARC ELN Either (1) (2) (3) Average Score in Subjects -0.072** -0.002 -0.071** (0.033) (0.047) (0.033)

Control Yes Yes Yes

Controls Mean Yes Yes Yes N 6215 6215 6215

Note: Estimation is via Instrumental Variable approach. Dependent vari- ables are rates of different forms of violence per 100000 inhabitants. Con- trol variables include birth rate, death rate, infant mortality rate, years of establishment of municipality, rurality index, agricultural yield and fiscal characteristics. Clustered standard error estimates are reported in paren- theses; ∗∗∗ denotes statistical significance at the 1% level, ∗∗ at the 5% level, and ∗ at the 10% level, all for two-sided hypothesis tests.

91 Table 3.5. Lights and Education Quality

Mean of Lights OLS Reduced Form IV (1) (2) (3) (4) (5) (6) (7) Average Score in Subjects 0.12*** 0.11*** 0.32*** 0.37*** 0.37*** 1.92* (0.02) (0.04) (0.04) (0.04) (0.04) (1.05) Total Population (log) 0.21 3.07*** 3.07*** 3.20*** 5.66*** (0.64) (0.69) (0.69) (1.02) (1.60) Birth Rate -0.02 0.01 0.01 -0.01 0.03 (0.02) (0.02) (0.02) (0.02) (0.03) Infant Mortality Rate -0.65*** -0.32** -0.32** -0.21 -0.48** (0.12) (0.14) (0.14) (0.16) (0.19) Rurality Index -1.73*** -0.91*** -0.90*** -1.54*** -2.46*** (0.36) (0.33) (0.33) (0.56) (0.72) Agricultural Yield 0.08* 0.07* 0.07* 0.07 0.07 (0.04) (0.04) (0.04) (0.05) (0.04) Projected Population to Attend Primary School (log) -1.40*** -1.39*** -1.69*** -2.41*** (0.43) (0.43) (0.57) (0.64) Projected Population to Attend Secundary School (log) -1.59*** -1.59*** -1.32** -3.01*** (0.46) (0.46) (0.53) (1.06) Per Capita Total Expenditure -0.01 -0.05 -0.01 (0.02) (0.03) (0.03) Per Capita Total Tax Revenue 0.01 0.00 -0.01 (0.01) (0.02) (0.02) L.Per Capita Average Investment in Quality of Neighbors 0.09** (0.05) L.Per Capita Shift Share of Investment on Quality -0.01 (0.01) Municipality FE No Yes Yes Yes Yes Yes Yes Trend No Yes Yes Yes Yes Yes Yes Adjusted-R2 0.01 0.00 0.07 0.09 0.09 0.08 -0.42 Observations 6654 6654 6555 6555 6529 5178 5176 Underidentification 0.003 Weak Identification 28.900 Overidentification 0.261 Notes: Standardized coefficients from Ordinary Least Squares (OLS) and Instrumental Variable (IV) regressions. Heteroskedas- ticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; Endogeneity Test reports the p-value with null being variable is exogenous; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

92 Appendix A

GENDER GAP IN SCHOOLING: IS THERE A ROLE FOR HEALTH INSURANCE?

93 Figure A.1. RSBY Coverage

Source: www.rsby.gov.in

94 Table A.1. Robustness: Impact of RSBY on household school expenditure - Instrumental variable approach

Panel A. DID Panel B. DDD (1) (2) (3) (4) Log School expd. Log School expd. Budget Share Levels Budget Share Levels RSBY*Post 0.005*** 0.080*** -0.003* -0.105 (0.001) (0.014) (0.002) (0.169) Low Income (=1 for bottom 70%) -0.047** -0.384** (0.024) (0.169) RSBY*Post*Low Income 0.007*** 0.187*** (0.001) (0.088) Underidentification test p=0.000 p=0.000 p=0.001 p=0.013 Weak-identification test Kleigbergen Paap rk Wald F statistic 11.646 17.143 6.580 12.379 Endogeneity test p=0.010 p=0.031 p=0.010 p=0.038 Other Controls Y Y Y Y

95 Total Consumption Expenditure N Y N Y District fixed effects Y Y Y Y Time fixed effects Y Y Y Y District*Income fixed effects Y Y Time*Income fixed effects Y Y N 47421 47421 47421 47421 * p<0.10, ** p<0.05, *** p<0.01. The sample is restricted to HH with children and where age of the head is between 18 to 90 years. Panel A and B provide the DID and DDD results respectively. Col. (1) & (3) repeat the baseline IV result. Dependent variable is HH budget share of school expenditure. (2) & (4) are estimated via IV approach. Dependent variable is the inverse hyperbolic sine transformation of HH expenditure on school in levels. Total consumption expenditure is added as a regressor and instrument used for it is HH assets. Additional control in all regressions are RSBY = 1 if the district was exposed to RSBY & 0 otherwise, dummy for Low Income =1 if HH does not belong to top 30% and 0 otherwise (for DDD), HH size (instrumented by gender of the first child), highest education degrees of male and female members, indicators for religion of HH, indicators for caste of HH, dummy for urban areas, number of married men in the HH, number of married women in the HH, proportion of children, teens and adults, indicator for if HOH is married, dummy for if the HH has a bank account, dummy for if the HH has a farmer credit card, district fixed effects, time fixed effects, district by income fixed effects (for DDD), time by income fixed effects (for DDD). Standard errors reported are clustered standard errors. Table A.2. Robustness: Impact of RSBY on household school expenditure - Fractional logit estimation

Panel A. DID Panel B. DDD (1) (2) (3) (4) CRE FracLogit CRE FracLogit IV with FE (control function) IV with FE (control function) RSBY*Post 0.005*** 0.029 -0.003* 0.008 (0.001) (0.064) (0.002) (0.068) Low Income (=1 for bottom 70%) -0.047** 0.069 (0.024) (0.091) RSBY*Post*Low Income 0.007*** 0.067 (0.001) (0.043) Marginal Effect of RSBY: 0.001 (0.001) Households that belong to bottom 70% 0.002** (0.001) Households that belong to top 30% 0.000

96 (0.002) Underidentification test p=0.000 p=0.001 Weak-identification test Kleigbergen Paap rk Wald F statistic 11.646 6.580 Endogeneity test p=0.010 p=0.010 Other Controls Y Y Y Y District fixed effects Y N Y N Correlated random effects N Y N Y Time fixed effects Y Y Y Y Time*Income fixed effects Y Y N 47421 47421 47421 47421 * p<0.10, ** p<0.05, *** p<0.01. The sample is restricted to HH with children and where age of the head is between 18 to 90 years. Panel A and B provide the DID and DDD results

respectively. Col. (1) and (3) repeat the baseline IV with fixed effects results. Col. (2) and (4) via a fractional logit model with correlated random effects using a control function approach.

Dependent variable in all specifications is household's budget share of school expenditure. Additional controls in all specifications include : RSBY = 1 if the district was exposed to RSBY & 0

otherwise, dummy for Low Income =1 if HH does not belong to top 30% and 0 otherwise (for DDD), HH size (instrumented by gender of the first child), highest education degrees of male

and female members, indicators for religion of HH, indicators for caste of HH, dummy for urban areas, number of married men in the HH, number of married women in the HH, proportion of

children, teens and adults, indicator for if HOH is married, dummy for if the HH has a bank account, dummy for if the HH has a farmer credit card, district fixed effects, time fixed effects,

district by income fixed effects (for DDD), time by income fixed effects (for DDD). Standard errors reported are clustered standard errors. Table A.3. Robustness: Impact of RSBY on household school expenditure - Panel analysis

Panel A. DID Panel B. DDD (1) (2) (3) (4) Repeated Repeated Cross-Section Panel Cross-Section Panel RSBY*Post 0.005*** 0.003*** -0.003* -0.001 (0.001) (0.001) (0.002) (0.002) Low Income (=1 for bottom 70%) -0.047** -0.776* (0.024) (0.445) RSBY*Post*Low Income 0.007*** 0.004** (0.001) (0.002) Other Controls Y Y Y Y District Fixed Effects Y N Y N Household Fixed Effects N Y N Y 97 Time Fixed Effects Y Y Y Y District*Income Fixed Effects Y Y Time*Income Fixed Effects Y Y N 47421 45676 47421 45676 * p<0.10, ** p<0.05, *** p<0.01. The sample is restricted to HH with children and where age of the head is between 18 to 90 years. Panel A and B provide the DID and DDD results respectively. Dependent variable in all specifications is budget share of household's school expenditure. Col. (1) and (3) repeat the baseline IV with FE results. Data is treated in baseline as a repeated cross-section. Col (2) & (4) are estimated treating data as a panel data using IV with HH FE. Additional controls include: RSBY = 1 if the HH in the district was exposed to RSBY & 0 otherwise, dummy for Low Income =1 if HH does not belong to top 30% and 0 otherwise (for DDD), HH size (instrumented by the gender of the first child), highest education degrees of male and female members, indicators for religion of HH, indicators for caste of HH, dummy for urban areas, number of married men in the HH, number of married women in the HH, proportion of children, teens and adults, indicator for if HOH is married, dummy for if the HH has a bank account, dummy for if the HH has a farmer credit card, district fixed effects, HH fixed effects and time fixed effects. Standard errors reported are clustered standard errors. Table A.4. Robustness: Impact of RSBY on child school enrollment - Probit with correlated random effects

Panel A. DID Panel B. DDD (1) (2) (3) (4) (5) (6) LPM with FE LPM with CRE CRE Probit LPM with FE LPM with CRE CRE Probit RSBY*Post 0.027*** 0.017*** 0.126** -0.023 -0.023 0.191*** (0.006) (0.005) (0.060) (0.017) (0.017) (0.073) Boy 0.060*** 0.059*** 0.293*** 0.055*** 0.055*** 0.253*** (0.004) (0.004) (0.021) (0.004) (0.004) (0.045) RSBY*Post*Boy -0.019*** -0.018*** -0.032* (0.005) (0.005) (0.014) Low Income (=1 for bottom 70%) -0.042 -0.043 -0.123 (0.023) (0.023) (0.187) RSBY*Post*Low Income 0.046** 0.046** -0.151 (0.020) (0.020) (0.103) RSBY*Post*Low Income*Boy -0.009*** -0.009*** 0.038

98 (0.001) (0.001) (0.075) Marginal Effects of RSBY: Boy 0.094* 0.026 (0.055) (0.068) Girl 0.126** 0.040 (0.060) (0.073) Underidentification test p=0.000 p=0.000 p=0.000 p=0.000 Weak-identification test Kleigbergen Paap rk Wald F statistic 44.022 48.396 42.46 50.666 Endogeneity test p = 0.570 p=0.546 p=0.414 p=0.423 Other Controls Y Y Y Y Y Y District Fixed Effects Y N N Y N N Correlated Random Effects N Y Y Y Y Y Time Fixed Effects Y Y Y N Y Y District*Income Fixed Effects Y Y Y Time*Income Fixed Effects Y Y Y N 83221 83221 83221 83221 83221 83221 * p<0.10, ** p<0.05, *** p<0.01. The sample is restricted to children above the age of 5 and below the age of 18. Panel. A provides the DID results and Panel. B provides the DDD results. Col. (1) and (4) are estimated via LPM with FE. Col. (2) and (5) are estimated via LPM with correlated random effects. Col. (3) and (6) are estimated via IV probit model with correlated random effects. Dependent variable in all specifications is school enrollment of a child in a household in a district at a particular point in time. Additional controls include: gender dummy = 1 for a boy and 0 for a girl, RSBY = 1 if the district was exposed to RSBY and 0 otherwise, Low Income dummy =1 if HH does not belong to top 30% and 0 otherwise (for DDD), HH size, parental education characteristics, indicators for religion of HH, indicators for caste of HH, dummy for urban areas, school facilities and scholarships offered, district fixed effects, time fixed effects, district by income fixed effects (for DDD) and time by income fixed effects (for DDD). HH size is instrumented by the gender of the first child. Means of all controls at district level have been included in (2), (3), (5) and (6). Standard errors reported are clustered standard errors. Table A.5. Robustness: Impact of RSBY on child school enrollment - Instrumental variable approach

Panel A. DID Panel B. DDD (1) (2) (3) (4) (5) (6) LPM with FE, IV LPM with FE LPM with FE LPM with FE, IV LPM with FE LPM with FE RSBY*Post 0.027*** 0.028** 0.028** -0.023 -0.012 -0.016 (0.006) (0.011) (0.011) (0.017) (0.018) (0.019) Boy 0.060*** 0.058*** 0.059*** 0.055*** 0.053*** 0.054*** (0.004) (0.004) (0.004) (0.004) (0.003) (0.004) RSBY*Post*Boy -0.019*** -0.018*** -0.018*** (0.005) (0.006) (0.006) Low Income (=1 for bottom 70%) -0.042 -0.059*** -0.052*** (0.023) (0.006) (0.006) RSBY*Post*Low Income 0.046** 0.034 0.038* (0.020) (0.022) (0.022) RSBY*Post*Low Income*Boy -0.009*** -0.008*** -0.008***

99 (0.001) (0.001) (0.001) Other Controls Y Y Y Y Y Y Household Size Y Y N Y Y N District Fixed Effects Y Y Y Y Y Y Time Fixed Effects Y Y Y Y Y Y District*Income Fixed Effects Y Y Y Time*Income Fixed Effects Y Y Y N 83221 83221 83221 83221 83221 83221 * p<0.10, ** p<0.05, *** p<0.01. The sample is restricted to children above the age of 5 and below the age of 18. Panel. A provides the DID results and Panel. B provides the DDD results. (1) and (4) are estimated via LPM with FE. (2) and (5) are estiamted via LPM with FE including HH size as a regressor but not instrumenting for it. (3) and (6) are estimated via LPM with FE excluding HH size as a regressor. Dependent variable is school enrollment of a child in a household in a district at a particular point in time. Additional controls included in each specification - gender dummy = 1 for a boy and 0 for a girl, RSBY = 1 if the district was exposed to RSBY and 0 otherwise, Low Income dummy =1 if HH does not belong to top 30% and 0 otherwise (for DDD), parental education characteristics, indicators for religion of HH, indicators for caste of HH, dummy for urban areas, school facilities and scholarships offered, district fixed effects, time fixed effects, district by income fixed effects, time by income fixed effects. HH size is included as a regressor and instrumented by gender of the first child in the family. Standard errors reported are clustered standard errors. Table A.6. Sensitivity analysis: Impact of RSBY on household school expenditure and child school enrollment - Variation in income categories

Panel A. DID Panel B. DDD (1) (2) (3) Baseline Top and Bottom 30% Mid 40 and Top 30% Panel I. School Expenditure RSBY*Post 0.005*** -0.001 -0.001 (0.001) (0.002) (0.002) Low Income 0.001 0.010 (0.002) (0.019) RSBY*Post*Low Income 0.005* 0.002 (0.003) (0.003) Other Controls Y Y Y District fixed effects Y Y Y Time fixed effects Y Y Y District*Income fixed effects Y Y Time*Income fixed effects Y Y N 47421 27592 32835 Panel II. School Enrollment RSBY*Post 0.027*** -0.024** -0.040*** (0.006) (0.012) (0.008) Boy 0.060*** 0.057*** 0.040*** (0.004) (0.006) (0.004) Low Income 0.056 -0.086 (0.158) (0.092) RSBY*Post*Boy -0.019*** (0.005) RSBY*Post*Low Income 0.063* 0.049*** (0.033) (0.011) RSBY*Post*Low Income*Boy -0.002* 0.001 (0.001) (0.006) Underidentification test p=0.000 p=0.001 p=0.000 Weak-identification test Kleigbergen Paap rk Wald F statistic 44.022 16.721 39.832 Endogeneity test p = 0.570 p=0.529 p = 0.947 Other Controls Y Y Y District Fixed Effects Y Y Y Time Fixed Effects Y Y Y District*Income Fixed Effects Y Y Time*Income Fixed Effects Y Y N 83221 47876 57884 * p<0.10, ** p<0.05, *** p<0.01. The sample is restricted to HH with children and where age of the head is between 18 to 90 years. Panel A and B provide the DID and DDD results respectively. Col (1) repeats the baseline IV results. Col (2) provides DDD results where sample is restricted to top and bottom 30% of income distribution. Middle 40% is dropped. Col. (3) provides the DDD results where sample is restricted to middle 40% and top 30% of income distribution. Bottom 30% is dropped. Dependent variable in all specifications is budget share of household's school expenditure. Additional controls include : RSBY = 1 if the district was exposed to RSBY & 0 otherwise, Low Income dummy =1 if HH belongs to bottom 30% and 0 if HH belongs to top 30% (for Col (2)), Low Income dummy =1 if HH belongs to middle 40% and 0 if belongs to top 30% (for Col. (3)), HH size (instrumented by gender of the first child), highest education degrees of male and female members, indicators for religion of HH, indicators for caste of HH, dummy for urban areas, number of married men in the HH, number of married women in the HH, proportion of children, teens and adults, indicator for if HOH is married, dummy for if the HH has a bank account, dummy for if the HH has a farmer credit card, district fixed effects, time fixed effects, districy by income fixed effects (for DDD), time by income fixed effects (for DDD). Standard errors reported are clustered standard errors.

100 Table A.7. Sensitivity analysis: Impact of RSBY on household school expenditure - Variation by intensity of treatment

Panel A. DID Panel B. DDD (1) (2) School Expenditure RSBY*Post 0.000 0.001 (0.001) (0.001) RSBY*Post*Intensity1 0.001 (0.002) RSBY*Post*Intensity2 0.003 (0.002) RSBY*Post*Intensity3 0.003* (0.001) RSBY*Post*Low Income 0.002 (0.003) RSBY*Post*Low Income*Intensity1 0.005 (0.004) RSBY*Post*Low Income*Intensity2 0.005 (0.005) RSBY*Post*Low Income*Intensity3 0.008 (0.012) Underidentification test p=0.000 p=0.000 Weak-identification test Kleigbergen Paap rk Wald F statistic 12.292 12.213 Endogeneity test p=0.012 p=0.008 Other Controls Y Y District fixed effects Y Y Time fixed effects Y Y District*Income fixed effects Y Time*Income fixed effects Y N 37885 37885 p<0.10, ** p<0.05, *** p<0.01. The sample is restricted to HH where age of the head is between 18 to 90 years. Panel A and B provide the DID and DDD results respectively on school expenditure. Dependent variable is the budget share of household's school expenditure. Additional controls include in panel A: RSBY = 1 if the district was exposed to RSBY & 0 otherwise, Low Income dummy =1 if HH belongs to bottom 30% and 0 otherwise, discrete indicator variable for intensity depending on duration of treatment, relevant two way and three way interaction with intensity, HH size (instrumented by gender of the first child), highest education degrees of male and female members, indicators for religion of HH, indicators for caste of HH, dummy for urban areas, number of married men in the HH, number of married women in the HH, proportion of children, teens and adults, indicator for if HOH is married, dummy for if the HH has a bank account, dummy for if the HH has a farmer credit card, district fixed effects, time fixed effects, district by income fixed effects (for DDD), time by income fixed effects (for DDD). Standard errors reported are clustered standard errors.

101 Table A.8. Sensitivity analysis: Impact of RSBY on child school enrollment - Variation by intensity of treatment

Panel A. DID Panel B. DDD (1) (2) School Enrollment Boy 0.060*** 0.056*** (0.004) (0.004) RSBY*Post 0.030*** -0.023 (0.006) (0.018) RSBY*Post*Boy -0.013*** (0.005) RSBY*Post*Intensity1 -0.027*** (0.010) RSBY*Post*Intensity2 0.016 (0.012) RSBY*Post*Intensity3 0.007 (0.025) RSBY*Post*Intensity1*Boy -0.003 (0.010) RSBY*Post*Intensity2*Boy -0.049*** (0.013) RSBY*Post*Intensity3*Boy -0.097*** (0.028) RSBY*Post*Low Income 0.050** (0.021) RSBY*Post*Low Income*Boy 0.001 (0.006) RSBY*Post*Low Income*Intensity1 -0.022* (0.013) RSBY*Post*Low Income*Intensity2 0.008 (0.014) RSBY*Post*Low Income*Intensity3 -0.017 (0.030) RSBY*Post*Low Income*Intensity1*Boy -0.027* (0.014) RSBY*Post*Low Income*Intensity2*Boy -0.052*** (0.017) RSBY*Post*Low Income*Intensity3*Boy -0.091** Underidentification test p=0.000 p=0.000 Weak-identification test Kleigbergen Paap rk Wald F statistic 44.977 44.589 Endogeneity test p=0.592 p=0.473 Other Controls Y Y District Fixed Effects Y Y Time Fixed Effects Y Y District*Income Fixed Effects Y Time*Income Fixed Effects Y N 83221 83221 p<0.10, ** p<0.05, *** p<0.01. The sample is restricted to HH with children. Panel A and B provide the DID and DDD results respectively on child school enrollment. Dependent variable is school enrollment of a child in a household in a district at a particular point in time. Additional controls include: RSBY = 1 if the district was exposed to RSBY & 0 otherwise, Low Income dummy =1 if HH belongs to bottom 30% and 0 otherwise, discrete indicator variable for intensity depending on duration of treatment, relevant two way and three way interaction with intensity, HH size (instrumented by gender of the first child), indicators for religion of HH, indicators for caste of HH, dummy for urban areas, parental education characteristics, school facilities and scholarships offered, district fixed effects, time fixed effects, district by income fixed effects (for DDD), time by income fixed effects (for DDD). Standard errors reported are clustered standard errors.

102 Table A.9. Sensitivity analysis: Impact of RSBY on household school expenditure - Variation in take-up by district

Panel A Panel B (1) (2) (3) DID with district DID enrollment DDD with district enrollment RSBY*Post 0.007 0.003 -0.098 (0.014) (0.015) (0.177) RSBY*Post*DistrictEnrollment 0.006* (0.004) RSBY*Post*Low Income 0.106 (0.202) RSBY*Post*Low Income*DistrictEnrollment 0.001 (0.010) 103 Underidentification test p=0.285 p=0.143 p=0.124 Weak-identification test Kleigbergen Paap rk Wald F statistic 1.111 2.116 2.203 Endogeneity test p=0.322 p=0.355 p=0.313 Other Controls Y Y District fixed effects Y Y Time fixed effects Y Y District*Income fixed effects Y Time*Income fixed effects Y N 15265 15265 15265 p<0.10, ** p<0.05, *** p<0.01. The sample is restricted to HH with children and where age of the head is between 18 to 90 years. Panel A and B provide the DID and DDD results respectively. Dependent variable is the budget share of household's school expenditure. Additional controls include : RSBY = 1 if the district was exposed to RSBY & 0 otherwise, District enrollment rate(= enrolled targeted households/total eligible households), Low Income dummy =1 if HH belongs to bottom 30% and 0 otherwise, HH size (instrumented by gender of the first child), highest education degrees of male and female members, indicators for religion of HH, indicators for caste of HH, dummy for urban areas, number of married men in the HH, number of married women in the HH, proportion of children, teens and adults, indicator for if HOH is married, dummy for if the HH has a bank account, dummy for if the HH has a farmer credit card, district fixed effects, time fixed effects, district by income fixed effects (for DDD), time by income fixed effects (for DDD). Standard errors reported are clustered standard errors. Table A.10. Sensitivity analysis: Impact of RSBY on child school enrollment - Variation in age groups

Panel A. DID Panel B. DDD 5-9 years 10-14 years 15-17 years 5-9 years 10-14 years 15-17 years RSBY*Post 0.031*** 0.035*** 0.017 -0.022 -0.041*** 0.070*** (0.009) (0.009) (0.014) (0.021) (0.011) (0.016) Boy 0.043*** 0.061*** 0.081*** 0.039*** 0.058*** -0.097*** (0.005) (0.008) (0.017) (0.004) (0.008) (0.022) RSBY*Post*Boy -0.019*** -0.019*** -0.012 (0.007) (0.007) (0.012) Low Income (=1 for bottom 70%) 0.044 -0.085 -0.154 (0.140) (0.124) (0.327) RSBY*Post*Low Income 0.037 0.069*** 0.111*** (0.029) (0.020) (0.032) 104 RSBY*Post*Low Income*Boy -0.009*** -0.016* 0.019 (0.001) (0.008) (0.014) Underidentification test p=0.000 p=0.000 p=0.000 p=0.000 p=0.000 p=0.000 Weak-identification test Kleigbergen Paap rk Wald F statistic 25.214 13.184 11.172 23.669 12.068 11.187 Endogeneity test p = 0.464 p=0.853 p = 0.293 p=0.417 p=0.824 p=0.239 Other Controls Y Y Y Y Y Y District Fixed Effects Y Y Y Y Y Y Time Fixed Effects Y Y Y Y Y Y District*Income Fixed effects Y Y Y Time*Income Fixed effects Y Y Y N 29411 32824 18650 29411 32824 18650 * p<0.10, ** p<0.05, *** p<0.01. Panel A and B provide the DID and DDD results respectively. Estimation is using LPM. The sample in (1) is restricted to children between the ages 5 to 9 years; in (2) is restricted to children between the ages 10 to 14 years; and in (3) is restricted to children in the ages 15 to 17 years. Dependent variable is school enrollment of a child in a household in a district at a particular point in time. Additional controls included in each specification - gender dummy = 1 for a boy and 0 for a girl, RSBY = 1 if the district was exposed to RSBY, treatment interactions with gender dummy, Low Income dummy =1 HH does not belong to top 30% and 0 otherwise (for DDD), HH size, parental education characteristics, indicators for religion of HH, indicators for caste of HH, dummy for urban areas, school facilities and scholarships offered, district and time fixed effects, district by income fixed effects (for DDD) and time by income fixed effects (for DDD). HH size is instrumented by the gender of the first child. Standard errors reported are clustered standard errors. Table A.11. Sensitivity analysis: Impact of RSBY on household school expenditure and child school enrollment - Rural vs urban

Panel A. DID Panel B. DDD Panel I - School expenditure Baseline Urban Rural Baseline Urban Rural RSBY*Post 0.005*** 0.007** 0.004*** -0.003* -0.001 -0.004 (0.001) (0.003) (0.001) (0.002) (0.004) (0.003) Low Income (=1 for bottom 70%) -0.047** 0.000 0.001 (0.024) (0.006) (0.003) RSBY*Post*Low Income 0.007*** 0.002 0.008** (0.001) (0.005) (0.003) Other Controls Y Y Y Y Y Y District Fixed Effects Y Y Y Y Y Y Time Fixed Effects Y Y Y Y Y Y District*Income Fixed Effects Y Y Y Time*Income Fixed Effects Y Y Y N 47421 11219 28897 47421 11205 28897 Panel II. School enrollment 105 RSBY*Post 0.027*** -0.005 0.040*** -0.023 -0.043*** -0.027 (0.006) (0.011) (0.007) (0.017) (0.014) (0.018) Boy 0.060*** 0.023*** 0.072*** 0.055*** 0.022*** 0.068*** (0.004) (0.006) (0.006) (0.004) (0.005) (0.006) RSBY*Post*Boy -0.019*** -0.005 -0.024*** (0.005) (0.008) (0.006) Low Income (=1 for bottom 70%) -0.042 -0.176 -0.008 (0.023) (0.164) (0.105) RSBY*Post*Low Income 0.046** 0.043** 0.066*** (0.020) (0.020) (0.024) RSBY*Post*Low Income*Boy -0.009*** -0.007 -0.015** (0.001) (0.012) (0.007) Underidentification test p=0.000 p=0.000 p=0.000 p=0.000 p=0.000 p=0.000 Weak-identification test Kleigbergen Paap rk Wald F statistic 44.022 30.525 27.462 42.46 20.892 27.781 Endogeneity test p = 0.570 p=0.402 p=0.566 p=0.414 p=0.598 p = 0.516 Other Controls Y Y Y Y Y Y District Fixed Effects Y Y Y Y Y Y Time Fixed Effects Y Y Y Y Y Y District*Income Fixed Effects Y Y Y Time*Income Fixed Effects Y Y Y N 83221 22760 60461 83221 22760 60461 * p<0.10, ** p<0.05, *** p<0.01. Panel A and B provide the DID and DDD results respectively. Panel I provides the results estimated using IV approach. Dependent variable is budget share of household's school expenditure. The sample is restricted to HH with children and where age of the head is between 18 to 90 years. Panel II provides the results estimated using a LPM. Dependent variable is school enrollment of a child in a household. Individual sample is restricted to children above the age of 5 and below the age of 18. Additional controls included in Panel I include RSBY = 1 if the district was exposed to RSBY & 0 otherwise, dummy for Low Income =1 if HH does not belong to top 30% and 0 otherwise (for DDD), HH size (instrumented by gender of the first child), highest education degrees of male and female members, indicators for religion of HH, indicators for caste of HH, dummy for urban areas, number of married men in the HH, number of married women in the HH, proportion of children, teens and adults, indicator for if HOH is married, dummy for if the HH has a bank account, dummy for if the HH has a farmer credit card, district fixed effects, time fixed effects, district by income fixed effects (for DDD), time by income fixed effects (for DDD). Controls in Panel II specification include a gender dummy = 1 for a boy and 0 for a girl, RSBY, dummy for Low Income, HH size, parental education characteristics, indicators for religion of HH, indicators for caste of HH, dummy for urban areas, school facilities and scholarships offered, district and time fixed effects, district by income fixed effects (for DDD), time by income fixed effects (for DDD). HH size is instrumented by the gender of the first child. Standard errors reported are clustered standard errors. Table A.12. Sensitivity analysis: Impact of RSBY on household school expenditure and child school enrollment - Variation by castes

Panel A. DID Panel B. DDD Panel I - School expenditure Baseline General OBC SC ST Other Baseline General OBC SC ST Other RSBY*Post 0.005*** 0.009*** 0.004* 0.000 0.004 0.003 -0.003* -0.009 -0.002 0.001 0.000 -0.005 (0.001) (0.002) (0.002) (0.002) (0.003) (0.002) (0.002) (0.008) (0.003) (0.003) (0.006) (0.004) Low Income (=1 for bottom 70%) -0.047** -0.006*** -0.007** -0.004 -0.008 -0.005 (0.024) (0.001) (0.002) (0.004) (0.008) (0.004) RSBY*Post*Low Income 0.007*** 0.019*** 0.004 0.002 0.006 0.006* (0.001) (0.006) (0.003) (0.003) (0.006) (0.003) Other Controls Y Y Y Y Y Y Y Y Y Y Y Y District Fixed Effects Y Y Y Y Y Y Y Y Y Y Y Y Time Fixed Effects Y Y Y Y Y Y Y Y Y Y Y Y District*Income Fixed Effects Y Y Y Y Y Y Time*Income Fixed Effects Y Y Y Y Y Y N 47421 2206 15091 14565 7025 8534 47421 2206 15085 14565 7025 8534 Panel II. School enrollment 106 RSBY*Post 0.027*** 0.024 0.062*** 0.046*** 0.054 -0.015 -0.023 0.011 -0.024 -0.024 -0.134*** 0.014 (0.006) (0.022) (0.012) (0.013) (0.044) (0.025) (0.017) (0.022) (0.034) (0.038) (0.035) (0.042) Boy 0.060*** 0.029*** 0.077*** 0.051*** 0.041*** 0.061*** 0.055*** 0.026*** 0.064*** 0.053*** 0.032*** 0.060*** (0.004) (0.010) (0.009) (0.008) (0.014) (0.009) (0.004) (0.010) (0.009) (0.007) (0.010) (0.008) RSBY*Post*Boy -0.019*** -0.011 -0.070*** 0.008 0.001 0.001 (0.005) (0.014) (0.008) (0.008) (0.014) (0.018) Low Income (=1 for bottom 70%) -0.042 -0.472* -0.014 -0.181* -0.111 0.237 (0.023) (0.250) (0.167) (0.101) (0.186) (0.179) RSBY*Post*Low Income 0.046** 0.034 0.072 0.070* 0.198*** -0.046 (0.020) (0.023) (0.047) (0.038) (0.052) (0.042) RSBY*Post*Low Income*Boy -0.009*** -0.007 -0.061*** 0.004 -0.021* -0.008*** (0.001) (0.020) (0.011) (0.009) (0.013) (0.001) Underidentification test p=0.000 p=0.000 p=0.000 p = 0.041 p = 0.040 p=0.000 p=0.000 p=0.000 p=0.000 p=0.056 p=0.003 p=0.000 Weak-identification test Kleigbergen Paap rk Wald F statistic 44.022 20.444 12.667 4.131 4.111 17.656 42.46 18.793 11.318 5.717 8.125 25.42 Endogeneity test p = 0.570 p = 0.447 p = 0.530 p = 0.740 p = 0.605 p = 0.024 p=0.414 p = 0.433 p = 0.563 p = 0.849 p = 0.415 p = 0.010 Other Controls Y Y Y Y Y Y Y Y Y Y Y Y District Fixed Effects Y Y Y Y Y Y Y Y Y Y Y Y Time Fixed Effects Y Y Y Y Y Y Y Y Y Y Y Y District*Income Fixed Effects Y Y Y Y Y Y Time*Income Fixed Effects Y Y Y Y Y Y N 83221 4486 25892 25189 12291 15363 83221 4486 25892 25189 12291 15363 * p<0.10, ** p<0.05, *** p<0.01. Panel A and B provide the DID and DDD results respectively. Panel I provides the results estimated using IV approach. Dependent variable is budget share of household's school expenditure. The sample is restricted to HH with children and where age of the head is between 18 to 90 years. Panel II provides the results estimated using a LPM. Dependent variable is school enrollment of a child in a household. Individual sample is restricted to children above the age of 5 and below the age of 18. Additional controls included in Panel I specification include RSBY = 1 if the district was exposed to RSBY & 0 otherwise, dummy for Low Income =1 if HH does not belong to top 30% and 0 otherwise (for DDD), HH size (instrumented by gender of the first child), highest education degrees of male and female members, indicators for religion of HH, dummy for urban areas, number of married men in the HH, number of married women in the HH, proportion of children, teens and adults, indicator for if HOH is married, dummy for if the HH has a bank account, dummy for if the HH has a farmer credit card, district fixed effects, time fixed effects, district by income fixed effects (for DDD), time by income fixed effects (for DDD). Controls in Panel II include a gender dummy = 1 for a boy and 0 for a girl, RSBY, Low Income, HH size, parental education characteristics, indicators for religion of HH, dummy for urban areas, school facilities and scholarships offered, district and time fixed effects, district by income fixed effects (for DDD), time by income fixed effects (for DDD). HH size is instrumented by the gender of the first child. Standard errors reported are clustered standard errors. Appendix B

INTRA-HOUSEHOLD CONSUMPTION DECISIONS: EVIDENCE FROM NREGA

107 Figure B.1. Districts map of India

The map shows all rural districts of mainland India, colour-coded according to NREGA implementation phase. Phase 1 districts are shown in yellow, phase 2 in orange and phase 3 in brown (Source: Berg et al., 2012)

108 Table B.1. Summary statistics

Time period 2007-08 Districts - Phase 1 & 2 Districts - Phase 3 Demographics N Mean SD Min Max N Mean SD Min Max Age 42356 47.078 12.948 8 109 25766 48.806 13.382 15 99 Number of Adult Male members 42356 1.545 0.821 1 11 25766 1.612 0.869 1 10 Number of Adult Female members 42356 1.544 0.821 1 13 25766 1.638 0.909 1 11 Number of adult male & females in HH 42356 3.089 1.402 2 19 25766 3.250 1.500 2 16 Number of children 42356 1.897 1.613 0 17 25766 1.817 1.623 0 15 Number of adult males with education 42356 2.235 1.188 1 14 25766 2.310 1.220 1 14 Number of adult females with education 42356 2.103 1.168 1 14 25766 2.217 1.236 1 12 Number of members with education 42356 2.872 1.991 0 27 25766 3.220 1.991 0 17 Household Size 42356 5.007 2.556 1 26 25766 5.112 2.664 1 24 Land possessed 42356 4.228 2.095 1 12 25766 4.430 2.311 1 12 HH headed by females 42356 0.061 0.239 0 1 25766 0.075 0.263 0 1 HH males with primary and below schooling 42356 0.282 0.450 0 1 25766 0.247 0.431 0 1 HH males with middle and high school 42356 0.295 0.456 0 1 25766 0.347 0.476 0 1 HH males with higher education 42356 0.144 0.351 0 1 25766 0.200 0.400 0 1 HH males with technical education 42356 0.014 0.119 0 1 25766 0.025 0.156 0 1 HH females with primary and below schooling 42356 0.245 0.430 0 1 25766 0.236 0.425 0 1 HH females with middle and high school 42356 0.184 0.388 0 1 25766 0.247 0.432 0 1 HH females with higher education 42356 0.061 0.240 0 1 25766 0.107 0.309 0 1 HH females with technical education 42356 0.004 0.061 0 1 25766 0.009 0.097 0 1 Muslim 42356 0.040 0.195 0 1 25766 0.040 0.197 0 1 Christian 42356 0.025 0.157 0 1 25766 0.031 0.174 0 1 Sikh 42356 0.005 0.068 0 1 25766 0.022 0.145 0 1 Other religion 42356 0.009 0.097 0 1 25766 0.011 0.105 0 1 Scheduled Tribes 42356 0.077 0.266 0 1 25766 0.050 0.217 0 1 Consumption Variables Cereals & cereal products 42341 685.635 400.855 10 15000 25745 649.3091 407.1295 20 6000 Pulses & pulses products 41925 120.337 80.2904 4 2000 25499 135.2086 88.45742 4 3300 Edible oil 42196 153.113 89.422 4 4400 25421 172.8233 106.7603 3 2200 Intoxicants, pan and tobacco 34046 119.414 138.466 3 5000 18072 167.7373 202.2132 4 7050 Fuel and light 42105 324.292 172.493 9 4850 25599 394.7693 218.9147 4 5500 Entertainment 13447 86.3677 93.7579 4 4000 9239 120.4679 97.56365 5 2100 Vegetable and fruits 42264 268.937 171.505 6 4200 25719 299.7245 203.5115 10 8000 Salt, spices, condiments and other food 42349 210.592 160.082 4 6150 25764 286.5586 207.0537 15 9262 Meat, milk and milk products 40849 382.834 345.957 4 13000 25466 598.5484 549.7724 10 9000 Medical expenditure 31030 949.223 5024.39 2 250500 19020 1684.086 8925.278 3 375913 School expenditure 30879 1931.37 4116.53 2 125045 21209 2727.583 6094.921 2 215136 Personal, toiletry and miscellaneous articles 42219 135.892 110.527 4 9000 25682 175.9877 131.306 5 3000 Clothing, bedding and footwear 42291 2774.39 2175.91 17 100000 25716 3326.568 2692.659 50 100000 Durable goods 41556 1409.01 6257.68 3 818700 25511 2087.521 10767.01 3 605500 Notes: The table shows the differences in trends in the control districts (districts covered in phase 1 and 2) and the treatment districts (districts covered in phase 3) in 2007-08. Dummy variables containing information about education levels, caste and religion of the households are included. Dummy for households with female head = 1 if household is headed by female, otherwise 0. Muslim takes value 1 if household religion is Muslim. Christian = 1 if household religion is Christian, otherwise 0. Sikh = 1 if households religion is Sikh, otherwise 0. Other religion = 1 if the household religion falls under any of the other categories like Jainism, Buddhism, Zoroastrianism, and others. Scheduled Tribes = 1 if household caste is scheduled tribe, otherwise all other castes (SCs, OBCs and general) take value 0 because for several districts no data was available for other castes.

109 Table B.2. Impact of NREGA on Expenditure Shares - Fractional Logit Model with Correlated Random Effects Approach

Edible Fuel & Veg & Meat & Medical School Clothing & Durable Variables Cereals Pulses Oil Light Intoxicants Entertainment Fruits Condiments Milk Expd Expd Personal bedding Goods Coefficients NREGA 0.046*** 0.008 0.009 -0.016 -0.098*** -0.088*** 0.008 -0.060*** -0.106*** -0.032 0.066* -0.033*** 0.047** 0.053** (0.012) (0.012) (0.011) (0.010) (0.019) (0.017) (0.012) (0.011) (0.016) (0.025) (0.036) (0.010) (0.020) (0.027)

Marginal Effects of NREGA NREGA 0.009*** 0.002 0.002 -0.003 -0.024*** -0.022*** 0.002 -0.014*** -0.021*** -0.008 0.015* -0.008*** 0.008** 0.010** (0.002) (0.003) (0.003) (0.002) (0.005) (0.004) (0.003) (0.003) (0.003) (0.006) (0.008) (0.002) (0.004) (0.005) 110 Other Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Land included Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

N 81898 81049 81234 81373 59134 37835 81821 81915 80101 65201 67975 81823 81743 81276 Notes: * p<0.10, ** p<0.05, *** p<0.01. Estimation is via fractional logit model with correlated random effects. The sample is restricted to include households with atleast one adult female and male member. Dependent variables are in the form of budget shares spent on 14 separate commodity categories out of the total monthly spending by a household in a district at a particular point in time. Additional controls included in each specification - district fixed effects, household size, age of the head of the household, age squared, number of children, number of literate male and female members, number of male and female members with primary, middle, higher and technical education, Scheduled Tribe (ST), Scheduled Caste (SC), Other Backward Class (OBC), Hindu, Islam, Christianity, Sikhism, and other religion, and means of all controls at district level across time. Standard errors are clustered at district level and reported in parenthesis. Sample is restricted to include only households with atleast 1 male and female adult member who have school going children for the model where outcome is school expdenditure Table B.3. Heterogeneous Impacts of NREGA on Expenditure Shares: Female Share of NREGA Employment - Fractional Logit Model with Correlated Random Effects Approach

Edible Fuel & Veg & Meat & Medical School Clothing & Durable Variables Cereals Pulses Oil Light Intoxicants Entertainment Fruits Condiments Milk Expd Expd Personal bedding Goods Coefficients

NREGA 0.123*** 0.060*** 0.055*** 0.005 -0.221*** -0.090*** 0.035** -0.078*** -0.200*** -0.067** -0.040 -0.055*** 0.023 -0.021

(0.017) (0.033) (0.014) (0.013) (0.027) (0.021) (0.015) (0.016) (0.025) (0.031) (0.040) (0.013) (0.022) (0.029) NREGA*Female share of NREGA employment -0.227*** -0.123*** -0.129*** -0.081*** 0.382*** -0.034 -0.034 0.027 0.318*** 0.102* 0.397*** 0.040 0.062 0.321*** (0.036) (0.512) (0.027) (0.027) (0.051) (0.033) (0.028) (0.028) (0.046) (0.059) (0.067) (0.025) (0.044) (0.052) Marginal Effects of NREGA Female share of NREGA 0.023*** 0.015*** 0.013*** 0.001 -0.055*** -0.022*** 0.008** -0.018*** -0.041*** 0.016** -0.009* -0.013*** 0.004 -0.004

111 employment = 25% (0.003) (0.004) (0.003) (0.003) (0.007) (0.005) (0.003) (0.004) (0.005) (0.007) (0.009) (0.003) (0.004) (0.006) Female share of NREGA employment = 75% 0.024*** 0.015*** 0.013*** 0.001 -0.053*** -0.022*** 0.008** -0.018*** -0.038*** 0.016** 0.009* -0.013*** 0.004 -0.004 (0.004) (0.004) (0.003) (0.003) (0.006) (0.005) (0.003) (0.004) (0.005) (0.007) (0.009) (0.003) (0.004) (0.005)

Other Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Land included Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

N 80234 79427 79610 79738 57931 37019 80157 80248 78466 63887 52018 80159 80082 79,628 Notes: * p<0.10, ** p<0.05, *** p<0.01. Estimation is via fractional logit model with correlated random effects at district level. The sample is restricted to include households with atleast one adult female and male member. Dependent variables are in the form of budget shares spent on 14 separate commodity categories out of the total monthly spending by a household in a district at a particular point in time. Additional controls included in each specification - district fixed effects, NREGA jobs women to total employment ratio interacted with NREGA, household size, age of the head of the household, age squared, number of children, number of literate male and female members, number of male and female members with primary, middle, higher and technical education, Scheduled Tribe (ST), Scheduled Caste (SC), Other Backward Class (OBC), Hindu, Islam, Christianity, Sikhism, and other religion and means of all controls at district level across time. Standard errors are clustered at district level and reported in parenthesis. Sample is restricted to include only households with atleast 1 male and female adult member who have school going children for the model where outcome is school expenditure. Table B.4. Heterogeneous Impacts of NREGA on Expenditure Shares: State Stipulated Minimum Wages - Fractional Logit Model with Correlated Random Effects Approach

Edible Fuel & Veg & Meat & Medical School Clothing & Durable Variables Cereals Pulses Oil Light Intoxicants Entertainment Fruits Condiments Milk Expd Expd Personal bedding Goods Coefficients NREGA 0.267 -0.079* -0.108** -0.068 -0.100 -0.156** -0.035 -0.230*** -0.200** -0.056 0.225 -0.099** -0.130* -0.092 (0.041) (0.044) (0.046) (0.044) (0.075) (0.052) (0.045) (0.038) (0.070) (0.102) (0.144) (0.039) (0.072) (0.100) NREGA*minW -0.221*** 0.088** 0.121** 0.054 0.002 0.070 0.043 0.174*** 0.091 0.029 -0.162 0.066* 0.173** 0.139 (0.046) (0.041) (0.044) (0.042) (0.071) (0.044) (0.045) (0.038) (0.065) (0.102) (0.132) (0.038) (0.072) (0.094) Marginal Effects of NREGA

112 Minimum Wage = Rs. 82.50 per day 0.016*** -0.002 -0.002 -0.005* -0.024*** -0.024*** 0.000 -0.020*** -0.024*** -0.008 0.021** -0.011*** 0.002 0.004 (0.003) (0.004) (0.003) (0.003) (0.006) (0.005) (0.003) (0.003) (0.004) (0.007) (0.011) (0.003) (0.004) (0.006) Minimum Wage = Rs. 159.40 per day -0.017** 0.015** 0.020** 0.004 -0.024** -0.011* 0.007 0.011* -0.011 -0.002 -0.007 0.001 0.027** 0.027** (0.006) (0.006) (0.007) (0.005) (0.011) (0.006) (0.007) (0.006) (0.008) (0.015) (0.018) (0.006) (0.009) (0.012)

Other Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Land included Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

N 80234 79427 79610 79738 57931 37019 80157 80248 78466 63887 52018 80159 80082 79,628 Notes: * p<0.10, ** p<0.05, *** p<0.01. Estimation is via fractional logit model with correlated random effects at district level. The sample is restricted to include households with atleast one adult female and male member. Dependent variables

are in the form of budget shares spent on 14 separate commodity categories out of the total monthly spending by a household in a district at a particular point in time. Additional controls included in each specification - district fixed effects,

minimum wages, household size, age of the head of the household, age squared, number of children, number of literate male and female members, number of male and female members with primary, middle, higher and technical education,

Scheduled Tribe (ST), Scheduled Caste (SC), Other Backward Class (OBC), Hindu, Islam, Christianity, Sikhism, and other religion and means of all controls at district level across time. Standard errors are clustered at district level and reported in

parenthesis. Sample is restricted to include only households with atleast 1 male and female adult member who have school going children for the model where outcome is school expenditure. Table B.5. Heterogeneous Impacts of NREGA on Expenditure Shares: Crop Regions - Fractional Logit Model with Correlated Random Effects Approach

Edible Fuel & Meat & Medical School Clothing & Durable Variables Cereals Pulses Oil Light Intoxicants Entertainment Veg & Fruits Condiments Milk Expd Expd Personal bedding Goods Coefficients

NREGA -0.015* 0.036** -0.010 -0.05** -0.068** -0.049* -0.023 -0.013 -0.044 -0.030 0.104 -0.036* -0.054 0.138**

(0.024) (0.028) (0.021) (0.019) (0.032) (0.028) (0.020) (0.019) (0.034) (0.053) (0.078) (0.021) (0.034) (0.067)

NREGA*Rice 0.071** -0.043 -0.010 0.028 0.055 -0.048 0.010 -0.098*** -0.057 -0.075 -0.008 -0.030 0.142** -0.117*

(0.030) (0.036) (0.028) (0.026) (0.045) (0.034) (0.027) (0.026) (0.040) (0.071) (0.098) (0.026) (0.051) (0.071)

NREGA*Both 0.002 -0.070 -0.024 0.020 0.207** -0.019 0.004 0.012 0.115 0.068 -0.157 0.005 0.043 -0.032

(0.040) (0.041) (0.029) (0.033) (0.070) (0.069) (0.027) (0.036) (0.057) (0.098) (0.135) (0.033) (0.057) (0.085) 113 Marginal Effects of NREGA

Wheat Regions 0.005 0.000 -0.005 -0.006* -0.001 -0.020*** -0.004 -0.017*** -0.012** -0.015 0.018 -0.013*** 0.007 0.012

(0.004) (0.004) (0.004) (0.004) (0.007) (0.006) (0.004) (0.004) (0.004) (0.010) (0.014) (0.003) (0.006) (0.007)

Rice Regions 0.010** -0.002 -0.005 -0.005 -0.003 -0.024*** -0.003 -0.026*** -0.020*** -0.025* 0.022 -0.016*** 0.016* 0.004

(0.005) (0.006) (0.005) (0.005) (0.009) (0.007) (0.005) (0.006) (0.005) (0.014) (0.018) (0.004) (0.008) (0.009)

Regions producing both -0.003 -0.008 -0.008 -0.006 0.034** -0.017 -0.004 0.000 0.015 0.009 -0.012 -0.007 -0.002 0.020

(0.006) (0.008) (0.005) (0.006) (0.016) (0.016) (0.005) (0.008) (0.010) (0.021) (0.026) (0.006) (0.009) (0.012)

Other Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Land included Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

N 38141 37536 37722 37866 28802 16616 38103 38146 37211 30407 25213 38112 38081 37814 Notes: * p<0.10, ** p<0.05, *** p<0.01. Estimation is via fractional logit model with correlated random effects at district level. The sample is restricted to include households with atleast one adult female and male member. Sample is further restricted to include only those regions that are rice producing, wheat producing and those that produce both rice and wheat. DRice=1 for rice regions. If DRice=0, then DBoth is also equal to zero. Dependent variables are in the form of budget shares spent on 14 separate commodity categories out of the total monthly spending by a household in a district at a particular point in time. Additional controls included in each specification - district fixed effects, dummy for rice regions, dummy for regions that produce both rice and wheat, household size, age of the head of the household, age squared, number of children, number of literate male and female members, number of male and female members with primary, middle, higher and technical education, Scheduled Tribe (ST), Scheduled Caste (SC), Other Backward Class (OBC), Hindu, Islam, Christianity, Sikhism, and other religion and means of all controls at district level across time. Standard errors are clustered at district level and reported in parenthesis. Sample is restricted to include only households with atleast 1 male and female adult member who have school going children for the model where outcome is school expenditure. Table B.6. Impact of NREGA on Expenditure in Levels

Edible Fuel & Veg & Meat & Medical School Clothing & Durable Variables Cereals Pulses Oil Light Intoxicants Entertainment Fruits Condiments Milk Expd Expd Personal bedding Goods NREGA 0.016 0.013 -0.007 -0.050*** -0.021 -0.199*** 0.034 -0.107*** -0.038 0.001 0.182*** -0.024 0.008 0.172*** (0.021) -0.026 -0.020 (0.019) (0.035) (0.040) -0.024 -0.028 (0.025) (0.060) (0.052) (0.024) (0.024) (0.051)

Other Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes District Fixed Effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Underidentification Test p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 Weak Identification Test:

Cragg-Donald Wald F statistic 2009.41 1982.264 1991.078 1988.93 1393.10 676.08 1372.140 2003.064 1873.19 1541.97 1013.90 1998.36 1963.02 1923.68

Kleibergen-Paap rk Wald F statistic 469.84 464.280 463.197 468.77 347.77 202.95 468.519 471.280 436.21 411.24 336.69 473.85 475.29 478.08 Endogeneity Test p = 0.001 p = 0.269 p = 0.297 p = 0.023 p = 0.051 p = 0.021 p = 0.689 p = 0.000 p = 0.000 p = 0.000 p = 0.017 p = 0.008 p = 0.000 p = 0.000 114 N 80234 79427 79610 79738 57931 37019 80157 80248 78466 63887 52018 80159 80083 79628 Notes: * p<0.10, ** p<0.05, *** p<0.01. Estimation is via Intrumental Variable approach. The sample is restricted to include households with atleast one adult female and male member. Dependent variables are in natural log form - log of monthly expenditure. The coefficient for NREGA should be interpreted as (e^(β)-1). The impact in percentage terms is (e^(β)-1)*100. Additional controls included in each specification - district fixed effects, log of total consumption, household size, age of the head of the household, age squared, number of children, number of literate male and female members, number of male and female members with primary, middle, higher and technical education, Scheduled Tribe (ST), Scheduled Caste (SC), Other Backward Class (OBC), Hindu, Islam, Christianity, Sikhism, and other religion. Instrument for total consumption is land possessed. Standard errors are clustered at district level and reported in parenthesis. Underidentification Test reports the p-value of the Kleibergen-Paap (2006) rk statistic with rejection implying identification; Endogeneity Test reports the p-value with null being variable is exogenous; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification. Sample is restricted to include only households with atleast 1 male and female adult member who have school going children for the model where outcome is school expenditure. Table B.7. Heterogeneous Impacts of NREGA on Expenditure in Levels: Female Share of NREGA Employment

Edible Fuel & Veg & Meat & Medical School Clothing & Durable Variable Cereals Pulses Oil Light Intoxicants Entertainment Fruits Condiments Milk Expd Expd Personal bedding Goods

NREGA -0.002 0.003 0.005 -0.071*** -0.084* -0.083 -0.030 -0.092** -0.014 0.187** -0.082 -0.036 -0.021 0.029 (0.029) (0.037) (0.028) (0.026) (0.049) (0.059) (0.037) (0.045) (0.036) (0.090) (0.093) (0.033) (0.035) (0.070) NREGA*Female share of NREGA employment 0.029 0.020 -0.020 0.027 0.181** -0.253*** 0.176*** -0.044 -0.046 -0.391*** 0.729*** 0.010 0.079 0.393***

(0.061) (0.067) (0.056) (0.048) (0.086) (0.096) (0.059) (0.071) (0.074) (0.151) (0.172) (0.058) (0.069) (0.129)

Other Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes District Fixed Effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Marginal Effects of NREGA

115 Female share of NREGA employment = 25% 0.005 0.008 0.000 -0.065*** -0.038 -0.146*** 0.014 -0.103*** -0.025 0.090 0.100 -0.033 -0.001 0.127** (0.022) (0.028) (0.021) (0.020) (0.038) (0.044) (0.027) (0.033) (0.026) (0.066) (0.073) (0.025) (0.025) (0.053)

Female share of NREGA employment = 75% 0.019 0.018 -0.010 -0.051* 0.052 -0.273*** 0.102*** -0.125*** -0.048 -0.106 0.464*** -0.028 0.039 0.324*** (0.034) (0.037) (0.029) (0.027) (0.046) (0.050) (0.029) (0.034) (0.040) (0.074) (0.099) (0.032) (0.036) (0.071)

Underidentification Test p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000

Weak Identification Test: Cragg-Donald Wald F statistic 1962.11 1940.42 1950.32 1945.94 1369.64 657.48 1958.33 1960.52 1830.47 1515.05 1409.82 1953.46 1919.72 1879.50 Kleibergen-Paap rk Wald F statistic 459.59 455.18 454.34 459.39 336.05 207.54 459.02 461.94 427.32 404.55 378.68 464.23 466.22 469.56 Endogeneity Test p = 0.001 p = 0.305 p = 0.410 p = 0.017 p = 0.029 p = 0.063 p = 0.509 p = 0.000 p = 0.000 p = 0.000 p = 0.003 p = 0.014 p = 0.000 p = 0.000

N 78436 77787 77919 77971 56617 36164 78356 78448 76716 62531 65037 78366 78287 77885 Notes: * p<0.10, ** p<0.05, *** p<0.01. Estimation is via Intrumental Variable approach in Diff-in-Diff. The sample is restricted to include households with atleast one adult female and male member. Dependent variables are in natural log form - log of monthly expenditure; thus, the coefficient for the dummy variables should be interpreted as e^(β)-1. The impact in percentage terms is (e^(β)-1)*100. Additional controls included in each specification - district fixed effects, log of total consumption, household size, age of the head of the household, age squared, number of children, number of literate male and female members, number of male and female members with primary, middle, higher and technical education, Scheduled Tribe (ST), Scheduled Caste (SC), Other Backward Class (OBC), Hindu, Islam, Christianity, Sikhism, and other religion. Instrument for total consumption is land possessed. Standard errors are clustered at district level and reported in parenthesis. Underidentification Test reports the p-value of the Kleibergen-Paap (2006) rk statistic with rejection implying identification; Endogeneity Test reports the p-value with null being variable is exogenous; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification. Joint significance tests report the statistical significance of the total impact of NREGA evaluated at the maximum of state stipulated standardizd minimum wages as well as the minimum bound. Sample is restricted to include only households with atleast 1 male and female adult member who have school going children for the model where outcome is school expenditure. Table B.8. Heterogeneous Impacts of NREGA on Expenditure in Levels: State Stipulated Minimum Wages

Edible Fuel & Veg & Meat & Medical School Clothing & Durable Variable Cereals Pulses Oil Light Intoxicants Entertainment Fruits Condiments Milk Expd Expd Personal bedding Goods NREGA 0.179** -0.201** -0.208*** -0.143* -0.061 -0.498*** -0.219** -0.380*** -0.122 0.054 0.044 -0.097 -0.210** -0.018 (0.079) -0.101 -0.078 (0.081) (0.142) (0.136) -0.096 -0.091 (0.097) (0.225) (0.217) (0.093) (0.088) (0.183) NREGA*minW -0.166** 0.210** 0.202*** 0.094 0.040 0.302*** 0.247*** 0.281*** 0.079 -0.037 0.140 0.07 0.210** 0.168 (0.072) -0.092 -0.077 (0.079) (0.141) (0.115) -0.095 -0.089 (0.096) (0.215) (0.213) (0.088) (0.084) (0.173)

Other Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes District Fixed Effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Marginal Effects of NREGA

NREGA*minW (at MinW = Rs. 82.50 per day) -0.097** 0.111* 0.080 -0.018 0.044 -0.035 0.208*** 0.023 -0.052 -0.029 0.289* 0.002 0.126 0.267** (0.047) -0.061 -0.053 (0.05) (0.101) (0.076) -0.063 -0.067 (0.065) (0.155) (0.171) (0.059) (0.059) (0.115) 116 NREGA*minW (at MinW = Rs. 159.40 per day) 0.037 -0.022 -0.028 -0.058** 0.002 -0.255*** -0.015 -0.133*** -0.037 -0.016 0.136 -0.031 -0.015** 0.145** (0.026) -0.034 -0.024 (0.024) (0.041) (0.055) -0.029 -0.032 (0.031) (0.074) (0.088) (0.030) (0.029) (0.061)

Underidentification Test p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000

Weak Identification Test:

Cragg-Donald Wald F statistic 2025.88 1998.674 2014.201 2005.73 1399.16 688.56 2019.681 2020.939 1889.69 1555.88 1017.063 2014.84 1979.88 1940.58 Kleibergen-Paap rk Wald F statistic 474.63 469.085 470.625 473.71 351.64 203.66 473.344 476.208 440.17 415.53 335.54 478.86 480.17 483.27 Endogeneity Test p = 0.001 p = 0.326 p = 0.347 p = 0.024 p = 0.050 p = 0.030 p = 0.570 p = 0.000 p = 0.000 p = 0.000 p = 0.016 p = 0.010 p = 0.000 p = 0.000

N 80234 79427 79610 79738 57931 37019 80157 80248 78466 63887 52018 80159 80083 79628 Notes: * p<0.10, ** p<0.05, *** p<0.01. Estimation is via Intrumental Variable approach in Diff-in-Diff. The sample is restricted to include households with atleast one adult female and male member. Dependent variables are in natural log form - log of monthly expenditure; thus, the coefficient for the dummy variables should be interpreted as e^(β)-1. The impact in percentage terms is (e^(β)-1)*100. Additional controls included in each specification - district fixed effects, minimum wages, log of total consumption, household size, age of the head of the household, age squared, number of children, number of literate male and female members, number of male and female members with primary, middle, higher and technical education, Scheduled Tribe (ST), Scheduled Caste (SC), Other Backward Class (OBC), Hindu, Islam, Christianity, Sikhism, and other religion. Instrument for total consumption is land possessed. Standard errors are clustered at district level and reported in parenthesis. Underidentification Test reports the p-value of the Kleibergen-Paap (2006) rk statistic with rejection implying identification; Endogeneity Test reports the p-value with null being variable is exogenous; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification. Joint significance tests report the statistical significance of the total impact of NREGA evaluated at the maximum of state stipulated standardizd minimum wages as well as the minimum bound. Sample is restricted to include only households with atleast 1 male and female adult member who have school going children for the model where outcome is school expenditure. Table B.9. Hetergeneous Impacts of NREGA on Expenditure in Levels: Crop Regions

Edible Fuel & Veg & Meat & Medical School Clothing & Durable Variables Cereals Pulses Oil Light Intoxicants Entertainment Fruits Condiments Milk Expd Expd Personal bedding Goods NREGA -0.076* 0.021 0.000 -0.085*** -0.026 -0.085 0.033 0.013 0.000 0.053 0.232** -0.039 -0.049 0.238** (0.041) -0.049 -0.035 (0.033) (0.058) (0.066) -0.049 -0.044 (0.048) (0.132) (0.104) (0.052) (0.064) (0.121) NREGA*Rice 0.141*** -0.063 -0.042 0.071 -0.03 -0.136 -0.105* -0.314*** -0.120** -0.182 0.054 -0.07 0.129* -0.104 (0.052) -0.071 -0.047 (0.045) (0.086) (0.086) -0.063 -0.061 (0.061) (0.166) (0.118) (0.059) (0.077) (0.132) NREGA*Both -0.018 -0.084 0.067 0.004 0.328*** -0.144 0.071 0.002 0.03 0.405 -0.225 -0.047 0.105 -0.178 (0.081) -0.069 -0.041 (0.042) (0.111) (0.200) -0.061 -0.088 (0.071) (0.269) (0.143) (0.086) (0.081) (0.169)

Other Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes District Fixed Effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Marginal Effects of NREGA Wheat Regions -0.076* 0.021 0.000 -0.085*** -0.026 -0.085 0.033 0.013 0.000 0.053 0.232** -0.039 -0.049 0.238** (0.041) -0.049 -0.035 (0.033) (0.058) (0.066) -0.049 -0.044 (0.048) (0.132) (0.104) (0.052) (0.064) (0.121) 117 Rice Regions 0.065 -0.042 -0.042 -0.014 -0.056 -0.221** -0.072 -0.301*** -0.12** -0.129 0.286** -0.109** 0.080* 0.134 (0.043) (0.063) (0.045) (0.040) (0.072) (0.082) (0.054) (0.064) (0.057) (0.137) (0.095) (0.046) (0.052) (0.093) Regions producing both -0.094 -0.063 0.067* -0.081** 0.302** -0.229 0.104** 0.015 0.030 0.458* 0.007 -0.086 0.056 0.060 (0.048) (0.047) (0.040) (0.033) (0.091) (0.216) (0.054) (0.080) (0.060) (0.193) (0.125) (0.068) (0.057) (0.172)

Underidentification Test p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000 p = 0.000

Weak Identification Test:

Cragg-Donald Wald F statistic 751.05 743.472 742.872 739.93 549.48 215.04 743.272 746.410 661.93 574.27 405.92 747.25 725.14 720.71

Kleibergen-Paap rk Wald F statistic 176.44 175.169 172.177 173.75 130.63 63.27 175.450 175.133 163.66 152.40 125.07 178.43 178.91 176.99

Endogeneity Test p = 0.067 p = 0.000 p = 0.070 p = 0.691 p = 0.000 p = 0.000 p = 0.000 p = 0.001 p = 0.000 p = 0.054 p=0.4018 p = 0.127 p = 0.384 p = 0.000

N 38141 37536 37722 37866 28802 16616 38103 38146 37211 30407 25213 38112 38082 37814 Notes: * p<0.10, ** p<0.05, *** p<0.01. Estimation is via Intrumental Variable. The sample is restricted to include households with atleast one adult female and male member. Sample is further restricted to only include regions that are rice producing, wheat producing and those that produce both. If Rice=0, then Both is also equal to zero. Dependent variables are in natural log form - log of monthly expenditure; thus, the coefficient for the dummy variables should be interpreted as e^(β)-1. The impact in percentage terms is (e^(β)-1)*100. Additional controls included in each specification - district fixed effects, dummy variables for regions that produce rice, regions that produce both rice and wheat, log os total consumption, household size, age of the head of the household, age squared, number of children, number of literate male and female members, number of male and female members with primary, middle, higher and technical education, Scheduled Tribe (ST), Scheduled Caste (SC), Other Backward Class (OBC), Hindu, Islam, Christianity, Sikhism, and other religion. Instrument for total consumption is land possessed. Standard errors are clustered at district level and reported in parenthesis. Underidentification Test reports the p-value of the Kleibergen-Paap (2006) rk statistic with rejection implying identification; Endogeneity Test reports the p-value with null being variable is exogenous; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification. Joint significance tests report statistical significance of the total impact of NREGA given the interaction of NREGA with rice producing regions and interaction of NREGA with regions that produce both rice and wheat. Sample is restricted to include only households with atleast 1 male and female adult member who have school going children for the model where outcome is school expenditure Appendix C

THE EFFECT OF QUALITY OF EDUCATION ON CRIME: EVIDENCE FROM COLOMBIA

Figure C.1. Crime Rate 2007

118 Figure C.2. Education Quality 2007

119 Figure C.3. Crime Rate 2013

120 Figure C.4. Education Quality 2013

121 Table C.1: Summary statistics

Variable Mean Std. Dev. Min. Max. N

Car Theft Rate 5.666 11.21 0 124.681 5642

Commerce Theft Rate 16.133 25.982 0 405.077 7362

Thefts on Person Rate 44.353 73.77 0 632.972 7606

Household Theft Rate 22.441 38.738 0 471.884 7474

Total Kidnappings Rate 1.032 5.103 0 185.563 7851 122

Political Kidnapping Rate 0.457 3.77 0 185.563 7851

Non Political Kidnapping Rate 0.575 3.375 0 130.719 7851

Homicide Rate 31.69 38.232 0 485.794 7606

Average Score in Subjects 29.678 14.148 0 55.017 7765

Average Score in Cognitive Areas 29.943 14.38 0 62.78 7765

Language Median Score 31.055 14.849 0 57.32 7765

Continued on next page Table C.1 – continued from previous page

Variable Mean Std. Dev. Min. Max. N

Math Median Score 28.83 14.139 0 69.010 7765

Average Score in Social Areas 28.079 13.673 0 52.455 7765

Social Sciences Median Score 29.469 14.094 0 58.775 7765

Philosophy Median Score 26.689 13.509 0 51.89 7765

Biology Median Score 30.687 14.549 0 53.19 7765

123 Total Score 47.6 3.089 32.434 62.587 7764

Language Score 47.703 2.624 23.389 64.807 7764

Math Score 48.11 2.594 30.703 69.078 7764

Philosophy Score 48.594 2.357 32.72 60.096 7764

Biology Score 48.197 2.485 34.167 61.19 7764

Social Sciences Score 48.176 2.712 27.883 68.369 7764

Total Median Score 32.621 15.251 0 61.354 7765

Continued on next page Table C.1 – continued from previous page

Variable Mean Std. Dev. Min. Max. N

Subjects Median Z Score 0 0.988 -2.071 1.773 7765

Total Population (log) 9.545 1.126 5.509 15.853 7851

Birth Rate 13.182 4.722 0 52.217 7842

Infant Mortality Rate 21.987 9.543 6.507 91.97 7854

Rurality Index 0.574 0.244 0.001 1 7851

124 Agricultural Yield 7.294 11.711 0 136.535 7660

Projected Population to Attend Primary School (log) 7.29 1.127 3.497 13.349 7851

Projected Population to Attend Secundary School (log) 7.466 1.124 3.611 13.559 7851

Per Capita Total Expenditure 0.004 0.012 0 0.136 7687

Per Capita Total Tax Revenue 0.002 0.008 0 0.131 7689

Investment in Quality of Education (2005 constant million $) 759.7 2,335.1 0 87388 7854

Per Capita Average Investment in Quality of Neighbors 2638.4 10870.2 0 156671.5 7770

Continued on next page Table C.1 – continued from previous page

Variable Mean Std. Dev. Min. Max. N

Per Capita Shift Share of Investment on Quality (miles) 62.4 573.9 0.001 18369.8 7357

Source: DANE, CEDE, ICFES, DNP, IPUMS, National Police and author’s calculations 125 Table C.2. Crime and Education Quality (Without Bogota)

Crime Crime Property Crime Violent Crime (1) (2) (3) Average Score in Subjects -5.73*** -6.05*** 0.24 (2.00) (2.17) (0.98) Municipality FE Yes Yes Yes Controls Yes Yes Yes Trend Yes Yes Yes Adjusted-R2 -2.62 -2.89 -0.19 Observations 4486 4491 6134 Underidentification 0.011 0.011 0.001 Weak Identification 24.198 24.125 22.439 Overidentification 0.472 0.575 0.816 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen- Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

126 Table C.3. Disaggregated Crime and Education Quality (Without Bogota)

Car Commerce Household Person Kidnap. Pol. Kidnap. Non Pol. Kidnap. Homid. (1) (2) (3) (4) (5) (6) (7) (8) Average Score in Subjects -6.32*** 0.69 0.07 -3.64 -3.27** -0.23 -4.56** 0.69 (2.16) (1.21) (0.85) (2.68) (1.59) (0.82) (2.10) (1.02) Municipality FE Yes Yes Yes Yes Yes Yes Yes Yes Controls Yes Yes Yes Yes Yes Yes Yes Yes Trend Yes Yes Yes Yes Yes Yes Yes Yes Adjusted-R2 -1.75 -0.21 -0.19 -1.50 -0.51 -0.19 -0.83 -0.22 Observations 4586 5962 6036 6130 6213 6213 6213 6134 Underidentification 0.013 0.001 0.001 0.001 0.001 0.001 0.001 0.001 Weak Identification 20.330 22.125 22.376 22.590 22.610 22.610 22.610 22.439 Overidentification 0.568 0.190 0.210 0.291 0.531 0.821 0.483 0.987 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

Table C.4. Crime and Education Quality (Without State Capitals)

Crime Property Crime Violent Crime (1) (2) (3) Average Score in Subjects -3.60*** -3.87*** 0.32 (1.31) (1.48) (0.99) Municipality FE Yes Yes Yes Controls Yes Yes Yes Trend Yes Yes Yes Adjusted-R2 -0.94 -1.02 -0.20 Observations 4324 4329 5954 Underidentification 0.016 0.016 0.004 Weak Identification 15.755 15.708 18.554 Overidentification 0.210 0.260 0.875 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothe- sis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak iden- tification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

127 Table C.5. Disaggregated Crime and Education Quality (Without State Capitals)

Car Commerce Household Person Kidnap. Pol. Kidnap. Non Pol. Kidnap. Homid. (1) (2) (3) (4) (5) (6) (7) (8) Average Score in Subjects -5.33* -0.11 0.43 -0.73 -3.44* -0.29 -4.74** 0.80 (2.89) (1.09) (0.93) (0.86) (1.81) (0.92) (2.34) (1.03) Municipality FE Yes Yes Yes Yes Yes Yes Yes Yes Controls Yes Yes Yes Yes Yes Yes Yes Yes Trend Yes Yes Yes Yes Yes Yes Yes Yes Adjusted-R2 -1.18 -0.17 -0.20 -0.21 -0.54 -0.19 -0.87 -0.23 Observations 4424 5782 5856 5950 6033 6033 6033 5954 Underidentification 0.018 0.004 0.005 0.004 0.004 0.004 0.004 0.004 Weak Identification 13.057 18.201 17.686 18.296 18.667 18.667 18.667 18.554 Overidentification 0.461 0.228 0.187 0.128 0.523 0.872 0.479 0.955 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

Table C.6. Violence and Education Quality (With Population <200,000 Inhabitants)

Crime Violence Property Crime Violent Crime (1) (2) (3) Average Score in Subjects -3.67*** -3.90*** 0.32 (1.27) (1.41) (0.99) Municipality FE Yes Yes Yes Controls Yes Yes Yes Trend Yes Yes Yes Adjusted-R2 -0.99 -1.07 -0.20 Observations 4336 4341 5984 Underidentification 0.015 0.015 0.004 Weak Identification 15.896 15.847 18.556 Overidentification 0.221 0.274 0.882 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Het- eroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Under- identification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statis- tic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

128 Table C.7. Disaggregated Crime and Education Quality (With Population < 200, 000 In- habitants)

Car Commerce Household Person Kidnap. Pol. Kidnap. Non Pol. Kidnap. Homid. (1) (2) (3) (4) (5) (6) (7) (8) Average Score in Subjects -5.47* -0.14 0.31 -0.74 -3.42* -0.24 -4.76** 0.79 (2.97) (1.07) (0.88) (0.84) (1.80) (0.92) (2.34) (1.04) Municipality FE Yes Yes Yes Yes Yes Yes Yes Yes Controls Yes Yes Yes Yes Yes Yes Yes Yes Trend Yes Yes Yes Yes Yes Yes Yes Yes Adjusted-R2 -1.20 -0.17 -0.19 -0.21 -0.54 -0.19 -0.88 -0.23 Observations 4436 5812 5886 5980 6063 6063 6063 5984 Underidentification 0.017 0.004 0.005 0.004 0.004 0.004 0.004 0.004 Weak Identification 13.200 18.215 17.679 18.289 18.668 18.668 18.668 18.556 Overidentification 0.467 0.234 0.194 0.130 0.520 0.850 0.473 0.949 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

Table C.8. Crime and Education Quality (Rural Areas)

Violence Property Crime Violent Crime (1) (2) (3) Average Score in Subjects -4.16* -5.48** 0.87 (2.38) (2.62) (1.12) Municipality FE Yes Yes Yes Controls Yes Yes Yes Trend Yes Yes Yes Adjusted-R2 -0.87 -1.34 -0.24 Observations 2588 2588 3961 Underidentification 0.001 0.001 0.018 Weak Identification 10.409 10.409 11.827 Overidentification 0.288 0.454 0.932 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Het- eroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Under- identification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statis- tic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

129 Table C.9. Disaggregated Crime and Education Quality (Rural Areas)

Car Commerce Household Person Kidnap. Pol. Kidnap. Non Pol. Kidnap. Homid. (1) (2) (3) (4) (5) (6) (7) (8) Average Score in Subjects -5.84 0.45 1.95 -0.80 -4.01 -0.68 -4.48* 1.43 (3.93) (1.52) (1.35) (1.53) (2.53) (1.67) (2.67) (1.17) Municipality FE Yes Yes Yes Yes Yes Yes Yes Yes Controls Yes Yes Yes Yes Yes Yes Yes Yes Trend Yes Yes Yes Yes Yes Yes Yes Yes Adjusted-R2 -1.07 -0.19 -0.33 -0.22 -0.65 -0.20 -0.76 -0.31 Observations 2668 3805 3858 3946 4029 4029 4029 3961 Underidentification 0.002 0.018 0.011 0.018 0.017 0.017 0.017 0.018 Weak Identification 9.245 11.418 12.650 11.263 11.791 11.791 11.791 11.827 Overidentification 0.472 0.240 0.242 0.293 0.672 0.803 0.576 0.968 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

Table C.10. Crime and Education Quality (Urban Areas)

Violence Property Crime Violent Crime (1) (2) (3) Average Score in Subjects -1.67 -1.10 -2.41*** (1.45) (1.41) (0.72) Municipality FE Yes Yes Yes Controls Yes Yes Yes Trend Yes Yes Yes Adjusted-R2 -0.50 -0.26 -0.92 Observations 1893 1897 2165 Underidentification 0.500 0.501 0.319 Weak Identification 18.799 18.566 24.835 Overidentification 0.799 0.718 0.111 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Het- eroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Under- identification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statis- tic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

130 Table C.11. Disaggregated Crime and Education Quality (Urban Areas)

Car Commerce Household Person Kidnap. Pol. Kidnap. Non Pol. Kidnap. Homid. (1) (2) (3) (4) (5) (6) (7) (8) Average Score in Subjects -2.03 -0.94 -1.06 -0.54 0.71 0.91 -0.39 -2.54*** (1.43) (0.63) (0.97) (1.43) (1.27) (0.63) (2.64) (0.76) Municipality FE Yes Yes Yes Yes Yes Yes Yes Yes Controls Yes Yes Yes Yes Yes Yes Yes Yes Trend Yes Yes Yes Yes Yes Yes Yes Yes Adjusted-R2 -0.57 -0.23 -0.28 -0.17 -0.23 -0.25 -0.20 -1.04 Observations 1912 2148 2169 2175 2175 2175 2175 2165 Underidentification 0.497 0.333 0.458 0.318 0.318 0.318 0.318 0.319 Weak Identification 18.973 24.340 23.947 24.692 24.692 24.692 24.692 24.835 Overidentification 0.492 0.127 0.155 0.267 0.313 0.560 0.343 0.044 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

Table C.12. Crime and Education Quality (Total Transfers as Instruments)

Crime Rate Crime Property Crime Violent Crime (1) (2) (3) Average Score in Subjects -2.19 -2.76** 1.05 (1.38) (1.41) (1.16) Municipality FE Yes Yes Yes Controls Yes Yes Yes Trend Yes Yes Yes Adjusted-R2 -0.49 -0.67 -0.26 Observations 4642 4647 6390 Underidentification 0.001 0.001 0.001 Weak Identification 23.106 23.101 15.720 Overidentification 0.053 0.088 0.127 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Het- eroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Under- identification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statis- tic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

131 Table C.13. Disaggregated Crime and Education Quality (Total Transfers as Instruments)

Crime Rate Car Commerce Household Person Kidnap. Pol. Kidnap. Non Pol. Kidnap. Homid. (1) (2) (3) (4) (5) (6) (7) (8) Average Score in Subjects -3.14 -0.27 0.03 -1.76 -2.84 0.18 -4.44 1.46 (2.22) (1.04) (0.65) (2.22) (2.09) (0.86) (3.12) (1.22) Municipality FE Yes Yes Yes Yes Yes Yes Yes Yes Controls Yes Yes Yes Yes Yes Yes Yes Yes Trend Yes Yes Yes Yes Yes Yes Yes Yes Adjusted-R2 -0.57 -0.17 -0.19 -0.46 -0.43 -0.19 -0.79 -0.32 Observations 4754 6183 6274 6380 6469 6469 6469 6390 Underidentification 0.002 0.001 0.001 0.001 0.001 0.001 0.001 0.001 Weak Identification 13.591 16.427 23.803 16.888 15.978 15.978 15.978 15.720 Overidentification 0.082 0.912 0.106 0.208 0.279 0.232 0.588 0.089 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

132 Table C.14. Crime and Education Quality (Total Transfers as an Additional Regressor)

Violence Property Crime Violent Crime (1) (2) (3) Average Score in Subjects -9.62** -9.50** -0.97 (4.10) (4.19) (1.27) Per Capita Total Transfers 0.21 0.18 0.09 (0.14) (0.13) (0.06) Municipality FE Yes Yes Yes Controls Yes Yes Yes Trend Yes Yes Yes Adjusted-R2 -7.27 -7.08 -0.25 Observations 4486 4491 6134 Underidentification 0.056 0.056 0.016 Weak Identification 6.739 6.671 14.352 Overidentification 0.454 0.533 0.213 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Het- eroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Under- identification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statis- tic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

133 Table C.15. Disaggregated Crime and Education Quality (Total Transfers as an Additional Regressor)

Car Commerce Household Person Kidnap. Pol. Kidnap. Non Pol. Kidnap. Homid. (1) (2) (3) (4) (5) (6) (7) (8) Average Score in Subjects -10.89** 1.18 -0.62 -5.42 -3.92** -0.29 -5.46* -0.53 (4.55) (1.57) (1.70) (3.46) (1.97) (0.85) (2.92) (1.31) Per Capita Total Transfers 0.23 -0.03 0.07 0.14 0.05 0.00 0.07 0.09 (0.16) (0.05) (0.06) (0.10) (0.10) (0.04) (0.14) (0.06) Municipality FE Yes Yes Yes Yes Yes Yes Yes Yes Controls Yes Yes Yes Yes Yes Yes Yes Yes Trend Yes Yes Yes Yes Yes Yes Yes Yes Adjusted-R2 -4.84 -0.27 -0.21 -3.19 -0.65 -0.19 -1.09 -0.21 Observations 4586 5962 6036 6130 6213 6213 6213 6134 Underidentification 0.070 0.016 0.014 0.014 0.016 0.016 0.016 0.016 Weak Identification 5.738 13.958 15.028 14.235 14.129 14.129 14.129 14.352 Overidentification 0.496 0.260 0.085 0.120 0.308 0.853 0.287 0.292 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clus- tered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

134 Table C.16. Crime and Education Quality (Total Transfers instead of Total Expenditures)

Crime Property Crime Violent Crime (1) (2) (3) Average Score in Subjects -1.53 -1.89 0.71 (1.29) (1.29) (1.51) Per Capita Total Transfers -0.05 -0.06 0.02 (0.08) (0.07) (0.05) Municipality FE Yes Yes Yes Controls Yes Yes Yes Trend Yes Yes Yes Adjusted-R2 -0.31 -0.38 -0.22 Observations 4642 4647 6390 Underidentification 0.023 0.023 0.023 Weak Identification 4.837 4.811 6.638 Overidentification 0.042 0.075 0.118 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak iden- tification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

135 Table C.17. Disaggregated Crime and Education Quality (Total Transfers instead of Total Expenditures)

Car Commerce Household Person Kidnap. Pol. Kidnap. Non Pol. Kidnap. Homid. (1) (2) (3) (4) (5) (6) (7) (8) Average Score in Subjects -2.78 -0.08 -0.75 -1.68 -3.12 0.33 -5.03 1.07 (2.25) (1.23) (0.69) (1.94) (2.66) (1.27) (4.19) (1.49) Per Capita Total Transfers -0.03 -0.01 0.05* -0.00 0.02 -0.01 0.04 0.02 (0.09) (0.05) (0.03) (0.08) (0.10) (0.05) (0.14) (0.05) Municipality FE Yes Yes Yes Yes Yes Yes Yes Yes Controls Yes Yes Yes Yes Yes Yes Yes Yes Trend Yes Yes Yes Yes Yes Yes Yes Yes Adjusted-R2 -0.49 -0.17 -0.21 -0.43 -0.48 -0.20 -0.95 -0.26 Observations 4754 6183 6274 6380 6469 6469 6469 6390 Underidentification 0.038 0.018 0.011 0.021 0.024 0.024 0.024 0.023 Weak Identification 4.346 6.595 7.149 6.965 6.547 6.547 6.547 6.638 Overidentification 0.075 0.874 0.069 0.188 0.297 0.198 0.621 0.083 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

136 Table C.18. Crime and Education Quality (Total Transfers Instrumented)

Crime Property Crime Violent Crime (1) (2) (3) Average Score in Subjects 0.70 0.00 1.79 (1.66) (1.48) (1.47) Total Transfers 0.34*** 0.32*** 0.15 (0.06) (0.06) (0.19) Municipality FE Yes Yes Yes Controls Yes Yes Yes Trend Yes Yes Yes Adjusted-R2 -0.23 -0.15 -0.39 Observations 4486 4491 6117 Underidentification 0.140 0.140 0.077 Weak Identification 2.965 2.966 4.983 Overidentification 0.005 0.014 0.364 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Het- eroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Under- identification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statis- tic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

137 Table C.19. Disaggregated Crime and Education Quality (Total Transfers Instrumented)

Car Commerce Household Person Kidnap. Pol. Kidnap. Non Pol. Kidnap. Homid. (1) (2) (3) (4) (5) (6) (7) (8) Average Score in Subjects -0.79 0.84 1.40 0.06 -5.75** -1.04 -7.34** 2.59 (2.51) (1.11) (0.94) (0.94) (2.57) (1.19) (3.57) (1.69) Total Transfers 0.26** 0.04 0.17* 0.38*** -0.33 -0.12 -0.36 0.20 (0.11) (0.18) (0.09) (0.13) (0.33) (0.14) (0.40) (0.22) Municipality FE Yes Yes Yes Yes Yes Yes Yes Yes Controls Yes Yes Yes Yes Yes Yes Yes Yes Trend Yes Yes Yes Yes Yes Yes Yes Yes Adjusted-R2 -0.22 -0.23 -0.30 -0.18 -1.19 -0.22 -1.81 -0.61 Observations 4586 5945 6019 6113 6196 6196 6196 6117 Underidentification 0.139 0.073 0.050 0.076 0.071 0.071 0.071 0.077 Weak Identification 2.123 5.004 8.650 4.806 4.980 4.980 4.980 4.983 Overidentification 0.140 0.415 0.149 0.003 0.189 0.463 0.321 0.539 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

Table C.20. Crime and Education Quality (Cognitive Areas)

Violence Property Crime Violent Crime (1) (2) (3) Average Score in Cognitive Areas -11.77*** -12.28*** 0.30 (3.44) (3.70) (1.24) Municipality FE Yes Yes Yes Controls Yes Yes Yes Trend Yes Yes Yes Adjusted-R2 -13.45 -14.64 -0.20 Observations 4486 4491 6134 Underidentification 0.032 0.031 0.019 Weak Identification 11.530 11.618 7.805 Overidentification 0.845 0.981 0.814 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Het- eroskedasticity robust standard error estimates clustered at municipality level are re- ported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection im- plying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

138 Table C.21. Disaggregated Crime and Education Quality (Cognitive Areas)

Car Commerce Household Person Kidnap. Pol. Kidnap. Non Pol. Kidnap. Homid. (1) (2) (3) (4) (5) (6) (7) (8) Average Score in Cognitive Areas -12.79*** 0.86 -0.05 -4.68 -4.14* -0.29 -5.78** 0.87 (4.67) (1.53) (1.00) (3.81) (2.22) (1.05) (2.94) (1.28) Municipality FE Yes Yes Yes Yes Yes Yes Yes Yes Controls Yes Yes Yes Yes Yes Yes Yes Yes Trend Yes Yes Yes Yes Yes Yes Yes Yes Adjusted-R2 -8.20 -0.24 -0.19 -3.19 -0.88 -0.19 -1.51 -0.25 Observations 4586 5962 6036 6130 6213 6213 6213 6134 Underidentification 0.037 0.020 0.015 0.018 0.016 0.016 0.016 0.019 Weak Identification 9.987 7.472 7.018 7.351 7.660 7.660 7.660 7.805 Overidentification 0.872 0.170 0.232 0.386 0.629 0.809 0.582 0.977

139 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid. Table C.22. Crime and Education Quality (Social Areas)

Violence Property Crime Violent Crime (1) (2) (3) Average Score in Social Areas -3.22*** -3.41*** 0.15 (1.01) (1.11) (0.60) Municipality FE Yes Yes Yes Controls Yes Yes Yes Trend Yes Yes Yes Adjusted-R2 -1.73 -1.91 -0.20 Observations 4486 4491 6134 Underidentification 0.013 0.013 0.001 Weak Identification 26.346 26.251 25.864 Overidentification 0.408 0.495 0.821

140 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Het- eroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Under- identification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statis- tic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid. Table C.23. Disaggregated Crime and Education Quality (Social Areas)

Car Commerce Household Person Kidnap. Pol. Kidnap. Non Pol. Kidnap. Homid. (1) (2) (3) (4) (5) (6) (7) (8) Average Score in Social Areas -3.58*** 0.42 0.06 -2.21 -2.00** -0.14 -2.79** 0.42 (1.16) (0.72) (0.52) (1.55) (0.97) (0.50) (1.29) (0.63) Municipality FE Yes Yes Yes Yes Yes Yes Yes Yes Controls Yes Yes Yes Yes Yes Yes Yes Yes Trend Yes Yes Yes Yes Yes Yes Yes Yes Adjusted-R2 -1.23 -0.21 -0.19 -1.18 -0.44 -0.19 -0.69 -0.22 Observations 4586 5962 6036 6130 6213 6213 6213 6134 Underidentification 0.017 0.001 0.002 0.001 0.001 0.001 0.001 0.001 Weak Identification 21.354 25.684 26.029 26.213 26.032 26.032 26.032 25.864 Overidentification 0.499 0.195 0.204 0.265 0.504 0.829 0.457 0.998

141 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid. Table C.24. Crime and Education Quality (Total Score)

Violence Property Crime Violent Crime (1) (2) (3) Total Score -0.60*** -0.64*** 0.05 (0.21) (0.23) (0.21) Municipality FE Yes Yes Yes Controls Yes Yes Yes Trend Yes Yes Yes Adjusted-R2 -0.48 -0.51 -0.19 Observations 4486 4491 6134 Underidentification 0.003 0.003 0.007 Weak Identification 18.026 18.031 9.135 Overidentification 0.189 0.223 0.811 Notes: Standardized coefficients from Instrumental Variable (IV) regres- sion. Heteroskedasticity robust standard error estimates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p- value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg- Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

142 Table C.25. Disaggregated Crime and Education Quality (Total Score)

Car Commerce Household Person Kidnap. Pol. Kidnap. Non Pol. Kidnap. Homid. (1) (2) (3) (4) (5) (6) (7) (8) Total Score -0.70** 0.13 0.00 -0.75 -0.69* -0.05 -0.96* 0.14 (0.27) (0.23) (0.15) (0.52) (0.35) (0.17) (0.50) (0.21) Municipality FE Yes Yes Yes Yes Yes Yes Yes Yes Controls Yes Yes Yes Yes Yes Yes Yes Yes Trend Yes Yes Yes Yes Yes Yes Yes Yes Adjusted-R2 -0.41 -0.18 -0.19 -0.80 -0.33 -0.19 -0.48 -0.20 Observations 4586 5962 6036 6130 6213 6213 6213 6134 Underidentification 0.007 0.006 0.003 0.007 0.007 0.007 0.007 0.007 Weak Identification 13.668 8.966 13.095 9.203 9.065 9.065 9.065 9.135 Overidentification 0.290 0.188 0.218 0.270 0.553 0.809 0.514 0.971 Notes: Standardized coefficients from Instrumental Variable (IV) regression. Heteroskedasticity robust standard error esti- mates clustered at municipality level are reported in parentheses; *** denotes statistical significance at the 1% level, ** at the 5% level, and * at the 10% level, all for two-sided hypothesis tests. Underidentification Test reports the p-value for the Kleibergen-Paap (2006) rk statistic with rejection implying identification; F-stat reports the Kleibergen-Paap F statistic and Cragg-Donald Wald F statistic for weak identification; Overidentification test reports the p-value for the Hansen J statistic with the null being that the instruments are jointly valid.

143 BIBLIOGRAPHY

Reuben Abraham. Mobile phones and economic development: Evidence from the fishing industry in India. In Information and Communication Technologies and Development, 2006. ICTD’06. International Conference on, pages 48–56. IEEE, 2006. Arnab Acharya, Sukumar Vellakkal, Fiona Taylor, Edoardo Masset, Ambika Satija, Margaret Burke, and Shah Ebrahim. The impact of health insurance schemes for the informal sector in low-and middle-income countries: a systematic review. The Research Observer, 28(2):236–266, 2012. Farzana Afridi, Abhiroop Mukhopadhyay, and Soham Sahoo. Female labor force participation and child education in India: Evidence from the National Rural Employment Guarantee Scheme. IZA Journal of Labor & Development, 5(1):1–27, 2016. Anna Aizer and Flavio Cunha. The production of human capital: Endowments, investments and fertility. Technical report, National Bureau of Economic Research, 2012. Richard Akresh and Damien De Walque. Armed conflict and schooling: Evidence from the 1994 Rwandan genocide. Technical Report 4606, World Bank Policy Research Working Paper, 2008. Shamma Adeeb Alam. Parental health shocks, child labor and educational outcomes: Evidence from tanzania. Journal of health economics, 44:161–175, 2015. Harold Alderman and Paul Gertler. Family resources and gender differences in human capital investments: The demand for children’s medical care in pakistan. Intrahousehold Resource Allocation in Developing Countries, pages 231–48, 1997. Harold Alderman and Elizabeth M King. Gender differences in parental investment in education. Structural Change and Economic Dynamics, 9(4):453–468, 1998. Harold Alderman, Jere R Behrman, David R Ross, and Richard Sabot. Decomposing the gender gap in cognitive skills in a poor rural economy. Journal of Human Resources, pages 229–254, 1996. Harold Alderman, Jere R Behrman, Victor Lavy, and Rekha Menon. Child health and school enrollment: A longitudinal analysis. Journal of Human resources, pages 185–205, 2001. Douglas Almond, Hongbin Li, and Lingsheng Meng. Son preference and early childhood investments in china. manuscript, Tsinghua University, 2010. Siwan Anderson and Jean-Marie Baland. The economics of roscas and intrahousehold resource allocation. The Quarterly Journal of Economics, 117(3):963–995, 2002. Siwan Anderson and Mukesh Eswaran. What determines female autonomy? Evidence from Bangladesh. Journal of Development Economics, 90(2):179–191, 2009. Mehtabul Azam. The impact of Indian job guarantee scheme on labor market outcomes: Evidence from a natural experiment. IZA Discussion Paper No. 6548, May 2012. Mehtabul Azam. Does social health insurance reduce financial burden? panel data evidence from india. World Development, 102:1–17, 2018. Ivan Balbuzanov. Searching for evidence of boy-girl discrimination in household expenditure data: Evidence from gansu province, china. 2009.

144 Bilal Barakat and Henrik Urdal. Breaking the waves? Does education mediate the relationship between youth bulges and political violence? World Bank Policy Research Working Paper, (5114), 2009. Silvia Helena Barcellos, Leandro S Carvalho, and Adriana Lleras-Muney. Child gender and parental investments in india: are boys and girls treated differently? American Economic Journal: Applied Economics, 6(1):157–189, 2014. Kalpana Bardhan. Women’s work, welfare and status: Forces of tradition and change in india. Economic and Political Weekly, pages 2207–2220, 1985. . On life and death questions. Economic and Political Weekly, 9: 1293–1304, 1974. Felipe Barrera and Ana Maria Ib´a˜nez.Does violence reduce investment in education?: A theoretical and empirical approach. Documentos CEDE, (27), 2004. . Gender and say: A model of household behaviour with endogenously determined balance of power. The Economic Journal, 116(511):558–580, 2006. Christopher F. Baum and Mark E. Schaffer. Enhanced routines for instrumental variables/generalized method of moments estimation and testing. The Stata Journal, 7 (4):465–506, 2007. Christopher F Baum, Mark E Schaffer, Steven Stillman, et al. Enhanced routines for instrumental variables/gmm estimation and testing. Stata Journal, 7(4):465–506, 2007a. Christopher F Baum, Mark E Schaffer, Steven Stillman, et al. Enhanced routines for instrumental variables/GMM estimation and testing. Stata Journal, 7(4):465–506, 2007b. Christopher F Baum et al. Implementing new econometric tools in stata. In Mexican Stata Users’ Group Meetings, volume 9, 2013. Gary S. Becker. Crime and Punishment: An Economic Approach. Journal of , 76(2):169–217, 1968. Gary S Becker. A theory of social interactions. Journal of political economy, 82(6): 1063–1093, 1974. Kathleen Beegle, Rajeev H Dehejia, and Roberta Gatti. Child labor and agricultural shocks. Journal of Development economics, 81(1):80–96, 2006. Kathleen Beegle, Rajeev Dehejia, and Roberta Gatti. Why should we care about child labor? the education, labor market, and health consequences of child labor. Journal of Human Resources, 44(4):871–889, 2009. Jere R Behrman. Intrahousehold allocation of nutrients in rural india: Are boys favored? do parents exhibit inequality aversion? Oxford Economic Papers, 40(1):32–54, 1988. Jere R Behrman. The impact of health and nutrition on education. The World Bank Research Observer, 11(1):23–37, 1996. Jere R Behrman and James C Knowles. Household income and child schooling in vietnam. The World Bank Economic Review, 13(2):211–256, 1999.

145 Eric Bellman. Rural india snaps up mobile phones. The Wall Street Journal, February 2009. John Bellows and Edward Miguel. War and institutions: New evidence from Sierra Leone. American economic review, 96(2):394–399, 2006. Erlend Berg, Sambit Bhattacharyya, Rajasekhar Durgam, Manjula Ramachandra, et al. Can rural public works affect agricultural wages? Evidence from India. Centre for the Study of African Economies, Oxford University. Working Paper WPS/2012-05, 2012. Claude Berrebi. Evidence about the link between education, poverty and terrorism among Palestinians. Peace Economics, Peace Science and Public Policy, 13(1), 2007. Sonia Bhalotra and Cliff Attfield. Intrahousehold resource allocation in rural pakistan: a semiparametric analysis. Journal of Applied Econometrics, pages 463–480, 1998. Sonia Bhalotra and Samantha B Rawlings. Intergenerational persistence in health in developing countries: The penalty of gender inequality? Journal of Public Economics, 95 (3):286–299, 2011. Subhes C Bhattacharyya. Energy access problem of the poor in india: Is rural electrification a remedy? Energy Policy, 34(18):3387–3397, 2006. Jaime Bonet, G P´erez,and Jhorland Ayala. Contexto hist´oricoy evoluci´ondel SGP en Colombia. Documentos de trabajo sobre econom´ıaregional, 205, 2014. Banco de la Rep´ublica. Nayana Bose. Raising consumption through india’s national rural employment guarantee scheme. World Development, 96:245–263, 2017. Franois Bourguignon and Pierre-Andre Chiappori. Collective models of household behavior: An introduction. European Economic Review, 36:355–364, April 1992. Massimiliano Bratti and Mariapia Mendola. Parental health and child schooling. Journal of health economics, 35:94–108, 2014. Graham K Brown. The influence of education on violent conflict and peace: Inequality, opportunity and the management of diversity. Prospects, 41(2):191–204, 2011. Philip H Brown and Albert Park. Education and poverty in rural china. Economics of education review, 21(6):523–541, 2002. Tom Bundervoet et al. War, health, and educational attainment: A panel of children during burundis civil war. Technical report, Households in Conflict Network, 2012. Paolo Buonanno and Leone Leonida. Non-market effects of education on crime: Evidence from Italian regions. Economics of Education Review, 28(1):11–17, 2009. Paolo Buonanno, Daniel Montolio, and Paolo Vanin. Does social capital reduce crime? Journal of Law and Economics, 52(1):145–170, 2009. John B Burbidge, Lonnie Magee, and A Leslie Robb. Alternative transformations to handle extreme values of the dependent variable. Journal of the American Statistical Association, 83(401):123–127, 1988. Robin Burgess and Juzhong Zhuang. Modernisation and son preference. 2000.

146 Mayra Buvini´cand Geeta Rao Gupta. Female-headed households and female-maintained families: Are they worth targeting to reduce poverty in developing countries? Economic Development and Cultural Change, 45(2):259–280, 1997. Daniel Cerquera, Paula Jaramillo, and Natalia Salazar. La educaci´onen Colombia: evoluci´ony diagn´ostico. Boletines de divulgaci´onecon´omica, 6, 2000. Kerwin Kofi Charles and Nikolai Roussanov Erik Hurst. Conspicuous Consumption and Race. The Quarterly Journal of Economics, 124:425–467, 2009. Yuyu Chen and Ginger Zhe Jin. Does health insurance coverage lead to better health and educational outcomes? evidence from rural china. Journal of health economics, 31(1): 1–14, 2012. Pierre-Andre Chiappori. Nash-bargained households decisions: A comment. International Economic Review, 29(4):791–796, November 1988. Pierre-Andre Chiappori. Collective labor supply and welfare. Journal of Political Economy, 100(3):437–467, 1992. Barry R Chiswick. Differences in education and earnings across racial and ethnic groups: Tastes, discrimination, and investments in child quality. The Quarterly Journal of Economics, 103(3):571–597, 1988. Shelley Clark. Son preference and sex composition of children: Evidence from india. Demography, 37(1):95–108, 2000. Damian Clarke. Chidlren and their parents: A review of fertility and causality. Journal of Economic Surveys, 2017. Sarah R Cohodes, Daniel S Grossman, Samuel A Kleiner, and Michael F Lovenheim. The effect of child health insurance access on schooling: Evidence from public insurance expansions. Journal of Human Resources, 51(3):727–759, 2016. Christopher Colclough, Pauline Rose, and Mercy Tembon. Gender inequalities in primary schooling: The roles of poverty and adverse cultural practice. International Journal of Educational Development, 20(1):5–27, 2000. Paul Collier, Anke Hoeffler, and M˚ansS¨oderbom. On the duration of civil war. Journal of Peace Research, 41(3):253–273, 2004. Julie Berry Cullen, Brian A Jacob, and Steven Levitt. The effect of school choice on participants: Evidence from randomized lotteries. Econometrica, 74(5):1191–1230, 2006. Janet Currie and Jonathan Gruber. Health insurance eligibility, utilization of medical care, and child health. The Quarterly Journal of Economics, 111(2):431–466, 1996. Janet Currie and Brigitte C Madrian. Health, health insurance and the labor market. Handbook of labor economics, 3:3309–3416, 1999. Janet Currie and Enrico Moretti. Biology as destiny? short-and long-run determinants of intergenerational transmission of birth weight. Journal of Labor economics, 25(2): 231–264, 2007. David M Cutler and Adriana Lleras-Muney. Education and health: evaluating theories and evidence. Technical report, National bureau of economic research, 2006.

147 Andrew Dabalen and Saumik Paul. Estimating the causal effects of conflict on education in Cˆoted’Ivoire. Technical report, 2012. Jishnu Das and Jessica Leino. Evaluating the rsby: lessons from an experimental information campaign. Economic and Political Weekly, pages 85–93, 2011. Gaurav Datt and . Transfer benefits from public-works employment: Evidence for rural india. The Economic Journal, pages 1346–1369, 1994. Angus Deaton. The analysis of household surveys: a microeconometric approach to development policy. World Bank Publications, 1997. Klaus Deininger and Yanyan Liu. Welfare and Poverty Impacts of India s National Rural Employment Guarantee Scheme: Evidence from Andhra Pradesh, volume 1289. International Food Policy Research Institute, 2013. David James Deming. Better schools, less crime? Quarterly Journal of Economics, 126(4): 2063–2115, 2011. Sonalde Desai and . Maternal employment and changes in family dynamics: The social context of women’s work in rural south India. Population and Development Review, 20(1):115–136, March 1994. URL http://www.jstor.org/stable/2137632. Narayanan Devadasan, Tanya Seshadri, Mayur Trivedi, and Bart Criel. Promoting universal financial protection: evidence from the rashtriya swasthya bima yojana (rsby) in gujarat, india. Health Research Policy and Systems, 11(1):29, 2013. Gracious M Diiro, Abdoul G Sam, and David S Kraybill. Heterogeneous effects of maternal labor market participation on nutritional status of children: Empirical evidence from rural India. Available at SSRN 2445011, 2014. Ruth B. Dixon. Rural Women at Work: Strategies for Development in South Asia. The Johns Hopkins University Press, 1978. Matthias Doepke and Mich`eleTertilt. Families in . Handbook of Macroeconomics, 2:1789–1891, 2016. Dave Donaldson and Adam Storeygard. The view from above: Applications of satellite data in economics. Journal of Economic Perspectives, 30(4):171–98, November 2016. doi: 10.1257/jep.30.4.171. URL http://www.aeaweb.org/articles?id=10.1257/jep.30.4.171. Cheryl R. Doss. Testing among models of intrahousehold resource allocation. World Development, 24(10):1597–1609, 1996. Jean Dreze and Reetika Khera. The battle for employment guarantee. Frontline, 26(1): 3–16, 2009. Jean Dreze and Geeta Gandhi Kingdon. School participation in rural india. Review of Development Economics, 5(1):1–24, 2001. David M Drukker. Testing for serial correlation in linear panel-data models. Stata Journal, 3(2):168–177, 2003. Oeindrila Dube and Juan F Vargas. Commodity price shocks and civil conflict: Evidence from Colombia. The Review of Economic Studies, 80(4):1384–1421, 2013.

148 Esther Duflo and Christopher Udry. Intrahousehold resource allocation in cote d’ivoire: Social norms, separate accounts and consumption choices. National Bureau of Economic Research, (w10498), May 2004. Palanigounder Duraisamy. Gender, intrafamily allocation of resources and child schooling in south india. 1992. Suzanne Duryea, David Lam, and Deborah Levison. Effects of economic shocks on children’s employment and schooling in brazil. Journal of development economics, 84(1): 188–214, 2007. Puja Dutta, Rinku Murgai, Martin Ravallion, and Dominique van de Walle. Does India’s Employment Guarantee Scheme Guarantee Employment? The World Bank, Development Research Group, Policy Research Working Paper 6003, March 2012. Tim Dyson and Mick Moore. On kinship structure, female autonomy, and demographic behavior in india. Population and development review, pages 35–60, 1983. Eric V Edmonds. Child labor and schooling responses to anticipated income in . Journal of development Economics, 81(2):386–414, 2006. Tommi Ekholm, Volker Krey, Shonali Pachauri, and Keywan Riahi. Determinants of household energy consumption in india. Energy Policy, 38(10):5696–5707, 2010. Ted Enamorado, Luis F L´opez-Calva, and Carlos Rodr´ıguez-Castel´an.Crime and growth convergence: Evidence from Mexico. Economics Letters, 125(1):9–13, 2014. Frederick Engels. The Origin of the Family, Private Property, and the State. Pathfinder Press, New York, 1972, 1884. Mukesh Eswaran, Bharat Ramaswami, and Wilima Wadhwa. Status, Caste, and the Time Allocation of Women in Rural India. Economic Development and Cultural Change, 61 (2):311–333, January 2013. Alice Fabre and St´ephanePallage. Child labor, idiosyncratic shocks, and social policy. Journal of Macroeconomics, 45:394–411, 2015. Gabriela Flores, Jaya Krishnakumar, Owen O’donnell, and Eddy Van Doorslaer. Coping with health-care costs: implications for the measurement of catastrophic expenditures and poverty. Health economics, 17(12):1393–1412, 2008. Joyce B. Flueckiger. Gender and Genre in the Folklore of Middle India. Press, 1996. Terri Friedline, Rainier D Masa, and Gina AN Chowa. Transforming wealth: Using the inverse hyperbolic sine (ihs) and splines to predict youths math achievement. Social science research, 49:264–287, 2015. Tim Friehe and Mario Mechtel. Conspicuous Consumption and Communism: Evidence from East and West Germany. European Economic Review, 67:62–81, April 2014. Geeta Gandhi Kingdon. The gender gap in educational attainment in india: How much can be explained? Journal of Development Studies, 39(2):25–53, 2002. Shubhashis Gangopadhyay, Bharat Ramaswami, and Wilima Wadhwa. Reducing subsidies on household fuels in india: how will it affect the poor? Energy Policy, 33(18): 2326–2336, 2005.

149 Ashish Garg and Jonathan Morduch. Sibling rivalry and the gender gap: Evidence from child health outcomes in ghana. Journal of Population Economics, 11(4):471–493, 1998. Paul Gertler, David I Levine, and Minnie Ames. Schooling and parental death. The Review of Economics and Statistics, 86(1):211–225, 2004. Stephen Gibbons and Henry G. Overman. Mostly Pointless Spatial Econometrics? Journal of Regional Science, 52(2):172–191, 2012. ISSN 1467-9787. doi: 10.1111/j.1467-9787.2012.00760.x. URL http://dx.doi.org/10.1111/j.1467-9787.2012.00760.x. Steve Gibbons, Henry G Overman, and Eleonora Patacchini. Spatial Methods. In G. Duranton, J. V. Henderson, and W. Strange, editors, Handbook of Regional and Urban Economics, volume 5A, pages 115–168. Elsevier, 2015. Peter Glick. Women’s employment and its relation to children’s health and schooling in developing countries: Conceptual links, empirical evidence, and policies. Cornell Food and Nutrition Policy Program Working Paper, (131), 2002. Peter J Glick, David E Sahn, and Thomas F Walker. Household shocks and education investments in madagascar. Oxford Bulletin of Economics and Statistics, 78(6):792–813, 2016. Silvia C. G´omez-Soler. Educational achievement at schools: Assessing the effect of the civil conflict using a pseudo-panel of schools. International Journal of Educational Development, 49:91–106, 2016. GOI Government of India. Population census 2011, 2011a. URL http://www.census2011.co.in/. GOI Government of India. Population census 2011, 2011b. URL http://www.census2011.co.in/. GOI Government of India. Rashtriya swasthya bima yojana, 2013a. URL http://www.rsby.gov.in. GOI Government of India. Rashtriya swasthya bima yojana, 2013b. URL http://www.rsby.gov.in. Monica J Grant and Jere R Behrman. Gender gaps in educational attainment in less developed countries. Population and Development Review, 36(1):71–89, 2010. Jonathan Guryan. Desegregation and Black Dropout Rates. American Economic Review, pages 919–943, 2004. W Haddad, M Carnoy, and R Rinaldi. 0. regel. 1990.” education and development: Evidence for new priorities.”. World Bank Discussion Papers, 95, 1984. Robert Halvorsen and Raymond Palmquist. The interpretation of dummy variables in semilogarithmic equations. The American Economic Review, 70(3):474–475, June 1980. Amar A Hamoudi, Jeffrey D Sachs, et al. Economic consequences of health status: a review of the evidence. Technical report, Center for International Development at , 1999. Eric Hanushek et al. For long-term economic development, only skills matter. IZA World of Labor, pages 343–343, 2017.

150 Eric A. Hanushek. The economics of school quality. German Economic Review, 6(3): 269–286, 2005. Eric A. Hanushek, Jens Ruhose, and Ludger Woessman. Knowledge Capital and Aggregate Income Differences: Development Accounting for US States. American Economic Journal: Macroeconomics. Forthcoming. Eric A. Hanushek, Jens Ruhose, and Ludger Woessmann. It Pays to Improve School Quality. Education Next, 16(3):16–24, Summer 2016. Gautam Hazarika and Sudipta Sarangi. Household access to microcredit and child work in rural malawi. World Development, 36(5):843–859, 2008. Rachel Heath and Xu Tan. Intrahousehold bargaining, female autonomy, and labor supply: Theory and evidence from India. Unpublished manuscript, 2014. H˚avard Hegre, Gudrun Østby, and Clionadh Raleigh. Poverty and Civil War Events A Disaggregated Study of Liberia. Journal of Conflict Resolution, 53(4):598–623, 2009. J. Vernon Henderson, Adam Storeygard, and David N. Weil. Measuring Economic Growth from Outer Space. American Economic Review, 102(2):994–1028, April 2012. doi: 10.1257/aer.102.2.994. URL http://www.aeaweb.org/articles?id=10.1257/aer.102.2.994. M Anne Hill and Elizabeth King. Women’s education and economic well-being. , 1(2):21–46, 1995. Rozana Himaz. Intrahousehold allocation of education expenditure: the case of sri lanka. Economic Development and Cultural Change, 58(2):231–258, 2010. John Hoddinott and Lawrence Haddad. Does female income share influence household expenditures? Evidence from CˆoteD’Ivoire. Oxford Bulletin of Economics and Statistics, 57(1):77–96, 1995. William C Horrace and Ronald L Oaxaca. Results on the bias and inconsistency of ordinary least squares for the linear probability model. Economics Letters, 90(3):321–327, 2006. Macartan Humphreys and Jeremy M Weinstein. Who fights? The determinants of participation in civil war. American Journal of Political Science, 52(2):436–455, 2008. Clement Imbert and John Papp. Labor market effects of social programs: Evidence from India’s employment guarantee. American Economic Journal: Applied Economics, 7(2): 233–263, 2015. Government of India India Union Budget. Indian Union Budget 2015, 2015. URL http://indiabudget.nic.in/budget.asp. Government of India India Union Budget. Indian Union Budget 2017, 2017. URL http://indiabudget.nic.in/budget.asp. Indian Society of Labour Economics Institute for Human Development. India Labour and Employment Report 2014. Institute for Human Development, Academic Foundation, 2014. Mahnaz Islam and Anitha Sivasankaran. How does child labor respond to changes in adult work opportunities? Evidence from NREGA. Unpublished paper, Harvard University, April 2014.

151 Hanan G Jacoby. The economics of polygyny in sub-saharan Africa: Female productivity and the demand for wives in CˆoteD’Ivoire. Journal of Political Economy, 103(5): 938–971, 1995. Hanan G Jacoby and Emmanuel Skoufias. Risk, financial markets, and human capital in a developing country. The Review of Economic Studies, 64(3):311–335, 1997. Shireen J. Jejeebhoy and Zeba A. Sathar. Women’s autonomy in India and Pakistan: The influence of religion and region. Population and Development Review, 27(4):687–712, December 2001. Robert Jensen. Agricultural volatility and investments in children. The American Economic Review, 90(2):399–404, 2000. Douglas Johnson and Karuna Krishnaswamy. The impact of rsby on hospital utilization and out-of-pocket health expenditure. Washington DC: , 2012. Naila Kabeer. Resources, agency, achievements: Reflections on the measurement of women’s empowerment. Development and Change, 30(3):435–464, July 1999. Uma S Kambhampati and Sarmistha Pal. Role of parental literacy in explaining gender difference: Evidence from child schooling in india. The European Journal of Development Research, 13(2):97–119, 2001. and Lawrence Haddad. Are better off households more unequal or less unequal? Oxford Economic Papers, 46(3):445–458, 1994. Anuj Kapilashrami and Deepa Venkatachalam. Health insurance: Evaluating the impact on the right to health, 2013a. Anuj Kapilashrami and Deepa Venkatachalam. Health insurance: Evaluating the impact on the right to health, 2013b. Anup Karan, Sakthivel Selvaraj, and Ajay Mahal. Moving to universal coverage? trends in the burden of out-of-pocket payments for health care across social groups in india, 1999–2000 to 2011–12. PloS one, 9(8):e105162, 2014. Anup Karan, Winnie Yip, and Ajay Mahal. Extending health insurance to the poor in india: An impact evaluation of rashtriya swasthya bima yojana on out of pocket spending for healthcare. Social Science & Medicine, 181:83–92, 2017. Elizabeth G Katz. Gender and trade within the household: observations from rural guatemala. World Development, 23(2):327–342, 1995. Melanie Khamis, Nishith Prakash, and Zahra Siddique. Consumption and social identity: Evidence from India. Journal of Economic Behavior & Organization, 83:353–371, August 2012. Reetika Khera and Nandini Nayak. Women Workers and Perceptions of the National Rural Employment Guarantee Act. Economic and Political Weekly, 44(43):24–30, October 2009. Geeta Gandhi Kingdon. Where has all the bias gone? detecting gender bias in the intrahousehold allocation of educational expenditure. Economic Development and Cultural Change, 53(2):409–451, 2005. Anjini Kochar. Explaining household vulnerability to idiosyncratic income shocks. The American Economic Review, 85(2):159–164, 1995.

152 Feridoon Koohi-Kamali. Intrahousehold inequality and child gender bias in ethiopia. 2008. Alan B Krueger and Jitka Maleˇckov´a.Education, poverty and terrorism: Is there a causal connection? The Journal of Economic Perspectives, 17(4):119–144, 2003. Etienne G. Krug, Linda L. Dahlberg, James A. Mercy, Anthony B. Zwi, and Rafael Lozano, editors. World Report on Violence and Health. World Health Organization, 2002. Brian Lai and Clayton Thyne. The effect of civil war on education, 198097. Journal of Peace Research, 44(3):277–292, 2007. James Lake and Daniel L Millimet. An empirical analysis of trade-related redistribution and the political viability of free trade. Journal of International Economics, 99:156–178, 2016. Andreas Landmann and Markus Fr¨olich. Can health-insurance help prevent child labor? an impact evaluation from pakistan. Journal of health economics, 39:51–59, 2015. Jungmin Lee. Sibling size and investment in childrens education: An asian instrument. Journal of Population Economics, 21(4):855–875, 2008a. Yiu-fai Daniel Lee. Do families spend more on boys than on girls? empirical evidence from rural china. China Economic Review, 19(1):80–100, 2008b. Nancy E Levine. Differential child care in three tibetan communities: Beyond son preference. Population and development review, pages 281–304, 1987. Lixing Li and Xiaoyu Wu. Gender of children, bargaining power, and intrahousehold resource allocation in china. Journal of Human Resources, 46(2):295–316, 2011. Tianshu Li and Sheetal Sehkri. The unintended consequences of employment based safety net programs. Unpublished paper, Univirsity of Virginia, 2013. Jeremy Lise and Shannon Seitz. Consumption inequality and intra-household allocations. The Review of Economic Studies, 78(1):328–355, 2011. Hong Liu and Zhong Zhao. Does health insurance matter? evidence from china’s urban resident basic medical insurance. Journal of Comparative Economics, 42(4):1007–1020, 2014. Kai Liu. Insuring against health shocks: Health insurance and household choices. Journal of health economics, 46:16–32, 2016. Lance Lochner. Education, Work, and Crime: A Human Capital Approach. International Economic Review, 45(3):811–843, 2004. Lance Lochner. Education and Crime. In International Encyclopedia of Education, 3rd Edition. Elsevier, 2010a. Lance Lochner. Education policy and crime. In Controlling crime: strategies and tradeoffs, pages 465–515. University of Chicago Press, 2010b. Shelly Lundberg and Robert A. Pollak. Bargaining and distribution in marriage. The Journal of Economic Perspectives, 10(4):139–158, 1996. Luca Mancini. Horizontal inequality and communal violence: evidence from Indonesian districts. Centre for Research on Inequality, Human Security and Ethnicity, , 2005.

153 Neelakshi Mann and Varad Pande. MGNREGA Sameeksha: An Anthology of Research Studies on the Mahatma Gandhi National Rural Employment Guarantee Act, 2005, 2006–2012. Orient Blackswan Private Limited, 2012. Marilyn Manser and Murray Brown. Marriage and household decision making: A bargaining analysis. International Economic Review, 21(1):31–44, February 1980. Isaac M. Mbiti. Moving Women: Household composition, labor demand and crop choice. Unpublished paper, Southern Methodist University, September 2007. Marjorie B McElroy and Mary Jean Horney. Nash-bargained household decisions: Toward a generalization of the theory of demand. International Economic Review, 22(2):333–349, 1981. Erik Melander. Gender equality and intrastate armed conflict. International Studies Quarterly, 49(4):695–714, 2005. Ligia Melo. Impacto de la descentralizaci´onfiscal sobre la educaci´onp´ublica colombiana. Borradores de econom´ıa, 350, 2005. Banco de la Rep´ublica. Joan P. Mencher. Women’s Work and Poverty: Women’s Contribution to Household Maintenance in South India. Standford University Press, 1988. Ouarda Merrouche et al. The human capital cost of landmine contamination in Cambodia. Technical report, Households in conflict network, 2006. Ministry of Rural Development. Mahatma Gandhi National Rural Employment Guarantee Act - Report to the People 2013. Technical report, Government of India, February 2013. URL http://nrega.nic.in/netnrega/WriteReaddata/circulars/ Report to the people English2013.pdf. Ministry of Rural Development. Mahatma Gandhi National Rural Employment Guarantee Act 2005 - Performance, Initiatives and Strategies, June 2016. URL http://nrega.nic.in/ Circular Archive/archive/MGNREGA PerformanceReport27June2016.pdf. Government of India Ministry of Rural Development. Mgnrega districts. URL http://nrega.nic.in/mnrega dist.pdf. Government of India Ministry of Rural Development. Building sustainable livelihoods of the poor through MNREGA - user’s manual, 2013. Minnesota Population Center. Integrated Public Use Microdata Series, International: Version 6.4 [Machine-readable database]. University of Minnesota, 2015. URL https://international.ipums.org/international/index.shtml. Manoj Mohanan. Causal effects of health shocks on consumption and debt: quasi-experimental evidence from bus accident injuries. Review of Economics and Statistics, 95(2):673–681, 2013. Jonathan Morduch. Between the state and the market: Can informal insurance patch the safety net? The World Bank Research Observer, 14(2):187–207, 1999. Enrico Moretti. Does education reduce participation in criminal activities. In symposium on The Social Costs of Inadequate Education (Columbia University Teachers College), 2005.

154 Yair Mundlak. On the pooling of time series and cross section data. Econometrica: journal of the , pages 69–85, 1978. Arindam Nandi, Ashvin Ashok, and Ramanan Laxminarayan. The socioeconomic and institutional determinants of participation in India’s health insurance scheme for the poor. PloS one, 8(6):e66296, 2013. Sudha Narayanan and Upasak Das. Employment Guarantee for Women in India Evidence on Participation and Rationing in the MGNREGA from the National Sample Survey. Institute of Development Research Working Papers, June 2014. URL http://www.igidr.ac.in/pdf/publication/WP-2014-017.pdf. United Nations. The millenium development goals report 2015. Technical report, United Nations, 2015. Ha Nguyen and James Knowles. Demand for voluntary health insurance in developing countries: the case of vietnams school-age children and adolescent student health insurance program. Social science & medicine, 71(12):2074–2082, 2010. Paul Niehaus and Sandip Sukhtankar. The marginal rate of corruption in public programs. CEGA Working Papers, November 2012. URL http://escholarship.org/uc/item/7jh312k1. Josef Novotny, Jana Kubelkov, and Vanishree Joseph. A multi-dimensional analysis of the impacts of the Mahatma Gandhi National Rural Employment Guarantee Scheme: A tale from Tamil Nadu. Singapore Journal of Tropical Geography, 34:322–341, 2013. Don Operario, Lucie Cluver, Helen Rees, Catherine MacPhail, and Audrey Pettifor. Orphanhood and completion of compulsory school education among young people in south africa: Findings from a national representative survey. Journal of Research on Adolescence, 18(1):173–186, 2008. Gudrun Østby and Henrik Urdal. Education and Civil Conflict: A Review of the Quantitative, Empirical Literature. Background paper prepared for the Education for All Global Monitoring Report, 2011. Emily Oster. Does increased access increase equality? gender and child health investments in india. Journal of development Economics, 89(1):62–76, 2009. Aderoju Oyefusi. Oil and the probability of rebel participation among youths in the Niger Delta of Nigeria. Journal of Peace Research, 45(4):539–555, 2008. Sarmistha Pal. How much of the gender difference in child school enrolment can be explained? evidence from rural india. Bulletin of Economic Research, 56(2):133–158, 2004. Robert Palacios, Jishnu Das, and Changqing Sun. India’s health insurance scheme for the poor: Evidence from the early experience of the rashtriya swasthya bima yojana. New Delhi: Center for Policy Research, 2011. Debajit Palit, Arvind Garimella, Martand Shardul, and Saswata Chaudbury. Analysis of the electrification programme in india using the ‘energy plus’ framework and the key lessons. Technical report, The Energy and Resources Institute, India, July 2015. Leslie E. Papke and Jeffrey M. Wooldridge. Econometric Methods for Fractional Response Variables with an Application to 401(K) Plan Participation Rates. NBER Technical Working Paper No. 147, November 1993.

155 Leslie E Papke and Jeffrey M Wooldridge. Panel data methods for fractional response variables with an application to test pass rates. Journal of Econometrics, 145(1): 121–133, 2008. Mark M Pitt and Mark R Rosenzweig. Estimating the intrahousehold incidence of illness: Child health and gender-inequality in the allocation of time. International Economic Review, pages 969–989, 1990. Mark M Pitt, Mark R Rosenzweig, and Md Nazmul Hassan. Productivity, health, and inequality in the intrahousehold distribution of food in low-income countries. The American Economic Review, pages 1139–1156, 1990. Government of India Planning Commission. Annual report (2013-14) on the working of state power utilities & electricity departments. Technical report, Power and Energy Division, Planning Commission, February 2014. URL http://planningcommission.nic.in/. Agnes R Quisumbing, John A Maluccio, et al. Intrahousehold Allocation and Gender Relations: New Empirical Evidence from Four Developing Countries. International Food Policy Research Institute, 2000. Kalyani Raghunathan and Siddharth Hari. Providing more than just employment? Evidence from the NREGA in India. October 2014. D Rajasekhar, Erlend Berg, Maitreesh Ghatak, R Manjula, and Sanchari Roy. Implementing health insurance: the rollout of rashtriya swasthya bima yojana in karnataka. Economic and Political Weekly, pages 56–63, 2011. Anu Rammohan and Diane Dancer. Gender differences in intrahousehold schooling outcomes: the role of sibling characteristics and birth-order effects. Education Economics, 16(2):111–126, 2008. Shamika Ravi and Sofi Bergkvist. Are publicly financed health insurance schemes working in india? India Policy Forum, 11(1):158–192, 2015. Shamika Ravi and Monika Engler. Workfare as an effective way to fight poverty: The case of India’s NREGS. World Development, 67:57–71, March 2015. Shamika Ravi, Mudit Kapoor, and Rahul Ahluwalia. The Impact of NREGS on Urbanization in India. August 2012. Wameq Raza, Ellen van de Poel, and Pradeep Panda. Analyses of enrolment, dropout and effectiveness of rsby in northern rural india. 2016. Robert Repetto. Son preference and fertility behavior in developing countries. Studies in Family Planning, 3(4):70–76, 1972. Nancy E. Riley. Gender power and population change. Population Bulletin, 52(1):1–48, 1997. Catherine Rodr´ıguezand Fabio Sanchez. Armed conflict exposure, human capital investments, and child labor: evidence from Colombia. Defence and Peace Economics, 23 (2):161–184, 2012. Catherine Rodr´ıguez,Mar´ıaLaura Alz´ua,and Edgar Villa. The quality of life in prisons: do educational programs reduce in-prison conflicts? Documentos de Trabajo del CEDLAS, 2009.

156 Mark R Rosenzweig and T Paul Schultz. Market opportunities, genetic endowments, and intrafamily resource distribution: Child survival in rural india. The American Economic Review, 72(4):803–815, 1982. Anne Beeson Royalty and Jean M Abraham. Health insurance and labor market outcomes: Joint decision-making within households. Journal of Public Economics, 90(8):1561–1577, 2006. Mauricio Santamar´ıa,Jos´eF Arias, and Patricia Camacho. Exposici´onde motivos de la reforma a la Ley 60 de 1993. Sector educaci´ony sector salud. Archivos de econom´ıa, (173), 2001. Departamento Nacional de Planeaci´on. Rainer Sauerborn, Alayne Adams, and Maurice Hien. Household strategies to cope with the economic costs of illness. Social science & medicine, 43(3):291–301, 1996. Yasuyuki Sawada and Michael Lokshin. Household schooling decisions in rural pakistan. 1999. Pia Schneider. Why should the poor insure? theories of decision-making in the context of health insurance. Health policy and planning, 19(6):349–355, 2004. T Paul Schultz. Education investments and returns. Handbook of development economics, 1:543–630, 1988. T Paul Schultz. Why governments should invest more to educate girls. World Development, 30(2):207–225, 2002. and Sunil Sengupta. Malnutrition of rural children and the sex bias. Economic and political weekly, pages 855–864, 1983. Moses Shayo. Education, militarism and civil wars. Maurice Falk Institute for Economic Research in Israel, 2007. Emmanuel Skoufias. Labor market opportunities and intrafamily time allocation in rural households in south asia. Journal of Development Economics, 40(2):277–310, 1993. Lina Song, Simon Appleton, and John Knight. Why do girls in rural china have lower school enrollment? World Development, 34(9):1639–1653, 2006. John Strauss and Duncan Thomas. Human resources: Empirical modeling of household and family decisions. Handbook of development economics, 3:1883–2023, 1995. Ang Sun and Yang Yao. Health shocks and children’s school attainments in rural china. Economics of Education Review, 29(3):375–382, 2010. Daniel Suryadarma, Yus Medina Pakpahan, and Asep Suryahadi. The effects of parental death and chronic poverty on children’s education and health: Evidence from indonesia. 2009. Anil Swarup and Nishant Jain. Rashtriya swasthya bima yojana. Innovative, page 259, 2011. Eik Leong Swee et al. On war and schooling attainment: The case of Bosnia and Herzegovina. Technical report, Households in Conflict Network, 2009. The World Bank. The World Bank Indicators. http://data.worldbank.org/indicator/, 2017.

157 Duncan Thomas. Intra-Household Resource Allocation: An Inferential Approach. The Journal of Human Resources, 25(4):635–664, 1990. URL http://www.jstor.org/stable/145670. Robert J. Thornton and Jon T. Innes. Interpreting semilogarithmic regression coefficients in labor research. Journal of Labor Research, 10(4):443–447, 1989. UN United Nations. Gender-biased sex selection, 2017a. URL http://www.unfpa.org/gender-biased-sex-selection. UN United Nations. Gender-biased sex selection, 2017b. URL http://www.unfpa.org/gender-biased-sex-selection. Henrik Urdal. Population, Resources, and Political Violence A Subnational Study of India, 1956–2002. Journal of Conflict Resolution, 52(4):590–617, 2008. Juan F Vargas. The persistent Colombian conflict: subnational analysis of the duration of violence. Defence and Peace Economics, 23(2):203–223, 2012. GR Verma and Babu Bv. Son preference and desired family size in a rural community of west godavari district, andhra pradesh, india. Journal of Social Science, 15(1):59–64, 2007. AK Virk and R Atun. Towards universal health coverage in india: a historical examination of the genesis of rashtriya swasthya bima yojana–the health insurance scheme for low-income groups. Public health, 129(6):810–817, 2015. Adam Wagstaff, Magnus Lindelow, Gao Jun, Xu Ling, and Qian Juncheng. Extending health insurance to the rural population: an impact evaluation of china’s new cooperative medical scheme. Journal of health economics, 28(1):1–19, 2009. Wendy Wang. Son preference and educational opportunities of children in china-i wish you were a boy!. Gender Issues, 22(2):3–30, 2005. WHO. Tracking universal health coverage, first global monitoring report. Technical report, World Health Organization and The World Bank, 2015. Maame Esi Woode. Parental health shocks and schooling: The impact of mutual health insurance in rwanda. Social Science & Medicine, 173:35–47, 2017. Jeffrey M Wooldridge. Econometric analysis of cross section and panel data. MIT press, 2nd edition, 2010. Jeffrey M Wooldridge. Control function methods in applied econometrics. Journal of Human Resources, 50(2):420–445, 2015. Jeffrey M Wooldridge et al. Fractional response models with endogeneous explanatory variables and heterogeneity. In CHI11 Stata Conference, number 12. Stata Users Group, 2011. World Health Organization, editor. Global Status Report on Violence Prevention 2014. World Health Organization, 2014. Junjian Yi, James J Heckman, Junsen Zhang, and Gabriella Conti. Early health shocks, intra-household resource allocation and child outcomes. The Economic Journal, 125 (588):F347–F371, 2015.

158 Wei-hsin Yu and Kuo-hsien Su. Gender, sibship structure, and educational inequality in taiwan: Son preference revisited. Journal of Marriage and Family, 68(4):1057–1068, 2006. Laura Zimmermann. Public works programs in developing countries have the potential to reduce poverty. IZA World of Labor, May 2014. URL wol.iza.org.

159