<<

Ethnic and Religious Intergenerational Mobility in Africa∗

Alberto Alesina Sebastian Hohmann Harvard , CEPR and NBER London Business School

Stelios Michalopoulos Elias Papaioannou Brown University, CEPR and NBER London Business School and CEPR

September 27, 2018

Abstract We investigate the evolution of inequality and intergenerational mobility in educational attainment across ethnic and religious lines in Africa. Using census data covering more than 70 million people in 19 countries we document the following regularities. (1) There are large differences in intergenerational mobility both across and within countries across cultural groups. Most broadly, are more mobile than who are more mobile than people following traditional . (2) The average country-wide education level of the group in the generation of individuals’ parents is a strong predictor of group- level mobility in that more mobile groups also were previously more educated. This holds both across religions and ethnicities, within ethnicities controlling for and vice versa, as well as for two individuals from different groups growing up in the same region within a country. (3) Considering a range of variables, we find some evidence that mobility correlates negatively with in the political arena post indepdence, and that mobility is higher for groups that historically derived most of their subsistence from agriculture as opposed to pastoralism.

Keywords: Africa, Development, Education, Inequality, Intergenerational Mobility.

JEL Numbers. N00, N9, O10, O43, O55

∗Alberto Alesina Harvard Univerity and IGIER Bocconi, Sebatian Hohmnn , London Busienss Schoiol, Stelios Michalopoulos. Brown University, Elias Papaioannou. London Business School. We thank Remi Jedwab and Adam Storeygard for sharing their data on colonial roads and railroads in Africa, Julia Cag´eand Valeria Rueda for sharing their data on protestant missions, and Nathan Nunn for sharing his data on Catholic and Protestant missions. We would like to thank for their comments conference participants at the university of Zurich and Oriana Bandiera for her insightful discussion.

1 1 Introduction

Many argue that Africa’s poor post-independence performance is due to the salience of eth- nicity (Bates (2005), Dowden (2009)). A plethora of case-studies suggest that African politics are characterized by ethnic patronage and favoritism (e.g., Posner (2005). Many ethnic groups face repression and exclusion from national politics and most of Africa’s civil wars have an explicit ethnic dimension Wimmer, Cederman, and Min (2009)). Ethnic tensions also arise because small minorities exert significant influence on the economy and have sizable power on national politics (Chua (2004) and Robinson (2001)). In an influential study, Easterly and Levine (1997) argued that ethno-linguistic fractionalization can explain African miserable economic performance in its initial post-independence decades. At the same time, religious differences – that often follow ethnic lines – seem to be large leading to widespread religious tensions. Politics in the Sahel (, , Sudan, , ) are centered around predominantly Muslim Northern regions and mostly Christian Southern areas. Religious dif- ferences seem to also play a role in North Africa (Egypt, ), as well as in Ethiopia and parts of East Africa. In this paper, we move beyond case studies and anecdotal evidence and provide the first comprehensive analysis of ethnic and religious differences in education – a variable that cap- tures both individual living standards and public goods. Using census data from 19 countries comprising close to 70 million individual records from all parts of the continent, we provide new statistics of intergenerational mobility in educational attainment. The analysis reveals large differences in social mobility across ethnic and religious lines both across and within countries. We then explore the correlates of the large gaps in educational opportunity across ethnic and religious groups, showing that initial differences in education correlate strongly with social mobility, though there is notable cross-country heterogeneity on inertia.

1.1 Results Overview

In the first part of the paper (Section 3), we use census data from 18 and 14 African countries – the two numbers refer to the number of countries for which we have religion and ethnic- ity data respectively – and construct measures of educational opportunity across 171 ethnic groups and the main religions (, traditional African religions, Catholic, Protestant and Orthodox Christians). Building on recent works in the (Chetty et al. (2014), Chetty and Hendren (2018a), Chetty and Hendren (2018b), Card, Domnisoru, and Taylor (2018), we measure opportunity as the likelihood that children whose parents do not have any formal education manage to complete at least primary education (intergenerational - bility in education). We also draw distinctions by gender and rural-urban household status. The descriptive patterns reveal stark differences in social mobility across ethnic and religious groups. For example, in the likelihood that children from illiterate parents will man- age to complete at least primary schooling is on average 62.5%; for the Akan (Ashante) that dominate national politics the likelihood is 76.5%, while the for the Gurma it is only 45.5%. In , 60% of Krio (Creole) children, whose parents do not have any schooling,

2 manage to complete primary education (or higher), but the analogous likelihood for the Ko- ranko is just 18%. Religious differences in educational opportunity appear also pervasive. In almost all countries, educational opportunity for Christians is higher than of Muslims and for Africans of traditional . The differences are large. For example in Nigeria, Africa’s most populous country, the likelihood that children of parents without any education manage to complete at least primary schooling is around 75% for Christians and juts 18% for those adhering to traditional religions. In , educational intergenerational mobility is 43% for Catholics and just 15.6% for Muslims. In the second part (Section 4), we establish a strong positive association between the share the old generation with at least primary schooling (we refer informally to these individuals as “literate” throughout the paper) and upward mobility across ethnic and religious lines. Children of relatively more educated ethnicities and religions have a higher likelihood of ex- iting family illiteracy. Moreover, children whose parents have completed primary (or higher) schooling face a lower likelihood of downward intergenerational mobility (not complete pri- mary schooling) if they come from relatively more educated ethnic and religious groups. These patterns apply for both urban and rural households, though the ethnicity-religion education and social mobility nexus is stronger among rural households. These associations are also present for boys and girls. There is, however, non-negligible variation on the link between ethnic (and religious) education (of the “old”) and social mobility (of the young). On the one hand, in Ethiopia, Nigeria, Burkina Faso and Mali, there are strong ethnic and religious inertia. Children of illiterate parents of ethnic groups without much education face a much lower chance to complete primary schooling compared to children of illiterate parents who come from more educated ethnic and religious groups. On the other hand, in South Africa and in Botswana the association between ethnic-religious literacy of the old and the likelihood that kids of parents without any education manage to complete primary schooling is much attenuated or non-existent. In the third part (Section 5), we explore the correlates of educational intergenerational mobility across ethnicities in an effort to document some stylized facts. We start our corre- lational analysis with the Ethnic Power Relations database (Wimmer, Cederman, and Min (2009)), that provides proxies of ethnicities’ role in national politics based on experts’ sub- jective assessments. Educational mobility correlates significantly (negatively) with ethnic discrimination and appears higher for ethnicities that dominate national politics. However, the associations are often weak, suggesting that more quantitative work (like ours) needs to complement subjective proxies. We then examine the role of deeply-rooted ethnic features that earlier works suggest may influence development (see Michalopoulos and Papaioannou (2018) for an overview of recent research on the historical origins of African development); among others, we examine the correlation of educational intergenerational mobility with (Fenske (2015)), political centralization (Gennaioli and Rainer (2007), Michalopoulos and Papaioannou (2013)), the subsistence level of precolonial economies (Michalopoulos, Putter- man, and Weil (2016)), (Nunn (2008)), and ethnic partitioning (Michalopoulos and Papaioannou (2016)). The analysis shows that social mobility is higher for ethnicities that

3 derived a larger share of their subsistence from agriculture in the pre-colonial era and lower for mostly pastoralist groups. Interestingly, this result applies not only for rural but also for urban households. When we distinguish by gender we also uncover that educational intergen- erational mobility is significantly higher in patrilineal societies and ethnicities that practice bride price.

1.2 Related Literature

Our paper mostly contributes to the voluminous literature on African political economy on ethnic discrimination and favoritism. This research agenda shows with eloquent narratives and case studies the dominant role of ethnicity on African politics and the economy. Wim- mer, Cederman, and Min (2009) construct measures of ethnicities’ political power (dominant, marginal, excluded) and show that most African conflicts have an ethnic-specific dimension. Posner (2005) provides a comprehensive overview of political science works, while delving into the case of where politics have been organized alonng ethnic divisions. Recently economics research has added more quantitative evidence. Burgess et al. (2015) and Kramon and Posner (2016) look on ethnic favoritism on road infrastructure and education in Kenya. Amodio and Chiovelli (2016) study ethnic patronage among indigenous South African tribes post-. Acemoglu, Reed, and Robinson (2014), de Kadt and Larreguy (2018) and Baldwin (2013) present evidence on the role of traditional ethnic leaders in vote-buying and local politics in Sierra Leone, South Africa, and Zambia, respectively.1 Given our pan-African focus, our paper is mostly related to the early contribution of Franck and Rainer (2012), who show that educational attainment increases and infant mortality falls for individuals from the leader’s ethnic group. In the same vein, De Luca et al. (2018) show that luminosity, which proxies well regional income, increases by around 10% in the country leader’s ancestral home- lands during the leader’s tenure (see also Dickens (2018) who looks across partitioned by the national border ethnic groups). Our key contribution to this research agenda is to provide for the first time direct measures of ethnic differences in education and intergenerational mobility that capture directly ethnic favoritism/discrimination for a large number of African countries since independence using census data. In particular, we provide ethnic measures of social mo- bility for the population, rural and urban households and for both genders. We also provide similar mappings of social mobility across religious lines. Then we examine the correlates of the large differences in social mobility across African ethnic and religious groups in an effort to shed light on their origins and guide future research. Our focus on ethno-linguistic and religious differences in educational opportunity connects our study to the large literature assessing the role of , fractionalization, and polar- ization in economic, political and social development and public goods provision (Easterly and Levine (1997), Alesina et al. (2003), Fearon (2003), Montalvo and Reynal-Querol (2005), Esteban, Mayoral, and Ray (2012), Desmet, Ortu˜no-Ort´ın,and Wacziarg (2012)). Since we look at differences in education – that reflect well-being as well as public goods provision –

1See Logan Logan (2013), Baldwin (2015) and Michalopoulos and Papaioannou (2014) on the de facto and the de jure power of ethnic chiefs in Africa.

4 of most relevance are the studies of Baldwin and Huber (2010) and Alesina, Michalopoulos, and Papaioannou (2016) who construct proxies of ethnic inequality across countries using Afrobarometer and luminosity across ethnic homelands. The use of census data allows for a much more detailed mapping of ethnic (and religious) differences in education, as well as measuring social mobility. Moreover, the census data allow studying even often tiny ethnic (religious) groups that often exert economic and political power disproportionate to their size, as for example the Krio in Sierra Leone or Asian communities in East Africa (Chua (2004), Stewart (2002)). From a methodological standpoint our paper also relates to recent works on intergener- ational mobility that mostly deal with the United States (Chetty et al. (2014), Chetty and Hendren (2018a), Chetty and Hendren (2018b)). Of most relevance from this research strand is the study of Chetty et al. (2018), who construct race-specific measures of intergenerational mobility in income across the United States, and Card, Domnisoru, and Taylor (2018) who, us- ing census data from the entire US population in 1940, map educational mobility across states and across race looking at children residing with at least one parent (as we do). The large number of observations allows them to focus on individuals aged 14-18, when the overwhelm- ing majority of children still reside with their parents and are at an age that allows them to meaningfully assess educational attainment. Given our focus, our paper also relates to studies of intergenertional mobility in education (see Solon (1999) and Black and Devereux (2011) for overviews).2 In contrast to most of the existing literature, which focuses on the United States and other industrial countries, we examine the African case. In a companion paper (Alesina et al. (2018) we provide mappings of the land of educational opportunity across 23 African countries and 2,440 regions, and, using data from multi-children migrant households, show significant regional exposure effects. In this paper, we look at the ethnic and religious dimensions and show that there are similarly large differences in social mobility across ethnic and religious lines. We further show that ethnicity and religion matter even when we look at individuals residing in the same region (and even conditioning on region-urban/rural status), suggesting that ethnicity-religion play a role for intergenerational mobility that is distinct from regional differences.

2 Data and Methodology

2.1 Education Data

We map and study ethnic and religious differences in intergenerational mobility in educa- tion. Education correlates strongly with income/wealth across countries (e.g., Barro and Lee (2013)), regions (Gennaioli et al. (2014)), and at the inidvidual level, (Card (1999), Krueger and Lindahl (2001)), with Mincerian returns to schooling being – if anything – larger in low-

2Early studies on intergenerational mobility in education include Spady (1967), Bowles (1972), and Blake (1985). More recently, Hertz et al. (2008) estimate country-level intergenerational mobility coefficients for various cohorts across 42 countries. Azam and Bhatt (2015) and Golley and Kong (2013) estimate mobility in education in and China, respectively. However, these works do not map ethnic and religious differences in mobility, as we do.

5 income countries (Psacharopoulos (1994), Caselli, Ponticelli, and Rossi (2014), Young (2012), Montenegro and Patrinos (2014)). As we demonstrate in a companion paper (Alesina et al. (2018)), education correlates strongly with various proxies of well-being across Africa; living conditions, as reflected by the DHS composite wealth index and the Afrobarometer, child mortality and fertility, hopes and aspirations, attitudes toward domestic violence, and proxies of political and civic engagement. We use indivdual records from IPUMS (Integrated Public Use Microdata Series) Inter- national. This database, hosted at the University of Minnesota Population Centre, takes representative samples from national censuses (typically 10%), harmonizes the data series and makes them available in the public domain. In total we extract data from 37 national censuses from a total of 19 African countries from all parts of the continent. We have infor- mation for both ethnicity and religion for 13 countries. Botswana, Burkina Faso, Ethiopia, Ghana, , Mali, , Rwanda, Senegal, Sierra Leone, South Africa, , and Zambia. For 5 countries we have information only on religion, Cameroon, Egypt, , Nigeria, and Rwanda, while for Morocco we have info on ethnicity, but not religion.3 Most censuses were conducted in the 1980s, 1990s, and 2000s. Table 1 gives information on sample coverage. Appendix tables 18 and 19 give more detail on the numbers of observations for different sub-samples and sample construction.

Table 1: Number of observations, ethnic and religious groups by country

country N religion sample N ethnicity sample n religions n ethnicities Botswana 370,428 370,428 4 8 Burkina Faso 2,498,870 1,417,824 6 13 Cameroon 1,772,359 0 8 0 Egypt 19,983,770 0 3 0 Ethiopia 15,882,990 12,478,684 6 11 Ghana 4,360,422 4,360,422 7 9 Guinea 1,186,908 0 7 0 Liberia 348,057 348,057 5 16 2,333,370 1,341,977 4 11 Mali 1,451,856 3,228,570 5 13 Morocco 0 2,776,746 0 5 Mozambique 2,047,048 2,047,048 6 18 Nigeria 72,191 0 4 0 Rwanda 843,392 0 7 0 Senegal 1,694,761 1,694,761 5 10 Sierra Leone 494,298 494,298 7 13 South Africa 7,346,819 11,765,413 9 12 Uganda 4,045,909 4,045,909 8 21 Zambia 2,318,090 3,105,551 5 11 total 69,051,536 49,475,688 106 171

For the religion (ethnicity) analyis, we use information form 69,051,536 (49,475,688) mil- lion individuals (14,936,410 and 10,597,161 households respectively). After aggregating the records from small ethnic-linguistic groups (consisting of less than 1% of the country’s popula-

3The data from Nigeria come from household surveys conducted in consecutive years over 2006-2010. To maximize coverage, we aggregate the yearly observations and count them as one census-year.

6 tion as) as well as groups of the same ethnic family ethnic clusters,4 we are left with 171 ethnic groups across 14 countries. For religion, we have three main categories. As Table 2 shows, Christians (Catholics, Protestants, and Orthodox) are about 47%, while Musllims (almsot exclusively Sunnis) are around 46%. The remaining adhere to local traditional religions. As of 2015, the 18 (14) countries with religion (ethnicity) information were home to around 550 (150) million people, the difference reflecting mostly Nigeria and Egypt.

Table 2: Number of observations by major religion

major religion N per major religion Catholic 6,169,945 Orthodox 7,978,938 Protestant 10,900,517 Pentecostal 2,447,822 Christian, other / unspecified 4,940,305 Muslim 32,016,232 Animist 621,070 Traditional 978,581 Other 731,379 No religion 2,020,565 Missing 246,182 total 69,051,536

Besides religious and ethnic affiliation, IPUMS gives information on individuals’ residence, typically admin-2 divisions. This allows accounting for locational features and also compare the ethnic and religious gaps in social mobility to regional differences that are also large (as we show in Alesina et al. (2018)). Finally, throughout this paper, we also make use of information on individuals’ gender as well as whether households were classified as urban or rural by the national statistical agencies running the census.

2.2 Intergenerational Mobility in Education

We construct meaures of absolute intergenerational mobility that reflect the likelihood that young individuals (children) acquire higher educational attainment than individuals from the immediately older generation living in the same household.5 This transparent and simple approach follows Chetty et al. (2014), Chetty and Hendren (2018a), Chetty and Hendren (2018b), and Card, Domnisoru, and Taylor (2018), as well as our companion work on the

4For example, the Asante and Fante groups in Ghana are sizeable groups in their own right but are, together with smaller groups such as the Aowin and Wasa, part of the larger Akan cluster, which is why all individuals from those groups are aggregated into the “Akan” category. In Ethiopia, the Kembatigna and Dawurogna are small groups, less than 1% of the population in size and hence are aggregated into an “other small” category. In Uganda, the , Karimojong, Dodoth, Tepeth and Suk all belong to the Karamojong cluster and so we aggregate them into a single “Karamogong” cagegory. 5We assign individuals to a generation inside a household according to the individual’s relation to the household head. The earlier literature has mostly relied measures of relative intergenerational mobility (Black and Deveroux, 2011). These are based on regressing the schooling of the young to the schooling of their parents, controlling for other features as needed. In the earlier draft of the paper, we also used relative IM measures. As the results are similar to the ones with absolute mobility, and educational mobility in Africa is primary about the transition between zero schooling and primary education, we focus on the latter.

7 land of opportunity across Africa. For each religious and ethnic group in every country, we construct four by four matrices across the main educational attainment categories of the “young” (children) and the “old” generation (parents). The educational categories are less than completed primary (mostly no schooling), completed primary, completed secondary, and completed tertiary. Figure 1 (a) shows the Africa-wide transition matrix using all censuses across the 18 coun- tries with religion data. Figures 1 (b) and (c) tabulate the transition matrices for Christian and Muslim households. Three-forths of the “old” generation has not completed primary schooling6. Because of the preponderance of illiteracy among the old, in the rest of the pa- per we mostly focus on the likelihood that kids from parents without any schooling or less than completed primary (that we label as “illiterate”) manage to at least complete primary education (we label them as “literate”). To maximize coverage, we pool all censuses from a given country. We define the following dummy variables:

illit • I0,ibcert = 1 if the parent of individual i born in birth-decade b in country c of ethnicity e or religion r and observed in census-year t is illiterate (either no schooling or less than completed primary) and zero otherwise.

lit,illit • I1,ibcert = 1 if a child i born to illiterate parents in birth-decade b in country c of ethnicity e or religion r and observed in census-year t is literate and zero otherwise. Again we define literacy as having completed at least primary education.

Pooling observations across all censuses in a given country, we run the following regressions country-by-country:

lit o o o o I0,itbcre = [αr] + [αe] + [γb + δb + θt] + ict (1) lit,illit y y o o I1,itbcre = [αr ] + [αe ] + [γb + δb + θt] + ict, (2)

illit lit,illit where I0,ict and I1,ict are the indicators for parental (superscript o) and child (super- script y) education. We run the first specification in the full sample of individuals for which we observe previous-generation education; and we estimate the second specification for all children of “ uneducated” parents (either with no schooling or without completed primary). To account for unobserved factors we estimate specifications conditioning on birth-cohort fixed y o effects (for both the “young” (δb ) and the “old” (γb )), as well as census-year fixed effects (θt). 6To estimate transition likelihood for the completion of schooling beyond secondary we use a different sample restriction than the 14-year old age cutoff we employ in the rest of the paper: First, we require individuals to be at least 18, and second, we require them to be at least 9 years older than their years of schooling. Since schooling usually begins at age 6, this gives a reasonable 3-year buffer to make sure that individuals’ educational attainment is not misclassified. We use these sample restrictions only for this initial inspection of the data and do not rely on it anywhere else. Since we drop younger individuals in censuses in the 1990s and 2000s so as to allow Africans to complete schooling, these statistics only include the past two decades with the sizable expansion of tertiary schooling in a limited fashion.

8 Figure 1: Visualization of transition likelihoods with religion data

(a) Africa

fraction by parental attainment 0 .2 .4 .6 .8 1 1

.75

.5

.25

0 conditional likelihood of child attainment

less than primary primary secondarycompletedtertiary completed completed parental attainment

less than primary primary completed secondary completed tertiary completed

(b) Christian

fraction by parental attainment 0 .25 .5 .75 1 1

.75

.5

.25

0 conditional likelihood of child attainment

less than primary primary completedsecondarytertiary completed completed parental attainment

less than primary primary completed secondary completed tertiary completed

(c) Muslim

fraction by parental attainment 0 .2 .4 .6 .8 1 1

.75

.5

.25

0 conditional likelihood of child attainment

less than primary primarysecondary completedtertiary completed completed parental attainment

less than primary primary completed secondary completed tertiary completed

9 2.3 Cohabitation Selection

We can only estimate mobility of individuals who reside with their parents. This raises concerns of sample selection if the transmission of education between parents and kids who live apart systematically differs from that of co-resident individuals. By itself, this should push us to include only young children in the sample. The problem is, of course, that young children may not have completed their schooling, so the younger we make the sample, the greater the risk of mis-classifying individuals as “ less-than-primary” when in fact they would complete primary education one or two years after we observe them in the census. We deal with this tension between cohabitation selection and educaton mis-classification in two ways. To address co-habitation selection, we follow Card, Domnisoru, and Taylor (2018) and begin by estimating IM for the sample of individuals aged 14-18 for whom the in-sample co-residence rate is 87.5%. We then follow the estimates with the restricted 14-18 age sample with the full sample of individuals aged 14 and above (co-residence rate 33%) and compare estimates of IM from the two samples to gauge the extent to which our results may be driven by co-habitation selection. To address the problem of education mis-classification, we use census information on individual school enrolment to upward-correct educational attainment of individuals close to completing primary education. Specifically, if individuals have four or five years of schooling at the time we observe them and have educational attainment of “less-than-primary”, we record their attainment as “completed primary” in a second “student-corrected” educational attainment variable. We then present results with and without this student-correction. Tables 3 and 4 show co-residence rates by country.

Table 3: Co-residence rates ages 14-18 and 14-100 vs. age 8, education data observed for all, religion sample

8 8 country age 8 age 14-18 age 14-100 ∆14−18 ∆14−100 Burkina Faso 96.75 78.47 27.94 -18.90 -71.12 Botswana 90.14 80.71 35.28 -10.46 -60.86 Cameroon 95.61 80.10 33.38 -16.23 -65.09 Egypt 98.27 94.62 36.99 -3.72 -62.36 Ethiopia 98.02 83.24 27.79 -15.08 -71.65 Ghana 94.93 88.58 36.13 -6.69 -61.94 Guinea 94.38 77.71 29.11 -17.67 -69.16 Liberia 95.99 90.63 35.52 -5.58 -63.00 Mali 97.13 84.17 37.02 -13.35 -61.89 Mozambique 95.60 80.86 25.68 -15.41 -73.14 Malawi 99.95 86.85 23.07 -13.11 -76.92 Nigeria 92.91 89.27 33.91 -3.92 -63.50 Rwanda 94.53 86.57 32.30 -8.42 -65.83 Senegal 97.98 92.26 47.21 -5.84 -51.81 Sierra Leone 91.58 82.28 38.89 -10.15 -57.53 Uganda 97.68 80.70 25.81 -17.39 -73.58 South Africa 88.62 83.72 36.48 -5.53 -58.84 Zambia 95.77 84.14 34.75 -12.14 -63.72 overall 96.06 87.55 33.69 -8.85 -64.92 This table shows the number of individuals of different age ranges for whom previous generation education as well as their own is observed as a percentage of all individuals with data on their own educaction as well as their relationship to the household head. The latter does not exclude single-person households, since these individuals will be labelled “head”. The columns titled 8 ∆m−n show the proportionate reduction in the percentages relative to individuals aged 8.

10 Table 4: Co-residence rates ages 14-18 and 14-100 vs. age 8, education data observed for all, ethnicity sample

8 8 country age 8 age 14-18 age 14-100 ∆14−18 ∆14−100 Burkina Faso 96.13 77.34 25.74 -19.55 -73.23 Botswana 90.14 80.71 35.28 -10.46 -60.86 Ethiopia 98.19 85.20 30.51 -13.23 -68.92 Ghana 94.93 88.58 36.13 -6.69 -61.94 Liberia 95.99 90.63 35.52 -5.58 -63.00 Morocco 99.59 97.42 50.86 -2.18 -48.93 Mali 97.49 82.25 34.23 -15.63 -64.88 Mozambique 95.60 80.86 25.68 -15.41 -73.14 Malawi 99.98 89.51 23.62 -10.47 -76.37 Senegal 97.98 92.26 47.21 -5.84 -51.81 Sierra Leone 91.58 82.28 38.89 -10.15 -57.53 Uganda 97.68 80.70 25.81 -17.39 -73.58 South Africa 88.44 82.98 34.81 -6.18 -60.64 Zambia 96.79 86.48 35.63 -10.64 -63.19 overall 95.12 84.96 34.22 -10.68 -64.03 This table shows the number of individuals of different age ranges for whom previous generation education as well as their own is observed as a percentage of all individuals with data on their own educaction as well as their relationship to the household head. The latter does not exclude single-person households, since these individuals will be labelled “head”. The columns titled 8 ∆m−n show the proportionate reduction in the percentages relative to individuals aged 8.

Figure 2: Distribution of co-residence rates at different ages by religious and ethnic groups

(a) Religions, levels (b) Religions, country FEs 25 30 20 20 15 Density Density 10 10 5 0 0 0 .2 .4 .6 .8 1 -.4 -.2 0 .2 .4 co-residence rate co-residence rate residual

age 8 age 14-18 age 8 age 14-18

(c) Ethnicities, levels (d) Ethnicities, country FEs 25 50 20 40 15 30 Density Density 10 20 5 10 0 0 0 .2 .4 .6 .8 1 -.4 -.2 0 .2 .4 co-residence rate co-residence rate residual

age 8 age 14-18 age 8 age 14-18

11 Figure 2 plots histograms of co-residence rates by religious and ethnic groups for age 8 and age 14-18. Since most of our analysis later will be conditional on country fixed effects, we also report the distribution of co-residence rates within country. While there is variation in co-residence among cultural groups, for almost no group is it 20 percentage points smaller than the corresponding rate at age 8. Virtually no group has a co-residence rate different by more than 10 percent from the country mean. As we show below, our main results hold for low and high-co-residence rate groups as well as when we directly control for the rate of co-residence.

3 Social Mobility across Religious and Ethnic Lines

3.1 Education across Ethnic and Religious Groups

Before presenting our estimates of IM across ethnic and religious groups, it is useful providing some illustrations of the evolution of education. This allows understanding the underlying data and visualizing the dynamics of educational inequality across ethnic and religious groups.

Figure 3: Average years of schooling over time across religious groups for 4 countries

(a) Botswana (b) Ethiopia

4

10

3

2 5

mean years of schooling 1 mean years of schooling

0 0 20-29 30-39 40-49 50-59 60-69 70-79 80-89 20-29 30-39 40-49 50-59 60-69 70-79 80-89 birth decade birth decade Christian, Catholic Christian, Protestant Other Muslim Other Christian, unspecified No religion Christian, Orthodox Muslim Traditional

(c) Ghana (d) Malawi

8 6

6

4

4

2 mean years of schooling 2 mean years of schooling

0 0 20-29 30-39 40-49 50-59 60-69 70-79 80-89 birth decade 20-29 30-39 40-49 50-59 60-69 70-79 80-89 birth decade Christian, Protestant Christian, Other No religion Traditional Other Christian, Catholic Muslim Christian, unspecified Other Muslim No religion

12 Figure 4: Average years of schooling over time across ethnic groups for 4 countries

(a) Botswana (b) Ethiopia

15 4

3 10

2

5 1 mean years of schooling mean years of schooling

0 0 20-29 30-39 40-49 50-59 60-69 70-79 80-89 20-29 30-39 40-49 50-59 60-69 70-79 80-89 birth decade birth decade Amhara Hadiya Oromo Gamo English Kalanga / Sekalaka OTHER AFRICAN Sembukushu Gurage Welayta Sidama Somali OTHER NON-AFRICAN Setswana Sekgalagadi / Sengologa Sesarwa Tigray Silte OtherSMALL

(c) Ghana (d) Malawi

8 8

6 6

4 4

2 2 mean years of schooling mean years of schooling

0 0

20-29 30-39 40-49 50-59 60-69 70-79 80-89 20-29 30-39 40-49 50-59 60-69 70-79 80-89 birth decade birth decade

Ga-Dangme Guan Mande Tumbuka Tonga Lomwe Sena Akan Other Mole-Dagbani Ngonde OtherIPUMS Nyanja Yao Ewe Grusi Gurma OtherSMALL Ngoni Chewa

Figures 3 (a)-(d) portray the evolution of education across the main religious groups, while Figures 4 (a)-(d) illustrate the dynamics of schooling at the ethnicity level across 4 sample countries. The graphs plot average years of schooling for ten-year birth cohorts, going back to the pre-independence period. Panel (a) illustrates Botswana’s success: not only has education risen for the cohorts educated around independence in the mid-1960s, religious and ethnic gaps narrow from their notable indepdence level. When comparing schooling for the 1950s-born and the 1980s-born cohorts, the education of Tswana tribes double. The education of the Sewara and the Sembuku that was close to zero at independence has also risen considerably though it is still lower than the one of the Tswana and the non-African ethnicities. The patterns for Ethiopia in panel (b) are different. Schooling was quite low for the 1950s-born cohort, around 1.2 years on average. Christian groups had close to 2 years of schooling, while mean education for Muslims (like the Somali residing in the Ogaden region close to Somalia) was below 0.5 years. Education has increased modestly in the country, but most of the gains have accrued to the Christians. Hence, religious differences in education have increased. The ethnic patterns also reveal the absence of convergence. The Amhara, Habiya, and Gurage groups have experienced larger increases in education compared to all

13 other groups, but the Tigray saw a massive increase in education for the 1980s-born cohort. This likely reflects the key role of the Tigray in the defeat of the Marxist Derg regime in 1991 and their major role in Ethiopian politics since then. Mean years of schooling for the Somalis born in the 1980s continue being less than one, illustrating their miserable conditions. In panel (c) we plot the dynamics of education across religious and ethnic groups in Ghana that is emerging as an African success story. Education has increased, but most of the gains have been on the dominant groups, the Akan, the Ewe and the Ga that at independence had on average higher literacy. The mostly Muslim groups in the North, the Gurma, the Mole, the Grusi and the Mande have seen only modest increases in education. The visualization of education in Malawi in panel (d) shows that increases in education have been equally spread across ethnic and religious lines. As a consequence religious and ethnic inequality in education has remained roughly constant.

3.2 Intergenerational Mobility across Religious Groups

Table 5 gives the IM estimates for the country (ignoring religious affiliation), for the average religious group, as well as absolute IM estimates for the religious group with the highest and the lowest mobility.

Table 5: Country-level summary statistics of religion-level IM with student correction

country country- mean group- stdev group- min min group max max group level IM level IM level IM

Botswana 0.864 0.810 0.101 0.667 Muslim 0.890 Christian, unspecified Burkina Faso 0.159 0.178 0.090 0.058 Animist 0.277 Christian, Catholic Cameroon 0.613 0.658 0.108 0.461 Muslim 0.742 Christian, Protestant Egypt 0.630 0.639 0.012 0.629 Muslim 0.652 Christian, unspecified Ethiopia 0.182 0.152 0.080 0.050 Traditional 0.226 Christian, Catholic Ghana 0.638 0.612 0.143 0.341 Traditional 0.741 Christian, Protestant Guinea 0.285 0.340 0.114 0.247 No religion 0.503 Christian, Catholic Liberia 0.368 0.251 0.145 0.071 Other 0.434 Muslim Malawi 0.490 0.400 0.108 0.253 No religion 0.511 Christian, unspecified Mali 0.348 0.321 0.105 0.230 No religion 0.492 Christian, unspecified Mozambique 0.147 0.151 0.041 0.100 Muslim 0.195 Christian, Protestant Nigeria 0.668 0.491 0.265 0.289 Other 0.846 Christian, unspecified Rwanda 0.323 0.279 0.107 0.137 No religion 0.443 Muslim Senegal 0.290 0.421 0.105 0.280 Muslim 0.570 Christian, Catholic Sierra Leone 0.418 0.366 0.156 0.095 Traditional 0.507 Christian, Protestant South Africa 0.861 0.878 0.028 0.818 No religion 0.916 Muslim Uganda 0.550 0.406 0.250 0.034 Traditional 0.651 Muslim Zambia 0.569 0.521 0.109 0.334 No religion 0.600 Muslim All estimates are unconditional and are for the sample of individuals aged 14-18 with student correction.

Appendix Table 22 gives the IM estimates for each religious group in all 18 countries. A couple of noteworthy patterns emerge. First, there are sizable cross-country differences in social mobility, a pattern that we analyze in detail in our companion paper (Alesina et al. (2018)). The mean IM ranges from 0.81 in Botswana and 0.73 in South Africa to 0.12 in Ethiopia, 0.15 in Malawi and 0.16 in

14 Burkina Faso and Rwanda. Cross-country differences in social mobility are considerable for both rural and urban households and for both boys and girls (Appendix Tables 20 and 21). How much of the variation in intergenerational mobility across religous groups is explained by country-level factors? To answer this question, we regress individual mobility indicators in the sample of children of illiterate old on census and birth-decade fixed effects as well as a set of dummies of interest. Those dummies are either country- or religion dummies. For each regression, we record the marginal R-squared that results from including the additional dummies. Table 6 has the results. Table 6: Marginal R-squared country and group constants, religion sample

sample country religion ∆ full 0.066 0.075 0.010 urban 0.019 0.025 0.007 rural 0.069 0.078 0.010 male 0.066 0.075 0.010 female 0.068 0.078 0.010 This table shows results of regressions of individ- ual level IM (dummy equal to one if individual is literate and zero otherwise in the sample with illiterate old) on census-year fixed effects and birth-decade (for young and old) fixed effects as well as one set of fixed effects of interest. Those latter fixed effects are either country- or religion fixed effects and the columns with those titles show the marginal R-squared on those fixed ef- fects beyond the time and cohort dummies. The column titled ∆ shows the difference between the two.

Once country-dummies, which have a marginal R-squared of around 6% (except in the urban sample, where they explain less), are accounted for, religion constants explain about 1 addition percent of the variation in intergenerational mobility. Second, in most countries there are large within country differences in educational IM. The most striking case is Nigeria. IM is around 0.85 for Christians and just 0.28 for Nigerians who adhere to traditional religions and 0.54 for Muslims. Uganda is another country with large religious differences in social mobility, with IM ranging from just 3% for traditional religions to 65% for Muslims. In Mozambique, the likelihood that the kids of parents without any formal education will complete at least primary schooling is 0.16 and 0.20 for Christian Catholics and Protestants and only 0.10 for Muslims. In Egypt and in Sierra Leone religious differences in IM are not particularly large for the main religions. Figure 5 provides an illustration of the cross-country and the within-country differences in social mobility zooming in on . It shows clearly both the clustering of high mobility in certain countries – Cameroon and Ghana stand out is countries of relatively high mobility – as well as stark within-country differences in mobility across groups with gaps of 3:1 to 5:1 from the highest to the lowest mobility religions. Third, while there are some exceptions, the analysis reveals a clear ranking of religions by IM. Social mobility is in most countries the highest for Christian groups, and especially

15 Figure 5: Simple IM by group for five West-African countries, ages 14-18 with student correction

Christian, Catholic Christian, Protestant Other Muslim No religion

Animist Burkina Faso 0.00 0.25 0.50 0.75 1.00 Christian, Protestant Christian, Orthodox Christian, Catholic Christian, Other No religion Other

Animist Cameroon Muslim 0.00 0.25 0.50 0.75 1.00 Christian, Protestant Christian, Other Christian, Catholic Other Muslim Ghana No religion Traditional 0.00 0.25 0.50 0.75 1.00

Christian, unspecified

Muslim

Other Nigeria Traditional

0.00 0.25 0.50 0.75 1.00 Christian, Protestant Other Christian, Catholic Christian, Other Muslim No religion

Traditional Sierra Leone 0.00 0.25 0.50 0.75 1.00

for Protestants, followed by Catholics and Orthodox. It is the lowest for Africans who follow traditional religions; IM for African Muslim’s tends to be in the middle, higher than those adhering to local religions but almost always lower than Christians. The only cases where IM is higher for Muslim as compared to Christian groups are countries with tiny (less than 2%) Muslim communities (Rwanda, South Africa, and Zambia) plus Liberia and Uganda (where Muslims are 10% and 9.4% of the population respectively). Forth, while social mobility is higher for boys and for urban households, the gender and rural-urban gaps differ across religions (in different countries). To gauge the gender and rural- urban gap we calculate the difference of IM across gender and rural-urban status and divide with the pooled mean IM for each country-religion group. The gender gap is quite high in

16 Guinea (median gap > 70%) and Senegal (median > 40%) and low in Botswana and South Africa where the median gap is negative (women more mobile). The rural-urban gap is the highest in Ethiopia and Burkina Faso (median gaps of 3:1 and 2:1) and lowest in Egypt and South Africa (median gaps negative and almost zero, respectively).

3.3 Intergenerational Mobility across Ethnic Groups

We now turn to our estimates of social mobility across ethnic groups. Appendix table 23 reports the absolute IM estimates, as well as the population share and literacy rates for all 171 ethnic groups in 14 countries. Table 7 summarizes the newly-constructed estimates, reporting for each country the ethnicity with the highest and the lowest IM estimate alongside the country mean.

Table 7: Country-level summary statistics of ethnicity-level IM with student correction

country country- mean group- stdev group- min min group max max group level IM level IM level IM

Botswana 0.864 0.814 0.132 0.537 Sesarwa 0.932 Kalanga / Sekalaka Burkina Faso 0.186 0.210 0.192 0.046 Peul 0.803 French Ethiopia 0.176 0.160 0.064 0.040 Somali 0.264 Amhara Ghana 0.640 0.624 0.088 0.456 Gurma 0.765 Akan Liberia 0.368 0.407 0.070 0.261 Bassa 0.498 Mende Malawi 0.473 0.524 0.078 0.421 Yao 0.624 Tumbuka Mali 0.254 0.196 0.084 0.067 Maure 0.320 Bambara, Malinke, Dioula Morocco 0.454 0.438 0.072 0.363 Tchalhit 0.544 Other small Mozambique 0.150 0.160 0.095 0.049 Ciyao 0.332 Portuguese Senegal 0.290 0.357 0.117 0.205 Fula 0.611 Jola Sierra Leone 0.418 0.416 0.113 0.181 Koranko 0.603 Krio South Africa 0.880 0.891 0.045 0.823 Other IPUMS 0.941 Tshivenda Uganda 0.550 0.557 0.145 0.047 Karamojong 0.725 Bagisu Zambia 0.559 0.565 0.039 0.491 Nyanja 0.628 Kaonde All estimates are unconditional and are for the sample of individuals aged 14-18 with student correction.

Our approach identifies small ethnic groups with considerable political and economic power, what Chua (2003) coins as “market-dominant” ethnic minorities. For example, the Krio (Creole) that have dominated politics in Sierra Leone have the highest share of upward mobility, 0.6 (compared to a country mean of 0.42). The small French-speaking minority in Burkina Faso – that consists of just 1.6% – has by far the highest IM, 0.80, as compared to a one of the lowest country means, 0.19. In a few countries the IM estimates are the highest for large ethnic groups that anecdotal evidence suggest them having a dominant role on national politics; examples are the Ahmara in Ethiopia (35% of the population), the Akan-Ashante in Ghana (47% of the population), and the Bambara in Mali (54% of the population). As we have done for religious groups, we analyse the proportion of variance accounted for by country- and group constants respectively. The results are in table Two features are worth noting in comparison to the earlier religion analysis. First, country dummies explain a smaller share of the variance in the ethnicity sample (2 percent vs 6 percent, except for the urban sample where it is 2 percent in both cases). Second, ethnicity constants are more important in absolute terms than country constants in the ethnicity sample and

17 Table 8: Marginal R-squared country and group constants, ethnicity sample

sample country ethnicity ∆ full 0.022 0.049 0.027 urban 0.019 0.055 0.036 rural 0.020 0.047 0.027 male 0.028 0.054 0.026 female 0.018 0.050 0.033 This table shows results of regressions of individ- ual level IM (dummy equal to one if individual is literate and zero otherwise in the sample with illiterate old) on census-year fixed effects and birth-decade (for young and old) fixed effects as well as one set of fixed effects of interest. Those latter fixed effects are either country- or ethnic- ity fixed effects and the columns with those titles show the marginal R-squared on those fixed ef- fects beyond the time and cohort dummies. The column titled ∆ shows the difference between the two.

relatively much more important in comparison to the role that religion plays in explaining IM. At the same time, there is a lot of within-country variation. For example social mobility for the Somalis in Ethiopia in just 2.6% while for the Amhara is 26%. In Ghana the IM for the Akan (incl. Ashante) is 77%, while for the Gurma is half, 46%. In Uganda, approximately 70% of boys and girls whose parents do no have any education manage to complete at primary education, while for the Karamojong the corresponding likelihood is less than 5%. Even in small Liberia, the variation of social mobility across tribes is considerable, with the IM estimates ranging from 0.26 for the Bassa (12% of the population) to 50% for the Mende (just 1.4% of the population) and 43.3% for the Grebo (11.6% of the population). There are, moreover, considerable ethnic-specific differences in IM across gender and rural- urban status. Boys are most advantaged compared to girls with respect to social mobility in Morocco and Ethiopia, whereas gender gaps are negative (girls more mobile than boys) in Malawi, Botswana and South Africa. The urban-rural divides are starkest in Ethiopia and Burkina Faso and smallest in South Africa and Uganda. As for the religion-level analysis, urban-rural gaps are much larger (2.6:1 and 2:1 in Ethiopia and Burkina Faso) than male- female gaps.

4 Main Patterns

We now turn to the main data patterns looking in particular at the association between ethnic-religious literacy and social mobility.

4.1 Africa-wide Patterns

Given the high inertia of religious and ethnic education across cohorts, we examin the associ- ation between intergenerational mobility and mean education of the “old” generation across

18 religious and ethnic groups in each country. We also explore the role of ethnic (religious) size in explaining differences in IM within countries.

4.1.1 Religion

Figures 6 (a)-(b) plot the association between the likelihood that children of parents without formal education manage to complete at least primary school (in the vertical axis) against the share of literacy of the “old”generation (in the horizontal axis) for the main religious groups in the 18 countries. Figure 6: Association between parental literacy and IM (scatters conditional on country FEs, with student correction), major religious groupings

(a) y + b FEs, ages 14-18 (b) y + b FEs, ages 14-100

.4 .4

.2 .2

0 0

-.2 -.2

share literate kids of illiterate old -.4 share literate kids of illiterate old -.4 -.4 -.2 0 .2 .4 -.4 -.2 0 .2 .4 share literate old share literate old

Christian Muslim Other Christian Muslim Other

Figure 7: Association between parental literacy and IM (scatters conditional on country FEs, with student correction), finer religious groupings

(a) y + b FEs, ages 14-18 (b) y + b FEs, ages 14-100

.4 .4

.2 .2

0 0

-.2 -.2

-.4 -.4

share literate kids of illiterate old -.4 -.2 0 .2 .4 share literate kids of illiterate old -.4 -.2 0 .2 .4 share literate old share literate old

Animist Christian, Catholic Christian, Orthodox Animist Christian, Catholic Christian, Orthodox Christian, Other Christian, unspecified Christian, Protestant Christian, Other Christian, unspecified Christian, Protestant Christian, Pentecostal Muslim Traditional Christian, Pentecostal Muslim Traditional Other No religion Other No religion

Panel (a) looks at individuals aged 14-18 where selection is not a major concern, while Panel (b) looks at the full sample, ages 14 and older. In both graphs, we net country and birth cohort fixed-effects. Red dots indicate Muslim groups, blue dots indicate Christian groups, and

19 red dots indicate other religious groups. There is a clear positive association suggesting that social mobility has been significantly higher for these groups that had on average somewhat higher literacy rates. The pattern is similar when we allow for finer religious disaggregation, (7) Table 9, columns (1)-(3), gives the corresponding country fixed-effects regression estimates.

Table 9: Literacy and IM at the religion-level

(1) (2) (3) (4) (5) (6) religion IM religion IM religion IM religion IM religion IM religion IM share literate old 0.584∗∗∗ 0.737∗∗∗ 0.721∗∗∗ 0.508∗∗∗ 0.683∗∗∗ 0.589∗∗∗ (7.32) (13.83) (5.13) (6.64) (11.95) (4.21) R-squared 0.930 0.975 0.925 0.936 0.977 0.935 within-R-squared 0.644 0.815 0.603 0.510 0.746 0.447 N 106 106 106 106 106 106 country-FEs yes yes yes yes yes yes major religion FEs 14-18 14-100 14-18 14-18 14-100 14-18 age-range no no yes no no yes The dependent variable is religion-level share of literate kids of illiterate parents (estimated country-by- country net of census year and old and young birth decade fixed effects.) The independent variable is the religion-level share of literate parents (also estimated country-by-country, net of fixed effects). All correlations net of country fixed effects. Specifications with major religion fixed effects include dummies for Christian and Muslim faiths. t-statistics based on standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01.

The coefficient in the 14-18 age sample is 0.58-0.72 suggesting that the likelihood that children of parents without completed primary education will manage to complete at least primary schooling is around 6.5 percentage points higher for religious groups with a higher literacy rates of 10 percentage points. In columns (4)-(6) we add three broad religion constants (Christian, Muslim, with “other” the excluded category) so as to account for broad cultural differences; these specifications explore variation on differences in social mobility and literacy rates of the same religious group across countries. By doing so we account for the fact that Africans adhereing to Islam and traditional religions have on aveage lower parental literacy rates and lower IM. The coefficients on literacy become slightly smaller but remain significant. From an inspection of figure 6 this could have been expected since major religions are clearly ranked by both literacy and IM (part of the association operates at the level of major religions) but the association also holds within major groups. In Table 10 we distinguish across gender and across rural-urban status. The association be- tween the share of literacy for the “old” generation and social mobility across country-religious groups is significant across all samples. The correlation is stronger for girls as compared to boys (boys from religious groups of low human capital are much more likely to escape family illiteracy, as compared to girls) and stronger in rural than in urban areas.

20 Table 10: Literacy and IM at the religion-level, rural/urban, female/male heterogeneity

(1) (2) (3) (4) (5) (6) IM IM IM IM IM IM

Panel A: rural subsample Panel C: female subsample share literate old 0.418∗∗∗ 0.611∗∗∗ 0.580∗∗∗ 0.638∗∗∗ 0.818∗∗∗ 0.846∗∗∗ (6.78) (8.73) (6.29) (12.48) (17.96) (15.05) R-squared 0.305 0.459 0.282 0.515 0.725 0.593 N 102 102 102 106 106 106

Panel B: urban subsample Panel D: male subsample share literate old 0.244∗ 0.484∗∗∗ 0.555∗∗∗ 0.509∗∗∗ 0.677∗∗∗ 0.640∗∗∗ (1.71) (3.45) (4.18) (7.46) (9.90) (8.29) R-squared 0.034 0.133 0.138 0.300 0.380 0.329 N 102 102 102 106 106 106

age-range 14-18 14-100 14-18 14-18 14-100 14-18 student correction no no yes no no yes The dependent variable is the religion-level share of literate kids of illiterate parents (estimated net of census year and old and young birth decade fixed effects). The independent variable is the religion-level share of literate parents (also estimated net of fixed effects). t-statistics based on country-clustered standard errors in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01.

4.1.2 Ethnicity

Figure 8: Association between parental literacy and IM (scatters conditional on country FEs, with student correction)

(a) y + b FEs, ages 14-18 (b) y + b FEs, ages 14-100

.6 .5

.4

.2 0 0

-.2 share literate kids of illiterate old share literate kids of illiterate old -.5 -.4

-.4 -.2 0 .2 .4 .6 -.4 -.2 0 .2 .4 .6 share literate old share literate old

Burkina-Faso Botswana Ethiopia Ghana Burkina-Faso Botswana Ethiopia Ghana Liberia Morocco Mali Mozambique Liberia Morocco Mali Mozambique Malawi Senegal Sierra-Leone Uganda Malawi Senegal Sierra-Leone Uganda South Africa Zambia South Africa Zambia

Figures 8 a-b illustrate the association between ethnic-level IM and mean literacy rates of the “old” generation across ethnic groups. There is an evident positive association suggesting that social mobility, as reflected by the likelihood that boys and girls of parents without formal education manage to complete at least primary schooling, is significantly higher for ethnic groups with a relatively more educated parental generation. This result suggests that differences in education across ethnic lines tend to persist, as, on average, Africans from

21 relatively more educated groups have benefited more from the expansion of education. The regression estimate, shown in Table 11, on the ethnic share of literate old is around 0.5, slightly lower than the coarser religion level.

Table 11: Literacy and IM at the ethnicity-level

(1) (2) (3) ethnicity IM ethnicity IM ethnicity IM share literate old 0.510∗∗∗ 0.649∗∗∗ 0.535∗∗∗ (5.86) (7.49) (5.20) R-squared 0.876 0.928 0.900 within-R-squared 0.527 0.670 0.472 N 171 171 171 country-FEs yes yes yes age-range 14-18 14-100 14-18 student correction no no yes The dependent variable is ethnicity-level share of literate kids of il- literate parents (estimated country-by-country net of census year and old and young birth decade fixed effects.) The independent variable is the ethnicity-level share of literate parents (also estimated country-by- country, net of fixed effects). All correlations net of country fixed ef- fects. t-statistics based on standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01.

The association is somewhat stronger in rural, as compared to urban, households. As with religion low levels of literacy at the ethnicity level are especially detrimental for girls; the coefficient for boys aged 14-18 is 0.435 (0.595) and for girls 0.595 (0.82) without (with) the student correction. Moreover, the R-squared is considerably higher in the female sample (see appendix table 24).

4.1.3 Ethnic, Religious and Regional Differences

In Alesina et al. (2018) we uncover large regional differences in education and IM; at the same time, there is high ethnic-religious segregation across most of Africa (Alesina and Zhu- ravskaya (2011)). So the strong association between religious or ethnic literacy and IM may reflect regional differences in education and mobility. In table 12 we examine this possibil- ity. Columns (1)-(4) examine the stability of the religious literacy - IM association, as we progressively account at finer levels for regional features, while columns (5)-(8) examine the strength of ethnic literacy - IM correlation. Columns (1) and (5) reproduce the baseline asso- ciation, netting just census-year and birth-cohort constants. The regression coefficient is 0.72 and 0.535 for the religion and ethnic group specifications. In columns (2) and (6) we also net admin-1 unit fixed-effects (e.g., provinces in Nigeria and in South Africa) to account for broad regional differences. The coefficients drop slightly (to 0.66 and 0.466). In columns (3) and (7) we account for admin-2 fixed-effects; as we have close to 2,000 admin-2 units across the 19 countries these specifications account well for regional differences. This does not impact the estimate much. Finally, in columns (4) and (8) we interact the admin-2 unit constants with an urban-rural household indicator, so as to explore within regional-urban/rural variability. The regression estimate remains intact in the religion specification (in (4)) and declines modestly

22 Table 12: Literacy and IM at the religion- and ethnicity-level, geographic fixed effects

(1) (2) (3) (4) (5) (6) (7) (8) religion IM religion IM religion IM religion IM ethnicity IM ethnicity IM ethnicity IM ethnicity IM share literate old 0.721∗∗∗ 0.689∗∗∗ 0.699∗∗∗ 0.673∗∗∗ 0.535∗∗∗ 0.465∗∗∗ 0.460∗∗∗ 0.375∗∗∗ (5.13) (4.88) (5.58) (4.58) (5.20) (4.89) (4.51) (3.70) R-squared 0.925 0.896 0.891 0.926 0.900 0.870 0.881 0.932 within-R-squared 0.603 0.397 0.363 0.328 0.472 0.410 0.407 0.299 N 106 106 106 106 171 171 171 158 country-FEs yes yes yes yes yes yes yes yes 23 age-range 14-18 14-18 14-18 14-18 14-18 14-18 14-18 14-18 student correction yes yes yes yes yes yes yes yes fixed effects t,b t,b, admin1 t,b, admin2 t,b, admin2 × u/r t,b t,b, admin1 t,b, admin2 t,b, admin2 × u/r sample religion religion religion religion ethnicity ethnicity ethnicity ethnicity The dependent variable is religion/ethnicity-level share of literate kids of illiterate parents. The independent variable is the religion/ethnicity-level share of literate parents. The row “fixed effects” indicates the fixed effects included in the estimation of the LHS and RHS variables. All correlations net of country fixed effects. t-statistics based on standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01. in the ethnicity specification (in (8) to 0.375). These results suggest that the low literacy - low social mobility nexus across religious and ethnic lines is not driven by regional segregation and other regional features. One may wonder whether the literacy - IM association across religious and ethnic lines work on top of each other or whether the religion estimates reflect ethnic differences and vice versa. Interestingly, there are no mono-religious ethnic groups at all. For almost all groups for whom we also have ethnicity information, almost all religions in the country also have some members among the ethnicity. The mean number of religious groups per ethnicity is 5.95, with a standard deviation of 1.5. Figure 9, which shows a histogram of ethnicity-specific Herfindahl indices of religious affiliation shows that while there are some groups where is very concentrated, other ethnicities in a variety of ways.

Figure 9: Histogram of Herfindahl indices of ethnicity-level religious affiliation

4

3

2 Density

1

0 0 .2 .4 .6 .8 1 Ethnicity-level Herfindahl index of religious affiliation

There is geographic clustering in religious diversity by ethnicity. Diverse ethnicities are especially concetrated in Mozambique, South Africa, whereas West Africa, especially Mali, Burkina Faso, and Senegal is home to groups that are almost exlcusively Muslim. Figures 10 (a)-(b) plot the association between literacy and IM across religious groups with and without netting ethnicity fixed-effects, while Figures 11 (a)-(b) plot the analogous association at the ethnic group level, netting out religion fixed-effects. While variance falls, the strength and significance of the association between literacy and social mobility across ethnic (religious) lines remains intact once we net religion (ethnicity) fixed-effects.

24 Figure 10: Religion-level association between parental literacy and IM (scatters conditional on country FEs, with student correction) with and without ethnicity fixed effects

(a) y + b FEs, ages 14-18 (b) y + b FEs, ages 14-100

.4 .2

.2

0

0

-.2 -.2 share literate kids of illiterate old share literate kids of illiterate old

-.4 -.4 -.3 -.2 -.1 0 .1 .2 -.3 -.2 -.1 0 .1 .2 share literate old share literate old

baseline conditional on ethnicity FEs baseline conditional on ethnicity FEs

Figure 11: Ethnicity-level association between parental literacy and IM (scatters conditional on country FEs, with student correction) with and without religion fixed effects

(a) y + b FEs, ages 14-18 (b) y + b FEs, ages 14-100

.6 .5

.4

.2

0 0

-.2 share literate kids of illiterate old share literate kids of illiterate old

-.4 -.5

-.4 -.2 0 .2 .4 .6 -.4 -.2 0 .2 .4 .6 share literate old share literate old

baseline conditional on religion FEs baseline conditional on religion FEs

4.1.4 Co-residence selection

As we have shown in tables 3 and 4, the average co-residence rate for ages 14-18 is high in most countries. One could, however, still be concerned that different cultural norms across groups with respect to co-residence may contaminate the observed association between parental lit- eracy if the differential rates of sample selection induced by differences in co-residence are correlated with either variable. Here we show that this does not appear to be the case. For both religion- and ethnicity samples, column (1) of table 13 reproduces the baseline result. Next, we split the sample in two at the median co-residence rate and estimate the relationship between literacy and IM separately above and below the cutoff. We do this once taking the overall median across all groups (columns 2) and (3)) and once taking the median co-residence rate within each country (columns (4) and (5)). Finally, in column (6), we directly control for the groups-specific co-residence rate. For all ways of slicing the data, the strong correlation

25 between parental literacy and IM survives.

Table 13: Association between religion-level IM and parental literacy – the role of co- residence

(1) (2) (3) (4) (5) (6) IM IM IM IM IM IM

religion share literate old 0.721∗∗∗ 0.714∗∗∗ 0.703∗∗∗ 0.730∗∗∗ 0.771∗∗∗ 0.732∗∗∗ (5.13) (5.56) (5.05) (5.65) (6.00) (6.38) co-residence rate -0.650 (-1.50) R-squared 0.925 0.970 0.920 0.935 0.967 0.933 N 106 48 52 58 47 105

ethnicity share literate old 0.535∗∗∗ 0.545∗∗ 0.536∗∗∗ 0.606∗∗∗ 0.460∗∗∗ 0.536∗∗∗ (5.20) (3.14) (5.69) (4.86) (3.69) (5.02) co-residence rate 0.115 (0.53) R-squared 0.900 0.925 0.892 0.931 0.897 0.901 N 171 83 84 90 81 171

co-residence rates sub-sample all low, overall high, overall low, by country high, by country all The dependent variable is religion/ethnicity-level share of literate kids of illiterate parents. The independent variable is the religion/ethnicity-level share of literate parents. The row “co-residence rates sub-sample” indicates the sub-sample of groups (all, above, or below median co-residence rates). On which the estimation is run. The median is either that of all groups or the within-country median. All correlations net of country fixed effects. t-statistics based on standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01.

4.2 Country Heterogeneity

Next, we explore potential heterogeneity across countries on the strength of the literacy - social mobility correlation across ethnic and religious lines. We do so running the baseline specifications in each country separately for ethnic and religious groups. Table 14 reports the estimates. For brevity we just report estimates among individuals aged 14-18. A high coefficient implies strong inertia, while a low estimate implies convergence. The following patterns emerge. First, there is non-negligible variation in the coefficient estimate across countries. The coefficient on the religion specifications ranges from 0 in Botswana and 0.28 in South Africa to 1.5 in Ethiopia. The estimate on ethnic literacy ranges from 0 in South Africa and Liberia to 1.2-1.5 in Ethiopia, Mali, and Morocco. Second, inertia are the highest in the Sahara and the Sahel (Burkina Faso, Nigeria, Mali, Ethiopia) and the lowest in South Africa and Liberia; Ghana, Cameroon, Rwanda, and Senegal being close to the middle. These patterns suggest that the pan-African association revealed above does not apply uniformly. Religious and ethnic cleavages in social mobility differ across the 19 countries.

26 Table 14: Country-by-country heterogeneity in IM-literacy relationship

number of religions number of ethnicites religion-level correlation ethnicity-level correlation Burkina Faso 6 13 1.033∗∗∗ 0.865∗∗∗ (9.912) (18.676) Botswana 4 8 -0.334 0.412∗∗∗ (-0.697) (4.149) Cameroon 8 0 0.488∗∗∗ (5.185) Egypt 3 0 0.512 (5.768) Ethiopia 6 11 2.211∗∗∗ 1.933∗∗∗ (7.313) (3.657) Ghana 7 9 0.623∗∗∗ 0.410∗∗∗ (5.464) (5.281) Guinea 7 0 0.857∗∗ (3.271) Liberia 5 16 0.163 -0.170 (0.221) (-0.804) Morocco 0 5 1.336∗∗ (3.800) Mali 5 13 1.205∗∗∗ 1.482∗∗ (6.501) (3.082) Mozambique 6 18 0.573 0.668∗∗∗ (1.897) (3.410) Malawi 4 11 1.013∗ 0.597∗∗∗ (3.890) (7.479) Nigeria 4 0 1.121∗∗ (5.234) Rwanda 7 0 0.978∗∗∗ (9.334) Senegal 5 10 0.712 0.605∗∗ (1.802) (2.892) Sierra Leone 7 13 0.832∗∗ 0.308∗ (2.914) (1.908) Uganda 8 21 1.304∗∗∗ 0.906∗∗∗ (14.597) (6.382) South Africa 9 12 0.127∗∗ -0.103 (3.465) (-0.966) Zambia 5 11 1.007∗∗∗ 0.622∗ (10.207) (2.121) This table shows country-by-country regressions of IM on parental literacy at the ethnicity-levels (one regression per every two lines in each column). The dependent variable is the religion / ethnicity-share of literate kids of illiterate parents (estimated net of year and birth-decade for young and old fixed effects). The independent variable is the religion / ethnicity-level share of literate parents (estimated net of year and birth-decade for young and old fixed effects). t-statistics in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01.

4.3 Group-size

In table 15 we report results of regressions of the literacy of the old and IM on the population share of the groups inside the country. For religion, a non-linear relationship emerges between IM and size (the square term is motivated by Francois, Rainer, and Trebbi (2015)who doc- ument a hump-shaped relationship between ethnic group size and cabinet representation in Africa), with IM first increasing in size, then decreasing. For ethnicity, there is no discernable association between group size and either literacy of the old or IM.

27 Table 15: Group size, literacy, and IM at the religion- and ethnicity-level, individuals aged 14-18

(1) (2) (3) (4) (5) (6) share literate old share literate old IM 14-18 IM 14-18 IM 14-18 IM 14-18

religion size 0.0588 0.281 0.109∗ 0.595∗∗ 0.0677∗∗ 0.402∗∗∗ (1.03) (1.38) (1.83) (2.79) (2.36) (3.61) size2 -0.283 -0.618∗∗ -0.424∗∗∗ (-1.14) (-2.55) (-3.42) share literate old 0.706∗∗∗ 0.686∗∗∗ (5.18) (5.60) R-squared 0.751 0.754 0.821 0.835 0.929 0.935 N 106 106 106 106 106 106

ethnicity size 0.0779 -0.0486 0.0429 -0.167 0.00122 -0.141 (0.95) (-0.18) (0.75) (-1.00) (0.03) (-1.04) size2 0.217 0.359 0.244 (0.66) (1.70) (1.28) share literate old 0.535∗∗∗ 0.533∗∗∗ (5.17) (5.11) R-squared 0.713 0.714 0.812 0.813 0.900 0.901 N 171 171 171 171 171 171

country-FEs yes yes yes yes yes yes student correction yes yes yes yes yes yes In the columns entitled “share literate old” the dependent variable is the ethnicity / religion share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old). In the columns entitled “IM” it is the ethnicity / religion share of children of parents with less than primary who complete at least primary (estimated net of country-year and country-birth-decade fixed effects for young and old). All correlations net of country fixed effects. t-statistics based on standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01.

5 Correlates of Ethnic Mobility

5.1 Specification and Data

We now examine the correlates of ethnic IM within countries. We do not aim to identify causal effects, but simply uncover the main correlates of the large within-country differences in educational intergenerational mobility.

We run specifications linking the ethnic IM estimates (IMe,c) with proxies of ethnic power post-independence and deeply rooted ethnic features. The specification reads:

o IMe,c = θc + EPRe,cΦ + HISTe,cΓ + [+λEe,c] + ζe,c.

EPRe,c denotes variables reflecting ethnicities’ political power post-independence (as recorded in the Ethnic Power Relations (EPR) database. HISTe,c denote ethnic-specific cultural, economic, and political features, as recorded by George Peter Murdock (1967). θc is a vector of country constants that account for the large cross-country differences in IM and other

28 country-specific unobservables that may be correlated with IM and the covariates. Since the o education level of the old generation, Er,c, is a strong correlate of ethnic IM, we also report specifications controlling for it. As the specifications do not reflect causal mechanisms, we estimate the above regression equation adding the various contemporary and historical ethnic feature proxies one at a time – that is, each version of equation (5.1) has only one independent variable. The Data Appendix E.1 provides definitions and sources for all variables and gives summary statistics. We report specifications looking at individuals aged 14-18 who cohabitate with their parents with the student correction, as in this sample selection concerns are attenuated. We report in the Appendix analogous correlations at the full sample (individuals aged 14 and older).

5.2 Ethnic Power in National Politics

5.2.1 Data

A plethora of case studies and some econometric studies (e.g., Burgess et al. (2015), Amodio and Chiovelli (2016), De Luca et al. (2018), Franck and Rainer (2012)) relate public goods provision to political power at the national level. Since the ethnic-specific estimates of inter- generational mobility in education can be interpreted as ethnic proxies of social mobility, we examine the association between IM and variables of ethnicities role in national politics. We do so using information from the latest vintage of the Ethnic Power Relations database (Wim- mer, Cederman, and Min (2009), Vogt et al. (2015)) that provides subjective assessments of access to political power at the national level for all politically relevant ethnic groups at the yearly frequency since independence. Using experts’ opinions and subjective assessments of ethnicities’ absolute access to na- tional power, EPR gives quantitative measures proxying ethnicities role in national politics. There are three (six) main categories (incl. subcategories)7:

(1) Dominance-Monopoly. Elite members of the ethnic group hold monopoly or dominant power in the executive to the exclusion (or with very limited inclusion of “token” members) of members of other groups. (2) Senior or Junior Power. Representatives of the ethnicity participate as senior or junior partners in a formal or informal power-sharing agreement. [Power sharing is any arrangement that divides executive power among leaders who claim to represent particular ethnic groups and who have real influence on political decision making]. (3) Powerless or Discriminated. Representatives hold no political power at the national executive and ethnic members are subjected to active, intentional, and targeted formal or informal discrimination from political power (excluding dis- crimination in the socioeconomic sphere).

EPR also identifies ethnicities that have played an active role on civil wars.8

7The following are direct citations, with minor changes, from the EPR codebook Vogt and R¨uegger(2018). 8EPR also identifies ethnicities with regional automy. Since we have just two groups (the Baganda in Uganda and the Lozi im Zambia), we omit this.

29 We match the EPR ethnic classification to IPUMS and we collapse the annual information over the period indepedence-1980. We then define variables of the years that an ethnic group has been (i) dominating or/and monopolizing national power; (ii) a senior or junior partner in national political arena; (iii) explicitly discriminated or being powerless; (iv) participating at a civil war with an explicit ethnic dimension. We also defined dummy variables for each category. Some caveats are in needed, before reporting the results. First, the EPR ethnic groups are quite coarse; for example the Northerners group in Uganda that blends the Langi, the Acholi, the Madi, the Alur, the , and the Lugbara. Thus error is introduced when we match to the much finer IPUMS ethnic ids. Second, there is not much annual variability. For example, the monopoly indicator is perfectly correlated in our sample with the number of years with which ethnicities monopolized power. Third, the classification is subjective.

5.2.2 Results

Table 16 reports the regression estimates (appendix E.2.1 reports further results for individuals aged 14-100, as well as urban-rural and male-female heterogeneity). The specification in column (1) links the share of the parental cohort literacy with the EPR variables. Column (2) link the ethnic-specific IM estimate with EPR variables simply netting country fixed- effects, while column (3) also conditions on the share of literacy of the “old” generation that is a strong correlate of IM. Standard errors clustered at the country-level are reported below the coefficients.

Monopoly In row 1 we examine the role of ethnic monopolization of national power using the indicator variable and the frequency of years that the ethnicity monopolized power. It turns out that all variation in monopoly stems from Mali, with all “black” ethnicities in the country classified as such by EPR at the expense of only the Maure and Tuareg. Overall, 10 groups fall under “monopoly” (see appendix table 23 for the exact list of groups). For what it is worth, he specifications in columns (1) and (4) show that ethnic groups that monopolize power have significantly higher literacy rates (of the “old” generation) by about five percent compared to other ethnicities. Ethnicities that monopolize power have also higher mobility rates. The estimates in column (2) are (4) is highly significant both with the indicator and the frequency variable. Even when we condition on literacy rates of the “old” generation, the coefficient retains its economic and statistical significance. The estimates in column (3) and (6) imply that groups that have monopolized the executive at the national stage have 14 percentage points higher IM, a sizable effect. The association between monopolization of power and IM (as well as literacy) applies to both rural and urban households, though it is especially strong for urban households (Appendix table 27). Likewise, the association, while significant for both genders, is especially strong for boys (Appendix table 29).

Dominance We then examine the association between IM (and literacy of the old) with EPR’s classification of dominance in national politics. Examples of dominant ethnic groups

30 Table 16: Ethnicity-level correlates of the share of literate old and IM, ages 14-18, with student correction, Ethnic Power Relations

status indicator number of years in status (1) (2) (3) (4) (5) (6) variable share literate old IM IM controlling for share literate old N share literate old IM IM controlling for share literate old N MONOP 0.046∗∗∗ 0.171∗∗∗ 0.145∗∗∗ 131 0.046∗∗∗ 0.171∗∗∗ 0.145∗∗∗ 131 (0.000) (0.000) (0.005) (0.000) (0.000) (0.005) DOMIN 0.022 -0.003 -0.016 131 0.005 0.007 0.004 131 (0.053) (0.036) (0.023) (0.053) (0.051) (0.025) MONDOM 0.034 0.034 0.015 131 0.019 0.061 0.050 131 (0.055) (0.055) (0.045) (0.060) (0.070) (0.046) DISC -0.037∗∗∗ -0.055 -0.033 131 -0.009 -0.060∗∗∗ -0.055∗∗∗ 131 (0.014) (0.052) (0.059) (0.014) (0.021) (0.014) 31 PWLESS 0.014 -0.005 -0.013 131 -0.077 -0.062 -0.018 131 (0.114) (0.088) (0.057) (0.115) (0.073) (0.049) DISCPWLS -0.037 -0.097∗ -0.076∗∗ 131 -0.060 -0.081 -0.046 131 (0.101) (0.056) (0.034) (0.084) (0.058) (0.042) ETWAR 0.039 0.007 -0.015 131 0.018∗∗ -0.017 -0.028 131 (0.027) (0.048) (0.042) (0.007) (0.033) (0.031) This is not a normal regression table. In the columns entitled “share literate old” the dependent variable is the ethnicity share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old). In the columns entitled “IM” it is the ethnicity share of children of parents with less than primary who complete at least primary (estimated net of country-year and country-birth-decade fixed effects for young and old), which is also the LHS in the columns entitled “IM controlling for share literate old”. Each row shows the results of regressions of these variabes on the LHS on one RHS variable (indicated in the rows) at a time (the exception are SIZE and SIZE2, which enter jointly). The regressions in the two columns “IM controlling for share literate old” additionally control for the share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old), that is they include the LHS variable of the columns “share literate old” on the RHS. All RHS variable enter in the transformation indicated in the top row. All specifications include country fixed effects (not reported). Coefficients are standardized. MONOP = epr status of monopoly, DOMIN = epr status of dominant, MONDOM = epr status of monopoly or dominant, DISC = epr status of discriminated, PWLESS = epr status of powerless, DISCPWLS = epr status of discriminated or powerless, ETWAR = ethnic war. Standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01. lines indicate that variables remain significantly correlated with IM when we control for the share of literate parents. are the Amhara in Ethiopia during the full 1946-1980 period and the Chewa in Malawi over 1964-1980. Overall, 15 groups in our sample are classified as “dominant”. Since EPR puts dominance “below” monopoly, the table gives estimates also merging the two categories. Ethnicities, whose “elites hold dominant power in the executive” tend to have somewhat higher literacy rates; yet the estimate in column (1) does not pass standard significance thresholds. Likewise, IM is somewhat higher for these groups, though again the coefficient in columns (2) and (3) are always statistically significant.

Discrimination In row 4 we examine the role of ethnic discrimination in national pol- itics on IM and literacy. The specification in column (1) shows that literacy of the old generation if 4% lower for discriminated as compared to all other ethnic groups. This also applies in the different subsamples (male/female, urban/rural). IM is somewhat lower for eth- nic groups that have been facing discrimination though the coefficient is only significant with the frequency variable. We have 22 groups in the sample that at some point faced discrimi- nation. Examples of discriminated groups with below country-mean IM are the Oroma, the Afar and the Somalis in Ethiopia that faced discrimination throughout the 1946 − 1980. The not-particularly strong association between discrimination and IM stems from the fact that in many countries, educated minorities (with high IM) have faced discrimination; examples are the Coptic Christians in Egypt, Asians in Uganda, and even the Krio in Liberia in the 1980s.

Powerless The association between IM (and literacy) and EPR’s powerless classification (we have 46 such groups) is weak and not significant. This pattern applies with both he dummy and frequency of powerless transformation for both boys and girls, rural and urban households. Examples of powerless ethnicities with quite low IM are the Tuareg in Mali and the Shona in Mozambique. We also merged the powerless and the discriminated categories and repeated estimation. The coefficients for the dummy verion now become more precisely estimated. This weakly suggests that ethnic groups excluded from national power of facing explicit discrimination have lower levels of social mobility and lower literacy rates.

5.3 Historical Features

A growing strand of recent research shows that deeply-rooted, colonial and precolonial, ethnic features still affect contemporary outcomes (see Michalopoulos and Papaioannou (2018) for an overview). Due to African states’ weak state capacity, the limited colonial investments in transportation infrastructure and schools, and colonial era indirect rule, ethnic social norms and institutions have persisted. We thus explore the correlation between intergenerational mobility in education across ethnic groups and historical ethnic variables. Table 17 shows the baseline results for individuals aged 14-18. Appendix tables 31-35 show the analogous results for individuals aged 14-100 as well as urban-rural and male-female heterogeneity.

32 Table 17: Ethnicity-level correlates of the share of literate old and IM, ages 14-18, with student correction, historical features

(1) (2) (3) variable share literate old IM IM controlling for share literate old N

slavery SLAVERY -0.009 0.009 0.014 129 (0.081) (0.116) (0.089)

Murdock SPLIT -0.123∗ -0.052 0.026 130 (0.069) (0.068) (0.030) PASTO -0.045 -0.181∗∗ -0.156∗∗∗ 135 (0.085) (0.079) (0.040) AGRI 0.012 0.128 0.122∗∗ 135 (0.079) (0.082) (0.051) POLYG -0.064 0.004 0.043 132 (0.056) (0.043) (0.046) BRIDEPR 0.060 0.100∗ 0.066 135 (0.083) (0.054) (0.051) PATRI 0.121 0.195∗∗∗ 0.123 116 (0.099) (0.067) (0.087) PCCENT 0.006 -0.001 -0.004 129 (0.032) (0.033) (0.031) This is not a normal regression table. In the columns entitled “share literate old” the dependent variable is the ethnicity share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old). In the columns entitled “IM” it is the ethnicity share of children of parents with less than primary who complete at least primary (estimated net of country-year and country-birth- decade fixed effects for young and old), which is also the LHS in the columns entitled “IM controlling for share literate old”. Each row shows the results of regressions of these variabes on the LHS on one RHS variable (indicated in the rows) at a time. The regressions in the two columns “IM controlling for share literate old” additionally control for the share of parents with at least primary schooling (estimated net of country-year and country- birth-decade fixed effects for young and old), that is they include the LHS variable of the columns “share literate old” on the RHS. All specifications include country fixed effects (not reported). Coefficients are standardized. SPLIT = ancestral homeland of ethnic group split between two modern states, SLAVERY = Nunn slavery measure, PASTO = pastoralism dominant pre-colonial mode of subsistence, AGRI = agriculture dominant pre-colonial mode of subsistence, POLYG = polygyny dummy, BRIDEPR = ethnic group practices bride price, PATRI = patrilineal group, PCCENT = levels of jurisidictional hierarchy beyond local community, PLOW = plow aboriginal. Standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01. lines indicate that variables remain significantly correlated with IM when we control for the share of literate parents.

Ethnic Partitioning We begin our historical correlates analysis examining the asso- ciation between IM (and education) and an indicator variable that identifies ethnic groups that have been systematically partitioned by the artificial colonial borders designed in Eu- ropean capitals during the Scramble for Africa that endured independence (68 out of 130 groups fall into this category). Case study as well as cross-country and within-country econo- metric studies show that ethnic partitioning and border artificiality are related to violence, conflict, and underdevelopment (see Herbst (2000), Alesina, Easterly, and Matuszeski (2011), and Michalopoulos and Papaioannou (2016)). The ethnic partitioning indicator enters with a significantly negative coefficient in the literacy (of the “old”) specification in column (1); split-by-the-national border ethnic groups have on average lower education levels (this result adds to the evidence of Michalopoulos and Papaioannou (2016) who using DHS data showed a similar negative association). Partitioned ethnic groups tend to have lower levels of inter- generational mobility in education, but the estimate in column (2) is noisy and insignificant.

33 The same applies when we condition on the literacy rates of the “old” generation (in (3)). As we show in the Appendix these patterns apply for both boys and girls, as well as rural and urban households.

Economic Organization Motivated by Michalopoulos, Putterman, and Weil (2016) recent findings (using DHS data) that living conditions and public goods are lower (higher) for ethnic groups specializing in agriculture (pastoralism), we commence our analysis with Murdock’s ethnographic information exploring the association between IM and variables that reflect ethnicities’ economic organization. We define dummy variables that take the value one when pastoral (6 out of 135 groups) and agriculture (124 ouf of 135 groups) are the dominant precolonial modes of subsistence economies and then examine their correlation with IM. The pastoralism indicator enters with a significantly negative coefficient (−0.18), while the agriculture specialization dummy enters with a positive estimate (0.128) in the unconditional specification in column (2). The estimates are not much affected when we control for the share of literacy among the old (in column (3)) that in itself is not much correlated with the pastoral and agricultural dummies (column (1)). These patterns apply for boys and girls, though quantitatively the estimates are stronger for boys (Appendix table 33). When we distinguish between rural and urban household status (Appendix table 32), we find that social mobility is higher (lower) for agriculture-specializing (pastoralist) ethnicities in both subsamples. This suggests that deeply-rooted ethnic economic specialization affects mobility not only in the countryside but also in the urban centers. In the same vein, Michalopoulos, Putterman, and Weil (2016) find that wealth is higher for agriculture-specializing groups even when they look at cities and within non-agriculture-related occupations. This is an interesting finding as it suggests that precolonial traits (in this case the historical mode of subsistence) may manifest their influence on educational attainment not in the initial conditions but over time as they may influence the rate at which initial levels of human capital are transmitted intergenerationally.

Institutions, Political Centralization We then associate IM (and literacy) with Mur- dock’s 0-4 jurisdictional hierarchy beyond the local community index that is often used to proxy for the degree of political complexity of African ethnicities’ at the time of colonization. Gennaioli and Rainer (2007), Michalopoulos and Papaioannou (2013), Michalopoulos and Pa- paioannou (2014) and other works show that contemporary African development is higher in ancestral homelands of politically centralized ethnic groups – that used to be organized as states or large chiefdoms – as compared to acephalous ethnicities without much of political hierarchy. This variable is uncorrelated with the IM index (columns (2) and (3)) as well as with the share of illiteracy among the “old” generation (column (1)). The results are similar when we split by rural-urban status (see Appendix table 32); when we distinguish by gender there is some weak evidence that IM of girls is higher among politically centralized ethnicities (Appendix table 33).

34 Family Structure Social norms are highly persistent (e.g., Fern´andez(2003), Giuliano and Nunn (2017)) and family structure has first-order effects on economic choice (see Alesina and Giuliano (2015) for a review). We therefore examine the correlation between IM and variables reflecting ethnicities’ family structure before European colonization; First, we examine the role of polygyny – that is still practiced in many parts of the continent (indeed, 112 out of 132 groups practise polygyny) (see Fenske (2015) for an overview and patterns). In line with earlier works, we find that ethnic groups that practiced polygyny at the onset of colonization have lower literacy rates, on average by 6.4% (column (1)), though the estimate is not significant. There is, moreover, no evidence that polygyny correlates significantly with educational IM (columns (2) and (3)). The same applies when we solely look at girls; polygyny is weakly correlated to education but unrelated to IM. Second, we examine whether intergenertional mobility is higher (lower) for ethnic groups that practice bride price (104 out of 135 groups fall into this category) (Ashraf et al. (2016) connect the practice of bride prace to education). While there are no major differences on the education of the “old” generation, IM is higher for ethnic groups that practice bride price (column (2)), a result that is driven exclusively by rural households. The estimate is positive (though not always signficiant) for both boys and girls. Third, we explored whether social mobility is higher for patrlilineal ethnic groups (96 out of 116 are classified as such). While the regression estimate is, conditional on literacy (column (3)) weakly in(significant) in the full sample, it turns significantly positive in the boys subsample; educational mobility across generations is significantly higher in patrilineal ethnic groups.

Slave Trades Nunn (2008) and subsequent works have established the lasting adverse legacy of Africa’s slave trades. Nunn and Wantchekon (2011) show that trust levels are significantly lower for Africans descending from ethnic groups that were severely affected by the slave trades that weakened African societies and polities in the precolonial period. We thus correlate IM with a measure of how intensely an ethnic group was subject to the slave trade (and also with a dummy variable identifying ethnic groups that were directly affected by the slave trades). Neither IM nor literacy rates correlate with slave trades. This applies for both boys and girls and for both rural and urban households. We also associated IM with Murdcok’s classification of ethnic practice of slavery before the advent of Europeans finding again weak and insignificant associations.

6 Conclusion

We have studied social mobility across religious and ethnic groups in Africa. Using data from 37 national censuses in 19 countries comprising close to 70 million individual records that cover the time period from the early 20th century to the early 1990s, we have computed new measures of upward intergenerational mobility in educational attainment defined as the likelihood that co-residing children of parents with less than primary education complete at

35 least primary school. Given the preponderance of illiteracy in Africa, the non-primary to primary transition is the crucial margin to measure mobility in education on the continent. In the first part of the paper, we have documented a large variation in educational mobility across cultural groups. This variation has a strong country-specific component in that high and low literacy and mobility groups tend to cluster together in the same countries. At the same time, there are sizeable differences in intergenerational mobility within countries across both ethnic and religious lines. Across as within countries, Christians exhibit greater intergenerational mobility than Muslims, who in turn are more mobile than people adhering to traditional faiths. In the second part, we showed that initial group-level differences in educational attainment in the generation of individuals’ parents are a strong predictor of group-level differences in mobility in the sense that initially more literate groups are also more mobile. This pattern holds for both religious and ethnic groups, it holds within ethnic groups controlling for religion (and vice versa), and it holds for different groups residing in the same geography. The latter is especially striking since it implies, in our most demanding specification, that two individuals who grew up as children of illiterate parents in the same either urban or rural sub-division of the same admin-2 region have a greater or smaller chance of becoming literate depending on the overall level of literacy of their group in their parents’ generation in the country overall. The upshot is that economic differences across cultural groups that dominate many African societies can be expected to persist since they are strongly present in the transmission of human capital from one generation to the next. In the third part of the paper, we explored the correlates of ethnic and religious intergen- erational mobility. We do not attempt to identify causal effects but simply show univariate correlations, as well as correlations conditional on parental literacy, between our measures of mobility and several variables capturing cultural origins and political representation since in- dependence. The analysis has many caveats, chief among which the fact that many variables do not exhibit much variation in our samples. With that in mind, there is some evidence that ethnic groups that have been discriminated politically since indepdence exhibit lower mobility and that groups that historically relied more on agriculture for their subsistence are more mobile than pastoral groups.

36 References

Acemoglu, Daron, Tristan Reed, and James A. Robinson. 2014. “Chiefs: Economic Devel- opment and Elite Control of Civil Society in Sierra Leone.” Journal of Political Economy 122 (2):319–368.

Alesina, Alberto, Arnaud Devleeschauwer, William Easterly, Sergio Kurlat, and Romain Wacziarg. 2003. “Fractionalization.” Journal of Economic Growth 8 (2):155–194.

Alesina, Alberto, William Easterly, and Janina Matuszeski. 2011. “Artificial States.” Journal of the European Economic Association 9 (2):246–277.

Alesina, Alberto and Paola Giuliano. 2015. “Culture and Institutions.” Journal of Economic Literature 53 (4):898–944.

Alesina, Alberto, Sebastian Hohmann, Stelios Michalopoulos, and Elias Papaioannou. 2018. “Intergenerational Mobility in Africa.”

Alesina, Alberto, Stelios Michalopoulos, and Elias Papaioannou. 2016. “Ethnic Inequality.” Journal of Political Economy 124 (2):428–488.

Alesina, Alberto and Ekaterina Zhuravskaya. 2011. “Segregation and the Quality of Govern- ment in a Cross Section of Countries.” American Economic Review 101 (5):1872–1911.

Amodio, Francesco and Giorgio Chiovelli. 2016. “Ethnic Favoritism in Democracy: The Political Economy of Land and Labor in South Africa.”

Ashraf, Nava, Natalie Bau, Nathan Nunn, and Alessandra Voena. 2016. “Bride Price and Female Education.” Tech. Rep. 22417, National Bureau of Economic Research, Inc.

Azam, Mehtabul and Vipul Bhatt. 2015. “Like Father, Like Son? Intergenerational Educa- tional Mobility in India.” Demography 52 (6):1929–1959.

Baldwin, Kate. 2013. “Why Vote with the Chief? Political Connections and Public Goods Provision in Zambia.” American Journal of Political Science 57 (4):794–809.

———. 2015. The Paradox of Traditional Chiefs in Democratic Africa. New York: Cambridge University Press.

Baldwin, Kate and John D. Huber. 2010. “Economic versus Cultural Differences: Forms of Ethnic Diversity and Public Goods Provision.” American Political Science Review 104 (4):644–662.

Barro, Robert J. and Jong Wha Lee. 2013. “A New Data Set of Educational Attainment in the World, 1950–2010.” Journal of Development Economics 104 (C):184–198.

Bates, Robert H. 2005. Markets and States in Tropical Africa: The Political Basis of Agri- cultural Policies: With a New Preface. University of California Press.

37 Black, Sandra E. and Paul J. Devereux. 2011. “Recent Developments in Intergenerational Mobility.” In Handbook of Labor Economics, vol. 4B. Elsevier, 1487–1541.

Blake, Judith. 1985. “Number of Siblings and Educational Mobility.” American Sociological Review 50 (1):84–94.

Bowles, Samuel. 1972. “Schooling and Inequality from Generation to Generation.” Journal of Political Economy 80 (3, Part 2):S219–S251.

Burgess, Robin, Remi Jedwab, Edward Miguel, Ameet Morjaria, and Gerard Padr´oi Miquel. 2015. “The Value of Democracy: Evidence from Road Building in Kenya.” American Economic Review 105 (6):1817–1851.

Card, David. 1999. “The Causal Effect of Education on Earnings.” In Handbook of Labor Economics, vol. 3, Part A. Elsevier, 1801–1863.

Card, David, Ciprian Domnisoru, and Lowell Taylor. 2018. “The Intergenerational Transmis- sion of Human Capital: Evidence from the Golden Age of Upward Mobility.” :102.

Caselli, Francesco, Jacopo Ponticelli, and Federico Rossi. 2014. “A New Dataset on Mincerian Returns.” unpublished .

Chetty, Raj and Nathaniel Hendren. 2018a. “The Impacts of Neighborhoods on Intergen- erational Mobility I: Childhood Exposure Effects.” The Quarterly Journal of Economics 133 (3):1107–1162.

———. 2018b. “The Impacts of Neighborhoods on Intergenerational Mobility II: County- Level Estimates.” The Quarterly Journal of Economics 133 (3):1163–1228.

Chetty, Raj, Nathaniel Hendren, Maggie R Jones, and Sonya R Porter. 2018. “Race and Economic Opportunity in the United States: An Intergenerational Perspective.” Working Paper 24441, National Bureau of Economic Research.

Chetty, Raj, Nathaniel Hendren, Patrick Kline, and Emmanuel Saez. 2014. “Where Is the Land of Opportunity? The Geography of Intergenerational Mobility in the United States.” The Quarterly Journal of Economics 129 (4):1553–1623.

Chua, Amy. 2004. World on Fire: How Exporting Free Market Democracy Breeds and Global Instability. Random House. de Kadt, Daniel and Horacio A. Larreguy. 2018. “Agents of the Regime? Traditional Leaders and Electoral Politics in South Africa.” The Journal of Politics 80 (2):382–399.

De Luca, Giacomo, Roland Hodler, Paul A. Raschky, and Michele Valsecchi. 2018. “Ethnic Favoritism: An Axiom of Politics?” Journal of Development Economics 132:115–129.

Desmet, Klaus, Ignacio Ortu˜no-Ort´ın,and Romain Wacziarg. 2012. “The Political Economy of Linguistic Cleavages.” Journal of Development Economics 97 (2):322–338.

38 Dickens, Andrew. 2018. “Ethnolinguistic Favoritism in African Politics.” American Economic Journal: Applied Economics 10 (3):370–402.

Dowden, Richard. 2009. Africa: Altered States, Ordinary Miracles. London: Portobello Books Ltd, 1st edition edition ed.

Easterly, William and Ross Levine. 1997. “Africa’s Growth Tragedy: Policies and Ethnic Divisions.” The Quarterly Journal of Economics 112 (4):1203–1250.

Esteban, Joan, Laura Mayoral, and Debraj Ray. 2012. “Ethnicity and Conflict: An Empirical Study.” American Economic Review 102 (4):1310–1342.

Fearon, James D. 2003. “Ethnic and Cultural Diversity by Country*.” Journal of Economic Growth 8 (2):195–222.

Fenske, James. 2015. “African Polygamy: Past and Present.” Journal of Development Eco- nomics 117 (C):58–73.

Fern´andez,Raquel. 2003. “Household Formation, Inequality, and the Macroeconomy.” Jour- nal of the European Economic Association 1 (2-3):683–697.

Franck, Rapha¨eland Ilia Rainer. 2012. “Does the Leader’s Ethnicity Matter? Ethnic Fa- voritism, Education, and Health in Sub-Saharan Africa.” American Political Science Review 106 (02):294–325.

Francois, Patrick, Ilia Rainer, and Francesco Trebbi. 2015. “How Is Power Shared in Africa?” Econometrica 83:465–503.

Gennaioli, Nicola, Rafael La Porta, Florencio Lopez De Silanes, and Andrei Shleifer. 2014. “Growth in Regions.” Journal of Economic Growth 19 (3):259–309.

Gennaioli, Nicola and Ilia Rainer. 2007. “The Modern Impact of Precolonial Centralization in Africa.” Journal of Economic Growth 12 (3):185–234.

Giuliano, Paola and Nathan Nunn. 2017. “Understanding Cultural Persistence and Change.” Working Paper 23617, National Bureau of Economic Research.

Golley, Jane and Sherry Tao Kong. 2013. “Inequality in Intergenerational Mobility of Edu- cation in China.” China & World Economy 21 (2):15–37.

Herbst, Jeffrey. 2000. States and Power in Africa: Comparative Lessons in Authority and Control. Princeton, NJ: Princeton University Press.

Hertz, Tom, Tamara Jayasundera, Patrizio Piraino, Sibel Selcuk, Nicole Smith, and Alina Ve- rashchagina. 2008. “The Inheritance of Educational Inequality: International Comparisons and Fifty-Year Trends.” The B.E. Journal of Economic Analysis & Policy 7 (2).

Kramon, Eric and Daniel N. Posner. 2016. “Ethnic Favoritism in Education in Kenya.” Quarterly Journal of Political Science 11 (1):1–58.

39 Krueger, Alan B. and Mikael Lindahl. 2001. “Education for Growth: Why and for Whom?” Journal of Economic Literature 39 (4):1101–1136.

Logan, Carolyn. 2013. “The Roots of Resilience: Exploring Popular Support for African Traditional Authorities.” African Affairs 112 (448):353–376.

Michalopoulos, Stelios and Elias Papaioannou. 2013. “Pre-Colonial Ethnic Institutions and Contemporary African Development.” Econometrica 81 (1):113–152.

———. 2014. “On the Ethnic Origins of African Development: Chiefs and Precolonial Political Centralization.” Academy of Management Perspectives 29 (1):32–71.

———. 2016. “The Long-Run Effects of the Scramble for Africa.” American Economic Review 106 (7):1802–1848.

———. 2018. “Historical Legacies and African Development.” unpublished manuscript .

Michalopoulos, Stelios, Louis Putterman, and David N Weil. 2016. “The Influence of Ancestral Lifeways on Individual Economic Outcomes in Sub-Saharan Africa.” Working Paper 21907, National Bureau of Economic Research.

Montalvo, Jos´eG. and Marta Reynal-Querol. 2005. “Ethnic Polarization, Potential Conflict, and Civil Wars.” American Economic Review 95 (3):796–816.

Montenegro, Claudio E. and Harry Anthony Patrinos. 2014. “Comparable Estimates of Re- turns to Schooling around the World.” Tech. Rep. WPS7020, The World Bank.

Murdock, George P. 1959. Africa: Its Peoples and Their Cultural History. New York: McGraw-Hill Inc.,US, ex-library/includes map edition ed.

Murdock, George Peter. 1967. “Ethnographic Atlas: A Summary.” Ethnology 6 (2):109–236.

Nunn, Nathan. 2008. “The Long-Term Effects of Africa’s Slave Trades.” The Quarterly Journal of Economics 123 (1):139–176.

Nunn, Nathan and Leonard Wantchekon. 2011. “The Slave Trade and the Origins of Mistrust in Africa.” American Economic Review 101 (7):3221–3252.

Posner, Daniel N. 2005. Institutions and Ethnic Politics in Africa. Cambridge University Press.

Psacharopoulos, George. 1994. “Returns to Investment in Education: A Global Update.” World Development 22 (9):1325–1343.

Robinson, James. 2001. “Social Identity, Inequality, and Conflict.” Economics of Governance 2:85–99.

Solon, Gary. 1999. “Intergenerational Mobility in the Labor Market.” In Handbook of Labor Economics, vol. 3, Part A. Elsevier, 1761–1800.

40 Spady, William G. 1967. “Educational Mobility and Access: Growth and Paradoxes.” Amer- ican Journal of Sociology 73 (3):273–286.

Stewart, Frances. 2002. “Horizontal Inequalities: A Neglected Dimension of Development.” UNU World Institute for Development Economics Research Annual Lecture 5.

Vogt, Manuel, Nils-Christian Bormann, Seraina R¨uegger,Lars-Erik Cederman, Philipp Hun- ziker, and Luc Girardin. 2015. “Integrating Data on Ethnicity, Geography, and Conflict: The Ethnic Power Relations Data Set Family.” Journal of Conflict Resolution 59 (7):1327– 1342.

Vogt, Manuel and Seraina R¨uegger.2018. “The Ethnic Power Relations (EPR) Core Dataset 2018.”

Wimmer, Andreas, Lars-Erik Cederman, and Brian Min. 2009. “Ethnic Politics and Armed Conflict: A Configurational Analysis of a New Global Data Set.” American Sociological Review 74 (2):316–337.

Young, Alwyn. 2012. “The African Growth Miracle.” Journal of Political Economy 120 (4):696–739.

41 A Details on sample construction

A.1 Religion sample

Table 18: Number of observations in the religion sample

age≥14 14≤age≤18 olded=0 14≤age≤18,olded=0 country fraction year Nall Nage Nowned Nowned Nolded Nolded Nolded Nolded nr student data urban/rural data Botswana 10 2001 168,676 168,134 159,257 109,649 36,006 14,150 20,412 6,168 4 yes no Botswana 10 2011 201,752 201,235 190,212 138,375 40,463 12,794 16,040 2,786 4 yes no Burkina Faso 10 1996 1,081,046 1,075,824 803,264 552,402 156,495 77,238 147,669 71,934 6 no no Burkina Faso 10 2006 1,417,824 1,410,123 1,244,291 770,161 178,501 103,859 155,898 92,855 6 yes yes Cameroon 10 2005 1,772,359 1,772,359 1,542,200 1,018,632 311,011 138,181 141,740 56,272 8 yes yes Egypt 14.1 1986 6,799,093 6,794,386 5,418,332 4,262,426 1,489,558 649,695 1,187,454 517,672 3 no yes Egypt 10 1996 5,902,243 5,901,839 4,453,382 3,810,835 1,331,716 670,174 1,054,871 516,873 3 no yes Egypt 10 2006 7,282,434 7,282,434 5,739,722 5,096,618 1,916,007 753,720 1,239,841 443,837 3 yes yes Ethiopia 10 1984 3,404,306 3,398,027 2,733,575 1,800,650 379,412 204,664 368,396 197,342 6 yes yes Ethiopia 10 1994 5,044,598 5,044,597 4,201,616 2,833,214 793,791 451,168 749,892 423,495 6 yes yes Ethiopia 10 2007 7,434,086 7,434,086 1,097,614 744,744 211,838 121,605 189,209 107,719 6 yes yes

42 Ghana 10 2000 1,894,133 1,894,133 1,730,902 1,152,128 310,913 129,369 178,820 63,859 7 yes yes Ghana 10 2010 2,466,289 2,466,289 2,262,894 1,575,528 499,171 200,837 245,681 87,612 7 yes yes Guinea 10 1983 457,837 457,778 364,805 275,065 44,403 22,662 41,557 20,967 6 yes yes Guinea 10 1996 729,071 727,246 551,619 397,137 113,872 44,673 100,445 37,986 5 yes yes Liberia 10 2008 348,057 348,057 294,517 210,111 59,015 25,494 35,139 13,253 5 yes yes Malawi 10 1998 991,393 991,393 826,197 582,694 109,301 64,674 78,746 45,320 4 yes yes Malawi 10 2008 1,341,977 1,341,046 1,155,840 730,242 150,447 89,098 96,800 57,397 4 yes yes Mali 10 2009 1,451,856 1,424,140 1,262,277 776,333 268,699 120,018 228,707 101,769 5 yes yes Mozambique 10 2007 2,047,048 2,047,048 1,616,853 1,103,596 262,286 133,824 219,556 110,169 6 yes yes Nigeria .05 2010 72,191 71,991 58,973 41,830 14,115 6,679 6,432 2,918 4 yes yes Rwanda 10 2002 843,392 843,392 645,489 472,153 142,765 81,951 112,107 62,684 7 yes yes Senegal 10 1988 700,199 699,981 527,462 378,289 103,599 42,459 90,035 35,464 4 yes no Senegal 10 2002 994,562 994,562 911,891 594,599 233,001 82,137 192,271 64,360 5 yes yes Sierra Leone 10 2004 494,298 492,922 395,788 291,916 94,108 38,245 71,242 27,146 7 yes yes South Africa 10 1996 3,621,164 3,578,019 3,055,995 2,328,067 753,838 283,482 330,368 104,373 9 yes yes South Africa 10 2001 3,725,655 3,725,655 3,353,684 2,598,672 880,011 320,148 381,315 114,613 9 yes yes Uganda 10 1991 1,548,460 1,547,604 1,242,885 855,537 183,396 97,908 135,688 65,958 7 yes yes Uganda 10 2002 2,497,449 2,497,449 2,042,838 1,355,857 304,094 183,083 170,612 95,258 8 yes yes Zambia 10 2000 996,117 996,117 825,110 570,022 192,384 93,412 85,575 35,003 5 yes yes Zambia 10 2010 1,321,973 1,321,973 1,028,628 704,471 227,855 117,903 67,399 29,460 5 yes no total 69,051,536 68,949,840 51,738,112 38,131,952 11,792,071 5,375,304 8,139,917 3,612,522 A.2 Ethnicity sample

Table 19: Number of observations in the ethnicity sample

age≥14 14≤age≤18 olded=0 14≤age≤18,olded=0 country fraction year Nall Nage Nowned Nowned Nolded Nolded Nolded Nolded ne student data urban/rural data Botswana 10 2001 168,676 168,134 159,257 109,649 36,006 14,150 20,412 6,168 8 yes no Botswana 10 2011 201,752 201,235 190,212 138,375 40,463 12,794 16,040 2,786 8 yes no Burkina Faso 10 2006 1,417,824 1,410,123 1,244,291 770,161 178,501 103,859 155,898 92,855 13 yes yes Ethiopia 10 1994 5,044,598 5,044,597 4,201,616 2,833,214 793,791 451,168 749,892 423,495 11 yes yes Ethiopia 10 2007 7,434,086 7,434,086 1,097,614 744,744 211,838 121,605 189,209 107,719 11 yes yes Ghana 10 2000 1,894,133 1,894,133 1,730,902 1,152,128 310,913 129,369 178,820 63,859 9 yes yes Ghana 10 2010 2,466,289 2,466,289 2,262,894 1,575,528 499,171 200,837 245,681 87,612 9 yes yes Liberia 10 2008 348,057 348,057 294,517 210,111 59,015 25,494 35,139 13,253 16 yes yes Malawi 10 2008 1,341,977 1,341,046 1,155,840 730,242 150,447 89,098 96,800 57,397 11 yes yes Mali 10 1987 785,384 773,407 582,678 422,837 111,633 48,553 104,443 44,574 13 no no Mali 10 1998 991,330 986,822 734,156 519,001 155,183 68,901 143,926 63,136 13 yes yes Mali 10 2009 1,451,856 1,424,140 1,262,277 776,333 268,699 120,018 228,707 101,769 13 yes yes

43 Morocco 5 1994 1,294,026 1,293,171 1,293,171 842,330 406,223 136,345 376,346 121,039 5 no no Morocco 5 2004 1,482,720 1,481,076 1,481,076 1,052,531 514,271 150,544 455,058 125,690 5 no no Mozambique 10 2007 2,047,048 2,047,048 1,616,853 1,103,596 262,286 133,824 219,556 110,169 18 yes yes Senegal 10 1988 700,199 699,981 527,462 378,289 103,599 42,459 90,035 35,464 10 yes no Senegal 10 2002 994,562 994,562 911,891 594,599 233,001 82,137 192,271 64,360 10 yes yes Sierra Leone 10 2004 494,298 492,922 395,788 291,916 94,108 38,245 71,242 27,146 13 yes yes South Africa 10 1996 3,621,164 3,578,019 3,055,995 2,328,067 753,838 283,482 330,368 104,373 12 yes yes South Africa 10 2001 3,725,655 3,725,655 3,353,684 2,598,672 880,011 320,148 381,315 114,613 12 yes yes South Africa 8.6 2011 4,418,594 4,418,594 3,845,633 3,101,908 919,915 302,412 273,167 63,522 12 yes yes Uganda 10 1991 1,548,460 1,547,604 1,242,885 855,537 183,396 97,908 135,688 65,958 21 yes yes Uganda 10 2002 2,497,449 2,497,449 2,042,838 1,355,857 304,094 183,083 170,612 95,258 21 yes yes Zambia 10 1990 787,461 787,461 664,239 460,486 142,016 75,070 90,770 42,762 11 yes yes Zambia 10 2000 996,117 996,117 825,110 570,022 192,384 93,412 85,575 35,003 11 yes yes Zambia 10 2010 1,321,973 1,321,973 1,028,628 704,471 227,855 117,903 67,399 29,460 11 yes no total 49,475,688 49,373,700 37,201,508 26,220,604 8,032,657 3,442,818 5,104,369 2,099,440 B Further results on IM across religious groups

Table 20: Country-level summary statistics of religion-level IM, urban/rural

country country- mean group- stdev group- min min group max max group level IM level IM level IM

Urban sample Burkina Faso 0.159 0.490 0.135 0.304 Animist 0.639 Christian, Protestant Cameroon 0.613 0.746 0.143 0.459 Other 0.860 Christian, Catholic Egypt 0.630 0.436 0.300 0.091 Other 0.637 Christian, unspecified Ethiopia 0.182 0.535 0.187 0.217 Traditional 0.753 Christian, Orthodox Ghana 0.638 0.714 0.115 0.528 Traditional 0.817 Christian, Protestant Guinea 0.285 0.494 0.149 0.255 Animist 0.679 Christian, Catholic Liberia 0.368 0.318 0.215 0.000 Other 0.546 Muslim Malawi 0.490 0.578 0.153 0.358 No religion 0.690 Muslim Mali 0.348 0.541 0.165 0.333 No religion 0.734 Christian, unspecified Mozambique 0.147 0.293 0.070 0.184 Muslim 0.360 Other Nigeria 0.668 0.671 0.254 0.300 Traditional 0.875 Christian, unspecified Rwanda 0.323 0.401 0.126 0.171 No religion 0.546 Muslim Senegal 0.290 0.525 0.154 0.333 Christian, Protestant 0.692 Christian, Other Sierra Leone 0.418 0.562 0.262 0.000 No religion 0.767 Other South Africa 0.861 0.899 0.030 0.846 No religion 0.941 Other Uganda 0.550 0.529 0.300 0.056 Traditional 0.750 Christian, Orthodox Zambia 0.569 0.643 0.081 0.512 No religion 0.717 Christian, Protestant

Rural sample Burkina Faso 0.159 0.135 0.058 0.072 Animist 0.203 Christian, Protestant Cameroon 0.613 0.575 0.162 0.298 Other 0.709 Christian, Protestant Egypt 0.630 0.633 0.046 0.581 Other 0.660 Christian, unspecified Ethiopia 0.182 0.121 0.068 0.050 Traditional 0.203 Christian, Catholic Ghana 0.638 0.574 0.137 0.328 Traditional 0.701 Christian, Protestant Guinea 0.285 0.294 0.117 0.149 Muslim 0.474 Christian, Protestant Liberia 0.368 0.222 0.106 0.083 Other 0.349 Muslim Malawi 0.490 0.390 0.104 0.249 No religion 0.501 Christian, unspecified Mali 0.348 0.296 0.096 0.228 No religion 0.462 Christian, unspecified Mozambique 0.147 0.095 0.025 0.061 Muslim 0.119 Christian, Protestant Nigeria 0.668 0.488 0.257 0.300 Other 0.844 Christian, unspecified Rwanda 0.323 0.282 0.076 0.135 No religion 0.360 Muslim Senegal 0.290 0.394 0.127 0.193 Muslim 0.517 Christian, Catholic Sierra Leone 0.418 0.297 0.123 0.053 Traditional 0.416 Christian, Protestant South Africa 0.861 0.845 0.021 0.813 No religion 0.873 Christian, Other Uganda 0.550 0.403 0.246 0.039 Traditional 0.643 Muslim Zambia 0.569 0.438 0.088 0.285 No religion 0.500 Muslim All estimates are unconditional (no birth-decade or census-year fixed effects) and are for the sample of individuals aged 14-18 without student correction.

44 Table 21: Country-level summary statistics of religion-level IM, male / female

country country- mean group- stdev group- min min group max max group level IM level IM level IM

Male sample Botswana 0.811 0.757 0.052 0.697 Other 0.800 Muslim Burkina Faso 0.159 0.194 0.087 0.075 Animist 0.293 Christian, Catholic Cameroon 0.514 0.563 0.142 0.319 Other 0.725 Christian, Orthodox Egypt 0.630 0.604 0.122 0.464 Other 0.688 Christian, unspecified Ethiopia 0.120 0.104 0.057 0.033 Traditional 0.159 Christian, Protestant Ghana 0.540 0.523 0.130 0.281 Traditional 0.651 Christian, Protestant Guinea 0.213 0.329 0.092 0.251 Muslim 0.505 Christian, Catholic Liberia 0.221 0.138 0.121 0.000 Other 0.303 Muslim Malawi 0.149 0.111 0.038 0.066 No religion 0.157 Christian, unspecified Mali 0.263 0.280 0.091 0.170 No religion 0.415 Christian, unspecified Mozambique 0.147 0.160 0.038 0.114 Muslim 0.203 Christian, Protestant Nigeria 0.594 0.401 0.273 0.185 Other 0.752 Christian, unspecified Rwanda 0.161 0.162 0.050 0.056 No religion 0.208 Muslim Senegal 0.252 0.428 0.106 0.272 Muslim 0.548 Christian, Catholic Sierra Leone 0.239 0.225 0.103 0.081 No religion 0.352 Christian, Protestant South Africa 0.733 0.723 0.072 0.651 Traditional 0.853 Other Uganda 0.328 0.242 0.153 0.026 Traditional 0.384 Muslim Zambia 0.433 0.408 0.086 0.261 No religion 0.480 Muslim

Female sample Botswana 0.811 0.738 0.129 0.571 Muslim 0.873 Christian, unspecified Burkina Faso 0.159 0.155 0.096 0.034 Animist 0.259 Christian, Catholic Cameroon 0.514 0.470 0.161 0.214 Other 0.603 Christian, Catholic Egypt 0.630 0.544 0.101 0.429 Other 0.611 Christian, unspecified Ethiopia 0.120 0.080 0.057 0.010 Traditional 0.162 Christian, Orthodox Ghana 0.540 0.497 0.146 0.234 Traditional 0.629 Christian, Protestant Guinea 0.213 0.149 0.107 0.044 Other 0.323 Christian, Protestant Liberia 0.221 0.139 0.096 0.000 Other 0.244 Muslim Malawi 0.149 0.108 0.051 0.038 No religion 0.158 Christian, unspecified Mali 0.263 0.204 0.096 0.093 No religion 0.339 Christian, unspecified Mozambique 0.147 0.137 0.046 0.080 No religion 0.187 Christian, Protestant Nigeria 0.594 0.396 0.292 0.174 Other 0.789 Christian, unspecified Rwanda 0.161 0.152 0.077 0.040 No religion 0.277 Muslim Senegal 0.252 0.226 0.121 0.122 Christian, Other 0.434 Christian, Catholic Sierra Leone 0.239 0.202 0.072 0.111 Other 0.282 Christian, Protestant South Africa 0.733 0.796 0.052 0.723 No religion 0.880 Other Uganda 0.328 0.224 0.158 0.016 Traditional 0.436 Muslim Zambia 0.433 0.393 0.106 0.211 No religion 0.487 Muslim All estimates are unconditional (no birth-decade or census-year fixed effects) and are for the sample of individuals aged 14-18 without student correction.

45 Table 22: Summary statistics and IM at the country × religion level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

Botswana Christian, Unspecified 0.760 5.313 0.703 0.552 0.890 0.799 Other 0.005 7.278 0.754 0.630 0.871 0.717 No Religion 0.193 3.719 0.544 0.413 0.814 0.706 Muslim 0.006 8.732 0.915 0.814 0.667 0.750 mean 0.241 6.261 0.729 0.602 0.810 0.743 stdev 0.357 2.198 0.153 0.167 0.101 0.042 min 0.005 3.719 0.544 0.413 0.667 0.706 max 0.760 8.732 0.915 0.814 0.890 0.799

Burkina Faso 46 Christian, Catholic 0.201 1.806 0.210 0.215 0.277 0.276 Christian, Protestant 0.038 1.618 0.199 0.195 0.253 0.250 Other 0.004 1.766 0.172 0.193 0.230 0.242 Muslim 0.574 0.569 0.067 0.067 0.154 0.151 No Religion 0.004 0.547 0.055 0.051 0.093 0.090 Animist 0.179 0.101 0.012 0.010 0.058 0.049 mean 0.167 1.068 0.119 0.122 0.178 0.176 stdev 0.218 0.747 0.085 0.089 0.090 0.094 min 0.004 0.101 0.012 0.010 0.058 0.049 max 0.574 1.806 0.210 0.215 0.277 0.276

Cameroon Christian, Protestant 0.279 5.103 0.651 0.587 0.742 0.692 Christian, Orthodox 0.005 4.866 0.614 0.569 0.735 0.660 Christian, Catholic 0.400 5.902 0.740 0.671 0.731 0.708 Christian, Other 0.039 6.041 0.739 0.681 0.727 0.694 Table 22: Summary statistics and IM at the country × religion level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

No Religion 0.026 4.260 0.590 0.504 0.696 0.615 Other 0.009 5.682 0.706 0.636 0.650 0.634 Animist 0.045 1.752 0.226 0.204 0.523 0.400 Muslim 0.184 2.287 0.297 0.269 0.461 0.381 mean 0.123 4.487 0.570 0.515 0.658 0.598 stdev 0.148 1.636 0.199 0.182 0.108 0.132 min 0.005 1.752 0.226 0.204 0.461 0.381 max 0.400 6.041 0.740 0.681 0.742 0.708

Egypt Christian, Unspecified 0.057 3.565 0.362 0.353 0.652 0.634 47 Other 0.000 4.270 0.450 0.469 0.636 0.558 Muslim 0.943 2.407 0.283 0.259 0.629 0.603 mean 0.333 3.414 0.365 0.360 0.639 0.598 stdev 0.529 0.941 0.084 0.105 0.012 0.038 min 0.000 2.407 0.283 0.259 0.629 0.558 max 0.943 4.270 0.450 0.469 0.652 0.634

Ethiopia Christian, Catholic 0.010 0.960 0.095 0.090 0.226 0.251 Christian, Orthodox 0.536 0.922 0.080 0.073 0.225 0.234 Christian, Protestant 0.112 0.794 0.088 0.076 0.217 0.247 Muslim 0.291 0.360 0.028 0.023 0.117 0.117 Other 0.009 0.412 0.036 0.037 0.077 0.088 Traditional 0.041 0.121 0.010 0.008 0.050 0.053 mean 0.167 0.595 0.056 0.051 0.152 0.165 stdev 0.210 0.344 0.036 0.033 0.080 0.089 Table 22: Summary statistics and IM at the country × religion level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

min 0.009 0.121 0.010 0.008 0.050 0.053 max 0.536 0.960 0.095 0.090 0.226 0.251

Ghana Christian, Protestant 0.454 6.410 0.683 0.618 0.741 0.665 Christian, Other 0.107 5.482 0.640 0.565 0.722 0.631 Christian, Catholic 0.147 5.071 0.554 0.494 0.692 0.602 Other 0.007 5.427 0.597 0.539 0.673 0.569 Muslim 0.179 1.951 0.265 0.214 0.603 0.463 No Religion 0.044 2.883 0.362 0.318 0.512 0.438 Traditional 0.062 0.821 0.110 0.091 0.341 0.232 48 mean 0.143 4.006 0.459 0.406 0.612 0.514 stdev 0.150 2.111 0.216 0.200 0.143 0.150 min 0.007 0.821 0.110 0.091 0.341 0.232 max 0.454 6.410 0.683 0.618 0.741 0.665

Guinea Christian, Catholic 0.011 1.624 0.244 0.216 0.503 0.467 Christian, Protestant 0.002 1.176 0.189 0.151 0.494 0.416 Christian, Unspecified 0.053 2.332 0.334 0.250 0.348 0.292 Muslim 0.857 0.939 0.116 0.097 0.281 0.228 Other 0.004 0.920 0.164 0.099 0.257 0.217 Animist 0.033 0.216 0.040 0.027 0.250 0.174 No Religion 0.040 0.317 0.056 0.037 0.247 0.177 mean 0.143 1.075 0.163 0.125 0.340 0.282 stdev 0.315 0.735 0.104 0.085 0.114 0.117 min 0.002 0.216 0.040 0.027 0.247 0.174 Table 22: Summary statistics and IM at the country × religion level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

max 0.857 2.332 0.334 0.250 0.503 0.467

Liberia Muslim 0.103 2.139 0.298 0.252 0.434 0.385 Christian, Unspecified 0.877 3.869 0.508 0.429 0.362 0.409 No Religion 0.013 1.378 0.249 0.188 0.196 0.253 Traditional 0.005 1.026 0.218 0.144 0.194 0.253 Other 0.001 2.368 0.417 0.264 0.071 0.283 mean 0.200 2.156 0.338 0.255 0.251 0.316 stdev 0.381 1.102 0.121 0.108 0.145 0.075 min 0.001 1.026 0.218 0.144 0.071 0.253 49 max 0.877 3.869 0.508 0.429 0.434 0.409

Malawi Christian, Unspecified 0.848 4.238 0.358 0.351 0.511 0.466 Muslim 0.108 2.352 0.199 0.189 0.429 0.378 Other 0.022 2.793 0.216 0.209 0.409 0.362 No Religion 0.022 1.582 0.088 0.089 0.253 0.203 mean 0.250 2.741 0.215 0.209 0.400 0.352 stdev 0.401 1.116 0.111 0.108 0.108 0.110 min 0.022 1.582 0.088 0.089 0.253 0.203 max 0.848 4.238 0.358 0.351 0.511 0.466

Mali Christian, Unspecified 0.026 2.355 0.237 0.242 0.492 0.379 Muslim 0.951 1.460 0.153 0.147 0.347 0.263 Animist 0.019 0.384 0.039 0.034 0.277 0.188 Table 22: Summary statistics and IM at the country × religion level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

Other 0.000 1.253 0.088 0.146 0.258 0.184 No Religion 0.003 0.467 0.046 0.043 0.230 0.174 mean 0.200 1.184 0.112 0.122 0.321 0.238 stdev 0.420 0.807 0.083 0.086 0.105 0.086 min 0.000 0.384 0.039 0.034 0.230 0.174 max 0.951 2.355 0.237 0.242 0.492 0.379

Mozambique Christian, Protestant 0.143 2.303 0.206 0.183 0.195 0.271 Other 0.072 2.546 0.216 0.207 0.186 0.270 Christian, Catholic 0.276 2.741 0.233 0.231 0.163 0.248 50 Christian, Protestant Pentecostal 0.190 1.703 0.130 0.104 0.159 0.191 No Religion 0.183 1.435 0.102 0.094 0.101 0.150 Muslim 0.130 1.968 0.171 0.162 0.100 0.149 mean 0.165 2.116 0.176 0.163 0.151 0.213 stdev 0.069 0.503 0.052 0.055 0.041 0.057 min 0.072 1.435 0.102 0.094 0.100 0.149 max 0.276 2.741 0.233 0.231 0.195 0.271

Nigeria Christian, Unspecified 0.675 5.604 0.697 0.649 0.846 0.848 Muslim 0.315 2.773 0.345 0.330 0.539 0.529 Other 0.000 1.000 0.224 -0.000 0.289 0.500 Traditional 0.007 1.290 0.224 0.181 0.289 0.419 mean 0.250 2.667 0.372 0.290 0.491 0.574 stdev 0.320 2.106 0.223 0.275 0.265 0.189 min 0.000 1.000 0.224 -0.000 0.289 0.419 Table 22: Summary statistics and IM at the country × religion level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

max 0.675 5.604 0.697 0.649 0.846 0.848

Rwanda Muslim 0.015 3.106 0.396 0.352 0.443 0.465 Christian, Catholic 0.525 2.317 0.266 0.240 0.355 0.365 Christian, Other 0.044 2.162 0.249 0.225 0.312 0.338 Christian, Protestant 0.387 1.830 0.196 0.180 0.296 0.301 Other 0.005 1.873 0.196 0.184 0.251 0.278 Traditional 0.000 0.317 0.038 0.049 0.160 0.205 No Religion 0.020 1.109 0.112 0.098 0.137 0.130 mean 0.142 1.816 0.208 0.190 0.279 0.297 51 stdev 0.218 0.893 0.115 0.099 0.107 0.109 min 0.000 0.317 0.038 0.049 0.137 0.130 max 0.525 3.106 0.396 0.352 0.443 0.465

Senegal Christian, Catholic 0.041 2.990 0.434 0.357 0.570 0.524 Other 0.004 1.540 0.203 0.179 0.448 0.392 Christian, Protestant 0.000 3.360 0.350 0.303 0.423 0.325 Christian, Other 0.001 2.253 0.373 0.258 0.385 0.312 Muslim 0.951 1.237 0.188 0.152 0.280 0.254 mean 0.200 2.276 0.310 0.250 0.421 0.361 stdev 0.421 0.909 0.108 0.085 0.105 0.103 min 0.000 1.237 0.188 0.152 0.280 0.254 max 0.951 3.360 0.434 0.357 0.570 0.524

Sierra Leone Table 22: Summary statistics and IM at the country × religion level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

Christian, Protestant 0.105 4.266 0.536 0.485 0.507 0.431 Other 0.009 2.064 0.298 0.240 0.492 0.392 Christian, Catholic 0.080 3.372 0.468 0.399 0.434 0.369 Christian, Other 0.051 2.798 0.396 0.337 0.423 0.348 Muslim 0.752 1.464 0.233 0.186 0.410 0.300 No Religion 0.002 0.905 0.169 0.141 0.203 0.157 Traditional 0.001 1.074 0.160 0.111 0.095 0.083 mean 0.143 2.278 0.323 0.271 0.366 0.297 stdev 0.272 1.256 0.148 0.140 0.156 0.129 min 0.001 0.905 0.160 0.111 0.095 0.083 max 0.752 4.266 0.536 0.485 0.507 0.431 52

South Africa Muslim 0.017 7.937 0.932 0.854 0.916 0.871 Other 0.019 7.836 0.941 0.820 0.906 0.894 Traditional 0.004 3.767 0.522 0.411 0.897 0.742 Christian, Other 0.096 5.555 0.670 0.593 0.882 0.765 Christian, Protestant 0.303 6.497 0.753 0.678 0.872 0.762 Christian, Catholic 0.080 5.990 0.706 0.633 0.872 0.761 Christian, Protestant Pentecostal 0.296 4.284 0.553 0.470 0.871 0.751 Christian, Orthodox 0.017 5.087 0.646 0.558 0.869 0.760 No Religion 0.124 3.498 0.434 0.371 0.818 0.674 mean 0.106 5.606 0.684 0.599 0.878 0.776 stdev 0.117 1.628 0.173 0.168 0.028 0.067 min 0.004 3.498 0.434 0.371 0.818 0.674 max 0.303 7.937 0.941 0.854 0.916 0.894 Table 22: Summary statistics and IM at the country × religion level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

Uganda Muslim 0.094 3.796 0.484 0.423 0.651 0.572 Christian, Protestant 0.424 3.585 0.449 0.399 0.598 0.540 Christian, Other 0.010 3.417 0.424 0.383 0.593 0.535 Christian, Orthodox 0.001 4.301 0.484 0.458 0.535 0.481 Christian, Catholic 0.440 3.115 0.412 0.355 0.527 0.463 Other 0.014 1.359 0.177 0.143 0.233 0.181 No Religion 0.006 0.595 0.078 0.063 0.081 0.063 Traditional 0.011 0.236 0.031 0.028 0.034 0.027 mean 0.125 2.550 0.317 0.281 0.406 0.358 stdev 0.192 1.575 0.190 0.174 0.250 0.228 53 min 0.001 0.236 0.031 0.028 0.034 0.027 max 0.440 4.301 0.484 0.458 0.651 0.572

Zambia Muslim 0.005 6.249 0.742 0.683 0.600 0.610 Christian, Protestant 0.702 5.714 0.706 0.646 0.584 0.559 Christian, Catholic 0.222 5.742 0.699 0.645 0.571 0.552 Other 0.045 5.127 0.665 0.589 0.515 0.503 No Religion 0.027 3.628 0.466 0.423 0.334 0.355 mean 0.200 5.292 0.656 0.597 0.521 0.516 stdev 0.294 1.012 0.109 0.103 0.109 0.098 min 0.005 3.628 0.466 0.423 0.334 0.355 max 0.702 6.249 0.742 0.683 0.600 0.610 Summary statistics for each country are computed for the computed variables, not the underlying individual data. C Further results on IM across ethnic groups

Table 23: Summary statistics and IM at the country × ethnicity level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

Botswana Kalanga / Sekalaka 0.073 5.143 0.691 0.556 0.932 0.826 English 0.017 11.777 0.976 0.918 0.923 0.933 Other Non-African 0.008 8.014 0.892 0.693 0.895 0.762 Setswana 0.813 4.964 0.683 0.528 0.885 0.783 Sekgalagadi / Sengologa 0.029 3.282 0.521 0.370 0.839 0.713 Other African 0.030 4.901 0.592 0.488 0.761 0.699 Sembukushu 0.016 1.256 0.213 0.138 0.744 0.621

54 Sesarwa 0.015 0.925 0.146 0.096 0.537 0.424 mean 0.125 5.033 0.589 0.473 0.814 0.720 stdev 0.279 3.558 0.294 0.273 0.132 0.151 min 0.008 0.925 0.146 0.096 0.537 0.424 max 0.813 11.777 0.976 0.918 0.932 0.933

Burkina Faso French 0.016 9.348 0.834 0.811 0.803 0.874 Dioula 0.085 2.540 0.301 0.314 0.339 0.372 Mossi 0.533 0.968 0.099 0.116 0.214 0.238 Samo 0.020 1.354 0.164 0.167 0.207 0.222 Bissa 0.031 0.533 0.052 0.057 0.188 0.193 Gurunsi 0.065 0.771 0.086 0.092 0.178 0.193 Bwa 0.018 1.177 0.154 0.164 0.158 0.181 Bobo 0.025 0.659 0.075 0.077 0.146 0.167 Other IPUMS 0.049 0.432 0.042 0.045 0.130 0.140 Senoufo 0.011 0.395 0.045 0.046 0.111 0.112 Table 23: Summary statistics and IM at the country × ethnicity level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

Other small 0.022 0.507 0.054 0.062 0.107 0.114 Gurma 0.050 0.356 0.038 0.043 0.096 0.103 Peul 0.077 0.145 0.017 0.019 0.046 0.048 mean 0.077 1.476 0.151 0.155 0.210 0.228 stdev 0.139 2.445 0.219 0.212 0.192 0.210 min 0.011 0.145 0.017 0.019 0.046 0.048 max 0.533 9.348 0.834 0.811 0.803 0.874

Ethiopia Amhara 0.350 1.500 0.127 0.117 0.264 0.290 Gurage 0.020 0.599 0.056 0.047 0.245 0.254 55 Hadiya 0.022 0.687 0.068 0.057 0.208 0.262 Welayta 0.026 0.629 0.082 0.068 0.183 0.229 Tigray 0.066 0.515 0.049 0.043 0.179 0.190 Gamo 0.014 0.352 0.038 0.030 0.135 0.140 Sidama 0.031 0.582 0.075 0.064 0.133 0.164 Other small 0.100 0.438 0.043 0.037 0.129 0.145 Oromo 0.324 0.480 0.041 0.033 0.127 0.148 Silte 0.016 0.402 0.025 0.020 0.113 0.134 Somali 0.030 0.208 0.029 0.026 0.040 0.050 mean 0.091 0.581 0.058 0.049 0.160 0.182 stdev 0.125 0.334 0.029 0.027 0.064 0.071 min 0.014 0.208 0.025 0.020 0.040 0.050 max 0.350 1.500 0.127 0.117 0.264 0.290

Ghana Akan 0.469 6.317 0.703 0.622 0.765 0.673 Table 23: Summary statistics and IM at the country × ethnicity level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

Ewe 0.130 5.692 0.628 0.562 0.695 0.627 Ga-Dangme 0.079 6.195 0.655 0.593 0.673 0.595 Mande 0.012 1.748 0.250 0.200 0.643 0.501 Guan 0.042 4.326 0.491 0.429 0.631 0.523 Other IPUMS 0.024 2.283 0.299 0.246 0.609 0.494 Grusi 0.026 2.084 0.260 0.219 0.597 0.469 Mole-Dagbani 0.169 1.607 0.207 0.167 0.552 0.402 Gurma 0.049 1.025 0.135 0.115 0.456 0.363 mean 0.111 3.475 0.403 0.350 0.624 0.516 stdev 0.144 2.149 0.217 0.201 0.088 0.102 min 0.012 1.025 0.135 0.115 0.456 0.363 56 max 0.469 6.317 0.703 0.622 0.765 0.673

Liberia Mende 0.014 3.991 0.389 0.401 0.498 0.452 Kissi 0.049 2.635 0.354 0.301 0.476 0.426 Sapo 0.016 3.732 0.529 0.434 0.475 0.521 Mandingo 0.030 2.381 0.299 0.260 0.469 0.442 Grebo 0.116 4.474 0.573 0.509 0.433 0.462 Krahn 0.056 3.010 0.426 0.334 0.432 0.502 Lorma 0.049 4.082 0.546 0.439 0.429 0.459 Other small 0.027 4.739 0.537 0.470 0.429 0.450 Gola 0.042 3.366 0.426 0.362 0.425 0.405 Vai 0.034 4.071 0.502 0.433 0.424 0.407 Gbandi 0.030 3.196 0.403 0.339 0.410 0.406 Kru 0.068 5.014 0.618 0.545 0.402 0.464 Mano 0.093 3.683 0.549 0.433 0.339 0.427 Gio 0.080 3.169 0.463 0.360 0.335 0.427 Table 23: Summary statistics and IM at the country × ethnicity level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

Kpelle 0.177 3.143 0.428 0.355 0.274 0.305 Bassa 0.121 3.708 0.488 0.416 0.261 0.310 mean 0.063 3.650 0.471 0.399 0.407 0.429 stdev 0.045 0.732 0.087 0.075 0.070 0.057 min 0.014 2.381 0.299 0.260 0.261 0.305 max 0.177 5.014 0.618 0.545 0.498 0.521

Malawi Tumbuka 0.097 6.260 0.589 0.568 0.624 0.576 Tonga 0.022 5.524 0.533 0.495 0.617 0.546 Other small 0.009 5.813 0.528 0.509 0.605 0.542 57 Ngonde 0.011 5.915 0.594 0.556 0.584 0.557 Lomwe 0.150 4.224 0.345 0.344 0.533 0.484 Other IPUMS 0.029 4.881 0.411 0.412 0.533 0.500 Ngoni 0.124 4.834 0.406 0.404 0.516 0.478 Nyanja 0.055 3.800 0.316 0.309 0.485 0.448 Sena 0.040 3.163 0.298 0.288 0.429 0.402 Chewa 0.345 3.812 0.300 0.306 0.421 0.391 Yao 0.118 3.278 0.270 0.267 0.421 0.386 mean 0.091 4.682 0.417 0.405 0.524 0.483 stdev 0.098 1.099 0.123 0.111 0.078 0.069 min 0.009 3.163 0.270 0.267 0.421 0.386 max 0.345 6.260 0.594 0.568 0.624 0.576

Mali Bambara, Malinke, Dioula 0.541 1.365 0.156 0.145 0.320 0.264 Bobo, Dafing 0.022 0.977 0.113 0.108 0.286 0.235 Table 23: Summary statistics and IM at the country × ethnicity level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

Kassonge 0.015 0.941 0.110 0.102 0.282 0.234 Senoufo 0.026 0.613 0.066 0.058 0.265 0.211 Songhai, Djerma 0.059 1.074 0.146 0.118 0.224 0.179 Marka, Soninke 0.065 0.499 0.052 0.046 0.217 0.158 Minianka 0.041 0.504 0.052 0.046 0.214 0.168 Dagon, Kado 0.062 0.497 0.058 0.053 0.196 0.144 Other small 0.016 0.869 0.103 0.087 0.191 0.153 Pulaar, Peulh, Fulbe 0.093 0.486 0.059 0.055 0.098 0.083 Tamasheq, Bellah 0.033 0.383 0.055 0.044 0.094 0.067 Bozo, Somono 0.016 0.362 0.047 0.041 0.091 0.070 Maure 0.012 0.259 0.036 0.031 0.067 0.049 58 mean 0.077 0.679 0.081 0.072 0.196 0.155 stdev 0.142 0.332 0.040 0.036 0.084 0.071 min 0.012 0.259 0.036 0.031 0.067 0.049 max 0.541 1.365 0.156 0.145 0.320 0.264

Morocco Other small 0.013 0.955 0.159 0.098 0.544 0.452 Arabic 0.759 1.208 0.165 0.114 0.478 0.402 Tarifite 0.043 0.513 0.072 0.047 0.411 0.281 Tmazight 0.073 0.434 0.062 0.038 0.397 0.299 Tchalhit 0.112 0.397 0.047 0.033 0.363 0.262 mean 0.200 0.701 0.101 0.066 0.438 0.339 stdev 0.315 0.361 0.056 0.037 0.072 0.083 min 0.013 0.397 0.047 0.033 0.363 0.262 max 0.759 1.208 0.165 0.114 0.544 0.452 Table 23: Summary statistics and IM at the country × ethnicity level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

Mozambique Portuguese 0.179 4.423 0.475 0.451 0.332 0.482 Bitonga 0.015 2.038 0.152 0.124 0.326 0.354 Xirhonga 0.024 2.481 0.215 0.161 0.323 0.366 Xitsonga 0.164 1.913 0.147 0.111 0.231 0.253 Cichopi 0.022 1.823 0.103 0.088 0.212 0.245 Xitswa 0.049 1.405 0.096 0.074 0.199 0.210 Cinyungwe 0.030 1.802 0.162 0.152 0.176 0.245 Chitewe 0.021 1.811 0.165 0.131 0.165 0.230 Cishona 0.014 2.122 0.200 0.166 0.162 0.245 Sena 0.077 1.364 0.108 0.099 0.117 0.169 59 Echuwabo 0.037 1.598 0.116 0.114 0.112 0.153 Cindau 0.060 1.042 0.076 0.060 0.112 0.142 Other small 0.022 1.895 0.166 0.149 0.109 0.176 Emakhuwa 0.152 1.717 0.125 0.116 0.075 0.107 Shimakonde 0.019 1.318 0.139 0.097 0.069 0.110 Elomwe 0.050 1.629 0.068 0.061 0.054 0.074 Cinyanja 0.049 1.247 0.073 0.070 0.052 0.081 Ciyao 0.016 1.209 0.100 0.093 0.049 0.077 mean 0.056 1.824 0.149 0.129 0.160 0.207 stdev 0.053 0.744 0.091 0.087 0.095 0.111 min 0.014 1.042 0.068 0.060 0.049 0.074 max 0.179 4.423 0.475 0.451 0.332 0.482

Senegal Jola 0.060 2.108 0.345 0.273 0.611 0.524 Lebou 0.020 3.525 0.539 0.432 0.478 0.521 Table 23: Summary statistics and IM at the country × ethnicity level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

Other IPUMS 0.012 3.899 0.426 0.402 0.389 0.393 Soninke 0.015 1.691 0.236 0.194 0.360 0.347 Serer 0.155 1.149 0.189 0.141 0.347 0.288 Mandinka 0.053 1.405 0.210 0.172 0.336 0.300 Toucouleur 0.061 1.714 0.236 0.209 0.315 0.319 Other small 0.015 1.043 0.176 0.131 0.275 0.252 Wolof 0.435 1.289 0.196 0.159 0.252 0.239 Fula 0.173 0.627 0.097 0.076 0.205 0.167 mean 0.100 1.845 0.265 0.219 0.357 0.335 stdev 0.131 1.068 0.133 0.117 0.117 0.116 min 0.012 0.627 0.097 0.076 0.205 0.167 60 max 0.435 3.899 0.539 0.432 0.611 0.524

Sierra Leone Krio 0.018 8.124 0.857 0.847 0.603 0.626 Fullah 0.035 1.971 0.272 0.243 0.554 0.476 Mandingo 0.023 2.700 0.350 0.316 0.540 0.444 Temne 0.309 1.608 0.257 0.203 0.449 0.324 Kono 0.046 1.854 0.296 0.232 0.437 0.311 Kissi 0.029 1.293 0.204 0.175 0.427 0.340 Mende 0.318 2.351 0.349 0.286 0.426 0.317 Loko 0.026 1.859 0.296 0.225 0.424 0.319 Limba 0.090 1.533 0.247 0.194 0.370 0.303 Susu/Yalunka 0.037 1.386 0.202 0.170 0.366 0.295 Sherbro 0.021 2.544 0.371 0.292 0.349 0.255 Other small 0.007 3.705 0.522 0.468 0.282 0.263 Koranko 0.042 0.615 0.106 0.078 0.181 0.138 Table 23: Summary statistics and IM at the country × ethnicity level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

mean 0.077 2.426 0.333 0.287 0.416 0.339 stdev 0.107 1.874 0.186 0.192 0.113 0.119 min 0.007 0.615 0.106 0.078 0.181 0.138 max 0.318 8.124 0.857 0.847 0.603 0.626

South Africa Tshivenda 0.024 4.776 0.627 0.494 0.941 0.850 Ndebele 0.020 4.256 0.592 0.451 0.937 0.823 English 0.083 9.529 0.966 0.911 0.936 0.903 Sepedi 0.097 4.864 0.612 0.507 0.927 0.814 Siswati 0.027 4.414 0.558 0.462 0.910 0.818 61 Sesotho 0.078 5.853 0.714 0.621 0.907 0.812 Xitsonga 0.042 4.168 0.527 0.438 0.905 0.799 Zulu 0.247 5.110 0.602 0.525 0.882 0.763 Afrikaans 0.129 8.161 0.894 0.835 0.866 0.781 Setswana 0.086 5.710 0.693 0.605 0.831 0.721 Xhosa 0.163 5.493 0.651 0.592 0.826 0.721 Other IPUMS 0.005 7.664 0.803 0.741 0.823 0.786 mean 0.083 5.833 0.687 0.598 0.891 0.799 stdev 0.070 1.717 0.136 0.155 0.045 0.051 min 0.005 4.168 0.527 0.438 0.823 0.721 max 0.247 9.529 0.966 0.911 0.941 0.903

Uganda Bagisu 0.037 4.103 0.539 0.485 0.725 0.626 Baganda 0.139 6.056 0.693 0.641 0.698 0.637 Basoga 0.071 3.872 0.504 0.436 0.681 0.572 Table 23: Summary statistics and IM at the country × ethnicity level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

Langi 0.054 3.356 0.479 0.433 0.664 0.602 Bakhonzo 0.031 2.266 0.285 0.246 0.649 0.553 Jopadhola 0.012 3.500 0.472 0.415 0.642 0.563 0.012 3.096 0.411 0.359 0.632 0.529 Iteso 0.059 3.471 0.459 0.400 0.628 0.549 Banyoro 0.032 4.215 0.537 0.460 0.617 0.538 Bagwere 0.015 3.281 0.438 0.382 0.615 0.510 Banyakole 0.115 2.551 0.326 0.272 0.560 0.510 Lugbara 0.044 2.535 0.349 0.294 0.558 0.492 Acholi 0.053 3.075 0.431 0.392 0.557 0.519 Other small 0.099 2.883 0.381 0.329 0.556 0.491 62 Batoro 0.024 3.502 0.423 0.369 0.547 0.465 Madi 0.018 2.882 0.395 0.337 0.519 0.473 Banyarwanda 0.014 2.085 0.242 0.207 0.490 0.408 Bakiga 0.088 2.214 0.279 0.246 0.480 0.449 Bafumbir 0.017 2.005 0.228 0.221 0.439 0.413 Alur 0.024 2.638 0.364 0.312 0.404 0.362 Karamojong 0.043 0.389 0.056 0.046 0.047 0.047 mean 0.048 3.046 0.395 0.347 0.557 0.491 stdev 0.036 1.097 0.135 0.123 0.145 0.124 min 0.012 0.389 0.056 0.046 0.047 0.047 max 0.139 6.056 0.693 0.641 0.725 0.637

Zambia Kaonde 0.031 4.696 0.623 0.543 0.628 0.533 Tonga 0.167 5.111 0.661 0.591 0.622 0.567 Tumbuka 0.051 5.242 0.647 0.591 0.585 0.555 Mambwe 0.061 4.993 0.623 0.573 0.575 0.530 Table 23: Summary statistics and IM at the country × ethnicity level

ethnicity population mean old years share of literate share of literate IM, kids IM, kids share of schooling old, kids 14-18 old, kids 14-100 14-18 14-100

Bemba 0.249 5.665 0.680 0.635 0.575 0.554 Other IPUMS 0.022 4.611 0.574 0.525 0.562 0.507 Lamba 0.022 4.839 0.618 0.551 0.551 0.522 Lozi 0.081 4.976 0.625 0.551 0.549 0.509 Lunda 0.076 3.877 0.528 0.459 0.547 0.478 Lala 0.056 4.427 0.578 0.512 0.530 0.483 Nyanja 0.185 4.562 0.580 0.520 0.491 0.481 mean 0.091 4.818 0.612 0.550 0.565 0.520 stdev 0.075 0.469 0.044 0.047 0.039 0.031 min 0.022 3.877 0.528 0.459 0.491 0.478 max 0.249 5.665 0.680 0.635 0.628 0.567 63 There are several types of “Other” groups. (1) “Other IPUMS” denotes observations that did not have an ethnic or language identifier but were labelled by the census itself as “other”. (2) “Other small” denotes groups that individually did not account for 1% of the population and therefore were classified by us as “Other”. (3) “Other African” denotes nationals of other African countries without an ethnic identifier such as “Liberian”. (4) “Other non-African” denotes national of other non-African countries such as “Indian”. Summary statistics for each country are computed for the computed variables, not the underlying individual data. D Literacy and IM

Table 24: Literacy and IM at the ethnicity-level, rural/urban, female/male heterogeneity

(1) (2) (3) (4) (5) (6) IM IM IM IM IM IM

Panel A: rural subsample Panel C: female subsample share literate old 0.505∗∗∗ 0.639∗∗∗ 0.814∗∗∗ 0.595∗∗∗ 0.699∗∗∗ 0.818∗∗∗ (15.36) (19.10) (20.53) (12.71) (14.16) (20.85) R-squared 0.539 0.684 0.599 0.474 0.547 0.623 N 158 158 158 171 171 171

Panel B: urban subsample Panel D: male subsample share literate old 0.442∗∗∗ 0.733∗∗∗ 0.721∗∗∗ 0.435∗∗∗ 0.601∗∗∗ 0.626∗∗∗ (6.68) (8.62) (9.10) (11.46) (17.37) (16.28) R-squared 0.191 0.372 0.334 0.351 0.589 0.478 N 158 158 158 171 171 171

age-range 14-18 14-100 14-18 14-18 14-100 14-18 student correction no no yes no no yes The dependent variable is the ethnicity-level share of literate kids of illiterate parents (estimated net of census year and old and young birth decade fixed effects). The independent variable is the ethnicity-level share of literate parents (also estimated net of fixed effects). t-statistics based on country-clustered standard errors in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01.

E Correlates of ethnicity-specific IM

E.1 Variable definitions, sources, and summary statistics

EPR variables: Taken from Vogt et al. (2015). Variable definitions are given in the main text. slavery: Taken from Nunn (2008). Defined as

1 + s + s  slavery = ln atlantic indian , land area where satlantic and sindian are the total number of slaves taken from the ethnic group in all years via the Atlantic and Indian Ocean slave trades and land area is the land area of the group’s ancestral homeland obtained from the map in Murdock (1959). split: Whether the ancestral homeland of an ethnic group is split in such a way between two modern states that at least 10 percent of the total land area fall on either side of the border. Obtained by intersecting the map in Murdock (1959) with modern national boundaries as, e.g., in Michalopoulos and Papaioannou (2016). pastoralism: Taken from Murdock (1967), equal to one if pastoralism contributes most to pre-colonial subsistence economy, zero otherwise. agriculture: Taken from Murdock (1967), equal to one if agriculture (either intensive, exten-

64 sive, or type unknown) contributes most to pre-colonial subsistence economy, zero otherwise. polygyny: Taken from Murdock (1967), equal to one if the variable “marital composi- tion: monogamy and polygamy” is coded as either “Preferentially sororal, cowives in same dwelling”, “Preferentially sororal, cowives in separate dwellings”, “Non-sororal, cowives in separate dwellings”, or “Non-sororal, cowives in same dwelling”. bride price: Taken from Murdock (1967), equal to one if the variable “mode of marriage, primary” is coded as “Bride Price or wealth, to bride’s family”. patrilineal: Taken from Murdock (1967), equal to one if the variable “Descent: Major Type”, is coded as “Patrilineal”. It is zero if coded as “Matrilineal” and set to missing otherwise. pre-colonial centralization: Taken from Murdock (1967), equal to the variable “Juris- dictional Hierarchy Beyond Local Community”, which takes on categories 1: (no political authority beyond local community), 2: petty chiefdoms, 3: larger chiefdoms, 4: states, 5: large states.

Table 25: Summary statistics for correlates

variable N mean min max standard deviation

EPR monopoly 131 0.076 0 1 0.267 dominance 131 0.115 0 1 0.32 monopoly / dominance 131 0.191 0 1 0.394 discriminated 131 0.168 0 1 0.375 powerless 131 0.351 0 1 0.479 discriminated / powerless 131 0.427 0 1 0.497 ethnic war 131 0.198 0 1 0.4

History slavery 129 -5.161 -12.578 3.63 4.726 split ethnicity 130 0.523 0 1 0.501 pastoralism 135 0.044 0 1 0.207 agriculture 135 0.919 0 1 0.275 polygyny 132 0.848 0 1 0.36 bride price 135 0.77 0 1 0.422 patrilineal 116 0.828 0 1 0.379 pre-colonial centralization 129 2.504 1 5 0.894

65 E.2 More results E.2.1 EPR

Table 26: Ethnicity-level correlates of the share of literate old and IM, ages 14-100, with student correction, Ethnic Power Relations

status indicator number of years in status (1) (2) (3) (4) (5) (6) variable share literate old IM IM controlling for share literate old N share literate old IM IM controlling for share literate old N MONOP 0.042∗∗∗ 0.124∗∗∗ 0.096∗∗∗ 131 0.042∗∗∗ 0.124∗∗∗ 0.096∗∗∗ 131 (0.000) (0.000) (0.003) (0.000) (0.000) (0.003) DOMIN 0.013 0.013 0.005 131 0.000 0.005 0.005 131 (0.047) (0.036) (0.017) (0.046) (0.052) (0.025) MONDOM 0.023 0.041 0.026 131 0.013 0.044 0.035 131 (0.050) (0.046) (0.029) (0.052) (0.065) (0.036) DISC -0.038∗∗∗ -0.042 -0.017 131 -0.009 -0.054∗∗ -0.049∗∗∗ 131 (0.001) (0.045) (0.046) (0.014) (0.023) (0.015) PWLESS 0.018 -0.008 -0.020 131 -0.069 -0.066 -0.019 131 (0.101) (0.082) (0.033) (0.101) (0.078) (0.033) ∗∗

66 DISCPWLS -0.031 -0.076 -0.055 131 -0.054 -0.079 -0.043 131 (0.089) (0.064) (0.025) (0.074) (0.060) (0.032) ETWAR 0.030 0.029 0.009 131 0.018∗∗∗ 0.006 -0.007 131 (0.024) (0.037) (0.032) (0.007) (0.031) (0.028) This is not a normal regression table. In the columns entitled “share literate old” the dependent variable is the ethnicity share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old). In the columns entitled “IM” it is the ethnicity share of children of parents with less than primary who complete at least primary (estimated net of country-year and country-birth-decade fixed effects for young and old), which is also the LHS in the columns entitled “IM controlling for share literate old”. Each row shows the results of regressions of these variabes on the LHS on one RHS variable (indicated in the rows) at a time (the exception are SIZE and SIZE2, which enter jointly). The regressions in the two columns “IM controlling for share literate old” additionally control for the share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old), that is they include the LHS variable of the columns “share literate old” on the RHS. All RHS variable enter in the transformation indicated in the top row. All specifications include country fixed effects (not reported). Coefficients are standardized. MONOP = epr status of monopoly, DOMIN = epr status of dominant, MONDOM = epr status of monopoly or dominant, DISC = epr status of discriminated, PWLESS = epr status of powerless, DISCPWLS = epr status of discriminated or powerless, ETWAR = ethnic war. Standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01. lines indicate that variables remain significantly correlated with IM when we control for the share of literate parents. Table 27: Ethnicity-level correlates of the share of literate old and IM, ages 14-18, with student correction, Ethnic Power Relations, urban-rural heterogeneity

status indicator number of years in status (1) (2) (3) (4) (5) (6) variable share literate old IM IM controlling for share literate old N share literate old IM IM controlling for share literate old N

urban MONOP 0.181∗∗∗ 0.214∗∗∗ 0.161∗∗∗ 121 0.181∗∗∗ 0.214∗∗∗ 0.161∗∗∗ 121 (0.000) (0.000) (0.021) (0.000) (0.000) (0.021) DOMIN 0.043 0.030 0.017 121 0.030 0.046 0.036 121 (0.079) (0.064) (0.061) (0.077) (0.092) (0.071) MONDOM 0.086 0.080 0.054 121 0.097 0.125 0.097 121 (0.088) (0.083) (0.071) (0.095) (0.108) (0.082) DISC -0.007 -0.038 -0.036 121 -0.027 -0.035∗∗∗ -0.027∗∗∗ 121 (0.033) (0.026) (0.023) (0.020) (0.006) (0.009) PWLESS 0.111∗ -0.013 -0.048 121 0.031 -0.023 -0.033 121 (0.066) (0.057) (0.043) (0.073) (0.073) (0.052) DISCPWLS 0.046 -0.063 -0.078 121 0.004 -0.038 -0.040 121 (0.062) (0.058) (0.048) (0.057) (0.050) (0.035) ETWAR 0.045 0.037 0.023 121 0.034∗∗∗ 0.037 0.026 121 (0.035) (0.035) (0.031) (0.007) (0.034) (0.033)

rural MONOP 0.018∗∗∗ 0.157∗∗∗ 0.147∗∗∗ 121 0.018∗∗∗ 0.157∗∗∗ 0.147∗∗∗ 121 67 (0.000) (0.000) (0.002) (0.000) (0.000) (0.002) DOMIN 0.017 -0.015 -0.024 121 -0.023 -0.036 -0.023 121 (0.067) (0.030) (0.019) (0.047) (0.031) (0.018) MONDOM 0.022 0.021 0.008 121 -0.019 0.014 0.026 121 (0.070) (0.049) (0.046) (0.055) (0.071) (0.055) DISC -0.025 -0.056 -0.042 121 -0.007 -0.065∗∗∗ -0.061∗∗∗ 121 (0.035) (0.054) (0.067) (0.019) (0.021) (0.015) PWLESS 0.113∗∗ 0.038 -0.029 121 0.047 0.007 -0.020 121 (0.045) (0.073) (0.053) (0.037) (0.065) (0.051) DISCPWLS 0.050 -0.064 -0.094∗∗∗ 121 0.027 -0.037 -0.053 121 (0.050) (0.046) (0.031) (0.036) (0.056) (0.040) ETWAR 0.039 -0.002 -0.025 121 0.009 -0.029 -0.034 121 (0.033) (0.055) (0.047) (0.015) (0.036) (0.032) This is not a normal regression table. In the columns entitled “share literate old” the dependent variable is the ethnicity share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old). In the columns entitled “IM” it is the ethnicity share of children of parents with less than primary who complete at least primary (estimated net of country-year and country-birth-decade fixed effects for young and old), which is also the LHS in the columns entitled “IM controlling for share literate old”. Each row shows the results of regressions of these variabes on the LHS on one RHS variable (indicated in the rows) at a time (the exception are SIZE and SIZE2, which enter jointly). The regressions in the two columns “IM controlling for share literate old” additionally control for the share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old), that is they include the LHS variable of the columns “share literate old” on the RHS. All RHS variable enter in the transformation indicated in the top row. All specifications include country fixed effects (not reported). Coefficients are standardized. MONOP = epr status of monopoly, DOMIN = epr status of dominant, MONDOM = epr status of monopoly or dominant, DISC = epr status of discriminated, PWLESS = epr status of powerless, DISCPWLS = epr status of discriminated or powerless, ETWAR = ethnic war. Standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01. lines indicate that variables remain significantly correlated with IM when we control for the share of literate parents. Table 28: Ethnicity-level correlates of the share of literate old and IM, ages 14-100, with student correction, Ethnic Power Relations, urban- rural heterogeneity

status indicator number of years in status (1) (2) (3) (4) (5) (6) variable share literate old IM IM controlling for share literate old N share literate old IM IM controlling for share literate old N

urban MONOP 0.146∗∗∗ 0.195∗∗∗ 0.134∗∗∗ 121 0.146∗∗∗ 0.195∗∗∗ 0.134∗∗∗ 121 (0.000) (0.000) (0.011) (0.000) (0.000) (0.011) DOMIN 0.042 0.048 0.030 121 0.033 0.033 0.019 121 (0.068) (0.054) (0.042) (0.065) (0.083) (0.057) MONDOM 0.078 0.095 0.062 121 0.087 0.105 0.068 121 (0.075) (0.067) (0.050) (0.077) (0.100) (0.068) DISC -0.014 -0.040∗∗ -0.034∗∗∗ 121 -0.024 -0.037∗∗∗ -0.027∗∗∗ 121 (0.030) (0.020) (0.011) (0.018) (0.007) (0.002) PWLESS 0.114∗ 0.029 -0.021 121 0.025 -0.004 -0.015 121 (0.060) (0.052) (0.039) (0.059) (0.072) (0.049) DISCPWLS 0.043 -0.029 -0.047 121 0.002 -0.027 -0.027 121 (0.057) (0.054) (0.039) (0.047) (0.051) (0.033) ETWAR 0.036 0.035 0.020 121 0.033∗∗∗ 0.041∗∗∗ 0.027∗∗ 121 (0.032) (0.029) (0.021) (0.007) (0.012) (0.012)

rural MONOP 0.015∗∗∗ 0.100∗∗∗ 0.090∗∗∗ 121 0.015∗∗∗ 0.100∗∗∗ 0.090∗∗∗ 121 68 (0.000) (0.000) (0.001) (0.000) (0.000) (0.001) DOMIN 0.010 -0.009 -0.016 121 -0.023 -0.032 -0.016∗ 121 (0.057) (0.032) (0.013) (0.040) (0.026) (0.009) MONDOM 0.014 0.013 0.004 121 -0.021 -0.001 0.013 121 (0.060) (0.039) (0.029) (0.048) (0.053) (0.035) DISC -0.026 -0.031 -0.014 121 -0.005 -0.057∗∗ -0.053∗∗∗ 121 (0.020) (0.053) (0.051) (0.016) (0.027) (0.016) PWLESS 0.101∗∗∗ 0.052 -0.018 121 0.042 0.013 -0.016 121 (0.039) (0.051) (0.032) (0.033) (0.045) (0.029) DISCPWLS 0.042 -0.022 -0.051∗ 121 0.025 -0.028 -0.045 121 (0.042) (0.044) (0.028) (0.031) (0.049) (0.031) ETWAR 0.030 0.026 0.006 121 0.010 -0.008 -0.015 121 (0.027) (0.047) (0.040) (0.013) (0.039) (0.033) This is not a normal regression table. In the columns entitled “share literate old” the dependent variable is the ethnicity share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old). In the columns entitled “IM” it is the ethnicity share of children of parents with less than primary who complete at least primary (estimated net of country-year and country-birth-decade fixed effects for young and old), which is also the LHS in the columns entitled “IM controlling for share literate old”. Each row shows the results of regressions of these variabes on the LHS on one RHS variable (indicated in the rows) at a time (the exception are SIZE and SIZE2, which enter jointly). The regressions in the two columns “IM controlling for share literate old” additionally control for the share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old), that is they include the LHS variable of the columns “share literate old” on the RHS. All RHS variable enter in the transformation indicated in the top row. All specifications include country fixed effects (not reported). Coefficients are standardized. MONOP = epr status of monopoly, DOMIN = epr status of dominant, MONDOM = epr status of monopoly or dominant, DISC = epr status of discriminated, PWLESS = epr status of powerless, DISCPWLS = epr status of discriminated or powerless, ETWAR = ethnic war. Standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01. lines indicate that variables remain significantly correlated with IM when we control for the share of literate parents. Table 29: Ethnicity-level correlates of the share of literate old and IM, ages 14-18, with student correction, Ethnic Power Relations, male-female heterogeneity

status indicator number of years in status (1) (2) (3) (4) (5) (6) variable share literate old IM IM controlling for share literate old N share literate old IM IM controlling for share literate old N

male MONOP 0.047∗∗∗ 0.207∗∗∗ 0.180∗∗∗ 131 0.047∗∗∗ 0.207∗∗∗ 0.180∗∗∗ 131 (0.000) (0.000) (0.004) (0.000) (0.000) (0.004) DOMIN 0.022 -0.005 -0.018 131 0.003 -0.018 -0.019 131 (0.048) (0.035) (0.037) (0.048) (0.038) (0.020) MONDOM 0.034 0.040 0.020 131 0.017 0.044 0.034 131 (0.051) (0.061) (0.061) (0.055) (0.075) (0.059) DISC -0.043∗∗∗ -0.095∗∗ -0.070 131 -0.012 -0.090∗∗∗ -0.084∗∗∗ 131 (0.014) (0.047) (0.052) (0.011) (0.012) (0.007) PWLESS 0.012 -0.022 -0.029 131 -0.080 -0.067 -0.020 131 (0.111) (0.086) (0.047) (0.111) (0.084) (0.053) DISCPWLS -0.038 -0.111∗ -0.088∗∗ 131 -0.064 -0.102 -0.065 131 (0.099) (0.067) (0.036) (0.081) (0.064) (0.045) ETWAR 0.037 0.015 -0.007 131 0.018∗∗ -0.005 -0.016 131 (0.026) (0.062) (0.057) (0.008) (0.052) (0.049)

female MONOP 0.044∗∗∗ 0.123∗∗∗ 0.098∗∗∗ 131 0.044∗∗∗ 0.123∗∗∗ 0.098∗∗∗ 131 69 (0.000) (0.000) (0.005) (0.000) (0.000) (0.005) DOMIN 0.021 -0.006 -0.018 131 0.007 0.028 0.024 131 (0.056) (0.051) (0.028) (0.056) (0.063) (0.034) MONDOM 0.032 0.020 0.002 131 0.021 0.070 0.058 131 (0.059) (0.061) (0.041) (0.063) (0.073) (0.040) DISC -0.027∗∗ -0.008 0.007 131 -0.005 -0.027 -0.024 131 (0.013) (0.059) (0.066) (0.018) (0.030) (0.021) PWLESS 0.016 0.012 0.002 131 -0.070 -0.058 -0.018 131 (0.114) (0.095) (0.068) (0.115) (0.068) (0.049) DISCPWLS -0.034 -0.079 -0.060∗ 131 -0.053 -0.057 -0.027 131 (0.099) (0.050) (0.036) (0.084) (0.057) (0.043) ETWAR 0.039 -0.002 -0.024 131 0.017∗∗∗ -0.033 -0.043∗ 131 (0.028) (0.037) (0.031) (0.007) (0.022) (0.022) This is not a normal regression table. In the columns entitled “share literate old” the dependent variable is the ethnicity share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old). In the columns entitled “IM” it is the ethnicity share of children of parents with less than primary who complete at least primary (estimated net of country-year and country-birth-decade fixed effects for young and old), which is also the LHS in the columns entitled “IM controlling for share literate old”. Each row shows the results of regressions of these variabes on the LHS on one RHS variable (indicated in the rows) at a time (the exception are SIZE and SIZE2, which enter jointly). The regressions in the two columns “IM controlling for share literate old” additionally control for the share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old), that is they include the LHS variable of the columns “share literate old” on the RHS. All RHS variable enter in the transformation indicated in the top row. All specifications include country fixed effects (not reported). Coefficients are standardized. MONOP = epr status of monopoly, DOMIN = epr status of dominant, MONDOM = epr status of monopoly or dominant, DISC = epr status of discriminated, PWLESS = epr status of powerless, DISCPWLS = epr status of discriminated or powerless, ETWAR = ethnic war. Standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01. lines indicate that variables remain significantly correlated with IM when we control for the share of literate parents. Table 30: Ethnicity-level correlates of the share of literate old and IM, ages 14-100, with student correction, Ethnic Power Relations, male- female heterogeneity

status indicator number of years in status (1) (2) (3) (4) (5) (6) variable share literate old IM IM controlling for share literate old N share literate old IM IM controlling for share literate old N

male MONOP 0.036∗∗∗ 0.143∗∗∗ 0.116∗∗∗ 131 0.036∗∗∗ 0.143∗∗∗ 0.116∗∗∗ 131 (0.000) (0.000) (0.003) (0.000) (0.000) (0.003) DOMIN 0.009 0.013 0.007 131 -0.002 -0.017 -0.016 131 (0.041) (0.034) (0.027) (0.039) (0.045) (0.022) MONDOM 0.017 0.045 0.032 131 0.009 0.025 0.019 131 (0.043) (0.046) (0.039) (0.045) (0.067) (0.044) DISC -0.039∗∗∗ -0.081∗ -0.053 131 -0.009 -0.083∗∗∗ -0.077∗∗∗ 131 (0.002) (0.047) (0.049) (0.011) (0.016) (0.009) PWLESS 0.016 -0.016 -0.028 131 -0.065 -0.059 -0.011 131 (0.090) (0.079) (0.028) (0.090) (0.083) (0.037) DISCPWLS -0.029 -0.086 -0.064∗∗ 131 -0.051 -0.092 -0.055 131 (0.080) (0.071) (0.032) (0.065) (0.064) (0.039) ETWAR 0.026 0.035 0.016 131 0.014∗∗ 0.015 0.005 131 (0.023) (0.051) (0.045) (0.007) (0.050) (0.047)

female MONOP 0.043∗∗∗ 0.094∗∗∗ 0.064∗∗∗ 131 0.043∗∗∗ 0.094∗∗∗ 0.064∗∗∗ 131 70 (0.000) (0.000) (0.003) (0.000) (0.000) (0.003) DOMIN 0.015 0.009 -0.002 131 0.004 0.028 0.025 131 (0.048) (0.050) (0.025) (0.048) (0.063) (0.032) MONDOM 0.026 0.029 0.011 131 0.018 0.060 0.048 131 (0.051) (0.056) (0.032) (0.054) (0.071) (0.036) DISC -0.030∗∗∗ 0.002 0.023 131 -0.006 -0.019 -0.015 131 (0.002) (0.042) (0.042) (0.016) (0.030) (0.019) PWLESS 0.019 0.003 -0.011 131 -0.065 -0.071 -0.025 131 (0.100) (0.090) (0.046) (0.100) (0.075) (0.033) DISCPWLS -0.029 -0.065 -0.045∗∗ 131 -0.049 -0.062 -0.027 131 (0.087) (0.059) (0.023) (0.073) (0.060) (0.030) ETWAR 0.030 0.021 0.000 131 0.019∗∗∗ -0.011 -0.024 131 (0.023) (0.028) (0.025) (0.006) (0.018) (0.019) This is not a normal regression table. In the columns entitled “share literate old” the dependent variable is the ethnicity share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old). In the columns entitled “IM” it is the ethnicity share of children of parents with less than primary who complete at least primary (estimated net of country-year and country-birth-decade fixed effects for young and old), which is also the LHS in the columns entitled “IM controlling for share literate old”. Each row shows the results of regressions of these variabes on the LHS on one RHS variable (indicated in the rows) at a time (the exception are SIZE and SIZE2, which enter jointly). The regressions in the two columns “IM controlling for share literate old” additionally control for the share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old), that is they include the LHS variable of the columns “share literate old” on the RHS. All RHS variable enter in the transformation indicated in the top row. All specifications include country fixed effects (not reported). Coefficients are standardized. MONOP = epr status of monopoly, DOMIN = epr status of dominant, MONDOM = epr status of monopoly or dominant, DISC = epr status of discriminated, PWLESS = epr status of powerless, DISCPWLS = epr status of discriminated or powerless, ETWAR = ethnic war. Standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01. lines indicate that variables remain significantly correlated with IM when we control for the share of literate parents. E.2.2 Historical Features

Table 31: Ethnicity-level correlates of the share of literate old and IM, ages 14-100, with student correction, historical features

(1) (2) (3) variable share literate old IM IM controlling for share literate old N

slavery SLAVERY -0.005 -0.012 -0.009 129 (0.072) (0.091) (0.062)

Murdock SPLIT -0.113∗ -0.055 0.021 130 (0.061) (0.053) (0.019) PASTO -0.030 -0.139∗∗ -0.121∗∗∗ 135 (0.085) (0.065) (0.028) AGRI -0.001 0.096 0.096∗∗ 135 (0.077) (0.069) (0.039) POLYG -0.068 -0.012 0.031 132 (0.061) (0.038) (0.034) BRIDEPR 0.045 0.081∗ 0.053 135 (0.079) (0.044) (0.038) PATRI 0.118 0.170∗∗∗ 0.099 116 (0.089) (0.051) (0.066) PCCENT 0.005 0.001 -0.001 129 (0.031) (0.030) (0.029) This is not a normal regression table. In the columns entitled “share literate old” the dependent variable is the ethnicity share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old). In the columns entitled “IM” it is the ethnicity share of children of parents with less than primary who complete at least primary (estimated net of country-year and country-birth- decade fixed effects for young and old), which is also the LHS in the columns entitled “IM controlling for share literate old”. Each row shows the results of regressions of these variabes on the LHS on one RHS variable (indicated in the rows) at a time. The regressions in the two columns “IM controlling for share literate old” additionally control for the share of parents with at least primary schooling (estimated net of country-year and country- birth-decade fixed effects for young and old), that is they include the LHS variable of the columns “share literate old” on the RHS. All specifications include country fixed effects (not reported). Coefficients are standardized. SPLIT = ancestral homeland of ethnic group split between two modern states, SLAVERY = Nunn slavery measure, PASTO = pastoralism dominant pre-colonial mode of subsistence, AGRI = agriculture dominant pre-colonial mode of subsistence, POLYG = polygyny dummy, BRIDEPR = ethnic group practices bride price, PATRI = patrilineal group, PCCENT = levels of jurisidictional hierarchy beyond local community, PLOW = plow aboriginal. Standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01. lines indicate that variables remain significantly correlated with IM when we control for the share of literate parents.

71 Table 32: Ethnicity-level correlates of the share of literate old and IM, ages 14-18, with student correction, historical features, urban-rural heterogeneity

rural urban (1) (2) (3) (4) (5) (6) variable share literate old IM IM controlling for share literate old N share literate old IM IM controlling for share literate old N

Slavery SLAVERY 0.006 0.032 0.029 124 -0.014 -0.020 -0.015 124 (0.066) (0.089) (0.076) (0.111) (0.090) (0.081)

Murdock SPLIT -0.099 -0.049 0.012 125 -0.145∗ -0.021 0.031 125 (0.065) (0.059) (0.026) (0.080) (0.040) (0.026) PASTO -0.041 -0.167∗∗ -0.145∗∗∗ 130 -0.126 -0.163∗∗∗ -0.125∗∗∗ 130 (0.072) (0.070) (0.036) (0.082) (0.050) (0.028) AGRI 0.010 0.134∗∗ 0.128∗∗∗ 130 0.096 0.115∗ 0.084∗∗ 130 (0.063) (0.066) (0.038) (0.091) (0.063) (0.042) POLYG -0.011 0.035 0.042 127 -0.053 -0.029 -0.010 127 (0.029) (0.029) (0.038) (0.051) (0.066) (0.056) BRIDEPR 0.071 0.098∗∗ 0.059 130 -0.027 0.017 0.027 130 72 (0.069) (0.046) (0.045) (0.077) (0.076) (0.071) PATRI 0.097 0.140∗∗ 0.084 112 -0.036 0.127∗∗ 0.139∗∗ 112 (0.090) (0.054) (0.073) (0.097) (0.060) (0.070) PCCENT -0.020 -0.034 -0.025 124 -0.035 0.037 0.046 124 (0.027) (0.027) (0.025) (0.047) (0.031) (0.036) This is not a normal regression table. In the columns entitled “share literate old” the dependent variable is the ethnicity share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old). In the columns entitled “IM” it is the ethnicity share of children of parents with less than primary who complete at least primary (estimated net of country-year and country-birth-decade fixed effects for young and old), which is also the LHS in the columns entitled “IM controlling for share literate old”. Each row shows the results of regressions of these variabes on the LHS on one RHS variable (indicated in the rows) at a time. The regressions in the two columns “IM controlling for share literate old” additionally control for the share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old), that is they include the LHS variable of the columns “share literate old” on the RHS. All specifications include country fixed effects (not reported). Coefficients are standardized. SPLIT = ancestral homeland of ethnic group split between two modern states, SLAVERY = Nunn slavery measure, PASTO = pastoralism dominant pre-colonial mode of subsistence, AGRI = agriculture dominant pre-colonial mode of subsistence, POLYG = polygyny dummy, BRIDEPR = ethnic group practices bride price, PATRI = patrilineal group, PCCENT = levels of jurisidictional hierarchy beyond local community, PLOW = plow aboriginal. Standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01. lines indicate that variables remain significantly correlated with IM when we control for the share of literate parents. Table 33: Ethnicity-level correlates of the share of literate old and IM, ages 14-18, with student correction, historical features, male-female heterogeneity

female male (1) (2) (3) (4) (5) (6) variable share literate old IM IM controlling for share literate old N share literate old IM IM controlling for share literate old N

Slavery SLAVERY -0.010 0.000 0.006 129 -0.008 0.016 0.021 129 (0.082) (0.111) (0.075) (0.077) (0.123) (0.104)

Murdock SPLIT -0.122∗ -0.069 0.012 130 -0.121∗ -0.035 0.037 130 (0.073) (0.081) (0.033) (0.064) (0.054) (0.029) PASTO -0.047 -0.155∗∗ -0.127∗∗∗ 135 -0.043 -0.199∗∗ -0.177∗∗∗ 135 (0.085) (0.074) (0.036) (0.083) (0.080) (0.044) AGRI 0.011 0.103 0.096∗∗ 135 0.012 0.148∗ 0.142∗∗ 135 (0.080) (0.075) (0.044) (0.078) (0.085) (0.057) POLYG -0.067 -0.018 0.024 132 -0.060 0.023 0.056 132 (0.055) (0.038) (0.042) (0.055) (0.047) (0.050) BRIDEPR 0.057 0.078 0.044 135 0.060 0.119∗∗ 0.088∗ 135 73 (0.084) (0.062) (0.055) (0.079) (0.048) (0.046) PATRI 0.122 0.170∗∗ 0.093 116 0.117 0.214∗∗∗ 0.149∗ 116 (0.100) (0.084) (0.096) (0.095) (0.054) (0.080) PCCENT -0.000 0.048 0.048∗∗ 129 0.011 -0.044 -0.049 129 (0.032) (0.031) (0.022) (0.032) (0.044) (0.045) This is not a normal regression table. In the columns entitled “share literate old” the dependent variable is the ethnicity share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old). In the columns entitled “IM” it is the ethnicity share of children of parents with less than primary who complete at least primary (estimated net of country-year and country-birth-decade fixed effects for young and old), which is also the LHS in the columns entitled “IM controlling for share literate old”. Each row shows the results of regressions of these variabes on the LHS on one RHS variable (indicated in the rows) at a time. The regressions in the two columns “IM controlling for share literate old” additionally control for the share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old), that is they include the LHS variable of the columns “share literate old” on the RHS. All specifications include country fixed effects (not reported). Coefficients are standardized. SPLIT = ancestral homeland of ethnic group split between two modern states, SLAVERY = Nunn slavery measure, PASTO = pastoralism dominant pre-colonial mode of subsistence, AGRI = agriculture dominant pre-colonial mode of subsistence, POLYG = polygyny dummy, BRIDEPR = ethnic group practices bride price, PATRI = patrilineal group, PCCENT = levels of jurisidictional hierarchy beyond local community, PLOW = plow aboriginal. Standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01. lines indicate that variables remain significantly correlated with IM when we control for the share of literate parents. Table 34: Ethnicity-level correlates of the share of literate old and IM, ages 14-100, with student correction, historical features, urban-rural heterogeneity

rural urban (1) (2) (3) (4) (5) (6) variable share literate old IM IM controlling for share literate old N share literate old IM IM controlling for share literate old N

Slavery SLAVERY 0.016 0.011 0.001 124 -0.027 -0.018 -0.006 124 (0.056) (0.068) (0.051) (0.104) (0.077) (0.056)

Murdock SPLIT -0.088 -0.048 0.011 125 -0.147∗∗ -0.042 0.022 125 (0.058) (0.048) (0.015) (0.074) (0.038) (0.020) PASTO -0.026 -0.132∗∗ -0.116∗∗∗ 130 -0.111 -0.137∗∗∗ -0.094∗∗∗ 130 (0.069) (0.058) (0.030) (0.089) (0.039) (0.007) AGRI 0.002 0.102∗ 0.101∗∗∗ 130 0.078 0.096∗ 0.064∗ 130 (0.057) (0.055) (0.030) (0.092) (0.056) (0.034) POLYG -0.017 0.021 0.032 127 -0.058 -0.021 0.003 127 (0.033) (0.017) (0.029) (0.053) (0.057) (0.046) BRIDEPR 0.052 0.087∗∗ 0.055 130 -0.019 0.012 0.020 130 74 (0.065) (0.037) (0.036) (0.068) (0.059) (0.051) PATRI 0.091 0.120∗∗∗ 0.064 112 -0.037 0.072 0.087∗∗ 112 (0.082) (0.043) (0.059) (0.094) (0.045) (0.043) PCCENT -0.014 -0.032 -0.024 124 -0.023 0.016 0.025 124 (0.024) (0.024) (0.023) (0.045) (0.023) (0.026) This is not a normal regression table. In the columns entitled “share literate old” the dependent variable is the ethnicity share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old). In the columns entitled “IM” it is the ethnicity share of children of parents with less than primary who complete at least primary (estimated net of country-year and country-birth-decade fixed effects for young and old), which is also the LHS in the columns entitled “IM controlling for share literate old”. Each row shows the results of regressions of these variabes on the LHS on one RHS variable (indicated in the rows) at a time. The regressions in the two columns “IM controlling for share literate old” additionally control for the share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old), that is they include the LHS variable of the columns “share literate old” on the RHS. All specifications include country fixed effects (not reported). Coefficients are standardized. SPLIT = ancestral homeland of ethnic group split between two modern states, SLAVERY = Nunn slavery measure, PASTO = pastoralism dominant pre-colonial mode of subsistence, AGRI = agriculture dominant pre-colonial mode of subsistence, POLYG = polygyny dummy, BRIDEPR = ethnic group practices bride price, PATRI = patrilineal group, PCCENT = levels of jurisidictional hierarchy beyond local community, PLOW = plow aboriginal. Standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01. lines indicate that variables remain significantly correlated with IM when we control for the share of literate parents. Table 35: Ethnicity-level correlates of the share of literate old and IM, ages 14-100, with student correction, historical features, male-female heterogeneity

female male (1) (2) (3) (4) (5) (6) variable share literate old IM IM controlling for share literate old N share literate old IM IM controlling for share literate old N

Slavery SLAVERY -0.005 -0.018 -0.014 129 -0.004 -0.008 -0.005 129 (0.071) (0.093) (0.059) (0.066) (0.094) (0.071)

Murdock SPLIT -0.111∗ -0.068 0.012 130 -0.103∗ -0.038 0.033 130 (0.062) (0.066) (0.022) (0.055) (0.042) (0.021) PASTO -0.032 -0.118∗ -0.096∗∗∗ 135 -0.025 -0.154∗∗ -0.138∗∗∗ 135 (0.083) (0.061) (0.027) (0.076) (0.067) (0.031) AGRI 0.002 0.076 0.074∗∗ 135 -0.004 0.109 0.112∗∗∗ 135 (0.075) (0.063) (0.035) (0.069) (0.072) (0.042) POLYG -0.068 -0.034 0.013 132 -0.060 0.003 0.043 132 (0.061) (0.039) (0.033) (0.055) (0.038) (0.035) BRIDEPR 0.038 0.056 0.031 135 0.043 0.103∗∗ 0.076∗∗ 135 75 (0.078) (0.052) (0.045) (0.070) (0.041) (0.036) PATRI 0.113 0.147∗∗ 0.072 116 0.107 0.195∗∗∗ 0.129∗∗ 116 (0.089) (0.068) (0.079) (0.077) (0.043) (0.062) PCCENT 0.000 0.052∗ 0.051∗∗∗ 129 0.007 -0.037 -0.041 129 (0.030) (0.030) (0.019) (0.029) (0.039) (0.041) This is not a normal regression table. In the columns entitled “share literate old” the dependent variable is the ethnicity share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old). In the columns entitled “IM” it is the ethnicity share of children of parents with less than primary who complete at least primary (estimated net of country-year and country-birth-decade fixed effects for young and old), which is also the LHS in the columns entitled “IM controlling for share literate old”. Each row shows the results of regressions of these variabes on the LHS on one RHS variable (indicated in the rows) at a time. The regressions in the two columns “IM controlling for share literate old” additionally control for the share of parents with at least primary schooling (estimated net of country-year and country-birth-decade fixed effects for young and old), that is they include the LHS variable of the columns “share literate old” on the RHS. All specifications include country fixed effects (not reported). Coefficients are standardized. SPLIT = ancestral homeland of ethnic group split between two modern states, SLAVERY = Nunn slavery measure, PASTO = pastoralism dominant pre-colonial mode of subsistence, AGRI = agriculture dominant pre-colonial mode of subsistence, POLYG = polygyny dummy, BRIDEPR = ethnic group practices bride price, PATRI = patrilineal group, PCCENT = levels of jurisidictional hierarchy beyond local community, PLOW = plow aboriginal. Standard errors clustered at the country-level in parentheses. ∗p < 0.1, ∗ ∗ p < 0.5, ∗ ∗ ∗p < 0.01. lines indicate that variables remain significantly correlated with IM when we control for the share of literate parents.