InstitutoINSTITUTO de Economía DE ECONOMÍA DOCUMENTO de TRABAJO DOCUMENTO DE TRABAJO
423 2012
Marry for What? Caste and Mate Selection in Modern India
Abhijit Banerjee, Esther Duflo, Maitreesh Ghatak, Jeanne Lafortune.
www.economia.puc.cl • ISSN (edición impresa) 0716-7334 • ISSN (edición electrónica) 0717-7593 Versión impresa ISSN: 0716-7334 Versión electrónica ISSN: 0717-7593
PONTIFICIA UNIVERSIDAD CATOLICA DE CHILE INSTITUTO DE ECONOMIA
Oficina de Publicaciones Casilla 76, Correo 17, Santiago www.economia.puc.cl
MARRY FOR WHAT? CASTE AND MATE SELECTION IN MODERN INDIA Abhijit Banerjee Esther Duflo Maitreesh Ghatak Jeanne Lafortune*
Documento de Trabajo Nº 423
Santiago, Marzo 2012
INDEX
ABSTRACT
INTRODUCTION 1
2. MODEL 5 2.1 Set-up 5 2.2 An important caveat: preferences estimation with unobserved attributes 8 2.3 Stable matching patterns 8
3. SETTING AND DATA 13 3.1 Setting: the search process 13 3.2 Sample and data collection 13 3.3 Variable construction 14 3.4 Summary statistics 15
4. ESTIMATING PREFERENCES 17 4.1 Basic empirical strategy 17 4.2 Results 18 4.3 Heterogeneity in preferences 20 4.4 Do these coefficients really reflect preferences? 21 4.4.1 Strategic behavior 21 4.4.2 What does caste signal? 22 4.5 Do these preferences reflect dowry? 24
5. PREDICTING OBSERVED MATCHING PATTERNS 24 5.1 Empirical strategy 25 5.2 Results 27 5.2.1 Who stays single? 27 5.2.2 Who marries whom? 28
6. THE ROLE OF CASTE PREFERENCES IN EQUILIBRIUM 30 6.1 Model Predictions 30 6.2 Simulations 30
7. CONCLUSIONS 32
REFERENCES 32
Marry for What? Caste and Mate Selection in Modern India∗
Abhijit Banerjee, Esther Duflo, Maitreesh Ghatak and Jeanne Lafortune†
March 2012
Abstract This paper analyzes how preferences for a non-economic characteristic, such as caste, can affect equilibrium patterns of matching in the marriage market, and empirically evaluates this in the context of arranged marriages among middle-class Indians. We develop a model that demonstrates how the equilibrium consequences of caste depend on whether we observe a bias towards one’s own group or if there is a preference for “marrying up”. We then estimate actual preferences for caste, education, beauty, and other attributes using a unique data set on individuals who placed matrimonial advertisements in a major newspaper, the responses they received, and how they ranked them. Our key empirical finding is the presence of a strong preference for in-caste marriage. We find that in equilibrium, as predicted by our theoretical framework, these preferences do little to alter the matching patterns on non-caste attributes, and so people do not have to sacrifice much to marry within caste. This suggests a reason why caste remains a persistent feature of the Indian marriage market.
JEL classification: D10, J12, O12 Key words: Caste, Marriage, Stable matching
∗We thank the Anandabazar Patrika for their cooperation for this project, and Prasid Chakrabarty and the team of SRG investigators for conducting the survey. We thank Raquel Fernandez, Ali Hortascu, Patrick Bajari, George Mailath, Whitney Newey, Parag Pathak, Andrew Postlewaite, Debraj Ray, Alvin Roth, Ken Wolpin, many seminar participants and three anonymous referees for very helpful comments. Finally, we also thank Sanchari Roy and Tommy Wang for outstanding research assistance. †The authors are from the departments of economics at MIT, MIT, LSE, and University of Maryland, College Park respectively. 1 Introduction
Marriage is, among other things, an important economic decision. Sorting in families has an impact on child outcomes and the accumulation of human capital, and consequently, on long term economic development and inequality (Fernandez and Rogerson 2001, Fernandez 2003). In developing countries, where many women do not work outside their homes, marriage is arguably the single most important determinant of a woman’s economic future.1 In India, the setting for this paper, several studies have shown that marriage is indeed taken as an economic decision, managed by parents more often than by the prospective spouses. For example, Rosenzweig and Stark (1989) show that parents marry their daughters in villages where incomes co-vary less with respect to their own village. Foster and Rosenzweig (2001) show that demand for healthy women in the marriage market influences investments in girls. Yet, “status”-like attributes, such as caste, continue to play a seemingly crucial role in deter- mining marriage outcomes in India. In a recent opinion poll in India, 74 percent of respondents declared to be opposed to inter-caste marriage.2 The institution is so prevalent that matrimo- nial advertisements (henceforth, ads) in Indian newspapers are classified under caste headings, making it immediately obvious where prospective brides or grooms can find someone from their own caste. It is well known that these types of non-meritocratic social preferences can impede economic efficiency – a point that is often made in the literature on discrimination (Becker 1957). At the same time there is also the view that economic forces will tend to undermine institutions or preferences that generate impose large economic costs on people.3 Indeed we do see the role of caste changing with economic growth and the diversification of earnings opportunities in India: the correlation between caste and income in India is significantly lower now, and caste plays much less of a role in determining the job someone has (Munshi and Rosenzweig 2006). This paper is an attempt to understand why the stated role of caste in marriage remains so strong.4 One possibility is that this is just something that people say, but they do not actually act
1Even in our sample of highly educated females and males, fewer than 25 % of matched brides were working after marriage. 2We use the word caste in the sense of jati (community) as opposed to varna. The latter is a broad theoretical system of grouping by occupation (priests, nobility, merchants, and workers). The jati is the community within which one is required to be married, and which forms ones social identity. 3In the context of the marriage market, for example, Cole et al. (1992) characterize an “aristocratic equilibrium” which is characterized by low levels of productivity because of the weight people put on status. They go on to show that the aristocratic equilibrium may be broken by increased economic mobility because it leads to the emergence of low status men who are nevertheless high wealth, who may be in a position to attract a high status, low wealth woman. 4This is related to the literature in the United States which has looked at how religious homogamy can persist despite the fact that some groups are clearly in minority. Bisin et al. (2004), Bisin and Verdier (2001) and Bisin and Verdier (2000) argue that in this context, there is a strong preference for horizontal matching in order to socialize children within one’s faith. In addition, this homogamy may depend on the availability of partners of one’s own religion and members of a minority such as Jews may actually exhibit higher rates of endogamy. However, these
1 on it–perhaps if we were to observe their actual marital choices we would see that caste is much less important than it is claimed. The fact that many people do end up marrying in caste is not enough to reject this view since it is well-known that caste is correlated with many other attributes and those could be driving the observed choices. To answer the question of whether caste actually matters in the choice of a spouse, we follow the methodology developed in Hitsch et al. (2009), and Fisman et al. (2008) for studying partner choice in the United States. Hitsch et al. (2009) use on-line dating data to estimate racial preferences in the US: they observe the set of partner profiles that a potential dater faces as well as which profiles they actually click on, which is what they interpret as the first act of choice. Since they observe all the attributes that the decision- maker observes, this allows them to identify the decision-maker’s actual preferences. Fisman et al. (2008), do something very similar, using the random assignment of people to partners in speed dating. Both find strong evidence of same-race preferences, the equivalent of same-caste preferences in our context. To look at the strength of caste preferences in marriage, we apply this methodology to a data- set that we collected based on interviews with 783 families who placed newspaper matrimonial ads in a major Bengali newspaper. We asked ad-placers to rank the letters they have received in response to their ad, and list the letters they are planning to follow up with, and use these responses to estimate the "marginal" rate of substitution between caste and other attributes. We find, evidence for very strong own caste preferences: for example our estimates suggest that the bride’s side would be willing to trade off the difference between no education and a master’s degree in the prospective husband to avoid marrying outside their caste. For men seeking brides, the own caste effect is twice the effect of the difference between a self-described “very beautiful” woman and a self-described “decent-looking” one. This is despite the fact that the population in our sample is urban, relatively well off, and highly educated (for example, 85% have a college degree). Interestingly, this preference for caste seems much more horizontal than vertical: we see little interest in “marrying up” in the caste hierarchy among both men and women, but a strong preference for in-caste matches. This is similar to the strong preference for same-race matches that the literature in the United States finds, though our context makes it even more striking: these are ads for arranged marriage in a relatively conservative society where the goal is clearly marriage and as a result, the motives of the decision-makers are likely to be much more classically economic than those involved in online dating or speed dating. Dugar et al. (fort), who study newspaper partner search in India using people’s choices among nine randomly manipulated profiles, finds very similar results on the strength of own-caste preferences.5 papers do not include any characteristics of spouses other than religion, and thus do not address our notion of "cost" of marrying within one’s religion. Bisin et al. (2004) note that this is an important question left for future research in their conclusions. 5Another paper on this general area that is more or less contemporaneous with ours is Lee (2007) who uses data from Korea data on online dating to study partner choice.
2 The central contribution of this paper is to suggest an explanation for the persistence of such a strong preference for caste-based matching. Specifically, we propose a model that explains why such a strong in-group preference might have survived despite the changing economic incentives, which should have made other characteristics (such as education, income, etc.) increasingly attractive. Our basic hypotheses is that this is because, as we saw, preferences for caste are primarily “horizontal” in the sense that people prefer to marry their own caste over marrying into any other caste. This goes against the traditional story about the caste system, which emphasizes its hierarchical structure, but is consistent with the sociological evidence on the nature of caste today (Fuller 1996). When caste preferences are horizontal and a particular condition we call balance holds, the theoretical section of the paper demonstrates, the matching patterns along non-caste dimensions are actually very similar to the ones that would obtain in the absence of any caste preferences. As a result, the equilibrium price of caste, which is the opportunity cost of the marriage option that one has to give up to marry in caste tends to be quite low. One possible reason why caste persists, therefore, is that it actually does not cost very much to marry within caste.6 To check that this line of reasoning actually works in the data, we use our estimated prefer- ences and the assumption of stable matching to predict matching patterns, both in the current scenario and in various counterfactual settings. We surveyed our original respondents after one year to obtain information on their outcome on the marriage market: whether they married, and whom. We thus directly look at whether the actual matching pattern is similar to that predicted by the estimated preferences. Specifically, we use the Gale-Shapley (Gale and Shapley 1962) algo- rithm to generate the stable matches predicted by these preferences and compare them with the actual matches. Hitsch et al. (2009) perform the same exercise in their data, using an exchange of emails as the final outcome. Using metrics similar to the ones they use we find that the predicted and observed matches more or less line up, and therefore conclude that that stable matching is reasonable way to model marriage market equilibrium. This brings us to the central empirical exercise in our paper: we compute the set of stable matches that would arise in our population if preferences were exactly as estimated above except that all caste variables were ignored. Our results indicate that the percentage of intra-caste marriages drops dramatically. This implies that caste is not just a proxy for other characteristics households also care about and that there are several potential matches for each individual, both inside and outside his or her caste. At the same time, we also find that individuals are matched with spouses who are very similar on all non-caste characteristics to the mate they would have selected when caste was included within one’s preferences.
6This is in contrast with the results by Abramitzky et al. (2011) who estimate the impact of changing sex ratios in France after WWI on the propensity to marry across social class, clearly a vertical characteristic. In that setting, changing the supply of males has a large effect on the matching patterns because the price of marrying within one’s social class has been greatly increased for women.
3 Second, we estimate the “equilibrium price” of caste in terms of a variety of attributes, defined as the difference between the spouses of two observationally identical individuals, one who is from the same caste and the other who is not. This is done by regressing a spousal characteristic, such as education, on all observable characteristics of the individuals and a dummy for whether the match is “within caste” among the set of simulated matches. There is no characteristic for which this measure of price is significantly positive. To complete the argument we also estimate the equilibrium price for a vertical attribute, beauty, in terms of education. As our theory would predict, we see a non-zero price in this case. A number of conclusions follow from our findings. First, there is no reason to expect that economic growth by itself will undermine caste-based preferences in marriage. Second, caste- based preferences in marriage are unlikely to be a major constraint on growth. Finally, one might worry that if caste becomes less important, inequality might increase along other dimensions as we will see more assortative matching. Given that the matching is already close to being assortative, this is probably not an important concern. 7 While these conclusions are particularly important in the context of India, they are also more broadly relevant for any setting where we may observe strong in-group preferences in a matching context. Our theoretical conclusions, in particular, suggest that these preferences will have more impact for matching patterns in equilibrium whenever they display a “vertical” nature. Racial preferences for spouses, for example, may not have large equilibrium consequences if groups have a preference for marrying someone of their own race rather than hoping to marry a partic- ularly favored racial group. On the other hand, preferences for social status (e.g., marrying into aristocracy) might be more vertical (Abramitzky et al. (2011) and Almenberg and Dreber (2009)). While this is the main point of the paper, our data allows us to perform a number of other innovative exercises, which help bolstering the claim that we are identifying true preferences, rather than strategic behavior. Since we observe all the information seen by the ad-placer at the time they make a decision to reply or not to a particular letter means we do not have to worry about unobserved variables seen by the ad-placer and not seen by the econometrician.8 However, there are still a number of possible alternative explanations for the choices we observe other than a pure preference for caste. These are not unique to our paper: all the papers that use this kind of methodology for estimating preferences face the same problem. However, our data and setting allow us to explore these alternatives. First, we may be concerned about signaling. Perhaps there is no real preference for marrying in caste; because no one actually does it in equilibrium, however, those who make proposals to
7An important caveat to these conclusions is that they were obtained in a particular sample of highly educated West Bengali: while this population is anticipated to have weaker preferences for caste, thus potentially implying that our results are a lower bound for what one could expect in a different setting, this population is also potentially more “balanced” which implies that assortative matching may be easier to achieve here than in other contexts. 8This is why, unlike Dugar et al. (fort) we do not conduct an experiment: there is no econometric problem that an experiment can solve here.
4 non-caste members are treated with suspicion. We examine this by looking at the actual matches of those who make proposals out of caste and find that they are no different from that of others, suggesting that their underlying unobserved quality must be similar. Second, we need to deal with the possibility of strategic responses, i.e. the fact that some candidates may choose their responses based on who they expect to respond back positively rather than their true preferences. To get at this we compute an index of the quality of each ad and each letter, and show that the relative likelihood of responding to a “high quality” letter and a “low quality” letter are the same for "low quality" ad placers and “high quality” ad placers.9 The fact that the ad-placer’s ranking of the letters, and the decision to reply to the letters gives us very similar results also suggests that the respondents are not strategic in deciding whom to reply to. The remainder of the paper proceeds as follows: Section 2 first sketches a model where caste and other attributes interact on the marriage market. Section 3 presents the data while Section 4 elaborates on the methodology and the results of preference estimation. Section 5 highlights the results of the stable matches and Section 6 uses these results to derive conclusions regarding the equilibrium. Finally, Section 7 concludes.
2 Model
In this section we develop a simple model of marriage. The model introduces caste-based preferences (see for example Anderson (2003))into an otherwise standard model of marriage (Becker 1973). The novelty is that we allow two-sided matching, and both horizontal and vertical caste-based preferences (as opposed to just vertical preferences). Our goal is to derive how mar- riage market outcomes are affected by going from vertical to horizontal caste preferences, under the assumption that there is another “vertical” attribute (beauty, education, earnings, etc.) that the decision-makers also care about. We characterize conditions under which non-assortative matching will take place, and when it does, characterize the price of marrying within caste or outside caste (in terms of what the decision makers gives up along the second, vertical dimen- sion). These results will motivate our empirical analysis and help us interpret the main results.
2.1 Set up
Assume a population of men and women differentiated by “caste” where the caste of an individual is i ∈ {1, 2}. They are ranked in descending order: i = 1 is the higher caste, followed by i = 2. Men and women are also differentiated according to a “vertical” characteristic that we will refer to as quality, that affects their attractiveness to a potential partner. The quality of men will
9We find more evidence of strategic behavior at other steps of the process, in particular in deciding which ad to send a letter to.
5 be denoted by x ∈ [H, L] and the quality of women will be denoted by y ∈ [H, L]. We can think of these as education levels of men and women, or, income and beauty.
We denote the total number of women of type y who belong to caste i by ωyi > 0 and the number of men of type x who belong to caste i by µxi > 0, where x, y = H, L and i = 1, 2. We assume the following condition regarding the distribution of men and women:
Condition 1. A population is said to satisfy balance (B) if ωt1 + ωt2 = µt1 + µt2 for each t = H, L, and
ωHi + ωLi = µHi + µLi for each i = 1, 2. In other words, there is a balanced sex ratio within each type and within each caste. 10 If populations do not follow this assumption, non-assortative matching will follow trivially. The payoffs of men and women are both governed by the quality of the match. We assume that in a union where the man’s quality is given by x and the woman’s by y the payoff function has two (multiplicatively) separable elements, one governed by the vertical characteristics, f (x, y), and the other by caste, A(i, j) where the latter is the payoff of someone who is of caste i and who is matched with someone of caste j. We assume that the function f (x, y) > 0 is increasing with respect to both arguments. Thus, other things constant, everyone prefers a higher attribute partner. Also, for ease of exposition, we assume f (x, y) is symmetric, i.e., f (H, L) = f (L, H). In order to generate conditions that are easy to interpret we give the function A(i, j) a specific form: A(i, j) = 1 + α{β(2 − j) − γ(i − j)2} where α ≥ 0, β ≥ 0, γ ≥ 0. It is readily verified that A(i, j) > 0 as long as αγ < 1 (which we assume) and as long as γ > 0 the function displays strict complementarity with respect to caste: ∂2 A(i,j) ∂i∂j > 0. This caste-based match quality function is flexible. It allows a vertical as well as a horizontal component to caste. For example, if β = 0 then caste is purely horizontal: people want to match within their caste. Otherwise, the higher the caste of the partner (lower is j) the higher the match specific gain to an individual of caste i. On the other hand, if γ = 0 then caste is purely vertical with everyone preferring a higher caste partner, as in Anderson (2003).
We also assume that a number νyi, y = H, L, i = 1, 2, 0 < νyi < ωyi of women and a cor- responding number 0 < κxi < µxi of men have caste-neutral (CN) preferences, α = 0. These individuals put no weight on the caste of a potential partner, i.e., for them A(i, j) = 1 for all i = 1, 2 and j = 1, 2. Those who are caste-conscious value a caste-neutral individual of caste i (i = 1, 2) in the same way as they would a caste-conscious (CC) individual of caste i (i = 1, 2). As we will see, the data clearly supports the idea that a fraction of individuals are caste-neutral.
10As is well-known, the sex-ratio in South Asia tends to make men more abundant. However, as Rao (1993) has shown, the gap between the normal age at marriage between men and women, combined with the fact that the population is growing, counteracts this effect and almost all men do manage to find spouses.
6 Given these two elements governing the quality of a match, we assume that the payoff of an individual of gender G, of caste i who is matched with someone of caste j in a union where the man’s quality is given by x and that of the woman’s by y is given by:
uG(i, j, x, y) = A(i, j) f (x, y) for G = M, W.
We have imposed a lot of symmetry here: For example, a man of type 1 of caste 1 marrying a woman of type 2 of caste 2 gets the same payoff that a woman of caste 1 of type 1 would get from marrying a man of caste 2 of type 2. This is convenient for stating the results in a more compact form, but is by no means essential. We also assume that the utility of not being matched is zero. Since because both f (x, y) and A(i, j) are positive, the utility of being matched with anyone is always better than that of remaining single. Since the total number of men and women are the same, everyone should match in equilibrium. Finally we assume that matching is governed by these preferences—in particular there are no transfers, so that we have what in the literature is called non-transferable utility (NTU) matching (as in recent studies of the United States matching market by Hitsch et al. 2009, Fisman et al. 2006 and Fisman et al. 2008). This assumption is less common in the development economics literature on marriage than the alternative transferrable utility (TU) assumption (e.g., Becker 1973, Lam 1988), where dowries are interpreted as the instrument of transfer. Demanding a dowry is both illegal and considered unethical in middle–class urban Bengali culture,11 and as a result, no one mentions dowries in the ads or the letters, unless it is to announce that they do not want a dowry. Our presumption is that some fraction of the population will eventually ask for a dowry, but a substantial fraction will not (given that they spend money to say that in their ads). Therefore we cannot assume that we are in either of the pure cases. Our strategy therefore is to go ahead as if we are in a pure NTU world but argue that we would get very similar results if we made the TU assumption. However the presence of dowry can potentially affect the interpretation of our estimated preferences from the NTU model: the next sub-section, while a slight detour in the exposition of the theory, deals with this important concern. We return briefly to the stable matching patterns under TU at the end of section 2.3.
11We have so far failed to locate a study on dowry in this population that would throw light on its extent. However, we note that while Kolkata has 12 percent of the population of the largest metropolitan cities in India, it has only 1.9 percent of the so-called “dowry deaths” in these cities (about 6,000 in a year, India-wide), which are episodes where a bride is killed or driven to commit suicide by her in-laws following negotiation failure about the dowry. To the extent that the prevalence of dowry death partly reflects the prevalence of dowry, it suggests that they are less prevalent in Kolkata than in other major cities in India.
7 2.2 An important caveat: preferences estimation with unobserved attributes
Our empirical strategy relies on the fact that the econometrician observes everything that the decision-maker observes. However unobserved attributes may still play a role–exactly as they would if the observed characteristics were randomly assigned–if the decision-makers take into account the correlation between observables and what they do not observe. A key example of such an unobservable is the expected "ask"–some people will demand dowry and others will not. However, note that dowries, like many other unobservables, will get revealed in a future round of the marriage negotiations. Given that many people will not ask for a dowry, and you can always reject the ones who ask for too much later (or offer too little), it makes sense to first short-list every prospect worth exploring ignoring the possibility of their asking for a dowry or offering one, and to actually find out whether or not they want a dowry (or want to offer one) by contacting them. They can then discard the ones who ask for too much or offer too little based on better information. Obviously this logic only works if the cost of contacting an additional person is small which, given the large numbers of contacts that are made by people, seems plausible. It is straightforward to formalize this argument, and we do so in a separate online appendix.12 Assuming that the conditions of this proposition hold (namely, the exploration costs are not too high), it tells us what we observe in the data is people’s true ordering between those whom they consider and those whom they reject, even if dowry and other still to be revealed attributes will eventually be an important consideration in the decision. Based on this ranking we infer people’s preferences over a range of attributes. We will, however, come back to discuss some direct evidence that the estimated preferences are consistent with the assumption that people ignore dowry at this stage. None of this helps us with the possibility that there are unobserved attributes that will never be observed, but may yet be driving the decisions because of their correlations (actual or hypoth- esized) with the observables. We do try to test of some specific hypotheses of this class (e.g., is caste really a proxy for “culture’) using ancillary data, but at one level this is obviously an impossible quest.
2.3 Stable matching patterns
To start with, observe that if everyone were CN all H types would want to match with H types and since there are the same number of H type men and women, this is indeed what would happen – people would match assortatively. There may be out of caste matches, but
12The assumption here is that the unobserved attribute has a fixed value. It is more like something like attractive- ness than like a demand for dowry, which is something that might adjust to exactly compensate for differences in other attributes. Nevertheless, as long as each set of candidates with the same observable characteristics contains a sufficiently large subset which is on average identical to the rest of the group in everything except for the fact it will not accept a dowry, and as long as it is not possible to predict this in advance (dowry demands or offers are not made in writing), it makes sense to rank everyone as if no one wants a dowry, as long as the cost of search is not too large.
8 those who match out of caste will have the same quality of matches as those who marry in caste. We formalize this idea by introducing the concept of an average price of caste.
Definition The average price of caste (APC) for women (men) is the difference, in terms of average quality of the matches, of women (men) of the same quality who marry in or below caste, relative to the average quality of those who marry above or in caste averaged over all types of women.
The APC is zero if everyone matches assortatively as in the case where everyone is CN. With caste preferences, there is a potential trade-off between marrying assortatively and marrying based on caste preferences and therefore APC need not be zero. For example, consider a con- figuration where the only out of caste match is between high types of caste 2 and low types of caste 1 for both men and women, and all other matches are assortative. The price of caste will be positive because those H types who match in caste get a higher quality match relative to those who match with a higher caste. Define xic to be a x-type individual (x = H, L) from caste i (i = 1, 2) who has caste preference c ∈ {C, N} where α(C) = α > 0 = α(N), that is, people can be either caste-conscious (C) or caste-neutral (N). Therefore, we have eight types of individuals for each gender: H1C, H1N, H2C, H2N, L1C, L1N, L2C, and L2N. Sometimes we will refer to just the type and caste of an individual (and not his/her caste-preference): in that case we will refer them to as a xi type (where x = H, L and i = 1, 2). Furthermore, if X-Y are a match, X is the type and caste of the female and Y is that of the male. Proposition 1 establishes that, if an additional condition which limits the fraction of CN people in the population holds, then pure assortative matching cannot be an equilibrium when the vertical dimension of preferences is strong enough.
Condition 2. Limited Caste Neutrality (LCN): The number of CN H1 men is less than the number of caste conscious H2 women, and the number of CN H1 women is less than the number of caste conscious H2 men.
Clearly this cannot hold unless CN people are a sufficiently small fraction of the population. Let 1 f (H, H) β ≡ + αγ − 1 0 α f (H, L) and 1 f (H, H) β ≡ (1 − αγ) − 1 . 1 α f (H, L) Below, we show that assortative matching is an equilibrium as long as the attraction of match- ing with the high caste (the vertical dimension) is not too strong:
9 Proposition 1. All equilibria will include only assortative matches if β ≤ β1. On the other hand, if
β > β0 the following properties must hold: (i) all equilibria must have some non-assortative matching as long as condition LCN holds; (ii) if there is at least one non-assortative match there must be at least one out-of-caste non-assortative match; (iii) all out-of-caste non-assortative matches must involve an H type of caste 2 matching with an L type of caste 1.
Proof. Suppose an equilibrium with one non-assortative match exist. By our assumption of a balanced sex ratio by type, there must be ate least another such match. Given that (1, 1), (1, 2), (2, 1), and (2, 2) are the four possible matchings in terms of caste, if we treat identical matches with the gender roles reversed as the same match, then there are ten logical possibilities for pairs of non-assortative matches: (i) H1-L1 and L1-H1; (ii) H1-L1 and L1-H2; (iii) H1-L1 and L2-H1; (iv) H2-L2 and L1-H1; (v) H1-L2 and L2-H1; (vi) H1-L2 and L1-H2; (vii) H1-L2 and L2-H2; (viii) H2-L1 and L1-H2; (ix) H2-L1 and L2-H2; (x) H2-L2 and L2-H2. Of these (i), (iii), (v), (vi), and (x) are clearly unstable since there is a rematch from these two pairs of matches that would make both parties better off in at least one match. Case (vii) is also unstable because the H1 man would always prefer at least matching with an L1 woman who would accept such a match unless already matched with an H1 male. Thus, to be stable, this case would have to be combined with case (ii) or (iv). If it is combined with case (ii), the H1 man would be able to attract away the H2 woman paired with an L1 man. If combined with case (iv), the H2 man and woman would form a match. Thus, case (vii) is unstable. For case (viii) and (ix) to be stable, the H2-types must prefer being matched with an L1 than with each other. This will occur when
(1 + αβ − αγ) f (H, L) > f (H, H) which translates into β > β0. When β < β0 (which holds when β < β1 since β1 < β0), those matches are unstable. Both cases (ii) and (iv) will be unstable if H1 prefers matching with H2 than with L1. This occurs when (1 + αβ) f (H, L) < (1 − αγ) f (H, H) or when β < β1. Thus, if β < β1, all matches will be assortative.
On the other hand, if β > β0, then, starting from an assortative match (which exists given B), a H2C will always want to match with a L1 unless she is already matched with a H1, and any L1N would accept her offer. Therefore the only way there can be an assortative equilibrium is if all H2Cs are matched with H1s. But if there are two H1Cs who are each matched to a H2, they would want to deviate and match with each other. Therefore it must be the case that either the number of H1N men (women) is at least as large as the number of H2C women (men). This cannot be true if condition LCN holds.
10 The next step is to observe that if there is at least one non-assortative match then there must be an out-of-caste non-assortative match. Suppose on the contrary that the population only contains assortative matches, and, non-assortative matches of the form H2-L2 or H1-L1 (or the reverse). Since we have a balanced sex ratio within type, we must have another non-assortative match. If those are of the same form (H2-L2 or H1-L1), then both H-types would match together and this is clearly unstable. Let us consider the possibility that the equilibrium includes only groups of pairs of the form H2-L2 and L1-H1. While H2 would clearly be willing to match with
H1, H1 will be unwilling to do so as β > β1. On the other hand, there is at least one L1N in the population (by our assumption) and this individual will be willing to match with H2. Now, L1N will be willing to do so as long as she is not already matched with an H-type. She cannot be matched with an H2 type since all non-assortative matches are within caste by assumption. Also, she cannot be matched with an H1 since in that case, the two unassortatively matched H1 would pair together. Thus, there will always be at least one non-assortative out-of-caste match, which is a contradiction. To complete the proof, note that there are two possible out-of-caste non-assortative matches: H2-L1 or H1-L2. The latter is ruled out since all the cases involving it were ruled at the beginning as not stable. Of course, Proposition 1 does not guarantee that assortative matching is the only possible configuration when β < β0. Moreover β < β1 is a fairly stringent condition. This multiplicity of equilibria and the corresponding need to impose strong conditions to be able to limit the set of possible equilibrium patterns is a direct result of introducing caste neutrality. As is well-known, indifference introduces significant complications in matching problems.(See, for example, Ab- dulkadiro˘gluet al. 2007 and Erdil and Ergin 2006). However indifference in our framework cannot be dismissed as a non-generic phenomenon. As we shall see, when we estimate pref- erences person by person, about 30% of the population show no caste preference of the type modeled here– which reflects the fact that their caste preferences are sufficiently weak so that other factors dominate their decisions and therefore given realistic choice environments (say 50 letters to chose from), we will never see them acting on their caste preferences. This is what indifference is meant to capture. The next proposition provides a stronger characterization by adding the requirement that within a caste-type, the fraction of caste-neutral types is the same among men and women (see A.2 for the proof). For this we need to make a stronger assumption about the population distri- bution. We define
Condition 3. A population is said to satisfy strong balance (SB) if ωri = µri, and νri = κri where r = H, L and i = 1, 2.
Proposition 2. Suppose the population satisfies SB. If β < β0 then only assortative matchings are stable.
Conversely, when β > β0, all equilibria must have some non-assortative out-of-caste matching as long
11 as condition LCN holds. Moreover if there is non-assortative out-of-caste matching it must involve, in addition to assortative matches, combinations of m ≥ 0 L1-H2 and H2-L1 pairs and n ≥ 0 either H2-L2 and L1-H2 pairs or L2-H2.and H2-L1 pairs. Finally the APC is zero when β < β0 and positive if β > β0. It is useful to ask whether we would get very different matching patterns if we took the same population (i.e one that satisfies SB), but used a TU framework. It should be clear that with TU matching, all CC H1 will match with each other and so will all CC L2 (the SB assumption makes this feasible) and all CN. Under TU, it is sufficient to look at the total surplus under a given match and compare it with the total surplus under alternative matches. Let v(xic, yjc) denote the total surplus when a man of type xic is matched with a woman of type yjc. Under our assumptions
v(xic, yjc) = 2 + α(c){β(4 − (i + j) − 2γ(i − j)2} f (x, y).
Given SB, we can show that if β ≥ 2β0 − γ then an equilibrium with non-assortative matches is possible. Suppose not, and therefore start without loss of generality with an assortative matching equilibrium where individuals are matched to someone of the opposite sex identical to them in terms of type, caste and caste-preference. Consider a match between a H2C and a L1N :
v(H2C, L1N) = [2 + α (β − γ)] f (H, L).
Since v(H2C, H2C) = 2 f (H, H), so long as β ≥ 2β0 − γ a H2C type is better off matching with a L1N type. Also, as f (H, H) > f (L, L), and v(L1N, L1N) = 2 f (L, L), by a similar argument a
L1N type is better off matching with a H2C type. When β < 2β0 − γ, we can show that all pairs of non-assortative matches listed in the proof of Proposition 1 would not exist in equilibrium as they generate surpluses that are smaller than the ones obtained by assortatively matching the
H-types. Thus, if β < 2β0 − γ and SB holds, all pairings would be assortative. To sum up, our model suggests that the impact of caste preferences on equilibrium outcomes depends crucially on whether these preferences are vertical or horizontal. When preferences are mostly horizontal, out-of-caste matches will look like in-caste matches on non-caste attributes, i.e. they will be assortative, to the extent that demographics allow it. Furthermore, little would change in matching patterns on non-caste attributes if caste preferences were to be ignored and the "price of caste" will be zero. On the other hand, when preferences are strongly vertical, some fraction of out-of-caste matches would be non-assortative and we will see a positive “price of caste” in equilibrium. Given these theoretical predictions, the empirical sections that follow will focus on estimating the magnitude of the caste preferences in our sample and determining whether they are mostly horizontal or vertical. Then, using these estimates, we will explore empirically the equilibrium consequences that these caste preferences generate for marital pairing and highlight their resem-
12 blances to the theoretical predictions generated here.
3 Setting and data
3.1 Setting: the search process
Our starting point is the set of all matrimonial ads placed in the Sunday edition of the main Bengali newspaper, the Anandabazar Patrika (ABP) from October 2002 to March 2003. With a circulation of 1.2 million, ABP is the largest single edition newspaper in India and it runs a popular special matrimonial section every Sunday. The search process works as follows. First, the parents or relatives of a prospective bride or groom place an ad in the newspaper. Each ad indicates a PO box (provided by the newspaper), and sometimes a phone number, for interested parties to reply. They then get responses over the next few months (by mail or by phone), and elect whether or not to follow up with a particular response. While ads are placed by both sides of the market, “groom wanted” ads represent almost 63 percent of all ads placed. One can both post an ad and reply to one. When both parties are interested, the set of parents meet, then the prospective brides and grooms meet. The process takes time: in our sample, within a year of placing an ad, 44 percent of our sample of ad-placers whom we interviewed were married or engaged although most had placed only a single ad. Of those who got married, 65 percent met through an ad, the rest met through relatives or, in 20 percent of the cases, on their own (which are referred to as “love marriages ”).
3.2 Sample and data collection
We first coded the information in all the ads published in the Sunday edition over this time period. We excluded ads placed under the heading “Christian” or “Muslims” in the newspaper given our focus on caste, which is primarily (though not exclusively) a phenomenon among Hindus. The details on the information provided and the way it was coded are provided below. We refer to this data set of 22,210 ads as the “ad-placer sample.” We further restricted our attention to ads that did not mention a phone number, and requested all responses to be sent at the newspaper PO Box or to a personal mailing address.13 This restriction was necessary to make sure that what the ad-placer knows about his/her respondents is fully captured by the letters. About 43 percent of the ad-placer sample included a phone number (sometimes in addition to a PO Box and sometimes as the only way to contact the ad- placer). We find little differences between the characteristics of the ads that included a phone
13Only a small fraction of ads included only a personal mailing address (namely, 4 percent of our interview-sample, and 8 percent of the ad placer sample).
13 number and those that did not, except in terms of geographical location: fewer ad placers with phone numbers were from Kolkata. After excluding these ads from the ad-placer sample, we randomly sampled 784 ads. With ABP’s authorization, respondents were approached and asked whether they would agree to be interviewed when they came to collect the answers to their ads at the newspaper PO Box. Only one sampled respondent refused to be interviewed. The ads placed by the 783 individuals who completed the survey form the “interview sample.” The interview was conducted in the ad-placer’s home after a few days with the person in charge of the search, usually the parent, uncle or older brother of the prospective groom or bride. Detailed information was collected on the prospective groom or bride, his family and the search process for a marriage partner.14 In particular, ad-placers were asked whether they also replied to other ads and, when they did, to identify the ads they had responded to among the ads published in the past few weeks. Ad placers were also asked how many letters they received in response to their ad (on average 83 for bride-wanted and 23 for groom-wanted ad placers), and to identify the letters they were planning to follow up with (the “considered” letters). We then randomly sampled five letters from the set of “considered” letters (or took the entire set if they had less than five in this category), and ten (or all of them if they had less than ten in this category) from the set of the “non-considered” letters, and requested authorization to photocopy them. The information in these letters was subsequently coded, using the procedure outlined below. We refer to this data set as the “letter data set.” Finally, a year after the first visit, this original interview-sample was re-interviewed, and we collected information regarding their current marital status and their partner’s choice. Only 33 ad-placers out of the entire sample could not be contacted. Out of those we reached, 346 were married or engaged, and 289 of those agreed to a follow-up interview and gave us detailed information regarding their selected spouse, the date of the marriage and their overall search process including the number of ads posted and the way the match was made. Appendix Tables C.1 and C.2 compare ad-placers found and not found and those who agreed or refused to answer the follow up questions. There appears to be little systematic differences between the two groups.
3.3 Variable construction
Ads and letters provide very rich and mostly qualitative information. A data appendix de- scribes the coding process. In this subsection, we mainly discuss the coding process for the caste information. If caste was explicitly mentioned in the ad or letter, we used that information as the caste of the person. Caste is often not explicitly mentioned in the ad because the ad is usually placed
14The questionnaire is available online at https://sites.google.com/site/jeannelafortune/research.
14 underneath a particular heading in the newspaper corresponding to a caste. If caste is not directly mentioned in the ad, the heading is used for this classification. The information on caste is readily available, directly or indirectly, in the overwhelming majority of ads (98 percent). In the letters, caste is explicitly mentioned in about 70 percent of the cases. As already mentioned, Hindu society is divided into a number of broad castes (varnas) but each of these castes, in turn, is divided into a number of sub-castes (jatis).Ad-placers or letters can be more or less specific in identifying themselves. Historically, there was a more or less clear hierarchy among the broad caste groups, but within each broad group, there was no clear ordering. We therefore grouped castes into eight ordered broad-caste groups, based on the classifications in Risley (1981) and Bose (1958), with Brahmin at the top (with the rank of 8, and various schedule castes at the bottom, with the rank of 1). Appendix Table C.3 presents the classification. To determine whether a letter writer and an ad-placer are from the same caste, we attributed to each letter or ad the specific sub-caste mentioned in the ad. If the ad-placer or letter writer only mentioned a broad group, he or she is assumed to be from any of the specific sub-castes. For example, a self-identified Kulin Brahmin is considered to be from a different caste as a self-identified Nath Brahmin (though the vertical distance between them is set to zero), but is considered to be of the same caste as someone who simply identified himself as a Brahmin. Another relevant piece of information is the stated preference regarding caste. Among the sampled ads, more than 30 percent of individuals specify their preference for marrying within their caste (using phrases such as “Brahmin bride wanted”). Another 20 to 30 percent explicitly specify their willingness to unions outside their own caste by the use of phrases such as “caste no bar.” The remaining 40 to 50 percent do not make any mention of preferences regarding caste. The remaining variables coded were: education (in 7 categories), earnings and occupation for men (we construct an occupational score, referred to as “wage” in what follows), family origin, physical characteristics, and some more rarely mentioned traits (astrological signs, blood types, etc.). The data appendix provides more details on the coding and appendix table C.4 shows the fraction of ads in which each characteristic is not mentioned.
3.4 Summary statistics
Table 1 presents summary statistics for both our interview sample and the full set of ads. The two samples look quite similar, except that the interview sample is more likely to live in Kolkata (the Kolkata sample was less likely to provide a phone number). It is important to emphasize that our sample is not at all representative of India, or even West Bengal. It is drawn mostly from the Bengali (upper) middle class, as evidenced both by the prevalence of higher caste individuals (a quarter of the sample are Brahmin), and educational
15 achievement. Education levels are mentioned in the ad by 90 percent of women and 80 percent of men. Almost all men and women (90 percent) have at least a bachelor’s degree. Both men and women have occupational scores significantly higher than the median urban formal sector occupational score (from Bargain et al. 2007 and Glinskaya and Lokshin 2005). This group enters the marriage market after they have completed their education and (at least for men) found a job: the average age is 27 for women, and 32 for men. Around 50 percent of the sample lives or works in Kolkata and slightly less than half consider their family as originating from West Bengal. This paper is not meant to be a characterization of the marriage market in India, but a description of how this particular market works; it is quite striking that even in this very well educated and quite well off sample, caste remains so important. Physical characteristics clearly play an important role in the marriage market. Height is mentioned in the ad by 96 percent of the women and 90 percent of the men. A prospective bride’s skin tone and beauty are mentioned in 75 percent and 70 percent of the groom wanted ad, respectively beauty. There does not appear to be much boasting about physical appearance, however. More ads describe the bride as being “decent-looking” than either “beautiful” or “very beautiful.” Table 2 shows summary statistics for this sample, comparisons between the ad-placers and the letters they have received, as well as with their eventual spouses. In this table, as well as in the remainder of the paper, all differences are presented in terms of the difference between the characteristics of the man and the characteristics of the woman.15 Two-thirds of the letters that mention caste are from someone from the same caste as the ad-placer. The fraction of within-caste marriages among actual matches is a little higher than the fraction of letters that come from within one’s caste: 72 percent of the prospective grooms and 68 percent of the prospective brides who are married after a year have married within their own narrow caste. This fraction increases to 76 percent and 72 percent respectively if we use the broad classification in terms of caste. Men who marry outside of caste tend to marry women from a lower caste while women who marry outside of caste tend to marry someone from a higher caste. Women tend to marry grooms who have either the same education (42 percent) or who are more educated than them (45 percent). Men are more likely to marry similarly or more educated women than themselves and 72 percent to 75 percent of the brides and grooms are from the same family origin (i.e., West or East Bengal).
15Since the sampling was stratified with unequal weights, each letter is weighted by the inverse of its probability of selection.
16 4 Estimating preferences
Using this data, we now estimate the preferences over various characteristics, exploiting the choices made by ad-placers and people who replied to their ads. We first discuss our basic empirical strategy and present the results. We then empirically examine various concerns about why the coefficients we observe may not actually represent households’ preferences.
4.1 Basic empirical strategy
The first goal of this section is to estimate relative preferences for various attributes in a prospective spouse. We assume that the value of a spouse j to a particular individual i can be described by the following function: U(Xj, Xi) = αXj + β f Xi, Xj + µi + εij (1) where α captures the effect of the characteristics of person j, β specifies how this effect might be different depending on person’s i own characteristics and µi represents ad-placer fixed effects. We have in our data several indications of individual’s revealed preference for one potential spouse over another that can allow us to estimate the parameters of equation (1). First, we know whether an ad-placer is following up with a particular letter writer or not. We thus have information that he preferred this letter to the letters he did not consider. Second, the ad-placers also provided us with their ranking of each letter we sampled. For this exercise, we asked them to give us their true preference ordering, regardless of whether they were considering responding to the letter. In addition, for ad-placers who have themselves replied to ads, we know which ads they decided to reply to (and we also know the universe of ads they could have replied to on that particular date). Furthermore, we know that a letter writer decided to reply to an ad. Finally, we also know how many responses an ad received. We focus in what follows on the decision of the ad-placer to respond to a particular letter. The results using the ranking of letters provided by the respondent (provided in the appendix) are extremely similar. We prefer to consider the ad-places responses to the letters he has received over the other choices we observe in the data for three reasons. First, we can be sure that the ad-placers have read all the letters they have received, so the set over which choices are made is well-defined. Second, strategic behavior is a priori less likely in this sample since the letter writer has already expressed interest in the ad-placer. The results from these other strategies are presented in the appendix, and the relevant differences are discussed below. The regressions we estimate thus take takes the following form: