HEDG HEALTH, ECONOMETRICS AND DATA GROUP
WP 20/01
The Education-Health Nexus: A Meta-Analysis
Xindong Xue
January 2020
http://www.york.ac.uk/economics/postgrad/herc/hedg/wps/
The Education-Health Nexus: A Meta-Analysis
By
Xindong Xue1
School of Public Administration, Zhongnan University of Economics & Law, China
December 6, 2019
1 Correspondence: No.182, Nanhu Avenue, East Lake Hi-tech Development Zone, Wuhan 430073, China. Tel.: +86 27 8838 7901; fax: +86 27 8838 6936. E-mail address: [email protected].
The Education‐Health Nexus: A Meta‐Analysis
Xindong Xue
Abstract
Does education cause a better health? No consensus answer to this question has yet emerged. In this paper, I perform a meta-analysis of the extensive literature on the health effects of education. The final sample identifies 105 studies with 4,671 estimates. Overall, the health effect of education is not economically meaningful, although statistically significant. There is severe publication bias favoring the positive effect of education on health. Studies that do not control for endogeneity are prone to exaggerate the estimated effect. In addition, the effect becomes weaker for more recent studies. The results suggest that education may not be a feasible policy option for promoting population health.
JEL CODES: B49, C49, I10, I20, I31 KEYWORDS: Education, Health, Human capital, Meta-analysis, Research synthesis
1
1.Introduction
Since the seminal works of Schultz (1961), Becker (1964) and Grossman (1972), social
scientists have generally agreed that education and health play a fundamental role in economic
development and well-being. In 2015, United Nations listed education and health as important
sustainable development goals by 2030 (United Nations, 2015). Given their intrinsic value for human development, an increasingly large body of literature has been devoted to examining the complementary relationship between education on health. Does education affect health?
The empirical answer to this question has important implications. In theory, it can directly test the model of demand for health capital, which predicts that education improves health
(Grossman, 1972). In policy, if education does have a large, beneficial effect on health, then educational interventions might serve as a more cost-effective tool for promoting population health than merely increasing public health spending.
A large, positive correlation between education and health has been observed extensively in many countries, no matter how education and health are measured.1 However, there is a
broad disagreement on how large the effect is and whether this effect is causal. While some
studies report a significant education effect on health (e.g., Van Kipperslius, O’Donnell, & van
Doorslaer, 2011; Oreopoulos, 2006, 2007; Lleras-Muney, 2005), some other studies find small
or no effects (e.g., Meghir et al., 2018; Lager & Torssander, 2012; Clark & Royer ,2013;
Behrman et al.,2011; Arendt, 2005; Albouy & Lequien, 2009; Braakmann, 2011). Moreover,
some studies find mixed evidences across different health dimensions or sub-groups. Education
works for some health indicators or only for some specific groups (e.g., James, 2015; Kemptner
et al., 2011; Webbink et al., 2010).
Given the diversity of findings on the effects of education on health, the purpose of this
paper is to employ meta-analysis to quantitively review the empirical literature in this regard.
1 For a comprehensive review, see Grossman and Kaestner (1997), Grossman (2006), Cutler and Lleras-Muney (2008,2012). 2
Meta-analysis is a reliable and objective way to synthesize research findings and has been
extensively used in the field of economics, especially in the case that the empirical literature
lacks consensus (Neves et al., 2015; Stanley & Doucouliagos, 2012). It also employs
multivariate regression to reveal the factors underlying the heterogeneity of estimates, and to
establish whether there are any consistent and generalizable results which apply across contexts
and methods (Anderson et al., 2018).
To date, there are only two meta-analytic studies on the education effects on health.1
Furnee et al. (2008) is the first to conduct a meta-analysis on 88 estimates from 40 studies.
Their results show that the quality-adjusted life years (QALY) weight of a year of education is
0.036. However, their study only uses self-reported health as health measure. Since health is a multi-dimensional concept, self-reported health alone cannot delineate the overall picture of the health effects of education. Another study by Hamad et al. (2018) provide a review on the quasi-experimental studies of compulsory schooling laws. Their findings indicate that education has an effect on most health outcomes—most beneficial, some negative. However, they don’t make clear on what the effect size is. Since different studies use different health measures and estimation methodology, estimates may not be comparable in different studies.
Building on the previous studies, this paper seeks to answer three core questions. First, what is overall effect of education on health? Second, is there any publication bias in the current literature? Third, what are the factors explaining the heterogeneity of the estimates? This paper contributes to the existing meta-analytic literature in several ways. First, I carry out a systematic search and reporting procedure to extract standardized effect size estimates from different studies. The standardized effect size makes it possible to compare the different estimates in different studies. Second, the final sample consists of 4,671 estimates from 105 studies, representing the most comprehensive meta-analysis on the effects of education on
1 Another closely related literature is Galama et al. (2018). They reviewed the experimental and Quasi-experimental evidence on the effects of education on health and mortality. However, they didn’t conduct a formal meta-analysis. 3
health up to date. Third, I perform meta-regression analyses to reveal the factors behind the different estimated effects across studies. Fourth, as highlighted in some studies (e.g., Galama et al, 2018), heterogeneity may underlie the conflicting results on the effects of education on health. This study codes a detailed list of measures of health and education so as to explore the heterogeneous effects of education on health. Finally, I implement a variety of weighting procedures to calculate the mean effect, use clustered standard errors to eliminate serial correlations between estimates within studies, and conduct formal Funnel Asymmetry Tests
(FATs) and Precision Effect Estimate with Standard Error (PEESE) to detect publication bias.
I find that the overall effect of education on health is not economically meaningful, although statistically significant. There is substantial publication bias favoring a positive impact of education on health. The meta-regression analysis indicates that studies that do not address endogeneity are prone to exaggerate the effects of education on health. And the effect becomes weaker for more recent studies. The results cast doubt on the policy initiatives to improve population health through educational interventions.
The rest of the paper proceeds as follows. Section 2 introduces the conceptual framework.
Section 3 describes the meta-dataset. Section 4 conducts the preliminary analysis. Section 5 discusses publication bias. Section 6 explores the heterogeneity. Section 7 conducts robustness checks. Section 8 summarizes and concludes.
2. Conceptual Framework
In theory, there are several explanations why education may improve health. The first is the productive efficiency hypothesis, which states that people with higher education are more efficient producers of health because education increases the productive efficiency from the given quantities of health inputs (Grossman, 1972, 2006). The second is the allocative efficiency hypothesis, which argues that education can improve health through the optimal mix of health inputs. Better-educated people have more information on the deleterious effects of
4
smoking and bad habits, so that they are more likely to have healthy lifestyles (Rosenzweig &
Schulz, 1989). The third is that education improves health through channels such as better labor market opportunities, higher income, better living conditions, higher quality of care and living environment (Card, 1999; Cutler & Lleras-Muney, 2010).
In empirical literature, identifying the causal effects of education on health is plagued by endogeneity problems. The first originates from reverse causality. Healthier people usually have higher education (Behrman & Rosenzweig, 2004). Second, there may be omitted third variables including genetics, time preference and family background, which simultaneously cause both education and health to move in the same direction (Fuchs, 1982; Bijwaard et al.,
2015). For example, people with low time preference usually value future returns more highly, thus investing more in education and health concurrently.
Several identification strategies have been implemented to disentangle the causal relationships between education and health. However, the results of these studies are not consistent. The most widely used strategy is instrumental variable (IV). For example, Lleras-
Muney (2005), in her influential study, uses compulsory schooling laws as instruments for education and finds one additional year of compulsory schooling reduces 10-year mortality in the United States by as much as 6 percentage points. In contrast, Black et al. (2005), also using the compulsory schooling laws as instruments, find no effect of education on mortality among the US population. Furthermore, Buckles et al. (2016) exploits the exogeneous variation in years of college education caused by Vietnam draft and shows that education reduces mortality significantly. Finally, a recent study by Fletcher (2015) shows that the effects of education are not precisely estimated, though they appear to be large. It should also be noted that the variations in the results of instrumental variables studies may be partly due to the validity of instrument, as the exclusion condition of IV cannot be directly tested (Grossman, 2015).
The second approach to address endogeneity is Regression Discontinuity Design (RDD).
5
Compared to IV, RDD imposes weaker assumptions for identification. Oreopoulos (2006) employ compulsory schooling laws reform in UK as a discontinuity and finds a significant effect of education on both physical health and self-rated health. Clark and Royer (2013), however, use the same strategy and report no significant effect of education on mortality in
Britain. Albouy and Lequien (2009) also find that education does not have significant effect on the survival rate of French population. Meghir et al. (2018) examine the consequence of compulsory education reform in Sweden and find no significant effect of the reform on mortality.
The third alternative approach is twin fixed effects estimation. The logic behind this approach is that twins share almost the same characteristics such as genetic inheritance, gender, family characteristics and innate ability. It is plausible that differences in education between twins are exogenously determined, so differences in outcomes across twins can be seen as the outcomes from differences in their education. Using the Danish twin samples, Behrman et al.
(2011) find that education has no significant effect on mortality, while Van den Berg et al.
(2012) and Madsen et al. (2010) report mixed findings which depend on the sample studied.
On the basis of twin samples in Sweden, Lundborg et al. (2016) finds that education significantly reduces mortality. Furthermore, some studies show that education reduces the overweight for men (Webbink et al. 2010) but not for women (Webbink et al. 2010, Amin et al. 2013).
It can be seen from the prior literature that estimates on the effects of education on health vary greatly depending the methodology, health measures and sample used. In light of these conflicting findings, it becomes an empirical question to ascertain whether and to what extent education has an impact on health. Since there are large discrepancies in estimates across different studies, it is essential to synthesize these results and arrive at a general conclusion.
The meta-analysis can address this question.
6
3. The Meta-Data Set
3.1 Search Strategy and Selection Criteria
Following the MAER-Net guidelines proposed by Stanley et al. (2013), I carry out a systematic
search for the potential studies in the following online databases: EconLit, Web of Knowledge,
Google scholar, JSTOR, EBSCO, RePEc, IDEAS, SSRN, Scopus, ProQuest, NBER, IZA,
OECD Library and World Bank Publications. In line with Minasyan et al. (2019), reference
snowballing techniques was also employed to collect articles identified through the search
engine process. The search scope covers published journal articles, book chapters, conference
proceedings, working papers, master theses, doctoral dissertations, research reports, and other
technical reports. The combinations of the following key words are used: “education”,
“schooling”, “health”, “mortality”, “disease”, “obesity”, “BMI”, “morbidity”,
“depression”, “cognition”, “life expectancy” and “survival”. The search process was
completed at the end of December, 2018.
The initial search identified 474 studies in total. To make the analysis consistent, I adopt
a stepwise procedure to finalize the sample. The first step is to remove the duplicated studies,
including working papers reporting the same estimates with the final published version. Second,
I drop the studies which examined the role of health education, for example, the physical
education or oral health education, which was not directly related to our subject. Third, I delete
the studies that examine the effect of health on education. Forth, I restrict the sample to the
effect education on own health, excluding the studies investigating the intergenerational effect of education on health. Fifth, I discard studies lacking sufficient information to calculate t- statistics (that is, standard errors, t-statistics, p-value, or 95% confidence Intervals). Sixth, I exclude the theoretical studies or systematic reviews which did not report econometric estimates. Lastly, I eliminate studies which includes interaction terms and quadratic specifications of the education variable in the regression specification because it is difficult to
7
extract the partial estimated effect (Gunby, Jin & Reed, 2017). I collected 4,718 estimates from
107 studies. The detailed flowchart is shown in Appendix A. The studies are listed in Appendix
B.
Table 1 reports the source of selected studies. The majority of the estimates (73.52%) is
extracted from journal articles, followed by estimates from working papers (24.62%),
conference proceedings (1.54%) and books (0.32%). With regard to the common journal outlets,
the journal with highest percentage of estimates is Social Science & Medicine (18.49%), an
interdisciplinary journal devoted to the exploration of social science on health. The next
journals in terms of frequency are Journal of Health Economics (10.83%), Economics &
Human Biology (9 %), Social Science Research (7.69%), and Economics of Education Review
(6.29%). Following these are a list of economic journals (OECD Journal: Economic Studies,
Health Economics, China Economic Review, Applied Economics), population journal
(Demography) and public health journal (International Journal of Epidemiology). This proves
that education-health nexus is a broad, appealing subject in the field of multi-disciplines.
3.2 Exclusion of Outliers
Outliers may be a concern in the meta-analysis. Before proceeding to code the studies, I
remove a few implausibly influential estimates from the dataset. Following Gallet &
Doucouliagos (2017), I first estimate a FAT-PET and then remove observations with absolute
value of the standardized residual greater than 3.5. As a consequence, the 47 outliers are
removed from the dataset. The final sample consists of 105 studies with 4671 estimates. Note
that the main results remain quantitatively similar without excluding the outliers.
Figure 1 displays the histogram of t-statistics used in the final sample. The distribution is
right-skewed with a few large outliers, indicating many studies reporting positive estimates. Of
these, 2183 (46.74%) recorded a significant, positive relationship between education and health
at the 10% significance level or below. 2319 (49.65%) record no significant relationship. There
8
are 169 estimates (3.62%) reporting a significant, negative relationship.
3.3 Coding Procedure
For each study in the sample, I code the data to extract a range of study characteristics.
This includes: the study’s author(s), publication year, publication status (e.g., journal, working paper, books), journal name, data type (cross-sectional or panel; individual level or aggregate level), countries studied, names of health variables, names of education variables, number of observations, and sample type (whole population or subsamples). I also record the regression coefficient and its associated standard errors or confidence intervals so as to derive the partial correlation coefficient (PCC).
It is worth noting that there are a number of studies not reporting the t-statistics or the standard error but p-values. I use the TINV function in Excel to calculate the t-statistics using the p-values and the degrees of freedom (Stanley & Doucouliagos, 2012). Similarly, there are some studies only reporting the levels of statistical significance with asterisk mark ***, **and
*. I follow the rule of thumb to assume that p-value is 0.01 for ***, 0.05 for **, 0.1 for * and
0.5 for no asterisk mark. I then use these p-values and degrees of freedom to calculate the t- statistics.
In addition, some studies employ non-linear estimations (e.g.,Probit/logit, cox proportional hazard model) and report only the Odds Ratio (OR) with standard error or 95%