Quick viewing(Text Mode)

Modeling Return to Education in Heterogeneous Populations

Modeling Return to Education in Heterogeneous Populations

Noname manuscript No. (will be inserted by the editor)

Modeling return to education in heterogeneous populations

Angelo Mazza · Michele Battisti · Salvatore Ingrassia · Antonio Punzo

the date of receipt and acceptance should be inserted later

Keywords Indicators of return to education · Mincer’s earnings function · mixtures of regression models

JEL classification: C14, J24, J31 Abstract The Mincer earnings function is a regression model that relates individual’s earnings to schooling and experience. It has been used to explain individual behavior with respect to educational choices and to in- dicate productivity on a large number of countries and across many different demographic groups. However, recent empirical studies have shown that often the populations of interest embed latent homogeneous subpopulations, with different returns to education across subpopulations, rendering a single Min- cer’s regression inadequate. Moreover, whatever (concomitant) information is available about the nature of such a heterogeneity, it should be incorporated in an appropriate manner. We propose a mixture of Mincer’s models with con- comitant variables: it provides a flexible generalization of the Mincer model, a breakdown of the population into several homogeneous subpopulations, and an explanation of the unobserved heterogeneity. The proposal is motivated and illustrated via an application to data provided by the Bank of Italy’s Survey of Household Income and Wealth in 2012.

1 Introduction

Earnings functions are used by social scientists to explain individual behavior with respect to educational choices and to indicate productivity (Oreopoulos

M. Battisti Dipartimento di Scienze Giuridiche, della Societ`ae dello Sport, Universit`adi Palermo, Via Maqueda, 172, 90134 Palermo, Italy, E-mail: [email protected] S. Ingrassia, A. Mazza and A. Punzo Dipartimento di Economia e Impresa, Universit`adi Catania, Corso Italia, 55, 95129 Catania, Italy, E-mail: [email protected], E-mail: [email protected], E-mail: [email protected] 2 Short form of author list

and Petronijevic, 2013). They provide an indicator of returns to schooling, typ- ically in the form of projected future wages, which helps individuals to decide how to invest in their own human capital (Patrinos, 2016). These indicators are also used as a basis for setting public policy with respect to investment in education (Davis and Noland, 2003). Introduced by Jacob Mincer, a pioneer of the New Labor Economics, in his seminal work Schooling, Experience and Earnings, the“human capital earn- ings function” is arguably the most popular earning function (Mincer, 1974; Chiswick, 2007; Machin, 2007). It is a single-equation model that explains the natural logarithm of earnings as a linear function of years of education, years of potential labor market experience, and the square of years of potential experience; in formula,

2 ln(y)= µ (x; β)+ ε = β0 + β1x1 + β2x2 + β3x2 + ε, (1)

where y denotes earnings, x1 and x2 represent years of education and years of potential labor market experience respectively1, while ε ∼ N (0,σ), with σ ′ being the (conditional) standard deviation of ln (y). In (1), x =(x1,x2) and ′ β =(β0,β1,β2,β3) . The Mincer equation owes its popularity to the straightforward interpre- tation of the coefficient β1 as approximated rate of return to education (see Bj¨orklund and Kjellstr¨om, 2002, for a critical discussion). It has been examined on many datasets, involving a large number of countries and many different demographic groups and, as stated by Lemieux (2006), it is “one of the most widely used models in empirical economics”. Within income inequality stud- ies, it has been used to study wage differentials due to gender (Smith and Westergard-Nielsen, 1988) and for predicting the wage that a self employed worker in a certain sector of the economy would have received on average as a paid employee on the same sector of economy (Atkinson, 1986, p. 163; Amarante, 2014). The literature on educational mismatches uses several dif- ferent specifications of equation (1) for quantifying the effect of educational mismatch on wages (Nieto and Ramos, 2016). However, recent empirical studies (see, e.g., Nordin, 2008, Henderson et al, 2011, Kopf et al, 2013 and Battisti, 2013) have shown the relevance of unob- served heterogeneity. That is, often the populations of interest are constituted by latent groups bearing different returns to educations and characterized by different socio-demographic profiles, in a way that regression coefficients (and dispersion parameters) cannot be assumed to be the same for all observations, making the use of a single Mincer’s regression inadequate. Heterogeneity and segmentation has been shown for Italy by Cipollone, 2001 and by Battisti, 2013. In fact, the Italian labor market has been traditionally characterized by rigid institutions, with employment protection legislation imposing strict rules and constraints regarding the ability to hire and fire workers. Reforms that started at the end of the ’90s have increasingly introduced more flexibility, but these new rules mostly apply only to newly hired workers; this has lead to the

1See Card (1999, 2001) for a discussion on the use of polynomial terms for experience. Modeling return to education in heterogeneous populations 3

formation of a two-tier labor market (see for instance Boeri and Garibaldi, 2007). Finite mixtures of linear regression models, introduced by Quandt and Ramsey (1978) in the general form of “switching regression”, constitute a ref- erence framework of analysis when no information about group membership is available and the modeling aim is to find groups of observations with sim- ilar regression coefficients (Gr¨un and Leisch, 2008a). Furthermore, whatever (concomitant) socio-demographic information is available about the nature of such a heterogeneity, it should be incorporated in the model in an appropriate manner. To deal with these issues, based on Dayton and Macready (1988), in Section 2 we introduce a finite mixture of Mincer’s regression models with concomitant variables: the proposed model simultaneously provides a flexi- ble generalization of the Mincer regression, a breakdown of the population into several homogeneous subpopulations, and an explanation of the unob- served heterogeneity also based on the considered concomitant variables. The expectation-maximization (EM) algorithm is used for parameter estimation and the Bayesian information criterion (BIC) is adopted to select the number of groups (or clusters or mixture components). In Section 3, the model is applied to disposable household income, as obtained from the Bank of Italy’s Survey of Household Income and Wealth (SHIW) in 2012. In addition to illustrate the use of the model, this real data application demonstrates, based on the BIC, how a single Mincer’s regression is inadequate for the data at hand.

2 The model

Given a d-dimensional vector W of concomitant variables (individual charac- teristics), and based on Dayton and Macready (1988), we propose to generalize equation (1) via a finite mixture of k Mincer’s regressions with concomitant variables; being a mixture, the proposed model can be defined from the con- ditional density of ln (y), given x and w, in the following way

k p [ln(y) |x, w; ϑ]= πj (w; α) φ ln(y) |x; µ x; β ,σj , (2) X  j  j=1

where πj (w; α), j = 1,...,k, are positive weights (depending on the pa- rameters α) summing to one for each w, φ (·; µ,σ) denotes the density of a Gaussian random variable with mean µ and standard deviation σ, µ x; β , j j = 1,...,k, is defined as in (1), and ϑ contains all of the parameters of the model. The multinomial logit model ′ exp(αj0 + α w) π (w; α)= j1 (3) j k ′ exp(α + α w) X h0 h1 h=1 4 Short form of author list

′ is assumed for the mixture weights in (2), where αj1 = (αj1,...,αjd) , αj = ′ ′ d+1 ′ ′ ′ αj , α ∈ IR , and α = (α ,..., α ) , with α ≡ 0 for identifiability 0 j1 1 k 1 sake (see Gr¨un and Leisch, 2008b, p. 4). Being a mixture, model (2) may serve two different purposes (Titterington et al, 1985, pp. 2–3). First, it can be used as a semi-parametric competitor of nonparametric estimation techniques for the conditional density of ln(y) since, under regularity conditions, any density can be consistently estimated by a mixture of normal densities (Ghosal and van der Vaart, 2001). Second, and more importantly, model (2) can be used as a powerful device for clustering by assuming that each mixture component represents a group underlying the overall population (McLachlan and Basford, 1988). Advantageously, by means of the mixture weights in (3), the concomitant variables w can be used to explain the profiles of the different groups. Here, it is important to stress that, based on the mixture model (2), we do not specify the groups a priori, but we let the data identify homogeneous groups with respect to the relationship between ln(y) and x.

2.1 Maximum likelihood estimation: the EM algorithm

To find maximum likelihood (ML) estimates for ϑ in (2), we adopt the EM al- gorithm of Dempster et al (1977), as implemented by the stepFlexmix() func- tion of the flexmix package (Leisch, 2004; Gr¨un and Leisch, 2008b) for R (R ′ ′ ′ ′ ′ ′ Core Team, 2013). In detail, given a random sample (y1, x1, w1) ,..., (yn, xn, wn) of (Y, X, W ) from model (2), and once k is assigned, the algorithm basically takes into account the complete-data log-likelihood n k n k lc (ϑ)= zij ln[πj (wi; α)] + zij ln p yi|xi; β ,σj , (4) X X X X  j  i=1 j=1 i=1 j=1 ′ ′ ′ where zij =1if(yi, xi, wi) comes from component j and zij = 0 otherwise. The EM algorithm iterates between two steps, one E-step and one M-step, until convergence; their schematization, with respect to model (2), is given below (see Wedel and DeSarbo, 1995 and Wedel and Kamakura, 2000, pp. 120–124 for further details). E-step: Given the current parameter estimates ϑ(r) of the rth iteration, each zij is replaced by the estimated posterior probability

(r) (r) (r) πj wi; α φ hln(yi) |xi; µ xi; βj  ,σj i z(r) =  . (5) ij (r) p hln(yi) |xi, wi; ϑ i

(r) (r) M-step: the obtained values of zij , which are function of ϑ , are substituted to zij in (4) so leading to the expected complete-data log-likelihood which is maximized with respect to ϑ, subject to the constraints on these param- eters. Note that, as the two terms on the right-hand side of (4) have zero cross-derivatives, they can be maximized separately. Modeling return to education in heterogeneous populations 5

Naturally, the EM algorithm described above needs to be initialized. Among the possible initialization strategies (see Biernacki et al, 2003, Karlis and Xekalaki, 2003, and Bagnato and Punzo, 2013 for further details), in the real data analysis of Section 3 a random initialization is repeated 10 times from different random positions (obtained via the argument nrep=10 of the stepFlexmix() function) and the solution maximizing the observed-data log- likelihood among these 10 runs is selected. Once the model is fitted, we can estimates the posterior probabilities of group membership, say zij, based on (5). Hence, each individual can be as- signed to one of the k groupsb via the maximum a posteriori probability oper- ator MAP (zij), assuming value 1 if maxh=1,...,k {zih} occurs at component j and 0 otherwise.b b

2.2 Choosing the number of mixture components

Model (2), in addition to ϑ, is also characterized by the number of mixture components k. Thus far, this quantity has been treated as a priori fixed; nevertheless, for practical purposes, its selection is usually required. The usual way to select k is via computation of a convenient (likelihood-based) model selection criterion over a reasonable range of values for k, and then choosing the value of k associated with the best value of the adopted criterion. In the real data analysis of Section 3 we adopt the Bayesian information criterion (Schwarz, 1978), i.e., BIC = −2l(ϑˆ)+ m ln n, where m is the overall number of free parameters in the model and l(ϑˆ) denotes the maximized observed-data log-likelihood.

3 Real data analysis

3.1 Data at hand

To illustrate the model, we adopt data provided by the Bank of Italy’s Survey of Household Income and Wealth (SHIW), reporting several socio-economic characteristics of Italian households for 2012. The SHIW is a biannual survey on the microeconomic behavior of Italian families, with a sample of approxi- mately 8,000 households per year. It contains information both on households (family composition) and on individuals. Moreover, it provides detailed infor- mation on several characteristics of the worker, such as net yearly earnings, average weekly hours of work and number of months of employment per year2, educational attainment (the highest completed school degree3), job experience,

2Our definition of the hourly wages is: yearly net earnings/(months worked × weekly hours worked × 4) 3Standard and not actual year of formal schooling are recorded. Since students who fail to reach a standard have to repeat the year, the actual number of years is likely to be underestimated. 6 Short form of author list

gender, marital status, sector of employment, composition of his/her family, parents background, regions of residence, town size. We consider adult heads of the family aged 19 or over, full time and part time employees, working either in the public or in the private sector4 and such that information about earnings are available; this yields a total of 3,141 individuals. For each head of family we consider the following variables: Y (Earnings): In its original formulation, the Mincer equation refers to the hourly price of labor as correct measure of worker’s earnings5. SHIW con- tains annual earnings net of taxes and social security contributions, average number of hours worked per week, and number of months of employment per year. Based on these quantities, and as used by most empirical studies6, hourly wages are defined as yearly net earnings Y = . months worked × weekly hours worked × 4

X1 (Education): Education is generally measured by the number of years spent at school. SHIW does not contain information about this number, but only on the highest degree attained by individual. Following a common ap- proach in literature (see, e.g., Vieira, 1999 and Brunello and Miniaci, 1999) we calculate the educational attainment of the individual by imputing the number of years required to complete her/his reported level of educational attainment7. More precisely, we consider that the (statutory) numbers of years required to obtain a primary and a junior school certificate is 5 and 8 years respectively; instead, for the upper secondary school the number of years ranges from 11 (vocational or technical school) to 13 (classical or scientific studies); finally, for tertiary education, we consider 16, 18 and 21 years for the university diploma, the college degree, and the postgraduate degree (e.g. Ph.D.), respectively. X2 (Experience): Many empirical studies use age as a proxy for the (working) experience of individuals. But this choice can be severely biased, especially for young cohorts. Other authors use potential experience, defined as the difference between the current age and the age at the labor market en- try, but they ignore the possibility of unemployment or underemployment,

4We exclude self-employed because of the low reliability of their declared earnings. As calculated by Brandolini and Cannari (1994) Bank of Italy’s Survey of Household Income and Wealth (SHIW) seems to underestimate the self-employed earnings of about 50 percentage points. 5Monthly or annual wages would in addition capture the effect of individual’s decisions on working hours. Given the only weak positive correlation between working time and ed- ucational attainment, it is reasonable to assume that the choice of hours worked reflects individual preferences rather than educational levels. 6Notice that hourly measure of earnings can be affected by measurement errors due to the fact that we calculate hourly wages as total earnings divided by hours of work. 7Standard, not actual, years of formal schooling are recorded. Since students who fail to reach a standard have to repeat the year, the actual number of years is likely to be underestimated. Modeling return to education in heterogeneous populations 7

again a crucial feature for young cohorts. Here we use as proxy for experi- ence the number of years for which a worker has been paid social security contribution; they should reflect the effective years of training on the job and learning-by-doing activities. W1 (Gender): gender, a dichotomous variable assuming values “Male” and “Female”. The former is chosen as reference category. W2 (Area): area, a nominal variable with values “North”, “Center”, and “South”. The former is chosen as reference category. W3 (Citizenship): citizenship, a dichotomous variable assuming values “Ital- ian” and “Foreign”. The former is chosen as reference category.

3.2 Results

Model (2) is fitted to the data for values of k ranging from 1 to 10; the model corresponding to the lowest BIC value has k = 3 components, and the cor- responding estimated parameters are reported in Table 1. As the dependent

Table 1 Parameters estimates for model (2) with k = 3. Bold style highlights regression coefficients significantly different from zero (significance level equal to 0.05). The relative size of the groups is also reported at the bottom of the table.

j =1 j =2 j =3

Covariates βj0 (Intercept) 1.49314 1.37273 1.56716 βj1 (Education) 0.03844 0.06528 0.02752 βj2 (Experience) 0.01180 0.01670 0.02161 2 βj3 (Experience ) 0.00034 -0.00012 -0.00025 σj 0.75445 0.27891 0.19186

Concomitant variables αj0 (Intercept) 0.00000 3.01686 2.58453 αj1 (Gender:Femaleref.Male) 0.00000 -1.37438 -0.50159 αj2 (Area:Centerref.North) 0.00000 -0.37758 -0.36252 αj3 (Area:Southref.North) 0.00000 -1.72404 -1.35943 αj4 (Citizenship: Foreign ref. Italian) 0.00000 -4.07112 -1.31669 Relativesizeofthegroups 0.05285 0.41006 0.53709

variable is the log of earnings, the estimation results show that, on average, one additional year of education increases earnings by around 3.844% for in- dividuals in the first group, 6.528% for individuals in the second group, and 2.752% for individuals in the third group. This means that 41% only of the workers receive a good return from education’s investment. Note that the av- erage returns found by Brunello et al (2001) are 4-5%, so that we may see these previous results as a weighted average of our clusters’ estimations. The use of concomitant variables in the mixture model allows us to char- acterize the profile of the groups. As an example, the negative and significant coefficient associated to the Female category of the Gender variable for the second group tells us that being a woman makes less likely to belong to the second group than to the first (more precisely, its logit decreases by −1.37438). 8 Short form of author list

Using the maximum a posteriori probabilities, we can assign each subject to a unique group. Table 2 reports the conditional frequencies of membership to the groups given each of the categories of the concomitant variables, while Table 3 reports the conditional frequencies of the categories of the concomitant variables given the groups.

Table 2 Conditional frequencies of membership to the groups given the categories of the concomitant variables.

Gender Area Citizenship Group Group Male Female North Center South Italian Foreign relative size 1 0.03946 0.08440 0.03556 0.04444 0.08078 0.03841 0.18812 0.05285 2 0.50159 0.19444 0.44421 0.47302 0.32776 0.45349 0.00330 0.41006 3 0.45896 0.72115 0.52022 0.48254 0.59146 0.50810 0.80858 0.53709 Total 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000

Table 3 Conditional frequencies of the categories of the concomitant variables given the groups.

Gender Area Citizenship Group Male Female North Center South Italian Foreign 1 0.52410 0.47590 0.30723 0.16867 0.52410 0.65663 0.34337 2 0.85870 0.14130 0.49457 0.23137 0.27407 0.99922 0.00078 3 0.59988 0.40012 0.44221 0.18020 0.37759 0.85477 0.14523 Sample 0.70201 0.29799 0.45654 0.20057 0.34288 0.90353 0.09646

The second group, which is the one with the highest return to education, is characterized by a disproportionately high presence of males; in fact, whereas 50.159% of all men belong to the second group, only 19.444% of women do (Table 2). On the other hand, women are overrepresented in the third group, the one with the lowest return to education, where they account for 72.115%, whereas men are only 45.896% (Table 2). These gender differences are consis- tent with the findings of Flabbi (1999) and Fiaschi and Gabbriellini (2013). Apart from discriminatory attitudes and specifically employers’ unwillingness to invest in training female workers, gender differences have been ascribed to the different labor participation pattern between men and women arising from an intermittent female labor supply that may erode job skills (Mincer and Polachek, 1974). Foreign workers, who embody 9.645% of the sample, are practically not present in the second group, whereas they account for 14.523% of the third group (Table 3). Evidence for a lower return on education for foreign im- migrants in Italy are found in Accetturo and Infante (2008). Using data on immigrants in Israel, Friedberg (2000) shows how once in the new country, Modeling return to education in heterogeneous populations 9 immigrants tend to accept any kind of job, even ones in which they cannot fully exploit their human capital and skills. Workers residing in the South of Italy (34.288% of the sample) appear underrepresented in the second group (27.407%), while they account for more than a half of the first group (52.41%; see Table 3). Returns to education, so, appear slightly lower in the South of Italy; this is is consistent with the findings of Fiaschi and Gabbriellini (2013).

4 Conclusions

The Mincer human capital earnings function provides the most popular in- dicator of return to education. However, empirical studies have shown that populations of interest are often constituted by latent groups bearing differ- ent returns to educations and characterized by different socio-demographic profiles. In this paper, we propose a new mixture-based approach to make the Mincer earnings function more flexible with respect to unobserved het- erogeneity. The proposed model allows to estimate the density of the income distribution, to detect homogeneous subpopulations, and to analyze the po- sition of individuals with specific characteristics. The method is illustrated using data provided by the Bank of Italy’s Survey of Household Income and Wealth in 2012. Our empirical results demonstrate that this method can be successfully used in practice. Note that within specific applications, the Mincer function has been ex- tended in different ways, notably to deal with the potential endogeneity of schooling (Trostel et al, 2002), non-linear education premiums (Ghosh, 2014), heterogeneity in human capital within educational levels or work experiences (Nieto and Ramos, 2016), and cohort effects Henderson et al, 2011. We ac- knowledge the relevance of these issues, although here, to keep our exposition simple, we focused on the classical Mincer equation. However, when required from the context of the application, these extensions can be easily incorporated within the framework proposed.

Compliance with Ethical Standards

The authors declare that they have no conflict of interest.

References

Accetturo A, Infante L (2008) Immigrant earnings in the italian labour market. Bank of Italy Temi di Discussione (Working Paper) No 695 Amarante V (2014) Income inequality in latin america: Data challenges and availability. Social indicators research 119(3):1467–1483 Atkinson AB (1986) The economics of inequality. Cambridge University Press, Cambridge Bagnato L, Punzo A (2013) Finite mixtures of unimodal beta and gamma densities and the k-bumps algorithm. Computational Statistics 28(4):1571–1597 10 Short form of author list

Battisti M (2013) Reassessing segmentation in the labour market: An application for Italy 1995–2004. Bulletin of Economic Research 65(s1):s38–s55 Biernacki C, Celeux G, Govaert G (2003) Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Computational Statistics & Data Analysis 41(3-4):561–575 Bj¨orklund A, Kjellstr¨om C (2002) Estimating the return to investments in education: how useful is the standard mincer equation? Economics of Education Review 21(3):195–210 Boeri T, Garibaldi P (2007) Two tier reforms of employment protection: A honeymoon effect?. The Economic Journal 117(521):F357–F385 Brandolini A, Cannari L (1994) Methodological appendix: the bank of italy’s survey of household income and wealth. In: Ando A, Guiso L, Visco I (eds) Savings and the ac- cumulation of wealth. Essays on Italian households and government saving behavior, Cambridge University Press, Cheltenham Brunello G, Miniaci R (1999) The economic returns to schooling for italian men. an evalu- ation based on instrumental variables. 6(4):509–519 Brunello G, Comi S, Lucifora C (2001) The returns to education in italy: a new look at the evidence. In: Harmon C, Walker I, Nielsen NW (eds) The Returns to Education in Europe, Edward Elgar, Cheltenham Card D (1999) The causal effect of education on earnings. Handbook of Labor Economics 3:1801–1863 Card D (2001) Estimating the return to schooling: Progress on some persistent econometric problems. Econometrica 69(5):1127–1160 Chiswick B (2007) In memoriam: Jacob Mincer (1922–2006). Journal of Population Eco- nomics 20(1):3–6 Cipollone P (2001) Is the italian labour market segmented? Tech. rep., Bank of Italy, Eco- nomic Research and International Relations Area Davis HD, Noland BE (2003) Understanding human capital through multiple disciplines: the educational needs index. Social Indicators Research 61(2):147–174 Dayton CM, Macready GB (1988) Concomitant-variable latent-class models. Journal of the American Statistical Association 83(401):173–178 Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B (Methodological) 39(1):1–38 Fiaschi D, Gabbriellini C (2013) Wage functions and rates of return to education in italy. In: Fifth meeting of the Society for the Study of (ECINEQ) Bari (Italy) Flabbi L (1999) Returns to schooling in italy, ols, iv and gender differences. Working Paper: Serie di econometria ed economia applicata, Universit`aBocconi 1 Friedberg RM (2000) You can’t take it with you? immigrant assimilation and the portability of human capital. Journal of labor economics 18(2):221–251 Ghosal S, van der Vaart AW (2001) Entropies and rates of convergence for maximum likelihood and bayes estimation for mixtures of normal densities. Annals of Statistics 29(5):1233–1263 Ghosh PK (2014) The contribution of human capital variables to changes in the wage distribution function. Labour Economics 28:58–69 Gr¨un B, Leisch F (2008a) Finite Mixtures of Generalized Linear Regression Models, Physica- Verlag HD, Heidelberg, pp 205–230. DOI 10.1007/978-3-7908-2064-5 11, URL http:// dx.doi.org/10.1007/978-3-7908-2064-5_11 Gr¨un B, Leisch F (2008b) FlexMix version 2: Finite mixtures with concomitant variables and varying and constant parameters. Journal of Statistical Software 28(4):1–35 Henderson DJ, Polachek SW, Wang L (2011) Heterogeneity in schooling rates of return. Economics of Education Review 30(6):1202–1214 Karlis D, Xekalaki E (2003) Choosing initial values for the EM algorithm for finite mixtures. Computational Statistics & Data Analysis 41(3–4):577–590 Kopf J, Augustin T, Strobl C (2013) The potential of model-based recursive partitioning in the social sciences: Revisiting Ockham’s razor. In: McArdle JJ, Ritschard G (eds) Con- temporary Issues in Exploratory Data Mining in the Behavioral Sciences, Quantitative Methodology Series, Routledge, London, pp 75–95 Modeling return to education in heterogeneous populations 11

Leisch F (2004) FlexMix: A general framework for finite mixture models and latent class regression in R. Journal of Statistical Software 11(8):1–18 Lemieux T (2006) The “Mincer equation” thirty years after schooling, experience, and earn- ings. In: Grossbard S (ed) Jacob Mincer: A Pioneer of Modern Labor Economics, Springer, , pp 127–145 Machin S (2007) The new economics of education: methods, evidence and policy. Journal of Population Economics 21(1)1–19 McLachlan GJ, Basford KE (1988) Mixture Models: Inference and Applications to Cluster- ing, Statistics Series, vol 84. Marcel Dekker, New York Mincer J (1958) Investment in human capital and personal income distribution. Journal of Political Economy 66(4):281–302 Mincer J (1974) Schooling, Experience, and Earnings. National Bureau of Economic Re- search, New York Mincer J, Polachek S (1974) Family investments in human capital: Earnings of women. Journal of political Economy 82(2, Part 2):S76–S108 Nieto S, Ramos R (2016) Overeducation, skills and wage penalty: Evidence for spain using piaac data. Social Indicators Research DOI 10.1007/s11205-016-1423-1 Nordin M (2008) Ability and rates of return to schooling-making use of the Swedish enlist- ment battery test. Journal of Population Economics 21(3):703–717 Oreopoulos P, Petronijevic U (2013) Making college worth it: A review of research on the returns to higher education. Tech. rep., National Bureau of Economic Research Patrinos HA (2016) Estimating the return to schooling using the mincer equation. IZA World of Labor 278 Quandt RE, Ramsey JB (1978) Estimating mixtures of normal distributions and switching regressions. Journal of the American statistical Association 73(364):730–738 R Core Team (2013) R: A Language and Environment for Statistical Computing. R Foun- dation for Statistical Computing, Vienna, Austria, URL http://www.R-project.org/ Schwarz G (1978) Estimating the dimension of a model. The Annals of Statistics 6(2):461– 464 Smith N, Westergard-Nielsen N (1988) Wage differentials due to gender. Journal of Popu- lation Economics 1:115–130 Titterington DM, Smith AFM, Makov UE (1985) Statistical Analysis of Finite Mixture Distributions. John Wiley & Sons, New York Trostel P, Walker I, Woolley P (2002) Estimates of the economic return to schooling for 28 countries. Labour economics 9(1):1–16 Vieira JAC (1999) Returns to education in portugal. Labour Economics 6(4):535–541 Wedel M, DeSarbo WS (1995) A mixture likelihood approach for generalized linear models. Journal of Classification 12(1):21–55 Wedel M, Kamakura W (2000) Market Segmentation: Conceptual and Methodological Foun- dations, 2nd edn. Kluwer Academic Publishers, Boston, MA, USA