Evidence of the relationship between education and in

Elder G. Sant’Anna1, Marcelo Justus2, Elo´aS. Davanzo3, Gustavo C. Moreira4

Abstract We investigated the role of education in an individual’s decision to smoke and in the intensity of use when he or she becomes a smoker, controlling for socioeconomic and demographic characteristics, among other relevant variables. We present evidence from estimations (probit model, sample selection model, count model) using a sample composed of almost 36,000 people living in Brazil in 2008. Reinforcing findings from the international literature, we saw that higher levels of education are associated with a lower probability of smoking and with a lower number of cigarettes smoked daily. Other marginal and relevant results were also found. Information about the risks of smoking and tobacco advertising are examples of variables with high statistical significance and the expected sign (negative and positive, respectively). These results is in line with the National Program of . Keywords: smoking, tabacoo, addiction, education, human capital, health JEL Classification: I12, D12, C50 Area´ Anpec: Area´ 12 – Economia Social

Resumo Investigamos o papel da educa¸c˜aono tabagismo, tanto na decis˜aode fumar, como na in- tensidade do v´ıcio,controlando as caracter´ısticassocioeconˆomicas,demogr´aficase outros fatores. Utilizamos uma amostra de quase 36 mil pessoas brasileiras na estimati¸c˜aode modelos probit, de sele¸c˜aoamostral e de contagem. As evidˆenciasencontradas refor¸cam as conclus˜oesdos estudos internacionais, indicando que altos n´ıveis de escolaridade est˜ao associados a uma menor probabilidade de ser fumante, e ao menor consumo di´ariode cigarros. As vari´aveis que controlam o conhecimento do risco do uso de tabaco e a ex- posi¸c˜ao`apropaganda de cigarro e derivados apresentaram o sinal esperado (negativo e positivo, respectivamente) e foram estatisticamente significantes. Esse resultado vai ao encontro da proposta do Programa Nacional de Controle do Tabaco. Palavras-chave: tabagismo, cigarro, v´ıcio,educa¸c˜ao,capital humano, sa´ude 1. Introduction Smoking is a toxicomania characterized by physical and psychological addic- tion to nicotine, one of the 4,720 toxic substances contained in tobacco. Research associates smoking with at least fifty types of diseases, most of which are chronic and severe ones. It is estimated that the life expectancy of a smoking individual is at least 10 years shorter than that of a non-smoking one. In Brazil, tobacco use often takes the form of consumption of manufactured cigarettes. In 2013, there were 21.4 million smokers in Brazil, 18.5 million of whom were daily smokers. According to the Ministry of Health, about 200,000 deaths per year are related to tobacco consumption. In economic terms, smoking increases expenditures with health care and de- creases productivity due to morbidity and premature death, substantially redu- cing the stock of human capital in society (World Bank, 1999). According to estimates of World Bank (2015), when indirect costs are also taken into account, smoking accounts for losses estimated at US$ 1 trillion worldwide every year. The World Health Organization (WHO) considers smoking as the main cause of preventable death in the world. In view of the great social impact of smoking, this paper is mainly intended to empirically investigate, based on Becker and Murphy (1988) theory of ratio- nal addiction, the socioeconomic and demographic risk factors associated with tobacco consumption and intensity of tobacco use, emphasizing the hypotheti- cal effect of education. What socioeconomic characteristics are associated with the decision to smoke and with the intensity of tobacco use? Since education is positively associated with an individual’s health status and habits (Cutler and Lleras-Muney, 2006; Deaton and Paxson, 2004; Grossman, 2006; Lleras-Muney, 2005), is the same reasoning valid for the relationship between education and smoking behavior? The theory of human capital and the relationship between education and he- alth can explain the link between education and decisions related to smoking. Becker (1962) defines investment in human capital as one of the activities lea- ding to a higher real income in the future for an individual. This investment includes schooling, professional training, health care and acquisition of informa- tion about the economic system. All investments of this kind in human beings

Email addresses: [email protected] (Elder G. Sant’Anna), [email protected] (Marcelo Justus), [email protected] (Elo´aS. Davanzo), [email protected] (Gustavo C. Moreira) 1PhD candidate at the University of S˜aoPaulo, Brazil. 2Professor in the Intitute of Economics at the University of Campinas, S˜aoPaulo, Brazil. 3PhD candidate in the Intitute of Economics at the University of Campinas, S˜aoPaulo, Brazil 4Professor at the University of S˜aoCarlos, S˜aoPaulo, Brazil. 2 could improve their physical and mental skills, making it possible to predict with greater certainty that their real income will be higher in the future. Workers can invest in any aspect that can enhance and improve their skills (schooling and/or training), thus improving their human capital and consequently raising their marginal productivity and earnings in the labor market. For Schultz (1961) much of what we consider consumption is actually invest- ment in human capital: spending on education, health care, internal migrations in search of better job opportunities, among other expenditures. By investing in themselves, individuals can expand the set of choices available to them, increasing their well-being. Galama and van Kippersluis (2015) mention the importance of health care as an element of human capital: longevity, provision of direct utility and time that can be devoted to working. Theory describes the persistent association between education and health in several ways. There is evidence that education increases access to important information for people to consider health care when making decisions, i.e. education paves the way for different patterns of reasoning and decision-making. In general, more educated people respond to new information faster than less educated or uneducated individuals. Education also affects health through changes in behaviors and opportunities, particularly in income oppor- tunities (World Bank, 1999; Cutler and Lleras-Muney, 2006; de Walque, 2010; Feinstein, 2002; Grossman, 1972). As education affects health-related decisions, it is also likely to affect decisions related to smoking. Several studies have shown that this relation is real. Their results suggest that education can affect the decision to quit smoking and that more education reduces smoking initiation and addiction to nicotine. Indirectly, if education makes people more patient, it reduces their propensity to indulge in short-term pleasures with long-term costs (de Walque, 2010; Kendler et al., 1999; Koning et al., 2015; Sander, 1995). In this theoretical context, we apply two hypotheses. The first one is that ”education reduces the probability of an individual becoming a smoker”, and the second one is that ”education reduces the intensity of tobacco use among smo- kers”. Therefore, we used a large random sample of aged 15 and above composed of almost 36,000 people from all over the country. We highlight that microdata of this nature is rarely found even in developed countries, where data for empirical research are more available. This study is organized as follows. We present Becker and Murphy (1988) theory of rational addiction and relevant empirical literature in Section 2. In Section 3 we show the data base, the modelling and the specification of empirical models and discuss our results. Section 5 concludes the paper.

3 2. Theoretical Background

Becker and Murphy (1988) developed a theory of rational addiction accor- ding to which rationality implies a consistent plan to maximize utility over time. Strong addiction requires a great effect of consumption of a good in the past on its consumption in the present. This complementarity causes some stationary states to become unstable. According to the authors, the theory of rational choice would explain a large number of addiction behaviors. Even the most addicted individuals are usually rational in the sense that they make prospects of maxi- mizing their utility based on stable preferences. Here, addiction is not seen as restricted to alcohol, tobacco or cocaine, as it includes addiction to work, food, religion, music and so forth. A good is considered addictive if its consumption at different moments in time is complementary and the greater the complementa- rity, the greater the degree of addiction. The onset and continuity of dangerous addictions, such as smoking and alcoholism, and of beneficial addictions, such as to religion, are often linked to stressful events experienced throughout one’s life cycle, such as adolescence, divorce, unemployment, and so on. Even people with the same utility function and the same income and facing the same price levels may have different degrees of addiction according to their different experiences in life. This study will use the economic theory of cigarette addiction (Suranovic et al., 1999) presented below as the basis for the specification of the empirical models and for discussing the results of the econometric estimations. The effects of smoking for a smoker at age A are broken down into three separable addictive components: current benefits (BA), future losses (LA) and consumption adjustment costs (CA). Current benefits represent utility derived during the period in which one smokes. It is assumed that this utility (or be- nefits) of smoking at age A, (BA) increases at decreasing rates as the current consumption of cigarettes (s) increases, that is,

0 00 BA = BA(s) BA ≥ 0,BA ≥ 0,BA < 0, ∀s ≥ 0 Current disutility of future losses caused by smoking is determined by expec- ted losses to health. It is assumed that individuals expected to live to a certain age, as defined by their life expectancy. It is assumed that each cigarette con- sumed reduces the life expectancy of an individual, who will then evaluate his or her future losses by calculating the discounted present value of the expected reduction in his or her lifetime. Therefore, current perception of future losses tends to increase among young people and decrease among older people. It is assumed that an individual is A years old currently and that the life expectancy (remaining years of life) of a non-smoking individual at the age of A is represented by T (A). It is assumed that T 0(A) < 0 and T 00(A) = 0. It is further 4 assumed that each cigarette smoked reduces life expectancy by a fixed value α. Therefore, the life expectancy of a smoker at the age of A can be represented R A as T (A) − αSA, where SA = t=0 stdt is the cumulative total stock of previous cigarette consumption. The present value of the expected flow of future utility for a smoker at the age of A can be expressed as

Z T (A)+A−α(SA+s) −r(t−A) V (A, s) = e Wtdt t=A where r is the set discount rate, e−r(t−A) is the discount factor during period t and Wt is the expected utility during period t. Bearing in mind that s represents cigarette smoking over that period, αs at the upper limit represents the years of life lost due to smoking during the current period and αSA represents the years of life lost due to smoking over one’s lifetime. The present value of future losses expected by a smoker at the age of A as a result of smoking during this period is

Z T (A)+A−αSA −r(t−A) LA(s) = V (A, 0) − V (A, s) = e Wtdt. T (A)+A−α(SA+s) Future losses increase with smoking, as each cigarette smoked eliminates ex- pected benefits in the final moments of one’s life. Losses grow at increasing rates because each cigarette shortens a smoker’s life, resulting in a higher discount rate to be applied to the final minutes of one’s life. The model assumes that the adjustment cost for smokers at the age of A, CA(s), is a function of their smoking history, as defined by HA = {st} ∀t ∈ [0,A] and the current consumption, s5. Thus,

CA = C(s; HA) ∀s ∈ [0, sh] and CA = 0 ∀s ≥ sh. (1) It is assumed that any level of consumption equal to or greater than the level of habitual consumption, sh, does not impose any cost for cutting down smoking. Because consumption is reduced to below its usual level, this cost increases. The highest cost could result from quitting smoking altogether immediately. It is 0 assumed that CA < 0, i.e. that adjustment costs decrease as s approaches the 00 usual level. However, there is no reason to assign a sign to CA.

5The adjustment represents discomforts (loss of concentration, extreme irritability, etc.) cau- sed by quitting or cutting down smoking. 5 The consumer problem can then be defined. It is assumed that an individual aged A chooses to consume currently an amount of cigarettes and an amount of a compound good y, subject to budget constraints. The objective of the consumer is to maximize the present value of current and expected net benefits, as defined above. The compound good generates utility only during the current period and has a utility function that is addictive and separable from the utility of cigarettes. UA(s) = BA(s) − LA(s) − CA(s) is defined. Thus, the individual’s problem is

max WA = UA(s) + Γ(y) s,y subject to

pss + pyy = IA,

where Γ is the utility function of compound good y, WA is the sum of the cur- rent utility at the of age A of the consumption of cigarettes and of the compound good; ps and py are the prices of cigarettes and of the compound good, respec- tively; IA is the income at the age of A. Kuhn-Tucker’s first-order conditions are

0 Γ − µpy ≥ 0 and

0 UA − µps ≥ 0 where µ is the marginal utility of income. The second-order conditions for maximum are

00 00 UA < 0 e Γ < 0. Suranovic et al. (1999) explain the early stages of tobacco addiction by analy- zing the decision to start smoking, when a smoker becomes addicted to tobacco and when an addict decides to quit smoking.

2.1. Previous Empirical Studies The empirical literature suggests that the effects of education on health occur through three channels: (i) economic factors such as income and employment; (ii) health-related behaviors and (iii) psychosocial factors. In this paper, we will analyze the effects on health-related behaviors. These behaviors include, among others, the habit of smoking, which is the subject matter of this study. The strong relationship between education and health, even controlling for income, is considered robust in the literature on social sciences and economics 6 (Deaton and Paxson, 2004; Fuchs, 1982; Grossman, 2004; Lleras-Muney, 2005). The decision to smoke or not is a conscious choice that has a direct bearing on an individual’s health status and mortality. Because smoking is the leading cause of preventable death in the world, the fact that education may play a major role in preventing this habit suggests that traditional estimates of the returns on education, which are often focused on results for the labor market, may perhaps underestimate actual returns. Since education affects health-related decisions, it is also likely to affect decisions related to smoking. Results show that education can affect the decision to quit smoking and that more schooling may be associated with lower tobacco use initiation rates and lower nicotine addiction rates. Wetter et al. (2005) observed a strong relationship between lower schooling and greater likelihood of smoking, since less educated people tend to work in environments in which smoking is acceptable and where there is little incentive to quit smoking. De Walque (2007) found the result that more educated people are less likely to smoke and, when they do smoke, they are more likely to quit. Grimard and Parent (2007) also observed the effect of education on starting to smoke, but they found no evidence that more educated individuals are more likely to quit smoking. For Brazil, there is some evidences that tobacco addiction is prevalent among less educated and lower-income groups IBGE (2014); Pinto and Ug´a(2010). It should be highlighted there are only a few studies which did not detect the effects of education on tobacco use (Tenn et al., 2010) and on the decision to start smoking (Koning et al., 2015) or which found a nonlinear relationship between smoking and years of schooling (Zhu et al., 1996). Government policies that affect smoking behavior also have major implicati- ons for public health, economic efficiency and government revenue goals. Accor- ding to Keeler et al. (2001), the most important of these policies is that of taxing tobacco, through which two objectives can be achieved simultaneously: those of reducing tobacco consumption and of increasing government revenues. However, there are many other studies that analyze the effects of other government poli- cies on smoking, such as anti-smoking regulations, educational campaigns and, less obviously, schooling, which has been found to reduce tobacco use and stimu- late healthy behaviors. Table 3.1 shows some results of previous studies on the relationship between education, health and smoking.

7 Tabela 1: Selected studies on the effects of education on health and smoking

Authors Objectiv Data/Sample Results Cutler e To estimate the effects National Health Interview Education can affect Lleras-Muney of education on health Survey, conducted in the health by improving (2006) behavior and its quality. United States. Individuals one’s reasoning and decision aged 25 or over. making. De Walque To test the hypothesis that 16 supplements about With warnings about the (2010) education improves smoking of the National dangers of smoking, health and increases Health Interview Survey, smoking prevalence among life expectancy. between 1978 and 2000. the most educated fell more quickly. Feinstein To estimate the effects of United Kingdom National Education can affect (2002) schooling on two Cohorts. People health both directly, by important aspects of surveyed in 1999/2000. changing behaviors health: depression and and/or preferences, and obesity. indirectly, through changes in opportunities, particularly in income opportunities. Grossman To build a model of Education increases access (1972) positive demand for the to important information health commodity.– for people to consider health care when making decisions. Kendler et To investigate the relationship Personal interview with twin More years of schooling are al. (1999) between the risk factors women from the Virginia associated with a reduction for smoking initiation and Registry in 1898. in smoking initiation and dependence. nicotine dependence. Koning et To analyze the effects of Longitudinal data for Schooling does not affect al. (2015) education on decisions Australian twins the decision of start smoking, of start or quit (1980-1982 and 1988-1989). but has significant effect on smoking. the decision to quit smoking. Welte et To investigate potential Statutory Health Insurance Smoking accounted for 23% al. (2000) years of lost lifetime, Data; German Federal of deaths among men and for direct medical costs and Statistical Office 5% of all deaths among women, indirect costs ofsmoking as well as for the potential loss of 1.5 million years of lifetime. Note: Self elaboration.

3. Methodology

3.1. Data and sample We use data from the Special Survey on Tobacco Addiction (PETab, in the Brazilian acronym) that was jointly carried out with the 2008 National Household Sampling Survey (PNAD, in the Brazilian acronym). The survey was conducted through a partnership between the Brazilian Institute for Geography and Sta- tistics (IBGE, in the Brazilian acronym), the Ministry of Health, the National Cancer Institute (INCA, in the Brazilian acronym), the Health Surveillance Se- cretariat (SVS, in the Brazilian acronym) and the National Health Surveillance Agency (ANVISA, in the Brazilian acronym). The data was collected from a sub-sample of households surveyed through the PNAD 2008, covering individuals aged 15 and above in about 51,000 Brazilian households. The individuals included in that sub-sample answered questions related to the use of tobacco products, to their attempts to quit smoking, to 8 their exposure to smoke and to their access to awareness-raising campaigns and to information on the risks of smoking, among other issues related to the main topic. For other people interviewed through the PNAD 2008, information is only available for the habit of smoking, type of tobacco product used and amount consumed. It should be noted that the PETab survey is carried out in Brazil as part of an initiative of the World Health Organization (WHO) and of the Centers for Disease Control and Prevention. This partnership was established with the aim of promoting part of a survey conducted in 14 countries, including Brazil, entitled Global Adults Tobacco Survey (GATS)6. Table 2 shows the variables selected in the PETab survey to be used for modeling the decision to smoke and the intensity of cigarette consumption. These characteristics were chosen based on the theoretical and empirical literature cited in the 2 section.

Tabela 2: Definition and statistical summary of the variables used in the empirical models

Variable Definition Mean Std. Dev. No schooling 1 if the individual has no education and 0 otherwise 0.1144 0.32 Elementary or less 1 if the individual did not complete primary education 0 otherwise 0.3534 0.48 Primary Education 1 if the individual completed elementary school and 0 otherwise 0.1783 0.38 High school 1 if the individual completed high school and 0 otherwise 0.2696 0.44 Higher education 1 if the individual is a college graduate and 0 otherwise 0.0842 0.28 Income (ln) Logarithm for household income per capita 6.02 1.02 Works 1 if the individual works and 0 otherwise 0.6629 0.47 Age Age in years 40.96 17.34 Man 1 if the individual is male and 0 otherwise 0.4569 0.50 Whites 1 if the individual is Caucasian or Oriental and 0 otherwise 0.4718 0.50 Householder 1 if the individual is the family head and 0 otherwise 0.5186 0.50 Health insurance 1 if the individual has health insurance and 0 otherwise 0.2708 0.44 Very bad health 1 if the individual is in very bad health and 0 otherwise 0.0098 0.10 Bad health 1 if the individual is in bad health and 0 otherwise 0.0392 0.19 Average health 1 if the individual is in average health and 0 otherwise 0.2368 0.43 Good health 1 if the individual is in good health and 0 otherwise 0.5247 0.50 Very good health 1 if the individual is in very good health and 0 otherwise 0.1896 0.39 Smokers percentage of smokers in the household 0.0627 0.14 Warning 1 if the individual is aware of the risks of tobacco addiction and 0 96.96% 0.17 otherwise Marketing 1 if the individual saw cigarette ads and 0 otherwise 0.3953 0.49 Urban 1 if the individual lives in an urban area and 0 otherwise 0.8495 0.36 Metropolis 1 if the individual lives in a large city and 0 otherwise 0.3761 0.48 Cigarettes smoked Amount of cigarettes smoked in a day 1.97 5.20 Cigarrete (if smoker) 1 if the individual smokes manufactured cigarettes and 0 otherwise 0.8362 0.37 Note: Prepared with data from the PETab survey (N = 35,999).

The individual characteristics that we used included indicators for males, race/color and position in the occupation, in addition to age and household per capita income. As for geographic characteristics, we included dummy variables for urban area and metropolitan region. We believe that cigarette consumption is higher in those areas, since in rural areas, for example, chewing tobacco or

6That survey is intended to improve the capacity of countries to design, implement and evaluate tobacco control programs. 9 consuming cigarettes other than manufactured cigarettes are more common. We also included dummies to capture the diversity of Brazil’s regions. In this case, consumption is expected to be higher in Brazil’s south region, as this region accounts for about 98% of all the tobacco produced in Brazil. Finally, it was also possible to build variables to indicate whether the indivi- dual had access to information of some kind about the potential risks of smoking, whether he or she had been exposed to any cigarette ads and whether he or she actually believed in the health risks of smoking. It should be stressed that 100% of the respondents included in the sample used in the PETab survey believe that smoking causes serious diseases, strokes or heart attacks or lung cancer. Further- more, since all PNAD respondents also answered a question about whether they used tobacco or not, we computed the proportion of smokers in the households (Smokers (%)). It is noteworthy that we chose to use education level than years of schooling in the modeling to measure human capital. Our objective in this case was to check for any non-linearity in the relationship between education and smoking, as in the relationship found by Zhu et al. (1996). It is to be expected that more educated individuals are better prepared to evaluate the costs and benefits of smoking.

Tabela 3: Average characteristics of the smokers and non-smokers

Non-smokers Smokers Mean dif. Characteristics Mean Std. dev. Mean Std. dev. p-value Income (ln) 6.05 1.01 5.85 1.02 0.000 Works 0.6479 0.48 0.7374 0.44 0.000 Age 40.59 17.73 42.85 15.12 0.000 Man 0.4319 0.50 0.5816 0.49 0.000 Whites 0.4818 0.50 0.4217 0.49 0.000 Householder 0.4959 0.50 0.6317 0.48 0.000 Health insurance 0.2879 0.45 0.1855 0.39 0.000 Smokers (%) 0.0533 0.13 0.1095 0.17 0.000 Warning 0.9724 0.16 0.9557 0.21 0.000 Marketing 0.3857 0.49 0.4434 0.50 0.000 Urban 0.8577 0.35 0.8085 0.39 0.000 Metropolis 0.3804 0.49 0.3546 0.48 0.000 Cigarettes smoked 11.79 6.79 Cigarrete (if smoker) 0.8362 0.37 Note: Prepared with data from the PETab survey (N = 35,999).

Table 3 presents a descriptive analysis for the group of smokers and for that of non-smokers. Based on this prior analysis, one can speculate on what the main 10 socioeconomic determinants of smoking in Brazil are. Regarding individual characteristics, we found that tobacco consumption is higher among men and non-Caucasians. The percentage of smokers is also higher in the group of workers and heads of family, probably due to their greater res- ponsibility and, therefore, to the fact that they have a more stressful life. On average, smokers are two years older than non-smokers and the household per capita income of smokers is lower. Although the families of both smokers and non-smokers are on average of the same size, the number of smokers is higher in the households of smokers. As expected, the group of smokers is the most exposed to cigarette ads. It is noteworthy that the average consumption of cigarettes among smokers is of approximately 12 cigarettes a day and that about 84% of the smokers consume manufactured cigarettes. Finally, in Figure 1 we disaggregated the analysis between the group of smo- kers and non-smokers for two specific variables: self-reported health status and education. Education is clearly higher in the population of non-smokers. This fact reinforces our hypothesis that education is a determinant of the decision to smoke or not to smoke. It is also noticeable that smokers reported a good or very good health sta- tus less frequently. This last finding is only of a descriptive character, as the relationship between health status and tobacco consumption is prone to reverse causality, i.e. individuals are not in good health because they don’t smoke or they don’t smoke because they are in good health.

Figura 1: The relationship between smoking and health status or level of schooling.

Health status Level of schooling

Non-smokers Non-smokers 0.89 10.23 Very bad Smokers No schooling Smokers 1.42 17.50

3.66 33.63 Bad Elementary or less 5.23 43.89

23.11 18.29 Average Middle school 26.52 15.55

52.69 28.71 Good High school 51.38 18.23

19.66 9.14 Very good Higher education 15.45 4.83

0 20 40 60 80 100 0 20 40 60 80 100 Percentage (%) Percentage (%) Note: The result of Kolmogorov-Smirnov’s test at 1% significance shows that the distribution of these two characteristics is different in the two groups.

3.2. Empirical procedures Initially, to model the decision to smoke, we used the structure of an Additive Random Utility Model. From this perspective, we treated the decision to smoke as two possibilities of individual choice: to smoke (1) or not to smoke (0). Thus, 11 following Cameron and Trivedi (2005), the utility of each choice can be written as

U0 = V0 + ε0

U1 = V1 + ε1 where U0 and U1 are the utilities when an individual decides to smoke or not to smoke, respectively. V0 and V1 represent the deterministic components and ε0 and ε1 represent the random components of those utilities. In this regard, the individual will choose the option providing the greater utility. Then, Y = 1 if U1 > U0 will be observed. The probability of observing this result is given by

Pr [y = 1] = Pr [U1 > U0]

= Pr [V1 + ε1 > V0 + ε0]

= Pr [ε0 − ε1 < V1 − V0]

= F (V1 − V0) where F is the cumulative distribution function of ε0 − ε1. It should be noted 0 0 that if V0 = x β0 and V1 = x β1, it will only be possible to identify (β1 − β0). This means that it is not possible to check how a certain characteristic affects the utility of a choice, but how a certain characteristic affects the difference in utility provided by choices can be checked. To make their decisions, rational individuals usually weigh the costs and be- nefits of their options and, in this case, it can be inferred that people weigh the costs and benefits of smoking. Assuming that this cost-benefit analysis F (V1 − V0) is a function of socioeconomic and demographic characteristics, we 0 have F (V1 − V0) = F (X β), where X is the vector of those socioeconomic and demographic characteristics. Based on the assumption that F (·) is a normal function, the probability of observing an individual who smokes (Y = 1) can be written as follows

Pr (y = 1 | X) = Φ(X0β)

Once the process involved in the individual choice of smoking or not was investigated, we began to investigate the determinants of the addiction intensity, i.e. the amount of cigarettes smoked a day. Therefore, it is essential to consider the fact that the amount of cigarettes consumed is in part determined by the decision of an individual to smoke or not, meaning that a self-selection process is involved. 12 A common approach is that of using a bivariate sample selection model in ∗ which a latent variable y1i determines both the decision of the individual Y1i and the amount of cigarettes consumed Y2i. The selection equation is given by ( 1 se y∗ > 0 Y = 1i 1i ∗ 0 se y1i ≤ 0 while the consumption equation is given by ( y∗ se y∗ > 0 Y = 2i 1i 2i ∗ − se y1i ≤ 0 ∗ According to this approach, Y2i is only observed if y1i > 0 and it does not ∗ assume any value when y1i ≤ 0. Usually, a linear model with additive errors for the latent variables is specified

∗ 0 y1 = X1β1 + ε1 ∗ 0 y2 = X2β2 + ε2 If there are strong indications of non-correlation between the terms of the two equations above, applying an estimator obtained by OLS in the second equation would be consistent. However, the hypothesis that there are no observable cha- racteristics influencing both the decision to smoke and the amount of cigarettes consumed cannot be refuted. In this case, it is a usual procedure to estimate by OLS the following model for the values of Y2

0 0 y2i = X2β2 + σ12λ(X1βb1) + υ1

where υ1 is the error term, σ12 is the correlation between the errors and 0 0 0 λ(X1βb1) = φ(X1βb1)/Φ(X1βb1) is the inverse Mills ratio obtained by estimating a probit model of y1 in X1. Alternatively, the number of cigarettes consumed can be modeled from the perspective of a counting process. That is, we assume that the number of ciga- e−µµy rettes consumed follows a Poisson distribution Pr[Y = y] = y! , where µ is the mean and y = 0, 1, 2, ..., n. However, this distribution will only be appropriate under the equidispersion property, i.e. E[Y ] = V [Y ] = µ. In the presence of overdispersion, the process is best modeled by a Negative Binomial distribution, where E[y | µ, α] = µ e V [y | µ, α] = µ(1 + αµ). In addition, an excessive amount of zeros must be considered in a counting process. In the case of the amount of cigarettes, there are a lot of zeros due to non-smokers. Then, it is advisable to model the decision process in two stages as 13 ( f1(0) se y = 0 f(y) = 1−f1(0) f (y) se y ≤ 1 1−f2(0) 2 In this approach, known as Hurdle Model, the two parts are functionally independent. According to Cameron and Trivedi (2009), this hypothesis can be relaxed if it can be assumed that zero can occur either as a decision process or as a counting process, as in the following distribution ( f (0) + {1 − f (0)}f (0) sey = 0 f(y) = 1 1 2 {1 − f1(0)}f2(y) se y ≤ 1 Therefore, given that the amount of cigarettes f(y) is a process such as that described in the equation above, the effect of a particular characteristic of an individual, if there is equidispersion, is calculated through a poisson regression with inflated zeros. Otherwise, the most appropriate procedure is to admit that the decision process follows a normal distribution and that f2(·) has a negative binomial distribution. In this case, the estimation is done using a zero-inflated negative binomial (ZINB) model. So both the probit model and the first stage of the selection model will be used for us to investigate the decision to smoke, while the second stage of the sample selection model and the count model are intended to analyze the intensity of cigarette consumption. Regarding the sample selection model, parameter ρ was significant at 1%, i.e. we reject the hypothesis that the two equations are independent. In the count models, in turn, at a significance level of 1% we reject the equidispersion hypothesis (V [y | x] = E[y | x]). Next, the Vuong test showed that a zero inflation is indeed observed and, therefore, that the ZINB models are preferable. Table 4 shows the marginal effects estimated for each model, with the obser- vation that for the sample selection models we computed the marginal effect of both the first and the second stage on the censored average7. Therefore, the first two columns show the effect of the variables on the decision to smoke. The last two columns show in turn the effect on the number of cigarettes consumed per day. Note that the marginal effect estimated for the regressor set that was used is very similar in both the analysis of the decision to smoke and in the analysis of the intensity of addiction, regardless of the modeling strategy. For the most part, marginal effects were statistically significant and showed the expected direction.

7For more details, see Hoffmann and Kassouf (2005)

14 Tabela 4: Average Marginal Effects of Socioeconomic Characteristics on Tobacco Addiction

Selection Model Variables Probit model ZINB model 1st stage 2nd stage Level of Education Elementary or less −0.028*** -0.016** −0.209 −0.047 (0.009) (0.008) (0.127) (0.113) Middle school −0.075*** −0.059*** −0.821*** −0.600*** (0.010) (0.009) (0.151) (0.132) High school −0.090*** −0.075*** −1.117*** −0.863*** (0.010) (0.009) (0.146) (0.129) Higher education −0.094*** −0.079*** −1.193*** −1.014*** (0.012) (0.013) (0.190) (0.154) Income (ln) −0.017*** −0.012*** −0.159*** −0.049 (0.003) (0.003) (0.047) (0.040) Works 0.065*** 0.058*** 0.852*** 0.752*** (0.005) (0.005) (0.074) (0.066) Age 0.001*** 0.001*** 0.009*** 0.007*** (0.000) (0.000) (0.003) (0.002) Man 0.048*** 0.052*** 0.821*** 0.753*** (0.005) (0.005) (0.081) (0.068) White −0.019*** −0.016*** −0.249*** −0.072 (0.005) (0.005) (0.073) (0.062) Householder 0.058*** 0.055*** 0.848*** 0.734*** (0.005) (0.005) (0.085) (0.073) Health insurance −0.022*** −0.024*** −0.315*** −0.280*** (0.006) (0.006) (0.088) (0.077) Health Status Bad 0.003 0.005 0.060 −0.002 (0.026) (0.023) (0.369) (0.345) Average −0.024 −0.018 −0.301 −0.288 (0.023) (0.020) (0.331) (0.311) Good −0.029 −0.027 −0.430 −0.424 (0.023) (0.020) (0.330) (0.311) Very good −0.051** −0.048** −0.772** −0.701** (0.023) (0.021) (0.336) (0.316) Smokers 0.345*** 0.317*** 4.759*** 4.155*** (0.014) (0.013) (0.206) (0.184) Warning −0.066*** −0.057*** −0.824*** −0.610*** (0.016) (0.015) (0.242) (0.204) Marketing 0.042*** 0.044*** 0.682*** 0.610*** (0.005) (0.005) (0.077) (0.066) Urban 0.004 0.008 0.104 0.205*** (0.006) (0.006) (0.087) (0.076) Metropolis 0.006 0.009* 0.116 0.194*** (0.005) (0.005) (0.079) (0.069) North −0.038*** −0.042*** −0.730*** −0.324*** (0.009) (0.009) (0.136) (0.063) Northeast −0.049*** −0.050*** −0.811*** −0.230*** (0.007) (0.007) (0.113) (0.047) Southeast −0.019*** −0.019*** −0.369*** −0.119*** (0.007) (0.007) (0.112) (0.042) Midwest −0.033*** −0.029*** −0.513*** −0.069 (0.009) (0.009) (0.137) (0.057) Notes: N = 35, 999; Standard errors in parentheses; * Significance at 10%; ** Signi- ficance at 5%; *** Significance at 1%.

15 Special mention should be made of the variable that captures the self-declared health status status. A negative relationship between health status and the de- cision to smoke or the intensity of cigarette consumption only exists for those who reported that they were in very good health. Apart from these features, the variables indicating urban area and large city do not seem to influence the decision to smoke.

4. Results

As seen in the previous subsection, the results were little sensitive to the mo- deling strategy that was used. This goes for both the analysis of the determinants of the decision to smoke and for the analysis of the determinants of the amount of cigarettes smoked daily. Based on the estimations that were done, we sought to answer the following questions. What are the socioeconomic characteristics associated with the deci- sion to smoke and with the intensity of cigarette consumption? What role does education play in relation to tobacco addiction? The assumptions that guided this study so far, as listed in Section 1, are the following ones: “education redu- ces the probability of an individual being a smoker ” and “education reduces the intensity of tobacco use if the individual is a smoker”. The estimates presented in Table 4 provide strong evidence that investing in human capital, represented here by one’s schooling level, tends to reduce both the likelihood of smoking and the intensity of cigarette consumption. Thus, our results support the hypotheses made above. More educated individuals are on average less likely to smoke, and when they do decide to smoke they tend to consume fewer cigarettes daily. Our estimates indicated, at means, a reduction of 0.09 percentage points in the probability of smoking (p.p.) for more educated individuals and that they tend to smoke about 1 cigarette less per day as compared to uneducated individuals. Our evidence corroborates the findings of other studies that we investigated. In Cutler and Lleras-Muney (2006), Kendler et al. (1999), Madden (2008) and (Grimard and Parent, 2007) evidence is also provided that years of schooling have a negative effect on the likelihood of smoking. However, after controlling for endogeneity between years of schooling and smoking, (Koning et al., 2015) found no statistically relevant relationship between education and the decision to smoke. Regarding the amount of cigarettes smoked, both Cutler and Lleras- Muney (2006), and De Walque (2007) provide evidence similar to the one provided here. Therefore, it can be inferred that the effect of education on smoking behavior can be described as follows: (i) higher levels of education affect the way indivi- duals think and make decisions (Cutler and Lleras-Muney, 2006), (ii) education 16 prevents people from adopting behaviors that can be harmful to their health in the future (Koning et al., 2015) and (iii) lead to a better understanding of the costs associated with smoking. The specialized literature acknowledges the robust relationship between edu- cation and health. Our estimates are in line with those done in other empirical studies, since the albeit not causal relationship between education and health is expressed here by the negative effect of the level of education on tobacco addic- tion. As for the other controls, we observed that the higher the household income, the lower the probability of an individual smoking. Its negative effect on the intensity of tobacco addiction was in turn only captured in the sample selection model. On average, the older an individual, the higher the probability that he or she is a smoker, but the effect of age on the addiction intensity is minimal. Being a male or the family head are also factors that on average increase both the probability of smoking and the number of cigarettes consumed per day. As the percentage of smokers in a household increases, the risk of an individual being a smoker increases by approximately 0.3 p.p., while the amount of cigarettes smoked increases by 4 units. This result corroborates the findings of the previous literature. Analyses conducted with adolescents in the 10-20 age bracket and even with older smokers show that most of them have relatives who also smoke. This relationship is one of the most influential risk factors for the onset of smoking. The effects can be even stronger for young people from low-income families and young parents with low education, factors that can lead to persistent family cycles (or cycle of deprivation) throughout one’s life as a smoker8. The information on the risks of smoking and tobacco ads consists in variables with high statistical significance and expected sign. That is, the more informa- tion, the lower the probability of smoking and the lower the amount of cigarettes consumed. It is to be expected that rational individuals avoid risky behaviors. This effect can occur through two channels. The first one would be through edu- cation, i.e. the more educated individuals are, the better they understand the health risks posed by tobacco use. The second one would be through access to information, i.e. individuals with access to information or who seek information are more likely to use it in connection with their decision to smoke or not to smoke. With regard to advertising, its estimated effect was as expected. The variable indicating exposure to any tobacco advertising strategies has a positive sign and high statistical significance both for the decision to smoke and for the number of cigarettes consumed per day.

8For a more detailed discussion on the relationship between the family environment and smoking see, for example, Hill et al. (2005), Gilman et al. (2009), Williams and Covington (1997) and Avenevoli and Merikangas (2003) 17 Health-related characteristics also had the expected effect. Individuals with health insurance are less likely to smoke and if they do smoke they tend to consume a lower amount of cigarettes than those without health insurance. In addition, individuals who self-reported being in very good health are the least likely to smoke and consume a larger amount of cigarettes. It should be noted that these findings should not be seen as causal, since we did not control for endogeneity resulting from reverse causality between smoking and health status.

5. Conclusion

In this paper, we investigate the relationship between education and smoking in Brazil. For this purpose, we use a national database (PETab) that makes it possible for a wide range of information related to smoking behavior to be used. The wealth of information used here is seldom found in other studies that investigated this relationship. We conducted this analysis by modeling the decision to smoke and also the amount of cigarettes smoked per day. We therefore estimated a probitmodel, a sample selection model and a count model (ZINB). Despite the different modeling strategies adopted, the results were very similar. Although our results cannot be interpreted as indicating a causal relationship, the evidence found in this study, besides consistent with those found in the lite- rature, suggests that more educated individuals are less likely to smoke and that when they do smoke they tend to consume fewer cigarettes per day. This sug- gests that education may be the right path for controlling and reducing smoking in Brazil. In addition to making a more informed decision in relation to smoking or not, more educated individuals avoid risky behaviors and are better prepared to evaluate the costs and benefits of their actions. Among the other regressors used in our analysis, we highlight the role of both information on the risks of smoking and of exposure to cigarette ads. Perhaps as never before, we describe the effect of these variables on the decision to smoke and on the amount of cigarettes consumed daily. The estimated and statistically significant marginal effects presented in this study make it clear that policies for controlling and reducing smoking through the two above-mentioned channels are also very likely to be successful. These secondary results is extremely important, because is in line with the National Program of Tobacco Control. Since 1980 the Ministry of Health with Cancer National Institute develops programs to reduce the prevalence and Mor- bidity/mortality related to tabacco uses. Between these programs, we shed light educational acts and some legislations, such as Law No. 10,167, sanctioned on December 27, 2000, which restricted the advertisement of tobacco products only to the internal part of the establishments that commercialize the product, in ad- 18 dition to prohibiting sponsorship of national events. Thus, it is possible to infer that the actions of the National Program of Tobacco Control are focused on two axes with great potential of success on the control of smoking between Brazilians. Although these results provide sufficient inputs for policy makers, there is still room for further investigation. For example, we have not yet investigated the effect of schooling on cessation of addiction, on attempts of cessation of addiction and on the duration of addiction.

Acknowledgment

The authors thank the National Council for Technological and Scientific Deve- lopment (CNPq) for financial support to conduct this research (process number 442483/2014-7). Marcelo Justus thanks CNPq for his Productivity in Research Grant.

References

Avenevoli, S. and K. R. Merikangas (2003). Familial influences on adolescent smoking. Addic- tion 98 (s1), 1–20. Becker, G. S. (1962). Investment in human capital: A theoretical analysis. Journal of political economy 70 (5, Part 2), 9–49. Becker, G. S. and K. M. Murphy (1988, Aug.). The theory of rational addiction. Journal of Political Economy 96 (4), 675– 700. Cameron, A. C. and P. K. Trivedi (2005, july). Microeconometrics: methods and applications (1 ed.). New York: Cambridge University Press. Cameron, A. C. and P. K. Trivedi (2009, march). Microeconometrics using stata (2 ed.). College station: Stata Press. Cutler, D. M. and A. Lleras-Muney (2006). Education and health: evaluating theories and evidence. Technical report, National Bureau of Economic Research. De Walque, D. (2007). Does education affect smoking behaviors? evidence using the vietnam draft as an instrument for college education. Journal of health economics 26 (5), 877–895. de Walque, D. (2010). Education, information, and smoking decisions evidence from smoking histories in the united states, 1940–2000. Journal of Human Resources 45 (3), 682–717. Deaton, A. S. and C. Paxson (2004). Mortality, income, and income inequality over time in britain and the united states. In Perspectives on the Economics of Aging, pp. 247–286. University of Chicago Press. Feinstein, L. (2002). Quantitative Estimates of the Social Benefits of Learning, 2: Health (De- pression and Obesity). Wider Benefits of Learning Research Report. ERIC. Fuchs, V. R. (1982). Introduction to”economic aspects of health”. In Economic Aspects of Health, pp. 1–12. University of Chicago Press. Galama, T. J. and H. van Kippersluis (2015). A theory of education and health. Working Paper No. 2015-007 (2015-007), 1–78. Gilman, S. E., R. Rende, J. Boergers, D. B. Abrams, S. L. Buka, M. A. Clark, S. M. Colby, B. Hitsman, A. N. Kazura, L. P. Lipsitt, et al. (2009). Parental smoking and adolescent smoking initiation: an intergenerational perspective on tobacco control. Pediatrics 123 (2), e274–e281.

19 Grimard, F. and D. Parent (2007). Education and smoking: Were vietnam war draft avoiders also more likely to avoid smoking? Journal of Health Economics 26 (5), 896–926. Grossman, M. (1972). On the concept of health capital and the demand for health. Journal of Political economy 80 (2), 223–255. Grossman, M. (2004). Handbook of the Economics of Education, Chapter Education and Per- sonal Behavior. Forthcoming Amsterdam: Elsevier Science. Grossman, M. (2006). Handbook of the Economics of Education, Volume 1, Chapter Education and nonmarket outcomes, pp. 577–633. Elsevier. Hill, K. G., J. D. Hawkins, R. F. Catalano, R. D. Abbott, and J. Guo (2005). Family influences on the risk of daily smoking initiation. Journal of Adolescent Health 37 (3), 202–210. Hoffmann, R. and A. L. Kassouf (2005). Deriving conditional and unconditional marginal effects in log earnings equations estimated by heckman’s procedure. Applied Economics 37 (11), 1303–1311. IBGE (2014). Pesquisa nacional de sa´ude2013: percep¸c˜aodo estado de sa´ude, estilos de vida e doen¸cascrˆonicas. Keeler, T. E., T.-w. Hu, W. G. Manning, and H.-Y. Sung (2001). State tobacco taxation, education and smoking: controlling for the effects of omitted variables. National Tax Journal, 83–102. Kendler, K. S., M. Neale, P. Sullivan, L. Corey, C. Gardner, and C. Prescott (1999). A population-based twin study in women of smoking initiation and nicotine dependence. Psy- chological medicine 29 (02), 299–308. Koning, P., D. Webbink, and N. G. Martin (2015). The effect of education on smoking behavior: new evidence from smoking durations of a sample of twins. Empirical Economics 48 (4), 1479– 1497. Lleras-Muney, A. (2005). The relationship between education and adult mortality in the united states. The Review of Economic Studies 72 (1), 189–221. Madden, D. (2008). Sample selection versus two-part models revisited: The case of female smoking and drinking. Journal of health economics 27 (2), 300–307. Pinto, M. and M. A. D. Ug´a(2010). Os custos de doen¸castabaco-relacionadas para o sistema ´unicode sa´ude. Cad. Sa´ude P´ublica 26 (6), 1234–1245. Sander, W. (1995). Schooling and quitting smoking. The Review of Economics and Statistics, 191–199. Schultz, T. W. (1961). Investment in human capital. The American economic review, 1–17. Suranovic, S. M., R. S. Goldfarb, and T. C. Leonard (1999). An economic theory of cigarette addiction. Journal of health economics 18 (1), 1–29. Tenn, S., D. A. Herman, and B. Wendling (2010). The role of education in the production of health: An empirical analysis of smoking behavior. Journal of Health Economics 29 (3), 404–417. Wetter, D. W., L. Cofta-Gunn, R. T. Fouladi, J. E. Irvin, P. Daza, C. Mazas, K. Wright, P. M. Cinciripini, and E. R. Gritz (2005). Understanding the associations among education, employment characteristics, and smoking. Addictive behaviors 30 (5), 905–914. Williams, J. G. and C. J. Covington (1997). Predictors of cigarette smoking among adolescents. Psychological reports 80 (2), 481–482. World Bank (1999). A epidemia do tabagismo: Os governos e os aspectos econˆomicosdo controle do tabaco. The World Bank, agosto. World Bank (2015). Taxes on tobacco can save a life every six seconds. Technical report, The World Bank. Zhu, B.-P., G. A. Giovino, P. D. Mowery, and M. P. Eriksen (1996). The relationship between cigarette smoking and education revisited: implications for categorizing persons’ educational status. American Journal of Public Health 86 (11), 1582–1589.

20