Poverty and Inequality Mapping in Bulgaria
Total Page:16
File Type:pdf, Size:1020Kb
Poverty and Inequality Mapping in Bulgaria Final Report Oleksiy Ivaschenko January 2004 Acknowledgements: I am very grateful to people from the National Statistical Institute (NSI) of Bulgaria for their help in producing the poverty and inequality maps. I am especially thankful to Rositza Balakova, Dora Mircheva, Svetla Liamova, and Lilia Miteva. The NSI director Alexander Hadjiiski played an instrumental role in providing access to the Census data. I would also like to acknowledge very useful comments provided by a referee (Mr. Petkov) on an earlier version of this report. My special thanks go to Gero Carletto and Peter Lanjouw from the World Bank for their guidance during all stages of this work. Table of Contents: I. Introduction II. Zero Stage work: looking for a set of common variables in the Census and the BIHS II. a) Explaining zero stage work II. b) The results of zero stage work III. First stage: estimating the model of household per capita consumption using BIHS data IV. Second Stage: Predicting household per capita consumption using the Census data V. Poverty and inequality estimates for various geographical units VI. Conclusions References Appendix A. Establishing the comparability of variables in the Census and the BIHS Appendix B. Modeling heteroscedasticity in the household component of the residual Appendix C. Poverty and Inequality measures Appendix D. Estimates of poverty and inequality for districts and municipalities Appendix E. Poverty and Inequality Maps 1 I. Introduction The lack of reliable information about the welfare of population at the level of districts and municipalities in Bulgaria has been a constraining factor in the design, implementation and evaluation of economic and social programs that would be targeted at regional development. Although the Population and Housing Census (PHC), that was successfully conducted by the National Statistics Institute of Bulgaria (NSI) in 2001, provides comprehensive information on the household socio-demographic conditions, dwelling conditions, and individual characteristics of household members (such as age, education, employment status, etc.), it does not have the information necessary to construct the consumption or income aggregate. At the same time, the Bulgaria Integrated Household Survey (BIHS), conducted by the Gallup International during the period of April-May 2001 using the pre-census listing to draw a nationally-representative sample of 2,500 households, made it possible to construct a reliable consumption-based welfare measure. However, due to a relatively small sample size this survey does not allow one to obtain reliable welfare estimates at a more disaggregated level than Sofia city (capital), other urban, and rural. The main objective of the poverty and inequality mapping in Bulgaria is to produce the estimates of poverty and inequality that would be representative at the level of 28 districts and 262 municipalities. The poverty mapping work is based on a methodology developed by C. Elbers, J. Lanjouw and P. Lanjouw (2002a), which allows one to get accurate estimates of the consumption-based poverty and inequality at the disaggregated regional level by combing the information from the Census and the household (consumption) survey. This methodology involves three major steps. The main purpose of zero stage work is to select a set of variables that are common to the Census and the BIHS. In the first stage the subset of variables that are found to match (contain the same information) between the Census and the BIHS is used to estimate the regression model of per capita consumption using the BIHS data. In the second stage, the obtained set of parameter estimates from the consumption model is applied to the similarly defined variables in the Census to obtain the predicted per capita consumption for each census household. Based on the estimated level of per capita consumption, the estimates of poverty and inequality, as well as their standard errors, are calculated for any geographical unit which has sufficient number of households to obtain the reliable estimates. As mentioned above, in this report we focus on the estimates of welfare at the level of districts (with urban/rural disaggregation) and municipalities. However, we also provide estimates for Sofia city, other urban and rural strata (as well as for the whole country) to compare them with those estimates that are obtained directly from the BIHS. 2 The remainder of the report is structured in the following way. Section II describes in a greater detail zero stage work, and presents its findings. Section III is devoted to the estimation of household per capita consumption using the household survey data. Section IV discusses how the welfare estimates are calculated using the Census data. Section V presents the results of estimating poverty and inequality for districts and municipalities of Bulgaria. Section VI concludes. II. Zero Stage work: looking for a set of common variables in the Census and the BIHS II. a) Explaining zero stage work The zero stage work aims at identifying a set of comparable variables in the Census and the BIHS. The high degree of comparability of selected variables is crucial for getting the accurate estimates of welfare. Even when the household survey and the Census have what seem to be identically worded questions, subtle differences in the way the questions are asked, or different ordering of questions may cause the information content of the answers to differ between the two data sources. The comparability assessment essentially involves determining whether the variables are statistically similarly distributed over the households in the Census and the household survey samples. We perform this comparability procedure at the national level, as well as separately for urban (including Sofia city) and rural areas, at which the BIHS was designed to be representative of the population. A set of about 30 candidate common variables covering the household socio-demographic conditions, quality of dwelling, and the characteristics of the household head was initially identified by systematically comparing the Census and BIHS questionnaires, and studying the interviewer manuals for both surveys when necessary (see Table A1, Appendix A for the list of those variables). Initially, the following qualitative criteria were used to select a set of candidate variables: (a) Are the questions identically worded? (b) Are the criteria pertaining to the questions and answers identical (e.g., children are defined as those of age under 15 in both data sets)? (c) Are the answer options identical? (d) Are the interviewer instructions pertaining to the questions identical? In those cases where the number of the answer options differ between the Census and the BIHS (e.g., the construction material of the dwelling), we check whether several categories in one data source could be combined in a way that would make them comparable with a certain category in the other data source. The descriptive statistics (mean, standard deviation, minimum 3 and maximum values) for the selected (or constructed from those) variables that we expect to match between the Census and BIHS is then produced. The next step of the analysis involves the comparison of the descriptive statistics for the set of candidate variables using the Census and the BIHS data to investigate whether the initially selected variables are statically similarly distributed over households in these two data sources. As the main criteria for the extent of the overlap between the two data sources, we test whether the Census mean for a variable lies within the 95% confidence interval around the BIHS mean for the same variable. In those cases where the BIHS mean is found to be outside of the 95% confidence interval, we make every possible effort to understand the sources of discrepancies by going back to the data, and verifying the definitions of the variables by using the questionnaire instructions for both surveys, and talking to the NSI experts who could provide additional explanation on the content of the questions and answer options. In several cases, it was found that with some justifiable adjustments (e.g., re-grouping of the answer categories for the question in the BIHS to get a better fit with a particular category in the Census) the comparability of the variable in the two data sources could be established.1 There have also been identified several variables which due to a different wording (or meaning) of the questions in the two data sources are clearly not comparable. Such variables are excluded from the subsequent analysis. II. b) The results of zero stage work The analysis of the comparability of the set of candidate variables between the Census and the BIHS suggests that there is a good match for most of them. The descriptive statistics for the list of common variables in both the Census and BIHS at the national level, and separately for urban and rural areas is presented in Tables A2, A3 and A4, Appendix A. For several variables we have found the comparability to be better at the urban level as compared to the rural one, and vice versa. The main problem identified during the analysis is under-representation of one-person households in the BIHS as compared to the Census (17.7% vs. 22.7% of the total number of households, respectively).2 When analyzing the age structure of one-person households, we have 1 As an example of such an adjustment, it was found that the self-reported ethnicity in the BIHS matches better the ethnicity status in the Census (which is also self-reported) than the verified by the interviewer ethnicity status available in the BIHS data. 2 The main reason for this seems to be driven by the fact that in the BIHS the replacement procedure for households that could not be interviewed due to absence or refusal required that the replacement household would come from the dwelling of the same size (such as a 1 BR apartment), but would not necessarily have the same number of household members.