Ranking and Grouping of the Districts of Sri Lanka Based on the Expenses of Households: a Multivariate Analysis
Total Page:16
File Type:pdf, Size:1020Kb
RUHUNA JOURNAL OF SCIENCE Vol 11 (1): 83-97, June 2020 eISSN: 2536-8400 Faculty of Science DOI: http://doi.org/10.4038/rjs.v11i1.89 University of Ruhuna Ranking and grouping of the districts of Sri Lanka based on the expenses of households: A multivariate analysis S. R. Gnanapragasam Department of Mathematics, The Open University of Sri Lanka, Sri Lanka Correspondence: [email protected]; https://orcid.org/0000-0003-1411-4853 Received: 22nd March 2019, Revised: 09th October 2019, Accepted: 19th April 2020 Abstract. Sri Lanka has twenty-five districts administrated under nine provinces. The cost of living (CoL) diverges among the districts in Sri Lanka like in many parts of the world. Ranking and grouping based on household expenses is useful in decision making by stakeholders. Principal component analysis and cluster analysis are used for ranking and grouping the districts, respectively, based on the expenses of households in Sri Lanka. Sri Lankans spend more for non-food items than food items, particularly for housing and transport, and therefore non-food items are the most influencing factor to decide the CoL in the country. It is concluded that Colombo district has the highest CoL, followed by Gampaha and Kalutara, whereas Kilinochchi and Mullaitivu districts have the least CoL in Sri Lanka. Districts with moderately high CoL were also identified. These classifications will facilitate investors on decision making and also help people to decide which part of the country will be suitable to settle depending on their income and the CoL in districts. Moreover, this grouping will provide some information to policy makers when planning infrastructure development in the country, and it may also provide a direction to use a new index to measure CoL in Sri Lanka. Keywords: Cluster analysis, cost of living, principal component analysis. 1 Introduction Sri Lanka - also known as the pearl of the Indian Ocean is one of the top tourist destinations in the world. Twenty-five second-level administrative divisions called ‘districts’ are administrated under nine first-level administrative divisions or ‘provinces’ in Sri Lanka. As per the World Bank Report (2018), Sri Lanka is a lower middle-income country with 21.4 million of population. Further, it reports that, its economy grew at an average 5.8% from 2010 to 2017 and it has made significant progress in its socio-economic indicator with social indicators ranked among the highest in South Asia. In Sri Lanka, the cost of living (CoL) differs among the districts. The consumer price index (CPI) is the most widely used measure of CoL in many countries. It Faculty of Science, University of Ruhuna 83 Sri Lanka S. R. Gnanapragasam Ranking and grouping of the districts of Sri Lanka expresses the overall cost of the several goods and services bought by a typical consumer into a single index for measuring the general price level. The CPI is an accurate measure of the selected goods that make up the typical basket, but it is not a perfect measure of CoL. When CoL increases, on one hand housing becomes less affordable but, on the other hand, it often reflects positive developments at a higher level. Generally, Colombo consumer price index (CCPI), wholesale price index (WPI) and gross domestic product deflator (GDPD) are the three major indicators used to measure the changes in the prices in Sri Lanka. According to Kulatunge (2017), CCPI is the official CoL index in Sri Lanka. Furthermore, it says that, CCPI covers a large number of items and is heavily weighted to food items which include 41% of the index. Housing, water, electricity and gas weighted 24% while other fuels and transport items weighted 12% of the items in baskets. Although producing an ideal CoL index may be impossible, that concept directed and encouraged the researchers and relevant sectors to improve the CPI as mentioned by Abraham (2003). In addition, the study of Korale (2001) highlighted the need for a new index to serve as a general inflation indicator. It argued that a specially designed household expenditure survey is needed for a new CPI that should be undertaken primarily for designing weights for the CPI. Jacobs et al. (2014) also has pointed out that CPI is not an ideal measure of changes in CoL. Thus, alternative methods must be tried out to measure CoL in Sri Lanka too. An econometric approach to measure CoL was developed by Jorgenson and Slesnick (1990). Gnanapragasam (2016) has classified the districts of Sri Lanka based on CoL by the data obtained in the report of the household income and expenditure survey year 2012/13. In that study, particularly, classifications were done mainly based on the expenditure of essentials non-food items in Sri Lanka. However, districts grew and changed over time and therefore the classifications need to consider the temporal changes as well. This study considers the data, not only, non-food items, but also food items obtained in the report of the Household Income and Expenditure Survey of 2016. The rankings are useful tools for decision making by stakeholders. Principal component analysis (PCA) is widely used in various research or statistical sectors worldwide. Manage and Scariano (2013) demonstrated the use of PCA in ranking batsmen and bowlers who played in 2012 Indian Pemier League Cricket Tournament. Furthermore, in the studies of Muzamhindo et al. (2017) and Steiner (2006), PCA techniques were used to rank the world universities. Clustering is another common tool widely used in many fields of studies such as socio-economic measures (income, education, profession etc.), psychographic measures (interest, life style, motivation, etc.) and measures linked to the buying behavior (price range, buying intensity, buyer, etc.) to identify subgroups within the larger population who share similar pattern on a set of variables. Cluster analysis (CA) is used as a multivariate technique in statistics. The classification could facilitate investors to make their decisions on where to invest to gain more profit while satisfying the needs of customers. Further, this will help people to which part of the country is suitable to settle based on the CoL. Moreover, such grouping will provide some of the required information to policy makers when planning the infrastructure development in the country. Ruhuna Journal of Science 84 Vol 11 (1): 83-97, June 2020 S. R. Gnanapragasam Ranking and grouping of the districts of Sri Lanka The main aim of this study is to classify the 25 districts based on the expenses on major essential items in the households of Sri Lanka, through two objectives: firstly, to order the districts based on total CoL of a household in Sri Lanka using Principal Component Analysis (PCA), and secondly, to group the districts based on the expenses of a household in Sri Lanka on essential items using the Cluster Analysis (CA). 2 Materials and Methods 2.1 Data and variables The data for this study have been extracted from the most recent survey report of Household Income and Expenditure released by the Department of Census and Statistics of the Ministry of National Policies and Economic Affairs, Sri Lanka in January 2018. It contains data for all 25 districts in Sri Lanka considering two major categories as food items and non-food items in the survey. In food items category, 15 items have been considered whereas 13 items are taken into account under non-food category. Altogether data for 28 items, as variables, in the households in all 25 districts of Sri Lanka, as observations, are considered. In this study, the food items are taken as X variables and non-food items are taken as Y variables to handle the statistical software conveniently (Table 1). Table 1. List of variables used in estimating cost of living in districts of Sri Lanka. Food items Non- food items Variable Name Variable Name X1 Cereals Y1 Housing X2 Prepared food Y2 Fuel & Light X3 Pulses Y3 Personal care & Health expenses X4 Vegetables Y4 Transport X5 Meat Y5 Communication X6 Fish Y6 Education X7 Dried fish Y7 Cultural activities and entertainments X8 Eggs Y8 Household non-durable goods & services X9 Coconuts Y9 Clothing textiles & Footwear X10 Condiments Y10 Household durable goods X11 Milk & milk food Y11 Other miscellaneous expenses X12 Fats & oil Y12 Other adhoc (rarely) expenses X13 Sugar, Jaggery & Treacle Y13 Liquor, Drugs & Tobacco X14 Fruits X15 Other food items Ruhuna Journal of Science 85 Vol 11 (1): 83-97, June 2020 S. R. Gnanapragasam Ranking and grouping of the districts of Sri Lanka 2.2 Data analysis Descriptive statistics The descriptive statistics such as mean, standard deviation, minimum, maximum and total of all the items are summarized. District-wise average expenditures on food and non-food items are compared and, average expenditures on all 28 items for each district are reported. Correlation analysis The associations among the variables on food and non-food items, and the relationship between these items in both categories in terms of correlations were analysed. The p- value of Pearson’s correlation is obtained to observe the strength of the relationship among those 28 items, 5% level of significant is considered. Principle Component Analysis (PCA) Using the principal component as an index requires the determination of principal component scores. The principal component scores were obtained by substituting the observed values of the standardized variables into the following equation of the ith principal component (PCi). ′ 푃퐶 = 푒 푍 (1) ′ 푡ℎ where, 푒 =Transpose of 푖 eigenvector and Z= Matrix form of vectors of standardized variables. In PCA, the first few components (usually the first and the second or third) explain the greater percentage of the variance of original data. Therefore, in this study, only the first principle component score is used to rank the districts based on the total CoL.