Business Economics Paper No. : 8, Fundamentals of Econometrics Module No. : 15, Heteroscedasticity Detection

____________________________________________________________________________________________________ Subject Business Economics Paper 8, Fundamentals of Econometrics Module No and Title 15, Heteroscedasticity- Detection Module Tag BSE_P8_M15 BUSINESS PAPER NO. : 8, FUNDAMENTALS OF ECONOMETRICS ECONOMICS MODULE NO. : 15, HETEROSCEDASTICITY DETECTION ____________________________________________________________________________________________________ TABLE OF CONTENTS 1. Learning Outcomes 2. Introduction 3. Different diagnostic tools to identify the problem of heteroscedasticity 4. Informal methods to identify the problem of heteroscedasticity 4.1 Checking Nature of the problem 4.2 Graphical inspection of residuals 5. Formal methods to identify the problem of heteroscedasticity 5.1 Park Test 5.2 Glejser test 5.3 White's test 5.4 Spearman's rank correlation test 5.5 Goldfeld-Quandt test 5.6 Breusch- Pagan test 6. Summary BUSINESS PAPER NO. : 8, FUNDAMENTALS OF ECONOMETRICS ECONOMICS MODULE NO. : 15, HETEROSCEDASTICITY DETECTION ____________________________________________________________________________________________________ 1.Learning Outcomes After studying this module, you shall be able to understand Different diagnostic tools to detect the problem of heteroscedasticity Informal methods to identify the problem of heteroscedasticity Formal methods to identify the problem of heteroscedasticity 2. Introduction So far in the previous module we have seen that heteroscedasticity is a violation of one of the assumptions of the classical linear regression model. It occurs when the disturbance term ui of all the observations have non-constant conditional variance. It may be a result of specification errors or data issues. For instance, regression models with cross sectional data, especially in cases where the scale of the dependent variable varies across observations, heteroscedasticity is more likely to occur. It is also observed in highly volatile time series data. In the presence of heteroscedasticity, ordinary least squares estimators and forecasts based on them are still unbiased and consistent but they are no longer BLUE. The estimated variances and covariances of the estimators are biased and inconsistent; hypothesis testing procedures and statistical inference are not valid anymore. Therefore, if we continue to use the OLS method to estimate parameters and to test hypothesis for a data suffering from heteroscedasticity, then we are likely to get misleading conclusions. This makes necessary to find some diagnostic tools to check for the presence of heteroscedasticity problem. BUSINESS PAPER NO. : 8, FUNDAMENTALS OF ECONOMETRICS ECONOMICS MODULE NO. : 15, HETEROSCEDASTICITY DETECTION ____________________________________________________________________________________________________ 3. Diagnostic tools to identify the Problem of Heteroscedasticity Informal Formal Methods Methods Checking Nature of the problem Park test Graphical inspection of Glejser test residuals Spearman's rank correlation test Goldfeld- Quandt test Breusch- Pagan test White's test BUSINESS PAPER NO. : 8, FUNDAMENTALS OF ECONOMETRICS ECONOMICS MODULE NO. : 15, HETEROSCEDASTICITY DETECTION ____________________________________________________________________________________________________ 4. Informal Methods to Identify the Problem of Heteroscedasticity 4.1 Checking Nature of the Problem Nature of the problem is one of the simplest methods to detect the presence of heteroscedasticity. For instance, if we take a cross sectional data on household’s consumption patterns and income level in a locality then we find that residual variance changes for every observation. This is because cross sectional data pools small income, medium income and large income households together for the study. Thus, the possibility of heteroscedasticity is higher in case of cross sectional data. 4.2 Graphical inspection of residuals Before going to check heteroscedasticity by formal methods, a graphical examination of residuals, found by regressing dependent variable on explanatory variables, can be very helpful to have an idea of the presence of heteroscedasticity. This we can do by creating a residualplot in which we take squared residuals on the y axis andplot it against either on (one or more) explanatory variables or on itself. Figure 1shows different patterns of squared residuals, ,plotted against explanatory variable, X. fig (a) shows there is no systematic pattern between squared residuals and explanatory variable. Thus, there is no heteroscedasticity problem. However,̂ fig (b) to (e) shows a systematic pattern between squared residuals and explanatory variable. For instance, fig (b) shows a linear relationship between squared residuals and explanatory variable. Similarly, fig (d) and (e) show quadratic relationship between the two. Thus, fig (b) to (e) is depicting the possibility of heteroscedasticity. In case we have a multiple regression model with more than one explanatory variable then instead of plotting squared residuals against each explanatory variable, we can plot it simply against , the estimated Y. Since, , is a linear combination of all the explanatory variables, we get the � same graphs as above when we plot squaredresiduals against . This is shown in figure 2. ̂ ̂� However, the below drawn graphical plots are just an̂ �indication of the problem of heteroscedasticity. We need some formal methods to make sure the claim of heteroscedasticity problem. BUSINESS PAPER NO. : 8, FUNDAMENTALS OF ECONOMETRICS ECONOMICS MODULE NO. : 15, HETEROSCEDASTICITY DETECTION ____________________________________________________________________________________________________ Figure1 Figure 2 BUSINESS PAPER NO. : 8, FUNDAMENTALS OF ECONOMETRICS ECONOMICS MODULE NO. : 15, HETEROSCEDASTICITY DETECTION ____________________________________________________________________________________________________ 5. Formal Methods to Identify the Problem of Heteroscedasticity 5.1 Park Test Park method is based on the assumption that heteroscedastic variance, , is some function of the explanatory variable . Therefore, to regress on , the following functional form is adopted: 1 � [1 ] � � � � Here, is population error variance. � However,�� = � we+ �cannot �� run+ �this regression � = ,, … because . �. population error variance is unknown and thus, we� use as a proxy for and obtain by the below mentioned steps: � 1. Run �the� original OLS �regression and�� obtain the residual, 2. Squaring the residual, and taking their logsand regress the following equation: [2] ��. � 3. In case of more than �one explanatory variable, we run the above regression for each explanatory�� = � + variable � ��2�. + � � = ,, … . �. 4. Now, if turns out to be significant then we have a problem of heteroscedasticity and we need to correct it. However, if turns out to be insignificant then we can interpret as homoscedastic� variance, � EXAMPLE� 1 � . The following hypothetical example will enable further understanding of this test: Lets us take a data on wages (per hour, rupees), education (years of schooling) and experience (years of job) for 523 workers of an industry.Then on regressing wages, being dependant variable, on education and experience as explanatory variables, we get the following regression results: Wagei = -4.524472 + 0.913018 Edui + 0.096810 Expi se= (1.239348) (0.082190) (0.017719) t= (-3.650687) (11.10868) (5.463513) p= (0.003) (0.0000) (0.0000) r2= .194953 The above result shows that there exist a positive relationship between wages and education and also wages and experience. The estimated coefficients of education and experience are also significant as captured by their t values of about 11 and 5 respectively. 1. The functional form adopted is just for simplicity. One can choose some other functional form as well and the results will be different. For instance, if the values of explanatory variables are negative then one should not take log of But simply regress on � 2. The other method used can be to regress on i (estimated Y). �. �� . �� ̂ BUSINESS PAPER NO. : 8, FUNDAMENTALS OF ECONOMETRICS ECONOMICS MODULE NO. : 15, HETEROSCEDASTICITY DETECTION ____________________________________________________________________________________________________ However, the real problem arises because of the fact that it is a cross sectional data i.e. a sample of 523 workers with diverse backgrounds is taken together at a given point of time. Thus, the possibility of heteroscedasticity is higher here. When we plot squared residuals on each of the explanatory variable i.e. education and experience or on the estimated value of wage, we get considerable variability in the plot as depicted in earlier figures 1 and 2 [fig (b) to (e)]. Now a formal check for heteroscedasticity can be done by using Park test. Here, we regress squared residuals on estimated value of wage and get the following results: = -10.35965 + 3.467020 se= (11.79490) (1.255228) �t= (-0.878316) (2.762063)��̂ p= (.3802) (0.0059) r2= 0.014432 Now we can see that the coefficient of estimated value of wage is statistically significant as it has very small p value. Thus, the Park test suggests the presence of heteroscedasticity. NOTE OF CAUTION: The error term in equation [2] itself may not be homoscedastic. Thus, we are again back on the same problem. � 5.2 Glejser Test This test suggests that instead of taking square of residuals, we take the absolute value of the estimated residuals ,and regress it on explanatory variable, X. He

Business Economics Paper No. : 8, Fundamentals of Econometrics Module No. : 15, Heteroscedasticity Detection

05 36534Nys130620 31

Graduate Econometrics Review

Regression: an Introduction to Econometrics

Regression Project

Inary Least Squares Regression with Raw Cost – OLS Log Transformed Cost – GLM Model (Gamma Regression)

Comparison of Different Tests for Detecting Heteroscedasticity in Datasets

Classical Linear Regression Model: Assumptions and Diagnostic Tests

A Study on the Violation of Homoskedasticity Assumption in Linear Regression Models االنحذار الخطي ت التباي

Syllabus SBNM 5220 Econometrics North Park University Course Credit: 2 Semester Hours

The Heteroskedasticity Tests Implementation for Linear Regression Model Using MATLAB

Models for Health Care

Are Amenities Important for the Migration of Highly Educated Workers? the Role of Built- Amenities in the Migration of Highly Educated Workers