Quick viewing(Text Mode)

Outline Nature of Heteroscedasticity Possible Reasons

Outline Nature of Heteroscedasticity Possible Reasons

1/25 

Outline

Basic in Transportation  WWhathat iiss tthehe naturenature ooff hheteroscedasticity?eteroscedasticity?  What are its consequences?  How does one detect it?  What are the remedial measures?

Amir Samimi

Civil Engineering Department Sharif University of Technology

Primary Source: Basic Econometrics (Gujarati)

2/25  3/25 

Nature of Heteroscedasticity Possible Reasons

2 2  An impor tant assu mpt io n in C LRM is t hat E(u i) = σ 1. As peopl e lea rn, t he ir e rro rs o f be hav io r beco me s ma lle r ove r  This is the assumption of equal (homo) spread (scedasticity). time.  Example: the higher income families on the average save more than the lower-  As the number of hours of typing practice increases, the average number of income families, but there is also more variability in their savings. typing errors as well as their decreases. 2. As incomes grow, people have more choices about the disposition of their income.  Rich people have more choices about their savings behavior. 2 3. As collecting techniques improve, σ i is likely to decrease.  Banks that have sophisticated data processing equipment are likely to commit fewer errors. 4/25  5/25 

Possible Reasons Cross-sectional and Data

44.. HHeteroscedasticityeteroscedasticity cacann aariserise wwhenhen ttherehere aarere outoutliers.liers.  HHeteroscedasticityeteroscedasticity iiss likelylikely to be moremore cocommonmmon in ccrossross-  An observation that is much different than other observations in the sample. sectional than in time series data. 5. Heteroscedasticity arises when model is not correctly specified.  In cross-sectional data, one usually deals with members of a population at a given point in time. These members may be of different sizes, income, etc.  Very often what looks like heteroscedasticity may be due to the fact that some important variables are omitted from the model.  In time series data, the variables tend to be of similar orders of magnitude because one generally collects the data for the same entity over a period of 6. in distribution of a regressor is an other source. time.  Distribution of income and wealth in most societies is uneven, with the bulk of the income and wealth being owned by a few at the top. 7. Other sources of heteroscedasticity:  Incorrect data transformation (ratio or first difference transformations).  Incorrect functional form (linear versus log–linear models).

6/25  7/25 

OLS Estimation with Heteroscedasticity Method of Generalized

 OL S estim ator s an d th eir vari an ces w he n  Ideall y, we woul d lik e to gi ve l ess weig ht to t he obse r vatio ns  . coming from populations with greater variability.

 .  Consider: Yi = β1 + β2Xi + ui = β1X0i + β2Xi + ui  Assume the heteroscedastic variances are known:  Is it still BLUE when we drop only the assumption?  We can easily prove that it is still linear and unbiased.  We can also show that it is a .  of transformed disturbance term is now homoscedastic:  It is no longer best and the minimum variance is not given by the equation above.  What is BLUE in the presence of heteroscedasticity?  Apply OLS to the transformed model and get BLUE estimators. 8/25  9/25 

GLS Estimators Consequences of Using OLS

 MinimizMinimizee  OOLSLS estestimatorimator forfor vavarianceriance isis a bbiasediased estestimator.imator.  Overestimates or underestimates, on average  Cannot tell whether the bias is positive or negative  No longer rely on confidence intervals, t and F tests  Follow the standard calculus techniques, we have:  If we persist in using the usual testing procedures despite heteroscedasticity, whatever conclusions we draw may be very misleading.

 Heteroscedasticity is potentially a serious problem and the researcher needs to know whether it is present in a given situation.

10/25  11/25 

Detection Informal Methods

 Theeeaere are no ha adrd-aadnd-fast r ul es fo r detect ing hete roscedast ic ity,  Nature of the Problem only a few rules of thumb.  Nature of problem may suggest heteroscedasticity is likely to be encountered.  Residual variance around the regression of consumption on income increases  This is inevitable because σ2 can be known only if we have the entire Y i with income. population corresponding to the chosen X’s,  More often than not, there is only one sample Y value corresponding to a  Graphical Method 2 2 particular value of X. And there is no way one can know σ i from just one Y  Estimated u i are plotted against estimated Yi observation.  Is the estimated value of Y systematically  Thus, heteroscedasticity may be a matter of intuition , educated guesswork , or relat ed t o th e squared resid ual? prior empirical experience.  a) no systematic pattern, perhaps no  Most of the detection methods are based on examination of OLS heteroscedasticity. residuals.  b-e) definite pattern, perhaps no homoscedasticity.  Those are the ones we observe, and not ui. We hope they are good estimates.  Using such knowledge, one may transform the  This hope may be fulfilled if the sample size is fairly large. data to alleviate the problem. 12/25  13/25 

Formal Methods Formal Methods

 PParkark TestTest  GGlejserlejser TTestest  He formalizes the graphical method, by suggesting a Log-:  Glejser suggests regressing the estimated error term on the X variable: 2 2 ln σ i = ln σ + β ln Xi + vi  Following functional forms are suggested: 2  Since σ i is generally unknown, Park suggests

 If β turns out to be insignificant, homoscedasticity assumption may be accepted.  Thilfilfhbkilhe particular functional form chosen by Park is only suggesti ve.  For large samples the first four give generally satisfactory results.  The last two models are nonlinear in the parameters.

 Note: the error term vi may not satisfy the OLS assumptions.

 Note: some argued that vi does not have a zero , it is serially correlated, and heteroscedastic.

14/25  15/25 

Formal Methods Formal Methods

 Spearm an’ s R ank Corr elat io n Test  Gol df el d-Quandt Test

 Fit the regression to the data on Y and X and estimate the residuals.  Rank the observations according to Xi values.

 Rank both absolute value of residuals and Xi (or estimated Yi) and compute the  Omit c central observations, and divide the remaining observations into two Spearman’s coefficient: groups each of (n − c) / 2 observations.  Fit separate OLS regressions to the first and last set of observations, and obtain th • di = difference in the ranks for i observation. the residual sums of squares RSS1 and RSS2.  Assuming that the population rank is zero and n > 8, the  Compute the ratio siifiignificance of fth the sampl e rs can btbe test tdbthttted by the t test, with df = n − 2:  If ui are assumed to be normally distributed, and if the assumption of homoscedasticity is valid, then it can be shown that λ follows the F distribution.  The ability of the test depends on how c is chosen.  If the computed t value exceeds the critical t value, we may accept the  Goldfeld and Quandt suggest that c = 8 if n = 30, c = 16 if n = 60. hypothesis of heteroscedasticity.  Judge et al. note that c = 4 if n = 30 and c = 10 if n is about 60. 16/25  17/25 

Formal Methods Formal Methods

 BrBreuscheusch–PPaganagan–GodGodfreyfrey TestTest  WWhitehite’s GeGeneralneral HeteroscedasticityHeteroscedasticity TTestest  Success of GQ test depends on c and X with which observations are ordered.  Does not rely on the normality assumption and is easy to implement.

 Estimate Yi = β1 + β2X2i + ··· + βkXki + ui by OLS and obtain the residuals.  Estimate Yi = β1 + β2X2i + β3X3i + ui and obtain the residuals.  Obtain , (ML estimator of σ2)  Run the following auxiliary regression:

 Construct variables pi defined as

 Regress pi on the Z’s as pi = α1 + α2Z2i + ··· + αmZmi + vi Higher powers of regressors can also be introduced. 2 o σ i is assumed to be a linear function of the Z’s.  Under the null hypothesis (homoscedasticity), if the sample size n increases o Some or all of the X’s can serve as Z’s. idfiiindefinitel y, i t can b e sh own th at nR2 ∼ χ2 (df = numbfber of regressors)  Obtain the ESS (explained sum of squares)  = 0.5 ESS  If the chi-square value exceeds the critical value, the conclusion is that there is heteroscedasticity.  Assuming ui are normally distributed, one can show that if there is 2  If it does not α = α = α = α = α = 0. homoscedasticity and if the sample size n increases indefinitely, then  ∼ χ m−1 2 3 4 5 6  BPG test is an asymptotic, or large-sample, test.  It has been argued that if cross-product terms are present, then it is a test of heteroscedasticity and specification bias.

18/25  19/25 

Remedial Measures Remedial Measures

2  Heteroscedast ic ity does not dest roy unb i asedness aadnd  WWehen σ i issow: known: consistency.  The most straightforward method of correcting heteroscedasticity is  But OLS estimators are no longer efficient, not even by of . asymptotically.  WLS method provides BLUE estimators.  There are two approaches to remediation:  2 2 when σ i is known, and  When σ i is unknown: 2  When σ i is not known.  Is there a way of obtaining consistent estimates of the variances and of OLS estimators even if there is heteroscedasticity? The answer is yes. 20/25  21/25 

White’s Correction White’s Procedure

 WWhitehite hhasas suggested a pprocedurerocedure by wwhichhich asyasymptoticallymptotically vavalidlid  FForor a 22-vavariableriable regressionregression mmodelodel Yi = β1 + β2X2i + ui we sshowed:howed: statistical inferences can be made about the true parameter values.  Several computer packages present White’s heteroscedasticity-  White has shown that is a consistent estimator of corrected variances and standard errors along with the usual OLS variances and standard errors.  For Y = β + β X + β X + ··· +β X + u we have:  White’s heteroscedasticity-corrected standard errors are also i 1 2 2i 3 3i k ki i known as robust standard errors.  are the residuals obtained from the original regression.  are the residuals obtained from the auxiliary

regression of the regressor Xj on the remaining regressors.

22/25  23/25 

Example Reasonable Heteroscedasticity Patterns

 Apart fro m be ing a laagerge-sam pepocedueple procedure,,o on edawbace drawback o f t he White procedure is that the estimators thus obtained may not be so efficient as those obtained by methods that transform data to

Y = per capita expenditure on public schools by state in 1979 reflect specific types of heteroscedasticity. Income = per capita income by state in 1979  Both the regressors are statistically significant at the 5 percent  We may consider several assumptions about the pattern of level, whereas on the basis of White estimators they are not. heteroscedasticity.  Since robust standard errors are now available in established regression packages, it is recommended to report them.  WHITE option can be used to compare the output with regular OLS output as a check for heteroscedasticity. 24/25  25/25 

Reasonable Heteroscedasticity Patterns Homework 5

 AAssumptionssumption 11:: if , BBasicasic EconometricsEconometrics (Guja(Gujarati,rati, 22003)003)

1. Chapter 11, Problem 15 [50 points]  Assumption 2: if , 2. Chapter 11, Problem 16 [50 points]

 Assumption 3: if ,

 Assumption 4:  A log transformation such as lnY = β + β ln X + u very often reduces i 1 2 i i Assignment weight factor = 0.5 heteroscedasticity.