Section 8 Heteroskedasticity
Total Page:16
File Type:pdf, Size:1020Kb
Load more
										Recommended publications
									
								- 
												  Hypothesis Testing and Likelihood Ratio TestsHypottthesiiis tttestttiiing and llliiikellliiihood ratttiiio tttesttts Y We will adopt the following model for observed data. The distribution of Y = (Y1, ..., Yn) is parameter considered known except for some paramett er ç, which may be a vector ç = (ç1, ..., çk); ç“Ç, the paramettter space. The parameter space will usually be an open set. If Y is a continuous random variable, its probabiiillliiittty densiiittty functttiiion (pdf) will de denoted f(yy;ç) . If Y is y probability mass function y Y y discrete then f(yy;ç) represents the probabii ll ii tt y mass functt ii on (pmf); f(yy;ç) = Pç(YY=yy). A stttatttiiistttiiicalll hypottthesiiis is a statement about the value of ç. We are interested in testing the null hypothesis H0: ç“Ç0 versus the alternative hypothesis H1: ç“Ç1. Where Ç0 and Ç1 ¶ Ç. hypothesis test Naturally Ç0 § Ç1 = ∅, but we need not have Ç0 ∞ Ç1 = Ç. A hypott hesii s tt estt is a procedure critical region for deciding between H0 and H1 based on the sample data. It is equivalent to a crii tt ii call regii on: a critical region is a set C ¶ Rn y such that if y = (y1, ..., yn) “ C, H0 is rejected. Typically C is expressed in terms of the value of some tttesttt stttatttiiistttiiic, a function of the sample data. For µ example, we might have C = {(y , ..., y ): y – 0 ≥ 3.324}. The number 3.324 here is called a 1 n s/ n µ criiitttiiicalll valllue of the test statistic Y – 0 . S/ n If y“C but ç“Ç 0, we have committed a Type I error.
- 
												  Lecture 12 Robust EstimationLecture 12 Robust Estimation Prof. Dr. Svetlozar Rachev Institute for Statistics and Mathematical Economics University of Karlsruhe Financial Econometrics, Summer Semester 2007 Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Copyright These lecture-notes cannot be copied and/or distributed without permission. The material is based on the text-book: Financial Econometrics: From Basics to Advanced Modeling Techniques (Wiley-Finance, Frank J. Fabozzi Series) by Svetlozar T. Rachev, Stefan Mittnik, Frank Fabozzi, Sergio M. Focardi,Teo Jaˇsic`. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Outline I Robust statistics. I Robust estimators of regressions. I Illustration: robustness of the corporate bond yield spread model. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics I Robust statistics addresses the problem of making estimates that are insensitive to small changes in the basic assumptions of the statistical models employed. I The concepts and methods of robust statistics originated in the 1950s. However, the concepts of robust statistics had been used much earlier. I Robust statistics: 1. assesses the changes in estimates due to small changes in the basic assumptions; 2. creates new estimates that are insensitive to small changes in some of the assumptions. I Robust statistics is also useful to separate the contribution of the tails from the contribution of the body of the data. Prof. Dr. Svetlozar Rachev Institute for Statistics and MathematicalLecture Economics 12 Robust University Estimation of Karlsruhe Robust Statistics I Peter Huber observed, that robust, distribution-free, and nonparametrical actually are not closely related properties.
- 
												  Generalized Linear Models and Generalized Additive Models00:34 Friday 27th February, 2015 Copyright ©Cosma Rohilla Shalizi; do not distribute without permission updates at http://www.stat.cmu.edu/~cshalizi/ADAfaEPoV/ Chapter 13 Generalized Linear Models and Generalized Additive Models [[TODO: Merge GLM/GAM and Logistic Regression chap- 13.1 Generalized Linear Models and Iterative Least Squares ters]] [[ATTN: Keep as separate Logistic regression is a particular instance of a broader kind of model, called a gener- chapter, or merge with logis- alized linear model (GLM). You are familiar, of course, from your regression class tic?]] with the idea of transforming the response variable, what we’ve been calling Y , and then predicting the transformed variable from X . This was not what we did in logis- tic regression. Rather, we transformed the conditional expected value, and made that a linear function of X . This seems odd, because it is odd, but it turns out to be useful. Let’s be specific. Our usual focus in regression modeling has been the condi- tional expectation function, r (x)=E[Y X = x]. In plain linear regression, we try | to approximate r (x) by β0 + x β. In logistic regression, r (x)=E[Y X = x] = · | Pr(Y = 1 X = x), and it is a transformation of r (x) which is linear. The usual nota- tion says | ⌘(x)=β0 + x β (13.1) · r (x) ⌘(x)=log (13.2) 1 r (x) − = g(r (x)) (13.3) defining the logistic link function by g(m)=log m/(1 m). The function ⌘(x) is called the linear predictor. − Now, the first impulse for estimating this model would be to apply the transfor- mation g to the response.
- 
												  Generalized Least Squares and Weighted Least Squares EstimationREVSTAT – Statistical Journal Volume 13, Number 3, November 2015, 263–282 GENERALIZED LEAST SQUARES AND WEIGH- TED LEAST SQUARES ESTIMATION METHODS FOR DISTRIBUTIONAL PARAMETERS Author: Yeliz Mert Kantar – Department of Statistics, Faculty of Science, Anadolu University, Eskisehir, Turkey [email protected] Received: March 2014 Revised: August 2014 Accepted: August 2014 Abstract: • Regression procedures are often used for estimating distributional parameters because of their computational simplicity and useful graphical presentation. However, the re- sulting regression model may have heteroscedasticity and/or correction problems and thus, weighted least squares estimation or alternative estimation methods should be used. In this study, we consider generalized least squares and weighted least squares estimation methods, based on an easily calculated approximation of the covariance matrix, for distributional parameters. The considered estimation methods are then applied to the estimation of parameters of different distributions, such as Weibull, log-logistic and Pareto. The results of the Monte Carlo simulation show that the generalized least squares method for the shape parameter of the considered distri- butions provides for most cases better performance than the maximum likelihood, least-squares and some alternative estimation methods. Certain real life examples are provided to further demonstrate the performance of the considered generalized least squares estimation method. Key-Words: • probability plot; heteroscedasticity; autocorrelation; generalized least squares; weighted least squares; shape parameter. 264 Yeliz Mert Kantar Generalized Least Squares and Weighted Least Squares 265 1. INTRODUCTION Regression procedures are often used for estimating distributional param- eters. In this procedure, the distribution function is transformed to a linear re- gression model. Thus, least squares (LS) estimation and other regression estima- tion methods can be employed to estimate parameters of a specified distribution.
- 
												  Should We Think of a Different Median Estimator?Comunicaciones en Estad´ıstica Junio 2014, Vol. 7, No. 1, pp. 11–17 Should we think of a different median estimator? ¿Debemos pensar en un estimator diferente para la mediana? Jorge Iv´an V´eleza Juan Carlos Correab [email protected] [email protected] Resumen La mediana, una de las medidas de tendencia central m´as populares y utilizadas en la pr´actica, es el valor num´erico que separa los datos en dos partes iguales. A pesar de su popularidad y aplicaciones, muchos desconocen la existencia de dife- rentes expresiones para calcular este par´ametro. A continuaci´on se presentan los resultados de un estudio de simulaci´on en el que se comparan el estimador cl´asi- co y el propuesto por Harrell & Davis (1982). Mostramos que, comparado con el estimador de Harrell–Davis, el estimador cl´asico no tiene un buen desempe˜no pa- ra tama˜nos de muestra peque˜nos. Basados en los resultados obtenidos, se sugiere promover la utilizaci´on de un mejor estimador para la mediana. Palabras clave: mediana, cuantiles, estimador Harrell-Davis, simulaci´on estad´ısti- ca. Abstract The median, one of the most popular measures of central tendency widely-used in the statistical practice, is often described as the numerical value separating the higher half of the sample from the lower half. Despite its popularity and applica- tions, many people are not aware of the existence of several formulas to estimate this parameter. We present the results of a simulation study comparing the classic and the Harrell-Davis (Harrell & Davis 1982) estimators of the median for eight continuous statistical distributions.
- 
												  Generalized and Weighted Least Squares EstimationLINEAR REGRESSION ANALYSIS MODULE – VII Lecture – 25 Generalized and Weighted Least Squares Estimation Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur 2 The usual linear regression model assumes that all the random error components are identically and independently distributed with constant variance. When this assumption is violated, then ordinary least squares estimator of regression coefficient looses its property of minimum variance in the class of linear and unbiased estimators. The violation of such assumption can arise in anyone of the following situations: 1. The variance of random error components is not constant. 2. The random error components are not independent. 3. The random error components do not have constant variance as well as they are not independent. In such cases, the covariance matrix of random error components does not remain in the form of an identity matrix but can be considered as any positive definite matrix. Under such assumption, the OLSE does not remain efficient as in the case of identity covariance matrix. The generalized or weighted least squares method is used in such situations to estimate the parameters of the model. In this method, the deviation between the observed and expected values of yi is multiplied by a weight ω i where ω i is chosen to be inversely proportional to the variance of yi. n 2 For simple linear regression model, the weighted least squares function is S(,)ββ01=∑ ωii( yx −− β0 β 1i) . ββ The least squares normal equations are obtained by differentiating S (,) ββ 01 with respect to 01 and and equating them to zero as nn n ˆˆ β01∑∑∑ ωβi+= ωiixy ω ii ii=11= i= 1 n nn ˆˆ2 βω01∑iix+= βω ∑∑ii x ω iii xy.
- 
												  Lecture 24: Weighted and Generalized Least Squares 1Lecture 24: Weighted and Generalized Least Squares 1 Weighted Least Squares When we use ordinary least squares to estimate linear regression, we minimize the mean squared error: n 1 X MSE(b) = (Y − X β)2 (1) n i i· i=1 th where Xi· is the i row of X. The solution is T −1 T βbOLS = (X X) X Y: (2) Suppose we minimize the weighted MSE n 1 X W MSE(b; w ; : : : w ) = w (Y − X b)2: (3) 1 n n i i i· i=1 This includes ordinary least squares as the special case where all the weights wi = 1. We can solve it by the same kind of linear algebra we used to solve the ordinary linear least squares problem. If we write W for the matrix with the wi on the diagonal and zeroes everywhere else, then W MSE = n−1(Y − Xb)T W(Y − Xb) (4) 1 = YT WY − YT WXb − bT XT WY + bT XT WXb : (5) n Differentiating with respect to b, we get as the gradient 2 r W MSE = −XT WY + XT WXb : b n Setting this to zero at the optimum and solving, T −1 T βbW LS = (X WX) X WY: (6) But why would we want to minimize Eq. 3? 1. Focusing accuracy. We may care very strongly about predicting the response for certain values of the input | ones we expect to see often again, ones where mistakes are especially costly or embarrassing or painful, etc. | than others. If we give the points near that region big weights, and points elsewhere smaller weights, the regression will be pulled towards matching the data in that region.
- 
												  Bias, Mean-Square Error, Relative Efficiency3 Evaluating the Goodness of an Estimator: Bias, Mean-Square Error, Relative Efficiency Consider a population parameter ✓ for which estimation is desired. For ex- ample, ✓ could be the population mean (traditionally called µ) or the popu- lation variance (traditionally called σ2). Or it might be some other parame- ter of interest such as the population median, population mode, population standard deviation, population minimum, population maximum, population range, population kurtosis, or population skewness. As previously mentioned, we will regard parameters as numerical charac- teristics of the population of interest; as such, a parameter will be a fixed number, albeit unknown. In Stat 252, we will assume that our population has a distribution whose density function depends on the parameter of interest. Most of the examples that we will consider in Stat 252 will involve continuous distributions. Definition 3.1. An estimator ✓ˆ is a statistic (that is, it is a random variable) which after the experiment has been conducted and the data collected will be used to estimate ✓. Since it is true that any statistic can be an estimator, you might ask why we introduce yet another word into our statistical vocabulary. Well, the answer is quite simple, really. When we use the word estimator to describe a particular statistic, we already have a statistical estimation problem in mind. For example, if ✓ is the population mean, then a natural estimator of ✓ is the sample mean. If ✓ is the population variance, then a natural estimator of ✓ is the sample variance. More specifically, suppose that Y1,...,Yn are a random sample from a population whose distribution depends on the parameter ✓.The following estimators occur frequently enough in practice that they have special notations.
- 
												  18.650 (F16) Lecture 10: Generalized Linear Models (Glms)Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X (µ(X), σ2I), | ∼ N And ⊤ IE(Y X) = µ(X) = X β, | 2/52 Components of a linear model The two components (that we are going to relax) are 1. Random component: the response variable Y X is continuous | and normally distributed with mean µ = µ(X) = IE(Y X). | 2. Link: between the random and covariates X = (X(1),X(2), ,X(p))⊤: µ(X) = X⊤β. · · · 3/52 Generalization A generalized linear model (GLM) generalizes normal linear regression models in the following directions. 1. Random component: Y some exponential family distribution ∼ 2. Link: between the random and covariates: ⊤ g µ(X) = X β where g called link function� and� µ = IE(Y X). | 4/52 Example 1: Disease Occuring Rate In the early stages of a disease epidemic, the rate at which new cases occur can often increase exponentially through time. Hence, if µi is the expected number of new cases on day ti, a model of the form µi = γ exp(δti) seems appropriate. ◮ Such a model can be turned into GLM form, by using a log link so that log(µi) = log(γ) + δti = β0 + β1ti. ◮ Since this is a count, the Poisson distribution (with expected value µi) is probably a reasonable distribution to try. 5/52 Example 2: Prey Capture Rate(1) The rate of capture of preys, yi, by a hunting animal, tends to increase with increasing density of prey, xi, but to eventually level off, when the predator is catching as much as it can cope with.
- 
												  A Joint Central Limit Theorem for the Sample Mean and Regenerative Variance Estimator*Annals of Operations Research 8(1987)41-55 41 A JOINT CENTRAL LIMIT THEOREM FOR THE SAMPLE MEAN AND REGENERATIVE VARIANCE ESTIMATOR* P.W. GLYNN Department of Industrial Engineering, University of Wisconsin, Madison, W1 53706, USA and D.L. IGLEHART Department of Operations Research, Stanford University, Stanford, CA 94305, USA Abstract Let { V(k) : k t> 1 } be a sequence of independent, identically distributed random vectors in R d with mean vector ~. The mapping g is a twice differentiable mapping from R d to R 1. Set r = g(~). A bivariate central limit theorem is proved involving a point estimator for r and the asymptotic variance of this point estimate. This result can be applied immediately to the ratio estimation problem that arises in regenerative simulation. Numerical examples show that the variance of the regenerative variance estimator is not necessarily minimized by using the "return state" with the smallest expected cycle length. Keywords and phrases Bivariate central limit theorem,j oint limit distribution, ratio estimation, regenerative simulation, simulation output analysis. 1. Introduction Let X = {X(t) : t I> 0 } be a (possibly) delayed regenerative process with regeneration times 0 = T(- 1) ~< T(0) < T(1) < T(2) < .... To incorporate regenerative sequences {Xn: n I> 0 }, we pass to the continuous time process X = {X(t) : t/> 0}, where X(0 = X[t ] and [t] is the greatest integer less than or equal to t. Under quite general conditions (see Smith [7] ), *This research was supported by Army Research Office Contract DAAG29-84-K-0030. The first author was also supported by National Science Foundation Grant ECS-8404809 and the second author by National Science Foundation Grant MCS-8203483.
- 
												  Weighted Least SquaresWeighted Least Squares 2 ∗ The standard linear model assumes that Var("i) = σ for i = 1; : : : ; n. ∗ As we have seen, however, there are instances where σ2 Var(Y j X = xi) = Var("i) = : wi ∗ Here w1; : : : ; wn are known positive constants. ∗ Weighted least squares is an estimation technique which weights the observations proportional to the reciprocal of the error variance for that observation and so overcomes the issue of non-constant variance. 7-1 Weighted Least Squares in Simple Regression ∗ Suppose that we have the following model Yi = β0 + β1Xi + "i i = 1; : : : ; n 2 where "i ∼ N(0; σ =wi) for known constants w1; : : : ; wn. ∗ The weighted least squares estimates of β0 and β1 minimize the quantity n X 2 Sw(β0; β1) = wi(yi − β0 − β1xi) i=1 ∗ Note that in this weighted sum of squares, the weights are inversely proportional to the corresponding variances; points with low variance will be given higher weights and points with higher variance are given lower weights. 7-2 Weighted Least Squares in Simple Regression ∗ The weighted least squares estimates are then given as ^ ^ β0 = yw − β1xw P w (x − x )(y − y ) ^ = i i w i w β1 P 2 wi(xi − xw) where xw and yw are the weighted means P w x P w y = i i = i i xw P yw P : wi wi ∗ Some algebra shows that the weighted least squares esti- mates are still unbiased. 7-3 Weighted Least Squares in Simple Regression ∗ Furthermore we can find their variances 2 ^ σ Var(β1) = X 2 wi(xi − xw) 2 3 1 2 ^ xw 2 Var(β0) = 4X + X 25 σ wi wi(xi − xw) ∗ Since the estimates can be written in terms of normal random variables, the sampling distributions are still normal.
- 
												  This Is Dr. Chumney. the Focus of This Lecture Is Hypothesis Testing –Both What It Is, How Hypothesis Tests Are Used, and How to Conduct Hypothesis TestsTRANSCRIPT: This is Dr. Chumney. The focus of this lecture is hypothesis testing –both what it is, how hypothesis tests are used, and how to conduct hypothesis tests. 1 TRANSCRIPT: In this lecture, we will talk about both theoretical and applied concepts related to hypothesis testing. 2 TRANSCRIPT: Let’s being the lecture with a summary of the logic process that underlies hypothesis testing. 3 TRANSCRIPT: It is often impossible or otherwise not feasible to collect data on every individual within a population. Therefore, researchers rely on samples to help answer questions about populations. Hypothesis testing is a statistical procedure that allows researchers to use sample data to draw inferences about the population of interest. Hypothesis testing is one of the most commonly used inferential procedures. Hypothesis testing will combine many of the concepts we have already covered, including z‐scores, probability, and the distribution of sample means. To conduct a hypothesis test, we first state a hypothesis about a population, predict the characteristics of a sample of that population (that is, we predict that a sample will be representative of the population), obtain a sample, then collect data from that sample and analyze the data to see if it is consistent with our hypotheses. 4 TRANSCRIPT: The process of hypothesis testing begins by stating a hypothesis about the unknown population. Actually we state two opposing hypotheses. The first hypothesis we state –the most important one –is the null hypothesis. The null hypothesis states that the treatment has no effect. In general the null hypothesis states that there is no change, no difference, no effect, and otherwise no relationship between the independent and dependent variables.