Chapter 12 Autocorrelation in Time Series Data

Total Page:16

File Type:pdf, Size:1020Kb

Chapter 12 Autocorrelation in Time Series Data 238 Chapter 12 Autocorrelation in Time Series Data In a time series model, Yi = ¯0 + ¯1Xi + "i; i = 1; : : : ; n; 1 we allow the "i errors to be correlated with one another; that is, ½("i; "j) = 0, i = j. 6 6 12.1 Problems of Autocorrelation If we treat a time series model as a simple linear regression model; that is, we assume ½("i; "j) = 0 when, in fact, ½("i; "j) = 0, then the variance in the error terms of the time series model will be underestimated.6 Any con¯dence intervals or tests for the time series model in this case would then be suspect. 1 By way of comparison, in a simple linear regression model, we assume the error "i have zero covariance, σ("i; "j ) = 0. Also recall, covariance and correlation are closely related to one another and, in particular, σ("i; "j ) = 0, i = j implies ½("i; "j ) = 0, i = j. 6 6 239 240 Chapter 12. Autocorrelation in Time Series Data (ATTENDANCE 11) Exercise 12.1 (Problems of Autocorrelation) 1. The errors in (a) below are (circle one) not autocorrelated / autocorrelated because they fluctuate at random above and below zero (0). error error error 20 20 20 10 10 10 0 0 0 -10 -10 -10 -20 -20 -20 0 2.5 5 7.5 10 0 2.5 5 7.5 10 0 2.5 5 7.5 10 time time time (a) uncorrelated e e (b) correlated e = e + u (c) correlated t = t-1 + u t t-1 t t where e = 10 where e = -10 0 0 Figure 12.1 (Positively Autocorrelated Error Terms) time, t 1 2 3 4 5 6 7 8 9 10 error, "t 1 2 4 -1 -5 0 2 -1 2 2 2. The errors in (b) above are positively autocorrelated because adjacent error terms tend to be of the same magnitude. time, t 0 1 2 3 4 5 6 7 8 9 10 not autocorrelated error, ut 1 2 4 -1 -5 0 2 -1 2 2 autocorrelated error, "t = "t¡1 + ut 10 11 13 17 16 11 11 13 12 14 16 In this case, all of the errors fluctuate at random above the horizontal line at (choose one) ¡10 / 0 / 10 3. Consider the errors in (c) above. time, t 0 1 2 3 4 5 6 7 8 9 10 not autocorrelated error, ut 1 2 4 -1 -5 0 2 -1 2 2 autocorrelated error, "t = "t¡1 + ut -10 -9 -7 -3 -4 -9 -9 -7 -8 -6 -4 There errors are (circle one) positively / negatively autocorrelated because even though adjacent error terms are all negative, they all tend to be of the same magnitude. Negative correlation occurs when "t = "t¡1 + ut, that is, when adjacent error terms are all of the same absolute magnitude,¡ but opposite in sign. Section 1. Problems of Autocorrelation (ATTENDANCE 11) 241 4. Match residual plots with scatter plots error error error 20 20 20 10 10 10 0 0 0 -10 -10 -10 -20 -20 -20 0 2.5 5 7.5 10 0 2.5 5 7.5 10 0 2.5 5 7.5 10 time time time (a) residual plot 1 (b) residual plot 2 (c) residual plot 3 Y Y Y 50 50 50 40 40 40 30 30 30 20 20 20 0 0 0 0 2.5 5 7.5 10 0 2.5 5 7.5 10 0 2.5 5 7.5 10 time time time (d) scatter plot 1 (e) scatter plot 2 (f) scatter plot 3 Figure 12.2 (Residual plots and associated scatter plots) Match the residual plots with the scatter plots. Residual plot (a) (b) (c) Scatter plot 5. Problem of detecting autocorrelation What if we were unaware that the simple linear regression was, in fact, a time series model? That is, what if we assumed ½("i; "j) = 0 when, in fact, we should have assumed ½("i; "j) = 0? In this case, we would draw scatterplot (choose one) (d) / (e)6 / (f) in the ¯gure above. In other words, it seems that autocorrelation is di±cult to detect. 242 Chapter 12. Autocorrelation in Time Series Data (ATTENDANCE 11) 6. What if Positively Autocorrelated Error Goes Undetected? Y Y 50 50 40 actual regression fitted error fitted regression 40 actual error 30 30 20 20 0 0 0 2.5 5 7.5 10 0 2.5 5 7.5 10 time time (a) undetected positive autocorrelation (b) detected positive autocorrelation Figure 12.3 (What if Positively Autocorrelated Error Goes Undetected?) If the positive autocorrelated error remains undetected, then the ¯tted error terms are (choose one) smaller / larger than the actual error terms. 7. More What if Positively Autocorrelated Error Goes Undetected? If the positive autocorrelated error remains undetected, this means the variance of the error terms in model is often (circle one) underestimated / overesti- mated. Implications of this are (a) the M SE may underestimate the variance of error terms, (b) s bk may underestimated the standard deviation of the estimated regres- f g sion coe±cient, bk, (c) estimated regression coe±cients may still be unbiased, but no longer have minimum variance, and this all implies the con¯dence intervals and tests, using the t and F distri- butions, can not really be used for inference purposes. 8. Property of Positively Autocorrelated Error When errors are positively autocorrelated, initial error, "0, tends to (circle one) die out quickly (tend to zero) linger (stay at the same magnitude). Consequently, to detect positive autocorrelated error, it makes sense to create a measure that detects error that lingers at the same magnitude. 9. Autocorrelation Versus Cross Correlation True / False The word autocorrelation refers to the analysis of the correlation of error vari- ables in one times series; cross correlation would refer to the analysis of the correlation of error variables in di®erent time series. Section 2. First{Order Autoregressive Error Model (ATTENDANCE 11) 243 12.2 First{Order Autoregressive Error Model The simple ¯rst{order autoregressive error model is Yt = ¯0 + ¯1Xt + "t "t = ½"t¡1 + ut 2 where ½ < 1 and ut are independent N(0; σ ) and ½ is the autocorrelation parameter. In a similarj j way, the multiple ¯rst{order autoregressive error model is Yt = ¯ + ¯ Xt + ¯ Xt + + ¯p¡ Xt;p¡ + "t 0 1 1 2 2 ¢ ¢ ¢ 1 1 "t = ½"t¡1 + ut 2 where, again, ½ < 1 and ut are independent N(0; σ ). Notice, in particular, we j j assume in this model that the error "t terms depend on one another according to "t = ½"t¡1 + ut. Exercise 12.2 (Properties of First{Order Autoregressive Error Model) 1. True / False 2 The one and only way the error "t terms can depend on one another is according to "t = ½"t¡1 + ut. 2. If ½ = 1, "t = ½"t¡1 + ut = "t¡1 + ut In this case, the magnitude of the error now, at time t, is (choose one) less than / equal to / greater than the magnitude of the error one time unit in the past, at time t 1. ¡ 3. If ½ = 1, the error terms are (choose one) positively autocorrelated / uncorrelated / negatively au- tocorrelated 4. If ½ = 1, the error terms are (choose¡one) positively autocorrelated / uncorrelated / negatively au- tocorrelated 5. If ½ = 0, "t = ½"t¡1 + ut = ut where, recall, ut are assumed to be independent. In this case, the error terms are (choose one) positively autocorrelated / uncorrelated / negatively au- tocorrelated 2 We assume the error "t terms depend on one another in this way because is it reasonable and the mathematics work out this way. 244 Chapter 12. Autocorrelation in Time Series Data (ATTENDANCE 11) 6. Mean E "t = 0. Even thoughf g the error terms are autocorrelated, they (circle one) are / are not unbiased. 2 σ2 7. Variance σ "t = 2 . f g 1¡½ The smaller, closer to zero (0), the correlation parameter ½ is, the (circle one) closer / farther away the variance of autocorrelated error terms is to the variance of unautocorrelated error terms, σ2. σ2 8. Covariance σ "t; "t¡ = ½ 2 = ½· f 1g 1¡½ If the error terms are not auto³ correlated´ (½ = 0), the covariance is (circle one) zero / one / in¯nity. 9. Correlation ½ "t; "t¡1 = ½ If the error termsf aregnot autocorrelated (½ = 0), the correlation (standardized covariance) is (circle one) zero / one / in¯nity. s σ2 s 10. Covariance s time periods apart, σ "t; "t¡s = ½ 2 = ½ · f g 1¡½ The covariance s time periods apart, σ "t; "t¡s , is³(circle´one) smaller / larger f g than the covariance one time period apart, σ "t; "t¡ because 0 ½ 1 f 1g · · s 11. True / False. Correlation s time periods apart, ½ "t; "t¡s = ½ f g 12. True / False The variance{covariance matrix is · ·½ ·½n¡1 2 . ¢ ¢ ¢ . σ " = . £f g 2 3 n n ·½n¡1 ·½n¡2 ¢ ¢ ¢ · 6 ¢ ¢ ¢ 7 4 5 σ2 where · = 1¡½2 Section 3. Durbin{Watson Test for Autocorrelation (ATTENDANCE 11) 245 12.3 Durbin{Watson Test for Autocorrelation SAS program: att11-12-3-read-durbin If the Durbin{Watson test statistic, n 2 t=2(et et¡1) D = ¡ ; et = Yt Y^t n e2 ¡ P t=1 i is small, then this indicates "t P"t¡1, or, in other words, that the error terms are positively autocorrelated with one¼ another. In fact, if D > dU ½ = 0; not autocorrelated dL < D < dU undetermined, or weakly autocorrelated D < dL ½ > 0; positive autocorrelated where dU and dL are found in Table B.7, pages 1349{1350. Exercise 12.3 (Durbin{Watson Test for Autocorrelation) illumination, X 1 2 3 4 5 6 7 8 9 10 ability to read, Y 70 70 75 88 91 94 100 92 90 85 1.
Recommended publications
  • Understanding Linear and Logistic Regression Analyses
    EDUCATION • ÉDUCATION METHODOLOGY Understanding linear and logistic regression analyses Andrew Worster, MD, MSc;*† Jerome Fan, MD;* Afisi Ismaila, MSc† SEE RELATED ARTICLE PAGE 105 egression analysis, also termed regression modeling, come). For example, a researcher could evaluate the poten- Ris an increasingly common statistical method used to tial for injury severity score (ISS) to predict ED length-of- describe and quantify the relation between a clinical out- stay by first producing a scatter plot of ISS graphed against come of interest and one or more other variables. In this ED length-of-stay to determine whether an apparent linear issue of CJEM, Cummings and Mayes used linear and lo- relation exists, and then by deriving the best fit straight line gistic regression to determine whether the type of trauma for the data set using linear regression carried out by statis- team leader (TTL) impacts emergency department (ED) tical software. The mathematical formula for this relation length-of-stay or survival.1 The purpose of this educa- would be: ED length-of-stay = k(ISS) + c. In this equation, tional primer is to provide an easily understood overview k (the slope of the line) indicates the factor by which of these methods of statistical analysis. We hope that this length-of-stay changes as ISS changes and c (the “con- primer will not only help readers interpret the Cummings stant”) is the value of length-of-stay when ISS equals zero and Mayes study, but also other research that uses similar and crosses the vertical axis.2 In this hypothetical scenario, methodology.
    [Show full text]
  • Simple Linear Regression with Least Square Estimation: an Overview
    Aditya N More et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 7 (6) , 2016, 2394-2396 Simple Linear Regression with Least Square Estimation: An Overview Aditya N More#1, Puneet S Kohli*2, Kshitija H Kulkarni#3 #1-2Information Technology Department,#3 Electronics and Communication Department College of Engineering Pune Shivajinagar, Pune – 411005, Maharashtra, India Abstract— Linear Regression involves modelling a relationship amongst dependent and independent variables in the form of a (2.1) linear equation. Least Square Estimation is a method to determine the constants in a Linear model in the most accurate way without much complexity of solving. Metrics where such as Coefficient of Determination and Mean Square Error is the ith value of the sample data point determine how good the estimation is. Statistical Packages is the ith value of y on the predicted regression such as R and Microsoft Excel have built in tools to perform Least Square Estimation over a given data set. line The above equation can be geometrically depicted by Keywords— Linear Regression, Machine Learning, Least Squares Estimation, R programming figure 2.1. If we draw a square at each point whose length is equal to the absolute difference between the sample data point and the predicted value as shown, each of the square would then represent the residual error in placing the I. INTRODUCTION regression line. The aim of the least square method would Linear Regression involves establishing linear be to place the regression line so as to minimize the sum of relationships between dependent and independent variables.
    [Show full text]
  • Application of General Linear Models (GLM) to Assess Nodule Abundance Based on a Photographic Survey (Case Study from IOM Area, Pacific Ocean)
    minerals Article Application of General Linear Models (GLM) to Assess Nodule Abundance Based on a Photographic Survey (Case Study from IOM Area, Pacific Ocean) Monika Wasilewska-Błaszczyk * and Jacek Mucha Department of Geology of Mineral Deposits and Mining Geology, Faculty of Geology, Geophysics and Environmental Protection, AGH University of Science and Technology, 30-059 Cracow, Poland; [email protected] * Correspondence: [email protected] Abstract: The success of the future exploitation of the Pacific polymetallic nodule deposits depends on an accurate estimation of their resources, especially in small batches, scheduled for extraction in the short term. The estimation based only on the results of direct seafloor sampling using box corers is burdened with a large error due to the long sampling interval and high variability of the nodule abundance. Therefore, estimations should take into account the results of bottom photograph analyses performed systematically and in large numbers along the course of a research vessel. For photographs taken at the direct sampling sites, the relationship linking the nodule abundance with the independent variables (the percentage of seafloor nodule coverage, the genetic types of nodules in the context of their fraction distribution, and the degree of sediment coverage of nodules) was determined using the general linear model (GLM). Compared to the estimates obtained with a simple Citation: Wasilewska-Błaszczyk, M.; linear model linking this parameter only with the seafloor nodule coverage, a significant decrease Mucha, J. Application of General in the standard prediction error, from 4.2 to 2.5 kg/m2, was found. The use of the GLM for the Linear Models (GLM) to Assess assessment of nodule abundance in individual sites covered by bottom photographs, outside of Nodule Abundance Based on a direct sampling sites, should contribute to a significant increase in the accuracy of the estimation of Photographic Survey (Case Study nodule resources.
    [Show full text]
  • The Simple Linear Regression Model
    The Simple Linear Regression Model Suppose we have a data set consisting of n bivariate observations {(x1, y1),..., (xn, yn)}. Response variable y and predictor variable x satisfy the simple linear model if they obey the model yi = β0 + β1xi + ǫi, i = 1,...,n, (1) where the intercept and slope coefficients β0 and β1 are unknown constants and the random errors {ǫi} satisfy the following conditions: 1. The errors ǫ1, . , ǫn all have mean 0, i.e., µǫi = 0 for all i. 2 2 2 2. The errors ǫ1, . , ǫn all have the same variance σ , i.e., σǫi = σ for all i. 3. The errors ǫ1, . , ǫn are independent random variables. 4. The errors ǫ1, . , ǫn are normally distributed. Note that the author provides these assumptions on page 564 BUT ORDERS THEM DIFFERENTLY. Fitting the Simple Linear Model: Estimating β0 and β1 Suppose we believe our data obey the simple linear model. The next step is to fit the model by estimating the unknown intercept and slope coefficients β0 and β1. There are various ways of estimating these from the data but we will use the Least Squares Criterion invented by Gauss. The least squares estimates of β0 and β1, which we will denote by βˆ0 and βˆ1 respectively, are the values of β0 and β1 which minimize the sum of errors squared S(β0, β1): n 2 S(β0, β1) = X ei i=1 n 2 = X[yi − yˆi] i=1 n 2 = X[yi − (β0 + β1xi)] i=1 where the ith modeling error ei is simply the difference between the ith value of the response variable yi and the fitted/predicted valuey ˆi.
    [Show full text]
  • Generalized Linear Models
    Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608) Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions • So far we've seen two canonical settings for regression. Let X 2 Rp be a vector of predictors. In linear regression, we observe Y 2 R, and assume a linear model: T E(Y jX) = β X; for some coefficients β 2 Rp. In logistic regression, we observe Y 2 f0; 1g, and we assume a logistic model (Y = 1jX) log P = βT X: 1 − P(Y = 1jX) • What's the similarity here? Note that in the logistic regression setting, P(Y = 1jX) = E(Y jX). Therefore, in both settings, we are assuming that a transformation of the conditional expec- tation E(Y jX) is a linear function of X, i.e., T g E(Y jX) = β X; for some function g. In linear regression, this transformation was the identity transformation g(u) = u; in logistic regression, it was the logit transformation g(u) = log(u=(1 − u)) • Different transformations might be appropriate for different types of data. E.g., the identity transformation g(u) = u is not really appropriate for logistic regression (why?), and the logit transformation g(u) = log(u=(1 − u)) not appropriate for linear regression (why?), but each is appropriate in their own intended domain • For a third data type, it is entirely possible that transformation neither is really appropriate. What to do then? We think of another transformation g that is in fact appropriate, and this is the basic idea behind a generalized linear model 1.2 Generalized linear models • Given predictors X 2 Rp and an outcome Y , a generalized linear model is defined by three components: a random component, that specifies a distribution for Y jX; a systematic compo- nent, that relates a parameter η to the predictors X; and a link function, that connects the random and systematic components • The random component specifies a distribution for the outcome variable (conditional on X).
    [Show full text]
  • Lecture 18: Regression in Practice
    Regression in Practice Regression in Practice ● Regression Errors ● Regression Diagnostics ● Data Transformations Regression Errors Ice Cream Sales vs. Temperature Image source Linear Regression in R > summary(lm(sales ~ temp)) Call: lm(formula = sales ~ temp) Residuals: Min 1Q Median 3Q Max -74.467 -17.359 3.085 23.180 42.040 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -122.988 54.761 -2.246 0.0513 . temp 28.427 2.816 10.096 3.31e-06 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 35.07 on 9 degrees of freedom Multiple R-squared: 0.9189, Adjusted R-squared: 0.9098 F-statistic: 101.9 on 1 and 9 DF, p-value: 3.306e-06 Some Goodness-of-fit Statistics ● Residual standard error ● R2 and adjusted R2 ● F statistic Anatomy of Regression Errors Image Source Residual Standard Error ● A residual is a difference between a fitted value and an observed value. ● The total residual error (RSS) is the sum of the squared residuals. ○ Intuitively, RSS is the error that the model does not explain. ● It is a measure of how far the data are from the regression line (i.e., the model), on average, expressed in the units of the dependent variable. ● The standard error of the residuals is roughly the square root of the average residual error (RSS / n). ○ Technically, it’s not √(RSS / n), it’s √(RSS / (n - 2)); it’s adjusted by degrees of freedom. R2: Coefficient of Determination ● R2 = ESS / TSS ● Interpretations: ○ The proportion of the variance in the dependent variable that the model explains.
    [Show full text]
  • Chapter 2 Simple Linear Regression Analysis the Simple
    Chapter 2 Simple Linear Regression Analysis The simple linear regression model We consider the modelling between the dependent and one independent variable. When there is only one independent variable in the linear regression model, the model is generally termed as a simple linear regression model. When there are more than one independent variables in the model, then the linear model is termed as the multiple linear regression model. The linear model Consider a simple linear regression model yX01 where y is termed as the dependent or study variable and X is termed as the independent or explanatory variable. The terms 0 and 1 are the parameters of the model. The parameter 0 is termed as an intercept term, and the parameter 1 is termed as the slope parameter. These parameters are usually called as regression coefficients. The unobservable error component accounts for the failure of data to lie on the straight line and represents the difference between the true and observed realization of y . There can be several reasons for such difference, e.g., the effect of all deleted variables in the model, variables may be qualitative, inherent randomness in the observations etc. We assume that is observed as independent and identically distributed random variable with mean zero and constant variance 2 . Later, we will additionally assume that is normally distributed. The independent variables are viewed as controlled by the experimenter, so it is considered as non-stochastic whereas y is viewed as a random variable with Ey()01 X and Var() y 2 . Sometimes X can also be a random variable.
    [Show full text]
  • 1 Simple Linear Regression I – Least Squares Estimation
    1 Simple Linear Regression I – Least Squares Estimation Textbook Sections: 18.1–18.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and variance σ2. We have seen that we can write x in terms of µ and a random error component ε, that is, x = µ + ε. For the time being, we are going to change our notation for our random variable from x to y. So, we now write y = µ + ε. We will now find it useful to call the random variable y a dependent or response variable. Many times, the response variable of interest may be related to the value(s) of one or more known or controllable independent or predictor variables. Consider the following situations: LR1 A college recruiter would like to be able to predict a potential incoming student’s first–year GPA (y) based on known information concerning high school GPA (x1) and college entrance examination score (x2). She feels that the student’s first–year GPA will be related to the values of these two known variables. LR2 A marketer is interested in the effect of changing shelf height (x1) and shelf width (x2)on the weekly sales (y) of her brand of laundry detergent in a grocery store. LR3 A psychologist is interested in testing whether the amount of time to become proficient in a foreign language (y) is related to the child’s age (x). In each case we have at least one variable that is known (in some cases it is controllable), and a response variable that is a random variable.
    [Show full text]
  • The Conspiracy of Random Predictors and Model Violations Against Classical Inference in Regression 1 Introduction
    The Conspiracy of Random Predictors and Model Violations against Classical Inference in Regression A. Buja, R. Berk, L. Brown, E. George, E. Pitkin, M. Traskin, K. Zhang, L. Zhao March 10, 2014 Dedicated to Halbert White (y2012) Abstract We review the early insights of Halbert White who over thirty years ago inaugurated a form of statistical inference for regression models that is asymptotically correct even under \model misspecification.” This form of inference, which is pervasive in econometrics, relies on the \sandwich estimator" of standard error. Whereas the classical theory of linear models in statistics assumes models to be correct and predictors to be fixed, White permits models to be \misspecified” and predictors to be random. Careful reading of his theory shows that it is a synergistic effect | a \conspiracy" | of nonlinearity and randomness of the predictors that has the deepest consequences for statistical inference. It will be seen that the synonym \heteroskedasticity-consistent estimator" for the sandwich estimator is misleading because nonlinearity is a more consequential form of model deviation than heteroskedasticity, and both forms are handled asymptotically correctly by the sandwich estimator. The same analysis shows that a valid alternative to the sandwich estimator is given by the \pairs bootstrap" for which we establish a direct connection to the sandwich estimator. We continue with an asymptotic comparison of the sandwich estimator and the standard error estimator from classical linear models theory. The comparison shows that when standard errors from linear models theory deviate from their sandwich analogs, they are usually too liberal, but occasionally they can be too conservative as well.
    [Show full text]
  • Linear Regression Using Stata (V.6.3)
    Linear Regression using Stata (v.6.3) Oscar Torres-Reyna [email protected] December 2007 http://dss.princeton.edu/training/ Regression: a practical approach (overview) We use regression to estimate the unknown effect of changing one variable over another (Stock and Watson, 2003, ch. 4) When running a regression we are making two assumptions, 1) there is a linear relationship between two variables (i.e. X and Y) and 2) this relationship is additive (i.e. Y= x1 + x2 + …+xN). Technically, linear regression estimates how much Y changes when X changes one unit. In Stata use the command regress, type: regress [dependent variable] [independent variable(s)] regress y x In a multivariate setting we type: regress y x1 x2 x3 … Before running a regression it is recommended to have a clear idea of what you are trying to estimate (i.e. which are your outcome and predictor variables). A regression makes sense only if there is a sound theory behind it. 2 PU/DSS/OTR Regression: a practical approach (setting) Example: Are SAT scores higher in states that spend more money on education controlling by other factors?* – Outcome (Y) variable – SAT scores, variable csat in dataset – Predictor (X) variables • Per pupil expenditures primary & secondary (expense) • % HS graduates taking SAT (percent) • Median household income (income) • % adults with HS diploma (high) • % adults with college degree (college) • Region (region) *Source: Data and examples come from the book Statistics with Stata (updated for version 9) by Lawrence C. Hamilton (chapter 6). Click here to download the data or search for it at http://www.duxbury.com/highered/.
    [Show full text]
  • Linear Regression in Matrix Form
    Statistics 512: Applied Linear Models Topic 3 Topic Overview This topic will cover • thinking in terms of matrices • regression on multiple predictor variables • case study: CS majors • Text Example (KNNL 236) Chapter 5: Linear Regression in Matrix Form The SLR Model in Scalar Form iid 2 Yi = β0 + β1Xi + i where i ∼ N(0,σ ) Consider now writing an equation for each observation: Y1 = β0 + β1X1 + 1 Y2 = β0 + β1X2 + 2 . Yn = β0 + β1Xn + n TheSLRModelinMatrixForm Y1 β0 + β1X1 1 Y2 β0 + β1X2 2 . = . + . . . . Yn β0 + β1Xn n Y X 1 1 1 1 Y X 2 1 2 β0 2 . = . + . . . β1 . Yn 1 Xn n (I will try to use bold symbols for matrices. At first, I will also indicate the dimensions as a subscript to the symbol.) 1 • X is called the design matrix. • β is the vector of parameters • is the error vector • Y is the response vector The Design Matrix 1 X1 1 X2 Xn×2 = . . 1 Xn Vector of Parameters β0 β2×1 = β1 Vector of Error Terms 1 2 n×1 = . . n Vector of Responses Y1 Y2 Yn×1 = . . Yn Thus, Y = Xβ + Yn×1 = Xn×2β2×1 + n×1 2 Variance-Covariance Matrix In general, for any set of variables U1,U2,... ,Un,theirvariance-covariance matrix is defined to be 2 σ {U1} σ{U1,U2} ··· σ{U1,Un} . σ{U ,U } σ2{U } ... σ2{ } 2 1 2 . U = . . .. .. σ{Un−1,Un} 2 σ{Un,U1} ··· σ{Un,Un−1} σ {Un} 2 where σ {Ui} is the variance of Ui,andσ{Ui,Uj} is the covariance of Ui and Uj.
    [Show full text]
  • Lecture 14 Multiple Linear Regression and Logistic Regression
    Lecture 14 Multiple Linear Regression and Logistic Regression Fall 2013 Prof. Yao Xie, [email protected] H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech H1 Outline • Multiple regression • Logistic regression H2 Simple linear regression Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is related to X by the following simple linear regression model: Response Regressor or Predictor Yi = β0 + β1X i + εi i =1,2,!,n 2 ε i 0, εi ∼ Ν( σ ) Intercept Slope Random error where the slope and intercept of the line are called regression coefficients. • The case of simple linear regression considers a single regressor or predictor x and a dependent or response variable Y. H3 Multiple linear regression • Simple linear regression: one predictor variable x • Multiple linear regression: multiple predictor variables x1, x2, …, xk • Example: • simple linear regression property tax = a*house price + b • multiple linear regression property tax = a1*house price + a2*house size + b • Question: how to fit multiple linear regression model? H4 JWCL232_c12_449-512.qxd 1/15/10 10:06 PM Page 450 450 CHAPTER 12 MULTIPLE LINEAR REGRESSION 12-2 HYPOTHESIS TESTS IN MULTIPLE 12-4 PREDICTION OF NEW LINEAR REGRESSION OBSERVATIONS 12-2.1 Test for Significance of 12-5 MODEL ADEQUACY CHECKING Regression 12-5.1 Residual Analysis 12-2.2 Tests on Individual Regression 12-5.2 Influential Observations Coefficients and Subsets of Coefficients 12-6 ASPECTS OF MULTIPLE REGRESSION MODELING 12-3 CONFIDENCE
    [Show full text]