Quick viewing(Text Mode)

Time Series Regression (Part 1) LECTURE 7|TIME SERIES FORECASTING METHOD [email protected] Review

Time Series Regression (Part 1) LECTURE 7|TIME SERIES FORECASTING METHOD Rahmaanisa@Apps.Ipb.Ac.Id Review

Time Series Regression (part 1) LECTURE 7| SERIES METHOD [email protected] Review . method for non-seasonal time series :  Moving : SMA, DMA  : SES, DES . Smoothing method for seasonal time series data:  Additive Holt-Winter  Multiplicative Holt-Winter Review Outline . Review of regression model . Independence assumption and the consequences of its violation . Regresion model for time series ?? Linear Regression

풚 = 푿휷 + 휺

dependent variable error model

independent variable(s) Linear Regression

Fitted Line Plot Y = 2.803 + 1.511 X

20.0 S 0.911075 -Sq 95.9% R-Sq(adj) 95.9% 17.5

15.0

12.5

Y

10.0

7.5

5.0

0 2 4 6 8 10 X Assumptions on Linear Regression Model

• The relationship between X and Y is linear

• 휀~푖. 푖. 푑 푁표푟푚푎푙 0, 휎2

• No multicollinearity Diagnostics Serial Correlated Error

푐표푣 푒푡, 푒푡−푘 ≠ 0

where

푒푡 = error at time 푡

푒푡−푘 = error at time (푡 − 푘), 푘 = 1,2, … Problems in Linear Regression: Serial Correlation

Positive serial correlation of residuals

The residuals change sign in gradual oscillation. Problems in Linear Regression: Serial Correlation

Negative serial correlation of residuals

The residuals bounce between positive and negative, but not randomly Possible Causes of Serial Correlated Error

1) omitted variables

2) ignoring nonlinearities

3) measurement errors Consequences of Serial Correlated Error

1. The OLS are still unbiased and consistent

2. In large samples, the error may be still normally distributed

3. The estimators are no longer efficient  no longer BLUE.

4. The estimated may be underestimated,

5. the tests using the t and F distribution, may no longer be appropriate Identification of Serial Correlated Error . Residual Plot . Durbin Watson test . Runs Test . Breuch-Godfrey Test . Etc. Possible Solutions for Problem

. Cochrane-Orcutt

. Hildreth-Lu

. Distributed Lag

. Etc. Illustration

Consider the number of labour hours and sales (in dollars) data set as follows:

Quar- Number of labour sales in Quar- Number of labour sales in Year Year ter hours dollars ter hours dollars 2011 1 126754 15349829 2014 1 147263 18438749 2011 2 129839 15629384 2014 2 147868 18604334 2011 3 106872 15720934 2014 3 113897 18740234 2011 4 123787 16230984 2014 4 149879 18943340 2012 1 137678 16809312 2015 1 149376 19276345 2012 2 138279 16923347 2015 2 156982 19173645 2012 3 109873 16978434 2015 3 123783 19147234 2012 4 137368 17203948 2015 4 159734 19842667 2013 1 139823 17830230 2016 1 159734 20783274 2013 2 138346 17937463 2016 2 169283 20348753 2013 3 112837 18074652 2016 3 128647 20873488 2013 4 149870 18347655 2016 4 163467 20475644 Source: kaggle.com Illustration

The datasets is avalaible at:

https://github.com/raoy/Time-Series-Analysis Illustration

Scatterplot of sales in dollars vs Number of labour hours

21000000

20000000

19000000

s

r

a

l

l

o

d 18000000

n

i

s

e

l

a s 17000000

16000000 Correlations 15000000 Pearson 0.615 100000 110000 120000 130000 140000 150000 160000 170000 Number of labour hours correlation P-value 0.001 Illustration Regression Equation sales in dollars = 10373187 + 56.8 Number of labour hours

Analysis of Source DF Adj SS Adj MS F-Value P-Value Regression 1 2.32E+13 2.32E+13 13.37 0.001 Error 22 3.82E+13 1.74E+12 Total 23 6.14E+13

Model Summary S R-sq R-sq(adj) R-sq(pred) 1317579 37.79% 34.97% 27.52%

Coefficients Term Coef SE Coef T-Value P-Value VIF Constant 10373187 2167627 4.79 0 Number of labour hours 56.8 15.5 3.66 0.001 1 Illustration

The residuals are NOT RANDOM! Illustration

Scatterplot of sales in dollars vs sales in dollars (t-1)

21000000

20000000

19000000

s

r

a

l

l

o

d 18000000

n

i

s

e

l

a s 17000000 Sales is HIGHLY CORRELATED 16000000 with its value at (t-1) period 15000000 15000000 16000000 17000000 18000000 19000000 20000000 21000000 sales in dollars (t-1) Illustration Add the lag of SALES as independent variable Regression Equation sales in dollars = 1111051 + 0.9296 sales in dollars (t-1) + 2.80 Number of labour hours

Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression 2 5.06E+13 2.53E+13 237.69 0 Error 20 2.13E+12 1.06E+11 Total 22 5.27E+13

Model Summary S R-sq R-sq(adj) R-sq(pred) 326161 95.96% 95.56% 94.30%

Coefficients Term Coef SE Coef T-Value P-Value VIF Constant 1111051 796218 1.4 0.178 sales in dollars (t-1) 0.9296 0.0546 17.01 0 1.58 Number of labour hours 2.8 4.88 0.57 0.573 1.58 Illustration Chapter Summary . Assumptions on classical regression modeling . Consequences of autocorrelated residuals . Regression modeling for time series data Another Example

See chapter 4.8 on Hyndman (2013) https://www.otexts.org/fpp/4/8 Exercise 1

Supposed there were 20 periods market share data set of a toothpaste product : Market Market Period Price Period Price share share 1 3.63 0.97 11 7.25 0.79 2 4.20 0.95 12 6.09 0.83 3 3.33 0.99 13 6.80 0.81 4 4.54 0.91 14 8.65 0.77 5 2.89 0.98 15 8.43 0.76 6 4.87 0.90 16 8.29 0.80 7 4.90 0.89 17 7.18 0.83 8 5.29 0.86 18 7.90 0.79 9 6.18 0.85 19 8.45 0.76 10 7.20 0.82 20 8.23 0.78

Conduct regression modeling of market share (Y) towards price (X). Investigate autocorrelation of the residuals. Year Sales Advertising Year Sales Advertising 1975 11.7 9.4 1995 18.0 15.9 Exercise 2 1976 12.0 9.6 1996 17.9 16.0 1977 12.3 10 1997 18.0 16.3 Conduct appropriate regression modeling using 1978 12.8 10.4 1998 18.2 16.2 the following data set, and 1979 13.1 10.8 1999 18.2 16.8 investigate autocorrelation 1980 13.6 10.9 2000 18.3 17.3 of the residuals. 1981 13.9 11.7 2001 18.6 17.6 1982 14.4 12.2 2002 19.2 18.1 1983 14.7 12.5 2003 19.3 18.3 1984 15.3 12.9 2004 19.5 18.5 1985 15.5 13.0 2005 19.2 18.7 1986 15.8 13.2 2006 19.3 18.9 1987 16.1 13.8 2007 19.5 19.2 1988 16.6 14.2 2008 20.0 20.0 1989 16.9 14.6 2009 20.0 20.0 1990 16.7 14.4 2010 19.9 20.3 1991 16.9 15.0 2011 19.8 20.4 1992 17.4 15.4 2012 19.9 21.0 1993 17.6 15.7 2013 20.2 21.5 1994 17.9 15.9 2014 21.0 22.1 Next Topic…

Regression for Time Series Data Set (part 2) References Gujarati, D., McMillan, P. 2011. by Example. London: Palgrave Macmillan. Hyndman, R.J and Athanasopoulos, G. 2013. Forecasting: principles and practice. https://www.otexts.org/ fpp/6/2/ [March 21st, 2018] Paulson, D.S. 2007. Handbook of Regression and Modeling: Applications for the Clinical and Pharmaceutical Industries. Boca Raton: Chapman & Hall.

30 The handouts are available on the following site:

stat.ipb.ac.id/en

31 PREPARE YOUR MID-EXAM