Statistics and Econometrics Using Xlispstat
Total Page:16
File Type:pdf, Size:1020Kb
STATISTICS AND ECONOMETRICS USING XLISPSTAT John E. Floyd University of Toronto May 27, 2009 Contents 1 Introduction 1 2 Working with XLispStat 5 2.1 Using XLispStat as a Calculator . 6 2.2 De¯ning Objects and Working with Lists . 8 2.3 Writing Lisp Functions . 16 2.4 Working with Matrices . 19 2.5 Reading and Writing Data Files . 26 2.6 Transforming Data . 32 2.7 Error Messages . 46 3 Descriptive Statistics 49 4 Hypothesis Tests 63 4.1 Probability Densities and Quantiles . 63 4.2 Plotting Probability Distributions . 68 4.3 Generating Random Data . 71 4.4 Tests of the Mean and Standard Deviation . 73 4.5 Tests of the Di®erence Between Two Means . 75 4.6 Tests of Goodness of Fit . 80 5 Linear Regression Analysis 85 5.1 Using Matrix Calculations . 86 5.2 Using the Regression-Model Function . 90 5.3 Heteroskedasticity . 93 5.4 Time Series: Autocorrelated Residuals . 95 5.5 Multicollinearity . 102 5.6 Some Improved Linear Regression Functions . 107 5.6.1 A Basic OLS-Regression Function . 108 5.6.2 Regressions on Cross-Sectional Data . 113 i ii CONTENTS 5.6.3 Time-Series Regressions . 114 5.6.4 Adjusting the Lengths of Time-Series and Setting up Lagged Values . 118 6 Regression Analysis of Panel Data 119 6.1 Di®erences Estimation . 122 6.2 Entity Demeaned Fixed E®ects Regression . 125 6.3 Using Fixed-E®ects Dummy Variables . 126 6.4 Reorganisation of Panel Data Sets . 137 7 Instrumental Variables Regression 141 7.1 Two-Stage Least Squares . 142 7.2 Estimation Using Ordinary Least Squares . 144 7.3 First Stage TSLS Estimation . 146 7.4 Second Stage TSLS Estimation . 147 7.5 An Application to Panel Data . 152 8 Probit, Logit and Nonlinear Regression 159 8.1 The Linear Probability Model . 159 8.2 Probit and Logit Models . 161 8.3 Nonlinear Least Squares Estimation . 163 8.4 Maximum Likelihood Estimation . 167 9 Spurious Regression and Cointegration 181 9.1 Checking for Stationarity . 181 9.1.1 Dickey-Fuller Tests . 182 9.1.2 Phillips-Perron Tests . 187 9.1.3 The Problem of Low Power . 192 9.2 Testing for Cointegration . 192 9.2.1 Tests of Regression Residuals for Cointegration . 193 9.2.2 Johansen Cointegration Tests . 199 10 Further Topics in Regression Analysis 215 10.1 Joint Hypotheses Tests . 215 10.2 Non-Nested Hypotheses Tests . 237 10.2.1 F-Tests . 237 10.2.2 J-Tests . 239 10.2.3 Complete Parameter Encompassing Tests . 242 10.3 Generalised Least Squares . 245 10.3.1 The Nature of GLS . 245 CONTENTS iii 10.3.2 Quasi-Di®erencing . 246 10.3.3 Seemingly Unrelated Regression Techniques . 263 11 Vector Autoregression Analysis 271 11.1 Standard-Form Estimation . 274 11.2 Moving Average Representation . 285 11.3 Identi¯cation . 289 11.3.1 Choleski Decompositions . 290 11.3.2 Structural Decompositions . 306 11.3.3 Blanchard-Quah Decompositions . 336 11.4 Bootstrapping Con¯dence Intervals . 343 12 Forecasting 361 12.1 Trend Projections . 361 12.2 ARIMA Forecasts . 365 12.3 OLS Forecasts . 386 12.4 Near-VAR Forecasts . 391 iv CONTENTS Chapter 1 Introduction The purpose of this manual is to show the reader how to use the free program XLispStat to do basic statistical and econometric analysis. It has evolved into somewhat of a tutorial for those interested in learning basic statistics and econometrics. A small amount of Lisp programming, which a diligent reader will learn how to do in a few hours, will be required. For students and other beginners, this will provide a good background for subsequently learning how to cope with commercial statistical and computer programs that one often eventually needs to use. For day-to-day work, and even to learn the basics, the reader can work through the small manual I have written and the exercises and examples there referred to, consulting this big manual for details and deeper and more sophisticated issues. XLispStat is a wonderful rich platform for statistical computing written by Luke Tierney at the University of Minnesota. Its depth far exceeds what is utilised here. Those who, having worked through this manual, want to really get serious about XLispStat are advised to get Luke Tierney's book.1 I believe that one could program in XLispStat routines equivalent in purpose and result to virtually anything commonly found in commercial econometrics software. Functions are already present for non-linear least squares, maximisation and maximum likelihood estimation and Bayesian computations along with object-oriented methods for data handling and graphics. So most of the work required would involve adapting these existing resources to the needs at hand. And one of the best ways to develop an understanding of statistical and econometric techniques is to program the 1Luke Tierney, Lisp-Stat: An Object-Oriented Environment for Statistical Computing and Dynamic Graphics, Wiley Series in Probability and Mathematical Statistics, John Wiley & Sons, 1990. 1 2 CHAPTER 1. INTRODUCTION routines oneself. Indeed, what follows would not have been written but for my inclination to try to ¯nd out what is really happening when my favourite commercial program, RATS, is performing its calculations. Nearly all of the actual econometrics functions used here, as well as many data handling routines, were written by me and are available from my web-site in the ¯le addfuncs.lsp. That ¯le must be loaded into the workspace after loading XLispStat before working through any material presented in this manual. Readers are free to work through and modify any of the materials in that ¯le and add new ones as desired, thereby making the program their own. An MS-Windows version of XLispStat can be obtained by following the appropriate links on my web-site www.economics.utoronto.ca/floyd. You will need the self-extracting zip ¯les wxls32zp.exe, which contains the pro- gram itself, and xlispdf.exe, which contains the data referred to in this manual and exexamp.exe which contains some exercises and example pro- grams. And the addfuncs.lsp must, of course, also be obtained along with the XLispStat ¯le maximize.lsp that will be needed for maximum likelihood estimation. A version of XLispStat for Apple computers can be obtained by searching the Web as can Linux versions for most distributions. The data used here are made available for Linux versions in the ¯les xlispdf.tar.gz and the exercise and example ¯les are in the ¯le exeamp.tar.gz. The next chapter provides a simple guide to working with and writing programs in XLispStat. Everything you would need to program all functions created here is explained. Chapter 3 sets out the procedures that enable us to describe properly the data we are working with and Chapter 4 focuses on hypothesis testing, beginning with a discussion of probability distributions. Chapter 5 introduces regression analysis, starting with a discussion of how to run regressions using matrix calculations. The XLispStat function called regression-model is explored next. This function is important be- cause it is used repeatedly in all functions I wrote that use OLS regression calculations although, used alone, its output presentation is too crude for day-to-day work. Heteroskedastic and autocorrelated residuals are then dis- cussed followed by the problem of multicollinearity. Finally, three new func- tions to perform OLS regressions on cross-sectional and time-series data are then presented, along with some additional functions to simplify the process of adjusting the lengths of time-series and setting up lagged values. Chapters 6, 7, and 8 deal respectively with panel-data analysis, instru- mental variables, and logit and probit estimation. Of these issues, only instrumental variables estimation has been used in my own research, so the other two chapters contain only very rudimentary analyses. All three chap- ters are based on the introductory textbook by Stock and Watson and data 3 sets there referred to.2 More sophisticated extensions dealing with panel- data and logit and probit analysis await my ¯nding a joint author whose main research uses these techniques. My own focus on time-series analysis accounts for the intensive exami- nation in Chapter 9 of how to test for stationarity and cope with problems of spurious regression. I ¯nd my functions dealing with these issues, partic- ularly the tests for stationarity and cointegration, more useful for my pur- poses than those in most commercial programs. Students working through this chapter, and the references to the textbooks by Enders and Hamilton, should ¯nd the e®ort helpful in understanding the basics of cointegration analysis.3 Chapter 10 deals with a number of additional topics in regression analysis that have been important in my own work. These are joint hypothesis tests, non-nested hypothesis tests and generalised least squares estimation of seemingly-unrelated regressions. An extensive treatment of vector autoregression analysis is the subject of Chapter 11, again reflecting the importance of this area in my own empirical work. Students should ¯nd these materials, and the references on which they are based, useful in developing an understanding of the basics in this area. And the functions I present can do all types of VARs, along with bootstrapped con¯dence intervals, though admittedly not as elegantly as RATS. The ¯nal chapter deals in a rudimentary way with forecasting time series. There is no pretension of competing with business software, but the func- tions provided are useful in making pseudo (in-sample) forecasts of agents' expected levels of variables from which unanticipated shocks to these vari- ables can be calculated and used in econometric analysis. The forecasting of variables beyond the period for which data are available is also briefly discussed.