Econometrics Streamlined, Applied and E-Aware
Total Page:16
File Type:pdf, Size:1020Kb
Econometrics Streamlined, Applied and e-Aware Francis X. Diebold University of Pennsylvania Edition 2014 Version Thursday 31st July, 2014 Econometrics Econometrics Streamlined, Applied and e-Aware Francis X. Diebold Copyright c 2013 onward, by Francis X. Diebold. This work is licensed under the Creative Commons Attribution-NonCommercial- NoDerivatives 4.0 International License. (Briefly: I retain copyright, but you can use, copy and distribute non-commercially, so long as you give me attribution and do not modify.) To view a copy of the license, visit http://creativecommons.org/licenses/by-nc-nd/4.0. To my undergraduates, who continually surprise and inspire me Brief Table of Contents About the Author xxi About the Cover xxiii Guide to e-Features xxv Acknowledgments xxvii Preface xxxv 1 Introduction to Econometrics1 2 Graphical Analysis of Economic Data 11 3 Sample Moments and Their Sampling Distributions 27 4 Regression 41 5 Indicator Variables in Cross Sections and Time Series 73 6 Non-Normality, Non-Linearity, and More 105 7 Heteroskedasticity in Cross-Section Regression 143 8 Serial Correlation in Time-Series Regression 151 9 Heteroskedasticity in Time-Series Regression 207 10 Endogeneity 241 11 Vector Autoregression 243 ix x BRIEF TABLE OF CONTENTS 12 Nonstationarity 279 13 Big Data: Selection, Shrinkage and More 317 I Appendices 325 A Construction of the Wage Datasets 327 B Some Popular Books Worth Reading 331 Detailed Table of Contents About the Author xxi About the Cover xxiii Guide to e-Features xxv Acknowledgments xxvii Preface xxxv 1 Introduction to Econometrics1 1.1 Welcome..............................1 1.1.1 Who Uses Econometrics?.................1 1.1.2 What Distinguishes Econometrics?...........3 1.2 Types of Recorded Economic Data...............3 1.3 Online Information and Data..................4 1.4 Software..............................5 1.5 Tips on How to use this book..................7 1.6 Exercises, Problems and Complements.............8 1.7 Historical and Computational Notes............... 10 1.8 Concepts for Review....................... 10 2 Graphical Analysis of Economic Data 11 2.1 Simple Techniques of Graphical Analysis............ 11 2.1.1 Univariate Graphics................... 12 2.1.2 Multivariate Graphics.................. 12 2.1.3 Summary and Extension................. 15 2.2 Elements of Graphical Style................... 17 2.3 U.S. Hourly Wages........................ 18 xi xii DETAILED TABLE OF CONTENTS 2.4 Concluding Remarks....................... 19 2.5 Exercises, Problems and Complements............. 20 2.6 Historical and Computational Notes............... 23 2.7 Concepts for Review....................... 23 2.8 Graphics Legend: Edward Tufte................. 25 3 Sample Moments and Their Sampling Distributions 27 3.1 Populations: Random Variables, Distributions and Moments. 27 3.1.1 Univariate......................... 27 3.1.2 Multivariate........................ 30 3.2 Samples: Sample Moments.................... 31 3.2.1 Univariate......................... 31 3.2.2 Multivariate........................ 34 3.3 Finite-Sample and Asymptotic Sampling Distributions of the Sample Mean........................... 34 3.3.1 Exact Finite-Sample Results............... 34 3.3.2 Approximate Asymptotic Results (Under Weaker As- sumptions)........................ 35 3.4 Exercises, Problems and Complements............. 36 3.5 Historical and Computational Notes............... 38 3.6 Concepts for Review....................... 38 4 Regression 41 4.1 Preliminary Graphics....................... 41 4.2 Regression as Curve Fitting................... 41 4.2.1 Bivariate, or Simple, Linear Regression......... 41 4.2.2 Multiple Linear Regression................ 46 4.2.3 Onward.......................... 47 4.3 Regression as a Probability Model................ 47 4.3.1 A Population Model and a Sample Estimator..... 47 4.3.2 Notation, Assumptions and Results........... 48 A Bit of Matrix Notation................. 49 Assumptions: The Full Ideal Conditions (FIC)..... 50 Results........................... 51 4.4 A Wage Equation......................... 52 4.4.1 Mean dependent var 2.342................ 54 4.4.2 S.D. dependent var .561................. 55 DETAILED TABLE OF CONTENTS xiii 4.4.3 Sum squared resid 319.938................ 55 4.4.4 Log likelihood -938.236.................. 55 4.4.5 F statistic 199.626.................... 56 4.4.6 Prob(F statistic) 0.000000............... 56 4.4.7 S.E. of regression .492.................. 56 4.4.8 R-squared .232...................... 57 4.4.9 Adjusted R-squared .231............... 58 4.4.10 Akaike info criterion 1.423................ 58 4.4.11 Schwarz criterion 1.435.................. 58 4.4.12 Hannan-Quinn criter. 1.427............... 59 4.4.13 Durbin-Watson stat. 1.926................ 59 4.4.14 The Residual Scatter................... 60 4.4.15 The Residual Plot..................... 60 4.5 Exercises, Problems and Complements............. 61 4.6 Historical and Computational Notes............... 66 4.7 Regression's Inventor: Carl Friedrich Gauss........... 67 4.8 Concepts for Review....................... 67 5 Indicator Variables in Cross Sections and Time Series 73 5.1 Cross Sections: Group Effects.................. 73 5.1.1 0-1 Dummy Variables................... 73 5.1.2 Group Dummies in the Wage Regression........ 75 5.2 Time Series: Trend and Seasonality............... 76 5.2.1 Linear Trend....................... 76 5.2.2 Seasonality........................ 79 Seasonal Dummies.................... 79 More General Calendar Effects............. 81 5.2.3 Trend and Seasonality in Liquor Sales.......... 81 5.3 Structural Change........................ 83 5.3.1 Gradual Parameter Evolution.............. 83 5.3.2 Sharp Parameter Breaks................. 84 Exogenously-Specified Breaks.............. 84 Endogenously-Selected Breaks.............. 86 5.3.3 Liquor Sales........................ 87 5.4 Panel Regression......................... 87 5.4.1 Panel Data........................ 88 xiv DETAILED TABLE OF CONTENTS 5.4.2 Individual and Time Effects............... 89 Individual Effects..................... 89 Time Effects........................ 89 Individual and Time Effects............... 90 5.5 Binary Regression......................... 90 5.5.1 Binary Response..................... 90 5.5.2 The Logit Model..................... 91 Logit............................ 91 Ordered Logit....................... 92 Dynamic Logit...................... 93 Complications....................... 93 5.5.3 Classification and \0-1 Forecasting"........... 94 5.6 Exercises, Problems and Complements............. 94 5.7 Historical and Computational Notes............... 101 5.8 Concepts for Review....................... 101 5.9 Dummy Variables, ANOVA, and Sir Ronald Fischer...... 103 6 Non-Normality, Non-Linearity, and More 105 6.1 Non-Normality.......................... 106 6.1.1 \y Problems" (via "): Non-Normality.......... 106 OLS Without Normality................. 106 Assessing Residual Non-Normality............ 107 Nonparametric Density Estimation........... 107 Outliers.......................... 108 6.1.2 \X Problems": Measurement Error and Multicollinearity109 Measurement Error.................... 109 Collinearity and Multicollinearity............ 110 Leverage.......................... 111 6.1.3 Estimation......................... 111 Robustness Iteration................... 112 Least Absolute Deviations................ 112 6.1.4 Wages and Liquor Sales................. 113 Wages........................... 113 Liquor Sales........................ 113 6.2 Non-Linearity........................... 114 6.2.1 Parametric Non-Linearity................ 114 DETAILED TABLE OF CONTENTS xv Models Linear in Transformed Variables........ 114 Intrinsically Non-Linear Models............. 116 Interactions........................ 117 A Final Word on Nonlinearity and the FIC....... 117 Testing for Non-Linearity................ 118 Non-Linearity in Wage Determination.......... 119 Non-linear Trends..................... 122 More on Non-Linear Trend................ 125 Non-Linearity in Liquor Sales Trend.......... 126 6.2.2 Non-Parametric Non-Linearity.............. 127 Series Expansions..................... 127 Step Functions (Dummies)................ 128 Regression Splines.................... 129 Smoothing Splines.................... 130 Local Polynomial Regression............... 131 Neural Networks..................... 132 Kernel Regression..................... 133 6.2.3 Dimensions and Interactions............... 133 Higher Dimensions: Generalized Additive Models... 133 Dummies......................... 133 Trees............................ 133 Forrests.......................... 133 Bagging.......................... 133 Boosting.......................... 133 6.3 Exercises, Problems and Complements............. 133 6.4 Historical and Computational Notes............... 140 6.5 Concepts for Review....................... 140 7 Heteroskedasticity in Cross-Section Regression 143 7.1 Exercises, Problems and Complements............. 149 7.2 Historical and Computational Notes............... 150 7.3 Concepts for Review....................... 150 8 Serial Correlation in Time-Series Regression 151 8.1 Observed Time Series....................... 151 8.1.1 Characterizing Time-Series Dynamics.......... 151 Covariance Stationary Time Series........... 152 xvi DETAILED TABLE OF CONTENTS 8.1.2 White Noise........................ 155 8.1.3 Estimation and Inference for the Mean, Autocorrela- tion and Partial Autocorrelation Functions....... 159