Econometric Theory
Total Page:16
File Type:pdf, Size:1020Kb
Econometric Theory John Stachurski January 10, 2014 Contents Preface v I Background Material1 1 Probability2 1.1 Probability Models.............................2 1.2 Distributions................................. 16 1.3 Dependence................................. 25 1.4 Asymptotics................................. 30 1.5 Exercises................................... 39 2 Linear Algebra 49 2.1 Vectors and Matrices............................ 49 2.2 Span, Dimension and Independence................... 59 2.3 Matrices and Equations........................... 66 2.4 Random Vectors and Matrices....................... 71 2.5 Convergence of Random Matrices.................... 74 2.6 Exercises................................... 79 i CONTENTS ii 3 Projections 84 3.1 Orthogonality and Projection....................... 84 3.2 Overdetermined Systems of Equations.................. 90 3.3 Conditioning................................. 93 3.4 Exercises................................... 103 II Foundations of Statistics 107 4 Statistical Learning 108 4.1 Inductive Learning............................. 108 4.2 Statistics................................... 112 4.3 Maximum Likelihood............................ 120 4.4 Parametric vs Nonparametric Estimation................ 125 4.5 Empirical Distributions........................... 134 4.6 Empirical Risk Minimization....................... 137 4.7 Exercises................................... 149 5 Methods of Inference 153 5.1 Making Inference about Theory...................... 153 5.2 Confidence Sets............................... 155 5.3 Hypothesis Tests.............................. 160 5.4 Use and Misuse of Testing......................... 170 5.5 Exercises................................... 172 6 Linear Least Squares 175 6.1 Least Squares................................ 175 6.2 Transformations of the Data........................ 179 6.3 Goodness of Fit............................... 183 6.4 Exercises................................... 188 JOHN STACHURSKI January 10, 2014 CONTENTS iii III Econometric Models 194 7 Classical OLS 195 7.1 The Model.................................. 195 7.2 Variance and Bias.............................. 198 7.3 The FWL Theorem............................. 201 7.4 Normal Errors................................ 207 7.5 When the Assumptions Fail........................ 215 7.6 Exercises................................... 219 8 Time Series Models 226 8.1 Some Common Models........................... 226 8.2 Dynamic Properties............................. 233 8.3 Maximum Likelihood for Markov Processes............... 248 8.4 Models with Latent Variables....................... 256 8.5 Exercises................................... 257 9 Large Sample OLS 264 9.1 Consistency................................. 264 9.2 Asymptotic Normality........................... 269 9.3 Exercises................................... 273 10 Further Topics 276 10.1 Model Selection............................... 276 10.2 Method of Moments............................ 292 10.3 Breaking the Bank.............................. 293 10.4 Exercises................................... 293 JOHN STACHURSKI January 10, 2014 CONTENTS iv IV Appendices 294 11 Appendix A: An R Primer 295 11.1 The R Language............................... 295 11.2 Variables and Vectors............................ 299 11.3 Graphics................................... 305 11.4 Data Types.................................. 311 11.5 Simple Regressions in R.......................... 319 12 Appendix B: More R Techniques 323 12.1 Input and Output.............................. 323 12.2 Conditions.................................. 330 12.3 Repetition.................................. 335 12.4 Functions................................... 340 12.5 General Programming Tips........................ 345 12.6 More Statistics................................ 348 12.7 Exercises................................... 358 13 Appendix C: Analysis 360 13.1 Sets and Functions............................. 360 13.2 Optimization................................. 365 13.3 Logical Arguments............................. 366 JOHN STACHURSKI January 10, 2014 Preface This is a quick course on modern econometric and statistical theory, focusing on fundamental ideas and general principles. The course is intended to be concise— suitable for learning concepts and results—rather than an encyclopedic treatment or a reference manual. It includes background material in probability and statistics that budding econometricians often need to build up. Most of the topics covered here are standard, and I have borrowed ideas, results and exercises from many sources, usually without individual citations to that effect. Some of the large sample theory is new—although similar results have doubtless been considered elsewhere. The large sample theory has been developed to illu- minate more clearly the kinds of conditions that allow the law of large numbers and central limit theorem to function, and to make large sample theory accessible to students without knowledge of measure theory. The other originality is in the presentation of otherwise standard material. In my humble opinion, many of the mathematical arguments found here are neater, more precise and more insightful than other econometrics texts at a similar level. Finally, the course also teaches programming techniques via an extensive set of ex- amples, and a long discussion of the statistical programming environment R in the appendix. Even if only for the purpose of understanding theory, good programming skills are important. In fact, one of the best ways to understand a result in econo- metric theory is to first work your way through the proof, and then run a simulation which shows the theory in action. These notes have benefitted from the input of many students. In particular, I wish to thank without implication Blair Alexander, Frank Cai, Yiyong Cai, Patrick Carvalho, Paul Kitney, Bikramaditya Datta, Stefan Webb and Varang Wiriyawit. v Part I Background Material 1 Chapter 1 Probability Probability theory forms the foundation stones of statistics and econometrics. If you want to be a first class statistician/econometrician, then every extra detail of proba- bility theory that you can grasp and internalize will prove an excellent investment. 1.1 Probability Models We begin with the basic foundations of probability theory. What follows will involve a few set operations, and you might like to glance over the results on set operations (and the definition of functions) in x13.1. 1.1.1 Sample Spaces In setting up a model of probability, we usually start with the notion of a sample space, which, in general, can be any nonempty set, and is typically denoted W. We can think of W as being the collection of all possible outcomes in a random experi- ment. A typical element of W is denoted w. The general idea is that a realization of uncertainty will lead to the selection of a particular w 2 W. Example 1.1.1. Let W := f1, . , 6g represent the six different faces of a dice. A realization of uncertainty corresponds to a roll of the dice, with the outcome being an integer w in the set f1, . , 6g. 2 1.1. PROBABILITY MODELS 3 The specification of all possible outcomes W is one part of our probability model. The other thing we need to do is to assign probabilities to outcomes. The obvious thing to do here is to assign a probability to every w in W, but it turns out, for technical reasons beyond the scope of this text (see, e.g., Williams, 1991), that is is not the right way forward. Instead, the standard approach is to assign probabilities to subsets of W. In the language of probability theory, subsets of W are called events. The set of all events is usually denoted by F, and we follow this convention.1 Example 1.1.2. Let W be any sample space. Two events we always find in F are W itself and the empty set Æ. (The empty set is regarded as being a subset of every set, and hence Æ ⊂ W). In this context, W is called the certain event because it always occurs (regardless of which outcome w is selected, w 2 W is true by definition). The empty set Æ is called the impossible event. In all of what follows, if B is an event (i.e., B 2 F), then the notation P(B) will represent the probability that event B occurs. The way you should think about it is this: P(B) represents the probability that when uncertainty is resolved and some w 2 W is selected by “nature,” the statement w 2 B is true. Example 1.1.3. Continuing example 1.1.1, let B be the event f1, 2g. The number P(B) represents the probability that the face w selected by the roll is either 1 or 2. The second stage of our model construction is to assign probabilities to elements of F. In order to make sure our model of probability is well behaved, it’s best to put certain restrictions on P. (For example, we wouldn’t want to have a B with P(B) = −93, as negative probabilities don’t make much sense.) These restrictions are imposed in the next definition: A probability P on (W, F) is a function that associates to each event in F a number in [0, 1], and, in addition, satisfies 1. P(W) = 1, and 2. Additivity: P(A [ B) = P(A) + P(B) whenever A, B 2 F with A \ B = Æ. 1I’m skirting technical details here. In many common situations, we take F to be a proper subset of the set of all subsets of W. In particular, we exclude a few troublesome subsets of W from F, because they are so messy that assigning probabilities to these sets cause problems for the theory. See x1.1.2 for more discussion. JOHN STACHURSKI January 10, 2014