Econometrics I Midterm Examination Fall 2004 Answer Key

Econometrics I Midterm Examination Fall 2004 Answer Key Please answer all of the questions and show all of your work. If you think that a question is ambiguous, clearly state how you interpret it before providing an answer. 1. (28 points) Consider the regression function β2 y = β0 + β1x + ε, where x is a scalar individual characteristic that is always positive, ε is a random variable 2 that is independently and identically distributed (i.i.d.) in the population as a N(0,σε), and β =(β0 β1 β2)0 is a vector of unknown parameters. You have access to a random sample of N observations containing information on (yx) drawn from this population. 2 1. Define the nonlinear least squares estimator of β and σε for this problem. The disturbance term is homoskedastic, so we define N ˆ ˆ ˆ β2 2 (β0 β1 β2)0 =argmin (yi β0 β1xi ) β − − i=1 X fortheestimatoroftheβ vector. A consistent estimator of σ2 is given by N ˆ 2 1 β2 2 σˆ = N − (yi βˆ βˆ x ) . − 0 − 1 i i=1 X 2. Are all of the parameters identified, or are some restrictions on the parameter space required to ensure identification of the model? 0 If β2 =0, then xi =1for all i and the NLLS estimator of β0 + β1 is the sample mean (they cannot be individual identified). Similarly, if β1 =0, then β2 is not identified. Thus identification conditions are that (1) the parameter space does not include β1 =0 and β2 =0and (2) nonconstancy of xi. 3. Derive the log likelihood function and define the maximum likelihood estimators of the model parameters. How does this estimator compare to the one you definedin(a)? The log likelihood function is given by N β2 ln L = N ln σ + ln φ((yi β β x )/σ), − − 0 − 1 i i=1 X where φ is the standard normal p.d.f. The m.l. estimator of the conditional mean for the normal minimizes the sum of squared deviations, jsut as was the case for the least squares estimator. In this case the conditional mean function happened to be nonlinear in the parameters, but the “quadratic loss” criteria is the same. Since the estimators of the conditional mean functions are identical, it is not surprising that the estimator of the variance parameter is also identical. 4. Assume that only the sign of y is observable, i.e., define 1 iff y>0 d = . 0 iff y 0 ½ ≤ Write down the log likelihood function corresponding to this model, and define the maximum likelihood estimators of the model parameters. Which parameters are identified given this information, and under what conditions? In this case, we get a nonlinear (in the parameters) probit model. The probability of d =1is simply β + β xβ2 Φ( 0 1 i ), (0.1) σ so the log likelihood function is N β2 β2 β0 + β1xi β0 + β1xi ln L = di ln Φ( )+(1 di)ln(1 Φ( )) . { σ − − σ } i=1 X From inspection of the log likelihood, we see that as in the linear in the parameters case, only the functions β0/σ and β1/σ are identified. Given these 2 estimable functions of the parameters, the parameter β2 is identified.Westillrequiretheidentification conditions discussed in part b. 2 2. (42 points) A dependent variable y is deterministically related to a scalar x as follows: y = β0 + β1x. 1. Assume that in the population β1 is a constant while β0 is independently and identically distributed with mean µ and variance σ2 . You have access to a random sample of β0 β0 N observations containing the information yi,xi i=1. Derive unbiased estimators for all of the unknown model parameters. { } We simply rewrite the regression function as y = µ + β x + ε, β0 1 where ε = β µ . Then by the assumptions on β ,εis independent of x, so OLS 0 β0 0 yields unbiased− estimators for µ and β . The squared OLS residuals divided by N 1 β0 1 − is an unbiased estimator of σ2 . β0 2. Consider a generalization of the model in which β1 is treated as a random variable in the population, which is independent of β .β is assumed to be i.i.d., with mean µ 0 1 β1 and variance σ2 . Define consistent estimators of all unknown model parameters, if they β1 exist. The model now becomes y = µ + µ x +(β µ )x +(β µ ), β0 β1 1 − β1 0 − β0 so the “new” disturbance term is u =(β µ )x +(β µ ). 1 − β1 0 − β0 By definition we have that E(u x)=0for all x. Then OLS estimators of the conditional mean function are unbiased for| µ and µ . Define the residuals from the first stage β0 β1 OLSregressionbyri. Then form the estimator N 1 2 2 2 γˆ =argminN − (r γ γ x ) . i − 1 − 2 i i=1 X Then γˆ is a consistent estimator of σ2 and γˆ is a consistent estimator of σ2 . 1 β0 2 β1 3. Given your response to (b), is it possible to define more efficient estimators for µ and β0 µ ?Ifso,define these estimators. If not, argue why the estimators definedin(b)are β1 efficient given the sample information and the model. We can use the consistent estimates of σ2 and σ2 to form consistent estimates of the β0 β1 covariance matrix of the disturbances, and then form the Feasible GLS estimator. Define 2 .5 wî =(ˆγ1 +ˆγ2xi ) . Then the FLGS estimates of the conditional mean function are given by N 2 (yi µβ µβ xi) (ˆµ µˆ )=argmin − 0 − 1 . β0 β1 wˆ2 i=1 i X These are (asymptotically) efficient in the class of LS estimators. 3 4. Describe tests, formal and/or informal, that would allow you to investigate whether β1 was a constant in the population (i.e., σ2 =0). β1 We are looking for systematic heteroskedasticity, in that the conditional variance is posited to be a linear function of x2. We could plot the squared residuals as a func- 2 tion of xi to see if there was an evidence of this relationship. More formally, we can estimate the linear regression model 2 2 ri = a + bxi + ξi. The OLS estimators of a and b are consistent under the null. The random variable ξi is heteroskedastic under the null. In large samples, we can compute the standard errors of the regression estimates using the Huber-Eiker-White method. Then ˆb/(s.e.(ˆb)) is distributed as a N(0, 1) under the null. 5. Say that we are willing to assume that both β0 and β1 are independently distributed as N(µ ,σ2 ),i=0, 1. Write down the log likelihood function for this model, and βi βi determine whether all parameters are identified. The log likelihood function is given by N y µ µ x 2 2 2 i β0 β1 i ln L = .5ln(σβ + σβ xi )+lnφ( − − ) {− 0 1 2 2 2 } i=1 σ + σ x X β0 β1 i q Identification issues are essentially the same as in the regression case. The main differ- ence is that the estimates are all computed in one-step instead of two (or three, when we compute the FLGS estimator). Identification could be verified by computing the matrix of first partials and confirming that it is of full column rank. 6. If you had to choose between using the estimators you defined in (c) or the maximum likelihood estimator defined in (e), which would you prefer? On what factors would your choice depend? If the normality assumption is correct, the ML estimator is consistent and asymptotically efficient. Then if the sample is large and one has some faith in the normality assumption, the ML estimator may be appropriate. The LS estimators are consistent under more general conditions than normality. Since differences in the asymptotic standard errors are often neglible, perhaps the LS estimators may be preferable even in large samples. In small samples, use of the LS estimators would pretty clearly be indicated. 4 3. (30 points) Unemployed individuals look for jobs according to the model of continuous time search in stationary environments presented in class. We showed that for an individual facing labor market parameters θ = ρbF λη), their optimal policy was to accept any wage offer { that was greater than w∗, where this constant was the solution to λ w∗ = b + (w w∗)dF (w), ρ + η − Zw∗ and where b is the utility flow in the unemployment state λ is the rate of meeting potential employers η istherateofbeingdismissedatajob ρ is the instantaneous discount rate F is the wage offer distribution. Assume you have access to a random sample of N individuals which contains the following information for each sample member: tu the length of a complete unemployment spell te the length of the complete employment spell that follows the unemployment spell w the wage rate associated with the employment spell (which is constant over the spell). Further assume that the distribution of wage offers is negative exponential, i.e., F (w)=1 exp( aw),w>0,α>0, − − f(w)=α exp( αw), − where F and f are the c.d.f. and p.d.f., respectively. 1. Write down the log likelihood function for the sample. We observe one unemployment spell followed by an employment spell with a wage obser- vation. Both duration distributions are negative exponential, with the parameter of the ˜ unemployment duration distribution being given by hu = λF (w∗), and the parameter of the employment duration distribution given by he = η.

Econometrics I Midterm Examination Fall 2004 Answer Key

Wooldridge, Introductory Econometrics, 4Th Ed. Appendix C

Chapter 9. Properties of Point Estimators and Methods of Estimation

Review of Mathematical Statistics Chapter 10

Chapter 4 Efficient Likelihood Estimation and Related Tests

ACM/ESE 118 Mean and Variance Estimation Consider a Sample X1

MAXIMUM LIKELIHOOD ESTIMATION for the Efﬁciency of Mles, Namely That the Theorem 1

Asymptotic Concepts L

Asymptotic Theory Greene Ch

Chapter 7: Estimation

14.30 Introduction to Statistical Methods in Economics Spring 2009

9 Asymptotic Approximations and Practical Asymptotic Tools

Notes on Median and Quantile Regression