Chapter 3 Robustness Properties of the Student T Based Pseudo

Chapter 3 Robustness Prop erties of the Student t Based Pseudo Maximum Likeliho o d Estimator In Chapter 2, some concepts from the robustness literature were intro duced. An imp ortant concept was the in uence function. In the presentchapter, the in uence function is used to assess the robustness prop erties of the Student t based pseudo maximum likeliho o d estimator with estimated degrees of freedom parameter. This estimator is often employed in the econometric literature as a rst relaxation of the usual normality assumption (see, e.g., de Jong et al. (1992), Kleib ergen and van Dijk (1993), Prucha and Kelejian (1984), and Spanos (1994)). In this chapter I show that the estimator is nonrobust in the sense that it has an unb ounded in uence function and an unb ounded change- 1 of-variance function if the degrees of freedom parameter is estimated rather than xed a priori. This result can already b e established in the setting of the simple lo cation/scale mo del and has obvious consequences for other robust estimators that estimate the tuning constant from the data. At the basis of the ab ove results lies the observation that the score function for the pseudo maximum likeliho o d estimator for the degrees of freedom parameter is unb ounded. As a result, the in uence functions of the degrees of freedom and scale estimator are also unb ounded. In contrast, the in uence function of the lo cation parameter is b ounded due to the blo ck-diagonalityof the Fisher information matrix under the assumption of symmetry. The change- of-variance function of the estimator for the lo cation parameter, however, is unb ounded, suggesting that standard inference pro cedures for the lo cation parameter are nonrobust if they are based on the Student t pseudo maximum likeliho o d estimator with estimated degrees of freedom parameter. These results illustrate two basic p oints. First, one should carefully distinguish b etween parameters of interest and nuisance parameters when assessing the robustness prop erties of statistical pro cedures. Second, if a parameter can b e estimated robustly in the sense that an estimator can b e constructed with a b ounded 1 See Section 3.2 for a de nition of the change-of-variance function. 48 3. STUDENT t BASED MAXIMUM LIKELIHOOD ESTIMATION in uence function, this do es not imply that the corresp onding inference pro- cedure for that parameter is also robust. The chapter is set up as follows. Section 3.1 intro duces the mo del and the pseudo maximum likeliho o d estimator based on the Student t distribution. Section 3.2 derives the in uence function and change-of-variance function and provides a simple nite sample approximation to these functions. Section 3.3 provides a simulation based comparison of several robust and nonrobust estimators for the simple mo del of Section 3.1. Section 3.4 concludes this chapter. 3.1 The Mo del and Estimator Consider the simple lo cation/scale mo del y = + " ; (3:1) t t where is the lo cation parameter, is the scale parameter, f" ; ...;" g is a set 1 T of i.i.d. drawings with unit scale, and T denotes the sample size. Mo del (3.1) is extremely simple, but it suces to illustrate the problems studied in this chapter. The diculties that arise for (3.1) also show up in more complicated mo dels. The usual way of estimating and in (3.1) is by means of ordinary least- squares (OLS). This pro duces the arithmetic sample mean and the sample standard deviation as estimators for and , resp ectively. As describ ed in the previous chapter, the standard OLS estimator is sensitive to outliers in the data. One big outlier is enough to corrupt the estimates completely. In order 2 to reduce the sensitivity of the results to outliers, the class of M estimators was prop osed by Hub er (1964, 1981). In this chapter a sp eci c element from this class is studied, namely the pseudo maximum likeliho o d estimator based on the Student t distribution (MLT estimator). As was describ ed in Section 2.3, an M estimator minimizes T X (y ; ; ; ) (3:2) t t=1 with resp ect to and , with denoting some smo oth function and >0 denoting a user sp eci ed tuning constant. In this chapter ! 1 (+1) +1 2 2 ( ) (y ) t 2 p (y ; ; ; )= ln 1+ ; (3:3) t 2 ( ) 2 2 Metho ds for reducing the p ossibly bad e ects of outliers have a long history,ascanbe seen from the references in Section 1.3 of Hamp el et al. (1986). One of these references dates back to Greek antiquity. 3.1. THE MODEL AND ESTIMATOR 49 with () denoting the gamma function, such that can b e interpreted as the degrees of freedom parameter of a Student t distribution. The rst or- P T der conditions for the MLT estimators for and are (y )=0 and t t=1 P T (y ) = 0, resp ectively, with t t=1 (+ 1)(y ) t (y )= (3:4) t 2 2 +(y ) t and 1 (y ) = (1 + (y ) (y )) t t t 2 2 (y ) t = (3.5) 2 2 +(y ) t Although in the ab ove setup can b e regarded as a user sp eci ed tuning constant that determines the degree of robustness of the M estimator, it is not unusual to estimate together with and (see the references b elow). Several estimators for are available from the literature. Using the pseudo log likeliho o d , the most obvious estimator for is given by the (pseudo) P T maximum likeliho o d (ML) estimator, i.e., the value ^ that solves (y )= t t=1 0, with 2 1 (y ) +1 t (y )= ) ( )ln(1 + ) (y ) ; (3:6) ( t t 2 2 2 2 ( ) the digamma function ( ( )=dln(( ))=d ). , and () the gamma function. This estimator is used in, e.g., Fraser (1976), Little (1988), and Lange, Little, and Taylor (1989). Spanos (1994) used an alternative estimator for based on the sample kurtosis co ecient. A third estimator used in the literature is the one prop osed by Prucha and Kelejian (1984). They emb ed the family of Student t distributions in a more general class of distributions. Their estimator for uses an estimate of the rst absolute moment of the disturbance term. Yet another p ossibility for estimating is by using tail-index estimators 3 for the distribution of y (see, e.g., Gro enendijk, Lucas, and de Vries (1995)). t It is easily checked that the estimator of Spanos (1994) for is nonrobust. The nonrobustness of this estimator follows from the nonrobust estimation of the kurtosis co ecient. Similarly, the estimator used by Prucha and Kelejian 3 It is p erhaps useful to note here that (3.3) is closely linked to the assumption of i.i.d. Student t distributed errors in (3.1). Alternatively, one could study (3.1) under the assumption that (" ; ...;" ) has a multivariate Student t distribution with diagonal precision 1 T matrix, such that the errors are uncorrelated rather than indep endent. Zellner (1976, p. 402) proved that in this setting , , and cannot b e estimated simultaneously by means T of ML if only one realization of fy g is available. One way to solve this problem is by t t=1 T using several realizations of fy g , as is p ossible in a panel data context. One can then t t=1 construct a suitable estimator for the degrees of freedom parameter (see, e.g., Sutradhar and Ali (1986)). 50 3. STUDENT t BASED MAXIMUM LIKELIHOOD ESTIMATION (1984, p. 731) for estimating the rst absolute moment is also not robust to outliers, which results in a nonrobust estimator for . Finally, tail-index estimators in their usual implementation are intrinsically nonrobust, b ecause they concentrate on observations in the extreme quantiles of the distribution. So 4 the only remaining candidate for robust estimation of is the MLT estimator. The next section demonstrates, however, that the MLT estimator for is also nonrobust. 3.2 A Derivation of the In uence Function > > De ne =(; ;) and (y )= ( (y); (y ); (y )) , then the MLT t t t t P T > ^ estimator =(^; ;^ ^) solves (y )= 0. This section presents the t t=1 IF of the MLT estimator. First, some additional notation is needed. Let 0 > (y )=@ (y )=@ , with t t 0 1 (y ) (y ) (y ) t t t 0 @ A (y )= (y ) (y ) (y ) ; t t t t (y ) (y ) (y ) t t t and 2 2 2 2 2 (y ) = ( + 1)( (y ) )=( +(y ) ) ; t t t 2 2 2 (y ) = 2( + 1)(y )=( +(y ) ) ; t t t 2 2 2 2 2 (y ) = (y )( (y ) )=( +(y ) ) ; t t t t (y ) = ((y ) (y ) (y ))=; t t t t (y ) = (y ) (y )=; t t t 4 Of course, one can ob ject that the other estimators can easily b e extended such that they b ecome robust to at least some extent. For example, one can try to estimate the rst absolute moment of the errors in a robust way,thus `robustifying' the estimator of Prucha and Kelejian (1984). The problem with this approach is that one wants a consistent estimator for for the whole class of Student t distributions, as opp osed to an estimator that is constistent for only one sp eci c Student t distribution. If one constructs a robust estimator for the rst absolute momentbydownweighting extreme observations, one will end up estimating to o high, in general.

Load more