Inference for Vast Dimensional Elliptical Distributions

Inference for vast dimensional elliptical distributions Yves Dominicy1, Hiroaki Ogata2 and David Veredas3 Abstract We propose a quantile{based method to estimate the parameters (i.e. locations, dispersions, co{dispersions and the tail index) of an elliptical distribution, and a battery of tests for model adequacy. The method is suitable for vast dimensions since the estimators for the location vector and the dispersion matrix have closed form expressions, while estimation of the tail index boils down to univariate optimizations. The tests for model adequacy are for the null hypothesis of correct specification of one or several level contours. A Monte Carlo study to three distributions (Gaussian, Student{t and elliptical stable) for dimensions 20, 200 and 2000 reveals the goodness of the method, both in terms of computational time and for finite samples. An empirical application to financial data illustrates the usefulness of the approach. Keywords: Quantiles, elliptical family, simulations, heavy tails. JEL classification: C13, C15, G11 1ECARES, Solvay Brussels School of Economics and Management, Universitélibre de Bruxelles; email: [email protected]. 2Tokyo Metropolitan University and Waseda University; email: [email protected] 3ECARES, Solvay Brussels School of Economics and Management, Universitélibre de Bruxelles; email: [email protected]. Corresponding address: David Veredas, ECARES, Universitélibre de Bruxelles, 50 Av F.D. Roosevelt CP114, B1050 Brussels, Belgium. Phone: +3226504218. Fax: +3226504475. This work started during the visit of Hiroaki Ogata to the Universitélibre de Bruxelles, as holder of the Chaire Waseda, in March 2010. Yves Dominicy acknowledges financial support from a F.R.I.A. grant. Hiroaki Ogata acknowledges financial support from the Japanese Grant{in{Aid for Young Scientists (B), 22700291, and the International Relations Department of the Universitélibre de Bruxelles. David Veredas acknowledges financial support from the Belgian National Bank, and the IAP P6/07 contract, from the IAP program (Belgian Scientific Policy), 'Economic policy and finance in the global economy'. Yves Dominicy and David Veredas are members of ECORE, the recently created association between CORE and ECARES. Any error and inaccuracy are ours. 1 Introduction The elliptical family of distributions is commonly used as it nests, among others, the Gaussian, Student{t, elliptical stable (ESD henceforth), Cauchy, Laplace and Kotz probability laws. They are defined by three sets of parameters: a vector of locations, a dispersion matrix that reproduces the ellipticity, and a (possibly) tail index that generates tail thickness. While the statistical properties of the elliptical family of distributions are well known (see, among others, Kelker, 1970, Cambanis, et al., 1981, Fang et al., 1990, and Frahm, 2004), inference for vast dimensions is still an almost unexplored area, in particular for heavy{tailed distributions. For moderate dimensions and for thin{tailed distributions, such as the Gaussian, standard estimation methods {namely maximum likelihood (ML) and the generalized method of moments (GMM){ are straightforward. For heavy{tailed distributions, they may fail because of intractability of the probability density function or/and lack of existence of moments.1 This is the case of, for instance, the ESD, Cauchy and the Student{t. Alternative methods, such as Indirect Inference (Lombardi and Veredas, 2009) or projections (Nolan, 2010) can be used but, as the authors acknowledge, they do not apply to vast dimensions. Another branch of the literature has focused on the estimation of the shape matrix, i.e. the dispersion matrix up to a positive scalar factor. Tyler (1987) introduces an affine-equivariant estimator, and Hallin et al. (2006) propose an R-estimator of the shape matrix. In this article we propose inference (i.e. estimation and testing) for vast dimensional elliptical distributions. Estimation is based on quantiles, which always exist regardless of the thickness of the tails, and testing is based on the geometry of the elliptical family. More precisely, the contribution of this article is threefold. First, we introduce a quantile{based function that is informative about the co{dispersions. Second, we propose a fast method for the estimation of the parameters that i) does not require tractability of the density function and existence of moments and ii) does not suffer from the curse of dimensionality. Last, we introduce simple testing procedures for the null hypothesis of correct specification of one or several level contours. Estimation is based on an enhanced version of the Method of Simulated Quantiles (MSQ) of Dominicy and Veredas (2012). MSQ is based on a vector of functions of quantiles that can be either computed from data (the sample functions) or from the distribution (the theoretical functions). The estimated parameters are those that minimize a quadratic distance between both. Since the theoretical functions of quantiles may not have a closed form expression, we rely on simulations. One of the centerpieces of the method is that the functions of quantiles have to be informative about the parameters of interest. The first contribution of this article is to propose a function of quantiles for the co{dispersion that is based on the following simple idea: If two centered and scaled random variables co{move, most of the times the pairs of standardized observations have the same sign. So a new random variable equal to their projection in the 45{degree line has large dispersion. If, by contrast, they anti{move most of the times the pairs of standardized observations have opposite sign, and their projection in the 45{degree line has small dispersion. Therefore the interquantile range of the projection is an informative function about the co{dispersion. 1There exists a tail{trimmed version of GMM (Hill and Renault, 2012) that does not require existence of moments. 2 Furthermore, and this is the second contribution of the article, due to the properties of the elliptical family, we find a way to circumvent almost all the optimizations of the original MSQ. Only univariate quadratic minimizations are used for the tail index, if there is any, while the other parameters are obtained straightforwardly without any optimization procedure. This makes the method fast and applicable to vast dimensions. To assess the finite sample properties of the estimators we carry out a Monte Carlo study for the Gaussian, Student{t and ESD distributions of dimensions 20, 200 and 2000. For a wide range of sample sizes and tail indexes, we find that the estimated parameters are essentially unbiased. Moreover, computationally wise, the procedure is fast, even for vast dimensions. The third contribution is a battery of tests for correct specification of one or several level contours. This is different to testing for the correct distributional assumption, which can be done with standard goodness–of–fit test statistics (Cramér{von{Mises, Anderson{Darling or Kolmogorov{Smirnov to name a few). In many applications the researcher is not interested in fitting correctly the whole distribution but a small set of level contours. In finance, for instance, a risk manager is concerned by extreme level contours. In the univariate case, this motivation has lead to back{tests for Value{at{Risk that are based on the failure rate, that is the percentage of times that the observations are below a given quantile. If the theoretical quantile is well specified and estimated, the empirical failure rate should not be statistically different to the nominal rate. A similar approach is followed in this article, where the failure rate is defined as the percentage of times that the observations are inside an estimated level contour.2 Moreover, due to the geometry of the elliptical family, the tests can be easily performed without an increase in complexity with the dimension. This concept can be extended to testing the adequacy of several level contours, which leads to a vector of failure rates. We also consider the case where there is a level contour of main interest while we take into account the information in the neighboring contours. This leads to a Bonferroni type of test statistic where for the level contour of interest we are very intolerant (i.e. low size) with increasing tolerance as we move further away. In all testing scenarios, the distribution of the vector of failure rates is multinomial and a simple Wald test statistic can be used. We derive the asymptotic distribution of the tests that incorporates the uncertainty of the estimated parameters. The rest of the article is laid out as follows. Section 2 briefly reviews the elliptical family of distributions. Section 3 covers the estimation method for the locations, dispersions and tail index as an extension of MSQ (Section 3.1), explaining at length the function of quantiles for the co{dispersions (Section 3.2), and the asymptotic properties of the estimators (Section 3.3). Then, Section 3.4 presents the fast procedure, which overcomes almost all the optimizations. A comprehensive Monte Carlo study to a variety of distributions, tail thickness and sample sizes is touched upon in Section 4. The tests for level contours are described in Section 5. Section 6 illustrates the theory with an application of 22 de{volatilized financial daily return series of market indexes. Lastly, Section 7 the conclusions and directions for further research. Finally, a word on notation. Unless otherwise indicated, vectors are treated as column vectors. Random vectors are in capital and bold (e.g. X) and the corresponding realizations in small caps and bold (e.g. x). Elements of the random vector and of the realizations are not in bold (e.g. Xl and xl). Vectors of parameters are denoted in small caps and bold (e.g. 2Gonzalez{Rivera et al. (2011) and Gonzalez{Rivera and Yoldas (2011) propose tests in the same spirit and in a more general context. But while these tests require numerical integration, ours, because of the properties of elliptical family, do not.

Inference for Vast Dimensional Elliptical Distributions

On Multivariate Runs Tests for Randomness

D. Normal Mixture Models and Elliptical Models

Generalized Skew-Elliptical Distributions and Their Quadratic Forms

Elliptical Symmetry

Tracy-Widom Limit for the Largest Eigenvalue of High-Dimensional

Von Mises-Fisher Elliptical Distribution Shengxi Li, Student Member, IEEE, Danilo Mandic, Fellow, IEEE

Estimation of Moment Parameter in Elliptical Distributions

A Matrix Variate Generalization of the Power Exponential Family of Distributions

Multivariate Distributions

Eventual Convexity of Probability Constraints with Elliptical Distributions Wim Van Ackooij, Jérôme Malick

Inference in Multivariate Dynamic Models with Elliptical Innovations∗

A Note on Skew-Elliptical Distributions and Linear Functions of Order Statistics Nicola Loperfido