New Insights Into the Statistical Properties of M-Estimators Gordana Draskovic, Frédéric Pascal
Total Page:16
File Type:pdf, Size:1020Kb
New insights into the statistical properties of M-estimators Gordana Draskovic, Frédéric Pascal To cite this version: Gordana Draskovic, Frédéric Pascal. New insights into the statistical properties of M-estimators. IEEE Transactions on Signal Processing, Institute of Electrical and Electronics Engineers, 2018, 66 (16), pp.4253-4263. 10.1109/TSP.2018.2841892. hal-01816084 HAL Id: hal-01816084 https://hal.archives-ouvertes.fr/hal-01816084 Submitted on 26 Feb 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. 1 New insights into the statistical properties of M-estimators Gordana Draskoviˇ c,´ Student Member, IEEE, Fred´ eric´ Pascal, Senior Member, IEEE Abstract—This paper proposes an original approach to bet- its explicit form, the SCM is easy to manipulate and therefore ter understanding the behavior of robust scatter matrix M- widely used in the signal processing community. estimators. Scatter matrices are of particular interest for many Nevertheless, the complex normality sometimes presents a signal processing applications since the resulting performance strongly relies on the quality of the matrix estimation. In this poor approximation of underlying physics. Noise and interfer- context, M-estimators appear as very interesting candidates, ence can be spiky and impulsive i.e., have heavier tails than mainly due to their flexibility to the statistical model and their the Gaussian distribution. An alternative has been proposed by robustness to outliers and/or missing data. However, the behavior introducing elliptical distributions [11], namely the Complex of such estimators still remains unclear and not well understood Elliptically Symmetric (CES) distributions. These distributions since they are described by fixed-point equations that make their statistical analysis very difficult. To fill this gap, the main contri- present an important property which states that their higher bution of this work is to prove that these estimators distribution order moment matrices are scalars multiple of their corre- is more accurately described by a Wishart distribution than by spondent normal distribution. This presents a starting point the classical asymptotical Gaussian approximation. To that end, for the analysis that is done in this paper. These distributions we propose a new “Gaussian-core” representation for Complex have been frequently employed for non-Gaussian modeling Elliptically Symmetric (CES) distributions and we analyze the proximity between M-estimators and a Gaussian-based Sample (see e.g., for radar applications [12]–[16]). Covariance Matrix (SCM), unobservable in practice and playing Although Huber introduced robust M-estimators in [17] for only a theoretical role. To confirm our claims we also provide the scalar case, Maronna provided the detailed analysis of the results for a widely used function of M-estimators, the Maha- corresponding scatter matrix estimators in the multivariate real lanobis distance. Finally, Monte Carlo simulations for various case in his seminal work [18]. M-estimators correspond to a scenarios are presented to validate theoretical results. generalization of the well-known Maximum Likelihood esti- Index Terms—M-estimators, Complex Elliptical Symmetric mators (MLE), that have been widely studied in the statistics distributions, robust estimation, Wishart distribution, Mahanalo- bis distance. literature [19], [20]. In contrast to ML-estimators where the estimating equation depends on the probability density func- tion (PDF) of a particular CES distribution, the weight function I. INTRODUCTION in the M-estimating equation can be completely independent In signal processing applications, the knowledge of scatter of the data distribution. Consequently, M-estimators presents matrix is of crucial importance. It arises in diverse applications a wide class of scatter matrix estimators, including the ML- such as filtering, detection, estimation or classification. In estimators, robust to the data model. In [18], it is shown that, recent years, there has been growing interest in covariance under some mild assumptions, the estimator is defined as the matrix estimation in a vast amount of literature on this topic unique solution of a fixed-point equation and that the robust (see e.g., [1]–[8] and references therein). Generally, in most estimator converges almost surely (a.s.) to a deterministic of signal processing methods the data can be locally mod- matrix, equal to the scatter matrix up to a scale quantity elled by a multivariate zero-mean circular Gaussian stochastic (depending on the true statistical model). Their asymptotical process, which is completely determined by its covariance properties have been studied by Tyler in the real case [21]. matrix. Complex multivariate Gaussian, also called complex This has been recently extended to the complex case, more normal (CN), distribution plays a vital role in the theory of useful for signal processing applications, in [1], [6]. statistical analysis [9]. Very often the multivariate observations In most of the papers, three main M-estimators are studied are approximately normally distributed. This approximation and used in practice: the Student’s M-estimator that is MLE is (asymptotically) valid even when the original data is not for t-distribution, the Huber’s M-estimator and the Tyler’s multivariate normal, due to the central limit theorem. In that M-estimator [22], also known as Fixed Point (FP) estima- case, the classical covariance matrix estimator is the sample tor [2]. Student t-distribution is widely employed for non- covariance matrix (SCM) whose behavior is perfectly known. Gaussian data modeling since it offers flexibility thanks to Indeed, it follows the Wishart distribution [10] which is the an additive parameter, namely the Degree of Freedom (DoF). multivariate extension of the gamma distribution. Thanks to As a consequence, Student’s M-estimator is often used for scatter matrix estimation. Huber’s M-estimator, especially its Gordana Draskoviˇ c´ and Fred´ eric´ Pascal are with L2S - CentraleSupelec´ - CNRS - Universite´ Paris-Sud - 3 rue Joliot-Curie, F-91192 Gif-sur- complex multivariate extension, has received a lot of attention Yvette Cedex, France (e-mails: [email protected], since proven to be very robust to outliers. Tyler’s M-estimator [email protected]). “This paper has supplementary down- is not exactly an M-estimator1 but it is very useful because loadable material available at http://ieeexplore.ieee.org, provided by the au- thor. The material includes the results for the real case. This material is 149KB in size.” 1especially because it does not respect all Maronna conditions [18] 2 of rare property that any CES distribution with the same II. PROBLEM FORMULATION scatter matrix leads to the same result (hence “distribution- A. Complex distributions free”). Asymptotical properties of this estimator have been Let z = Re(z) + jIm(z) be an m-dimensional complex analyzed in [1], [23]. Recently, it has been shown that the random vector which consists of a pair of real random vectors behavior of Tyler’s estimator can be better approximated by a Re(z) and Im(z). The distribution of z on m determines Wishart distribution [24]. In this work, one aims at providing C the joint real 2m-variate distribution of Re(z) and Im(z) on more general results that can be applied to all M-estimators 2m and conversely. To completely define the second-order and one wants to analyze the gain of this approach on the R moments of Re(z) and Im(z), z is given by its covariance robust Mahalanobis distance [25], [26], very useful in various matrix C = E[(z−µ)(z−µ)H ] and pseudo-covariance matrix problems such as detection, clustering etc. P = E[(z−µ)(z−µ)T ]. If the complex vector is circular (see The contributions of this work are multiple. First, the [1] for details), the pseudo-covariance vanishes, i.e. P = 0. originality of the results comes from a new CES repre- 1) Generalized Complex Normal distribution: An m- sentation introducing “Gaussian cores”. This representation dimensional random vector has the generalized normal dis- is a modified stochastic representation given in [1] and is tribution z ∼ N(µ; C; P) if its probability density function crucial to understand the proposed method. Second, in this C (PDF) can be written as paper, M-estimators are, for the first time, analyzed thanks to a comparison with a very simple estimator, the SCM. z − µ exp − 1 (z − µ)H (z − µ)T V−1 Indeed, the direct statistical analysis of these estimators is 2 z∗ − µ∗ hz(z) = difficult because they are defined as the solution of an implicit πmpjVj equation and have been analyzed only in classical asymptotic (1) regimes. Here, we propose a different approach to overcome CP where µ is the statistical mean and V = . If z is this difficulty. More precisely, a sort of distance between M- P∗ C∗ estimators and the SCM is computed in order to propagate circular CN-distributed the pseudo-covariance will be omitted SCM non-asymptotic properties towards M-estimators. Third, in the notation, i.e. z ∼ CN(µ; C). the paper gives new insights into the correlation between M- estimators and the corresponding SCM in the Gaussian context B. Complex Elliptically Symmetric distributions which is the central part of our approach. Finally, we present An important class of circular distributions are the CES a practical interest of the results, specifically the application distributions. An m-dimensional random vector has a CES to the Mahalanobis distance. Note that all the results are distribution if its probability density function (PDF) can be provided in the complex case.