Asymptotic Properties of Robust Complex Covariance Matrix Estimates

Total Page:16

File Type:pdf, Size:1020Kb

Asymptotic Properties of Robust Complex Covariance Matrix Estimates 1 Asymptotic properties of robust complex covariance matrix estimates Mélanie Mahot Student, IEEE, Frédéric Pascal Member, IEEE, Philippe Forster Member, IEEE, Jean-Philippe Ovarlez Member, IEEE Abstract—In many statistical signal processing applications, [11], [12]. M-estimators of the covariance matrix are however the estimation of nuisance parameters and parameters of interest seldom used in the signal processing community. Only a is strongly linked to the resulting performance. Generally, these limited case, the Tyler’s estimator [13] also called the Fixed applications deal with complex data. This paper focuses on covariance matrix estimation problems in non-Gaussian envi- Point Estimator [14] has been widely used as an alternative to ronments and particularly, the M-estimators in the context of the SCM for radar applications. Concerning the M-estimators, elliptical distributions. Firstly, this paper extends to the complex notable exceptions are the recent papers by Ollila [15], [16], case the results of Tyler in [1]. More precisely, the asymptotic [17], [18], [19] who advocates their use in several applications distribution of these estimators as well as the asymptotic distribu- such as array processing. The M-estimators have also been tion of any homogeneous function of degree 0 of the M-estimates are derived. On the other hand, we show the improvement of studied in the case of large datasets, where the dimension of such results on two applications: DOA (directions of arrival) the data is of the same order as the dimension of the sample estimation using the MUSIC (MUltiple SIgnal Classification) [20]. algorithm and adaptive radar detection based on the ANMF One possible reason for this lack of interest is that their sta- (Adaptive Normalized Matched Filter) test. tistical properties are not well-known in the signal processing Index Terms—Covariance matrix estimation, robust estima- community, as opposed to the Wishart distribution of the SCM tion, elliptical distributions, Complex M-estimators. in the Gaussian context. They have been studied by Tyler [21] in the real case. However, in signal processing applications, I. INTRODUCTION data are usually complex and the purpose of this paper is to Many signal processing applications require the knowledge derive the asymptotic distribution of complex M-estimators of the data covariance matrix. The most often used estimator in the framework of elliptically distributed data. This result is is the well-known Sample Covariance Matrix (SCM) which also provided in [15] but without proof. We will also extend is the Maximum Likelihood (ML) estimator for Gaussian to the complex case, a property initially derived by Tyler in data. However, the SCM suffers from major drawbacks. When [1]: we show that in the complex elliptical distributions con- the data turn out to be non-Gaussian, as for instance in text, the asymptotic distribution of any positive homogeneous adaptive radar and sonar processing [2], the performance functional of degree 0 of estimates such as M-estimates and involved by the SCM can be strongly degraded. Indeed, this the SCM, is the same up to a scale factor. This result, useful is the case in impulsive noise contexts and in the presence for applications, extends the one proposed in [15]. Thus, fora of outliers as shown in [3]. To overcome these problems, Gaussian context and for signal processing applications which there has been an intense research activity in robust estimation only need the covariance matrix up to a scale factor, for ex- theory in the statistical community these last decades [4], ample Direction-of-Arrival (DOA) estimation or adaptive radar arXiv:1209.0897v2 [stat.AP] 6 Nov 2012 [5], [6]. Among several solutions, the so-called M-estimators detection, the parameter estimated has the same mean square originally introduced by Huber [7] and investigated in the error when estimated with the SCM or with an M-estimator seminal work of Maronna [8], have imposed themselves as an with a few more data (depending on σ1). Moreover, when the appealing alternative to the classical SCM. They have been context is non-Gaussian or contains outliers, the performance introduced within the framework of elliptical distributions. obtained with M-estimators is scarcely influenced while it is Elliptical distributions, originally introduced by Kelker in [9], unreliable and possibly completely damaged with the SCM as encompass a large number of well-known distributions as for shown for instance in [3]. We illustrate this effect using the instance the Gaussian distribution, or the multivariate Student MUSIC method and the Adaptive Normalized Matched Filter (or t) distribution. They may also be used to model heavy (ANMF) test introduced by Kraut and Scharf [22], [23]. It is tailed distributions by means of the K-distribution, as may be also illustrated by Ollila in [16], for MVDR beamforming. met for instance in adaptive radar with impulsive clutter [10], This paper is organized as follows. Section II introduces the required background and Section III the known properties of M.Mahot is with SONDRA, Supelec, Plateau du Moulon, 3 rue Joliot-Curie, real M-estimators. Then Section IV provides our contribution F-91190 Gif-sur-Yvette, France (e-mail: [email protected]) F. Pascal is with SONDRA, Supelec, Plateau du Moulon, 3 rue Joliot-Curie, about the estimators asymptotic distribution. Eventually, in F-91190 Gif-sur-Yvette, France (e-mail: [email protected]) Section V, simulations validate the theoretical analysis and P. Forster is with SATIE, ENS Cachan, CNRS, UniverSud, 61, Av. du Pdt Section VI concludes this work. Wilson, F-94230 Cachan, France (e-mail:[email protected]) J.-P. Ovarlez is with ONERA, DEMR/TSI, Chemin de la Hunière, F-91120 Vectors (resp. matrices) are denoted by bold-faced lowercase Palaiseau, (e-mail:[email protected]) letters (resp. uppercase letters). ∗, T and H respectively 2 represent the conjugate, the transpose and the Hermitian C. M-estimators of the scatter matrix d operator. means "distributed as", = stands for "shares the Let (z1, ..., z ) be an N-sample of m-dimensional real ∼ N same distribution as", d denotes convergence in distribution (resp. complex circular) independent vectors with z → i ∼ and denotes the Kronecker product. vec is the operator (0 1, Λ,hz) (resp. z (0 1, Λ,hz)), i = 1, ..., N. ⊗ m, i m, which transforms a matrix m n into a vector of lenth mn, TheE real (resp. complex) M∼-estimator CE of Λ is defined as the concatenating its n columns into× a single column. Moreover, solution of the following equation I 0 m is the m m identity matrix, m,p the m p matrix of N × m × 1 M = u z′ M−1z z z′ . (2) zeros, Jm2 = Jii Jii where Jii is the m m matrix N n n n n ⊗ × n=1 i X with a one inX the position and zeros elsewhere and (i,i) where the symbolc ′ stands for T cin the real case and for H in K is the commutation matrix which transforms vec A into ( ) the complex one. vec(AT ). Eventually, Im(y) represents the imaginary part of M-estimators have first been studied in the real case, defined the complex vector y and Re(y) its real part. as solution of (2) with real samples. Existence and uniqueness of the solution of (2) has been shown in the real case, provided II. BACKGROUND function u satisfies a set of general assumptions stated by A. Elliptical symmetric distribution Maronna in [8]. These conditions have been extended to the complex case by Ollila in [17]. They are recalled here below Let z be a -dimensional real (resp. complex circular) ran- m in the case where µ = 0m,1: dom vector. The vector z has a real (resp. complex) elliptical - u is non-negative, non increasing, and continuous on symmetric distribution if its probability density function (PDF) [0, ). can be written as ∞ - Let ψ(s)= su(s) and K = sups≥0 ψ(s). m<K< , −1 2 −1 ∞ gz(z)= Λ / hz((z µ)T Λ (z µ)), ψ is increasing, and strictly increasing on the interval in the real| | case, − − where ψ<K. −1 H −1 (1) gz(z)= Λ hz((z µ) Λ (z µ)), - Let PN (.) denote the empirical distribution of | | − − in the complex case, (z1, ..., zN ). There exists a > 0 such that for every m hyperplane S, dim(S) m 1, PN (S) 1 K a. where hz : [0, ) [0, ) is any function such that (1) ≤ − ≤ − − ∞ → ∞ This assumption can be strongly relaxed as shown in defines a PDF, µ is the statistical mean and Λ is a scatter [25], [26]. matrix. The scatter matrix Λ reflects the structure of the Let us now consider the following equation, which is roughly covariance matrix of z, i.e. the covariance matrix is equal to speaking the limit of (2) when N tends to infinity: Λ up to a scale factor. This real (resp. complex) elliptically 1 symmetric distribution will be denoted by (µ, Λ,hz) (resp. M = E u(z′M− z) zz′ , (3) E (µ, Λ,hz)). One can notice that the Gaussian distribution isCE a particular case of elliptical distributions. A survey on where z (0m,1, Λ,hz)(resp. (0m,1,Λ,hz)) and where ∼E′ T CE H complex elliptical distributions can be found in [15]. the symbol stands for in the real case and for in the complex one. In this paper, we will assume that µ = 0m,1. Without loss of generality, the scatter matrix will be taken to be equal to Then, under the above conditions, it has been shown for the the covariance matrix when the latter exists. Indeed, when the real case in [26], [8] that: second moment of the distribution is finite, function hz in - Equation (3) (resp.
Recommended publications
  • New Insights Into the Statistical Properties of M-Estimators Gordana Draskovic, Frédéric Pascal
    New insights into the statistical properties of M-estimators Gordana Draskovic, Frédéric Pascal To cite this version: Gordana Draskovic, Frédéric Pascal. New insights into the statistical properties of M-estimators. IEEE Transactions on Signal Processing, Institute of Electrical and Electronics Engineers, 2018, 66 (16), pp.4253-4263. 10.1109/TSP.2018.2841892. hal-01816084 HAL Id: hal-01816084 https://hal.archives-ouvertes.fr/hal-01816084 Submitted on 26 Feb 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. 1 New insights into the statistical properties of M-estimators Gordana Draskoviˇ c,´ Student Member, IEEE, Fred´ eric´ Pascal, Senior Member, IEEE Abstract—This paper proposes an original approach to bet- its explicit form, the SCM is easy to manipulate and therefore ter understanding the behavior of robust scatter matrix M- widely used in the signal processing community. estimators. Scatter matrices are of particular interest for many Nevertheless, the complex normality sometimes presents a signal processing applications since the resulting performance strongly relies on the quality of the matrix estimation. In this poor approximation of underlying physics. Noise and interfer- context, M-estimators appear as very interesting candidates, ence can be spiky and impulsive i.e., have heavier tails than mainly due to their flexibility to the statistical model and their the Gaussian distribution.
    [Show full text]
  • Robust Scatter Matrix Estimation for High Dimensional Distributions with Heavy Tail Junwei Lu, Fang Han, and Han Liu
    IEEE TRANSACTIONS ON INFORMATION THEORY 1 Robust Scatter Matrix Estimation for High Dimensional Distributions with Heavy Tail Junwei Lu, Fang Han, and Han Liu Abstract—This paper studies large scatter matrix estimation distribution family, the pair-elliptical. The pair-elliptical family for heavy tailed distributions. The contributions of this paper is strictly larger and requires less symmetry structure than the are twofold. First, we propose and advocate to use a new elliptical. We provide detailed studies on the relation between distribution family, the pair-elliptical, for modeling the high dimensional data. The pair-elliptical is more flexible and easier the pair-elliptical and several heavy tailed distribution families, to check the goodness of fit compared to the elliptical. Secondly, including the nonparanormal, elliptical, and transelliptical. built on the pair-elliptical family, we advocate using quantile- Moreover, it is easier to test the goodness of fit for the based statistics for estimating the scatter matrix. For this, we pair-elliptical. For conducting such a test, we combine the provide a family of quantile-based statistics. They outperform the existing results in low dimensions (20; 21; 22; 23; 24) with existing ones for better balancing the efficiency and robustness. In particular, we show that the propose estimators have compa- the familywise error rate controlling techinques including the rable performance to the moment-based counterparts under the Bonferonni’s correction, the Holm’s step-down procedure (25), Gaussian assumption. The method is also tuning-free compared and the higher criticism method (26; 27). to Catoni’s M-estimator for covariance matrix estimation. We Secondly, built on the pair-elliptical family, we propose further apply the method to conduct a variety of statistical a new set of quantile-based statistics for estimating scat- methods.
    [Show full text]
  • Multivariate Statistical Functions in R
    Multivariate statistical functions in R Michail T. Tsagris [email protected] College of engineering and technology, American university of the middle east, Egaila, Kuwait Version 6.1 Athens, Nottingham and Abu Halifa (Kuwait) 31 October 2014 Contents 1 Mean vectors 1 1.1 Hotelling’s one-sample T2 test ............................. 1 1.2 Hotelling’s two-sample T2 test ............................ 2 1.3 Two two-sample tests without assuming equality of the covariance matrices . 4 1.4 MANOVA without assuming equality of the covariance matrices . 6 2 Covariance matrices 9 2.1 One sample covariance test .............................. 9 2.2 Multi-sample covariance matrices .......................... 10 2.2.1 Log-likelihood ratio test ............................ 10 2.2.2 Box’s M test ................................... 11 3 Regression, correlation and discriminant analysis 13 3.1 Correlation ........................................ 13 3.1.1 Correlation coefficient confidence intervals and hypothesis testing us- ing Fisher’s transformation .......................... 13 3.1.2 Non-parametric bootstrap hypothesis testing for a zero correlation co- efficient ..................................... 14 3.1.3 Hypothesis testing for two correlation coefficients . 15 3.2 Regression ........................................ 15 3.2.1 Classical multivariate regression ....................... 15 3.2.2 k-NN regression ................................ 17 3.2.3 Kernel regression ................................ 20 3.2.4 Choosing the bandwidth in kernel regression in a very simple way . 23 3.2.5 Principal components regression ....................... 24 3.2.6 Choosing the number of components in principal component regression 26 3.2.7 The spatial median and spatial median regression . 27 3.2.8 Multivariate ridge regression ......................... 29 3.3 Discriminant analysis .................................. 31 3.3.1 Fisher’s linear discriminant function ....................
    [Show full text]
  • Multivariate L1 Statistical Methods: the Package
    JSS Journal of Statistical Software July 2011, Volume 43, Issue 5. http://www.jstatsoft.org/ Multivariate L1 Methods: The Package MNM Klaus Nordhausen Hannu Oja University of Tampere University of Tampere Abstract In the paper we present an R package MNM dedicated to multivariate data analysis based on the L1 norm. The analysis proceeds very much as does a traditional multivariate analysis. The regular L2 norm is just replaced by different L1 norms, observation vectors are replaced by their (standardized and centered) spatial signs, spatial ranks, and spatial signed-ranks, and so on. The procedures are fairly efficient and robust, and no moment assumptions are needed for asymptotic approximations. The background theory is briefly explained in the multivariate linear regression model case, and the use of the package is illustrated with several examples using the R package MNM. Keywords: least absolute deviation, mean deviation, mean difference, multivariate linear re- gression, R, shape matrix, spatial sign, spatial signed-rank, spatial rank, transformation- retransformation method. 1. Introduction Classical multivariate statistical inference methods (Hotelling's T 2, multivariate analysis of variance, multivariate regression, tests for independence, canonical correlation analysis, prin- cipal component analysis, and so on) are based on the use of the L2 norm. These standard moment-based multivariate techniques are optimal under the multivariate normality of the residuals but poor in their efficiency for heavy-tailed distributions. They are also highly sensitive to outlying observations. In this paper we present an R package MNM { available from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=MNM { which uses different L1 norms and the corresponding scores (spatial signs, spatial signed- ranks, and spatial ranks) in the analysis of multivariate data.
    [Show full text]
  • Robust Estimation of Structured Scatter Matrices in (Mis)Matched Models Bruno Meriaux, Chengfang Ren, Mohammed Nabil El Korso, Arnaud Breloy, Philippe Forster
    Robust Estimation of Structured Scatter Matrices in (Mis)matched Models Bruno Meriaux, Chengfang Ren, Mohammed Nabil El Korso, Arnaud Breloy, Philippe Forster To cite this version: Bruno Meriaux, Chengfang Ren, Mohammed Nabil El Korso, Arnaud Breloy, Philippe Forster. Robust Estimation of Structured Scatter Matrices in (Mis)matched Models. Signal Processing, Elsevier, 2019, 165, pp.163-174. 10.1016/j.sigpro.2019.06.030. hal-02165848 HAL Id: hal-02165848 https://hal.archives-ouvertes.fr/hal-02165848 Submitted on 2 Jul 2019 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Robust Estimation of Structured Scatter Matrices in (Mis)matched Models Bruno M´eriauxa,∗, Chengfang Rena, Mohammed Nabil El Korsob, Arnaud Breloyb, Philippe Forsterc aSONDRA, CentraleSup´elec, 91192 Gif-sur-Yvette, France bParis Nanterre University, LEME EA-4416, 92410 Ville d’Avray, France cParis Nanterre University, SATIE, 94230 Cachan, France Abstract Covariance matrix estimation is a ubiquitous problem in signal processing. In most modern signal processing applications, data are generally modeled by non-Gaussian distributions with covariance matrices exhibiting a particular structure. Taking into account this struc- ture and the non-Gaussian behavior improve drastically the estimation accuracy.
    [Show full text]
  • Covariance Estimation in Two-Level Regression
    2nd International Conference on Control and Fault-Tolerant Systems, SysTol'13, October 9-11, 2013, Nice, France 1 Covariance Estimation in Two-Level Regression Nicholas Moehle and Dimitry Gorinevsky Abstract— This paper considers estimation of covariance ma- Using data from multiple units in a two-level data set can trices in multivariate linear regression models for two-level data improve the covariance estimation accuracy. produced by a population of similar units (individuals). The proposed Bayesian formulation assumes that the covariances Covariance matrix estimation for one-level datasets has at- for different units are sampled from a common distribution. tracted substantial attention earlier. Several approaches to the Assuming that this common distribution is Wishart, the optimal maximum likelihood estimation (MLE) with shrinkage (reg- Bayesian estimation problem is shown to be convex. This paper proposes a specialized scalable algorithm for solving this two- ularization) have been proposed, e.g., see [10], [11] where level optimal Bayesian estimation problem. The algorithm scales further references can be found. The approaches related to to datasets with thousands of units and trillions of data points this paper add regularization by using a Bayesian prior in a per unit, by solving the problem recursively, allowing new maximum a posteriori probability (MAP) estimation, such as data to be quickly incorporated into the estimates. An example an inverse Wishart prior for the covariance matrix, see [13]. problem is used to show that the proposed approach improves over existing approaches to estimating covariance matrices in The inverse Wishart is the conjugate prior for the covariance linear models for two-level data.
    [Show full text]
  • Minimum Covariance Determinant and Extensions
    Minimum Covariance Determinant and Extensions Mia Hubert,∗ Michiel Debruyne,y Peter J. Rousseeuw∗ September 22, 2017 Abstract The Minimum Covariance Determinant (MCD) method is a highly robust estimator of multivariate location and scatter, for which a fast algorithm is available. Since estimating the covariance matrix is the cornerstone of many multivariate statistical methods, the MCD is an important building block when developing robust multivariate techniques. It also serves as a convenient and efficient tool for outlier detection. The MCD estimator is reviewed, along with its main properties such as affine equivariance, breakdown value, and influence function. We discuss its computation, and list applications and extensions of the MCD in applied and methodological multivariate statistics. Two recent extensions of the MCD are described. The first one is a fast deterministic algorithm which inherits the robustness of the MCD while being almost affine equivariant. The second is tailored to high-dimensional data, possibly with more dimensions than cases, and incorporates regularization to prevent singular matrices. arXiv:1709.07045v1 [stat.ME] 20 Sep 2017 ∗Department of Mathematics, KU Leuven, Celestijnenlaan 200B, BE-3001 Leuven, Belgium yDexia Bank, Belgium INTRODUCTION The Minimum Covariance Determinant (MCD) estimator is one of the first affine equivari- ant and highly robust estimators of multivariate location and scatter1,2. Being resistant to outlying observations makes the MCD very useful for outlier detection. Although already in- troduced in 1984, its main use has only started since the construction of the computationally efficient FastMCD algorithm of3 in 1999. Since then, the MCD has been applied in numerous fields such as medicine, finance, image analysis and chemistry.
    [Show full text]
  • Estimation of the Inverse Scatter Matrix for a Scale Mixture of Wishart
    Estimation of the inverse scatter matrix for a scale mixture of Wishart matrices under Efron-Morris type losses Djamila Boukehil, Dominique Fourdrinier, Fatiha Mezoued, William Strawderman To cite this version: Djamila Boukehil, Dominique Fourdrinier, Fatiha Mezoued, William Strawderman. Estimation of the inverse scatter matrix for a scale mixture of Wishart matrices under Efron-Morris type losses. 2020. hal-02503972 HAL Id: hal-02503972 https://hal.archives-ouvertes.fr/hal-02503972 Preprint submitted on 10 Mar 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Estimation of the inverse scatter matrix for a scale mixture of Wishart matrices under Efron-Morris type losses Djamila Boukehil ∗ Dominique Fourdrinier y Fatiha Mezoued z William E. Strawderman x February 17, 2020 Abstract We consider estimation of the inverse scatter matrix Σ−1 for a scale mixture of Wishart matrices under various Efron-Morris type losses, tr[fΣ^ −1 − Σ−1g2Sk] for k = 0; 1; 2:::, where S is the sample covariance matrix. We improve on the standard estimators a S+, where S+ denotes the Moore-Penrose inverse of S and a is a positive constant, through an unbiased estimator of the risk difference between the new estimators and a S+.
    [Show full text]
  • Robust Estimation of Scatter Matrix, Random Matrix Theory and an Application to Spectrum Sensing
    Robust Estimation of Scatter Matrix, Random Matrix Theory and an Application to Spectrum Sensing Thesis by Zhedong Liu In Partial Fulfillment of the Requirements For the Degree of Master of Science King Abdullah University of Science and Technology Thuwal, Kingdom of Saudi Arabia April,2019 2 EXAMINATION COMMITTEE PAGE The thesis of Zhedong Liu is approved by the examination committee Committee Chairperson: Mohamed-Slim Alouini Committee Members: Mohamed-Slim Alouini, Abla Kammoun , H˚avard Rue 3 ©April,2019 Zhedong Liu All Rights Reserved 4 ABSTRACT Robust Estimation of Scatter Matrix, Random Matrix Theory and Applications to Spectrum Sensing Zhedong Liu The covariance estimation is one of the most critical tasks in multivariate statisti- cal analysis. In many applications, reliable estimation of the covariance matrix, or scatter matrix in general, is required. The performance of the classical maximum likelihood method relies a great deal on the validity of the model assumption. Since the assumptions are often approximately correct, many robust statistical methods have been proposed to be robust against the deviation from the model assumptions. M-estimator is an important class of robust estimator of the scatter matrix. The properties of these robust estimators under high dimensional setting, which means the number of dimensions has the same order of magnitude as the number of obser- vations, is desirable. To study these, random matrix theory is a very important tool. With high dimensional properties of robust estimators, we introduced a new method for blind spectrum sensing in cognitive radio networks. 5 ACKNOWLEDGEMENTS First of all, I would like to thank my supervisor Prof.
    [Show full text]
  • Scatter Matrices and Independent Component Analysis 1 Introduction
    AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 175–189 Scatter Matrices and Independent Component Analysis Hannu Oja1, Seija Sirkia¨2, and Jan Eriksson3 1University of Tampere, Finland 2University of Jyvaskyl¨ a,¨ Finland 3Helsinki University of Technology, Finland Abstract:In the independent component analysis (ICA) it is assumed that the components of the multivariate independent and identically distributed ob- servations are linear transformations of latent independent components. The problem then is to find the (linear) transformation which transforms the ob- servations back to independent components. In the paper the ICA is discussed and it is shown that, under some mild assumptions, two scatter matrices may be used together to find the independent components. The scatter matrices must then have the so called independence property. The theory is illustrated by examples. Keywords: Affine Equivariance, Elliptical Model, Independence, Indepen- dent Component Model, Kurtosis, Location, Principal Component Analysis (PCA), Skewness, Source Separation. 1 Introduction Let x1; x2;:::; xn denote a random sample from a p-variate distribution. We also write 0 X = (x1 x2 ¢ ¢ ¢ xn) for the corresponding n £ p data matrix. In statistical modelling of the observed data, one often assumes that the observations xi are independent p-vectors ”generated” by the model xi = Azi + b ; i = 1; : : : ; n ; where the zi’s are called standardized vectors, b is a location p-vector, A is a full-rank p £ p transformation matrix and V = AA0 is a positive definite p £ p (P DS(p)) scatter matrix. In most applications (two-samples, several-samples case, linear model, etc.), b = bi may be dependent on the design.
    [Show full text]
  • Tools for Exploring Multivariate Data: the Package ICS
    Tools for Exploring Multivariate Data: The Package ICS Klaus Nordhausen Hannu Oja David E. Tyler University of Tampere University of Tampere Rutgers, The State University of New Jersey Abstract This introduction to the R package ICS is a (slightly) modified version of Nordhausen, Oja, and Tyler(2008c), published in the Journal of Statistical Software. Invariant coordinate selection (ICS) has recently been introduced as a method for exploring multivariate data. It includes as a special case a method for recovering the unmixing matrix in independent components analysis (ICA). It also serves as a basis for classes of multivariate nonparametric tests, and as a tool in cluster analysis or blind discrimination. The aim of this paper is to briefly explain the (ICS) method and to illustrate how various applications can be implemented using the R package ICS. Several examples are used to show how the ICS method and ICS package can be used in analyzing a multivariate data set. Keywords: clustering, discriminant analysis, independent components analysis, invariant co- ordinate selection, R, transformation-retransformation method. 1. Introduction Multivariate data normally arise by collecting p measurements on n individuals or experimen- tal units. Such data can be displayed in tables with each row representing one individual and each column representing a particular measured variable. The resulting data matrix X is then p n × p, with the row vector xi 2 < denoting the measurements taken on the ith individual or > > > ith experimental unit. Hence X = [x1 ; : : : ; xn ]. To be consistent with the convention used in the programming language R, all vectors are understood to be row vectors.
    [Show full text]