Robust Scatter Matrix Estimation for High Dimensional Distributions with Heavy Tail Junwei Lu, Fang Han, and Han Liu

Total Page:16

File Type:pdf, Size:1020Kb

Robust Scatter Matrix Estimation for High Dimensional Distributions with Heavy Tail Junwei Lu, Fang Han, and Han Liu IEEE TRANSACTIONS ON INFORMATION THEORY 1 Robust Scatter Matrix Estimation for High Dimensional Distributions with Heavy Tail Junwei Lu, Fang Han, and Han Liu Abstract—This paper studies large scatter matrix estimation distribution family, the pair-elliptical. The pair-elliptical family for heavy tailed distributions. The contributions of this paper is strictly larger and requires less symmetry structure than the are twofold. First, we propose and advocate to use a new elliptical. We provide detailed studies on the relation between distribution family, the pair-elliptical, for modeling the high dimensional data. The pair-elliptical is more flexible and easier the pair-elliptical and several heavy tailed distribution families, to check the goodness of fit compared to the elliptical. Secondly, including the nonparanormal, elliptical, and transelliptical. built on the pair-elliptical family, we advocate using quantile- Moreover, it is easier to test the goodness of fit for the based statistics for estimating the scatter matrix. For this, we pair-elliptical. For conducting such a test, we combine the provide a family of quantile-based statistics. They outperform the existing results in low dimensions (20; 21; 22; 23; 24) with existing ones for better balancing the efficiency and robustness. In particular, we show that the propose estimators have compa- the familywise error rate controlling techinques including the rable performance to the moment-based counterparts under the Bonferonni’s correction, the Holm’s step-down procedure (25), Gaussian assumption. The method is also tuning-free compared and the higher criticism method (26; 27). to Catoni’s M-estimator for covariance matrix estimation. We Secondly, built on the pair-elliptical family, we propose further apply the method to conduct a variety of statistical a new set of quantile-based statistics for estimating scat- methods. The corresponding theoretical properties as well as 1 numerical performances are provided. ter/covariance matrices . We also provide the theoretical prop- erties of the proposed methods. In particular, we show that Index Terms—Heavy-tailed distribution; Pair-Elliptical Distri- the proposed quantile-based methods outperform the existing bution; Quantile-based statistics; Scatter matrix. ones for better balancing the robustness and efficiency. As applications, we exploit the proposed estimators for conduct- I. INTRODUCTION ing several high dimensional statistical methods, and show Large covariance matrix estimation is a core problem in the advantages of using the quantile-based statistics both multivariate statistics. Pearson’s sample covariance matrix is theoretically and empirically. widely used for estimation and proves to enjoy certain optimal- ity under the subgaussian assumption (1;2;3;4;5). However, A. Other Related Works this assumption is not realistic in many real applications where data are heavy-tailed (6;7;8). The quantile-based statistics, such as the median absolute To handle heavy-tailed data, rank-based statistics are pro- deviation (29) and the Qn estimators (30; 31), have been used posed. Compared to Pearson’s sample covariance, rank-based in estimating marginal standard deviations. Their properties estimators achieve extra efficiency via exploiting the dataset’s in parameter estimation and robustness to outliers are further geometric structures. Such structures, like symmetry, are nat- studied in low dimensions (32). Moreover, these estimators urally involved in the data generating scheme and allow for have been generalized to estimate the dispersions between both efficient and robust inference. Conducting rank-based random variables (33; 34; 35; 36). covariance matrix estimation includes two steps. The first step Given these results, we mainly make three contributions: is to estimate the (latent) correlation matrix. For this, (9), (10), (i) Methodologically, we propose new quantile-based scatter (11), (12), (13), (14), and (15) exploit Spearman’s rho and matrix estimators that generalize the existing MAD and Qn Kendall’s tau estimators. They work under the nonparanormal estimators for better balancing the efficiency and robustness. or the transelliptical distribution family. The second step (ii) Theoretically, we provide more understandings on the is to estimate marginal variances. For this, (16), (17), and quantile-based methods. They confirm that the quantile-based (18) exploit Catoni’s M-estimator (19). However, Catoni’s estimators are also good alternatives to the prevailing moment- estimator requires to tune parameters. Moreover, it is sensitive based estimators in high dimensions. (iii) We propose a projec- to outliers and accordingly is not a robust estimator. tion method for overcoming the lack of positive semidefinite- In this paper, we strengthen the results in the literature in ness, which is typical in the robust scatter matrix estimation. two directions. First, we propose and advocate to use a new This approach maintains the efficiency as well as robustness to data contamination, while the prevailing SVD decomposition Junwei Lu is with the Department of Biostatistics, Harvard T.H. Chan approach (36) cannot. School of Public Health, Boston, MA (e-mail: [email protected]) Of note, the effectiveness of quantile-base methods is being Fang Han is with Department of Statistics, University of Washington, Seattle, WA (e-mail: [email protected]) realized in other fields in high dimensional statistics. For Han Liu is with the Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL (e-mail: han- 1The scatter matrix is any matrix proportional to the covariance matrix. See [email protected]). (28) for more details. IEEE TRANSACTIONS ON INFORMATION THEORY 2 example, (37), (38), and (39) provide analysis on the penalized The random variable X is heavy tailed if there exists some quantile regression and show that it can handle the case that p > 0 such that the noise term is very heavy-tailed. Our method, although very jjXjj ≤ K ≤ 1 for q ≤ p; and jjXjj = 1: different from theirs, shares similar properties. Lq Lp+1 The heavy tailedness of X is measured by how large p could B. Notation System be such that the p-th moment exists. d×d In the following we first provide an upper bound of the Let M = [Mjk] 2 R be a matrix and v = T d sample mean. It illustrates the “optimal rate but sub-optimal (v1; :::; vd) 2 R be a vector. We denote vI to be the subvector of v whose entries are indexed by a set I ⊂ scaling” phenomenon. f1; : : : ; dg. We denote M to be the submatrix of M T d I;J Theorem II.1. Suppose X = (X1;:::;Xd) 2 R is whose rows and columns are indexed by I and J. For a random vector with the population mean µ. Assume X γ 0 < q < 1, we define the `0, `q, and `1 vector (pseudo- satisfies jjXjjjL ≤ K, where we assume d = O(n ) and Pd p )norms to be jjvjj0 := j=1 I(vj 6= 0); jjvjjq := p = 2+2γ+δ. Letting µ¯ be the sample mean of n independent Pd q 1=q ( j=1 jvjj ) ; and jjvjj1 := max1≤j≤d jvjj: Here I(·) observations of X, we then have, with probability no smaller represents the indicator function. For a matrix M, we than 1 − 2d−2:5 − (log d)p=2n−δ=2, denote the matrix `q, `max, and Frobenius `F-norms of r log d M to be jjMjjq := maxkvkq =1 kMvkq; jjMjjmax := jjµ¯ − µjj1 ≤ 12K · : P 2 1=2 n maxjk jMjkj; and jjMjjF := ( j;k jMjkj ) : For any d×d p matrix M 2 R , we denote diag(M) to be the diagonal Theorem II.1 shows that, for preserving the OP ( log d=n) d×d matrix with the same diagonal entries as M, and Id 2 R to rate of convergence, p determines how large the dimension d be the d by d identity matrix. Let λj(M) and uj(M) represent can be compared to n. For example, when at most (4 + )-th the j-th largest eigenvalue and the corresponding eigenvector moment exists for X for some > 0, the sample mean attains T p of M, and hM1; M2i := Tr(M1 M2) be the inner product of the optimal rate OP ( log d=n) under the suboptimal scaling M1 and M2. For any two random vectors X and Y , we write d = O(n). X =D Y if and only if X and Y are identically distributed. The results in Theorem II.1 cannot be improved without Throughout the paper, we let c; C be two generic absolute adding more assumptions. Via a worst case analysis, the next constants, whose values may vary at different locations. theorem characterizes the sharpness of Theorem II.1. Theorem II.2. For any fixed constant C, p = 2 + 2γ with γ+δ0 C. Paper Organization γ > 0, and d = n for some δ0 > 0, there exists certain The rest of the paper is organized as follows. In the next random vector X, satisfying section we provide the theoretical evaluation of the impacts jjX jj < K; for some absolute constant K > 0 of heavy tails on the moment-based estimators. This motivates i Lq our work. Section III proposes the pair-elliptical family, and and all q ≤ p, such that, with probability tending to 1, we reveals the connection among the Gaussian, elliptical, non- have r paranormal, transelliptical, and pair-elliptical. In SectionIV, C log d jjµ¯ − µjj1 ≥ : we introduce the generalized MAD and Qn estimators for n estimating scatter/covariance matrices. SectionV provides the Theorems II.1 and Theorem II.2 together illustrate the con- theoretical results. SectionVI discusses parameter selection. In straints of applying moment-based estimators to study heavy Section VII, we apply the proposed estimators to conduct mul- tailed distributions. This motivates us to consider alternative tiple multivariate methods. We put experiments on synthetic methods that are more efficient in handling heavy tailedness. and real data in Section VIII, more discussions in SectionIX, and technical proofs in the appendix.
Recommended publications
  • New Insights Into the Statistical Properties of M-Estimators Gordana Draskovic, Frédéric Pascal
    New insights into the statistical properties of M-estimators Gordana Draskovic, Frédéric Pascal To cite this version: Gordana Draskovic, Frédéric Pascal. New insights into the statistical properties of M-estimators. IEEE Transactions on Signal Processing, Institute of Electrical and Electronics Engineers, 2018, 66 (16), pp.4253-4263. 10.1109/TSP.2018.2841892. hal-01816084 HAL Id: hal-01816084 https://hal.archives-ouvertes.fr/hal-01816084 Submitted on 26 Feb 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. 1 New insights into the statistical properties of M-estimators Gordana Draskoviˇ c,´ Student Member, IEEE, Fred´ eric´ Pascal, Senior Member, IEEE Abstract—This paper proposes an original approach to bet- its explicit form, the SCM is easy to manipulate and therefore ter understanding the behavior of robust scatter matrix M- widely used in the signal processing community. estimators. Scatter matrices are of particular interest for many Nevertheless, the complex normality sometimes presents a signal processing applications since the resulting performance strongly relies on the quality of the matrix estimation. In this poor approximation of underlying physics. Noise and interfer- context, M-estimators appear as very interesting candidates, ence can be spiky and impulsive i.e., have heavier tails than mainly due to their flexibility to the statistical model and their the Gaussian distribution.
    [Show full text]
  • Multivariate Statistical Functions in R
    Multivariate statistical functions in R Michail T. Tsagris [email protected] College of engineering and technology, American university of the middle east, Egaila, Kuwait Version 6.1 Athens, Nottingham and Abu Halifa (Kuwait) 31 October 2014 Contents 1 Mean vectors 1 1.1 Hotelling’s one-sample T2 test ............................. 1 1.2 Hotelling’s two-sample T2 test ............................ 2 1.3 Two two-sample tests without assuming equality of the covariance matrices . 4 1.4 MANOVA without assuming equality of the covariance matrices . 6 2 Covariance matrices 9 2.1 One sample covariance test .............................. 9 2.2 Multi-sample covariance matrices .......................... 10 2.2.1 Log-likelihood ratio test ............................ 10 2.2.2 Box’s M test ................................... 11 3 Regression, correlation and discriminant analysis 13 3.1 Correlation ........................................ 13 3.1.1 Correlation coefficient confidence intervals and hypothesis testing us- ing Fisher’s transformation .......................... 13 3.1.2 Non-parametric bootstrap hypothesis testing for a zero correlation co- efficient ..................................... 14 3.1.3 Hypothesis testing for two correlation coefficients . 15 3.2 Regression ........................................ 15 3.2.1 Classical multivariate regression ....................... 15 3.2.2 k-NN regression ................................ 17 3.2.3 Kernel regression ................................ 20 3.2.4 Choosing the bandwidth in kernel regression in a very simple way . 23 3.2.5 Principal components regression ....................... 24 3.2.6 Choosing the number of components in principal component regression 26 3.2.7 The spatial median and spatial median regression . 27 3.2.8 Multivariate ridge regression ......................... 29 3.3 Discriminant analysis .................................. 31 3.3.1 Fisher’s linear discriminant function ....................
    [Show full text]
  • Multivariate L1 Statistical Methods: the Package
    JSS Journal of Statistical Software July 2011, Volume 43, Issue 5. http://www.jstatsoft.org/ Multivariate L1 Methods: The Package MNM Klaus Nordhausen Hannu Oja University of Tampere University of Tampere Abstract In the paper we present an R package MNM dedicated to multivariate data analysis based on the L1 norm. The analysis proceeds very much as does a traditional multivariate analysis. The regular L2 norm is just replaced by different L1 norms, observation vectors are replaced by their (standardized and centered) spatial signs, spatial ranks, and spatial signed-ranks, and so on. The procedures are fairly efficient and robust, and no moment assumptions are needed for asymptotic approximations. The background theory is briefly explained in the multivariate linear regression model case, and the use of the package is illustrated with several examples using the R package MNM. Keywords: least absolute deviation, mean deviation, mean difference, multivariate linear re- gression, R, shape matrix, spatial sign, spatial signed-rank, spatial rank, transformation- retransformation method. 1. Introduction Classical multivariate statistical inference methods (Hotelling's T 2, multivariate analysis of variance, multivariate regression, tests for independence, canonical correlation analysis, prin- cipal component analysis, and so on) are based on the use of the L2 norm. These standard moment-based multivariate techniques are optimal under the multivariate normality of the residuals but poor in their efficiency for heavy-tailed distributions. They are also highly sensitive to outlying observations. In this paper we present an R package MNM { available from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=MNM { which uses different L1 norms and the corresponding scores (spatial signs, spatial signed- ranks, and spatial ranks) in the analysis of multivariate data.
    [Show full text]
  • Robust Estimation of Structured Scatter Matrices in (Mis)Matched Models Bruno Meriaux, Chengfang Ren, Mohammed Nabil El Korso, Arnaud Breloy, Philippe Forster
    Robust Estimation of Structured Scatter Matrices in (Mis)matched Models Bruno Meriaux, Chengfang Ren, Mohammed Nabil El Korso, Arnaud Breloy, Philippe Forster To cite this version: Bruno Meriaux, Chengfang Ren, Mohammed Nabil El Korso, Arnaud Breloy, Philippe Forster. Robust Estimation of Structured Scatter Matrices in (Mis)matched Models. Signal Processing, Elsevier, 2019, 165, pp.163-174. 10.1016/j.sigpro.2019.06.030. hal-02165848 HAL Id: hal-02165848 https://hal.archives-ouvertes.fr/hal-02165848 Submitted on 2 Jul 2019 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Robust Estimation of Structured Scatter Matrices in (Mis)matched Models Bruno M´eriauxa,∗, Chengfang Rena, Mohammed Nabil El Korsob, Arnaud Breloyb, Philippe Forsterc aSONDRA, CentraleSup´elec, 91192 Gif-sur-Yvette, France bParis Nanterre University, LEME EA-4416, 92410 Ville d’Avray, France cParis Nanterre University, SATIE, 94230 Cachan, France Abstract Covariance matrix estimation is a ubiquitous problem in signal processing. In most modern signal processing applications, data are generally modeled by non-Gaussian distributions with covariance matrices exhibiting a particular structure. Taking into account this struc- ture and the non-Gaussian behavior improve drastically the estimation accuracy.
    [Show full text]
  • Covariance Estimation in Two-Level Regression
    2nd International Conference on Control and Fault-Tolerant Systems, SysTol'13, October 9-11, 2013, Nice, France 1 Covariance Estimation in Two-Level Regression Nicholas Moehle and Dimitry Gorinevsky Abstract— This paper considers estimation of covariance ma- Using data from multiple units in a two-level data set can trices in multivariate linear regression models for two-level data improve the covariance estimation accuracy. produced by a population of similar units (individuals). The proposed Bayesian formulation assumes that the covariances Covariance matrix estimation for one-level datasets has at- for different units are sampled from a common distribution. tracted substantial attention earlier. Several approaches to the Assuming that this common distribution is Wishart, the optimal maximum likelihood estimation (MLE) with shrinkage (reg- Bayesian estimation problem is shown to be convex. This paper proposes a specialized scalable algorithm for solving this two- ularization) have been proposed, e.g., see [10], [11] where level optimal Bayesian estimation problem. The algorithm scales further references can be found. The approaches related to to datasets with thousands of units and trillions of data points this paper add regularization by using a Bayesian prior in a per unit, by solving the problem recursively, allowing new maximum a posteriori probability (MAP) estimation, such as data to be quickly incorporated into the estimates. An example an inverse Wishart prior for the covariance matrix, see [13]. problem is used to show that the proposed approach improves over existing approaches to estimating covariance matrices in The inverse Wishart is the conjugate prior for the covariance linear models for two-level data.
    [Show full text]
  • Minimum Covariance Determinant and Extensions
    Minimum Covariance Determinant and Extensions Mia Hubert,∗ Michiel Debruyne,y Peter J. Rousseeuw∗ September 22, 2017 Abstract The Minimum Covariance Determinant (MCD) method is a highly robust estimator of multivariate location and scatter, for which a fast algorithm is available. Since estimating the covariance matrix is the cornerstone of many multivariate statistical methods, the MCD is an important building block when developing robust multivariate techniques. It also serves as a convenient and efficient tool for outlier detection. The MCD estimator is reviewed, along with its main properties such as affine equivariance, breakdown value, and influence function. We discuss its computation, and list applications and extensions of the MCD in applied and methodological multivariate statistics. Two recent extensions of the MCD are described. The first one is a fast deterministic algorithm which inherits the robustness of the MCD while being almost affine equivariant. The second is tailored to high-dimensional data, possibly with more dimensions than cases, and incorporates regularization to prevent singular matrices. arXiv:1709.07045v1 [stat.ME] 20 Sep 2017 ∗Department of Mathematics, KU Leuven, Celestijnenlaan 200B, BE-3001 Leuven, Belgium yDexia Bank, Belgium INTRODUCTION The Minimum Covariance Determinant (MCD) estimator is one of the first affine equivari- ant and highly robust estimators of multivariate location and scatter1,2. Being resistant to outlying observations makes the MCD very useful for outlier detection. Although already in- troduced in 1984, its main use has only started since the construction of the computationally efficient FastMCD algorithm of3 in 1999. Since then, the MCD has been applied in numerous fields such as medicine, finance, image analysis and chemistry.
    [Show full text]
  • Estimation of the Inverse Scatter Matrix for a Scale Mixture of Wishart
    Estimation of the inverse scatter matrix for a scale mixture of Wishart matrices under Efron-Morris type losses Djamila Boukehil, Dominique Fourdrinier, Fatiha Mezoued, William Strawderman To cite this version: Djamila Boukehil, Dominique Fourdrinier, Fatiha Mezoued, William Strawderman. Estimation of the inverse scatter matrix for a scale mixture of Wishart matrices under Efron-Morris type losses. 2020. hal-02503972 HAL Id: hal-02503972 https://hal.archives-ouvertes.fr/hal-02503972 Preprint submitted on 10 Mar 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Estimation of the inverse scatter matrix for a scale mixture of Wishart matrices under Efron-Morris type losses Djamila Boukehil ∗ Dominique Fourdrinier y Fatiha Mezoued z William E. Strawderman x February 17, 2020 Abstract We consider estimation of the inverse scatter matrix Σ−1 for a scale mixture of Wishart matrices under various Efron-Morris type losses, tr[fΣ^ −1 − Σ−1g2Sk] for k = 0; 1; 2:::, where S is the sample covariance matrix. We improve on the standard estimators a S+, where S+ denotes the Moore-Penrose inverse of S and a is a positive constant, through an unbiased estimator of the risk difference between the new estimators and a S+.
    [Show full text]
  • Robust Estimation of Scatter Matrix, Random Matrix Theory and an Application to Spectrum Sensing
    Robust Estimation of Scatter Matrix, Random Matrix Theory and an Application to Spectrum Sensing Thesis by Zhedong Liu In Partial Fulfillment of the Requirements For the Degree of Master of Science King Abdullah University of Science and Technology Thuwal, Kingdom of Saudi Arabia April,2019 2 EXAMINATION COMMITTEE PAGE The thesis of Zhedong Liu is approved by the examination committee Committee Chairperson: Mohamed-Slim Alouini Committee Members: Mohamed-Slim Alouini, Abla Kammoun , H˚avard Rue 3 ©April,2019 Zhedong Liu All Rights Reserved 4 ABSTRACT Robust Estimation of Scatter Matrix, Random Matrix Theory and Applications to Spectrum Sensing Zhedong Liu The covariance estimation is one of the most critical tasks in multivariate statisti- cal analysis. In many applications, reliable estimation of the covariance matrix, or scatter matrix in general, is required. The performance of the classical maximum likelihood method relies a great deal on the validity of the model assumption. Since the assumptions are often approximately correct, many robust statistical methods have been proposed to be robust against the deviation from the model assumptions. M-estimator is an important class of robust estimator of the scatter matrix. The properties of these robust estimators under high dimensional setting, which means the number of dimensions has the same order of magnitude as the number of obser- vations, is desirable. To study these, random matrix theory is a very important tool. With high dimensional properties of robust estimators, we introduced a new method for blind spectrum sensing in cognitive radio networks. 5 ACKNOWLEDGEMENTS First of all, I would like to thank my supervisor Prof.
    [Show full text]
  • Asymptotic Properties of Robust Complex Covariance Matrix Estimates
    1 Asymptotic properties of robust complex covariance matrix estimates Mélanie Mahot Student, IEEE, Frédéric Pascal Member, IEEE, Philippe Forster Member, IEEE, Jean-Philippe Ovarlez Member, IEEE Abstract—In many statistical signal processing applications, [11], [12]. M-estimators of the covariance matrix are however the estimation of nuisance parameters and parameters of interest seldom used in the signal processing community. Only a is strongly linked to the resulting performance. Generally, these limited case, the Tyler’s estimator [13] also called the Fixed applications deal with complex data. This paper focuses on covariance matrix estimation problems in non-Gaussian envi- Point Estimator [14] has been widely used as an alternative to ronments and particularly, the M-estimators in the context of the SCM for radar applications. Concerning the M-estimators, elliptical distributions. Firstly, this paper extends to the complex notable exceptions are the recent papers by Ollila [15], [16], case the results of Tyler in [1]. More precisely, the asymptotic [17], [18], [19] who advocates their use in several applications distribution of these estimators as well as the asymptotic distribu- such as array processing. The M-estimators have also been tion of any homogeneous function of degree 0 of the M-estimates are derived. On the other hand, we show the improvement of studied in the case of large datasets, where the dimension of such results on two applications: DOA (directions of arrival) the data is of the same order as the dimension of the sample estimation using the MUSIC (MUltiple SIgnal Classification) [20]. algorithm and adaptive radar detection based on the ANMF One possible reason for this lack of interest is that their sta- (Adaptive Normalized Matched Filter) test.
    [Show full text]
  • Scatter Matrices and Independent Component Analysis 1 Introduction
    AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 175–189 Scatter Matrices and Independent Component Analysis Hannu Oja1, Seija Sirkia¨2, and Jan Eriksson3 1University of Tampere, Finland 2University of Jyvaskyl¨ a,¨ Finland 3Helsinki University of Technology, Finland Abstract:In the independent component analysis (ICA) it is assumed that the components of the multivariate independent and identically distributed ob- servations are linear transformations of latent independent components. The problem then is to find the (linear) transformation which transforms the ob- servations back to independent components. In the paper the ICA is discussed and it is shown that, under some mild assumptions, two scatter matrices may be used together to find the independent components. The scatter matrices must then have the so called independence property. The theory is illustrated by examples. Keywords: Affine Equivariance, Elliptical Model, Independence, Indepen- dent Component Model, Kurtosis, Location, Principal Component Analysis (PCA), Skewness, Source Separation. 1 Introduction Let x1; x2;:::; xn denote a random sample from a p-variate distribution. We also write 0 X = (x1 x2 ¢ ¢ ¢ xn) for the corresponding n £ p data matrix. In statistical modelling of the observed data, one often assumes that the observations xi are independent p-vectors ”generated” by the model xi = Azi + b ; i = 1; : : : ; n ; where the zi’s are called standardized vectors, b is a location p-vector, A is a full-rank p £ p transformation matrix and V = AA0 is a positive definite p £ p (P DS(p)) scatter matrix. In most applications (two-samples, several-samples case, linear model, etc.), b = bi may be dependent on the design.
    [Show full text]
  • Tools for Exploring Multivariate Data: the Package ICS
    Tools for Exploring Multivariate Data: The Package ICS Klaus Nordhausen Hannu Oja David E. Tyler University of Tampere University of Tampere Rutgers, The State University of New Jersey Abstract This introduction to the R package ICS is a (slightly) modified version of Nordhausen, Oja, and Tyler(2008c), published in the Journal of Statistical Software. Invariant coordinate selection (ICS) has recently been introduced as a method for exploring multivariate data. It includes as a special case a method for recovering the unmixing matrix in independent components analysis (ICA). It also serves as a basis for classes of multivariate nonparametric tests, and as a tool in cluster analysis or blind discrimination. The aim of this paper is to briefly explain the (ICS) method and to illustrate how various applications can be implemented using the R package ICS. Several examples are used to show how the ICS method and ICS package can be used in analyzing a multivariate data set. Keywords: clustering, discriminant analysis, independent components analysis, invariant co- ordinate selection, R, transformation-retransformation method. 1. Introduction Multivariate data normally arise by collecting p measurements on n individuals or experimen- tal units. Such data can be displayed in tables with each row representing one individual and each column representing a particular measured variable. The resulting data matrix X is then p n × p, with the row vector xi 2 < denoting the measurements taken on the ith individual or > > > ith experimental unit. Hence X = [x1 ; : : : ; xn ]. To be consistent with the convention used in the programming language R, all vectors are understood to be row vectors.
    [Show full text]