Testing mutual independence in high dimension via distance covariance By Shun Yao, Xianyang Zhang, Xiaofeng Shao ∗ Abstract In this paper, we introduce a L2 type test for testing mutual independence and banded dependence structure for high dimensional data. The test is constructed based on the pairwise distance covariance and it accounts for the non-linear and non-monotone depen- dences among the data, which cannot be fully captured by the existing tests based on either Pearson correlation or rank correlation. Our test can be conveniently implemented in practice as the limiting null distribution of the test statistic is shown to be standard normal. It exhibits excellent finite sample performance in our simulation studies even when sample size is small albeit dimension is high, and is shown to successfully iden- tify nonlinear dependence in empirical data analysis. On the theory side, asymptotic normality of our test statistic is shown under quite mild moment assumptions and with little restriction on the growth rate of the dimension as a function of sample size. As a demonstration of good power properties for our distance covariance based test, we further show that an infeasible version of our test statistic has the rate optimality in the class of Gaussian distribution with equal correlation. Keywords: Banded dependence, Degenerate U-statistics, Distance correlation, High di- mensionality, Hoeffding decomposition arXiv:1609.09380v2 [stat.ME] 18 Sep 2017 1 Introduction In statistical multivariate analysis and machine learning research, a fundamental problem is to explore the relationships and dependence structure among subsets of variables. An important ∗Address correspondence to Xianyang Zhang (
[email protected]), Assistant Professor, Department of Statistics, Texas A&M University.