Journal of Machine Learning Research 22 (2021) 1-45 Submitted 6/20; Revised 12/20; Published 1/21 Optimal Structured Principal Subspace Estimation: Metric Entropy and Minimax Rates Tony Cai
[email protected] Department of Statistics University of Pennsylvania Philadelphia, PA 19104, USA Hongzhe Li
[email protected] Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Philadelphia, PA 19104, USA Rong Ma
[email protected] Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Philadelphia, PA 19104, USA Editor: Ambuj Tewari Abstract Driven by a wide range of applications, several principal subspace estimation problems have been studied individually under different structural constraints. This paper presents a uni- fied framework for the statistical analysis of a general structured principal subspace estima- tion problem which includes as special cases sparse PCA/SVD, non-negative PCA/SVD, subspace constrained PCA/SVD, and spectral clustering. General minimax lower and up- per bounds are established to characterize the interplay between the information-geometric complexity of the constraint set for the principal subspaces, the signal-to-noise ratio (SNR), and the dimensionality. The results yield interesting phase transition phenomena concern- ing the rates of convergence as a function of the SNRs and the fundamental limit for consistent estimation. Applying the general results to the specific settings yields the mini- max rates of convergence for those problems, including the previous unknown optimal rates for sparse SVD, non-negative PCA/SVD and subspace constrained PCA/SVD. Keywords: Low-rank matrix; Metric entropy; Minimax risk; Principal component anal- ysis; Singular value decomposition 1. Introduction Spectral methods such as the principal component analysis (PCA) and the singular value decomposition (SVD) are a ubiquitous technique in modern data analysis with a wide range of applications in many fields including statistics, machine learning, applied mathematics, and engineering.