A Least Squares Formulation for a Class of Generalized Eigenvalue Problems in Machine Learning Liang Sun
[email protected] Shuiwang Ji
[email protected] Jieping Ye
[email protected] Department of Computer Science and Engineering, Arizona State University, Tempe, AZ 85287, USA Abstract nant Analysis (LDA), and Hypergraph Spectral Learn- ing (HSL) (Hotelling, 1936; Rosipal & Kr¨amer, 2006; Many machine learning algorithms can be Sun et al., 2008a; Tao et al., 2009; Ye, 2007). Al- formulated as a generalized eigenvalue prob- though well-established algorithms in numerical linear lem. One major limitation of such formula- algebra have been developed to solve generalized eigen- tion is that the generalized eigenvalue prob- value problems, they are in general computationally lem is computationally expensive to solve expensive and hence may not scale to large-scale ma- especially for large-scale problems. In this chine learning problems. In addition, it is challenging paper, we show that under a mild condi- to directly incorporate the sparsity constraint into the tion, a class of generalized eigenvalue prob- mathematical formulation of these techniques. Spar- lems in machine learning can be formulated sity often leads to easy interpretation and a good gen- as a least squares problem. This class of eralization ability. It has been used successfully in lin- problems include classical techniques such as ear regression (Tibshirani, 1996), and Principal Com- Canonical Correlation Analysis (CCA), Par- ponent Analysis (PCA) (d’Aspremont et al., 2004). tial Least Squares (PLS), and Linear Dis- criminant Analysis (LDA), as well as Hy- Multivariate Linear Regression (MLR) that minimizes pergraph Spectral Learning (HSL).