Semi-Supervised Support Vector Machines with Tangent Space Intrinsic Manifold Regularization

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 1 Semi-Supervised Support Vector Machines with Tangent Space Intrinsic Manifold Regularization Shiliang Sun and Xijiong Xie Abstract—Semi-supervised learning has been an active research topic in machine learning and data mining. One main reason is that labeling examples is expensive and time-consuming while there are large numbers of unlabeled examples available in many practical problems. So far, Laplacian regularization has been widely used in semi-supervised learning. In this paper, we propose a new regularization method called tangent space intrinsic manifold regularization. It is intrinsic to data manifold and favors linear functions on the manifold. Fundamental elements involved in the formulation of the regularization are local tangent space representations which are estimated by local principal component analysis, and the connections which relate adjacent tangent spaces. Simultaneously, we explore its application to semi-supervised classification and propose two new learning algorithms called tangent space intrinsic manifold regularized support vector machines (TiSVMs) and tangent space intrinsic manifold regularized twin support vector machines (TiTSVMs). They effectively integrate the tangent space intrinsic manifold regularization consideration. The optimization of TiSVMs can be solved by a standard quadratic programming while the optimization of TiTSVMs can be solved by a pair of standard quadratic programmings. Experimental results on semi-supervised classification problems show the effectiveness of the proposed semi-supervised learning algorithms. Index Terms—Support vector machine, twin support vector machine, semi-supervised classification, manifold learning, tangent space intrinsic manifold regularization ✦ 1 INTRODUCTION classification. Typical multi-view classification meth- EMI-SUPERVISED classification, which estimates a de- ods include co-training [6], SVM-2K [7], co-Laplacian S cision function from few labeled examples and a SVMs [8], manifold co-regularization [9], multi-view large quantity of unlabeled examples, is an active re- Laplacian SVMs [10], sparse multi-view SVMs [11] and search topic. Its prevalence is mainly motivated by the multi-view Laplacian TSVMs [12]. Although the regular- need to reduce the expensive or time-consuming label ization method presented in this paper can be applied acquisition process. Evidence shows that, provided that to multi-view semi-supervised classification after some the unlabeled data which are inexpensive to collect appropriate manipulation, in this paper we only focus are properly exploited, people can obtain a superior on the single-view classification problem. performance over the counterpart supervised learning The principle of regularization has its root in mathe- approaches with few labeled examples. For a compre- matics to solve ill-posed problems [13], and is widely hensive survey of semi-supervised learning methods, used in statistics and machine learning [3], [14], refer to [1] and [2]. [15]. Many well-known algorithms, e.g., SVMs [16], Current semi-supervised classification methods can be TSVMs [17], ridge regression and lasso [18], can be divided into two categories, which are called single- interpreted as instantiations of the idea of regularization. view and multi-view algorithms, respectively. Their dif- A close parallel to regularization is the capacity control ference lies in the number of feature sets used to train of function classes [19]. Both regularization and the classifiers. If more than one feature set is adopted to capacity control can alleviate the ill-posed and over- learn classifiers, the algorithm would be called a multi- fitting problems of learning algorithms. Moreover, from view semi-supervised learning algorithm. The Lapla- the point of view of Bayesian learning, the solution to cian support vector machines (LapSVMs) [3], [4] and a regularization problem corresponds to the maximum Laplacian twin support vector machines (LapTSVMs) [5], a posterior (MAP) estimate for a parameter of interest. which can be regarded as two applications of Lapla- The regularization term plays the role of the prior dis- cian eigenmaps to semi-supervised learning, are rep- tribution on the parameter in the Bayesian model [20]. resentative algorithms for single-view semi-supervised In many real applications, data lying in a high- dimensional space can be assumed to be intrinsically Manuscript received 6 Dec. 2014; revised 7 May and 18 Jul. 2015; accepted of low dimensionality. That is, data can be well char- 23 Jul. 2015. This work was supported by the National Natural Science Foundation of China under Project 61370175 and the Science and Technology acterized by far fewer parameters or degrees of freedom Commission of Shanghai Municipality under Grant 14DZ2260800. than the actual ambient representation. This setting is Shiliang Sun and Xijiong Xie are with the Shanghai Key Laboratory of usually referred to as manifold learning, and the dis- Multidimensional Information Processing, Department of Computer Science and Technology, East China Normal University, 500 Dongchuan Road, tribution of data is regarded to live on or near a low- Shanghai 200241, P.R. China. E-mail: [email protected]. dimensional manifold. The validity of manifold learning IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2 has already been testified by some recent developments, followed by Section 7 which discusses further refine- e.g., the work in [21], [22] and [23]. Laplacian regu- ments about the proposed methods and other possible larization is an important manifold learning method, applications. Concluding remarks are given in Section 8. which is used to exploit the geometry of the probabil- ity distribution by assuming that its support has the 2 METHODOLOGY OF THE TANGENT SPACE geometric structure of a Riemannian manifold. Graph- INTRINSIC MANIFOLD REGULARIZATION based learning methods often use this regularization to We are interested in estimating a function f(x) defined obtain an approximation to the underlying manifold. on M⊂ Rd, where M is a smooth manifold on Rd. We In particular, Laplacian regularization has been widely assume that f(x) can be well approximated by a linear used in semi-supervised learning to effectively combine function with respect to the manifold M. Let m be the labeled examples and unlabeled examples. For exam- dimensionality of M. At each point z ∈M, f(x) can be ples, LapSVMs and LapTSVMs are two representative ⊤ represented as a linear function f(x) ≈ bz + wz uz(x)+ semi-supervised classification methods with Laplacian 2 o(kx − zk ) locally around z, where uz(x)= Tz(x − z) is regularization. Laplacian regularized least squares are an m-dimensional vector representing x in the tangent regarded as a representative semi-supervised regression space around z, and Tz is an m × d matrix that projects x method with Laplacian regularization [3]. The optimiza- around z to a representation in the tangent space of M tion of such algorithms is built on a representer theorem at z. Note that in this paper the basis for Tz is computed that provides a basis for many algorithms for unsuper- using local principal component analysis (PCA) for its vised, semi-supervised and fully supervised learning. simplicity and wide applicability. In particular, the point In this paper, we propose a new regularization method z and its neighbors are sent over to the regular PCA called tangent space intrinsic manifold regularization procedure [25] and the top m eigenvectors of the d × d to approximate a manifold more subtly. Through this covariance matrix are returned back as rows of matrix regularization we can learn a linear function f(x) on the m Tz. The weight vector wz ∈ R is an m-dimensional manifold. The new regularization has potentials to be vector, and it is also the manifold-derivative of f(x) at z applied to a variety of statistical and machine learning with respect to the uz(·) representation on the manifold, problems. In later descriptions, Laplacian regularization which we write as ∇T f(x)|x=z = wz. is in fact only a part of the tangent space intrinsic Mathematically, a linear function with respect to the manifold regularization. manifold M, which is not necessarily a globally linear Part of this research has been reported in a short function in Rd, is a function that has constant manifold conference paper [24]. Compared to the previous work, derivative. However, this does not mean wz is a constant we have derived the formulation of the new regular- function of u due to the different coordinate systems ization in detail. While the previous work mainly con- when the “anchor point” z changes from one point to sidered data representation with the new regularization, another. This needs to be compensated using “connec- this paper considers a different task semi-supervised tions” that map a coordinate representation uz′ to uz for classification and exhibits the usefulness of the new any z′ near z. For points far apart, the connections are regularization method for this task. Two new learn- not of interest for our purpose, since coordinate systems ing machines TiSVMs and TiTSVMs are thus proposed. on a manifold usually change and representing distant TiSVMs integrate the common hinge loss for classifica- points with a single basis would thereby lead to a large tion, norm regularization, and the tangent space intrinsic bias. manifold regularization term, and lead to a quadratic To see how our approach works, we assume for programming problem, while TiTSVMs lead to a pair simplicity that Tz is an orthogonal matrix for all z: of quadratic programming problems. Semi-supervised ⊤ TzTz = I(m×m). This means that if x ∈ M is close to classification experiments with TiSVMs and TiTSVMs on ⊤ 2 z ∈ M, then x − z ≈ Tz Tz(x − z)+ O(kx − zk ). Now multiple datasets give encouraging results. consider x that is close to both z and z′. We can express The remainder of this paper is organized as follows. In f(x) both in the tangent space representation at z and Section 2, we introduce the methodology of the tangent z′, which gives space intrinsic manifold regularization.

Load more