A graph based approach to semi-supervised learning
Michael Lim
1 Feb 2011
Michael Lim A graph based approach to semi-supervised learning Two papers
M. Belkin, P. Niyogi, and V Sindhwani. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research 1-48, 2006. M. Belkin, P. Niyogi. Towards a Theoretical Foundation for Laplacian Based Manifold Methods. Journal of Computer and System Sciences, 2007.
Michael Lim A graph based approach to semi-supervised learning What is semi-supervised learning?
Prediction, but with the help of unsupervised examples.
Michael Lim A graph based approach to semi-supervised learning Practical reasons: unlabeled data cheap More natural model of human learning
Why semi-supervised learning?
Michael Lim A graph based approach to semi-supervised learning More natural model of human learning
Why semi-supervised learning?
Practical reasons: unlabeled data cheap
Michael Lim A graph based approach to semi-supervised learning Why semi-supervised learning?
Practical reasons: unlabeled data cheap More natural model of human learning
Michael Lim A graph based approach to semi-supervised learning An example
Michael Lim A graph based approach to semi-supervised learning An example
Michael Lim A graph based approach to semi-supervised learning An example
Michael Lim A graph based approach to semi-supervised learning An example
Michael Lim A graph based approach to semi-supervised learning Semi-supervised learning framework 1
l labeled examples (x, y) generated by distribution P.
u unlabeled examples drawn from marginal PX . Mercer kernel K. l ∗ 1 X 2 f = argmin V (xi , yi , f ) + γkf k f ∈Hk l K i=1
Michael Lim A graph based approach to semi-supervised learning Semi-supervised learning framework 2
Classical representer theorem:
l ∗ X f (x) = αi K(xi , x) i=1
Michael Lim A graph based approach to semi-supervised learning Modified objective:
l ∗ 1 X 2 2 f = argmin V (xi , yi , f ) + γAkf k + γI kf k f ∈HK l K I i=1
Manifold regularization: assumptions
Assumptions: P supported on manifold M
P(y|x) varies smoothly along geodesics of PX
Michael Lim A graph based approach to semi-supervised learning Manifold regularization: assumptions
Assumptions: P supported on manifold M
P(y|x) varies smoothly along geodesics of PX Modified objective:
l ∗ 1 X 2 2 f = argmin V (xi , yi , f ) + γAkf k + γI kf k f ∈HK l K I i=1
Michael Lim A graph based approach to semi-supervised learning Manifold regularization: known marginal
Theorem
If PX known and M is a smooth Riemannian manifold,
l Z ∗ X f (x) = + α(z)K(x, z)dPX (z) i=1 M
Michael Lim A graph based approach to semi-supervised learning Only requires unlabeled data 2 R 2 Natural choice: kf kI = M k∇M f k dP Approximate M with graph
Manifold regularization: unknown marginal
Need to estimate marginal and kf kI
Michael Lim A graph based approach to semi-supervised learning 2 R 2 Natural choice: kf kI = M k∇M f k dP Approximate M with graph
Manifold regularization: unknown marginal
Need to estimate marginal and kf kI Only requires unlabeled data
Michael Lim A graph based approach to semi-supervised learning Approximate M with graph
Manifold regularization: unknown marginal
Need to estimate marginal and kf kI Only requires unlabeled data 2 R 2 Natural choice: kf kI = M k∇M f k dP
Michael Lim A graph based approach to semi-supervised learning Manifold regularization: unknown marginal
Need to estimate marginal and kf kI Only requires unlabeled data 2 R 2 Natural choice: kf kI = M k∇M f k dP Approximate M with graph
Michael Lim A graph based approach to semi-supervised learning Use graph laplacian instead of manifold Laplacian
Manifold regularization: building the graph
Single-linkage clustering Nearest neighbor methods
Michael Lim A graph based approach to semi-supervised learning Manifold regularization: building the graph
Single-linkage clustering Nearest neighbor methods Use graph laplacian instead of manifold Laplacian
Michael Lim A graph based approach to semi-supervised learning Manifold regularization: using the graph
Theorem By choosing exponential weights for the edges, the graph Laplacian converges to the manifold Laplacian in probability.
f ∗ = argmin 1 Pl V (x , y , f ) + γ kf k2 + γI f T Lf f ∈HK l i=1 i i A K (u+l)2 L = D − W
Michael Lim A graph based approach to semi-supervised learning Main result
Theorem ∗ Pl+u f (x) = i=1 αi K(xi , x)
Michael Lim A graph based approach to semi-supervised learning ∗ Pl ∗ ∗ −1 Solution: f (x) = i=1 αi K(xi , x), α = (K + λlI ) Y Laplacian RLS: argmin 1 Pl (y − f (x ))2 + λ kf k2 + λI f T Lf f ∈HK l i=1 i i A K (u+l)2 ∗ Pl+u ∗ Solution: f (x) = i=1 αI K(x, xi ), ∗ λI l −1 α = (JK + λAlI + (u+l)2 LK) Y
Regularized least squares
Classical RLS: argmin 1 Pl (y − f (x ))2 + λkf k2 f ∈HK l i=1 i i K
Michael Lim A graph based approach to semi-supervised learning Laplacian RLS: argmin 1 Pl (y − f (x ))2 + λ kf k2 + λI f T Lf f ∈HK l i=1 i i A K (u+l)2 ∗ Pl+u ∗ Solution: f (x) = i=1 αI K(x, xi ), ∗ λI l −1 α = (JK + λAlI + (u+l)2 LK) Y
Regularized least squares
Classical RLS: argmin 1 Pl (y − f (x ))2 + λkf k2 f ∈HK l i=1 i i K ∗ Pl ∗ ∗ −1 Solution: f (x) = i=1 αi K(xi , x), α = (K + λlI ) Y
Michael Lim A graph based approach to semi-supervised learning ∗ Pl+u ∗ Solution: f (x) = i=1 αI K(x, xi ), ∗ λI l −1 α = (JK + λAlI + (u+l)2 LK) Y
Regularized least squares
Classical RLS: argmin 1 Pl (y − f (x ))2 + λkf k2 f ∈HK l i=1 i i K ∗ Pl ∗ ∗ −1 Solution: f (x) = i=1 αi K(xi , x), α = (K + λlI ) Y Laplacian RLS: argmin 1 Pl (y − f (x ))2 + λ kf k2 + λI f T Lf f ∈HK l i=1 i i A K (u+l)2
Michael Lim A graph based approach to semi-supervised learning Regularized least squares
Classical RLS: argmin 1 Pl (y − f (x ))2 + λkf k2 f ∈HK l i=1 i i K ∗ Pl ∗ ∗ −1 Solution: f (x) = i=1 αi K(xi , x), α = (K + λlI ) Y Laplacian RLS: argmin 1 Pl (y − f (x ))2 + λ kf k2 + λI f T Lf f ∈HK l i=1 i i A K (u+l)2 ∗ Pl+u ∗ Solution: f (x) = i=1 αI K(x, xi ), ∗ λI l −1 α = (JK + λAlI + (u+l)2 LK) Y
Michael Lim A graph based approach to semi-supervised learning Support vector machines
Like in regularized least squares, there is a version of the SVM called Laplacian SVM.
Michael Lim A graph based approach to semi-supervised learning Two moons dataset
Michael Lim A graph based approach to semi-supervised learning Wisconsin breast cancer data
683 samples. Benign or malignant? Clump thickness Uniformity of cell size and shape etc
Michael Lim A graph based approach to semi-supervised learning Wisconsin breast cancer data: results
Michael Lim A graph based approach to semi-supervised learning Longer term stuff
Besides geometric structure, what else can we use? Invariance? Learning the manifold: Simplicial complex instead of graph? Homology. Nice example in natural image statistics (Mumford et al, 2003)
Michael Lim A graph based approach to semi-supervised learning Longer term stuff 2
Hickernell, Song, and Zhang. Reproducing kernel Banach spaces with the l1 norm. Preprint.
Reproducing kernel Banach spaces with the l1 norm II: error analysis for regularized least squares regression. Preprint.
Michael Lim A graph based approach to semi-supervised learning