A graph based approach to semi-

Michael Lim

1 Feb 2011

Michael Lim A graph based approach to semi-supervised learning Two papers

M. Belkin, P. Niyogi, and V Sindhwani. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. Journal of Research 1-48, 2006. M. Belkin, P. Niyogi. Towards a Theoretical Foundation for Laplacian Based Manifold Methods. Journal of Computer and System Sciences, 2007.

Michael Lim A graph based approach to semi-supervised learning What is semi-supervised learning?

Prediction, but with the help of unsupervised examples.

Michael Lim A graph based approach to semi-supervised learning Practical reasons: unlabeled data cheap More natural model of human learning

Why semi-supervised learning?

Michael Lim A graph based approach to semi-supervised learning More natural model of human learning

Why semi-supervised learning?

Practical reasons: unlabeled data cheap

Michael Lim A graph based approach to semi-supervised learning Why semi-supervised learning?

Practical reasons: unlabeled data cheap More natural model of human learning

Michael Lim A graph based approach to semi-supervised learning An example

Michael Lim A graph based approach to semi-supervised learning An example

Michael Lim A graph based approach to semi-supervised learning An example

Michael Lim A graph based approach to semi-supervised learning An example

Michael Lim A graph based approach to semi-supervised learning Semi-supervised learning framework 1

l labeled examples (x, y) generated by distribution P.

u unlabeled examples drawn from marginal PX . Mercer kernel K. l ∗ 1 X 2 f = argmin V (xi , yi , f ) + γkf k f ∈Hk l K i=1

Michael Lim A graph based approach to semi-supervised learning Semi-supervised learning framework 2

Classical representer theorem:

l ∗ X f (x) = αi K(xi , x) i=1

Michael Lim A graph based approach to semi-supervised learning Modified objective:

l ∗ 1 X 2 2 f = argmin V (xi , yi , f ) + γAkf k + γI kf k f ∈HK l K I i=1

Manifold regularization: assumptions

Assumptions: P supported on manifold M

P(y|x) varies smoothly along geodesics of PX

Michael Lim A graph based approach to semi-supervised learning Manifold regularization: assumptions

Assumptions: P supported on manifold M

P(y|x) varies smoothly along geodesics of PX Modified objective:

l ∗ 1 X 2 2 f = argmin V (xi , yi , f ) + γAkf k + γI kf k f ∈HK l K I i=1

Michael Lim A graph based approach to semi-supervised learning Manifold regularization: known marginal

Theorem

If PX known and M is a smooth Riemannian manifold,

l Z ∗ X f (x) = + α(z)K(x, z)dPX (z) i=1 M

Michael Lim A graph based approach to semi-supervised learning Only requires unlabeled data 2 R 2 Natural choice: kf kI = M k∇M f k dP Approximate M with graph

Manifold regularization: unknown marginal

Need to estimate marginal and kf kI

Michael Lim A graph based approach to semi-supervised learning 2 R 2 Natural choice: kf kI = M k∇M f k dP Approximate M with graph

Manifold regularization: unknown marginal

Need to estimate marginal and kf kI Only requires unlabeled data

Michael Lim A graph based approach to semi-supervised learning Approximate M with graph

Manifold regularization: unknown marginal

Need to estimate marginal and kf kI Only requires unlabeled data 2 R 2 Natural choice: kf kI = M k∇M f k dP

Michael Lim A graph based approach to semi-supervised learning Manifold regularization: unknown marginal

Need to estimate marginal and kf kI Only requires unlabeled data 2 R 2 Natural choice: kf kI = M k∇M f k dP Approximate M with graph

Michael Lim A graph based approach to semi-supervised learning Use graph laplacian instead of manifold Laplacian

Manifold regularization: building the graph

Single-linkage clustering Nearest neighbor methods

Michael Lim A graph based approach to semi-supervised learning Manifold regularization: building the graph

Single-linkage clustering Nearest neighbor methods Use graph laplacian instead of manifold Laplacian

Michael Lim A graph based approach to semi-supervised learning Manifold regularization: using the graph

Theorem By choosing exponential weights for the edges, the graph Laplacian converges to the manifold Laplacian in probability.

f ∗ = argmin 1 Pl V (x , y , f ) + γ kf k2 + γI f T Lf f ∈HK l i=1 i i A K (u+l)2 L = D − W

Michael Lim A graph based approach to semi-supervised learning Main result

Theorem ∗ Pl+u f (x) = i=1 αi K(xi , x)

Michael Lim A graph based approach to semi-supervised learning ∗ Pl ∗ ∗ −1 Solution: f (x) = i=1 αi K(xi , x), α = (K + λlI ) Y Laplacian RLS: argmin 1 Pl (y − f (x ))2 + λ kf k2 + λI f T Lf f ∈HK l i=1 i i A K (u+l)2 ∗ Pl+u ∗ Solution: f (x) = i=1 αI K(x, xi ), ∗ λI l −1 α = (JK + λAlI + (u+l)2 LK) Y

Regularized

Classical RLS: argmin 1 Pl (y − f (x ))2 + λkf k2 f ∈HK l i=1 i i K

Michael Lim A graph based approach to semi-supervised learning Laplacian RLS: argmin 1 Pl (y − f (x ))2 + λ kf k2 + λI f T Lf f ∈HK l i=1 i i A K (u+l)2 ∗ Pl+u ∗ Solution: f (x) = i=1 αI K(x, xi ), ∗ λI l −1 α = (JK + λAlI + (u+l)2 LK) Y

Regularized least squares

Classical RLS: argmin 1 Pl (y − f (x ))2 + λkf k2 f ∈HK l i=1 i i K ∗ Pl ∗ ∗ −1 Solution: f (x) = i=1 αi K(xi , x), α = (K + λlI ) Y

Michael Lim A graph based approach to semi-supervised learning ∗ Pl+u ∗ Solution: f (x) = i=1 αI K(x, xi ), ∗ λI l −1 α = (JK + λAlI + (u+l)2 LK) Y

Regularized least squares

Classical RLS: argmin 1 Pl (y − f (x ))2 + λkf k2 f ∈HK l i=1 i i K ∗ Pl ∗ ∗ −1 Solution: f (x) = i=1 αi K(xi , x), α = (K + λlI ) Y Laplacian RLS: argmin 1 Pl (y − f (x ))2 + λ kf k2 + λI f T Lf f ∈HK l i=1 i i A K (u+l)2

Michael Lim A graph based approach to semi-supervised learning Regularized least squares

Classical RLS: argmin 1 Pl (y − f (x ))2 + λkf k2 f ∈HK l i=1 i i K ∗ Pl ∗ ∗ −1 Solution: f (x) = i=1 αi K(xi , x), α = (K + λlI ) Y Laplacian RLS: argmin 1 Pl (y − f (x ))2 + λ kf k2 + λI f T Lf f ∈HK l i=1 i i A K (u+l)2 ∗ Pl+u ∗ Solution: f (x) = i=1 αI K(x, xi ), ∗ λI l −1 α = (JK + λAlI + (u+l)2 LK) Y

Michael Lim A graph based approach to semi-supervised learning Support vector machines

Like in regularized least squares, there is a version of the SVM called Laplacian SVM.

Michael Lim A graph based approach to semi-supervised learning Two moons dataset

Michael Lim A graph based approach to semi-supervised learning Wisconsin breast cancer data

683 samples. Benign or malignant? Clump thickness Uniformity of cell size and shape etc

Michael Lim A graph based approach to semi-supervised learning Wisconsin breast cancer data: results

Michael Lim A graph based approach to semi-supervised learning Longer term stuff

Besides geometric structure, what else can we use? Invariance? Learning the manifold: Simplicial complex instead of graph? Homology. Nice example in natural image statistics (Mumford et al, 2003)

Michael Lim A graph based approach to semi-supervised learning Longer term stuff 2

Hickernell, Song, and Zhang. Reproducing kernel Banach spaces with the l1 norm. Preprint.

Reproducing kernel Banach spaces with the l1 norm II: error analysis for regularized least squares regression. Preprint.

Michael Lim A graph based approach to semi-supervised learning