Robust Non-Linear Dimensionality Reduction Using Successive 1-Dimensional Laplacian Eigenmaps
Total Page:16
File Type:pdf, Size:1020Kb
Robust Non-linear Dimensionality Reduction using Successive 1-Dimensional Laplacian Eigenmaps Samuel Gerber [email protected] Tolga Tasdizen [email protected] Ross Whitaker [email protected] Scientific Computing and Imaging Institute, University of Utah, Salt Lake City, UT 84112 Abstract dimensional manifold is non-linear, which makes com- mon linear approaches, such as principal component Non-linear dimensionality reduction of noisy analysis (PCA) (Jolliffe, 1986) and multidimensional data is a challenging problem encountered in scaling (MDS) (Cox & Cox, 1994) unsuitable. A com- a variety of data analysis applications. Re- mon example of a non-linear manifold embedded in a cent results in the literature show that spec- high dimensional space is image vectors of an object tral decomposition, as used for example by taken from different angles and lighting directions (Lin the Laplacian Eigenmaps algorithm, provides et al., 2006). In such a case, the dimensionality is re- a powerful tool for non-linear dimensionality stricted by the degrees of freedom of the physical con- reduction and manifold learning. In this pa- straints under which the images were taken. Whereas, per, we discuss a significant shortcoming of the data has a much greater dimensionality depend- these approaches, which we refer to as the ing on the resolution of the image. Analysis of climate repeated eigendirections problem. We propose data (G´amez et al., 2004) in geophysics or the study a novel approach that combines successive 1- of protein motions (Das et al., 2006) in computational dimensional spectral embeddings with a data biology are other examples. advection scheme that allows us to address this problem. The proposed method does not Isomap (Tenenbaum et al., 2000), chooses a lower di- depend on a non-linear optimization scheme; mensional embedding in which Euclidean distances ap- hence, it is not prone to local minima. Ex- proximate the geodesic distances on the manifold in periments with artificial and real data illus- the high-dimensional space. While this method has trate the advantages of the proposed method many desirable properties, such as distance preser- over existing approaches. We also demon- vation and a closed-form solution, the geodesic dis- strate that the approach is capable of cor- tance approximation depends on the definition of lo- rectly learning manifolds corrupted by signif- cal connectivity for each data point and can be sen- icant amounts of noise. sitive to noise. Figure 1(top row) shows increasingly noisy versions of a Swiss roll data set. The 2nd row illustrates the 2-dimensional embedding obtained by 1. Introduction the Isomap algorithm. Tenenbaum et al. propose k- nearest neighbors and -radius balls as two methods Dimensionality reduction is an important procedure in of determining local connectivity. For data with mod- various high-dimensional data analysis problems. Ap- erate and higher amounts of noise, we were unable to plications range from image compression (Ye et al., fine tune the algorithmic parameters to recover the 2004) to visualization (Vlachos et al., 2002). Often correct manifold structure. This can be seen in the the data lies on a low dimensional manifold embed- 2nd row of Figure 1 where the results for the noisier ded in a high dimensional space and the dimension- data sets demonstrate the wrong neighborhood rela- ality of the data can be reduced without significant tionships in the lower dimensional embedding (differ- loss of information. However, in many cases the lower ent colors next to each other). A more detailed discus- sion of the topological stability of Isomap with noisy th Appearing in Proceedings of the 24 International Confer- data can be found in Balasubramanian and Schwartz ence on Machine Learning, Corvallis, OR, 2007. Copyright 2007 by the author(s)/owner(s). (2002). Other recent approaches, such as Laplacian Robust Non-linear Dimensionality Reduction using Successive 1-Dimensional Laplacian Eigenmaps Eigenmaps (Belkin & Niyogi, 2003), exploit proxim- satisfactory parametrization. Figure 2 illustrates this ity relations instead of geodesic distances and use el- problem, which we refer to as repeated eigendirection. ements of spectral graph theory to compute a lower The two eigenvectors computed from the graph Lapla- dimensional embedding. We found that in compari- cian and LLE methods recover only the first non-linear son to Isomap, Laplacian Eigenmaps is more robust dimension of the manifold. The second non-linear di- in regard to noisy input data. This can be attributed mension is not discovered. The strong local correla- to the lack of a geodesic distance computation. Fig- tions between the two eigenvectors can also be seen ure 1 (3rd and 4th row), shows the results of Laplacian by plotting them on the plane (Figure 2, 3rd column). Eigenmaps and our method, respectively, on the noisy The repeated eigendirection problem occurs when the Swiss roll data. Neighborhood relationships are cor- extent of the underlying non-linear dimensions of the rectly preserved by both methods for the noise levels manifold differ significantly. We discuss this problem shown. in more depth in Section 3. Noisy Swiss rolls LLE Isomap Laplacian Eigenmaps Laplacian Eigenmaps Proposed method Proposed method Figure 1. The Swiss roll (θ cos(θ), y, θ sin(θ)) is a 2D mani- Figure 2. Repeated eigendirection problem. The first two fold (parametrized by θ and y) in 3 that is commonly used R columns show the first two eigenvectors colored on the in dimensionality reduction experiments. Top row: Swiss Swiss roll computed by LLE, Laplacian Eigenmaps our rolls colored by θ, viewed along the y-axis. Independent approach. The last column plots the first against the normally distributed noise with variance increasing from second eigenvector by each approach. The two eigenvec- left to right was added. 2D embeddings, also colored by tors display strong local correlations using LLE and Lapla- θ, found by Isomap (2nd row), Laplacian Eigenmaps (3rd cian Eigenmaps, but are independent using the proposed row) and our approach in the (4th row). method. Note that the proposed method is also able to re- cover approximate proportions of the manifold as discussed Unfortunately, dimensionality reduction methods further in Section 4. based on spectral graph theory implicitly put severe constraints on the shape of the low dimensional mani- We propose a method that successively collapses the fold. The eigenvectors computed from the graph dimensions of the low dimensional manifold based on Laplacian are orthogonal but not independent; in an advection scheme driven by the eigenvector. The other words, eigenvectors can be strongly correlated term advection describes the process of transporta- locally. This is not sufficient enough to guarantee a tion in fluids. In our setting we flow the individual Robust Non-linear Dimensionality Reduction using Successive 1-Dimensional Laplacian Eigenmaps points along the low dimensional manifold. The flow matrix that captures local interactions are susceptible on the low dimensional manifold is defined by a vec- to the repeated eigendirection problem. tor field based on the eigenvector computed from the Other approaches that have closed form solutions in- graph Laplacian. This collapses the dimension of the clude Isomap (Tenenbaum et al., 2000) a non-linear low dimensional manifold defined by this eigenvector extension to MDS based on approximate geodesic dis- and this dimension is therefore eliminated from subse- tances computed by shortest paths on a graph. Isomap quent eigenvector computations. Our algorithm con- does not suffer from the repeated eigendirection prob- sists of computing an eigenvector based on the current lem but is sensitive to noise in the input as discussed points, advect the points along the eigenvector and in Figure 1. Kernel PCA (Sch¨olkopf et al., 1998) ex- then repeat these two steps (using the advected points) tends PCA to handle non-linear cases using the ker- until the embedding is uncovered. The eigenvectors nel trick. It has been shown that spectral embedding computed at each iteration as described above gives methods are directly related to kernel PCA (Bengio the low dimensional parametrization. Our method re- et al., 2004). The difficulty in kernel based methods is solves the repeated eigendirection problem and is capa- to choose an appropriate kernel. Recent work by Wein- ble of dealing with significant amounts of noise, while berger et al. (2004) addresses this problem and pro- still having the desirable properties of eigendecompo- vides promising results. sition based methods, namely having a solution that does not suffer from local minima. A significant draw- Other approaches to non-linear dimensionality reduc- back of this approach is the high computational cost tion use iterative optimization techniques. Self orga- for the advection. For the Swiss roll example with nizing maps (Kohonen, 1997) learn a mapping from 2000 points in 3 dimensional space in Figure 1, it takes the high dimensional to a fixed lower dimensional about 10 minutes to compute the complete low dimen- space, usually a 2-dimensional Cartesian grid. Topol- sional embedding. ogy preserving networks (Martinetz & Schulten, 1994) use aspects from computational geometry combined 2. Related Work with a neural network to find a low dimensional em- bedding. Various other neural network based ap- Traditional methods such as PCA (Jolliffe, 1986) or proaches that utilize bottleneck hidden layers have MDS (Cox & Cox, 1994) work well under the con- been proposed (Bishop, 1995; Perantonis & v. Virvilis, dition that the lower dimensional manifold is close 1999; Lluis Garrido, 1999; Lee & Verleysen, 2002; to linear. Various methods have been developed for Hinton & Salakhutdinov, 2006). Another approach the non-linear case. Recently there has been an in- tries to minimize the Kullback-Leibler distance be- creased amount of interest in non-linear dimensional- tween a probability density function (PDF) defined in ity reduction methods that do not depend on an it- the high dimensional and a PDF in the low dimen- erative minimization procedure that can be prone to sional space (Hinton & Roweis, 2002).