Nonsmooth Riemannian Optimization for Matrix Completion and Eigenvalue Problems

École polytechnique fédérale de Lausanne Master Project Nonsmooth Riemannian optimization for matrix completion and eigenvalue problems Author: Supervisor: Francesco Nobili Prof. Daniel Kressner June 23, 2017 2 Contents 1 Introduction 5 2 Geometrical background for Riemannian optimization 9 2.1 First order geometry . .9 2.2 Riemannian geometry . 13 2.2.1 Riemannian steepest descent . 14 2.3 Distances on manifolds and geodesics . 14 2.4 The exponential map and retractions . 16 2.4.1 Convergence Analysis . 18 2.5 Vectors Transport . 19 2.5.1 Riemannian conjugate gradients . 20 3 Geometric CG for Matrix Completion 23 3.1 Different formulations . 23 3.2 The manifold Mk ............................... 27 3.3 The proposed method . 29 3.3.1 Implementation aspects . 30 3.4 Error Analysis . 31 3.4.1 Numerical simulations . 31 4 Nonsmooth Riemannian Optimization 37 4.1 Overview . 37 4.1.1 Subdifferential for Convex functions . 38 4.1.2 Generalized gradients for Lipschitz functions . 40 4.2 Riemannian subgradients . 41 4.2.1 Convex and Lipschitz maps . 42 4.2.2 Approximating the subdifferential . 44 4.2.3 Implementation aspects . 45 4.3 -subdifferential method . 47 4.4 An example of optimization on Sn−1 .................... 48 4.4.1 Numerical Results . 50 3 5 Sparse Rayleigh quotient minimization 51 5.1 The eigenvalue problem . 51 5.2 Sparse eigenvectors . 55 5.2.1 Localized spectral projectors . 56 5.2.2 Generating process . 57 5.3 Weighted Rayleigh quotient sparse minimization . 58 n 5.3.1 The manifold Stp ........................... 59 5.3.2 Sparse eigenvector on Sn−1 ..................... 59 n 5.3.3 Sparse eigenvectors on Stp ...................... 61 5.4 Nonsmooth Matrix Completion . 65 5.4.1 Further work directions . 66 Bibliography 67 4 Chapter 1 Introduction In numerical linear algebra, optimization problems arise from very natural issue, such as solving a linear system or eigenvalue problems. In this Thesis, we focus on constrained optimization problems where the constraints identify an Euclidean embedded submanifold. In such cases, assuming a Riemannian structure of the manifold M al- lows to switch to an unconstrained optimization problem whose active set becomes the whole set M. Smooth Riemannian optimization has been largely investigated by researchers in the recent years. The most important results are gathered, for instance, in the monograph [1]. The Riemannian approach is helpful when optimizing with constraints difficult to deal with. Consider for example the cost functions defined on the matrix manifolds m×n n n×p > Mk := {X ∈ R : rank(X) = k}, Stp := {X ∈ R : X X = Ip}. (1.0.1) Nevertheless, a whole theoretical backup coming from Riemannian Geometry is needed in order to design optimization process for a map f : M → R. One can also ask how hard is to optimize convex or Lipschitz maps defined on Riemannian manifold. Re- cently, nonsmooth optimization on Riemannian manifolds has been investigated in [29, 13, 17, 19]. In this Thesis we deal closely with the sets (1.0.1) as they are the natural setting for many issues arising in Linear Algebra. We consider smooth and nonsmooth formulations of two main problems and propose algorithms and numerical simulations. Matrix Completion The matrix completion problem is a numerical linear algebra issue that consists in completing in a unique way a partially observed matrix. Suppose we observe a matrix A on a subset of its entries. The goal is to complete the unknown entries to recover uniquely the original matrix A. The challenge lies in the set’s size of the observed entries; incredibly, a whole matrix can be recovered from just a small knowledge of the original data. m×n Formally, let A be in R and suppose that we observe A on a subset Ω of the 5 complete set of entries {1, ..., m} × {1, ..., n}. We denote the cardinality of this set by |Ω|. It is convenient to define the orthogonal projector PΩ onto the set of indices Ω as following ( m×n m×n Xi,j, if (i, j) ∈ Ω, PΩ : R → R ,Xi,j 7→ 0, otherwise. We denote AΩ = PΩ(A) the known entries of the matrix A. The action of PΩ is schematized in Figure 1.1. The matrix completion problem consists in finding a matrix PΩ Figure 1.1: Matrix completion problem: the action of the projector operator PΩ. In black the entries set to zero. X satisfying PΩ(X) = AΩ. In this Thesis, we assume an a priori knowledge of the solution’s rank and we consider on Mk 1 2 min ||PΩ(X − A)||F . X∈Mk 2 The problem is robust but unfortunately it cannot deal with localized type of noise. This is often the case in application and one prefers to include convex penalties in the cost function to deal with the presence of outliers. For this purpose, we investigate a generalization proposed by [19, 17] in the nonsmooth Riemannian setting for the completion of matrices. The eigenvalue problem The eigenvalue problem is the cornerstone of every linear algebra course and, despite its age, it still occupies a central position in the ongoing numerical research. For a good introduction we refer to the monograph [16]. n×n Formally, given a matrix A ∈ R we are interested in the eigenpairs (λ, v) satisfying Av = λv. In the 20th century, for the first time iterative methods to approximate an eigenpair were proposed. The most basic are power method and inverse iteration for the biggest 6 and smallest eigenvalue, respectively. While, to approximate multiple eigenvectors at the same time, we can rely instead on subspace iterations. For a more efficient approach, one prefers to choose a Krylov subspace method building the set 2 r−1 Kr(A, b) = span{b, Ab, A v, ..., A b}, to then perform a Ritz extraction on a smaller size matrix and obtain the eigenpairs of A. In recent years, the bigger computational capacity has challenged researchers to design algorithms for higher order eigenvalue problem. In this Thesis, we approach the eigenvalue problem by means of the Riemannian optimization of the Rayleigh quotient > > −1 ρA(X) = trace(X AX(X X) ). (1.0.2) We show how the local minima of the map are related to the eigenvectors of a symmetric matrix A. As a warm up, we consider a Riemannian steepest descent approach on n−1 the unit sphere S to approximate the smallest eigenpair (λmin, v). Finally, we investigate the eigenvalue problem for structured symmetric matrices A that admits localized basis of eigenvectors. The goal is to propose a penalized formulation of (1.0.2) X>AX λ|| X || min n trace( ) + vec( ) `1 X∈Stp to seek a sparse basis for p-eigenvectors with a nonsmooth Riemannian approach in n the manifold Stp . Outline of the Thesis The Thesis is structured as follows. In Chapter 2, we recall the most important tools in differential and Riemannian geometry. In Chapter 3, we propose a CG Riemannian method with a special emphasis on the matrix completion problem. Several numerical results are discussed for the completion of different low-rank matrices and an application in computational imaging is proposed. In Chapter 4, we discuss the generalization of certain notions, coming from convex analysis, to the Riemannian setting. A numerical approach based on a subdifferential set is proposed. Finally, in Chapter 5, nonsmooth strategies are applied to the eigenvalue problem and numerical results are shown. We conclude the Thesis with some nonsmooth formulations for the matrix completion problem. All the numerical outputs of this work are obtained in MatLab. 7 8 Chapter 2 Geometrical background for Riemannian optimization n In this Chapter, we describe the properties of a manifold M ⊂ R in order to under- stand a Riemannian geometric descent method. We will not recall the basic ingredients of differential geometry such as charts, atlas or smooth maps except for the purpose of fixing notations with the reader. On the other hand, we introduce basic theory in differential and Riemannian geometry such as tangent spaces, Riemannian metric, geodesics and retractions. Throughout this part, we will denote by M, N general manifold, (U, ψ) a general chart and f : M → R a smooth map in the sense of manifolds. Moreover, we recall that a an embedded submanifold M ⊂ N is a immersed manifold for which the inclusion map M → N is a topological embedding, i.e. the subspace topology of M coincides with the manifold topology. 2.1 First order geometry The aim of this section is to introduce the ingredients of differential geometry to properly set the basis for a simple descent Riemannian optimization method. In unconstrained optimization, a function is minimized step by step looking for suitable line searches containing successive approximations: on manifolds, due to the lack of a linear structure, descent directions live outside the active set. The natural environment to generalize these concepts is the tangent space. One way to introduce the tangent space of a manifold M at a point x is by means of smooth curves γ : R → M. Let Fx(M) be the set of smooth maps f : M → R around x. We define a tangent vector to the curve γ the mapping γ˙ (0) from Fx(M) to R df(γ(t)) γ˙ (0)f := , for every f ∈ Fx(M). dt t=0 9 We are ready to define a tangent vector to a manifold. Definition 2.1.1. We define a tangent vector at point x, denoted by ξx, the mapping Fx(M) → R such that there exists a curve γ on M with γ(0) = x satisfying df(γ(t)) ξxf :=γ ˙ (0)f = .

Nonsmooth Riemannian Optimization for Matrix Completion and Eigenvalue Problems

Fast Singular Value Thresholding Without Singular Value Decomposition∗

A Simplex-Type Voronoi Algorithm Based on Short Vector Computations of Copositive Quadratic Forms

Combinatorial Optimization, Packing and Covering

Quadratic Forms and Their Applications

A Framework for Efficient Execution of Matrix Computations

Zero-Error Slepian-Wolf Coding of Confined Correlated Sources With

Contents September 1

MIT 18.06 Linear Algebra, Spring 2005 Transcript – Lecture 5

Voronoi's Algorithm

On the Inverting of a General Heptadiagonal Matrix

A Multi-Domain Model for Product and Manufacturing System Design

Ranks of Random Matrices and Graphs