Biconvex Relaxation for Semidefinite Programming

Biconvex Relaxation for Semidefinite Programming in Computer Vision Sohil Shah1, Abhay Yadav1, Tom Goldstein1, David Jacobs1, Carlos Castillo1 and Christoph Studer2 1University of Maryland, College Park, 2Cornell University, Ithaca, NY Abstract non-convex constraints, such as integer-valued vectors and low-rank matrices, can be handled by semidefinite relaxation Semidefinite programming has become an indispensable (SDR) [16]. In this method, a non-convex optimization tool in computer vision. Unfortunately, general-purpose problem involving a vector of unknowns is “lifted” to a solvers for semidefinite programs are too slow to be used on higher dimensional convex problem involving a positive large-scale problems and exhibit excessive memory require- semidefinite matrix (all eigenvalues are non-negative) of ments. We propose a general framework to approximately unknowns, which then enables one to solve a semidefinite solve large-scale semidefinite problems (SDPs) at low com- program (SDP) [31]. While SDR was shown to deliver putational complexity. Our new approach, referred to as state-of-the-art performance in a wide range of applications biconvex relaxation (BCR), transforms a general SDP into [9, 16, 10, 30, 37, 20], the approach significantly increases a specific biconvex optimization problem, which can then be the dimensionality of the original optimization problem (i.e., solved in the original, low-dimensional variable space at replacing a vector with a matrix), which typically results in low complexity. The resulting biconvex problem is approx- exorbitant computational costs and memory requirements. imately solved using an efficient alternating minimization Nevertheless, SDR leads to SDPs whose global optimal (AM) procedure. Since AM has the potential to get stuck in solution(s) can be found using robust numerical methods. local minima, we propose a general initialization scheme A growing number of computer-vision applications in- that enables our BCR framework to start close to the global volve high-resolution images (or even videos) that require optimum of the original SDP—this is key for our algorithm the solution to SDPs with an excessive number of variables. to quickly converge to optimal or near-optimal solutions. We General-purpose solvers for SDPs (e.g., interior point meth- showcase the efficacy of our approach on two applications in ods) require high computational complexity; the worst-case computer vision, namely segmentation and co-segmentation. complexity is O(N 6:5 log(1=")) [25] for an N ×N problem BCR achieves a comparable solution quality to state-of-the- with objective error. Very often, N is proportional to the art SDP methods with speedups ranging between 4× and number of image pixels, which is potentially very large. 35×. At the same time, BCR handles a more general set of SDPs than the fastest previous approaches, which are more The prohibitive complexity and memory requirements of specialized. solving SDPs exactly with a large number of variables has spawned recent interest in fast, non-convex solvers that avoid lifting, i.e., operate in the original low-dimensional variable 1. Introduction space. For example, recent progress in phase retrieval by Candes` et al. [5] has shown that non-convex optimization Optimization problems involving either integer-valued methods provably achieve a solution quality comparable to vectors or matrices with low-rank structure are ubiquitous exact SDR-based methods at significantly lower complexity. in computer vision. Graph-cut methods for image segmen- This method operates on the original dimension, which en- tation, for example, involve optimization problems where ables its use on high-dimensional problems (reference [5] integer-valued variables represent region labels [26, 13, 30, performs phase retrieval on a 1080×1920 pixel image in less 9]. Problems in multi-camera structure-from motion [1], than 22 minutes). Another prominent example is max-norm manifold embedding [37], and matrix completion [20] re- regularization by Lee et al. [15], which was proposed for quire the solution of optimization problems over matrices solving high-dimensional matrix-completion problems and that are constrained to have low-rank. Since the constraints to approximately perform max-cut clustering. This method in such optimization problems are non-convex, the design was shown to outperform exact SDR-based methods in terms of computationally-efficient algorithms that find globally of computational complexity, while delivering acceptable optimal solutions is, in general, a difficult task. solution quality. While both of these examples outperform For a wide range of applications [16, 14, 3, 8, 27, 37], classical SDP-based methods, they are limited to very spe- 1 cific problem types, i.e., they perform one of phase retrieval, vex constraints via semidefinite relaxation (SDR) [16] and matrix completion, or max-cut clustering. More importantly, (ii) the resulting problems often come with strong theoreti- these methods are unable to handle complex SDPs that typi- cal guarantees. For example, the Goemans-Williamson [9] cally appear in computer vision. SDR seeks to find a solution to the NP-complete max-cut problem, which can be relaxed to the form (1) and solved 1.1. Contributions using general-purpose methods. Furthermore, the value of In this paper, we introduce a novel framework for approx- the graph cut obtained via this relaxation is known to exceed imately solving general SDPs in a computationally-efficient 80% of the maximum achievable value. manner and with small memory footprint. Our approach, re- In computer vision, a large number of problems can be ferred to as biconvex relaxation (BCR), transforms a general cast as SDPs of the general form (1). For example, [37] SDP into a specific biconvex optimization problem, which formulates image manifold learning as an SDP, [27] uses can then be solved in the original, low-dimensional vari- an SDP to enforce a non-negative lighting constraint when able space at low complexity. The resulting biconvex prob- recovering scene lighting and object albedos, [2] uses an lem is solved using a computationally-efficient alternating SDP for graph matching, [1] proposes an SDP that recovers minimization (AM) procedure. Since AM is prone to get the orientation of multiple cameras from point correspon- stuck in local minima, we propose a general initialization dences and essential matrices, and [20] uses low-rank SDPs scheme that enables our BCR framework to start close to to solve matrix-completion problems that arise in structure- the global optimum of the original SDP—this initialization from-motion and photometric stereo. is key for our algorithm to quickly converge to an optimal or near-optimal solution. We showcase the effectiveness of 2.2. SDR for Binary-Valued Quadratic Problems the proposed BCR framework by comparing the resulting A large body of results that use SDR to solve binary la- algorithms to that of highly-specialized SDP solvers for a beling problems exist in the literature. For such problems, select set of problems in computer vision involving image a set of variables take on binary values (e.g., indicating a segmentation and co-segmentation. Our results demonstrate segmentation into two groups) while minimizing a quadratic that BCR enables high-quality results at best-in-class run- cost function that depends on the assignment of pairs of vari- times. Specifically, BCR achieves speedups ranging from ables. Such labeling problems typically arise from Markov 4× to 35× over state-of-the-art competitor methods [34, 17] random fields (MRFs) for which a rich literature on solution for image segmentation and co-segmentation. methods exist (see [33] for a recent survey). Spectral methods, e.g., [26], are often used to solve such binary-valued 2. Background and Relevant Prior Art quadratic problems (BQPs)—the references [13, 30] used We now briefly review semidefinite programs (SDPs) and SDR, inspired by the work of [9] that provides a generalized then discuss prior work on fast, approximate solvers for SDR for the max-cut problem. BQP problems have wide SDPs in computer vision and related applications. applicability to computer vision problems, such as segmentation and perceptual organization [13, 34, 11], semantic 2.1. Semidefinite Programs (SDPs) segmentation [35], matching [30, 24], surface reconstruction including photometric stereo and shape from defocus [8], SDPs find use in a large (and growing) number of fields, and image restoration [23]. including computer vision, machine learning, signal and im- The key idea of solving BQPs via SDR is to lift the binary- age processing, statistics, communications, and control [31]. N 2 SDPs can be written in the following general form: valued label vector b 2 {±1g to an N -dimensional matrix space by forming the PSD matrix B = bbT , whose minimize hC; Yi non-convex rank-1 constraint is relaxed to PSD matrices N×N Y2R + B 2 SN×N with an all-ones diagonal [16]. The goal is then subject to hA ; Yi = b ; 8i 2 E; i i (1) to solve a SDP for B in the hope that the resulting matrix is hAj; Yi ≤ bj; 8j 2 B; rank 1 (in this case, SDR found an optimal solution); if B has Y 2 S+ ; higher rank, an approximate solution must be extracted. Ap- N×N proximate solutions can either be obtained from the leading + eigenvector or via randomization methods [16]. where SN×N represents the set of N × N symmetric, pos- hC; Yi = tr(CT Y) itive semidefinite matrices, and is the 2.3. Specialized Solvers for SDPs matrix inner product. The sets E and B contain the indices associated to the equality and inequality constraints, respec- While a number of general-purpose solvers for SDPs tively; the matrices Ai and Aj have appropriate dimensions. exist, such as SeDuMi [28] or SDPT3 [29], their computa- The key advantages of SDPs are that (i) they enable the tional complexity and memory requirements are generally transformation of certain non-convex constraints into con- high.

Load more