Sparse Semidefinite Programs with Near-Linear Time Complexity

Sparse Semidefinite Programs with Near-Linear Time Complexity Richard Y. Zhang and Javad Lavaei Abstract—Some of the strongest polynomial-time relaxations problems, including integer programming, combinatorial op- to NP-hard combinatorial optimization problems are semidef- timization [1], [2], but their high complexity severely limit inite programs (SDPs), but their solution complexity of up their use in practice. Semidefinite relaxations are generally to O(n6:5L) time and O(n4) memory for L accurate digits limits their use in all but the smallest problems. Given that considered intractible for problems that arise in real-life combinatorial SDP relaxations are often sparse, a technique applications. known as chordal conversion can sometimes reduce complexity substantially. In this paper, we describe a modification of A. Sparse semidefinite programming chordal conversion that allows any general-purpose interior- Much larger instances of (SDP) can be solved when the point method to solve a certain class of sparse SDPs with a data matrices C; A ;:::;A are sparse, with a nonzero guaranteed complexity of O(n1:5L) time and O(n) memory. 1 m To illustrate the use of this technique, we solve the MAX k- pattern that has a sparse chordal completion. If we identify CUT relaxation and the Lovasz Theta problem on power system the problem sparsity with an undirected graph on n vertices, models with up to n = 13659 nodes in 5 minutes, using SeDuMi then this is to say that the graph has a bounded treewidth. v1.32 on a 1.7 GHz CPU with 16 GB of RAM. The empirical Algebraically, this means that every matrix of the form time complexity for attaining L decimal digits of accuracy is S = C − P y A 0 factors into a sparse Cholesky factor ≈ 0:001n1:1L seconds. i i i L satisfying LLT = S, possibly after reordering the rows and I. INTRODUCTION columns. Note that this condition is satisfied by many real- life problems, with applications ranging from electric power Consider large-and-sparse instance of the semidefinite pro- systems [3], [4], [5] to machine learning [6]. gram Problems with the chordal sparsity structure may be solved minimize C • X (SDP) at reduced complexity using a reformulation procedure intro- duced by Fukuda and Nakata et al. [7], [8]. The essential idea subject to Ai • X = bi 8i 2 f1; : : : ; mg is to split the large conic constraint X 0 into ` ≤ n smaller X 0 conic constraints, enforced over its principal submatrices (we where the problem data are the sparse n × n real symmetric refer to the principal submatrix of X indexed by Ij as matrices C; A1;:::;Am sharing a common sparsity pattern X[Ij; Ij]) and the m-dimensional vector b. The notation Ai • X ≡ minimize C • X (CSDP) tr AiX refers to the usual matrix inner product, and the constraint X 0 restricts the matrix X to be symmetric subject to Ai • X = bi 8i 2 f1; : : : ; mg positive semidefinite. X[Ij; Ij] 0 8j 2 f1; : : : ; `g Interior-point methods are the most reliable approach for where the index sets I ;:::; I ⊂ f1; : : : ; ng are chosen solving small- and medium-scale SDPs, but become pro- 1 ` to make (SDP) and (CSDP) equivalent. To induce sparsity hibitively time- and memory-intensive for large-scale prob- in an interior-point solution of (CSDP), the ` principal lems. A fundamental difficulty is the positive semidefinite submatrices X[I ; I ];:::;X[I ; I ] are split into distinct constraint X 0, which usually couples every element in the 1 1 ` ` matrix variables X ;:::;X , constrained by the need for matrix decision variable X with everyone else, causing all of 1 ` their overlapping elements to agree them to take on nonzero values. The solution X? of (SDP) ` is usually fully dense, despite any sparsity in the problem X data. Interior-point methods solve highly sparse SDPs in minimize Cj • Xj (CTC) approximately the same time as dense SDPs of the same j=1 6:5 4 ` size: to L digits of accuracy in O(n L) time and O(n ) X memory. subject to Ai;j • Xj = bi 8i 2 f1; : : : ; mg; Semidefinite programs arise as some of the best convex j=1 relaxations to a number of important nonconvex optimization Xj 0 8j 2 f1; : : : ; `g; Nu;v(Xv) = Nv;u(Xu) 8fu; vg 2 E: (1) This work was supported in part by a DARPA YFA Award, an ONR YIP Award, and an AFOSR YIP Award. Here, the linear operator Ni;j(·) outputs the overlapping This work was supported by the ONR YIP Award, DARPA YFA Award, AFOSR YIP Award, NSF CAREER Award, and NSF EPCN Award. elements of two principal submatrices given the latter as the R.Y. Zhang and J. Lavaei are with the Department of Industrial Engineer- argument ing and Operations Research, University of California, Berkeley, CA 94720, USA [email protected] and [email protected] Nu;v(X[Iv; Iv]) = X[Iu \Iv; Iu \Iv]; and E are the edges to a certain tree graph. Once a solution show in Section IV that each interior-point iteration takes ? ? 6 3 4 2 X1 ;:::;X` to (CTC) is found, a corresponding solution for O(! n ) time and O(! n ) memory. (SDP) is recovered by solving B. Main result find X 0 (PSDMC) ? Empirical results [7], [8], [18], [4] find that many instances satisfying X[Ij; Ij] = Xj 8j 2 f1; : : : ; `g; of (CTC) can be actually solved in near-linear time using an in closed-form using a modification of the sparse Cholesky interior-point method. Unfortunately, other problem instances factorization algorithm (see e.g. [9]). Section II reviews this attain the worst-case cubic complexity quoted in Theorem 1. reformulation procedure in detail, including implementation The key issue is the large number of the overlap constraints considerations. N (X ) = N (X ) 8fu; vg 2 E; The complexity of solving the reformulated problem u;v v v;u u (CTC) is closely associated with the following constant which are imposed in addition to the constraints already present in the original problem. These overlap constraints can (Clique size) ! = maxfjI j; jI j;:::; jI jg; 1 2 ` significantly increase the size of the linear system solved at whose value has the interpretation as the size of the largest each interior-point interation, thereby offsetting the benefits clique in on the chordal graph completion. Finding the best of increased sparsity [19, Section 14.2]. In [20], omitting choice of I1;:::; Im to minimize ! is NP-complete [10], but some of the constraints made (CTC) easier to solve, but at provably good choices within a log(n) factor of the minimum the cost of also making the reformulation from (SDP) inexact. are readily available, using heuristics originally developed for The main result in this paper is a set of extra conditions the fill-reduction problem in numerical linear algebra [11], on (SDP) that allow (CTC) to solved in near-linear time, [12], [13]. by applying a generic interior-point method to the dualized Our assumption of sparsity with a sparse chordal comple- version of the problem. tion guarantees the existence of I1;:::; Im with bounded Assumption 2 (Decoupled submatrices). Let A = ! = O(1). In practice, such a choice is readily found using fA1;:::;Amg be the data, and I = fI1;:::; Ijg be the in- standard fill-reducing heuristics in a negligible amount of dex sets in (CSDP). Then for every Ai 2 A, there is a Ij 2 I time. Our numerical results in Section V used MATLAB’s in- n and some Ai;j 2 S such that Ai • X = Ai;j • X[Ij; Ij]. built approximate minimum degree heuristic to find choices of I1;:::; Im with ! ≤ 35 in less than 0.01 seconds on a The condition is similar to the correlative sparsity property standard desktop for problems as large as n = 13659. studied in [21]. However, note that previous efforts have To give a rigorous complexity bound for (CTC), we state only been able to exploit this property inside a first-order some nondegeneracy assumptions, which are standard for method [22], [21], [23], [24]. These have time complexity interior-point methods. that is exponential O(exp L) in the number of accurate digits L, are sensitive to the conditioning of the data, and require Assumption 1 (Nondegeneracy). We assume in (CTC): considerable fine-tuning of the parameters in practice. 1) (Linear independence) The matrix A = Theorem 2. Under Assumptions 1 and 2, there exists an [vec A1;:::; vec Am] has full column-rank, meaning that AT A is invertible. interior-point method that computes an approximate solution for (CTC) that is accurate to L digits in O(!6:5n1:5L) time 2) (Slater’s condition) There exist X1;:::;X` 0, y, P and O(!4n) memory. and S1;:::;S` 0, such that Ai;j • Xj = bi P j and i yiAi;j + Sj = Cj for all i 2 f1; : : : ; mg and Proof: We repeat the proof of Theorem 1, but show j 2 f1; : : : ; mg. in Section IV that, under Assumption 2, each interior-point Note that linear independence sets m = O(n), since iteration of the dualized version of (CTC) solves in just 6 4 we have already assumed each column of A to contain no O(! n) time and O(! n) memory, due to a chordal block- more than O(n) nonzeros. Also, Slater’s condition is satisfied sparsity structure discussed in Section III. in solvers like SeDuMi [14] and MOSEK [15] using the homogenous self-dual embedding technique [16]. C. Applications Assumption 2 is widely applicable to many important Theorem 1.

Sparse Semidefinite Programs with Near-Linear Time Complexity

Julia, My New Friend for Computing and Optimization? Pierre Haessig, Lilian Besson

Solving Mixed Integer Linear and Nonlinear Problems Using the SCIP Optimization Suite

Numericaloptimization

Using COIN-OR to Solve the Uncapacitated Facility Location Problem

Simulation and Application of GPOPS for Trajectory Optimization and Mission Planning Tool Danielle E

ILOG AMPL CPLEX System Version 9.0 User's Guide

Open Source Tools for Optimization in Python

User's Guide for SNOPT Version 7: Software for Large-Scale Nonlinear

Optimal Control for Constrained Hybrid System Computational Libraries and Applications

PSO-Based Soft Lunar Landing with Hazard Avoidance: Analysis and Experimentation

CME 338 Large-Scale Numerical Optimization Notes 2

Lab Information 1 Information About GLPK/Glpsol