Lower Bounds on the Size of Semidefinite Relaxations
Total Page:16
File Type:pdf, Size:1020Kb
Lower bounds on the size of semidefinite relaxations David Steurer Cornell James R. Lee Prasad Raghavendra Washington Berkeley Institute for Advanced Study, November 2015 overview of results unconditional computational lower bounds for classical combinatorial optimization problems o examples: MAX CUT, TRAVELING SALESMAN in restricted but powerful model of computation o generalizes best known algorithms o all possible linear and semidefinite relaxations o first super-polynomial lower bound in this model general program goes back to [Yannakakis’88] for refuting flawed P=NP proofs connection to optimization / convex geometry settle open question about semidefinite lifts of polytopes and positive semidefinite rank [see survey Fazwi, Gouveia, Parrilo, Robinson, Thomas] overview of results proof strategy for lower bounds optimal approximation algorithm in this model achieves best possible approximation guarantees among all poly-time algorithms in this model wide-range of problems: every constraint satisfaction problem concrete algorithm: sum-of-squares (aka Lasserre) hierarchy [Shor’87, Parrilo’00, Lasserre’00] derive lower bounds for general model from known counterexamples (integrality gaps) for sum-of-squares algorithm [Grigoriev, Schoenebeck, Tulsiani, Barak-Chan-Kothari] mathematical programming relaxations: powerful general approach for approximating NP-hard optimization problems three flavors: linear (LP) spectral (A)TSP edge expanders semidefinite (SDP) MAX CUT, CSP, SPARSEST CUT intriguing connection to hardness reductions (e.g., Unique Games Conjecture) plausibly optimal polynomial-time algorithms mathematical programming relaxations: powerful general approach for approximating NP-hard optimization problems Yannakakis’s model motivated by flawed P=NP proofs [Yannakakis’88] formalizes intuitive notion of LP relaxations for problem enough structure for unconditional lower bounds (indep. of P vs. NP) Fiorini-Massar-Pokutta-Tiwary-de Wolf’12, Braun-Pokutta-S., Braverman-Moitra, Chan-Lee-Raghavendra-S., Rothvoß extends to SDP relaxations (but LP lower bound techniques break down) [Fiorini-Massar-Pokutta-Tiwary-de Wolf, Gouveia-Parrilo-Thomas] testing computational complexity conjectures approximation / PCP: UNIQUE GAMES, sliding scale conjecture average-case: RANDOM 3 SAT, PLANTED CLIQUE LP formulations of MAX CUT 푥푖 = −1 find bipartition in given 푛-vertex graph 퐺 푥 = 1 to cut as many edges as possible 푖 2 푛 maximize 푓퐺 푥 = σ푖푗∈퐸 퐺 푥푖 − 푥푗 /4 over 푥 ∈ −1,1 (hypercube) equivalently: maximize σ푖푗∈퐸 퐺 (1 − 푋푖푗)/2 over cut polytope ⊤ 푛 CUT푛 = convex hull of 푥푥 ∣ 푥 ∈ −1,1 #facets is exponential no small direct LP formulation LP formulations of MAX CUT 푥푖 = −1 find bipartition in given 푛-vertex graph 퐺 푥 = 1 to cut as many edges as possible 푖 2 푛 maximize 푓퐺 푥 = σ푖푗∈퐸 퐺 푥푖 − 푥푗 /4 over 푥 ∈ −1,1 (hypercube) equivalently: maximize σ푖푗∈퐸 퐺 (1 − 푋푖푗)/2 over cut polytope ⊤ 푛 CUT푛 = convex hull of 푥푥 ∣ 푥 ∈ −1,1 #facets is exponential no small direct LP formulation 풅 general size-풏 LP formulation of MAX CUT [Yannakakis’88] 푛푑 푑 polytope 푃 ⊆ ℝ defined by ≤ 푛 linear inequalities that projects to CUT푛 often exponential savings: ℓ1-norm unit ball, Held-Karp TSP relaxation, LP/SDP hierarchies size lower bounds for LP formulations of MAX CUT? (implied by NP ≠ P/poly) example: poly-size LP formulation for ℓퟏ-norm ball ℓ1-unit ball project on 푥 variables −푦 ≤ 푥 ≤ 푦 σ 푦 ≤ 1 σ푖 푥푖 ≤ 1 푖 푖 푛 푥 ∈ ℝ푛 푥, 푦 ∈ ℝ 2푛 linear inequalities 2푛 + 1 linear inequalities exponential savings 풅 general size-풏 LP formulation of MAX CUT [Yannakakis’88] 푛푑 푑 polytope 푃 ⊆ ℝ defined by ≤ 푛 linear inequalities that projects to CUT푛 often exponential savings: ℓ1-norm unit ball, Held-Karp TSP relaxation, LP/SDP hierarchies size lower bounds for LP formulations of MAX CUT? (implied by NP ≠ P/poly) general size-풏풅 LP formulation of MAX CUT 푛푑 푑 polytope 푃 ⊆ ℝ defined by ≤ 푛 linear inequalities that projects to CUT푛 size lower bounds for LP formulations of MAX CUT exponential lower bound: 푑 ≥ Ω෩(푛) [Fiorini-Massar-Pokutta-Tiwary-de Wolf’12] approx. ratio > ½ requires superpolynomial size [Chan-Lee-Raghavendra-S.’13] but: best known MAX CUT algorithms based on semidefinite programming SDP general size-풏풅 LP formulation of MAX CUT 푛푑 푑 polytope 푃 ⊆ ℝ defined by ≤ 푛 linear inequalities that projects to CUT푛 푑 푑 spectrahedron 푃 ⊆ ℝ푛 ×푛 defined by intersecting some affine linear subspace with psd cone (artistic freedom) SDP size lower bounds for LP formulations of MAX CUT ? [Lee-Raghavendra-S.’15] exponential lower bound: 푑 ≥ Ω 푛0.1 approx. ratio > 0.99 requires super polynomial size (match NP-hardness for CSPs) best approx. ratio by 푛푑-size SDP no better than 푂(푑)-deg. sum-of-squares sum-of-squares is optimal SDP approximation algorithm for CSPs 2 푛 recall MAX CUT: maximize 푓퐺 푥 = σ푖푗∈퐸 퐺 푥푖 − 푥푗 /4 over 푥 ∈ −1,1 upper bound certificates algorithm with approx. guarantee must certify upper bounds on objective function 푓퐺 approx. ratio 훼 ⇒ algorithm certifies 푓퐺 ≤ 푐 for some 푐 ≤ OPT퐺/훼 can characterize LP/SDP algorithms by their certificates certificates of deg-풅 sum-of-squares algorithm (풏풅-size SDP example) 푛 2 certify 푓 ≥ 0 for function 푓: ±1 → ℝ iff 푓 = σ푖 푔푖 with ∀푖. deg 푔푖 ≤ 푑 equal as f’ns on hypercube captures best known algorithms for wide range of problems deg-ퟏ sum-of-squares captures Goemans-Williamson MAX CUT 0.878-approx. 2 for every graph 퐺, OPT퐺 − 0.878 ⋅ 푓퐺 = σ푖 푔푖 with deg 푔푖 ≤ 1 2 푛 recall MAX CUT: maximize 푓퐺 푥 = σ푖푗∈퐸 퐺 푥푖 − 푥푗 /4 over 푥 ∈ −1,1 certificates of deg-풅 sum-of-squares algorithm (풏풅-size SDP example) 푛 2 certify 푓 ≥ 0 for function 푓: ±1 → ℝ iff 푓 = σ푖 푔푖 with ∀푖. deg 푔푖 ≤ 푑 equal as f’ns on hypercube captures best known algorithms for wide range of problems deg-ퟏ sum-of-squares captures Goemans-Williamson MAX CUT 0.878-approx. 2 for every graph 퐺, OPT퐺 − 0.878 ⋅ 푓퐺 = σ푖 푔푖 with deg 푔푖 ≤ 1 2 푛 recall MAX CUT: maximize 푓퐺 푥 = σ푖푗∈퐸 퐺 푥푖 − 푥푗 /4 over 푥 ∈ −1,1 connection to Unique Games Conjecture best candidate algorithm to refute UGC: deg-푂෨ 1 sum of squares 2 enough to show: ∃푑. ∀퐺. OPT퐺 − 0.87ퟗ ⋅ 푓퐺 = σ푖 푔푖 with deg 푔푖 ≤ 푑 does larger degree help? yes: if 푓 ≥ 0, then 푓 = 푔2 for some function 푔 with deg 푔 ≤ 푛 (but 2푛-size SDP) 2 (tight: ½ − σ푖 푥푖 − ¼ ≥ 0 has no deg-표(푛) s.o.s. certificate) certificates of deg-풅 sum-of-squares algorithm (풏풅-size SDP example) 푛 2 certify 푓 ≥ 0 for function 푓: ±1 → ℝ iff 푓 = σ푖 푔푖 with ∀푖. deg 푔푖 ≤ 푑 equal as f’ns on hypercube captures best known algorithms for wide range of problems deg-ퟏ sum-of-squares captures Goemans-Williamson MAX CUT 0.878-approx. 2 for every graph 퐺, OPT퐺 − 0.878 ⋅ 푓퐺 = σ푖 푔푖 with deg 푔푖 ≤ 1 where are the vectors? 푓퐺 − 푐 퐷 suppose: no deg-푑 sos certificate for 푓퐺 ≥ 푐 2 separating hyperplane 퐷: ±1 푛 → ℝ σ푖 푔푖 ∣ deg 푔푖 ≤ 푑 2 σ푥 퐷 푥 ⋅ 푔 푥 ≥ 0 whenever deg 푔 ≤ 푑 σ푥 퐷(푥) ⋅ 1 = 1 σ푥 퐷 푥 ⋅ 푓퐺 푥 > 푐 ⊤ 푀 = σ푥 퐷 푥 ⋅ 푥푥 is usual SDP solution 푛 ℝ ±1 (in particular 푀 ≽ 0 and 푀푖푖 = 1) ∃ vectors 푣푖 with 푀푖푗 = 푣푖, 푣푗 certificates of deg-풅 sum-of-squares algorithm (풏풅-size SDP example) 푛 2 certify 푓 ≥ 0 for function 푓: ±1 → ℝ iff 푓 = σ푖 푔푖 with ∀푖. deg 푔푖 ≤ 푑 equal as f’ns on hypercube captures best known algorithms for wide range of problems deg-ퟏ sum-of-squares captures Goemans-Williamson MAX CUT 0.878-approx. 2 for every graph 퐺, OPT퐺 − 0.878 ⋅ 푓퐺 = σ푖 푔푖 with deg 푔푖 ≤ 1 where are the vectors? 푓퐺 − 푐 퐷 suppose: no deg-푑 sos certificate for 푓퐺 ≥ 푐 2 separating hyperplane 퐷: ±1 푛 → ℝ σ푖 푔푖 ∣ deg 푔푖 ≤ 푑 2 σ푥 퐷 푥 ⋅ 푔 푥 ≥ 0 whenever deg 푔 ≤ 푑 σ푥 퐷(푥) ⋅ 1 = 1 퐷 behaves like probability σ푥 퐷 푥 ⋅ 푓퐺 푥 > 푐 distribution over MAX CUT ⊤ solutions with expected value > 푐 푀 = σ푥 퐷 푥 ⋅ 푥푥 is usual SDP solution 푛 pseudo-distribution:ℝ ±1 useful (in particular 푀 ≽ 0 and 푀푖푖 = 1) way to think about LP/SDP ∃ vectors 푣푖 with 푀푖푗 = 푣푖, 푣푗 relaxations in general certificates of deg-풅 sum-of-squares algorithm (풏풅-size SDP example) 푛 2 certify 푓 ≥ 0 for function 푓: ±1 → ℝ iff 푓 = σ푖 푔푖 with ∀푖. deg 푔푖 ≤ 푑 equal as f’ns on hypercube captures best known algorithms for wide range of problems deg-ퟏ sum-of-squares captures Goemans-Williamson MAX CUT 0.878-approx. 2 for every graph 퐺, OPT퐺 − 0.878 ⋅ 푓퐺 = σ푖 푔푖 with deg 푔푖 ≤ 1 certificates of deg-풅 sum-of-squares SDP algorithm 푛 2 certify 푓 ≥ 0 for function 푓: ±1 → ℝ iff 푓 = σ푖 푔푖 with ∀푖. deg 푔푖 ≤ 푑 certificates of general 풏풅-size SDP algorithm 푑 푑 characterized by psd-matrix valued function 푄: ±1 푛 → ℝ푛 ×푛 2 certify 푓 ≥ 0 iff ∃푃 ≽ 0. ∀푥 ∈ ±1 푛. 푓 푥 = Tr 푃푄(x) = 푃 푄 푥 퐹 ⊤ example: deg-풅 sum-of-squares SDP algorithm, 푄 푥 = 푥⊗푑 푥⊗푑 general SDP Q captured by deg-풅 sum-of-squares if deg 푄 푥 ≤ 푑 certificates of deg-풅 sum-of-squares SDP algorithm 푛 2 certify 푓 ≥ 0 for function 푓: ±1 → ℝ iff 푓 = σ푖 푔푖 with ∀푖. deg 푔푖 ≤ 푑 certificates of general 풏풅-size SDP algorithm 푑 푑 characterized by psd-matrix valued function 푄: ±1 푛 → ℝ푛 ×푛 2 certify 푓 ≥ 0 iff ∃푃 ≽ 0.