arXiv:2005.07030v6 [cs.DS] 30 Jan 2021 piiain opeiymaue,adclasses and measures, complexity optimization, works method the that Keywords: revealed have out carried experiments the rirr iesosu o3 ihrno nre nterne[ range the in o entries matrices random million with 30 five to generating algorith The up by dimensions time. checked polynomial arbitrary and in MATLAB solved be in quadr can the implemented that of transformation program the linear on a emphasis into special a with minimiza detail class this in complexity that the prove in to is sufficient lem is it conservative, very although .Introduction 1. solv complexit to computational developed The is time programs. polynomial quadratic in binary algorithm stricted exact an paper, this In Abstract xml,teojciefntosi h BPpolmaeacaso e of survey). class succes recent a striking a very are for had [5] problem have [3], (see and UBQP [2], useful vision the widely [1], community, are in that research functions functions vision objective computer progres the the algorithmic example, to the due of ima Much been has labeling, classification. segmentation/pixel data and image clustering, recognition to partitioning pattern limited denoising/restoration, image and not registration/matching, processing, but image including vision, tions, computer many in n opeiyTer sresi 7 n n[] ieago con of account good a give Mat [8], Discrete in in and area [7] research in active topic). (surveys an Ru Theory become and Complexity has Hammer and it by then, introduced Since were [6]. optimization quadratic binary and rpitsbitdt ArXiv to submitted Preprint ve a for framework unifying a represents UBQP that discovery the eateto uoai oto,Eetia niern a Engineering Electrical Control, Automatic of Department h nosrie iayqartcpormig(BP rbe o problem (UBQP) programming quadratic binary unconstrained The h BPpolmdtsbc ote16sweeped-ola f pseudo-boolean where 1960s the to back dates problem UBQP The urnl,ti rbe a eoeamjrpolmi eetyears recent in problem major a become has problem this Currently, oyoilTm loih o Unconstrained for Algorithm Polynomial-Time A ehia nvriyo atgn,Cmu ual e a 3 Mar del Muralla Campus Cartagena, of University Technical nosrie iayqartcpormig global programming, quadratic binary Unconstrained iayQartcOptimization Quadratic Binary unIncoMulero-Mart´ınez Ignacio Juan -al [email protected] E-mail: P h mlmnainapcsaeas described also are aspects implementation The . dEetoi Technology, Electronic nd 23 Spain. 0203, fgah,data graphs, of eray2 2021 2, February ywd variety wide ry ncomputer in s is y − tcprogram atic correctly. tUBQP at s 50 inprob- tion O , hematics unctions applica- 0.All 50]. 4.For [4]. unre- e deanu, u to due was m n nergy ccurs 15 this 2 ge f , of combinatorial optimization problems. In particular, as pointed out in [9] the UBQP model includes the following important combinatorial optimization prob- lems: maximum cut problems, maximum, click problems, maximum indepen- dent set problems, graph coloring problems, satisfiability problems, quadratic knapsack problems, etc. The UBQP problem is generally NP-Hard, [10] (you can use the UBQP problem to optimize the number of constraints satisfied on a 0/1 integer Pro- gramming instance, one of the Karp’s 21 NP-complete problems). Only a few special cases are solvable in polynomial time. In fact, the problem of deter- mining local minima of pseudo-boolean functions is found in the PLS-complete class (the class of hardest polynomial local search problems), [10], [11], and in general, local search problems are found in the EXP class, [12], [13], [14], [15], [16], [17]. Global optimization methods are NP-complete. To obtain a global optimal solution by exact methods (generally based on branch and bound strategies), the following techniques should be highlighted: the combinatorial variable elimination algorithm1, [6], [18], [19]; the continuous relaxation with linearization (where the requirement of binary variables is replaced by a weaker restriction of membership to the closed interval [0, 1]), [20], [21]; the posiform transformations, [22], [23]; the conflict graphs (the connection between posi- form minimization problem and the maximum weighted stability), [24], [25], [26], [27], [28], [29]; the linearization strategies such as standard linearization (consisting in transforming the minimization problem into an equivalent linear 0–1 programming problem), [30], [31], [32], [33], Glover method, [34], improved linearization strategy, [35], (the reader is referred to [36] for a recent comparison of these methods); semidefinite-based solvers, [37] (and the references therein), et cetera. The reader is referred to the survey [38] for a detailed description of these techniques until 2014. Many researchers have extensively studied the UBQP problem, however, up to date nobody has succeeded in developing an algorithm running in polynomial time. We claim, and this is the main contribution of this work, that UBQP is in the complexity class P . The main idea is to transform the UBQP problem into a linear programming (LP) problem, that is solved in polynomial time. We guar- antee that the minimum of the LP problem is also the minimum of the UBQP problem. We also provide the implementation details of the algorithm, moti- vated by the following aspects that any work on discrete optimization should present: (i) Describe in detail the algorithms to be able to reproduce the experiments and even improve them in the future. (ii) Providing the source code so that it is openly available to the scientific community: Interestingly, a recent study by Dunning has revealed that only 4% of papers on heuristic methods provide the source code, [39].
1This algorithm is in the class EXP and only runs in polynomial time for pseudo-Boolean functions associated with graphs of bounded tree-width.
2 (iii) Establish random test problems with an arbitrary input size. Here it is important to indicate the ranges of the parameters in the UBQP problem. This procedure has been implemented in MATLAB (source code is provided as supplementary material) and checked with five million random matrices up to dimension 30, with entries in the range [−50, 50]. An advantage of this algorithm is its modularity concerning the dimension of the problem: the set of linear constraints of the equivalent linear programming problem is fixed for a constant arbitrary dimension regardless of the objective function of the quadratic problem. Finally, we highlight that the objective of the work is not the speed of resolution of the problem but simply to show that the UBQP problem can be solved in polynomial time. Future works will analyze large-scale UBQP problems as well as the design of more efficient polynomial- time algorithms. The paper is organized as follows: Section 2 describes the relaxation process for the UBQP problem. Next in section 3, the main result about the equiva- lence of the UBQP problem with a linear programming problem is addressed. For simplicity in the exposition, the case n = 3 is presented first and then it is generalized for n > 3. The computational complexity in both time and space is analyzed in section 4. The implementation features about primary variables, transformation of the objective function, and convexity and consistency con- straints are treated in section 5. The design of the experiment for testing the solution is presented in section 6. Finally, section 7 is dedicated to discussing the main aspects presented in this work as well as possible future works.
2. Background
Let B = {0, 1} and f : Bn → R be a quadratic objective function defined as f (x) = xT Qx + bT x with Q = QT ∈ Rn×n, diag (Q) = (0,..., 0) and b ∈ Rn. The UBQP problem is defined as follows:
UBQP: minx∈B f (x).
The objective function f is usually called a quadratic pseudo-boolean func- tion, i.e. multilinear polynomials in binary unknowns. These functions repre- sent a class of energy functions that are widely useful and have had very striking success in computer vision (see [5] for a recent survey). n This problem can naturally be extended to the solid hypercube Hn = [0, 1] spanned by Bn. The extension of the pseudo-Boolean function f : Bn → R is a pol function f : Hn → R that coincides with f at the vertices of Hn. Rosenberg discovered an attractive feature regarding the multilinear polynomial extension pol pol f , [40]: the minimum of f is always attained at a vertex of Hn, and hence, that this minimum coincides with the minimum of f. From this, our optimization problem is reduced to the following relaxed quadratic problem:
(Pn): minx∈Hn f (x).
3 3. Main Result
In this section, we prove that Problem (P) can be reduced to a Linear Pro- gramming Problem.
3.1. A Simple Case We begin with the simple case of minimization of a quadratic form f (x) in the cube H3. Here the minimization problem is stated as follows:
(P3): minx∈H3 f (x).
3 1 3 Associated with the cube H3 we have a map φ : H3 → [0, 2] × 0, 2 defined as x1+2x1x2+x2 2 x1+2x1x3+x3 2 x2+2x2x3+x3 φ (x , x , x )= − 2 . 1 2 3 x1 2x1x2+x2 − 2 x1 2x1x3+x3 − 2 x2 2x2x3+x3 2 An important fact is that the cube H3 can be expressed as a convex hull of a finite set of vertices V = {0, 1}3. For simplicity, we enumerate the vertices in V as p1,p2,...,p8 so that H3 can be written as convex combinations of those vertices, i.e. H3 = conv (V ), where
8 8 conv (V )= αipi : ai ≥ 0, αi =1 . (i=1 i=1 ) X X 6 6 The map φ is a composition of the maps α : H3 → [0, 1] and β : [0, 1] → 3 1 3 [0, 2] × 0, 2 defined as α (x) = (x1, x1x2, x1x3, x2, x2x3, x3) , (1)
6 β (y)= E3y for every y ∈ [0, 1] , (2) where E3 is 12 0100 10 2001 1 00 0121 E = . (3) 3 2 1 −20100 1 0 −20 0 1 00 01 −2 1 More specifically φ = β ◦ α. As a summary, the maps φ, α, and β are represented in the diagram of Figure 1. The map φ is composition of α with β, i.e. φ = β ◦ α, where α can be T T built from H3 as a selection of the Kronecker productx ˜ ⊗ x˜ withx ˜ = 1, x 4 x1+2x1x2+x2 x1+2x1x3+x3 x2+2x2x3+x3 x1−2x1x2+x2 x1−2x1x3+x3 x2−2x2x3+x3 w = φ(x)= 2 , 2 , 2 , 2 , 2 , 2 w x u φ 12 u13 β(y)= E3y C H n 3 u β(α(x)) = φ(x)= 23 v12 y +2y + y v13 1 2 4 α y +2y + y β v23 1 3 6 1 y4 +2y5 + y6 2 = β(y) y1 − 2y2 + y4 y1 − 2y3 + y6 α(H ) y 3 y4 − 2y5 + y6 α(x) = (x1, x1x2, x1x3, x2, x2x3 , x3)
Figure 1: Diagram for the maps φ, α, and β.
1 and x ∈ H . The set H˜ = x˜ = : x ∈ H is convex: this is trivial 3 3 x 3 simply by building a convex combination of two pointsx ˜ andy ˜ in H˜3,
1 λx˜ + (1 − λ)˜y = ∈ H˜ with λ ∈ [0, 1] . λx + (1 − λ) y 3 8 8 Since H3 = conv (V ), it follows that x = i=1 λipi where i=1 λi = 1, λi ≥ 0, and p ∈ V . The set of vertices of H˜ is i 3 P P 1 V˜ = p˜i = : i =1,..., 8 . pi
So H˜3 = conv V˜ andx ˜ ⊗ x˜ is written as a convex combination: 8 x˜ ⊗ x˜ = λiλj (˜pi ⊗ p˜j). i,j=1 X
There exists a matrix S3 given by
0100100000000000 0000001001000000 1 0000000100000100 S = , 3 2 0010000010000000 0000000000010010 0001000000001000 such that α (x)= S3 (˜x ⊗ x˜).
5 3 − dimensional Space H∋ 6 − dimensional Space C3 1 1 1 1 φ(p )= , , 2, , , 0 (0, 0, 1) (0, 1, 1) = p 4 2 2 2 2 4 (1, 0, 1) (1, 1, 1) φ(x)
(0, 0, 0) (0, 1, 0) = p3
1 1 1 1 φ(p )= , 0, , , 0, 3 2 2 2 2 (1, 0, 0) (1, 1, 0)
Figure 2: Map φ between the cube H3 and the 6-dimensional convex-hull C3.
From the map φ, another convex hull is built C3 = conv (φ (V )). In Figure 2, the transformation between H3 and C3 through the map φ is illustrated. From φ (x) we can recover x through the linear transformation x = Lφ (x), where 1 1 −1 1 1 −1 1 L = 1 −1 1 1 −1 1 . 2 −1 1 1 −1 1 1 We have seen that the points of C3 are convex combinations of the vertices φ (pi), i =1,..., 8. For simplicity, we write such points as w = Bλ where matrix B is given by
1 1 1 1 0 0 2 2 2 2 2 2 1 1 1 1 0 2 0 2 2 2 2 2 0 1 1 2 0 1 1 2 B = 2 2 2 2 , (4) 0 0 1 1 1 1 0 0 2 2 2 2 0 1 0 1 1 0 1 0 2 2 2 2 0 1 1 0 0 1 1 0 2 2 2 2 8 8 and the vector λ ∈ [0, 1] is such that i=1 λi = 1. Next we show with a counterexample that C3 * φ (H3). For this, let w eP2+e3 be the point of C3 given by w = B 2 where ek represents the standard base vector in R8 with a ’1’ at the k-th component. Suppose that there was an x ∈ H3 such that φ (x)= w, then x = Lφ (x)= Lw, and therefore it should be
6 verified that φ (Lw)= w. However, a contradiction would be reached since
1 1 8 4 1 1 8 4 5 1 φ (Lw)= 16 6= 2 = w, 1 1 8 4 1 1 8 4 3 1 16 2 1 1 T where Lw = 0, 4 , 4 . This result reveals that φ (H3) is not convex, because otherwise, when C has its vertices in φ (H ) it would be that C ⊆ φ (H ). 3 3 3 3 The image of H3 through φ is contained in C3.
Lemma 1. φ (H3) ⊆C3. Proof. Matrix B has rank 6. In fact submatrix B˜ formed by columns 2,..., 7 from B has full rank. Let x be an arbitrary point in H3, we will prove that 8 8 φ (x) ∈C3. For this, we must find a vector λ ∈ [0, 1] such that Σi=1λi = 1 and Bλ = φ (x) . (5)
T For simplicity we will use the notation λ˜ = λ2 λ3 λ4 λ5 λ6 λ7 . Then the vector λ˜ will depend on the point x in H and on the free parameter 3 λ8:
λ8 + x3 − x1x3 − x2x3 λ8 + x2 − x1x2 − x2x3 −1 T T x2x3 − λ8 λ˜ = B˜ B˜ B˜ (φ (x) − B8λ8)= . λ8 + x1 − x1x2 − x1x3 x x − λ 1 3 8 x1x2 − λ8 Furthermore λ1, λ˜, and λ8 should satisfy the convexity constraints:
0 ≤ xixj − λ8 ≤ 1, i, j ∈{1, 2, 3} and i 6= j, (6)
0 ≤ λ8 + xi (xj + xk − 1) ≤ 1, i, j, k ∈{1, 2, 3} , i 6= j, i 6= k, j 6= k, (7)
0 ≤ λ1, λ8 ≤ 1, (8)
λ1 + λ8 + x1 + x2 + x3 − x1x2 − x1x3 − x2x3 = 1. (9)
8 The identity (9) is precisely the convexity constraint Σi=1λi = 1. Let us define the following quantities:
M1 (x) = min {1 − x1 (1 − x2 − x3) , 1 − x2 (1 − x1 − x3) , 1 − x3 (1 − x1 − x2)} ,
M2 (x) = min {x1x2, x1x3, x2x3} ,
m1 (x) = max {−x1 (1 − x2 − x3) , −x2 (1 − x1 − x3) , −x3 (1 − x1 − x2)} ,
m2 (x) = max {x1x2 − 1, x1x3 − 1, x2x3 − 1} .
7 It can be verified that m1 (x) ≤ 1, m2 (x) ≤ 0 and that M1 (x) ,M2 (x) ≥ 0: multilinear polynomials reach their extremes at the vertices of H3, so
min 1 − xi (1 − xj − xk) = 0, x∈H3
min xixj = 0, x∈H3
max −xi (1 − xj − xk) = 1, x∈H3
max xixj − 1 = 0. x∈H3
Then, max {m1 (x) ,m2 (x)}≤ λ8 ≤ min {M1 (x) ,M2 (x)} . On the other hand, m3 (x)= − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3) ≤ λ8 ≤ 1−(x1 + x2 + x3 − x1x2 − x1x3 − x2x3)= M3 (x) .
It is immediate to verify that minx∈H3 M3 (x) = maxx∈H3 m3 (x) = 0, which implies that m3 (x) ≤ 0 and that M3 (x) ≥ 0 in the hypercube H3. Now it is enough to prove that: (i) −xi (1 − xj − xk) ≤ 1 − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3): This boils down to simply analyzing the case (i, j, k)=(1, 2, 3) since the polynomial that appears on the right side is symmetric (for the cases (i, j, k) = (2, 1, 3) and (i, j, k)=(3, 1, 2) the proof would be identical). We will analyze the sign of the multilinear polynomial, p (x1, x2, x3)= −x1 (1 − x2 − x3)−(1 − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3)) .
The maximum of p in the hypercube H3 is reached at one of its vertices with value 0. With this, we have shown that
m1 (x) ≤ 1 − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3).
(ii) xixj − 1 ≤ 1 − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3). By the same argument as in (i) we reduce the problem to the pair (i, j)=(1, 2) and simply analyze the multilinear polynomial,
p (x1, x2, x3)= x1x2 − 1 − (1 − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3)) , whose maximum in H3 is 0. In this way,
m2 (x) ≤ 1 − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3).
(iii) − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3) ≤ 1 − xi (1 − xj − xk). Again we reduce it to (i, j, k)=(1, 2, 3), and we construct the multilinear polynomial: p (x1, x2, x3)= − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3)−(1 − x1 (1 − x2 − x3)) .
8 It can be verified that minx∈H3 p (x1, x2, x3)= −2, so
− (x1 + x2 + x3 − x1x2 − x1x3 − x2x3) ≤ M1.
(iv) − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3) ≤ xixj . We reduce it to (i, j) = (1, 2) and to the polynomial
p (x1, x2, x3)= − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3) − x1x2,
such that again minx∈H3 p (x1, x2, x3)= −2. Thus,
− (x1 + x2 + x3 − x1x2 − x1x3 − x2x3) ≤ M2.
With this, we have proven that
max {m1 (x) ,m2 (x)}≤ λ8 ≤ M3 (x),
m3 (x) ≤ λ8 ≤ min {M1 (x) ,M2 (x)} , and that
max {m1 (x) ,m2 (x) ,m3 (x)}≤ λ8 ≤ min {M1 (x) ,M2 (x) ,M3 (x)} = M (x).
Since m2 (x) ,m3 (x) ≤ 0, we should simply take λ8 ∈ [max {0,m1 (x)} , min {M (x) , 1}] and λ1 =1 − (λ8 + x1 + x2 + x3 − x1x2 − x1x3 − x2x3).
Since the previous proof is constructive, in the following examples we will 8 show how λ ∈ [0, 1] can be constructed from x ∈ H3. We must emphasize that since λ8 moves in a permitted interval [max {0,m1 (x)} , min {M (x) , 1}], in general, there can be many solutions.
1 1 T Example 2. Let x = 1 2 2 . According to the notation in Lemma 1:
M1 (x) = min {1 − x1 (1 − x2 − x3), 1 − x2 (1 − x1 − x3) , 1 − x3 (1 − x1 − x2)} = 5 5 = min 1, , =1, 4 4 1 1 1 1 M (x) = min {x x , x x , x x } = min , , = , 2 1 2 1 3 2 3 2 2 4 4 m1 (x) = max {−x1 (1 − x2 − x3) , −x2 (1 − x1 − x3) , −x3 (1 − x1 − x2)} = 1 1 1 = max 0, , = , 4 4 4 1 M (x)=1 − (x + x + x − x x − x x − x x )= , 3 1 2 3 1 2 1 3 2 3 4
9 1 1 and as a result M (x) = min {M1 (x) ,M2 (x) ,M3 (x)} = 4 , and λ8 = 4 , λ1 =0. With this selection, we will have
λ1 0 λ + x − x x − x x 0 8 3 1 3 2 3 λ8 + x2 − x1x2 − x2x3 0 x2x3 − λ8 0 λ = = 1 . λ8 + x1 − x1x2 − x1x3 4 1 x1x3 − λ8 4 1 x1x2 − λ8 4 λ 1 8 4 For this λ we verify that Bλ = φ (x): 0 1 1 1 1 5 0 0 2 2 2 2 2 2 0 4 0 1 0 1 1 2 1 2 0 5 2 2 2 2 4 0 1 1 2 0 1 1 2 0 3 Bλ = 2 2 2 2 = 4 = φ (x) . 0 0 1 1 1 1 0 0 1 1 2 2 2 2 4 4 0 1 0 1 1 0 1 0 1 1 2 2 2 2 4 4 0 1 1 0 0 1 1 0 1 1 2 2 2 2 4 4 1 4 1 1 Example 3. Let (x1, x2, x3)= 0, 4 , 4 :
M1 (x) = min {1 − x1 (1 − x2 − x 3) , 1 −x2 (1 − x1 − x3) , 1 − x3 (1 − x1 − x2)} = 13 13 13 = min 1, , = , 16 16 16 1 M (x) = min {x x , x x , x x } = min 0, 0, =0, 2 1 2 1 3 2 3 16 m1 (x) = max {−x1 (1 − x2 − x3) , −x2 (1 − x1 − x3) , −x3 (1 − x1 − x2)} = 3 3 = max 0, − , − =0, 16 16 9 M (x)=1 − (x + x + x − x x − x x − x x )= , 3 1 2 3 1 2 1 3 2 3 16 and as a result M (x) = min {M1 (x) ,M2 (x) ,M3 (x)} = 0, and λ8 = 0, λ1 = 9 16 . We this selection we obtain the vector λ: 9 λ1 16 λ + x − x x − x x 0 8 3 1 3 2 3 λ8 + x2 − x1x2 − x2x3 0 x2x3 − λ8 0 λ = = 1 . λ8 + x1 − x1x2 − x1x3 4 1 x1x3 − λ8 4 1 x1x2 − λ8 4 1 λ8 4 10 Now for this λ, we check that Bλ = φ (x):
9 16 1 1 1 1 3 1 0 0 2 2 2 2 2 2 16 8 0 1 0 1 1 2 1 2 3 1 2 2 2 2 16 8 0 1 1 2 0 1 1 2 1 5 Bλ = 2 2 2 2 16 = 16 = φ (x) . 0 0 1 1 1 1 0 0 0 1 2 2 2 2 8 0 1 0 1 1 0 1 0 0 1 2 2 2 2 8 0 1 1 0 0 1 1 0 0 3 2 2 2 2 16 0 1 T Example 4. For the point x = 1 2 0 we compute the quantities Mi (x) and mi (x):
M1 (x) = min {1 − x1 (1 − x2 − x3) , 1 − x2 (1 − x1 − x3) , 1 − x3 (1 − x1 − x2)} = 1 1 = min , 1, 1 = , 2 2 1 M (x) = min {x x , x x , x x } = min , 0, 0 =0, 2 1 2 1 3 2 3 2 m1 (x) = max {−x1 (1 − x2 − x3) , −x2 (1 − x1 − x3) , −x3 (1 − x1 − x2)} = 1 = max − , 0, 0 =0, 2 M3 (x)=1 − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3)=0, which implies that M (x) = min {M1 (x) ,M2 (x) ,M3 (x)} = 0, and λ8 = 0, λ1 =0. Using this selection we have that
λ1 0 λ + x − x x − x x 0 8 3 1 3 2 3 λ8 + x2 − x1x2 − x2x3 0 x2x3 − λ8 0 λ = = 1 . λ8 + x1 − x1x2 − x1x3 2 x1x3 − λ8 0 1 x1x2 − λ8 2 λ 0 8 For this λ we verify that Bλ = φ (x): 0 1 1 1 1 5 0 0 2 2 2 2 2 2 0 4 0 1 0 1 1 2 1 2 0 1 2 2 2 2 2 0 1 1 2 0 1 1 2 0 1 Bλ = 2 2 2 2 = 4 = φ (x) . 0 0 1 1 1 1 0 0 1 1 2 2 2 2 2 4 0 1 0 1 1 0 1 0 0 1 2 2 2 2 2 0 1 1 0 0 1 1 0 1 1 2 2 2 2 2 4 0 11 The following example shows that the associated vector λ for a φ (x) is not unique.
1 1 1 T Example 5. Let x = 4 4 4 : M1 (x) = min {1 − x1 (1 − x2 − x3) , 1 − x2 (1 − x1 − x3) , 1 − x3 (1 − x1 − x2)} = 7 7 7 7 = min , , = , 8 8 8 8 1 1 1 1 M (x) = min {x x , x x , x x } = min , , = , 2 1 2 1 3 2 3 16 16 16 16 m1 (x) = max {−x1 (1 − x2 − x3) , −x2 (1 − x1 − x3) , −x3 (1 − x1 − x2)} = 1 1 1 1 = max − , − , − = − , 8 8 8 8 7 M (x)=1 − (x + x + x − x x − x x − x x )= , 3 1 2 3 1 2 1 3 2 3 16
1 and as a consequence, M (x) = min {M1 (x) ,M2 (x) ,M3 (x)} = 16 , and λ8 ∈ 1 0, 16 . If we select λ = 1 then λ = 13 and we will have the following vector λ: 8 32 1 32 13 λ1 32 5 λ8 + x3 − x1x3 − x2x3 32 5 λ8 + x2 − x1x2 − x2x3 32 1 x2x3 − λ8 32 λ = = 5 . λ8 + x1 − x1x2 − x1x3 32 1 x1x3 − λ8 32 1 x1x2 − λ8 32 1 λ8 32 For this λ we check that Bλ = φ (x):
13 32 1 1 1 1 5 5 0 0 2 2 2 2 2 2 32 16 0 1 0 1 1 2 1 2 5 5 2 2 2 2 32 16 0 1 1 2 0 1 1 2 1 5 Bλ = 2 2 2 2 32 = 16 = φ (x) . 0 0 1 1 1 1 0 0 5 3 2 2 2 2 32 16 0 1 0 1 1 0 1 0 1 3 2 2 2 2 32 16 0 1 1 0 0 1 1 0 1 3 2 2 2 2 32 16 1 32 1 3 On the other hand, with the selection λ8 = 16 , we would have λ1 = 8 , and
12 therefore 3 8 3 16 3 16 0 λ = . 3 16 0 0 1 16 For this λ we check again that Bλ =φ (x):
3 8 1 1 1 1 3 5 0 0 2 2 2 2 2 2 16 16 0 1 0 1 1 2 1 2 3 5 2 2 2 2 16 16 0 1 1 2 0 1 1 2 0 5 Bλ = 2 2 2 2 = 16 = φ (x) . 0 0 1 1 1 1 0 0 3 3 2 2 2 2 16 16 0 1 0 1 1 0 1 0 0 3 2 2 2 2 16 0 1 1 0 0 1 1 0 0 3 2 2 2 2 16 1 16 The elements in C3 are called primary variables and are denoted as w. We will make a distinction between primary variables u ∈ [0, 2]3 and primary variables 1 3 T T T v ∈ 0, 2 , which make up the vector w as w = u , v .
Definition 6 (Primary variables). For the triplet of variables (x1, x2, x3) ∈ H3 we define the primary variables for 1 ≤ i < j ≤ 3: x +2x x + x u = i i j j , (10) ij 2 x − 2x x + x v = i i j j , (11) ij 2 1 where uij ∈ [0, 2] and vij ∈ 0, 2 . T T T T The vector of primary variables w = u , v is such that u = (u12,u13,u23) is given by (10) and vT = (v , v , v ) by (11). The advantage of defining these 12 13 23 variables is that they satisfy the following relationships: (i) Cross-products in the objective function (corresponding to the off-diagonal entries in Q): u − v x x = ij ij for 1 ≤ i < j ≤ 3. (12) i j 2 (ii) Single variables (corresponding to the vector b): u + v + u + v − u − v x = ij ij ik ik jk jk for 1 ≤ i < j ≤ 3. (13) i 2
We have seen in Lemma 1 that given an x ∈ H3 there always exists a w ∈C3 such that φ (x)= w.
13 6 We define the linear transformation ϕ : C3 → R given by ϕ (w)= T3w, with
1 1 −1 1 1 −1 1 0 0 −1 0 0 1 0 1 0 0 −1 0 T = . 3 2 1 −1 1 1 −1 1 0 0 1 0 0 −1 −1 1 1 −1 1 1 We claim that ϕ ◦ β = id; this can be checked by inspection:
T3E3 = I6, (14) where E3 was given in (3). Remark 7. According to the definition of α there exists a vector c ∈ R6 such that f (x)= cT α (x). Additionally, always there exists a vector c˜ ∈ R6 such that f (x)=˜cT φ (x), where T T c˜ = c T3. (15) This is due to T T c˜ E3α (x)= c T3E3α (x) , and we know from (14) that T3E3 = I. Let f˜(w)=˜cT w, we define the optimization problem: ′ ˜ (P3): minw∈C3 f (w).
The following theorem states that the minimum of f over H3 is the minimum of f˜ over C3. ∗ ∗ Theorem 8. Let f (x ) be the minimum of the problem (P3), and f˜(w ) the ′ ∗ ˜ ∗ minimum of the problem (P3’). Then f (x )= f (w ).
Proof. In virtue of Lemma 1, φ (H3) ⊆C3. This means that there exists a point ∗ ∗ w ∈C3 such that φ (x )= w. The minimum of f˜ over C3 is attained at w ∈C3, so that f˜(w∗) ≤ f˜(w) . (16) The connection between f and f˜ yields:
∗ T ∗ T ∗ f˜(φ (x ))=˜c φ (x )= c T3 (E3α (x )) (17) T ∗ ∗ = c α (x )= f (x ).
According to (16) and (17) it follows that
f˜(w∗) ≤ f˜(w)= f (x∗).
14 On the other hand, we know that there exists a vertex y in H3 such that φ (y)= w∗, so
T T T f (y)= c α (y)= c T3 (E3α (y))=˜c φ (y)= f˜(φ (y)) . (18) ∗ Since the minimum of f over H3 is attained at x ∈ H3, and accounting for (18), we have that
f (y)= f˜(φ (y)) = f (w∗) ≥ f (x∗).
Henceforth, f (x∗)= f˜(w∗).
In the previous theorem the condition φ (H3) ⊆C3 could be eliminated since ∗ φ (x ) is a vertex of C3. ′ The problem (P3): can be written as a linear programming problem:
minc ˜T w Bλ − w =0 (LP3): T λ such that u =1 λ w ≥ 0, ≥ 0
T whereλ = (λ1,...,λ 8), B = (φ (p1) ,...,φ (p8)), and u is an all-ones vector of appropriate dimension. Throughout this work the variables λ are called secondary variables. Example 9. Let f (x)= xT Qx + bT x with
0 −10 −20 Q = −10 0 −10 , −20 −10 0 −2 b = −2 . −26 This objective function is written as f (x)= cT α (x), where cT = (−2, −20, −40, −2, −20, −26). T T T The problem (LP3) has an objective function f˜(w)=˜c w with c˜ = c T3 = (1, −33, −23, 21, 7, −3). The matrix B in the constraints was given in 4. The minimum of (P’) is −110 and is attained at w∗ = (2, 2, 2, 0, 0, 0)T .
3.2. The General Case In this subsection, we generalize the simple case to the n-dimensional hyper- cube Hn.
15 For each triplet (i, j, k) with 1 ≤ i xi+2xixj +xj 2 xi+2xixk+xk 2 xj +2xj xk+xk φ (x , x , x )= 2 . i,j,k i j k xi−2xixj +xj 2 xi−2xixk+xk 2 xj −2xj xk+xk 2 (i,j,k) (i,j,k) (i,j,k) (i,j,k) From the set of vertices V of H3 the convex-Hull C3 = conv φi,j,k V ) (i,j,k) is defined. Recall that although the set φi,j,k H3 is not convex, the image (i,j,k) (i,j,k) of H3 through φi,j,k is contained in C3 . Similarly as done in (10) and (11) we define primary variables uij and vij for 1 ≤ i < j ≤ n. For the sake of clarity we adopt the notation wi,j,k to refer to the vector (uij ,uik,ujk vij , vik, vjk) and xi,j,k for the vector (xi, xj , xk). With this (i,j,k) 3 1 3 notation the elements in C3 are wi,j,k ∈ [0, 2] × 0, 2 . Also, we will make 3 a distinction between primary variables ui,j,k ∈ [0, 2] and primary variables T 1 3 T T vi,j,k ∈ 0, 2 , which make up the vector wi,j,k as wi,j,k = ui,j,k, vi,j,k . We (i,j,k) (i,j,k) have seen that given an xi,j,k ∈ H3 there always exists a w ∈ C3 such that φi,j,k (xi,j,k) = wi,j,k. This result should be borne in mind because it is key in the main theorem of this subsection. (i,j,k) n 1 n From the convex-hulls C3 we create the set Cn ⊆ [0, 2] × 0, 2 by introducing consistency constraints: For it, we introduce the functions u + v + u + v − u − v g (w)= i,j i,j i,k i,k j,k j,k , i,j,k 2 where 1 ≤ i n 1 n Given a point w ∈ [0, 2] × 0, 2 we define the following consistency constraints for 1 ≤ j < k ≤ n: (C1): g1,2,3 (w)= g1,j,k (w) with (j, k) 6= (2, 3). (C2): g2,1,3 (w)= g2,j,k (w) with (j, k) 6= (1, 3). (C3): gi,1,2 (w)= gi,j,k (w) with (j, k) 6= (1, 2) and i ≥ 3. The set Cn is then defined as 1 n C = w ∈ [0, 2]n × 0, : w ∈C(i,j,k) n 2 i,j,k 3 and w satisfies (C1), (C2), and (C3) . 16 n T T T In this case u and v are 2 -dimensional vectors, and then w = u , v ∈ n(n−1) R . Let w ∈ Cn, and triplets (i, j, k), (i,j,l), with l 6= k, there exist one-to-one maps φi,j,k and φi,j,l (as shown in the simple case) such that (i,j,k) wi,j,k = φi,j,k (xi,j,k), xi,j,k ∈ H3 , and (i,j,l) wi,j,l = φi,j,l (yi,j,k), xi,j,l ∈ H3 . Consistency means that we expect that xi = gi,j,k (w)= gi,j,l (w)= yi, xj = gj,i,k (w)= gj,i.l (w)= yj. Analogously to the simple case we have that φ (Hn) ⊂ Cn. This result is argued in the following lemma: Lemma 10. Given an arbitrary point x ∈ Hn, there exists a w ∈Cn such that φ (x)= w. Proof. For a triplet (i1, j1, k1) there exists a point (wi1 , wj1 , wk1 ) ∈ C3 and a map φi1,j1,k1 : H3 → C3 such that φi1 ,j1,k1 (xi1 , xj1 , xk1 ) = wi1,j1,k1 (this was shown above in Lemma 1 for the simple case). Let (i2, j2, k2) be another triplet with φi2 ,j2,k2 (xi2 , xj2 , xk2 ) = wi2 ,j2,k2 , such that {i1, j1, k1}∩{i2, j2, k2}= 6 ∅. Without loss of generality let us assume that i1 = i2 (otherwise we can make a permutation of the indices to get that configuration) then from the consistency constraints we have xi1 = yi1 : xi1 = gi1,j1,k1 (φi1,j1,k1 (xi1 , xj1 , xk1 )) = gi1,j2,k2 (φi1 ,j2,k2 (yi1 ,yj2 ,yk2 )) = yi1 Extending this idea to all pair of triplets, we conclude that for every x ∈ Hn, there exists a vector w ∈Cn such that φ (x)= w. Example 11. For n =4, the consistency constraints are Consistency for x1: (u12 + v12 + u13 + v13 − u23 − v23) − (u + v + u + v − u − v )=0 12 12 14 14 24 24 (u + v + u + v − u − v ) 12 12 13 13 23 23 − (u13 + v13 + u14 + v14 − u34 − v34)=0 Consistency for x2: (u12 + v12 + u23 + v23 − u13 − v13) − (u + v + u + v − u − v )=0 12 12 24 24 14 14 (u + v + u + v − u − v ) 12 12 23 23 13 13 − (u23 + v23 + u24 + v24 − u34 − v34)=0 17 Consistency Constraints Cw123 T T T (1,2,3) (x1, x2, x3) x = r w = r w = r w C3 1 123 124 134 Cw124 T T T (1,2,4) (x1, x2, x4) x = r w = r w = r w C3 2 213 214 234 Cw134 T T T (1,3,4) (x1, x3, x4) x = r w = r w = r w C3 3 312 314 324 Cw234 T T T (2,3,4) (x2, x3, x4) x = r w = r w = r w C3 4 412 413 423 Figure 3: Consistency constraints for n = 4 Consistency for x3: (u13 + v13 + u23 + v23 − u12 − v12) − (u + v + u + v − u − v )=0 13 13 34 34 14 14 (u + v + u + v − u − v ) 13 13 23 23 12 12 − (u23 + v23 + u34 + v34 − u24 − v24)=0 Consistency for x4: (u14 + v14 + u24 + v24 − u12 − v12) − (u + v + u + v − u − v )=0 14 14 34 34 13 13 (u + v + u + v − u − v ) 14 14 24 24 12 12 − (u24 + v24 + u34 + v34 − u23 − v23)=0 In Figure 3, we graphically show the consistency constraints for the variables (1,2,3) (1,2,4) (1,3,4) x1, x2, x3 and x4, generated from the convex-hulls C3 , C3 , C3 and (2,3,4) C3 via the linear transformation C = (M,M) with 1 1 −1 1 M = 1 −1 1 . 4 −1 1 1 Note that xi Cφi,j,k (xi,j,k)= xj . xk We will see that the minimization problem of f (x) in Hn is equivalent to the minimization problem of a linear objective function f˜(w) in Cn, where Cn can 18 be expressed as constraints on equality given by the convexity constraints and consistency constraints, in addition to the natural constraints for the vectors (i,j,k) (i,j,k) λ (secondary variables) in the definition of the convex-hull C3 : 8 8 (i,j,k) (i,j,k) (i,j,k) C3 = λl φ (pl): λl =1, l l X=1 X=1 (i,j,k) (i,j,k) pl ∈ V , λl ≥ 0 for l =1,..., 8 . In particular, to define f˜(w) it is necessary to introduce a transformation Tn such that T T f˜(w)=˜c w = c Tnw. This transformation is obtained according to the following relationships: xi = gi,j,k (w) , u − v x x = i,j i,j . i j 2 More specifically, if α : Hn → α (Hn) is the map α (x) = (x1, x1x2, ··· , x1xn, x2, x2x3, ··· , x2xn, ··· , xn−1, xn−1xn, xn). and β : α (Hn) → Cn is a linear map β (α (x)) = Enα (x) = w such that for a triplet (i, j, k) we verify that x +2x x + x x +2x x + x u = i i j j , u = i i k k , ij 2 ik 2 x +2x x + x x − 2x x + x u = j j k k , v = i i j j , jk 2 ij 2 x − 2x x + x x − 2x x + x v = i i k k , v = j j k k . ik 2 jk 2 n(n+1) Here En is not a square matrix but a rectangular n (n − 1) × 2 . The matrices En and Tn are connected by the relation: TnEn = I, n(n+1) n(n+1) where I is the identity matrix of dimension 2 × 2 . Obviolusly the transformation Tn satisfies the relation Tnβ (α (x)) = α (x) . Figure 4 illustrates the maps φ, α, and β, in a similar way as in the simple case (n = 3). 19 wijk = φijk (x) xi+2xixj +xj ) xi+2xixk+xk) xj +2xj xk+xk) xi−2xixj +xj ) xi−2xixk+xk) xj −2xj xk+xk) = 2 , 2 , 2 , 2 , 2 , 2 w x φ β(y)= Eny Cn Hn β(α(x)) = φ(x) β α α(Hn) y α(x) = (x1, x1x2,...,xn−1xn, xn) Figure 4: Diagram for the maps φ, α, and β in the general case. Example 12. For n =4: 1 1 0 −1001 1 0 −1 0 0 1 00000 −10 0 0 0 0 0 100000 −10 0 0 0 0 010000 0 −10 0 0 1 1 −10 1 00 1 −10 1 0 0 T = . 4 2 0 001000 0 0 −1 0 0 0 000100 0 0 0 −1 0 −11 0 1 00 −11 0 1 0 0 0 000010 0 0 0 0 −1 −10 1 0 10 −10 1 0 1 0 The matrix E4 is 12 0 010 0000 10 2 000 0100 10 0 200 0001 00 0 012 0100 00 0 010 2001 1 00 0 000 0121 . 2 1 −20 010 0000 1 0 −2000 0100 1 0 0 −200 0001 00 0 01 −20100 00 0 010 −20 0 1 00 0 000 01 −2 1 It can be checked that T4E4 = I10. 20 Now we formulate the minimization problem in Cn: ′ ˜ (Pn): minw∈Cn f (w), T where the objective function f˜ : Cn → R is defined as f˜(x)=˜c w with T T c˜ = c Tn. The following Theorem is an extension of Theorem 8 and proves ′ that the minimum of the problem (Pn) is the same as that of (Pn). ∗ Theorem 13. Let w ∈ Cn be the point that minimizes the function f˜, and ∗ x ∈ Hn the point where f reaches the minimum (which we know to be a vertex ∗ ∗ of Hn, then f˜(w )= f (x )). Proof. From Lemma 10, we know that φ (Hn) ⊆Cn, so there is a point w ∈Cn ∗ ∗ such that φ (x ) = w. The minimum of f˜ over Cn is attained at w ∈ Cn, so that f˜(w∗) ≤ f˜(w) . (19) The connection between f and f˜ yields: ∗ T ∗ T ∗ f˜(φ (x ))=˜c φ (x )= c Tn (Enα (x )) (20) T ∗ ∗ = c α (x )= f (x ). According to (19) and (20) it follows that f˜(w∗) ≤ f˜(w)= f (x∗). On the other hand, we know that there exists a vertex y in Hn such that φ (y)= w∗, so T T T f (y)= c α (y)= c Tn (Enα (y))=˜c φ (y)= f˜(φ (y)) . (21) ∗ Since the minimum of f over Hn is attained at x ∈ Hn, and accounting for (21), we have that f (y)= f˜(φ (y)) = f (w∗) ≥ f (x∗). Henceforth, f (x∗)= f˜(w∗). As for the simple case n = 3, in the general case, the condition φ (Hn) ⊆Cn can be relaxed. In the appendix, another proof is made where this relaxation is performed and the correspondence between the vertices of Hn and the vertices of Cn is simply considered. This requires expressing the problem in secondary variables along with box constraints. ′ With the ideas presented above, the optimization problem (Pn) is trans- formed into the following linear optimization problem: 21 minc ˜T w such that for 1 ≤ i