arXiv:2005.07030v6 [cs.DS] 30 Jan 2021 piiain opeiymaue,adclasses and measures, complexity optimization, works method the that Keywords: revealed have out carried experiments the rirr iesosu o3 ihrno nre nterne[ range the in o entries matrices random million with 30 five to generating algorith The up by dimensions time. checked polynomial arbitrary and in MATLAB solved be in quadr can the implemented that of transformation program the linear on a emphasis into special a with minimiza detail class this in complexity that the prove in to is sufficient lem is it conservative, very although .Introduction 1. solv complexit to computational developed The is time programs. polynomial quadratic in binary algorithm stricted exact an paper, this In Abstract xml,teojciefntosi h BPpolmaeacaso e of survey). class succes recent a striking a very are for had [5] problem have [3], (see and UBQP [2], useful vision the widely [1], community, are in that research functions functions vision objective computer progres the the algorithmic example, to the due of ima Much been has labeling, classification. segmentation/pixel data and image clustering, recognition to partitioning pattern limited denoising/restoration, image and not registration/matching, processing, but image including vision, tions, computer many in n opeiyTer sresi 7 n n[] ieago con of account good a give Mat [8], Discrete in in and area [7] research in active topic). (surveys an Ru Theory become and Complexity has Hammer and it by then, introduced Since were [6]. optimization quadratic binary and rpitsbitdt ArXiv to submitted Preprint ve a for framework unifying a represents UBQP that discovery the eateto uoai oto,Eetia niern a Engineering Electrical Control, Automatic of Department h nosrie iayqartcpormig(BP rbe o problem (UBQP) programming quadratic binary unconstrained The h BPpolmdtsbc ote16sweeped-ola f pseudo-boolean where 1960s the to back dates problem UBQP The urnl,ti rbe a eoeamjrpolmi eetyears recent in problem major a become has problem this Currently, oyoilTm loih o Unconstrained for Algorithm Polynomial-Time A ehia nvriyo atgn,Cmu ual e a 3 Mar del Muralla Campus Cartagena, of University Technical nosrie iayqartcpormig global programming, quadratic binary Unconstrained iayQartcOptimization Quadratic Binary unIncoMulero-Mart´ınez Ignacio Juan -al [email protected] E-mail: h mlmnainapcsaeas described also are aspects implementation The . dEetoi Technology, Electronic nd 23 Spain. 0203, fgah,data graphs, of eray2 2021 2, February ywd variety wide ry ncomputer in s is y − tcprogram atic correctly. tUBQP at s 50 inprob- tion O , hematics unctions applica- 0.All 50]. 4.For [4].  unre- e deanu, u to due was m n nergy ccurs 15 this 2 ge  f , of combinatorial optimization problems. In particular, as pointed out in [9] the UBQP model includes the following important combinatorial optimization prob- lems: problems, maximum, click problems, maximum indepen- dent set problems, graph coloring problems, satisfiability problems, quadratic knapsack problems, etc. The UBQP problem is generally NP-Hard, [10] (you can use the UBQP problem to optimize the number of constraints satisfied on a 0/1 integer Pro- gramming instance, one of the Karp’s 21 NP- problems). Only a few special cases are solvable in polynomial time. In fact, the problem of deter- mining local minima of pseudo-boolean functions is found in the PLS-complete class (the class of hardest polynomial local search problems), [10], [11], and in general, local search problems are found in the EXP class, [12], [13], [14], [15], [16], [17]. Global optimization methods are NP-complete. To obtain a global optimal solution by exact methods (generally based on branch and bound strategies), the following techniques should be highlighted: the combinatorial variable elimination algorithm1, [6], [18], [19]; the continuous relaxation with linearization (where the requirement of binary variables is replaced by a weaker restriction of membership to the closed interval [0, 1]), [20], [21]; the posiform transformations, [22], [23]; the conflict graphs (the connection between posi- form minimization problem and the maximum weighted stability), [24], [25], [26], [27], [28], [29]; the linearization strategies such as standard linearization (consisting in transforming the minimization problem into an equivalent linear 0–1 programming problem), [30], [31], [32], [33], Glover method, [34], improved linearization strategy, [35], (the reader is referred to [36] for a recent comparison of these methods); semidefinite-based solvers, [37] (and the references therein), et cetera. The reader is referred to the survey [38] for a detailed description of these techniques until 2014. Many researchers have extensively studied the UBQP problem, however, up to date nobody has succeeded in developing an algorithm running in polynomial time. We claim, and this is the main contribution of this work, that UBQP is in the P . The main idea is to transform the UBQP problem into a (LP) problem, that is solved in polynomial time. We guar- antee that the minimum of the LP problem is also the minimum of the UBQP problem. We also provide the implementation details of the algorithm, moti- vated by the following aspects that any work on discrete optimization should present: (i) Describe in detail the algorithms to be able to reproduce the experiments and even improve them in the future. (ii) Providing the source code so that it is openly available to the scientific community: Interestingly, a recent study by Dunning has revealed that only 4% of papers on heuristic methods provide the source code, [39].

1This algorithm is in the class EXP and only runs in polynomial time for pseudo-Boolean functions associated with graphs of bounded tree-width.

2 (iii) Establish random test problems with an arbitrary input size. Here it is important to indicate the ranges of the parameters in the UBQP problem. This procedure has been implemented in MATLAB (source code is provided as supplementary material) and checked with five million random matrices up to dimension 30, with entries in the range [−50, 50]. An advantage of this algorithm is its modularity concerning the dimension of the problem: the set of linear constraints of the equivalent linear programming problem is fixed for a constant arbitrary dimension regardless of the objective function of the quadratic problem. Finally, we highlight that the objective of the work is not the speed of resolution of the problem but simply to show that the UBQP problem can be solved in polynomial time. Future works will analyze large-scale UBQP problems as well as the design of more efficient polynomial- time algorithms. The paper is organized as follows: Section 2 describes the relaxation process for the UBQP problem. Next in section 3, the main result about the equiva- lence of the UBQP problem with a linear programming problem is addressed. For simplicity in the exposition, the case n = 3 is presented first and then it is generalized for n > 3. The computational complexity in both time and space is analyzed in section 4. The implementation features about primary variables, transformation of the objective function, and convexity and consistency con- straints are treated in section 5. The design of the experiment for testing the solution is presented in section 6. Finally, section 7 is dedicated to discussing the main aspects presented in this work as well as possible future works.

2. Background

Let B = {0, 1} and f : Bn → R be a quadratic objective function defined as f (x) = xT Qx + bT x with Q = QT ∈ Rn×n, diag (Q) = (0,..., 0) and b ∈ Rn. The UBQP problem is defined as follows:

UBQP: minx∈B f (x).

The objective function f is usually called a quadratic pseudo-boolean func- tion, i.e. multilinear polynomials in binary unknowns. These functions repre- sent a class of energy functions that are widely useful and have had very striking success in computer vision (see [5] for a recent survey). n This problem can naturally be extended to the solid hypercube Hn = [0, 1] spanned by Bn. The extension of the pseudo-Boolean function f : Bn → R is a pol function f : Hn → R that coincides with f at the vertices of Hn. Rosenberg discovered an attractive feature regarding the multilinear polynomial extension pol pol f , [40]: the minimum of f is always attained at a vertex of Hn, and hence, that this minimum coincides with the minimum of f. From this, our optimization problem is reduced to the following relaxed quadratic problem:

(Pn): minx∈Hn f (x).

3 3. Main Result

In this section, we prove that Problem (P) can be reduced to a Linear Pro- gramming Problem.

3.1. A Simple Case We begin with the simple case of minimization of a quadratic form f (x) in the cube H3. Here the minimization problem is stated as follows:

(P3): minx∈H3 f (x).

3 1 3 Associated with the cube H3 we have a map φ : H3 → [0, 2] × 0, 2 defined as x1+2x1x2+x2   2 x1+2x1x3+x3 2  x2+2x2x3+x3  φ (x , x , x )= − 2 . 1 2 3  x1 2x1x2+x2   − 2   x1 2x1x3+x3   − 2   x2 2x2x3+x3   2    An important fact is that the cube H3 can be expressed as a convex hull of a finite set of vertices V = {0, 1}3. For simplicity, we enumerate the vertices in V as p1,p2,...,p8 so that H3 can be written as convex combinations of those vertices, i.e. H3 = conv (V ), where

8 8 conv (V )= αipi : ai ≥ 0, αi =1 . (i=1 i=1 ) X X 6 6 The map φ is a composition of the maps α : H3 → [0, 1] and β : [0, 1] → 3 1 3 [0, 2] × 0, 2 defined as   α (x) = (x1, x1x2, x1x3, x2, x2x3, x3) , (1)

6 β (y)= E3y for every y ∈ [0, 1] , (2) where E3 is 12 0100 10 2001 1  00 0121  E = . (3) 3 2  1 −20100     1 0 −20 0 1     00 01 −2 1    More specifically φ = β ◦ α.   As a summary, the maps φ, α, and β are represented in the diagram of Figure 1. The map φ is composition of α with β, i.e. φ = β ◦ α, where α can be T T built from H3 as a selection of the Kronecker productx ˜ ⊗ x˜ withx ˜ = 1, x  4 x1+2x1x2+x2 x1+2x1x3+x3 x2+2x2x3+x3 x1−2x1x2+x2 x1−2x1x3+x3 x2−2x2x3+x3 w = φ(x)= 2 , 2 , 2 , 2 , 2 , 2 w x  u φ 12 u13 β(y)= E3y C H n 3 u  β(α(x)) = φ(x)= 23 v12    y +2y + y v13  1 2 4 α   y +2y + y β v23   1 3 6   1 y4 +2y5 + y6   2 = β(y) y1 − 2y2 + y4   y1 − 2y3 + y6 α(H )   y 3 y4 − 2y5 + y6   α(x) = (x1, x1x2, x1x3, x2, x2x3 , x3)  

Figure 1: Diagram for the maps φ, α, and β.

1 and x ∈ H . The set H˜ = x˜ = : x ∈ H is convex: this is trivial 3 3 x 3     simply by building a convex combination of two pointsx ˜ andy ˜ in H˜3,

1 λx˜ + (1 − λ)˜y = ∈ H˜ with λ ∈ [0, 1] . λx + (1 − λ) y 3   8 8 Since H3 = conv (V ), it follows that x = i=1 λipi where i=1 λi = 1, λi ≥ 0, and p ∈ V . The set of vertices of H˜ is i 3 P P 1 V˜ = p˜i = : i =1,..., 8 . pi    

So H˜3 = conv V˜ andx ˜ ⊗ x˜ is written as a convex combination:   8 x˜ ⊗ x˜ = λiλj (˜pi ⊗ p˜j). i,j=1 X

There exists a matrix S3 given by

0100100000000000 0000001001000000 1  0000000100000100  S = , 3 2  0010000010000000     0000000000010010     0001000000001000      such that α (x)= S3 (˜x ⊗ x˜).

5 3 − dimensional Space H∋ 6 − dimensional Space C3 1 1 1 1 φ(p )= , , 2, , , 0 (0, 0, 1) (0, 1, 1) = p 4 2 2 2 2 4   (1, 0, 1) (1, 1, 1) φ(x)

(0, 0, 0) (0, 1, 0) = p3

1 1 1 1 φ(p )= , 0, , , 0, 3 2 2 2 2 (1, 0, 0) (1, 1, 0)  

Figure 2: Map φ between the cube H3 and the 6-dimensional convex-hull C3.

From the map φ, another convex hull is built C3 = conv (φ (V )). In Figure 2, the transformation between H3 and C3 through the map φ is illustrated. From φ (x) we can recover x through the linear transformation x = Lφ (x), where 1 1 −1 1 1 −1 1 L = 1 −1 1 1 −1 1 . 2  −1 1 1 −1 1 1    We have seen that the points of C3 are convex combinations of the vertices φ (pi), i =1,..., 8. For simplicity, we write such points as w = Bλ where matrix B is given by

1 1 1 1 0 0 2 2 2 2 2 2 1 1 1 1 0 2 0 2 2 2 2 2  0 1 1 2 0 1 1 2  B = 2 2 2 2 , (4)  0 0 1 1 1 1 0 0   2 2 2 2   0 1 0 1 1 0 1 0   2 2 2 2   0 1 1 0 0 1 1 0   2 2 2 2    8 8 and the vector λ ∈ [0, 1] is such that i=1 λi = 1. Next we show with a counterexample that C3 * φ (H3). For this, let w eP2+e3 be the point of C3 given by w = B 2 where ek represents the standard base vector in R8 with a ’1’ at the k-th component. Suppose that there was an  x ∈ H3 such that φ (x)= w, then x = Lφ (x)= Lw, and therefore it should be

6 verified that φ (Lw)= w. However, a contradiction would be reached since

1 1 8 4 1 1 8 4  5   1  φ (Lw)= 16 6= 2 = w,  1   1   8   4   1   1   8   4   3   1   16   2      1 1 T where Lw = 0, 4 , 4 . This result reveals that φ (H3) is not convex, because otherwise, when C has its vertices in φ (H ) it would be that C ⊆ φ (H ). 3  3 3 3 The image of H3 through φ is contained in C3.

Lemma 1. φ (H3) ⊆C3. Proof. Matrix B has rank 6. In fact submatrix B˜ formed by columns 2,..., 7 from B has full rank. Let x be an arbitrary point in H3, we will prove that 8 8 φ (x) ∈C3. For this, we must find a vector λ ∈ [0, 1] such that Σi=1λi = 1 and Bλ = φ (x) . (5)

T For simplicity we will use the notation λ˜ = λ2 λ3 λ4 λ5 λ6 λ7 . Then the vector λ˜ will depend on the point x in H and on the free parameter 3  λ8:

λ8 + x3 − x1x3 − x2x3 λ8 + x2 − x1x2 − x2x3 −1   T T x2x3 − λ8 λ˜ = B˜ B˜ B˜ (φ (x) − B8λ8)= .  λ8 + x1 − x1x2 − x1x3       x x − λ   1 3 8   x1x2 − λ8      Furthermore λ1, λ˜, and λ8 should satisfy the convexity constraints:

0 ≤ xixj − λ8 ≤ 1, i, j ∈{1, 2, 3} and i 6= j, (6)

0 ≤ λ8 + xi (xj + xk − 1) ≤ 1, i, j, k ∈{1, 2, 3} , i 6= j, i 6= k, j 6= k, (7)

0 ≤ λ1, λ8 ≤ 1, (8)

λ1 + λ8 + x1 + x2 + x3 − x1x2 − x1x3 − x2x3 = 1. (9)

8 The identity (9) is precisely the convexity constraint Σi=1λi = 1. Let us define the following quantities:

M1 (x) = min {1 − x1 (1 − x2 − x3) , 1 − x2 (1 − x1 − x3) , 1 − x3 (1 − x1 − x2)} ,

M2 (x) = min {x1x2, x1x3, x2x3} ,

m1 (x) = max {−x1 (1 − x2 − x3) , −x2 (1 − x1 − x3) , −x3 (1 − x1 − x2)} ,

m2 (x) = max {x1x2 − 1, x1x3 − 1, x2x3 − 1} .

7 It can be verified that m1 (x) ≤ 1, m2 (x) ≤ 0 and that M1 (x) ,M2 (x) ≥ 0: multilinear polynomials reach their extremes at the vertices of H3, so

min 1 − xi (1 − xj − xk) = 0, x∈H3

min xixj = 0, x∈H3

max −xi (1 − xj − xk) = 1, x∈H3

max xixj − 1 = 0. x∈H3

Then, max {m1 (x) ,m2 (x)}≤ λ8 ≤ min {M1 (x) ,M2 (x)} . On the other hand, m3 (x)= − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3) ≤ λ8 ≤ 1−(x1 + x2 + x3 − x1x2 − x1x3 − x2x3)= M3 (x) .

It is immediate to verify that minx∈H3 M3 (x) = maxx∈H3 m3 (x) = 0, which implies that m3 (x) ≤ 0 and that M3 (x) ≥ 0 in the hypercube H3. Now it is enough to prove that: (i) −xi (1 − xj − xk) ≤ 1 − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3): This boils down to simply analyzing the case (i, j, k)=(1, 2, 3) since the polynomial that appears on the right side is symmetric (for the cases (i, j, k) = (2, 1, 3) and (i, j, k)=(3, 1, 2) the proof would be identical). We will analyze the sign of the multilinear polynomial, p (x1, x2, x3)= −x1 (1 − x2 − x3)−(1 − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3)) .

The maximum of p in the hypercube H3 is reached at one of its vertices with value 0. With this, we have shown that

m1 (x) ≤ 1 − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3).

(ii) xixj − 1 ≤ 1 − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3). By the same argument as in (i) we reduce the problem to the pair (i, j)=(1, 2) and simply analyze the multilinear polynomial,

p (x1, x2, x3)= x1x2 − 1 − (1 − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3)) , whose maximum in H3 is 0. In this way,

m2 (x) ≤ 1 − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3).

(iii) − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3) ≤ 1 − xi (1 − xj − xk). Again we reduce it to (i, j, k)=(1, 2, 3), and we construct the multilinear polynomial: p (x1, x2, x3)= − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3)−(1 − x1 (1 − x2 − x3)) .

8 It can be verified that minx∈H3 p (x1, x2, x3)= −2, so

− (x1 + x2 + x3 − x1x2 − x1x3 − x2x3) ≤ M1.

(iv) − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3) ≤ xixj . We reduce it to (i, j) = (1, 2) and to the polynomial

p (x1, x2, x3)= − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3) − x1x2,

such that again minx∈H3 p (x1, x2, x3)= −2. Thus,

− (x1 + x2 + x3 − x1x2 − x1x3 − x2x3) ≤ M2.

With this, we have proven that

max {m1 (x) ,m2 (x)}≤ λ8 ≤ M3 (x),

m3 (x) ≤ λ8 ≤ min {M1 (x) ,M2 (x)} , and that

max {m1 (x) ,m2 (x) ,m3 (x)}≤ λ8 ≤ min {M1 (x) ,M2 (x) ,M3 (x)} = M (x).

Since m2 (x) ,m3 (x) ≤ 0, we should simply take λ8 ∈ [max {0,m1 (x)} , min {M (x) , 1}] and λ1 =1 − (λ8 + x1 + x2 + x3 − x1x2 − x1x3 − x2x3).

Since the previous proof is constructive, in the following examples we will 8 show how λ ∈ [0, 1] can be constructed from x ∈ H3. We must emphasize that since λ8 moves in a permitted interval [max {0,m1 (x)} , min {M (x) , 1}], in general, there can be many solutions.

1 1 T Example 2. Let x = 1 2 2 . According to the notation in Lemma 1:

M1 (x) = min {1 − x1 (1 − x2 − x3), 1 − x2 (1 − x1 − x3) , 1 − x3 (1 − x1 − x2)} = 5 5 = min 1, , =1, 4 4   1 1 1 1 M (x) = min {x x , x x , x x } = min , , = , 2 1 2 1 3 2 3 2 2 4 4   m1 (x) = max {−x1 (1 − x2 − x3) , −x2 (1 − x1 − x3) , −x3 (1 − x1 − x2)} = 1 1 1 = max 0, , = , 4 4 4   1 M (x)=1 − (x + x + x − x x − x x − x x )= , 3 1 2 3 1 2 1 3 2 3 4

9 1 1 and as a result M (x) = min {M1 (x) ,M2 (x) ,M3 (x)} = 4 , and λ8 = 4 , λ1 =0. With this selection, we will have

λ1 0 λ + x − x x − x x 0  8 3 1 3 2 3    λ8 + x2 − x1x2 − x2x3 0  x2x3 − λ8   0  λ =   =  1  .  λ8 + x1 − x1x2 − x1x3   4     1   x1x3 − λ8   4     1   x1x2 − λ8       4   λ   1   8   4      For this λ we verify that Bλ = φ (x): 0 1 1 1 1 5 0 0 2 2 2 2 2 2 0 4 0 1 0 1 1 2 1 2  0  5  2 2 2 2   4  0 1 1 2 0 1 1 2  0  3 Bλ = 2 2 2 2   = 4 = φ (x) .  0 0 1 1 1 1 0 0   1   1   2 2 2 2   4   4   0 1 0 1 1 0 1 0   1   1   2 2 2 2   4   4   0 1 1 0 0 1 1 0   1   1   2 2 2 2   4   4     1     4  1 1   Example 3. Let (x1, x2, x3)= 0, 4 , 4 :

M1 (x) = min {1 − x1 (1 − x2 − x3) , 1 −x2 (1 − x1 − x3) , 1 − x3 (1 − x1 − x2)} = 13 13 13 = min 1, , = , 16 16 16   1 M (x) = min {x x , x x , x x } = min 0, 0, =0, 2 1 2 1 3 2 3 16   m1 (x) = max {−x1 (1 − x2 − x3) , −x2 (1 − x1 − x3) , −x3 (1 − x1 − x2)} = 3 3 = max 0, − , − =0, 16 16   9 M (x)=1 − (x + x + x − x x − x x − x x )= , 3 1 2 3 1 2 1 3 2 3 16 and as a result M (x) = min {M1 (x) ,M2 (x) ,M3 (x)} = 0, and λ8 = 0, λ1 = 9 16 . We this selection we obtain the vector λ: 9 λ1 16 λ + x − x x − x x 0  8 3 1 3 2 3    λ8 + x2 − x1x2 − x2x3 0  x2x3 − λ8   0  λ =   =  1  .  λ8 + x1 − x1x2 − x1x3   4     1   x1x3 − λ8   4     1   x1x2 − λ8   4     1   λ8       4      10 Now for this λ, we check that Bλ = φ (x):

9 16 1 1 1 1 3 1 0 0 2 2 2 2 2 2 16 8 0 1 0 1 1 2 1 2  3  1  2 2 2 2  16  8  0 1 1 2 0 1 1 2  1  5 Bλ = 2 2 2 2  16  = 16 = φ (x) .  0 0 1 1 1 1 0 0   0   1   2 2 2 2     8   0 1 0 1 1 0 1 0   0   1   2 2 2 2     8   0 1 1 0 0 1 1 0   0   3   2 2 2 2     16     0        1 T Example 4. For the point x = 1 2 0 we compute the quantities Mi (x) and mi (x): 

M1 (x) = min {1 − x1 (1 − x2 − x3) , 1 − x2 (1 − x1 − x3) , 1 − x3 (1 − x1 − x2)} = 1 1 = min , 1, 1 = , 2 2   1 M (x) = min {x x , x x , x x } = min , 0, 0 =0, 2 1 2 1 3 2 3 2   m1 (x) = max {−x1 (1 − x2 − x3) , −x2 (1 − x1 − x3) , −x3 (1 − x1 − x2)} = 1 = max − , 0, 0 =0, 2   M3 (x)=1 − (x1 + x2 + x3 − x1x2 − x1x3 − x2x3)=0, which implies that M (x) = min {M1 (x) ,M2 (x) ,M3 (x)} = 0, and λ8 = 0, λ1 =0. Using this selection we have that

λ1 0 λ + x − x x − x x 0  8 3 1 3 2 3    λ8 + x2 − x1x2 − x2x3 0  x2x3 − λ8   0  λ =   =  1  .  λ8 + x1 − x1x2 − x1x3       2   x1x3 − λ8   0     1   x1x2 − λ8       2   λ   0   8        For this λ we verify that Bλ = φ (x): 0 1 1 1 1 5 0 0 2 2 2 2 2 2 0 4 0 1 0 1 1 2 1 2  0  1  2 2 2 2   2  0 1 1 2 0 1 1 2  0  1 Bλ = 2 2 2 2   = 4 = φ (x) .  0 0 1 1 1 1 0 0   1   1   2 2 2 2   2   4   0 1 0 1 1 0 1 0   0   1   2 2 2 2     2   0 1 1 0 0 1 1 0   1   1   2 2 2 2   2   4     0        11 The following example shows that the associated vector λ for a φ (x) is not unique.

1 1 1 T Example 5. Let x = 4 4 4 :  M1 (x) = min {1 − x1 (1 − x2 − x3) , 1 − x2 (1 − x1 − x3) , 1 − x3 (1 − x1 − x2)} = 7 7 7 7 = min , , = , 8 8 8 8   1 1 1 1 M (x) = min {x x , x x , x x } = min , , = , 2 1 2 1 3 2 3 16 16 16 16   m1 (x) = max {−x1 (1 − x2 − x3) , −x2 (1 − x1 − x3) , −x3 (1 − x1 − x2)} = 1 1 1 1 = max − , − , − = − , 8 8 8 8   7 M (x)=1 − (x + x + x − x x − x x − x x )= , 3 1 2 3 1 2 1 3 2 3 16

1 and as a consequence, M (x) = min {M1 (x) ,M2 (x) ,M3 (x)} = 16 , and λ8 ∈ 1 0, 16 . If we select λ = 1 then λ = 13 and we will have the following vector λ:   8 32 1 32 13 λ1 32 5 λ8 + x3 − x1x3 − x2x3 32    5  λ8 + x2 − x1x2 − x2x3 32 1  x2x3 − λ8   32  λ =   =  5  .  λ8 + x1 − x1x2 − x1x3   32     1   x1x3 − λ8   32     1   x1x2 − λ8   32     1   λ8       32      For this λ we check that Bλ = φ (x):

13 32 1 1 1 1 5 5 0 0 2 2 2 2 2 2 32 16 0 1 0 1 1 2 1 2  5  5  2 2 2 2  32  16  0 1 1 2 0 1 1 2  1  5 Bλ = 2 2 2 2  32  = 16 = φ (x) .  0 0 1 1 1 1 0 0   5   3   2 2 2 2   32   16   0 1 0 1 1 0 1 0   1   3   2 2 2 2   32   16   0 1 1 0 0 1 1 0   1   3   2 2 2 2   32   16     1     32    1 3 On the other hand, with the selection λ8 = 16 , we would have λ1 = 8 , and

12 therefore 3 8 3 16  3  16  0  λ =   .  3   16   0     0     1   16  For this λ we check again that Bλ =φ (x): 

3 8 1 1 1 1 3 5 0 0 2 2 2 2 2 2 16 16 0 1 0 1 1 2 1 2  3  5  2 2 2 2  16  16  0 1 1 2 0 1 1 2  0  5 Bλ = 2 2 2 2   = 16 = φ (x) .  0 0 1 1 1 1 0 0   3   3   2 2 2 2   16   16   0 1 0 1 1 0 1 0   0   3   2 2 2 2     16   0 1 1 0 0 1 1 0   0   3   2 2 2 2     16     1     16    The elements in C3 are called primary variables and are denoted as w. We will make a distinction between primary variables u ∈ [0, 2]3 and primary variables 1 3 T T T v ∈ 0, 2 , which make up the vector w as w = u , v .

Definition  6 (Primary variables). For the triplet of variables (x1, x2, x3) ∈ H3 we define the primary variables for 1 ≤ i < j ≤ 3: x +2x x + x u = i i j j , (10) ij 2 x − 2x x + x v = i i j j , (11) ij 2 1 where uij ∈ [0, 2] and vij ∈ 0, 2 . T T T T The vector of primary variables  w = u , v is such that u = (u12,u13,u23) is given by (10) and vT = (v , v , v ) by (11). The advantage of defining these 12 13 23  variables is that they satisfy the following relationships: (i) Cross-products in the objective function (corresponding to the off-diagonal entries in Q): u − v x x = ij ij for 1 ≤ i < j ≤ 3. (12) i j 2 (ii) Single variables (corresponding to the vector b): u + v + u + v − u − v x = ij ij ik ik jk jk for 1 ≤ i < j ≤ 3. (13) i 2

We have seen in Lemma 1 that given an x ∈ H3 there always exists a w ∈C3 such that φ (x)= w.

13 6 We define the linear transformation ϕ : C3 → R given by ϕ (w)= T3w, with

1 1 −1 1 1 −1 1 0 0 −1 0 0 1  0 1 0 0 −1 0  T = . 3 2  1 −1 1 1 −1 1     0 0 1 0 0 −1     −1 1 1 −1 1 1      We claim that ϕ ◦ β = id; this can be checked by inspection:

T3E3 = I6, (14) where E3 was given in (3). Remark 7. According to the definition of α there exists a vector c ∈ R6 such that f (x)= cT α (x). Additionally, always there exists a vector c˜ ∈ R6 such that f (x)=˜cT φ (x), where T T c˜ = c T3. (15) This is due to T T c˜ E3α (x)= c T3E3α (x) , and we know from (14) that T3E3 = I. Let f˜(w)=˜cT w, we define the optimization problem: ′ ˜ (P3): minw∈C3 f (w).

The following theorem states that the minimum of f over H3 is the minimum of f˜ over C3. ∗ ∗ Theorem 8. Let f (x ) be the minimum of the problem (P3), and f˜(w ) the ′ ∗ ˜ ∗ minimum of the problem (P3’). Then f (x )= f (w ).

Proof. In virtue of Lemma 1, φ (H3) ⊆C3. This means that there exists a point ∗ ∗ w ∈C3 such that φ (x )= w. The minimum of f˜ over C3 is attained at w ∈C3, so that f˜(w∗) ≤ f˜(w) . (16) The connection between f and f˜ yields:

∗ T ∗ T ∗ f˜(φ (x ))=˜c φ (x )= c T3 (E3α (x )) (17) T ∗ ∗ = c α (x )=f (x ).

According to (16) and (17) it follows that

f˜(w∗) ≤ f˜(w)= f (x∗).

14 On the other hand, we know that there exists a vertex y in H3 such that φ (y)= w∗, so

T T T f (y)= c α (y)= c T3 (E3α (y))=˜c φ (y)= f˜(φ (y)) . (18)  ∗ Since the minimum of f over H3 is attained at x ∈ H3, and accounting for (18), we have that

f (y)= f˜(φ (y)) = f (w∗) ≥ f (x∗).

Henceforth, f (x∗)= f˜(w∗).

In the previous theorem the condition φ (H3) ⊆C3 could be eliminated since ∗ φ (x ) is a vertex of C3. ′ The problem (P3): can be written as a linear programming problem:

minc ˜T w Bλ − w =0 (LP3):  T λ  such that u =1   λ  w ≥ 0, ≥ 0

 T whereλ = (λ1,...,λ 8), B = (φ (p1) ,...,φ (p8)), and u is an all-ones vector of appropriate dimension. Throughout this work the variables λ are called secondary variables. Example 9. Let f (x)= xT Qx + bT x with

0 −10 −20 Q = −10 0 −10 ,  −20 −10 0   −2  b = −2 .  −26    This objective function is written as f (x)= cT α (x), where cT = (−2, −20, −40, −2, −20, −26). T T T The problem (LP3) has an objective function f˜(w)=˜c w with c˜ = c T3 = (1, −33, −23, 21, 7, −3). The matrix B in the constraints was given in 4. The minimum of (P’) is −110 and is attained at w∗ = (2, 2, 2, 0, 0, 0)T .

3.2. The General Case In this subsection, we generalize the simple case to the n-dimensional hyper- cube Hn.

15 For each triplet (i, j, k) with 1 ≤ i

 xi+2xixj +xj 2 xi+2xixk+xk  2  xj +2xj xk+xk φ (x , x , x )= 2 . i,j,k i j k  xi−2xixj +xj   2   xi−2xixk+xk   2   xj −2xj xk+xk     2    (i,j,k) (i,j,k) (i,j,k) (i,j,k) From the set of vertices V of H3 the convex-Hull C3 = conv φi,j,k V ) (i,j,k) is defined. Recall that although the set φi,j,k H3 is not convex, the image  (i,j,k) (i,j,k)  of H3 through φi,j,k is contained in C3 . Similarly as done in (10) and (11) we define primary variables uij and vij for 1 ≤ i < j ≤ n. For the sake of clarity we adopt the notation wi,j,k to refer to the vector (uij ,uik,ujk vij , vik, vjk) and xi,j,k for the vector (xi, xj , xk). With this (i,j,k) 3 1 3 notation the elements in C3 are wi,j,k ∈ [0, 2] × 0, 2 . Also, we will make 3 a distinction between primary variables ui,j,k ∈ [0, 2] and primary variables   T 1 3 T T vi,j,k ∈ 0, 2 , which make up the vector wi,j,k as wi,j,k = ui,j,k, vi,j,k . We (i,j,k)  (i,j,k) have seen that given an xi,j,k ∈ H3 there always exists a w ∈ C3 such that φi,j,k (xi,j,k) = wi,j,k. This result should be borne in mind because it is key in the main theorem of this subsection. (i,j,k) n 1 n From the convex-hulls C3 we create the set Cn ⊆ [0, 2] × 0, 2 by introducing consistency constraints: For it, we introduce the functions   u + v + u + v − u − v g (w)= i,j i,j i,k i,k j,k j,k , i,j,k 2 where 1 ≤ i

n 1 n Given a point w ∈ [0, 2] × 0, 2 we define the following consistency constraints for 1 ≤ j < k ≤ n:   (C1): g1,2,3 (w)= g1,j,k (w) with (j, k) 6= (2, 3). (C2): g2,1,3 (w)= g2,j,k (w) with (j, k) 6= (1, 3). (C3): gi,1,2 (w)= gi,j,k (w) with (j, k) 6= (1, 2) and i ≥ 3. The set Cn is then defined as 1 n C = w ∈ [0, 2]n × 0, : w ∈C(i,j,k) n 2 i,j,k 3    and w satisfies (C1), (C2), and (C3) . 

16 n T T T In this case u and v are 2 -dimensional vectors, and then w = u , v ∈ n(n−1) R . Let w ∈ Cn, and triplets (i, j, k), (i,j,l), with l 6= k, there exist   one-to-one maps φi,j,k and φi,j,l (as shown in the simple case) such that

(i,j,k) wi,j,k = φi,j,k (xi,j,k), xi,j,k ∈ H3 , and (i,j,l) wi,j,l = φi,j,l (yi,j,k), xi,j,l ∈ H3 . Consistency means that we expect that

xi = gi,j,k (w)= gi,j,l (w)= yi,

xj = gj,i,k (w)= gj,i.l (w)= yj.

Analogously to the simple case we have that φ (Hn) ⊂ Cn. This result is argued in the following lemma:

Lemma 10. Given an arbitrary point x ∈ Hn, there exists a w ∈Cn such that φ (x)= w.

Proof. For a triplet (i1, j1, k1) there exists a point (wi1 , wj1 , wk1 ) ∈ C3 and a map φi1,j1,k1 : H3 → C3 such that φi1 ,j1,k1 (xi1 , xj1 , xk1 ) = wi1,j1,k1 (this was shown above in Lemma 1 for the simple case). Let (i2, j2, k2) be another triplet with φi2 ,j2,k2 (xi2 , xj2 , xk2 ) = wi2 ,j2,k2 , such that {i1, j1, k1}∩{i2, j2, k2}= 6 ∅. Without loss of generality let us assume that i1 = i2 (otherwise we can make a permutation of the indices to get that configuration) then from the consistency constraints we have xi1 = yi1 :

xi1 = gi1,j1,k1 (φi1,j1,k1 (xi1 , xj1 , xk1 )) = gi1,j2,k2 (φi1 ,j2,k2 (yi1 ,yj2 ,yk2 )) = yi1

Extending this idea to all pair of triplets, we conclude that for every x ∈ Hn, there exists a vector w ∈Cn such that φ (x)= w.

Example 11. For n =4, the consistency constraints are

Consistency for x1:

(u12 + v12 + u13 + v13 − u23 − v23) − (u + v + u + v − u − v )=0  12 12 14 14 24 24 (u + v + u + v − u − v )  12 12 13 13 23 23  − (u13 + v13 + u14 + v14 − u34 − v34)=0   Consistency for x2:

(u12 + v12 + u23 + v23 − u13 − v13) − (u + v + u + v − u − v )=0  12 12 24 24 14 14 (u + v + u + v − u − v )  12 12 23 23 13 13  − (u23 + v23 + u24 + v24 − u34 − v34)=0   17 Consistency Constraints Cw123 T T T (1,2,3) (x1, x2, x3) x = r w = r w = r w C3 1 123 124 134

Cw124 T T T (1,2,4) (x1, x2, x4) x = r w = r w = r w C3 2 213 214 234

Cw134 T T T (1,3,4) (x1, x3, x4) x = r w = r w = r w C3 3 312 314 324

Cw234 T T T (2,3,4) (x2, x3, x4) x = r w = r w = r w C3 4 412 413 423

Figure 3: Consistency constraints for n = 4

Consistency for x3:

(u13 + v13 + u23 + v23 − u12 − v12) − (u + v + u + v − u − v )=0  13 13 34 34 14 14 (u + v + u + v − u − v )  13 13 23 23 12 12  − (u23 + v23 + u34 + v34 − u24 − v24)=0   Consistency for x4:

(u14 + v14 + u24 + v24 − u12 − v12) − (u + v + u + v − u − v )=0  14 14 34 34 13 13 (u + v + u + v − u − v )  14 14 24 24 12 12  − (u24 + v24 + u34 + v34 − u23 − v23)=0  In Figure 3, we graphically show the consistency constraints for the variables (1,2,3) (1,2,4) (1,3,4) x1, x2, x3 and x4, generated from the convex-hulls C3 , C3 , C3 and (2,3,4) C3 via the linear transformation C = (M,M) with 1 1 −1 1 M = 1 −1 1 . 4  −1 1 1    Note that xi Cφi,j,k (xi,j,k)= xj .   xk   We will see that the minimization problem of f (x) in Hn is equivalent to the minimization problem of a linear objective function f˜(w) in Cn, where Cn can

18 be expressed as constraints on equality given by the convexity constraints and consistency constraints, in addition to the natural constraints for the vectors (i,j,k) (i,j,k) λ (secondary variables) in the definition of the convex-hull C3 :

8 8 (i,j,k) (i,j,k) (i,j,k) C3 = λl φ (pl): λl =1, l l  X=1 X=1 (i,j,k) (i,j,k) pl ∈ V , λl ≥ 0 for l =1,..., 8 . 

In particular, to define f˜(w) it is necessary to introduce a transformation Tn such that T T f˜(w)=˜c w = c Tnw. This transformation is obtained according to the following relationships:

xi = gi,j,k (w) , u − v x x = i,j i,j . i j 2

More specifically, if α : Hn → α (Hn) is the map

α (x) = (x1, x1x2, ··· , x1xn, x2, x2x3, ··· ,

x2xn, ··· , xn−1, xn−1xn, xn). and β : α (Hn) → Cn is a linear map β (α (x)) = Enα (x) = w such that for a triplet (i, j, k) we verify that x +2x x + x x +2x x + x u = i i j j , u = i i k k , ij 2 ik 2 x +2x x + x x − 2x x + x u = j j k k , v = i i j j , jk 2 ij 2 x − 2x x + x x − 2x x + x v = i i k k , v = j j k k . ik 2 jk 2

n(n+1) Here En is not a square matrix but a rectangular n (n − 1) × 2 . The matrices En and Tn are connected by the relation:

TnEn = I,

n(n+1) n(n+1) where I is the identity matrix of dimension 2 × 2 . Obviolusly the transformation Tn satisfies the relation

Tnβ (α (x)) = α (x) .

Figure 4 illustrates the maps φ, α, and β, in a similar way as in the simple case (n = 3).

19 wijk = φijk (x) xi+2xixj +xj ) xi+2xixk+xk) xj +2xj xk+xk) xi−2xixj +xj ) xi−2xixk+xk) xj −2xj xk+xk) = 2 , 2 , 2 , 2 , 2 , 2   w x φ β(y)= Eny Cn Hn β(α(x)) = φ(x)

β α

α(Hn) y α(x) = (x1, x1x2,...,xn−1xn, xn)

Figure 4: Diagram for the maps φ, α, and β in the general case.

Example 12. For n =4: 1 1 0 −1001 1 0 −1 0 0 1 00000 −10 0 0 0 0  0 100000 −10 0 0 0   0 010000 0 −10 0 0    1  1 −10 1 00 1 −10 1 0 0  T =   . 4 2  0 001000 0 0 −1 0 0     0 000100 0 0 0 −1 0     −11 0 1 00 −11 0 1 0 0     0 000010 0 0 0 0 −1     −10 1 0 10 −10 1 0 1 0      The matrix E4 is 12 0 010 0000 10 2 000 0100  10 0 200 0001   00 0 012 0100     00 0 010 2001    1  00 0 000 0121    . 2  1 −20 010 0000     1 0 −2000 0100     1 0 0 −200 0001     00 0 01 −20100     00 0 010 −20 0 1     00 0 000 01 −2 1      It can be checked that T4E4 = I10.

20 Now we formulate the minimization problem in Cn: ′ ˜ (Pn): minw∈Cn f (w),

T where the objective function f˜ : Cn → R is defined as f˜(x)=˜c w with T T c˜ = c Tn. The following Theorem is an extension of Theorem 8 and proves ′ that the minimum of the problem (Pn) is the same as that of (Pn). ∗ Theorem 13. Let w ∈ Cn be the point that minimizes the function f˜, and ∗ x ∈ Hn the point where f reaches the minimum (which we know to be a vertex ∗ ∗ of Hn, then f˜(w )= f (x )).

Proof. From Lemma 10, we know that φ (Hn) ⊆Cn, so there is a point w ∈Cn ∗ ∗ such that φ (x ) = w. The minimum of f˜ over Cn is attained at w ∈ Cn, so that f˜(w∗) ≤ f˜(w) . (19) The connection between f and f˜ yields:

∗ T ∗ T ∗ f˜(φ (x ))=˜c φ (x )= c Tn (Enα (x )) (20) T ∗ ∗ = c α (x )=f (x ).

According to (19) and (20) it follows that

f˜(w∗) ≤ f˜(w)= f (x∗).

On the other hand, we know that there exists a vertex y in Hn such that φ (y)= w∗, so

T T T f (y)= c α (y)= c Tn (Enα (y))=˜c φ (y)= f˜(φ (y)) . (21)  ∗ Since the minimum of f over Hn is attained at x ∈ Hn, and accounting for (21), we have that

f (y)= f˜(φ (y)) = f (w∗) ≥ f (x∗).

Henceforth, f (x∗)= f˜(w∗).

As for the simple case n = 3, in the general case, the condition φ (Hn) ⊆Cn can be relaxed. In the appendix, another proof is made where this relaxation is performed and the correspondence between the vertices of Hn and the vertices of Cn is simply considered. This requires expressing the problem in secondary variables along with box constraints. ′ With the ideas presented above, the optimization problem (Pn) is trans- formed into the following linear optimization problem:

21 minc ˜T w such that for 1 ≤ i

A11 A12 0 A˜ = A , A˜ = A , ˜b = 0 . 11  21  22  22    A31 A32 u       Example 14. Let Q ∈ Z4×4 be the symmetric matrix:

0 −30 6 −22 −30 0 15 −2 Q = ,  6 15 0 −5   −22 −2 −5 0      and b ∈ Z4 the vector −8 −22 b = .  0   −32     

22 Problem (P ′ ): Problem (P ): n n minimize f˜(w)=˜cT w minimize f (x)= xT Qx + bT x = cT α (x) subject to w ∈Cn subject to x ∈ Hn

Problem (LPn): minimize f˜(w)=˜cT w λ subject to A˜ A˜ = ˜b 11 12 w   and w ≥ 0,λ ≥ 0 

Figure 5: Equivalence of problems (Pn), (P’n) and (LPn)

The consistency constraints were given in Example 11. The optimal value of f is −170 and

∗ λ = e7 + e16 + e22 + e30, 1 1 1 1 1 1 T w∗ = 2, , 2, , 2, , 0, , 0, , 0, , 2 2 2 2 2 2   32 where ek ∈ R is the k-th vector of the standard basis (with an entry ’1’ at the position k and ’0’ for the rest of positions). ∗ The optimal point x can be recovered from w by applying E3 to wijk . To illustrate this point we compute x∗ for the previous example. Example 15. From example 14:

1 1 1 1 T w∗ = 2, , , 0, , , 123 2 2 2 2   1 1 1 1 T w∗ = , 2, , , 0, . 234 2 2 2 2   We can recover the point x∗ as follows:

−1 ∗ T T E3 w123 = (1, 1, 0, 1, 0, 0) = (x1, x1x2, x1x3, x2, x2x3, x3) ,

23 ∗ ∗ ∗ ∗ and this means x1 = x2 =1, x3 =0. We proceed in a similar way for w234:

−1 ∗ T T E3 w234 = (1, 0, 1, 0, 0, 1) = (x2, x2x3, x2x4, x3, x3x4, x4) , ∗ ∗ ∗ ∗ which implies that x2 =1, x3 =0, x4 =1. The optimal point is x = (1, 1, 0, 1).

4. Computational Complexity

The problem (LPn) requires an amount of memory given by the dimension of A in (22), that is (7N + N2) (8N +2N1). Henceforth, the space complexity is of order O n6 . Since both B (involved in the definition of the convex-hull (i,j,k) 8 λ(i,j,k) C3 ) and E3 (corresponding to the constraint l=1 l = 1 in the convex- hull C(i,j,k)) are constant, the is also O n6 (assuming that the 3 P storing of an entry in a matrix is O (1)). Note that the objective function T  f˜(w)=˜c w requires 2N1 multiplications and 2N1 − 1 sums, resulting in a time complexity of order O n2 . Once generated the matrix A and the vectorc ˜, the problem (LPn) can be solved in polynomial-time  via interior-point methods (the reader is referred to [41] for more details): (i) ellipsoid method due to Khachiyan (O m4L where L denotes the number of bits in a binary representation of A and m is the space 7  dimension, (ii) projective algorithm of Karmarkar (O m 2 L ), (iii) Gonzaga al- gorithm and Vaidya’s 87 algorithm, with the complexity of O m3L operations (these two algorithms were simultaneously developed in 1987), or (iv) Vaidya’s 5  89 algorithm (O m 2 ), among others. Henceforth, the problem (Pn) is solved in polynomial time. The number of variables in (PLn) is 8N +2N1 so that the 15 problem can be solved with Vaidya’s 89 algorithm in O n 2 .   5. Implementation Aspects

This section addresses the implementation aspects of the equivalent linear program (LPn). We will develop the different algorithms both for the trans- formation of the objective function in the form f˜(w)=˜cT w through the linear transformation Tn and for the equality constraints (convexity constraints and consistency constraints). The convexity and consistency constraints allow us to define the feasible set Cn as follows: n 1 λ C = w ∈ [0, 2]n × 0, : A˜ A˜ = ˜b, n 2 11 12 w       λ ≥ 0 ,  where A11 A12 0 A˜ = A , A˜ = A , ˜b = 0 . 11  21  12  22    A31 A32 u       24 We will start with the indexing of the primary variables since this point is key for a correct implementation of the method.

5.1. Definition of Primary Variables As indicated above, the primary variables are stored in a vector w ∈ Rn(n−1) n as a stacking of the 2 -dimensional vectors u, and v:  u w = . v   In turn, each element in u and v is defined by a pair of indices (i, j) such that 1 ≤ i < j ≤ n. The element uij (idem for vij ), is stored in u (in v) in the position 2n−i 2n−i (i − 1) 2 + (j − i) when i < j (in the position N1 + (i − 1) 2 + (j − i) for vij ). This can be analyzed simply according to the scheme below where different subvectors are represented according to the first index:

u1,2 u2,3 u1,3 u2,4 [h] .  Length = n − 1, .  Length = n − 2, ···, .  .    u1,n u2,n

 −   Length= i 1 (n−k)  Pk=1  u|i,i+1 {z ui+1,i+2 } ui,i+2 ui+2,i+3 Length = .  Length = n − i, ··· , .  .  .  n − (i + 1)   ui,n ui+1,n   un−2,n−1  ··· ,  Length = 2, un−1,n Length = 1 un− ,n 2 

The i-th subvector, that is (ui,i+1,ui,i+2, ··· ,ui,n) has above i−1 subvectors with lengths from n − 1 to n − (i − 1). Therefore, the number of entries above the i-th subvector is i−1 i (i − 1) 2n − i (n − k)= n (i − 1) − = (i − 1) . 2 2 k X=1 Finally, within the subvector i, the first element is ui,i+1, so the input ui,j is in position (j − i) within this subvector:

ui,i+1 ui,i+2  .  (j − i) positions .     u   i,j   u   i,j+1  .   .   .       ui,n 

25 Similarly we can position ourselves in vector v: when i ≥ j, the position of the element ui,j can be calculated immediately by swapping the roles of i and j. Based on the previous information, we define the positioning index function:

2n−i (i − 1) 2 + (j − i) , when i < j ι (i, j)= − (j − 1) (2n j) + (i − j) , when i ≥ j  2 Throughout this presentation, we will exemplify the main ideas and algorithms for a problem of dimension n = 4. Example 16. For n =4, in Figures 6a and 6b the correspondence between pri- mary variables and the indices generated by the ι function has been represented.

Primary variables u ι (1, 2) ι (1, 3) ι (1, 4) ι (2, 3) ι (2, 4) ι (3, 4) z 1 2 3}| 4 5 6 { ↓ ↓ ↓ ↓ ↓ ↓ u12 u13 u14 u23 u24 u34

(a) Primary variables u in the vector w Primary variables v

N1 + ι (1, 2) N1 + ι (1, 3) N1 + ι (1, 4) N1 + ι (2, 3) N1 + ι (2, 4) N1 + ι (3, 4) z 6+1 6+2 6+3}| 6+4 6+5 6+6 { ↓ ↓ ↓ ↓ ↓ ↓ v12 v13 v14 v23 v24 v34

(b) Primary variables v in the vector w

Figure 6: Positions of the primary variables in the vector w according to the index ι.

5.2. Transformation of Objective Function

In this subsection we will build the linear transformation Tn that allows us to transform the objective function f : Hn → R into f˜ : Cn → R. Specifically, T T T f˜(w)=˜c w wherec ˜ = c Tn. The matrix Tn is obtained by exploiting the following relationships: u + v + u + v − u − v x = ij ij ik ik jk jk , i 2 u − v x x = ij ij , i j 2 such that TnEnα (x) = α (x), remember that w = φ (x) = Enα (x). We will illustrate the construction of Tn with an example: 4 Example 17. For n = 4 we have that N1 = 2 = 6 (= number of primary variables u = number of primary variables v). In Tables 1 and 2, the T matrix  4 is represented indicating the correspondence of the rows with the single variables x1, x2, x3, and x4, as well as the cross products x1x2, x1x3, x1x4, x2x3, x2x4

26 and x3x4. The vector T4w is a linear combination of the column vectors in T4 through the primary variables u and v (this has been highlighted with each column indicating which primary variable it is associated with). Furthermore, the primary variables that correspond to each other are found within the vector w according to the index ι (i, j). In reality, the matrix T4 is used to express the variables xi as well as the cross products xixj with j >i as a function of the primary variables. For clarity we will divide the matrix T4 into a part u v T4 corresponding to the primary variables u and another T4 to the primary u v u v variables v so that T4 = T4 T4 . Tables 1 and 2 represent T4 and T4 respectively.  Table 1: Part of the matrix T4 corresponding to the primary variables u.

ι (1, 2) ι (1, 3) ι (1, 4) ι (2, 3) ι (2, 4) ι (3, 4) 1 2 3 4 5 6 ↓ ↓ ↓ ↓ ↓ ↓ x1 → 1 1 0 −1 0 0 x1x2 → 1 0 0 0 0 0 x1x3 → 0 1 0 0 0 0 x1x4 → 0 0 1 0 0 0 x2 → 1 −1 0 1 0 0 x2x3 → 0 0 0 1 0 0 x2x4 → 0 0 0 0 1 0 x3 → −1 1 0 1 0 0 x3x4 → 0 0 0 0 0 1 x4 → −1 0 1 0 1 0 ↑ ↑ ↑ ↑ ↑ ↑ u12 u13 u14 u23 u24 u34

When i =1, the variable x1 is expressed as a function of u and v as u + u − u + v + v − v x = 12 13 23 12 13 23 , 1 2 and cross products as u − v x x = 12 12 . (23) 1 2 2

The variable x1 corresponds to row 1 in T4, the primary variables u12, u13 and u23 are associated with columns ι (1, 2)=1, ι (1, 3)=2 and ι (2, 3)=4, while the primary variables v12, v13 and v23 are placed in columns N1 + ι (1, 2)=7, N1 + ι (1, 3)= 8, and N1 + ι (2, 3) = 10. The cross product x1x2 corresponds to row 2 in T4. According to (23) we must access columns ι (1, 2) = 1 (corresponding to u12) and N1 + ι (1, 2)= 7 (corresponding to v12). Following this process we would finish building the matrix T4 completely.

Algorithm 1 calculates Tn for an arbitrary n according to these ideas.

27 Table 2: Part of the matrix T4 corresponding to the primary variables v.

N1 + ι (1, 2) N1 + ι (1, 3) N1 + ι (1, 4) 6+1 6+2 6+3 ↓ ↓ ↓ 1 1 x1 → 2 2 0 1 x1x2 → − 2 0 0 1 x1x3 → 0 − 2 0 1 x1x4 → 0 0 − 2 1 1 x2 → 2 − 2 0 x2x3 → 0 0 0 x2x4 → 0 0 0 1 1 x3 → − 2 2 0 x3x4 → 0 0 0 1 1 x4 → − 2 0 2 ↑ ↑ ↑ v23 v24 v34

Table 2: Part of the matrix T4 corresponding to the primary variables v (Continuation).

N1 + ι (2, 3) N1 + ι (2, 4) N1 + ι (3, 4) 6+4 6+5 6+6 ↓ ↓ ↓ 1 x1 → − 2 0 0 x1x2 → 0 0 0 x1x3 → 0 0 0 x1x4 → 0 0 0 1 x2 → 2 0 0 1 x2x3 → − 2 0 0 1 x2x4 → 0 − 2 0 1 x3 → 2 0 0 1 x3x4 → 0 0 − 2 1 x4 → 0 2 0 ↑ ↑ ↑ v23 v24 v34

28 Algorithm 1 Transformation of objective function procedure Transformed Objective Function(c) ⊲ Computation ofc ˜ in f˜(w)=˜cT w T ← 0 n(n+1) × 2 2N1 3: r ← 0 for i ← 1,n do S ←{1, 2 ...,n} 6: r ← r +1 if i=1 then 1 Tr,8N+ι(1,2) ← 2 1 9: Tr,8N+ι(1,3) ← 2 1 Tr,8N+ι(2,3) ←− 2 1 Tr,8N+N1+ι(1,2) ← 2 12: 1 Tr,8N+N1+ι(1,3) ← 2 1 Tr,8N+N1+ι(2,3) ←− 2 else if i=2 then 1 15: Tr,8N+ι(1,2) ← 2 1 Tr,8N+ι(2,3) ← 2 1 Tr,8N+ι(1,3) ←− 2 18: 1 Tr,8N+N1+ι(1,2) ← 2 1 Tr,8N+N1+ι(2,3) ← 2 1 Tr,8N+N1+ι(1,3) ←− 2 21: else 1 Tr,8N+ι(1,2) ←− 2 1 Tr,8N+ι(1,i) ← 2 1 24: Tr,8N+ι(2,i) ← 2 1 Tr,8N+N1+ι(1,2) ←− 2 1 Tr,8N+N1+ι(1,i) ← 2 27: 1 Tr,8N+N1+ι(2,i) ← 2 end if Generate an ordered array C of the elements of S¯ taken two by two. 30: for j ← i +1,n do r ← r +1 T ← 1 r,8N+ι(i,j) 2 33: 1 Tr,8N+N1+ι(i,j) ←− 2 end for end for 36: c˜T ← cT T return c˜ end procedure

29 5.3. Equality Constraints At the beginning of the procedure, we will assume that A˜ is an empty matrix that we will fill in. This matrix is of dimension 7N + N2 × 8N +2N1.

5.3.1. Convexity Constraints For the secondary variables λ we have adopted the following notation λ(1,2,3) λ(1,2,4) λ =  .  , .    λ(n−2,n−1,n)      (i,j,k) 8 where λ ∈ [0, 1] (this notation does not follow the definition of wi,j,k, that is, for each (i, j, k) we have a vector λ(i,j,k) which is exclusive to the convex hull (i,j,k) (i,j,k) C3 ). For each triplet (i, j, k) we generate the convex hull C3 :

(i,j,k) λ(i,j,k) T λ(i,j,k) λ(i,j,k) C3 = wi,j,k = B : u =1, ≥ 0 , (24) n o where u = (1, 1, 1, 1, 1, 1, 1, 1)T and

ui,j,k wi,j,k = , vi,j,k   (i,j,k) (i,j,k) 3 (i,j,k) with ui,j,k , v ∈ R . According to (24) the convexity constraints for C3 are equal constraints written as

(i,j,k) Bλ − wi,j,k =0, (25)

uT λ(i,j,k) =1, (26) along with the natural constraint λ(i,j,k) ≥ 0. Let us start by looking at the implementation of (25). As we see in (25) we need the basic block B = (φ (p1) ,...,φ (p8)) that is implemented in Algorithm 2. In the previous algorithm, squares appear for simplicity in writing since we are evaluating binary variables. For simplicity of implementation, each triplet (i, j, k) is associated with a single index r ∈ {1,...,N}, where we remember n ˜ that N = 3 ; This index r will allow us to traverse rows inside the matrix A. 6N×8N (r) We write the  matrix A11 ∈ R as a stack of submatrices A11 of dimension R6×8N each one of them transforming a vector λ(i,j,k):

(1) A11 A(2) A =  11  . 11 .  .   (N)   A   11    30 Algorithm 2 Basic block B 1: procedure Basic Block ⊲ The basic block B 2: B ← 06×8 3: i ← 0 ⊲ it is used as a column index of B 4: for x1 ← 0, 1 do 5: for x2 ← 0, 1 do 6: for x3 ← 0, 1 do 7: i ← i +1 2 (x1+x2) 8: B1,i ← 2 2 (x1+x3) 9: B2,i ← 2 2 (x2+x3) 10: B3,i ← 2 2 (x1−x2) 11: B4,i ← 2 2 (x1−x3) 12: B5,i ← 2 2 (x2−x3) 13: B6,i ← 2 14: end for 15: end for 16: end for 17: end procedure

(r) These sub-matrices A11 are divided according to (i, j, k):

(1,2,3) (1,2,4) (n−2,n−1,n)  λ λ ··· λ  A(r) = ↓ ↓ ··· ↓ , 11    A(r)(1) A(r)(2) ··· A(r)(N)   11 11 11    Part corresponding to the secondary variables λ   (r)(i) 6×8  where A11 ∈ R .| A part of the constraint{z in (25) is written} simply as: (r)(r) A11 ← B. 6N×2N1 Similarly, the matrix A12 ∈ R will be written as

(1) A12 A(2) A =  12  , 12 .  .   (N)   A   12    (r) 6×8 (r) where A ∈ R transforms a vector wi,j,k. In turn, each matrix A will 12 ′ ′′ 12 (r) (r) be divided into A12 and A12 corresponding to the primary variables u and v ′ ′′ (r) (r) (r) respectively, i.e. A12 = A12 A12 . Given an index r ∈ {1,...,N} we ′ ′′ (r) (r)  can generate A12 and A12 by accessing columns ι (i, j), ι (i, k), and ι (j, k). For primary variables u we have to make the following assignments

31 ′ (r) A12 ←−1, 1,ι(i,j) ′  (r)  A12 ←−1, 2,ι(i,k) ′  (r)  A12 ←−1, 3,ι(j,k)   and for primary variables v:

′′ (r) A12 ←−1, 4,ι(i,j) ′′  (r)  A12 ←−1, 5,ι(i,k) ′′  (r)  A12 ←−1. 6,ι(j,k)   With this, we achieve that

(r) A12 w = −I6×6wi,j,k = −wi,j,k. N×8N Finally, the constraint in (26) can be entirely written through A31 ∈ R (r) following the same idea as for A11. Specifically, A31 is a stack of rows A31 that transform the vector λ(i,j,k) into a scalar: (1) A31 (2) A31 A =   . 31 .  .   (N)   A   31    (r) Again these sub-matrices row A31 are divided according to (i, j, k):

(1,2,3) (1,2,4) (n−2,n−1,n)  λ λ ··· λ  A(r) = ↓ ↓ ··· ↓ , 31    A(r)(1) A(r)(2) ··· A(r)(N)   31 31 31    Part corresponding to the secondary variables λ     (r)(i) 1×8 | {z } where A31 ∈ R . So the matrix A31 from the constraint (26) is written (r)(r) simply as A31 ← (1, 1, 1, 1, 1, 1, 1, 1) for r =1,...,N.

5.3.2. Consistency Constraints Consistency constraints express the cancellation of a linear combination of primary variables u and v. In particular the consistency constraints can be written as g1,2,3 (w) − g1,j,k (w) = 0 with (j, k) 6= (2, 3) , (27)

32 g2,1,3 (w) − g2,j,k (w) = 0 with (j, k) 6= (1, 3) , (28)

gi,1,2 (w) − gi,j,k (w) = 0 with (j, k) 6= (1, 2), i ≥ 3, (29) or in matrix form as A22w = 0. Remembering that the number of primary n variables is 2N1 = 2 2 , we have that this will be the number of columns of A22. The number of rows in A22 matches the number of consistency constraints  n−1 that we saw to be N2 = n 2 − 1 . From the definition of gi,j,k (w) we have that the combination of variables u   is the same as that of variables v; remember that u + u − u v + v − v g (w)= ij ik jk + ij ik jk . i,j,k 2 2 This idea reduces many calculations since you simply have to create a matrix N2×N1 M = (M1,M1) such that M1 ∈ R and A22 = M. The matrix M will be filled sequentially by rows. Initially, it is reset to zero. To generate the consistency constraints as defined in the equations (27), (28), and (29), we must create in first a set of indices S = {1, 2,...,n} that will be traversed sequentially. In the i-th step we remove the element i from the set S which gives us a set S¯ = S {i}, and it is from this set we generate the pairs (j, k) as they appear in the constraints of consistency. For this, we − 2 n 1 generate an ordered array C ∈ N ( 2 ) according to the lexicographic order and including all the combinations of two elements taken from S¯. Note that when i = 1, the first combination of this list is (j1, k1) = (2, 3) and this corresponds to the subscripts of g1,2,3 (w) in equation (27). Similarly, for i = 2, the first ordered pair in C is (j1, k1)=(1, 3), corresponding to g2,1,3 (w) in the equation (28), and when i ≥ 2, (j1, k1) = (1, 2) is the first ordered pair in C associated with gi,1,2 (w) in (29). Now all that is left is to generate the rest of the indexes that appear in the consistency constraints. For it, we go through C from the second ordered pair onwards. Let (j, k) be this ordered pair, and suppose we are at step r. We introduce the part of gi,j1,k1 (w) corresponding to the variables u:

Mr,ι(i,j1) ← Mr,ι(i,j1) − 1,

Mr,ι(i,k1) ← Mr,ι(i,k1) − 1,

Mr,ι(j1,k1) ← Mr,ι(j1,k1) +1.

Similarly, we do with the part of gi,j,k (w) that corresponds to the variables u:

Mr,ι(i,j) ← Mr,ι(i,j) − 1,

Mr,ι(i,k) ← Mr,ι(i,k) − 1,

Mr,ι(j,k) ← Mr,ι(j,k) +1.

Once we have completed this task we have filled in the part of A22 corresponding to u, that is to say, A22 = (M1, ·). To complete the part corresponding to v we

33 Algorithm 3 Create consistency constraints

M ← 0N2×2N1 2: S ←{1, 2,...,n} ⊲ Index Set r ← 0 ⊲ Row index of M 4: for i ← 1,n do S¯ ← S {i} 6: Generate an ordered array C of the elements of S¯ taken two by two. (j1, k1) ← C1 ⊲ first ordered pair in C 8: for l ← 2,n do r ← r +1 10: (j, k) ← Cl ⊲ l-th ordered pair in C

Mr,ι(i,j1) ← Mr,ι(i,j1) − 1

12: Mr,ι(i,k1) ← Mr,ι(i,k1) − 1

Mr,ι(j1,k1) ← Mr,ι(j1,k1) +1 14: Mr,ι(i,j) ← Mr,ι(i,j) − 1 Mr,ι(i,k) ← Mr,ι(i,k) − 1 16: Mr,ι(j,k) ← Mr,ι(j,k) + 1. end for 18: end for M1 ← M [1,...,N2;1,...N1] 20: M [1,...,N2; N1 +1,... 2N1] ← M1

simply replicate M1 in A22, i.e. A22 = (M1,M1). The construction process of the matrix M is shown in Algorithm 3. In Algorithm 3 the submatrix formed by the rows {1,...,N2} and by the columns {1,...,N1} is denoted by M [1,...,N2;1,...N1]. Similarly for M [1,...,N2; N1 +1,... 2N1]. When n is large, the generation of the array C of elements of S¯ taken two by two can take up a lot of memory, causing the system to blow up. To avoid this problem, the combinations must be generated one by one. Algorithm 4 is a variant of Algorithm 3 that sequentially generates the combinations using the indices l1 ∈ [1,...,n − 2] and l2 ∈ [l1 +1 ...,n − 1]. Finally, it was already commented in the previous sections that A21 is a null matrix since the secondary variables λ do not intervene in the definition of the consistency constraints.

5.4. Linear Optimizer Once the binary quadratic problem has been translated into a linear op- timization problem, it only remains to invoke a standard optimizer. Here it should be noted, although it is a well-known fact, that the linear optimizer works in polynomial time. There are many interior-point methods to solve the linear programming problem, although they are all improvements to the ellip- soid method due to Khachiyan. The objective of this technical note is not to present an efficient method but to demonstrate that the procedure of transfor- mation to a linear program is successful. For this reason, it is sufficient for the

34 Algorithm 4 Create improved consistency constraints

M ← 0N2×2N1 S ←{1, 2,...,n} ⊲ Index Set r ← 0 ⊲ Row index of M for i ← 1,n do S¯ ← S {i} j1 ← S¯ (1) k1 ← S¯ (2) for l1 ← 1,n − 2 do for l2 ← l1 +1,n − 1 do if l1 > 1 or l2 > 2 then r ← r +1 j ← S¯ (l1) k ← S¯ (l2) . . Here the same assignments as in Algorithm 3 . . end if end for end for end for M1 ← M [1,...,N2;1,...N1] M [1,...,N2; N1 +1,... 2N1] ← M1

35 implementation to simply invoke a standard resolver. The problem (LPn) will be written compactly as

min f˜(w)=˜cT w λ subject to A˜ = ˜b w   λ ≥ 0

The implementation was done in MATLAB and the linprog function was in- voked with parametersc ˜, A˜, and ˜b, and the constraint λ ≥ 0. Here we are not interested in the implementation details of linprog as they are not relevant to ensure that the proposed method works correctly, we are only interested in the result produced by the linear optimizer. For a future improvement in the implementation, we suggest the following ∗ stop condition: if all the components wi are distant from the minimum wi in less 1 than 4 the optimizer must stop. This is because the primary variables u and v n 1 n move in the domains [0, 2] and 0, 2 respectively. The presented formulation is matrix for reasons of clarity and simplicity, however this involves having a   somewhat sparsed structure with many zero entries.

6. Experiment Description

The implementation of the algorithm presented above was done in MATLAB (the code can be found in the supplementary material). Experimentation was performed for arbitrary dimensions of the problem (up to n = 30, due to memory limitations in MATLAB). Both matrix Q and vector b were chosen arbitrarily in a range of values [lv,uv] = [−50, 50] (this range can be freely modified by the experimenter). Also, the domain of values is allowed to be the reals or the integers. Regarding the success condition of the experiment, a comparison is made of the optimalc ˜T w∗ obtained from linprog with that produced by brute force, f (x∗) (exploring all the possible binary combinations for the original prob- lem). For this, we set a dimension ǫ > 0 and comparec ˜T w∗ with f (x∗): if c˜T w∗ − f (x∗) <ǫ the result is successful. The method has been tested in Matlab R2016a 64-bit under Windows 10 (64-

bit) Intel® Core ™ i5-4300U CPU@ 1.90 GHz. RAM 8.00GB. After five million experiments with variability in the dimension of the problem and randomness in Q and b, we did not record any unsatisfactory results, which suggests that this method is correct.

7. Discussion

In this paper an algorithm has been developed to find the global minimum of a quadratic binary function in polynomial time. Specifically, the computational 15 complexity of the algorithm when using the Vaidya linear optimizer is O n 2 .   36 This bound is very conservative but it is enough to prove that the problem is in class P . The of the complexity exponent, and therefore the speed of resolution of the method, strongly depends on the advances that occur in linear optimizers as well as of technological aspects such as parallel computing. A great advantage of this algorithm is its modularity according to the di- mension: the definition of the consistency and convexity constraints through A and b are fixed for all problems of a given dimension n. This means that these matrices can be precalculated for different sizes of problems and stored in a data file (physically it could be stored in a fast access ROM memory). To test the algorithm for problems of dimension n, this precalculated information is loaded into memory and the linear optimizer is directly invoked. The algorithm has been implemented in MATLAB and has been verified generating more than five million matrices of arbitrary dimension up to 30 with random entries in the range [−50, 50]. Actually, the algorithm is designed for any dimension n, however, tests beyond 30 can be done considering the storage limits of matrices in MATLAB. In particular, matrix A of the problem (LPn) 6 n n has dimension (8N +2N1) × (7N + N1) ∈ O n where N = 3 , N1 = 2 y N = n n−1 − 1 .. This matrix is memory intensive if its sparse condition is 2 2    not exploited. Therefore, the algorithm can be made to work more efficiently   by storing A as a sparse matrix (matrix A is extremely sparse so the memory demand is not as great as it might seem a priori from the dimension). If you do not exploit this property, with MATLAB’s default options, there is a memory limit on matrices of about 248 = 256·1012 entries. The size of matrix A (number of entries) according to dimension n of the problem as indicated in Figure 7 where a conservative memory limit of 1012 is indicated. Thus, for this threshold, it is theoretically possible to calculate dimension problems up to n = 93. In practice, the situation is more dramatic if sparse matrices are not handled. In particular, for dimension n = 30, MATLAB begins to give space reservation problems requiring 10.1 GB of storage. For all the reasons stated above it is recommended to exploit the sparsity of the matrices. Although the problem has polynomial computational complexity in both time and space when n is large this takes time to compute. For example, n = 100 implies a bound proportional to 1015. On a small scale, polynomial-time algorithms generally do not perform well compared to metaheuristic methods. Performance is found when handling large dimensions. For large-scale UBQP problems, it is necessary to compile the code avoiding the use of interpreted MATLAB code. On the other hand, the system should be parallelized as much as possible using multi-core processor architectures and offloading computing in graphics processing units (GPUs). Parallelization should avoid the dependency problem. This requires that the counter variable r within the for-loops in algo- rithms 1, 3, and 4 should be made explicit as a dependent function of i and j, r (i, j), thus avoiding the increment r ← r + 1. Finally, concerning the experiments, it should be noted that the method solution has been compared with the brute force solution. For medium and large dimensions this way of proceeding does not work. For example for n = 100

37 ·1012 1.5

1 Size of A 0.5

0 0 20 40 60 80 100 120 Dimension n

Figure 7: Size of A in terms of n. The memory threshold for non-sparse matrices is set at 1012 and is represented by a dashed red line.

the brute force method requires 2100 iterations, approximately one followed by thirty zeros. Therefore, in the future, it would be necessary to design verification experiments that compared solutions by methods of different nature.

8. Appendix

We rewrite the problem (LPn) in secondary variables as:

min dT λ ′ (LPn): s.t. F λ = g  8N  λ ∈ [0, 1] (N +N)×8N where F ∈ R 2 1 . In the problem (LP’n) box restrictions of the form 0 ≤ λi ≤ 1 have been added. The upper bound on λ is redundant since for each C(i,j,k) we have that 8 (i,j,k) λl = 1, l X=1 (i,j,k) (i,j,k) and λl ≥ 0 for l = 1,..., 8. This implies that λl ≤ 1. However, this redundancy is convenient to determine the vertices of the feasible region Cn.

Lemma 18. Let V be the set of vertices of the hypercube Hn. For each p ∈ V , 8N ′ there is a λ ∈{0, 1} that is in the feasible region of (LPn). Conversely, for

38 8N ′ every λ ∈{0, 1} in the feasible region of (LPn) there exists a vertex p of Hn such that φ (p)= w ∈Cn.

Proof. For each triple (i, j, k) we have that φi,j,k (pi,j,k) is a vertex of Ci,j,k. Consequently, there exists a λ(i,j,k) ∈ R8, with one and the remainder zeros. These vertices have a preimage pi,j,k through φi,j,k given by

xi pi,j,k = xj ,   xk   with u + v + u + v − u − v x = ij ij ik ik jk jk ∈{0, 1} . i 2 (i,j,k) Since xi must be the same for each convex C3 , the consistency restrictions must be satisfied at the point φ (p). Hence, φ (p) ∈ Cn, and there is a λ ∈ 8N ′ {0, 1} in the feasible region of (LPn). For the reciprocal, we have that λ(i,j,k) is a vector of the standard base of R8, that is, the vector of primary variables w is such that wi,j,k is a vertex of the (i,j,k) (i,j,k) convex C3 . For this vertex there exists a vertex pi,j,k in H3 such that

φi,j,k (pi,j,k)= wi,j,k.

The consistency constraints guarantee that the coordinate xi at two points pi,j,k and pi,j′,k′ is the same. Hence, there exists a unique preimage p for w through φ such that p ∈ V . In the previous Lemma, despite the fact that for each p ∈ V , the vector λ ∈ {0, 1}8N does not have to be unique, the associated vector of primary variables is unique through the transformation w = Bλ (see example 5 of Section 3.1). This shows that there is a bijection between the vertices of Hn and those n of Cn through φ. And therefore, the number of vertices of Cn is just 2 .

∗ Theorem 19. Let w be the point in Cn where the minimum of the problem ′ ∗ (Pn) is reached and let x be the point in Hn for the minimum of (Pn), then f˜(w∗)= f (x∗).

Proof. According to Lemma 18, the set of vertices of Cn is {φ (p): p ∈ V } (where V is the set of vertices of the hypercube Hn). We know that the minimum of ′ the problem of (Pn) is reached at a vertex of Cn, so there will be a p ∈ V such ∗ that φ (p)= w . On the other hand, the minimum of f over Hn is achieved at ∗ x ∈ Hn, so that f (x∗) ≤ f (p) . (30) The connection between f and f˜ leads to

∗ T ∗ T ∗ T ∗ ∗ f˜(φ (x ))=˜c φ (x )= c Tn (Enα (x )) = c α (x )= f (x ) (31) 

39 ∗ ∗ where φ (x ) is a vertex of Cn because x ∈ V , and to

T T T f (p)= c α (p)= c Tn (Enα (p))=˜c φ (p)= f˜(φ (p)) . (32)  According to (30) and (32),

f (x∗) ≤ f (p)= f˜(w∗).

∗ Since the minimum of f˜ over Cn is attained at w ∈Cn, and accounting for (31), we have that f (x∗)= f˜(φ (x∗)) ≥ f˜(w∗) . Henceforth, f (x∗)= f˜(w∗).

References

[1] Y. Boykov, O. Veksler, R. Zabih, Fast approximate energy minimization via graph cuts, IEEE Transactions on Pattern Analysis and Machine Intel- ligence 23 (11) (2001) 1222–1239. [2] V. Kolmogorov, C. Rother, Minimizing nonsubmodular functions with graph cuts-a review, IEEE Transactions on Pattern Analysis and Machine Intelligence 29 (7) (2007) 1274–1279. [3] J. H. Kappes, B. Andres, F. A. Hamprecht, C. Schn¨orr, S. Nowozin, D. Ba- tra, S. Kim, B. X. Kausler, J. Lellmann, N. Komodakis, C. Rother, A comparative study of modern inference techniques for discrete energy min- imization problems, in: 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 1328–1335. [4] V. Kolmogorov, R. Zabin, What energy functions can be minimized via graph cuts?, IEEE Transactions on Pattern Analysis and Machine Intelli- gence 26 (2) (2004) 147–159. [5] P. F. Felzenszwalb, R. Zabih, Dynamic programming and graph algorithms in computer vision, IEEE Transactions on Pattern Analysis and Machine Intelligence 33 (4) (2011) 721–740. [6] P. L. Hammer, S. Rudeanu, Boolean Methods in Operations Research and Related Areas, Springer Berlin Heidelberg, 1968. [7] E. Boros, P. L. Hammer, Pseudo-boolean optimization, Discrete Applied Mathematics 123 (1-3) (2002) 155–225. [8] P. Hansen, B. Jaumard, V. Mathon, State-of-the-art survey—constrained nonlinear 0–1 programming, ORSA Journal on Computing 5 (2) (1993) 97–119.

40 [9] G. A. Kochenberger, F. Glover, A unified framework for modeling and solv- ing combinatorial optimization problems: A tutorial, in: Multiscale Opti- mization Methods and Applications, Kluwer Academic Publishers, 2006, pp. 101–124. [10] P. M. Pardalos, S. Jha, Complexity of uniqueness and local search in quadratic 0–1 programming, Operations Research Letters 11 (2) (1992) 119–123. [11] A. A. Sch¨affer, Simple local search problems that are hard to solve, SIAM Journal on Computing 20 (1) (1991) 56–87. [12] C. A. Tovey, Hill climbing with multiple local optima, SIAM Journal on Algebraic Discrete Methods 6 (3) (1985) 384–393. [13] C. A. Tovey, Low order polynomial bounds on the expected performance of local improvement algorithms, Mathematical Programming 35 (2) (1986) 193–224. [14] P. L. Hammer, B. Simeone, T. M. Liebling, D. de Werra, From linear sep- arability to unimodality: A hierarchy of pseudo-boolean functions, SIAM Journal on Discrete Mathematics 1 (2) (1988) 174–184. [15] K. W. Hoke, Completely unimodal numberings of a simple polytope, Dis- crete Applied Mathematics 20 (1) (1988) 69–81. [16] M. Emamy-K., The worst case behavior of a greedy algorithm for a class of pseudo-boolean functions, Discrete Applied Mathematics 23 (3) (1989) 285–287. [17] C. A. Tovey, 3. local improvement on discrete structures, in: E. Aarts, J. K. Lenstra (Eds.), Local Search in Combinatorial Optimization, Princeton University Press, 2003, pp. 57–90. [18] I. R. P.L. Hammer, S. Rudeanu, On the determination of the minima of pseudo-boolean functions (in romanian), Studii si Cercetari Matematice 14 (1963) 359–364. [19] Y. Crama, P. Hansen, B. Jaumard, The basic algorithm for pseudo-boolean programming revisited, Discrete Applied Mathematics 29 (2-3) (1990) 171– 185. [20] E. Balas, J. B. Mazzola, Nonlinear 0–1 programming: I. linearization tech- niques, Mathematical Programming 30 (1) (1984) 1–21. [21] E. Balas, E. Zemel, Facets of the knapsack polytope from minimal covers, SIAM Journal on Applied Mathematics 34 (1) (1978) 119–148. [22] P. L. Hammer, P. Hansen, B. Simeone, Roof duality, complementation and persistency in quadratic 0–1 optimization, Mathematical Programming 28 (2) (1984) 121–155.

41 [23] J.-M. Bourjolly, P. L. Hammer, W. R. Pulleyblank, B. Simeone, Boolean- combinatorial bounding of maximum 2-satisfiability, in: Computer Science and Operations Research, Elsevier, 1992, pp. 23–42. [24] P. Hammer, The conflict graph of a pseudo-boolean function, Tech. rep., Bell Laboratories (August 1978). [25] A. H. (alias P.L. Hammer, Storiesof the one-zero-zero-one nights: Abu boul in graphistan, in: P. Hansen, D. de Werra (Eds.), Regards sur la Th´eorie des Graphes, Presses Polytechniques Romandes, Lausane, 1980. [26] C. Ebenegger, P. Hammer, D. de Werra, Pseudo-boolean functions and stability of graphs, in: Algebraic and Combinatorial Methods in Opera- tions Research, Proceedings of the Workshop on Algebraic Structures in Operations Research, Elsevier, 1984, pp. 83–97. [27] G. Alexe, P. L. Hammer, V. V. Lozin, D. de Werra, Struction revisited, Discrete Applied Mathematics 132 (1-3) (2003) 27–46. [28] P. L. Hammer, N. V. R. Mahadev, D. de Werra, The struction of a graph: Application toCN-free graphs, Combinatorica 5 (2) (1985) 141–147. [29] A. Hertz, On the use of boolean methods for the computation of the sta- bility number, Discrete Applied Mathematics 76 (1-3) (1997) 183–203. [30] G. B. Dantzig, On the significance of solving linear programming problems with some integer variables, Econometrica 28 (1) (1960) 30. [31] R. Fortet, L’algebre de boole et ses applications en recherche operationnelle, Trabajos de Estadistica 11 (2) (1960) 111–118. [32] F. Glover, E. Woolsey, Technical note—converting the 0-1 polynomial pro- gramming problem to a 0-1 linear program, Operations Research 22 (1) (1974) 180–182. [33] P. Hansen, Methods of nonlinear 0-1 programming, in: Discrete Optimiza- tion II, Proceedings of the Advanced Research Institute on Discrete Opti- mization and Systems Applications of the Systems Science Panel of NATO and of the Discrete Optimization Symposium co-sponsored by IBM Canada and SIAM Banff, Aha. and Vancouver, Elsevier, 1979, pp. 53–70. [34] F. Glover, Improved linear formulations of nonlinear integer problems, Management Science 22 (4) (1975) 455–460. [35] H. D. Sherali, J. C. Smith, An improved linearization strategy for zero-one quadratic programming problems, Optimization Letters 1 (1) (2006) 33–47. [36] R. J. Forrester, N. Hunt-Isaak, Computational comparison of exact solution methods for 0-1 quadratic programs: Recommendations for practitioners, Journal of Applied Mathematics 2020 (2020) 1–21.

42 [37] N. Krislock, J. Malick, F. Roupin, BiqCrunch, ACM Transactions on Math- ematical Software 43 (4) (2017) 1–23. [38] G. Kochenberger, J.-K. Hao, F. Glover, M. Lewis, Z. L¨u, H. Wang, Y. Wang, The unconstrained binary quadratic programming problem: a survey, Journal of Combinatorial Optimization 28 (1) (2014) 58–81. [39] I. Dunning, S. Gupta, J. Silberholz, What works best when? a systematic evaluation of heuristics for max-cut and QUBO, INFORMS Journal on Computing 30 (3) (2018) 608–624. [40] I. G. Rosenberg, Br`eves communications. 0-1 optimization and non-linear programming, RAIRO - Operations Research - Recherche Op´erationnelle 6 (V2) (1972) 95–97. [41] C. C. Gonzaga, On the complexity of linear programming, Resenhas IME- USP 2 (2).

43