European Journal of Operational Research 281 (2020) 36–49
Contents lists available at ScienceDirect
European Journal of Operational Research
journal homepage: www.elsevier.com/locate/ejor
Discrete Optimization
Parametric convex quadratic relaxation of the quadratic knapsack problem
∗ M. Fampa a, , D. Lubke a, F. Wang b, H. Wolkowicz c a PESC/COPPE, Universidade Federal do Rio de Janeiro, Caixa Postal 68511, Rio de Janeiro, RJ21941-972, Brazil b Department of Mathematics, Royal Institute of Technology, Sweden c Department of Combinatorics and Optimization, University of Waterloo, Waterloo, Canada
a r t i c l e i n f o a b s t r a c t
Article history: We consider a parametric convex quadratic programming (CQP) relaxation for the quadratic knapsack Received 20 January 2019 problem (QKP). This relaxation maintains partial quadratic information from the original QKP by per-
Accepted 16 August 2019 turbing the objective function to obtain a concave quadratic term. The nonconcave part generated by Available online 20 August 2019 the perturbation is then linearized by a standard approach that lifts the problem to matrix space. We
Keywords: present a primal-dual interior point method to optimize the perturbation of the quadratic function, in a Quadratic knapsack problem search for the tightest upper bound for the QKP. We prove that the same perturbation approach, when Quadratic binary programming applied in the context of semidefinite programming (SDP) relaxations of the QKP, cannot improve the Convex quadratic programming relaxations upper bound given by the corresponding linear SDP relaxation. The result also applies to more general Parametric optimization integer quadratic problems. Finally, we propose new valid inequalities on the lifted matrix variable, de-
Valid inequalities rived from cover and knapsack inequalities for the QKP, and present separation problems to generate cuts for the current solution of the CQP relaxation. Our best bounds are obtained alternating between opti- mizing the parametric quadratic relaxation over the perturbation and applying cutting planes generated by the valid inequalities proposed. ©2019 Elsevier B.V. All rights reserved.
1. Introduction is a generalization of the knapsack problem, which has the same feasible set of the QKP, and a linear objective function in x . The We study a convex quadratic programming (CQP) relaxation of linear knapsack problem can be solved in pseudo-polynomial time the quadratic knapsack problem (QKP), using dynamic programming approaches with complexity of O ( nc ). ∗ The QKP appears in a wide variety of fields, such as biol- p := max x T Qx QKP ogy, logistics, capital budgeting, telecommunications and graph ( QKP ) s.t. w T x ≤ c (1) theory, and has received a lot of attentio0n in the last decades. x ∈ { 0 , 1 } n , Several papers have proposed branch-and-bound algorithms for where Q ∈ S n is a symmetric n × n nonnegative integer profit ma- the QKP, and the main difference between them is the method n trix, w ∈ Z ++ is a vector of positive integer weights for the items, used to obtain upper bounds for the subproblems ( Billionnet & ∈ Z ≥ , ∈ = and c ++ is the knapsack capacity with c wi for all i N : Calmels, 1996; Billionnet, Faye, & Soutif, 1999; Caprara, Pisinger, & { 1 , . . . , n } . The binary (vector) variable x indicates which items are Toth, 1999; Chaillou, Hansen, & Mahieu, 1989; Helmberg, Rendl, & chosen for the knapsack, and the inequality in the model, known Weismantel, 1996; 20 0 0 ). The well known trade-off between the as a knapsack inequality, ensures that the selection of items does strength of the bounds and the computational effort required to not exceed the knapsack capacity. We note that any linear costs in obtain them is intensively discussed in Pisinger (2007) , where the objective can be included on the diagonal of Q by exploiting semidefinite programming (SDP) relaxations proposed in Helmberg the {0, 1} constraints and, therefore, are not considered. et al. (1996) and Helmberg, Rendl, and Weismantel (20 0 0) are The QKP was introduced in Gallo, Hammer, and Simeone presented as the strongest relaxations for the QKP. The linear (1980) and was proved to be NP-Hard in the strong sense by re- programming (LP) relaxation proposed in Billionnet and Calmels duction from the clique problem. The quadratic knapsack problem (1996) , on the other side, is presented as the most computation- ally inexpensive.
∗ Both the SDP and the LP relaxations have a common fea- Corresponding author.
E-mail addresses: [email protected] (M. Fampa), [email protected] ture, they are defined in the symmetric matrix lifted space T (D. Lubke), [email protected] (F. Wang), [email protected] (H. Wolkowicz). determined by the equation X = xx , and by the replacement of https://doi.org/10.1016/j.ejor.2019.08.027 0377-2217/© 2019 Elsevier B.V. All rights reserved. M. Fampa, D. Lubke and F. Wang et al. / European Journal of Operational Research 281 (2020) 36–49 37 the quadratic objective function in QKP with a linear function in Table 1
X , namely, trace( QX ). As the constraint X = xx T is nonconvex, it is Equations number corresponding to acronyms. relaxed by convex constraints in the relaxations. The well known QKP (1) CI (27)
McCormick inequalities ( McCormick, 1976 ), and also the semidef- QKPlifted (2) ECI (28) inite constraint, X − xx T 0 , have been extensively used to relax LPR (3) LCI (29) CQP (5) SCI (32) = T , Qp the nonconvex constraint X xx in relaxations of the QKP. LSDP (21) CILS (34) In this paper, we investigate a CQP relaxation for the QKP, QSDPQ p (15) SCILS (36) where instead of linearizing the objective function, we perturb SKILS (41) the objective function Hessian Q , and maintain the (concave) per- turbed version of the quadratic function in the objective, lineariz- ing only the remaining part derived from the perturbation. Our X − xx T 0 . An IPM could also be applied to this parametric prob- relaxation is a parametric convex quadratic problem, defined as lem in order to generate the best possible bound. Nevertheless, − a function of a matrix parameter Qp , such that Q Qp 0. This we prove an interesting result concerning the relaxations, in case matrix parameter is iteratively optimized by a primal-dual interior the constraint X − xx T 0 is imposed: the tightest bound gener- point method (IPM) to generate the best possible bound for the ated by the parametric quadratic SDP relaxation is obtained when QKP. During this iterative procedure, valid cuts are added to the the perturbation Q p is equal to Q , or equivalently, when we lin- formulation to strengthen the relaxation, and the search for the earize the entire objective function, obtaining the standard linear best perturbation is adapted accordingly. Our procedure alternates SDP relaxation. We conclude, therefore, that keeping the (concave) between optimizing the matrix parameter and applying cutting perturbed version of the quadratic function in the objective of the planes generated by valid inequalities. At each iteration of the SDP relaxation does not lead to a tighter bound. A result that could procedure, a new bound for the QKP is computed, considering be derived from our analysis is that the CQP relaxation cannot gen- the updated matrix parameter and the cuts already added to the erate a tighter bound than the standard linear SDP relaxation. This relaxation. result was already proved in Billionnet et al. (2016) , and we show
In Billionnet, Elloumi, and Lambert (2016) (see also Billionnet, that it still holds for unbounded and nonconvex feasible set.
Elloumi, & Lambert, 2012; Billionnet, Elloumi, & Plateau, 2009 for Another contribution of this work is the development of valid previous results), a similar parametric convex quadratic problem inequalities for the CQP relaxation on the lifted matrix variable. was investigated for the more general problem of minimizing a The inequalities are first derived from cover inequalities for the quadratic function of bounded integer variables subject to a set of knapsack problem. The idea is then extended to knapsack inequali- quadratic constraints. However, the authors consider a unique per- ties. Taking advantage of the lifting X := xx T , we propose new valid turbation of the Hessian of each quadratic function in the model, inequalities that can also be applied to more general relaxations to reformulate it as a mixed integer quadratic programming (MIQP) of binary quadratic programming problems that use the same lift- problem with a CQP continuous relaxation. They propose to solve ing. We discuss how cuts for the quadratic relaxation can be ob- the reformulated problem by an MIQP solver. To reformulate the tained by the solution of separation problems, and investigate pos- problem, the authors also seek the best possible perturbations of sible dominance relation between the inequalities proposed. the Hessians, which are considered as the ones, such that the solu- Finally, we present an algorithmic framework, where we iter- tion of the continuous relaxation of the MIQP is maximal. In other atively improve the upper bound for the QKP by optimizing the words, they seek perturbations that lead to the best bound at the choice of the perturbation of the objective function and adding root node of a branch-and-bound algorithm. The authors claim that cutting planes to the relaxation. At each iteration, lower bounds for these perturbations can be computed from an optimal dual solu- the problem are also generated from feasible solutions constructed tion of a standard SDP relaxation of the problem. However, the from a rank-one approximation of the solution of the CQP relax- dual SDP problem is not correctly formulated in the paper, and ation. the proof presented is not correct. In this paper, we prove the re- In Section 2 , we introduce our parametric convex quadratic re- sult following the idea presented in Billionnet et al. (2009), which laxation for the QKP. In Section 3 , we explain how we optimize is based on Lemarchal and Oustry (1999, Theorem 4.4). Further- the parametric problem over the perturbation of the objective; more, we show that the result is valid to the more general problem i.e., we present the IPM applied to obtain the perturbation that where the feasible set is any bounded polyhedron with nonempty leads to the best possible bound. In Section 4 , we present our con- interior. We also note that, if no cuts are added to the relaxation clusion about the parametric quadratic SDP relaxation, and relate during the iterations of our interior point method, it becomes an our results to the results presented in Billionnet et al. (2016) . In alternative way of obtaining the optimal perturbation considered Section 5 , we introduce new valid inequalities on the lifted matrix in Billionnet et al. (2016). variable of the convex quadratic model, and we describe how cut-
Another similar approach to handle nonconvex quadratic func- ting planes are obtained by the solution of separation problems. In tions consists in decomposing it as a difference of convex (DC) Section 6 , we present the heuristic used to generate lower bounds quadratic function (Horst & Thoai, 1999). DC decompositions to the QKP. In Section 7 , we discuss our numerical experiments have been extensively used in the literature to generate convex and in Section 8 , we present our final remarks. quadratic relaxations of nonconvex quadratic problems. See, for ex- ample, Fampa, Lee, and Melo (2017) and references therein. Unlike the approach used in DC decompositions, we do not necessarily Notation ∈ S n , decompose x TQx as a difference of convex functions, or equiva- If A then svec(A) is a vector whose entries come from A lently, as a sum of a convex and a concave function. Instead, fol- by stacking up its ‘lower half’, i.e., lowing the approach introduced in Billionnet et al. (2016) , we de- T n (n +1) / 2 svec (A ) := (a 11 , . . . , a n 1 , a 22 , . . . , a n 2 , . . . , a nn ) ∈ R . compose it as a sum of a concave function and a quadratic term derived from the perturbation applied to Q . This perturbation can The operator sMat is the inverse of svec, i.e., sMat ( svec (A )) = A . − λ be any symmetric matrix Qp , such that Q Qp 0. We also denote by min (A), the smallest eigenvalue of A and by λ In an attempt to obtain stronger bounds, we also investi- i (A) the ith largest eigenvalue of A. gated the parametric convex quadratic SDP problem, where we To facilitate the reading of the paper, Table 1 relates the add to our CQP relaxation, the positive semidefinite constraint acronyms used with the associated equations numbers. 38 M. Fampa, D. Lubke and F. Wang et al. / European Journal of Operational Research 281 (2020) 36–49
Table 2 where Z ∈ S n and ∈ S n denote, respectively, the slack and the dual
List of abbreviations. symmetric matrix variables. We consider the Lagrangian function CQP Convex Quadratic Programming ∗ μ( , , ) = ( ) − μ + (( − + )) . QKP Quadratic Knapsack Problem L Qp Z : pCQP Qp log det Z trace Q Qp Z
SDP Semidefinite Programming Some important points should be emphasized here. We first MIQP Mixed Integer Quadratic Programming ∗ ( ) MILP Mixed Integer Linear Programming note that the objective function for pCQP Qp is linear in Qp, i.e., this function is the maximum of linear functions over feasible points x , X . Therefore, this is a convex function. We also show the standard abbreviations used in the paper in Moreover, as will be detailed next, the search direction of the Table 2 . IPM, computed at each iteration of the algorithm, depends on = ( ) , = ( ) , the optimum solution x x Qp X X Qp of CQPQ p for a fixed 2. A parametric convex quadratic relaxation matrix Q p . At each iteration of the IPM, we have Z 0, and there- − ≺ fore Q Qp 0. Thus, problem CQPQ p maximizes a strictly con- In order to construct a convex relaxation for QKP , we start by cave quadratic function, subject to linear constraints over a com- considering the following standard reformulation of the problem pact set P, and consequently, has a unique optimal solution (see in the lifted space of symmetric matrices, defined by the lifting e.g. Turlach & Wright, 2015 ). From standard sensitivity analysis re- X := xx T . sults, e.g. Fiacco (1983 , Corollary 3.4.2), Hogan (1973) , and Danskin ∗ = ( ) (1966 , Theorem 1), as the optimal solution x = x (Q p ) , X = X(Q p ) is pQKP : max trace QX lifted ∗ ( ) s.t. w T x ≤ c unique, the function pCQP Qp is differentiable, and therefore, the ( QKP ) (2) lifted X = xx T Lagrangian function is also differentiable. Since Q appears only in the objective function in CQP , and x ∈ { 0 , 1 } n. p Qp T T T We consider an initial LP relaxation of QKP , given by x (Q − Q p ) x + trace (Q p X ) = x Qx + trace (Q p (X − xx )) , max trace (QX ) ( LPR ) (3) we have ( , ) ∈ P, s.t. x X ∗ T ∇ p CQP (Q p ) = X − xx . (8) where P ⊂ [0 , 1] n × S n is a bounded polyhedron, such that
The optimality conditions for (7) are obtained by differentiating { (x, X ) : w T x ≤ c, X = xx T , x ∈ { 0 , 1 } n} ⊂ P. the Lagrangian L μ with respect to Q p , , Z , respectively,
∂ 2.1. The perturbation of the quadratic objective Lμ ∗ : ∇ p CQP (Q p ) − = 0 , ∂Q p Next, we propose a convex quadratic relaxation with the same ∂L μ − + = , (9) feasible set as LPR , but maintaining a concave perturbed version of ∂ : Q Qp Z 0 the quadratic objective function of QKP, and linearizing only the ∂ Lμ − remaining nonconcave part derived from the perturbation. More : −μZ 1 + = 0 , ( or ) Z − μI = 0 . ∂ n Z specifically, we choose Q p ∈ S such that This gives rise to the nonlinear system Q − Q p 0 , (4) ⎛ ⎞ ∗ ∇ p (Q ) − and we get CQP p G μ(Q , , Z) = ⎝ Q − Q + Z ⎠ = 0 , Z, 0 . (10) T T T T T p p x Qx = x (Q − Q ) x + x Q x = x (Q − Q ) x + trace (Q xx ) p p p p Z − μI T = x (Q − Q p ) x + trace (Q p X ) . ∗ , We use a BFGS approximation for the Hessian of pCQP since it Finally, we define the parametric convex quadratic relaxation of is not guaranteed to be twice differentiable everywhere, and up- QKP : date it at each iteration (see Lewis & Overton, 2013 ). We denote 2 ∗ ∗ the approximation of ∇ p (Q ) by B , and begin with the ap- p (Q ) := max x T (Q − Q ) x + trace (Q X ) BFGS CQP p ( ) CQP p p p + CQPQ = k, k 1 p s.t. (x, X ) ∈ P. proximation B0 I. Recall that if Qp Qp are two successive it- + ∇ ∗ ( k ) , ∇ ∗ ( k 1 ) , (5) erates with gradients pCQP Qp pCQP Qp respectively, with ∈ S n (n +1) / 2 , current Hessian approximation Bk then we set 3. Optimizing the parametric problem over the parameter Q ∗ + ∗ + p = ∇ ( k 1 ) − ∇ ( k ) , = k 1 − k, Yk : pCQP Qp pCQP Qp Sk : Qp Qp ∗ ( ) The upper bound pCQP Qp in the convex quadratic problem and,
CQPQ depends on the feasible perturbation Qp of the Hessian Q. p υ := Y , S , ω := svec (S ) , B svec (S ) . To find the best upper bound, we consider the parametric problem k k k k k Finally, we update the Hessian approximation with ∗ ∗ = ( ) . paramQKP : min pCQP Qp (6) 1 1 Q−Q p 0 T T B + := B + svec (Y ) svec (Y ) − B svec (S ) svec (S ) B . k 1 k υ k k ω k k k k We solve (6) with a primal-dual interior-point method (IPM), and υ > we describe in this section how the search direction of the algo- We note that the curvature condition 0 should be verified rithm is obtained at each iteration. to guarantee the positive definiteness of the updated Hessian. In We start with minimizing a log-barrier function. We use the our implementation, we address this by skipping the BFGS update when υ is negative or too close to zero. barrier function, B μ( Q p , Z ) with barrier parameter, μ > 0, to obtain the barrier problem The equation for the search direction is ∗ min B μ(Q , Z) := p (Q ) − μ log det Z Q p p CQP p ( , , ) = − μ( , , ) , s.t. Q − Q p + Z = 0 (: ) (7) Gμ Qp Z G Qp Z (11) Z 0 , Z M. Fampa, D. Lubke and F. Wang et al. / European Journal of Operational Research 281 (2020) 36–49 39 where Algorithm 1 Updating the perturbation Q p . ∗ ∇ ∗ ( ) − k k k ( k ) ( k ) ∇ ( k ) μk τ = pCQP Qp Rd Input: k, Qp , Z , , x Qp , X Qp , pCQP Qp , Bk , , α : . τ = . G μ(Q p , , Z) = Q − Q p + Z =: R p . (12) 0 95, μ : 0 9.
Z − μI R c Compute the residuals: ⎛ ⎞ ∗ ∇ ( k ) − k If B is the current estimate of the Hessian, then (11) becomes Rd pCQP Qp = ⎝ − k + k ⎠ . Rp : Q Qp Z ( ( )) − = − , k k k sMat B svec Qp Rd Rc Z − μ I −Q p + Z = −R p , Solve the linear system for Q : Z + Z = −R c . p k k k k Z sMat (B svec (Q )) + (Q ) = −R − Z R + R . We can substitute for the variables and Z in the third equa- k p p c d p tion of the system. The elimination gives us a simplified system, Set: and therefore, we apply it, using the following two equations for := sMat (B svec (Q )) + R , Z := −R + Q . elimination and backsolving, k p d p p Update Q p , Z and : = sMat (B svec (Q )) + R , Z = −R + Q . (13) p d p p + + + k 1 = k + α , k 1 = k + α , k 1 = k + α , Qp : Qp ˆ p Qp Z : Zp ˆ p Z : ˆ d Accordingly, we have a single equation to solve, and the system where finally becomes
α = τα × { , { k + α }} , ˆ p : min 1 argmaxαp Zp p Z 0 ( ( )) + ( ) = − − + . Z sMat B svec Qp Qp Rc ZRd Rp k αˆ := τα × min{ 1 , argmaxα { + α 0 }} . d d d We emphasize that to compute the search direction at each it- + + eration of our IPM, we need to update the residuals defined in (12) , ( k 1 ) ( k 1 ) Obtain the optimal solution x Qp , X Qp of relaxation and therefore we need the optimal solution x = x (Q p ) , X = X(Q p ) k +1 CQP , where Q p := Q . of the convex quadratic relaxation CQP for the current perturba- Qp p Q p ∗ Update the gradient of pCQP : tion Qp. Problem CQPQ p is thus solved at each iteration of the IPM ∗ + + + + method, each time for a new perturbation Q p , such that Q − Q p ≺ ∇ ( k 1 ) = ( k 1 ) − ( k 1 ) ( k 1 ) T . pCQP Qp : X Qp x Qp x Qp 0 . ∗ υ > In Algorithm 1 , we present in details an iteration of the IPM. Update the Hessian approximation of pCQP (if 0): The algorithm is part of the complete framework used to generate ∗ k +1 ∗ k k +1 k Y := ∇ p (Q ) − ∇ p (Q ) , S := Q − Q , bounds for QKP , as described in Section 7 . k CQP p CQP p k p p υ = , , ω = ( ) , ( ) , : Yk Sk : svec Sk Bk svec Sk Remark 1. Algorithm is an interior-point method with a quasi- 1 T B + := B + svec (Y ) svec (Y ) Newton step (BFGS). The object function we are minimizing is k 1 k υ k k differentiable with exception possibly at the optimum. A com- 1 T plete convergence analysis of the algorithm is not in the scope of − B svec (S ) svec (S ) B . ω k k k k this paper, however, convergence analysis for some similar prob- lems can be found in the literature. In Armand, Gilbert, and Jégou Update μ: (20 0 0) , it is shown that if the objective function is always differen- k +1 k +1 tiable and strongly convex, then it is globally convergent to the an- + trace (Z ) μk 1 := τμ . alytic center of the primal-dual optimal set when μ tends to zero n and strict complementarity holds. + + + + k 1 k +1 k +1 ( k 1 ) ( k 1 ) ∇ ∗ ( k 1 ) Output: Qp , Z , , x Qp , X Qp , pCQP Qp , μk +1 Bk +1 , . 4. The parametric quadratic SDP relaxation
In an attempt to obtain tighter bounds, a promising approach might seem to be to include the positive semidefinite constraint We now consider the parametric SDP problem given by T X − xx 0 in our parametric quadratic relaxation CQP , and ∗ Qp = T ( − ) + ( ) pQSDP : sup x Q Qp x trace Qp X solve a parametric convex quadratic SDP relaxation, also using Qp ( QSDP ) ( , ) ∈ F an IPM. Nevertheless, we show in this section that the convex Qp s.t. x X − T , quadratic SDP relaxation cannot generate a better bound than the X xx 0 linear SDP relaxation, obtained when we set Q p equal to Q . In fact, (15) as shown below, the result applies not only to the QKP, but to more where Q − Q 0 . general problems as well. We emphasize here that the same result p n n does not apply for CQPQ p . We could observe with our computa- Theorem 2. Let F be any subset of R × S . For any choice of matrix tional experiments that the best bounds were obtained by CQP , Q p Q p satisfying Q − Q p 0 , we have − = , when we had Q Qp 0 for all instances considered. ∗ ∗ p QSDP ≥ p LSDP . (16) Consider the linear SDP problem given by Q p
∗ { ∗ − } = ∗ = ( ) Moreover, inf pQSDP : Q Qp 0 pLSDP . pLSDP : sup trace QX Q p ( LSDP ) s.t. (x, X ) ∈ F (14) ( ) X − xx T 0 , Proof. Let x˜, X˜ be a feasible solution for LSDP. We have ∗ ≥ T ( − ) + ( ˜ ) n n n n pQSDP x˜ Q Qp x˜ trace Qp X (17) where x ∈ R , X ∈ S , and F is any subset of R × S . Qp 40 M. Fampa, D. Lubke and F. Wang et al. / European Journal of Operational Research 281 (2020) 36–49
T T L (x, X, y ) := x Q x + trace (Q X ) = trace ((Q − Q p )( x˜x ˜ − X˜ )) + trace ((Q − Q p ) X˜ ) + trace (Q p X˜ ) n p (18) q
+ T ( − ( ) − γ T ) . yk bk trace k X k x = T k 1 = trace ((Q − Q p )( x˜x ˜ − X˜ )) + trace (Q X˜ ) (19) The Lagrangian dual function is then defined as
≥ trace (Q X˜ ) . (20) g(y ) := max L (x, X, y ) x,X q
The inequality (17) holds because ( x˜, X˜ ) is also a feasible solution = T + ( ) + ( − ( ) − γ T ) max x Qn x trace Qp X yk bk trace k X k x x,X = for QSDPQ p . The inequality in (20) holds because of the negative k 1 ∗ q q q − T − ˜ semidefiniteness of Q Qp and x˜x˜ X. Because pQSDP is an up- = − + T − γ T + Qp max trace Qp yk k X x Qn x yk k x yk bk x,X per bound on the objective value of LSDP at any feasible solu- ⎧ k =1 k =1 k =1 ∗ ∗ q q q tion, we can conclude that p ≥ p . Clearly, Q = Q satis- ⎨ QSDP LSDP p x T Q x − y γ T x + y b , if Q − y = 0 , Qp = max n k k k k p k k = = = fies Q − Q p = 0 0 and LSDP is the same as QSDP for this choice x ⎩ k 1 k 1 k 1 Qp + ∞ , otherwise. { ∗ − } = ∗ ⎧ of Qp . Therefore, inf pQSDP : Q Qp 0 pLSDP . Q p ⎪ q q ⎪ ⎪− 1 γ T † γ + T , ⎪ yk k Qn yk k y b Notice that in Theorem 2 we do not require that F be convex ⎨ 4 k =1 k =1 = nor bounded. Also, in principle, for some choices of Q p , we could q q ∗ ∗ ⎪ − = γ ∈ ( ) , have p = + ∞ with p = + ∞ or not. ⎪ if Qp yk k 0 and yk k range Qn QSDP LSDP ⎪ Qp ⎩ k =1 k =1 + ∞ , otherwise, Remark 3. As a corollary from Theorem 2 , we have that the upper , bound for QKP given by the solution of the quadratic relaxation † where Qn is the pseudo-inverse of Qn . CQP , cannot be smaller than the upper bound given by the so- Qp As F has nonempty interior, the Slater condition holds for the lution of the SDP relaxation obtained from it, by adding the SDP inner maximization problem in (22) . Therefore, problem (22) has constraint X − xx T 0 and setting Q equal to Q . p the same optimal value as In Billionnet et al. (2016 , Theorem 1), this result was already proven for the case where F is a particular convex set. The authors q q also claim that in this particular case, the best bound obtained by 1 T † T min − y γ Q y γ + y b , , 4 k k n k k the CQP relaxation is exactly the same as the bound given by the Qp Qn y k =1 k =1 linear SDP relaxation, and that the best perturbation can be de- q − = , rived from the optimal dual variables of the SDP problem. Never- s.t. Qp yk k 0 theless the proof presented in Billionnet et al. (2016) is based on k =1 an incorrect formulation of the dual SDP problem. In the follow- q γ ∈ ( ) , ing, we prove that the result holds, following the same idea used yk k range Qn = in Billionnet et al. (2009) , which is based on Lemarchal and Oustry k 1 Q = Q − Q 0 , y ≥ 0 , (1999 , Theorem 4.4). Furthermore, we show that the result is valid n p to the more general problem where the feasible set is any bounded which is equivalent to polyhedron with nonempty interior. The result then applies to the
CQP relaxations that we use in this work. − max t , , t Qp y Theorem 4. Consider F ⊂ R n × S n as a bounded polyhedron with q q T nonempty interior, defined by: 1 γ T ( − ) † γ − − ≥ , s.t. 4 yk k Q Qp yk k y b t 0
F = { ( , ) ( ) + γ T ≤ , = , . . . , } , k =1 k =1 : x X : trace k X k x bk k 1 q Q − Q 0 , p ∈ S n γ ∈ R n, = , . . . , q where k and k for k 1 q. γ ∈ ( − ) , Define yk k range Q Qp = ∗ T k 1 p polCQP (Q p ) := max{ x (Q − Q p ) x + trace (Q p X ) : (x, X ) ∈ F} , q − = , Qp yk k 0 where Q − Q p 0 , and k =1 ∗ T ≥ . p polSDP := max { trace (QX ) : (x, X ) ∈ F, X − xx 0 } . (21) y 0
Then By Schur complement ( Horn & Zhang, 2005 ), this is equivalent to ∗ ∗ the following SDP problem min p polCQP (Q p ) = p polSDP . Q−Q p 0 − max t Proof. First we consider the parametric problem , , t Qp y ⎡ ⎤ q ∗ ( ) min ppolCQP Qp − − T γ T / − ⎢ t y b yk k 2⎥ Q Qp 0 ⎢ ⎥ T = = min max x (Q − Q p ) x + trace (Q p X ) ⎢ k 1 ⎥ , − , s.t. ⎢ ⎥ 0 Q Qp 0 x X q T ⎣ ⎦ s.t. trace ( X ) + γ x ≤ b , k = 1 , . . . , q. γ / − k k k yk k 2 Qp Q ( ≥ ) : yk 0 k =1 q (22) − = , Qp yk k 0 Considering Q n := Q n (Q p ) := Q − Q p , the Lagrangian function of k =1 the inner maximization problem in (22) is defined as y ≥ 0 . M. Fampa, D. Lubke and F. Wang et al. / European Journal of Operational Research 281 (2020) 36–49 41
After substitution, this is equivalent to is a valid inequality for QKPlifted . In this case, we say that (26) is
min −t a valid inequality for QKPlifted derived from the valid inequality t,y ⎡ ⎤ (25) for QKP . q
− − T γ T / ⎢ t y b yk k 2⎥ ⎢ ⎥ = 5.1. Preliminaries: knapsack polytope and cover inequalities ⎢ k 1 ⎥ s.t. ⎢ ⎥ q q ⎣ ⎦ (23) γ / − We begin by recall the concepts of knapsack polytopes and yk k 2 yk k Q cover inequalities. k =1 k =1 The knapsack polytope is the convex hull of the feasible points s z T 0 . : 0 of the knapsack problem, z Z
y ≥ 0 . KF := { x ∈ { 0 , 1 } n : w T x ≤ c} .
We can now derive the dual problem of (23), considering the La- Definition 5 (Zero-one knapsack polytope) . grangian function, defined as
= ( ) = ({ ∈ { , } n T ≤ } ) . L (t, y, s, z, Z) KPol : conv KF conv x 0 1 : w x c ⎛ ⎡ ⎤ ⎞ q Proposition 6. The dimension
− − T γ T ⎜ ⎢ t y b yk k 2⎥ ⎟ ⎜ ⎢ ⎥ ⎟ ( ) = , s z T = dim KPol n = − − ⎜ ⎢ k 1 ⎥ ⎟ t trace⎜ ⎢ ⎥ ⎟ z Z q q ⎝ ⎣ ⎦ ⎠ and KPol is an independence system, i.e., γ − yk k 2 yk k Q n k =1 k =1 x ∈ KPol, y ∈ { 0 , 1 } , y ≤ x ⇒ y ∈ KPol.
q q ≤ , ∀ ∈ R n , Proof. Recall that wi c i. Therefore, all the unit vectors ei = − − (− − T ) − γ T − − t s t y b yk k z trace Z yk k Q as well as the zero vector, are feasible, and the first statement fol- = = k 1 k 1 lows. The second statement is clear. q q
= ( − ) + T − γ T − − Cover inequalities were originally presented in Balas (1975) , t s 1 sy b yk k z trace Z yk k Q
k =1 k =1 Wolsey (1975), see also Nemhauser and Wolsey (1988, Section II.2). These inequalities can be used in general optimization problems q
= ( − ) + ( − γ T − ( )) + ( ) . with knapsack inequalities and binary variables and, particularly, t s 1 yk sbk z trace k Z trace QZ k in QKP . k =1 The Lagrangian dual function is defined as Definition 7 (Cover inequality, CI) . The subset C ⊆ N is a cover if it h (s, z, Z) := min L (t, y, s, z, Z) satisfies ≥ , y 0 t q
= ( − ) + ( − γ T − ( )) + ( ) w j > c. min t s 1 yk sbk k z trace k Z trace QZ y ≥0 ,t ∈ k =1 j C trace (QZ) , = = , − γ T − ( ) ≥ , = , . . . , , The (valid) CI is if s 1 sbk k z trace k Z 0 k 1 q −∞ , otherwise. x ≤ |C | − 1 . (27) Therefore, the dual problem of (23) is given by the maximization j j∈C of the Lagrangian dual function over the SDP cone, as in max trace (QZ) A cover C is minimal if no proper subset of C is a cover. z,Z T ∗ s.t. trace ( Z) + γ z ≤ b , k = 1 , . . . , q, Definition 8 (Extended CI, ECI) . Let w := max ∈ w and define k k k (24) j C j 1 z T the extension of C as 0 . z Z ∗ E(C) := C ∪ { j ∈ N\C : w j ≥ w } . We finally note that problems (24) and (21) are the same. Since The ECI is F has nonempty interior, then (21) is strictly feasible. Therefore, strong duality holds for problem (21), and the result of the theo- x j ≤ |C | − 1 . (28) rem follows. j∈ E(C)
α ≥ ∀ ∈ 5. Valid inequalities Definition 9 (Lifted CI, LCI). Given a cover C, let j 0, j N\C, α > ∈ and j 0, for some j N\C, such that We are now interested in finding valid inequalities to + α ≤ | | − , strengthen relaxations of QKP in the lifted space determined by x j j x j C 1 (29) the lifting X := xx T . Let us denote by CRel , any convex relaxation of j∈C j∈ N\C = T QKP in the lifted space, where the equation X xx was relaxed is a valid inequality for KF . Inequality (29) is a LCI . in some manner, by convex constraints, i.e., any convex relaxation of QKPlifted Cover inequalities are extensively discussed in Hammer, We note that if the inequality Johnson, and Peled (1975) , Balas and Zemel (1978) , Balas (1975) ,
τ T x ≤ β (25) Wolsey (1975) , Nemhauser and Wolsey (1988) , and Atamtürk
(2005). Details about the computational complexity of LCI is pre- is valid for QKP , where τ ∈ Z n and β ∈ Z + , then, as x is nonnega- + sented in Zemel (1989) and Gu, Nemhauser, and Savelsbergh tive and X := xx T , (1998) . Algorithm 2 ( Wolsey, 1998 , p.17), shows how to lift a LCI . It −β provides a facet-defining inequality for KPol when C is a minimal ( x X ) ≤ 0 (26) τ cover and t¯ = 1 . 42 M. Fampa, D. Lubke and F. Wang et al. / European Journal of Operational Research 281 (2020) 36–49
Algorithm 2 Procedure to Lift Cover Inequalities. 5.3. New valid inequalities in the lifted space Input: A cover C and a valid inequality As discussed, after finding any valid inequality in the form of t¯ −1 (25) for QKP , we may add the constraint (26) to CRel when aim- α + ≤ | | − , i j xi j xi C 1 ing at better bounds. We observe now, that besides (26) we can j=1 i ∈C also generate other valid inequalities in the lifted space by taking = T for KF , for some t¯ ∈ [1 , r] , where r := | N \ C| . advantage of the lifting X : xx , and also of the fact that xis bi- ∈ \ nary. In the following, we show how the idea can be applied to Sort the elements i N C in ascending wi order, defining cover inequalities. { i 1 , i 2 , . . . , i r } . = Let For: t t¯ to r x j ≤ β, (33) j∈C t−1 ζ = α + where C ⊂ N and β < | C |, be a valid inequality for KPol . t max i j xi j xi j=1 i ∈C Inequality (33) can be either a cover inequality, CI , an extended t−1 (30) cover inequality, ECI , or a particular lifted cover inequality, LCI , s.t. w x + w x ≤ c − w α ∈ ∀ ∈ i j i j i i it where j {0, 1}, j N\C in (29). Furthermore, given a general LCI, j=1 i ∈C α ∈ Z , ∈ where j + for all j N\C, a valid inequality of type (33) can be |C | +t −1 x ∈ { 0 , 1 } . α α constructed by replacing each j with min { j , 1} in the LCI. α = | | − − ζ Set it C 1 t . Definition 11 (Cover inequality in the lifted space, CILS ) . Let C ⊂ N End and β < | C | as in inequality (33) , and also consider here that β > 1. We define β X ≤ . (34) ij 2 5.2. Adding cuts to the relaxation i, j∈C ,i< j as the CILS derived from (33) . Given a solution ( x¯, X¯ ) of CRel , our initial goal is to obtain Theorem 12. If inequality (33) is valid for QKP , then the CILS (34) is a valid inequality for QKPlifted derived from a CI that is violated T T n by ( x¯, X¯ ) . A CI is formulated as α x ≤ e α − 1 , where α ∈ {0, 1} a valid inequality for QKPlifted . and e denotes the vector of ones. We then search for the CI that β Proof. Considering (33) , we conclude that at most products maximizes the sum of the violations among the inequalities in 2 of variables x x , where i , j ∈ C , can be equal to 1. Therefore, as Y¯ cut (α) ≤ 0 , where Y¯ := x¯ X¯ and i j = Xij : xi xj , the result follows. −e T α + 1 α) = . Remark 13. When β = 1 , inequality (33) is well known as a clique cut ( α cut, widely used to model decision problems, and frequently used as a cut in branch-and-cut algorithms. In this case, using similar To obtain such a CI , we solve the following linear knapsack prob- idea to what was used to construct the CILS , we conclude that it lem, is possible to fix ∗ = { T ¯ (α) T α ≥ + , α ∈ { , } n} . v : maxα e Ycut : w c 1 0 1 (31) X ij = 0 , for all i, j ∈ C, i < j. Let α∗ solve (31) . If ∗ > 0 , then at least one valid inequality v Given a solution ( x¯, X¯ ) of CRel , the following MIQP problem is in the following set of n scaled cover inequalities, denoted in the a separation problem, which searches for a CILS violated by X¯ . following by SCI , is violated by ( x¯, X¯ ) . ∗ = ( ¯ ) − β(β − ) , ( ) z : maxα,β,K trace XK 1 MIQP 1 T ∗ −e α + 1 T α ≥ + , ( ) ≤ . s.t. w c 1 x X α∗ 0 (32) β = e T α − 1 , K(i, i ) = 0 , i = 1 , . . . , n, Based on the following theorem, we note that to strengthen cut K(i, j) ≤ α , i, j = 1 , . . . , n, i < j, (32) , we may apply Algorithm 2 to the CI obtained, lifting it to an i K(i, j) ≤ α , i, j = 1 , . . . , n, i < j, LCI , and finally add the valid inequality (26) derived from the LCI j K(i, j) ≥ 0 , i, j = 1 , . . . , n, i < j, to CRel . K(i, j) ≥ αi + α j − 1 , i, j = 1 , . . . , n, i < j, α ∈ { , } n , β ∈ R , ∈ S n . Theorem 10. The valid inequality (26) for QKPlifted , which is derived 0 1 K from a valid LCI, dominates all inequalities derived from a CI that can α∗ β∗ ∗ ∗ > If , , K solves MIQP1 , with z 0, the CILS given by be lifted to the LCI . ∗ ∗ ∗ ∗ trace (K X ) ≤ β (β − 1) is violated by X¯ . The binary vector α de- Proof. Consider the LCI (29) derived from a CI (27) for QKP . The fines the CI from which the cut is derived. The CI is specifically ∗T T ∗ ∗ ∗ corresponding scaled cover inequalities (26) derived from the CI given by α x ≤ e α − 1 and β (β − 1) determines the right- and the LCI are, respectively, hand side of the CILS . The inequality is multiplied by 2 because we consider the variable K as a symmetric matrix, in order to sim- ≤ (| | − ) , ∀ ∈ , Xij C 1 xi i N plify the presentation of the model. j∈C Theorem 14. The valid inequality CILS for QKP , which is derived lifted and from a valid LCI in the form (33), dominates any CILS derived from a
X ij + α j X ij ≤ (|C | − 1) x i , ∀ i ∈ N, CI that can be lifted to the LCI. ∈ ∈ \ j C j N C Proof. As X is nonnegative, it is straightforward to verify that if X α ≥ ∀ ∈ where j 0, j N\C. Clearly, as all Xij are nonnegative, the second satisfies a CILS derived from a LCI, X also satisfies any CILS derived inequality dominates the first, for all i ∈ N . from a CI that can be lifted to the LCI . M. Fampa, D. Lubke and F. Wang et al. / European Journal of Operational Research 281 (2020) 36–49 43
( ) > β(β − ) β Any feasible solution of MIQP1 such that trace X¯ K 1 case, if at most elements of C can be selected in the so- ¯ β β generates a valid inequality for QKPlifted that is violated by X. = lution, clearly at most 2 2 subsets of Cs can also be Therefore, we do not need to solve MIQP to optimality to gener- 1 selected. ate a cut. Moreover, to generate distinct cuts, we can solve MIQP 1 several times (not necessarily to optimality), each time adding to it, the following “no-good” cut to avoid the previously generated Given a solution ( x¯, X¯ ) of CRel , we now present a mixed linear cuts: integer programming (MILP) separation problem, which searches
¯ ∈ α( )( − α( )) ≥ , for an inequality in SCILS that is most violated by X. Let A ¯ i 1 i 1 (35) n (n +1) n × i ∈ N { 0 , 1 } 2 . In the first n columns of A we have the n × n identity matrix. In the remaining n (n − 1) / 2 columns of the matrix, there where α¯ is the value of the variable α in the solution of MIQP 1 are exactly two elements equal to 1 in each column. All columns when generating the previous cut. = , α∗ β∗ ∗ α∗T ≤ T α∗ − are distinct. For example, for n 4 We note that, if , , K solves MIQP1 , then x e 1 ⎛ ⎞ is a valid CI for QKP , however it may not be a minimal cover. 1 0 0 0 1 1 1 0 0 0 Aiming at generating stronger valid cuts, based in Theorem 14 , we ⎜ 0 1 0 0 1 0 0 1 1 0 ⎟ = . −δ T α, A : ⎝ ⎠ might add to the objective function of MIQP1 , the term e 0 0 1 0 0 1 0 1 0 1 for some weight δ > 0. The objective function would then favor 0 0 0 1 0 0 1 0 1 1 minimal covers, which could be lifted to a facet-defining LCI , that would finally generate the CILS . We should also emphasize that if The columns of A represent all the subsets of items in N with one the CILS derived from a CI is violated by a given X¯ , then clearly, or two elements. Let ¯ ∗ = ( ) − , ( ) the CILS derived from the LCI will also be violated by X. z : maxα, v ,K,y trace X¯ K 2v MILP 2 Now, we also note that, besides defining one cover inequality s.t. w T α ≥ c + 1 , in the lifted space considering all possible pairs of indexes in C , K(i, i ) = 2 y (i ) , i = 1 , . . . , n, we can also define a set of cover inequalities in the lifted space, n ( ) ≤ , considering for each inequality, a partition of the indexes in C into y i 1 subsets of cardinality 1 or 2. In this case, the right-hand side of i =1 n (n +1) / 2 the inequalities is never greater than β/2. The idea is made precise K(i, j) = A (i, t) A ( j, t )) y (t ) , i, j = 1 , . . . , n, i < j, below. t= n +1 ≥ ( T α − ) / − . , Definition 15 (Set of cover inequalities in the lifted space, v e 1 2 0 5 ≤ ( T α − ) / , SCILS ) . Let C ⊂ N and β < | C | as in inequality (33) . Let v e 1 2 y (t) ≤ 1 − A (i, t) + α(i ) , i = 1 , . . . , n, = { ( , ) , . . . , ( , ) } ( + ) 1. Cs : i1 j1 ip jp be a partition of C, if |C| is even. = , . . . , n n 1 , t 1 2 = { ( , ) , . . . , ( , ) } ࢨ ( + ) 2. Cs: i1 j1 ip jp be a partition of C {i0 } for each α ≤ ≤ α + n n 1 ( − α) , Ay 2 1 i ∈ C , if | C | is odd and β is odd. ( + ) 0 n n n 1 α ∈ { 0 , 1 } , y ∈ { 0 , 1 } 2 , 3. C s : = {(i , i ) , (i , j ) , . . . , (i p , j p )} , where {(i , j ) , . . . , (i p , j p )} 0 0 1 1 1 1 n ࢨ ∈ β v ∈ Z , K ∈ S . is a partition of C {i0 } for each i0 C, if |C| is odd and is even. α∗, ∗, ∗, ∗ ∗ > If v K y solves MILP2 , with z 0, then the particular in-
< = , . . . , equality in SCILS given by In all cases, ik jk for all k 1 p. ∗ ∗ The inequalities in the SCILS derived from (33) are given by ( ) ≤ trace K X 2v (37)
β is violated by X¯ . The binary vector α∗ defines the CI from which X ≤ , (36) ij 2 the cut is derived. As the CI is given by α∗x ≤ e T α∗ − 1 , we can (i, j) ∈C s conclude that the cut generated either belongs to case (1) or (3) in for all partitions Cs defined as above. Definition 15. This fact is considered in the formulation of MILP2 . ∗
, The vector y defines a partition Cs as presented in case (3), if Theorem 16. If inequality (33) is valid for QKP then the inequalities n = y (i ) = 1 , and in case (1), otherwise. We finally note that the in the SCILS (36) are valid for QKP . i 1 lifted number 2 in the right-hand side of (37) is due to the symmetry of Proof. The proof of the validity of SCILS is based on the lifting the matrix K ∗. = relation Xij xi x j . We note that if the binary variable xi indicates We now may repeat the observations made for MIQP1 . ( ) > whether or not the item i is selected in the solution, the variable Any feasible solution of MILP2 such that trace X¯ K 2v gener-
Xij indicates whether or not the pair of items i and j, are both se- ates a valid inequality for CRel, which is violated by X¯ . Therefore, lected in the solution. we do not need to solve MILP2 to optimality to generate a cut.
Moreover, to generate distinct cuts, we can solve MILP2 several 1. If | C | is even, C is a partition of C in exactly | C |/2 subsets s times (not necessarily to optimality), each time adding to it, the with two elements each, and therefore, if at most β ele- following suitable “no-good” cut to avoid the previously generated