Conjugate of some Rational functions and Convex Envelope of Quadratic functions over a Polytope
by
Deepak Kumar
Bachelor of Technology, Indian Institute of Information Technology, Design and Manufacturing, Jabalpur, 2012
A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
in
THE COLLEGE OF GRADUATE STUDIES
(Computer Science)
THE UNIVERSITY OF BRITISH COLUMBIA (Okanagan)
January 2019
c Deepak Kumar, 2019 The following individuals certify that they have read, and recommend to the College of Graduate Studies for acceptance, a thesis/dissertation en- titled:
Conjugate of some Rational functions and Convex Envelope of Quadratic functions over a Polytope submitted by Deepak Kumar in partial fulfilment of the requirements of the degree of Master of Science.
Dr. Yves Lucet, I. K. Barber School of Arts & Sciences Supervisor
Dr. Heinz Bauschke, I. K. Barber School of Arts & Sciences Supervisory Committee Member
Dr. Warren Hare, I. K. Barber School of Arts & Sciences Supervisory Committee Member
Dr. Homayoun Najjaran, School of Engineering University Examiner
ii Abstract
Computing the convex envelope or biconjugate is the core operation that bridges the domain of nonconvex analysis with convex analysis. For a bi- variate PLQ function defined over a polytope, we start with computing the convex envelope of each piece. This convex envelope is characterized by a polyhedral subdivision such that over each member of the subdivision, it has an implicitly defined rational form(square of a linear function over a linear function). Computing this convex envelope involves solving an expo- nential number of subproblems which, in turn, leads to an exponential time algorithm. After that, we compute the conjugate of each such rational function de- fined over a polytope. It is observed that the conjugate has a parabolic subdivision such that over each member of its subdivision, it has an im- plicitly defined fractional form(linear function over square root of a linear function). This computation of the conjugate is performed with a worst-case linear time complexity algorithm. Finally, some directions and insights about computing the maximum of all the conjugates of each piece of a PLQ function and then the conjugate of that to obtain the biconjugate are provided as conjectures for future work.
iii Lay Summary
Optimization problems occur in a wide variety of fields ranging from Engineering to Mathematical Finance and solving these problems play a crucial role in the process. When the optimization problem is nonconvex it is often difficult to solve. In this thesis, we present a method to convert some class of nonconvex problems to convex problems, which are often easier to solve, and a method to compute their conjugates which can help solving the problem faster than before.
iv Table of Contents
Abstract ...... iii
Lay Summary ...... iv
Table of Contents ...... v
List of Tables ...... viii
List of Figures ...... ix
List of Notations and Abbreviations ...... xi
Acknowledgements ...... xii
Dedication ...... xiii
Chapter 1: Introduction ...... 1 1.1 PLQ Functions ...... 1 1.2 Convex Envelope ...... 2 1.3 Fenchel conjugate ...... 4 1.4 Motivation ...... 5
Chapter 2: Preliminaries and Notations ...... 8
Chapter 3: Convex Envelope of Bivariate Quadratic Func- tions over a polytope ...... 13 3.1 Quadratic Reduction ...... 13 3.2 Problem Formulation[Loc16] ...... 16 3.2.1 Problem Formulation ...... 16 3.2.2 Expressions for η(a, b)...... 19 3.2.3 Subproblems ...... 20 3.3 Exponential subproblems ...... 22
v TABLE OF CONTENTS
3.4 Optimal Solutions [Loc16] ...... 25 3.4.1 Solving the subproblems ...... 25 3.4.2 Functional forms ...... 26 3.4.3 Solutions ...... 26 3.5 Convex envelope: the maximum of all subproblems ...... 29 3.6 Algorithmic Design ...... 30 3.6.1 Input and Output Data Structures ...... 30 3.6.2 Main Algorithm ...... 33 3.6.3 Solving the subproblems ...... 37 3.7 Map Overlay problem ...... 40 3.7.1 Data structures for polyhedral region ...... 41 3.7.2 Sorting vertices in clockwise direction ...... 42 3.7.3 Map Overlay Algorithm ...... 45
Chapter 4: Conjugate of a class of Convex Bivariate Rational Functions over a polytope ...... 49 4.1 Subdifferentials in the interior of the polytope ...... 50 4.2 Subdifferentials at the Vertices ...... 55 4.3 Subdifferentials on the edges ...... 55 4.4 Structure of the conjugate domain ...... 59 4.5 Conjugate Expressions ...... 65 4.5.1 Fractional forms ...... 65
Chapter 5: Algorithmic computation of the Conjugate for a class of Bivariate Rational functions over a poly- tope ...... 73 5.1 Data Structures ...... 73 5.1.1 Parabolic region ...... 74 5.1.2 Output Data Structure ...... 76 5.2 Algorithm ...... 79 5.2.1 Main Algorithm ...... 80 5.2.2 Algorithm for edges ...... 81 5.3 Example 1 ...... 83 5.4 Example 2 ...... 89
Chapter 6: Conclusions and Future work ...... 96 6.1 Future work ...... 97 6.1.1 Maximum of all conjugates ...... 97 6.1.2 Conjugate of bivariate nonconvex PLQ Functions . . . 98 6.1.3 Convex envelope of bivariate nonconvex PLQ functions 98
vi TABLE OF CONTENTS
Bibliography ...... 100
Appendix ...... 105 Appendix A: The η(a, b) expressions for the bilinear functions . . . 106 Appendix B: Solving the subproblems ...... 108 B.1 Case 1: Quadratic-Quadratic ...... 108 B.1.1 Solving the equality constraint ...... 109 B.1.2 Solving the inequalities ...... 109 B.1.3 Solutions ...... 111 B.2 Case 2: Quadratic-Linear ...... 112 B.2.1 Solving the equality constraint ...... 113 B.2.2 Solving the inequalities ...... 113 B.2.3 Solutions ...... 115 B.3 Case 3: Linear-Linear ...... 116 B.3.1 Solving the equality constraint ...... 116 B.3.2 Solving the inequalities ...... 117 B.3.3 Solutions ...... 118 Appendix C: Conjugate expressions for a rational function over an edge ...... 119
vii List of Tables
Table 3.1 Different cases depending upon the eigenvalues . . . . 14
Table 5.1 Conjugate domain for a rational function ...... 76 Table 5.2 Conjugate domain for quadratic functions ...... 80
Table 6.1 Observed intersections for computing the maximum of all conjugates ...... 97
viii List of Figures
Figure 1.1 A univariate convex PLQ Function ...... 2 Figure 1.2 A non convex univariate PLQ Function ...... 3 x2 s2 Figure 1.3 f(x) = , its conjugate f ∗(s) = , and the map- 2 2 ping between the two ...... 5 Figure 1.4 Closed convex envelope (shown in Blue) of each piece of a univariate nonconvex PLQ function ...... 6
2 Figure 2.1 A Polyhedral subdivision of R ...... 10
Figure 3.1 η1’s polyhedral domain ...... 22 Figure 3.2 η2(a, b)’s polyhedral domain ...... 23 Figure 3.3 Nine Subregions formed by overlaying domains of η1 and η2 ...... 23 Figure 3.4 19 subregions, shown by different colors, formed by intersection of domains of 3 η(a, b)s belonging to a convex edge...... 24 Figure 3.5 Input Data Structure ...... 31 Figure 3.6 Output Data Structure ...... 32 Figure 3.7 Space division by line y = mx + c ...... 41 Figure 3.8 Vertex Enumeration Problem ...... 48
Figure 4.1 Primal and dual mapping for quadratic function in Example 4.4 over lines parallel to x1 + x2 = 0. . . . 52 Figure 4.2 Normal cone at a vertex v ∈ P shown as the arrows in the intersection of Red and Green region ...... 56 Figure 4.3 Polyhedral subdivision for Example 4.14 ...... 62 Figure 4.4 Parabolic subdivision for r and P in Example 4.22 . . 64 Figure 4.5 Parabolic subdivision for r and P from Example 4.5 . 66
Figure 5.1 A subdifferential region ...... 74 Figure 5.2 Region with only active constraints ...... 75
ix LIST OF FIGURES
Figure 5.3 Proposed subdivision for storing a subdifferential region 75 Figure 5.4 Output Data structure ...... 77 Figure 5.5 Normal cone division lines for Example 5.3 ...... 83 Figure 5.6 Conjugate for Example 5.3 ...... 88 Figure 5.7 Normal cone division lines for Example 5.4 ...... 90 Figure 5.8 Conjugate for Example 5.4 ...... 95
x List of Notations and Abbreviations
R Real numbers φ Empty set |A| Cardinality of set A dom(f) Effective domain of function f cl(g) Closed envelope of function g
convfP Convex envelope of f over a region P conv Closed convex Envelope Sn×n Symmetric Matrix of size n × n int(P ) Interior of the set P ri(P ) Relative interior of set P PLQ Piecewise Linear Quadratic LFT Legendre-Fenchel transform f ∗ Conjugate of function f sup S The supremum of set S inf S The infimum of set S
NP (x) Normal cone of set P at x ∂f(x) Subdifferential of function f at x ∇f(x) Gradient of function f at x
IP Indicator function of set P
xi Acknowledgements
I would take this opportunity to acknowledge and express my sincere gratitude towards my supervisor Dr. Yves Lucet, for his constant guidance, encouragement, and precious advice throughout my graduate studies. I am very grateful for his endless support, especially for his expertise, patience and insightful suggestions during the course of my research. His attention to details motivated me to pay closer attention to everything I work on, and his positive outlook and confidence in my research inspired me and gave me the right confidence to dig deeper into my work. I would also like to extend my gratitude towards the members of my su- pervisory committee: Dr. Heinz Bauschke and Dr. Warren Hare for spending their valuable time reading my thesis and providing constructive and insight- ful feedback. I am also thankful to all of my committee members for making my defense a wonderful moment of my life. I truly appreciate the support of my family and all of my friends and colleagues of the CCA research group for all they meant to me during the completion of my research. Without them, this journey and the lab would not have been such an enjoyable experience. I am also thankful to the University of British Columbia, Okanagan and the Natural Sciences and Engineering Research Council of Canada (NSERC) for providing the financial assistance, without which this research would have seemed like a distant dream.
xii Dedication
To my family, for their constant love and support.
xiii Chapter 1
Introduction
Computational convex analysis (CCA) focuses on creating efficient al- gorithms to compute fundamental transforms arising in the field of convex analysis. Computing the convex envelope or biconjugate is the core oper- ation that bridges the domain of nonconvex analysis with convex analysis. The early idea of the computation of convex transforms can be traced back to [Mor65]. However, development of most of the algorithms in CCA began with the Fast Legendre Transform (FLT) in [Bre89] and was further devel- oped in [Cor96, Luc96] (and independently in [NV94, SAF92]). Later, the algorithm was improved to the optimal linear worst-case time complexity in [Luc97]. Recent algorithms to compute the conjugate numerically have been based on either parameterization [HUL07], manipulation of graphs (GPH model) [GL11], or the computation of the Moreau envelope [Luc06]. More complex operators such as the proximal average operator [BGLW08, BLT08] can be built by using a combination of addition, scalar multiplica- tion, and conjugacy operations. This has further motivated more research in the field of computational convex analysis [LLT17, JKL11, LBT09]. Computational convex analysis has numerous applications in a wide va- riety of fields, ranging from image processing and partial differential equa- tions to thermodynamics. An extensive study has been carried out in the survey [Luc10].
1.1 PLQ Functions
Piecewise Linear Quadratic (PLQ) functions play a significant role in the approximation of functions. Compared to piecewise linear functions, PLQ functions capture the curve of the original function and are continu- ous but may not be smooth, however, they are second-order differentiable in the interior of each piece. PLQ functions have been well-known in the field of convex analysis (the set of PLQ functions is closed under common convex operators [RW98]). Their applications in nontraditional statistical estimations problems have been well studied in [CCHP17].
1 1.2. Convex Envelope
Figure 1.1: A univariate convex PLQ Function
An example of a convex univariate PLQ function is shown in Figure 1.1. A PLQ framework along with the existence of linear time algorithms for various convex transforms has been proposed in [LBT09, BLT08]. A detailed description of all the important transforms is presented in the study [Luc10]. Computing the full graph of the convex hull of univariate nonconvex PLQ functions using the biconjugate operation in optimal linear worst-case time complexity has been proposed in [GL10]. However, the area of bivariate PLQ functions, especially nonconvex, has not been explored much.
1.2 Convex Envelope
For a function f defined over a region X, the process of computing the pointwise supremum of all convex underestimators of f for region X is called the convex envelope and is denoted by convfX . For example, for the univariate nonconvex PLQ function x2 x ≤ 0 f(x) = 1 − (x − 1)2 0 < x < 1 x2 x ≥ 1 shown in Figure 1.2, its convex envelope is given by
2 1.2. Convex Envelope
Figure 1.2: A non convex univariate PLQ Function
x2 x ≤ 0 convf = x 0 < x < 1 x2 x ≥ 1 and is shown earlier in Figure 1.1. Computing the convex envelope of a function is one of the fundamental task in mathematical optimization. Every nonconvex function shares the same minima as its convex envelope and computing the minima of a convex function is much easier than computing the same for a nonconvex function. The task of computing the convex envelope of a function is very useful though not simple (computation of the convex envelope of a multilinear function over a unit hypercube has been proved to be NP-Hard in [Cra89]). However, the convex envelope of functions defined over a polytope P and restricted by the vertices of P can be computed in finite time using a linear program [Tar04, Tar08]. A method to reduce the computation of convex n−1 envelope of functions that are one lower dimension(R ) convex and have indefinite Hessian to optimization problems in lower dimensions is discussed in [JMW08]. Results for specific functions exist in the literature, a particular one is the bilinear function. Any general bivariate nonconvex quadratic function can be linearly transformed to the sum of bilinear and a linear function. Some studies involving bilinear functions are
3 1.3. Fenchel conjugate
− convex envelopes for bilinear functions over rectangles have been dis- cussed in [McC76] and validated in [AKF83]; − the convex envelope over special polytopes (not containing edges with finite positive slope) has been derived in [SA90]; − the convex envelope of a bilinear function over a triangle containing exactly one edge with finite positive slope is presented in [Lin05]; − the convex envelope over general triangles and triangulation of the polytopes through doubly nonnegative matrices (both semidefinite and nonnegative) is presented in [AB10, Ans12]. In [Loc14], it is shown that the analytical form of the convex envelope of some bivariate functions defined over polytopes can be computed by solving a continuously differentiable convex problem and then later is extended to any bivariate function satisfying some assumption (see Assumption [Loc16]). They go on to prove that the convex envelope of any bivariate function over a polytope satisfying the assumption is characterized by a polyhedral subdivision, however, the form of the envelope is determined implicitly.
1.3 Fenchel conjugate
n The Fenchel conjugate of a function f : R → R ∪ {+∞} is given by f ∗(s) = sup [hs, xi − f(x)] x∈Rn where hs, xi = sT x is the standard dot product. It is also known as the Legendre-Fenchel Transform or convex conjugate or simply conjugate. It plays a significant role in duality and computing it is a key step in solving x2 the dual optimization problem [Her16]. For a convex function f(x) = , 2 the conjugate is the function itself in the dual domain. The mapping between the primal and dual domain for this function is shown in Figure 1.3. For a smooth convex univariate function, the slope of the affine underestimator at a given point in the primal domain maps to a point in the dual domain, while the intercept multiplied by −1 becomes the value of the conjugate at that slope and vice-versa. A method to compute the conjugate known as the fast Legendre trans- form was first introduced in [Bre89] and its applications have been studied in [Cor96, Luc96]. A linear time algorithm was later introduced by Lucet to compute the discrete Legendre transform [Luc97].
4 1.4. Motivation
Primal domain (x,y) Dual domain (s,r) 1.5
1 1
0.5 r = 1*s -0.5
0.5 y = 1*x-0.5
0 0
-0.5 -0.5
-1
-1.5 -1 -0.5 0 0.5 1 -1.5 -1 -0.5 0 0.5 1 1.5
x2 s2 Figure 1.3: f(x) = , its conjugate f ∗(s) = , and the mapping between 2 2 the two
Computation of the conjugate of convex univariate PLQ functions have been well studied in the literature and linear time algorithms have been developed in [GL13, GJL14] both for the PLQ and the GPH models. Re- cently, a linear time algorithm to compute the conjugate of convex bivariate PLQ functions using an entity graph (storing the entities of the domain as a graph) has been proposed in [HL18].
1.4 Motivation
n Let f : R → R ∪ {+∞} be a piecewise function, that is, f1(x) x ∈ P1 f (x) x ∈ P f(x) = 2 2 ······ fN (x) x ∈ PN . From [HUL93, Theorem 2.4.1], we have ∗ ∗ (inf fi) = sup fi . (1.1) i i
5 1.4. Motivation
Figure 1.4: Closed convex envelope (shown in Blue) of each piece of a uni- variate nonconvex PLQ function
Also, from [HUL93, Proposition 2.6.1],
conv(inf(fi + IP )) = conv( inf[conv(fi + IP )] ) (1.2) i i i i
where IPi is the indicator function for Pi. So
conv(inf(fi + IP )) = conv(inf[conv(fi + IP )]) using (1.2) i i i i ∗ ∗ = ((inf[conv(fi + IP )]) ) i i ∗ ∗ = ( sup [conv(fi + IPi )] ) using (1.1). i Hence the closed convex envelope of a PLQ function can be computed in four steps: Step 1 Compute the convex envelope of each piece
Step 2 Compute the conjugate of the convex envelope of each piece
6 1.4. Motivation
Step 3 Compute the maximum of all the conjugates
Step 4 Compute the conjugate of the function obtained in Step 3 to get the biconjugate (or the closed convex envelope) of the original function.
Note that computing the closed convex envelope of each piece does not result in the closed convex envelope of the whole function. This is illustrated in Figure 1.4. In this thesis, we begin with computing the convex envelope of each piece using the method proposed in [Loc16] in Chapter 3. After that, we compute the conjugate of each piece in Chapter 4. Chapter 5 covers the algorithmic design and examples for computing the conjugate of a convex rational function over a polytope. Step 3 and 4 are discussed in Chapter 6 as future work.
7 Chapter 2
Preliminaries and Notations
This section covers the basic background to facilitate proper understand- ing of the rest of the thesis. Definition 2.1. Convex Set. A set S is convex if ∀x, y ∈ S and λ ∈ [0, 1], λx + (1 − λ)y ∈ S. Definition 2.2. Cardinality of a finite set. Cardinality of a finite set is the number of elements in the set. For example, let A = {1, 5, 7}, then its cardinality |A|= 3. n Definition 2.3. Convex Function. A function f : R → R∪{+∞} is convex if and only if ∀x, y ∈ dom(f) = {x : f(x) ∈ R} and λ ∈ (0, 1), f(λx + (1 − λ)y) ≤ λf(x) + (1 − λ)f(y). It is strictly convex when f(λx + (1 − λ)y) < λf(x) + (1 − λ)f(y) for x, y ∈ dom(f), x 6= y and λ ∈ (0, 1). Definition 2.4. Lower Semi-continuous/Closed function [HUL13, Defini- n tion 3.2.1]. A function f : R → R ∪ {+∞} is said to be closed if n lim inf f(x) ≥ f(x0) for all x0 ∈ R . x→x0 n For any non closed function g : R → R ∪ {−∞, +∞}, we note cl(g) the closed envelope of g, that is, the largest closed function that is lower than n n g. We denote the set of closed convex functions on R as convR . All the n functions involved here on are in convR . Definition 2.5. Convex Envelope. The convex envelope of a function f over a region P , noted conv fP , is the pointwise supremum of all the convex underestimators of f over P . It is defined as
convfP (x, y) = sup c f(x0, y0) − [a(x0 − x) + b(y0 − y) + c] ≥ 0, ∀(x0, y0) ∈ P
8 Chapter 2. Preliminaries and Notations
The optimal value (a∗, b∗, c∗) defines the supporting hyperplane for the function f at a point (x, y) ∈ P .
Definition 2.6. Symmetric Matrix. A symmetric matrix, A, is a square matrix which is equal to its transpose, that is, A = AT . The set of symmetric matrices is denoted by S.
Definition 2.7. Positive Semidefinite Matrix. An n × n symmetric matrix n T M is called positive semidefinite if for any x 6= 0 ∈ R , x Mx ≥ 0. When xT Mx > 0, then M is a positive definite matrix.
Definition 2.8. Indefinite Matrix. A symmetric matrix, M, is called in- definite if it contains at least two nonzero eigenvalues of opposite signs.
For a function f ∈ C1, an indefinite Hessian signifies f is nonconvex.
Definition 2.9. Halfspace. A halfspace is an unbounded region formed by the set of points satisfying a linear inequality. Mathematically,
n T Hs = {x ∈ R : a x ≤ b}
n where a ∈ R , a 6= 0 and b ∈ R. Definition 2.10. Hyperplane(Affine). A hyperplane is formed by the set of points statisfying a linear equality. Mathematically,
n T Hp = {x ∈ R : a x = b}
n where a ∈ R , a 6= 0, b ∈ R. Definition 2.11. Polytope. A polytope P is a bounded region formed by the intersection of a finite number of halfspaces. It can be written as
n P = {x ∈ R : Ax ≤ b},
m×n m where A ∈ R , b ∈ R and m is the number of halfspaces. A polytope is always convex. 2 A polytope in R can be divided into three entities namely vertices, edges, and the interior. A vertex is formed by the intersection of edges. An edge is a line segment on the boundary of the polytope joining two vertices, while n the interior of a polytope P can be written as int(P ) = {x ∈ R : Ax < b}.
9 Chapter 2. Preliminaries and Notations
2 Figure 2.1: A Polyhedral subdivision of R
S Definition 2.12. Polyhedral Subdivision. A convex set R = i∈{1,...,m} Ri, 2 R ⊆ R , defined as the union of a finite number of polyhedral regions is said T to have a polyhedral subdivision if for any j, k ∈ {1, ··· , m}, j 6= k, Rj Rk is either φ or is contained in a line.
2 Figure 2.1 shows an example of polyhedral subdivision of R . Definition 2.13. Piecewise Linear-Quadratic function. A Piecewise Linear- Quadratic(PLQ) function is characterized by a polyhedral subdivision such that over each member of the subdivision, it has either linear or quadratic form. Mathematically, a PLQ function
1 T x A1x + b1x + δ1 x ∈ P1 2 1 xT A x + b x + δ x ∈ P f(x) = 2 2 2 2 2 ······ 1 T 2 x AN x + bN x + δN x ∈ PN .
n n×n n where x ∈ R ,Ai ∈ S , bi ∈ R , δi ∈ R and Pi are polyhedral sets. n Definition 2.14. Legendre-Fenchel transform. For a function f : R → ∗ R ∪ {+∞}, its Legendre-Fenchel transform(LFT), f , is defined as
10 Chapter 2. Preliminaries and Notations
f ∗(s) = sup [hs, xi − f(x)] x∈Rn where hs, xi = sT x is the standard dot product.
n Definition 2.15. Biconjugate. For a function f : R → R ∪ {+∞}, its biconjugate is given as (f ∗)∗.
When f is convex (f ∗)∗ = cl(f), otherwise (f ∗)∗ is the closed convex envelope of f.
Definition 2.16. Cone. A set C is a cone if is contains 0 and for every x ∈ C, λx ∈ C for all λ ≥ 0.
Definition 2.17. Relative Interior [HUL13, Definition 2.1.1]. The relative n interior of a convex set C ∈ R , denoted by ri(C), is defined as the interior within the affine hull of C. Mathematically, x ∈ ri(C) if and only if x ∈ aff(C) and ∃δ > 0 such that aff(C) ∩ B(x, δ) ⊂ C where aff(C) is the affine hull of C and B(x, δ) is a ball of radius δ centered on x.
Definition 2.18. Normal Cone. For a set P , the normal Cone at a point x ∈ P is given by
NP (x) = {s : hs, y − xi ≤ 0, for all y ∈ P }.
For all x ∈ int(P ),NP (x) = {0}. When P is a 2D polytope, for any edge E = {x : L(x) ≤ 0, l1 ≤ x1 ≤ u1} with L(x) = ax + by + c between vertices l and u, the normal cone at any x ∈ ri(E),NP (x) = {s : s = λ∇L(x), λ ≥ 0}. Definition 2.19. Subdifferential of a function. The subdifferential of a function f at any x ∈ dom(f), given by ∂f(x) is
∂f(x) = {s : f(y) ≥ f(x) + hs, y − xi, for all y ∈ dom(f)}.
Every element of ∂f(x) is called a subgradient of f at x. When f is differ- entiable at x then ∂f(x) = {∇f(x)}.
Definition 2.20. Parabola (or Parabolic curve). A parabola is a two dimen- sional planar curve whose points (x, y) satisfy the equation ax2 +bxy +cy2 + dx + ey + f = 0 where b2 − 4ac = 0. A line is a parabola with a = b = c = 0.
11 Chapter 2. Preliminaries and Notations
2 Definition 2.21. Parabolic region. A parabolic region, Pr ⊂ R , is formed by the intersection of a finite number of parabolic inequalities. It can be defined as
2 2 2 Pr = {x ∈ R : aix1 + bix1x2 + cix2 + dix1 + eix2 + fi ≤ 0, i ∈ {1, ··· , k}, 2 bi − 4aici = 0}.
2 2 2 For Cp(x) = ax1 + bx1x2 + cx2 + dx1 + ex2 + f with b − 4ac = 0, the set 2 2 Pr = {x ∈ R : Cp(x) ≤ 0} is convex, but Ps = {x ∈ R : Cp(x) ≥ 0} is not. S Definition 2.22. Parabolic subdivision. A convex set R = i∈{1,...,m} Ri, 2 R ⊆ R , defined as the union of a finite number of parabolic regions is said T to have a parabolic subdivision if for any j, k ∈ {1, ··· , m}, j 6= k, Rj Rk is either φ or is contained in a parabola.
12 Chapter 3
Convex Envelope of Bivariate Quadratic Functions over a polytope
Computing the closed convex envelope of a 2D PLQ function requires computation of the closed convex envelope of each quadratic piece. For this purpose, we utilize the method proposed in [Loc16]. We start with problem formulation, reduce it to subproblems, and then obtain the optimal solution for each. The closed convex envelope then comes out to be the maximum of all such solutions over the given polytope. In this chapter, we formulate the mathematical form of the closed convex envelope of each piece of a bivariate nonconvex PLQ function using [Loc16]. After that, we introduce the data structures and algorithms involving various 2D geometry manipulations to compute the closed convex envelope of a bivariate nonconvex quadratic function. Before we present the method proposed in [Loc16] and our algorithmic approach, let us discuss a technique to reduce the general expression of any nonconvex 2D quadratic function to a sum of bilinear and affine functions.
3.1 Quadratic Reduction
Using linear algebra and eigenvalue decomposition, we can reduce any quadratic function to a sum of a bilinear function and an affine function. This reduction is required to simplify our process of solving the subprob- lems, which, otherwise, would have led to solving quartic equations in most of the cases. The other benefit of this reduction is that solving the subprob- lems only for the bilinear term, as compared to the general quadratic form, reduces the size, complexity, and number of special cases to be dealt with when creating an algorithmic solution for the problem. 2 A general quadratic expression in R can be written as f(x) = xT Ax + bT x + δ (3.1)
13 3.1. Quadratic Reduction
Table 3.1: Different cases depending upon the eigenvalues Eigenvalues Quadratic function type Convex envelope over a polytope λ1 ≥ 0, λ2 ≥ 0 convex Function itself λ1 > 0, λ2 > 0 strictly convex Function itself λ1 ≤ 0, λ2 ≤ 0 concave Piecewise Linear λ1 < 0, λ2 < 0 strictly concave Piecewise Linear λ1 > 0, λ2 < 0 nonconvex and nonconcave Computed using [Loc16]
2×2 2 where A ∈ S , b ∈ R , δ ∈ R and S is the set of all symmetric matrices (see Definition 2.6). When f is neither convex or concave, A is an indefinite matrix. T T Let f1(x) = x Ax and f2(x) = b x + δ. From [HUL13, p. 71], we have
convf1 + convf2 ≤ conv(f1 + f2), and because f2 is affine,
convf1 + convf2 = conv(f1 + f2).
An affine function is already convex, so
T convf = convf1 + b x + δ. (3.2)
T For f1(x) = x Ax, by using eigenvalue decomposition
λ 0 A = U T 1 U, (3.3) 0 λ2 where λ1 and λ2 are the eigenvalues, and U is the matrix of eigenvectors of A. Letx ˜ = Ux, then −1 T λ1 0 f1(U x˜) = g1(˜x) =x ˜ x˜ 0 λ2 or (3.4) 1 0 g (˜x) = λ x˜T x,˜ 1 1 0 −α with α = −λ2/λ1. When f(x) is convex, α ≤ 0. Since we are interested in the case when f(x) is neither convex nor nonconcave, λ1 and λ2 would
14 3.1. Quadratic Reduction carry different signs, hence α > 0. Table 3.1 enumerates different possible cases based on different eigenvalues. Now 1 0 1 1 0 1 1 1 T = √ √ √ √ . (3.5) 0 −α α − α 1 0 α − α
1 1 T Letx ¯ = √ √ x˜, so by using (3.5) in (3.4), we get α − α
0 1 h (¯x) = λ x¯T x.¯ 1 1 1 0
T Hence forx ¯ = [¯x1 x¯2] , h1(¯x) = λ1x¯1x¯2 (3.6) which is our bilinear term. All the linear transformations performed in (3.3)-(3.6) are invertible as 1 1 U is invertible and matrix √ √ is clearly invertible with inverse α − α √ α 1 √1 √ . 2 α α −1 Proposition 3.1. For a bivariate quadratic nonlinear function f of T form (3.1), a polytope P , let f(x) = f1(x) + f2(x) with f1(x) = x Ax and T f2(x) = b x + δ, then for λ1 > 0 and h1(x) = x1x2, conv fP = λ1conv h1P ◦ −1 M + conv f2P where M is a 2 × 2 invertible matrix. Proof. From [HUL13, p. 71], for any function f and λ > 0
convλf = λ convf.
So for (3.6), convf1 = λ1 convh1. The value λ1 > 0 can be made sure by always assigning the positive eigenvalue to λ1. 1 1 T Now, for M = √ √ U, h (¯x) = f (M ◦ x). So α − α 1 1
−1 convf1 = convh1 ◦ M . (3.7)
Finally, (3.2) becomes
−1 T convf = λ1convh1 ◦ M + b x + δ.
15 3.2. Problem Formulation[Loc16]
Hence the closed convex envelope of any nonconvex quadratic function f defined over a polytope P can be obtained by
−1 convfP = λ1convh1P ◦ M + convf2P .
3 −1 Example 3.2. For the quadratic function f(x) = x xT +[2, −2]T x −1 −5 +4, its quadratic term gets reduced to h1(¯x) ≈ 3.123 · x¯1x¯2 with transfor- −0.8360 −1.1490 mation matrix M ≈ . 1.3934 −1.1490 Since the problem of computing the closed convex envelope of a quadratic function reduces to computing the closed convex envelope of a bilinear func- tion, from now on we focus on formulating the problems, its solutions, and the algorithmic design for bilinear function only. Also, for consistency of this 2 chapter with [Loc16], we use variables x, y ∈ R instead of x = [x1, x2] ∈ R in the algebraic expressions.
3.2 Problem Formulation[Loc16]
In this section, we will reformulate the problem of computing the closed convex envelope into a group of subproblems as presented in [Loc16]. The maximum of the solutions of such subproblems, when computed over the given polytope, leads to a polyhedral subdivision of the polytope such that over each such division, the closed convex envelope belongs to a particular class of bivariate rational functions.
3.2.1 Problem Formulation This section covers a new formulation of the closed convex envelope problem presented before in Definition 2.5. The closed convex envelope of 2 any function f at a point (x0, y0) ∈ P , where P ⊂ R is a convex set, is given by an optimization problem in a, b, c with infinite number of linear constraints as
convfP (x0, y0) = max c
f(x, y) − [a(x − x0) + b(y − y0) + c] ≥ 0 ∀(x, y) ∈ P.
The optimal solution (a∗, b∗, c∗) of this problem defines the supporting hy- perplane to the closed convex envelope of f at point (x0, y0) over P .
16 3.2. Problem Formulation[Loc16]
The infinite number of constraints can be replaced with a constraint as
convfP (x0, y0) = max c
min f(x, y) − [a(x − x0) + b(y − y0)] ≥ c. (x,y)∈P
Let us introduce the following assumption,
2 2 Assumption 3.2.1. For a polytope P ⊂ R and function f ∈ C , − the Hessian of f is indefinite in the interior of P , and
− along each edge of P , f is either strictly convex or concave.
This first part of the assumption is sufficiently reasonable as we want to compute the closed convex envelope of a function which is neither convex nor concave. However the second condition is restrictive and requires careful selection of the polytope P , but it plays a crucial role in the new formula- tion of the convex envelope problem. Most of the 2D nonconvex nonlinear quadratic functions over any polytope P satisfy these assumptions. For the quadratic functions that do not statisfy these assumptions, a change in the polytope can be used to make them valid under our assumptions. Now under the above assumptions, the minimum of f(x, y)−[a(x−x0)+ b(y − y0)] ≥ c will always be attained either at a vertex of P or over an edge such that the constriction of f along the edge is a strictly convex function. So the problem of computing the closed convex envelope of f over a polytope P can then be written as
convfP (x, y) = max c 0 f(xvi ,y vi ) − [a(xvi − x) + b(yvi − y)] ≥ c, ∀ i ∈ V (P ) min f(x0, y0) − [a(x0 − x) + b(y0 − y))] ≥ c, ∀j ∈ 0 0 (x ,y )∈[vj1 ,vj2 ] E0(P ) (3.8) where
− E0(P ) is the index set of edges where f is strictly convex,
0 −∀ j ∈ E (P ), vertices vj1 and vj2 are the end points of the edge, − V 0(P ) is the set of all the remaining vertex indices not belonging in E0(P ),
17 3.2. Problem Formulation[Loc16]
0 −∀ i ∈ V (P ), (xvi ,y vi ) are the coordinates of the corresponding vertex. Example 3.3. Let f(x, y) = xy and P be a polytope with vertices
v1 = (0, 0), v2 = (1, 2), v3 = (2, 2), v4 = (2, 1).
The equation of lines constituting the boundaries of polytope are e1 := y = 2x from v1 to v2, e2 := y = 2 from v2 to v3, e3 := x = 2 from v3 to v4, and e4 := y = x/2 from v4 to v1 respectively. Since f is strictly convex along the lines y = 2x and y = x/2 only, so E0(P ) = {1, 4} and V 0(P ) = {3}. A detailed formulation of the above steps has been presented in [LS14, Sec. 2]. In the above formulation, each constraint either belongs to an edge or a vertex of the polytope P . However, the constraint belonging to an edge still remains a minimization problem in itself. 0 0 For j ∈ E (P ), a convex edge with equation of line as y = mjx + qj, 0 0 let fj(x ) = f(x , mjx + qj) be the restriction of f along the edge. Now the minimization problem related to this edge can be written as
0 0 0 min fj(x ) − [a(x − x) + b(mjx + qj − y)] ≥ c. (3.9) 0 x ∈[xv ,xv ] j1 j2
Let sj(a, b) = arg min{(3.9)}. If the function is decreasing (resp. in- creasing), sj(a, b) = +∞(resp. −∞), otherwise
0 fj(sj(a, b)) = a + bmj. (3.10) Since f is strictly convex along [x , x ], we have f 0(x ) ≥ f 0(x ). j vj1 vj2 j vj2 j vj1 Now the minima will either lie in the interval or on the boundaries. Let us define the three regions D− = {(a, b): a + bm ≤ f 0(x )} j j j vj1 D = {(a, b): f 0(x ) ≤ a + bm ≤ f 0(x )} (3.11) j j vj1 j j vj2 D+ = {(a, b): a + bm ≥ f 0(x )}. j j j vj2 The arg min{(3.9)} is
− xv if (a, b) ∈ D j1 j xj(a, b) = sj(a, b) if (a, b) ∈ Dj (3.12) x if (a, b) ∈ D+. vj2 j Correspondingly,
yj(a, b) = mjxj(a, b) + qj.
18 3.2. Problem Formulation[Loc16]
Also for the constraints belonging to the remaining vertex indices i ∈ V 0(P ),
xi(a, b) = xvi and yi(a, b) = yvi .
So ∀k ∈ V 0(P ) ∪ E0(P ), we define
ηk(a, b) = f(xk(a, b), yk(a, b)) − axk(a, b) − byk(a, b). (3.13)
By using (3.13), the closed convex envelope problem (3.8) becomes
convfP (x, y) = max c 0 0 (3.14) ηk(a, b) + ax + by ≥ c, ∀ k ∈ V (P ) ∪ E (P ).
The problem (3.14) is a simplified version of (3.8) where ηk can take two different forms depending upon if it is a convex edge or a vertex of the polytope.
3.2.2 Expressions for η(a, b) The bilinear function, f(x, y) = xy, is strictly convex on an edge y = 00 00 mjx + qj if mj > 0 (as fj (x) = 2mj and to be strictly convex fj (x) ≥ 0). From (3.12), over such edge
a + bmj − qj sj(a, b) = . 2mj So ∀j ∈ E0(P ), 2 (a + mjb − qj) − − bqj (a, b) ∈ Dj ηj(a, b) = 4mj − + −axj(a, b) − byj(a, b) + xj(a, b)yj(a, b)(a, b) ∈ {Dj ,Dj }, 0 and, ∀i ∈ V (P ) and vertex (xvi ,y vi ),
ηi(a, b) = −axvi − byvi + xvi yvi . (3.15) The derivation of the above expressions is presented in Appendix A.
Example 3.4. From Example 3.3, we have two convex edges, e1 and e4 with lines y = 2x and y = x/2 between vertices v1 = (0, 0) to v2 = (1, 2)
19 3.2. Problem Formulation[Loc16]
and v1 = (0, 0) to v4 = (2, 1) respectively. Now for the edge e1,
0 a + 2b ≤ 0 (a + 2b)2 η1(a, b) = − 0 ≤ a + 2b ≤ 4 8 2 − a − 2b 4 ≤ a + 2b.
Similarly, for edge e4
0 a + b ≤ 0 2 2 (a + b/2) b η2(a, b) = − 0 ≤ a + ≤ 2 2 2 b 2 − 2a − b 2 ≤ a + 2 . The indexing of η functions does not matter as our main objective is to obtain the constraint set only. Finally, for vertex v3
η3(a, b) = 4 − 2a − 2b.
3.2.3 Subproblems From [Loc16, lemma 3.1], at any point in the polytope not lying on the strictly convex edge or vertices, the solution of problem (3.14) has at least two active constraints. So the optimization problem (3.14) can be reduced to subproblems for each pair of η functions. Hence, ∀(h, w) ∈ E0(P ) S V 0(P ) and h 6= w,
convfP (x, y) = max max ηh(a, b) + ax + by h,w a,b
ηh(a, b) = ηw(a, b) 0 0 ηh(a, b) ≤ ηk(a, b), k ∈ E (P ) ∪ V (P ) \{h, w} (3.16) Since η is a piecewise defined function with a polyhedral subdivision, solving the subproblems for each pair is the same as solving them over each member of the resulting polyhedral subdivision obtained by overlaying all the polyhedral subdivisions for all η functions. When arranged according to the subdivided region, the form of the sub-
20 3.2. Problem Formulation[Loc16] problems become
max ηh(a, b) + ax + by
ηh(a, b) = ηw(a, b) 0 0 (3.17) ηh(a, b) ≤ ηk(a, b), k ∈ E (P ) ∪ V (P ) \{h, w}
(a, b) ∈ Sr where − index r of subproblem ranges in interval 1, ..., 3|E0(P )| where |E0(P )| is the cardinality of set E0(P )(see Definition 2.2),
− Each polyhedral set Sr is defined as
Sr = ∩j∈E0(P )Γj
− + − + where Γj ∈ {Dj ,Dj,Dj } i.e all possible combinations of Dj ,Dj,Dj and ∀j ∈ E0(P ).
A set S = Sr is the collection of all the polyhedral set such that over each Sr, there is an unique combination of pieces of η functions. Also, S 2 i∈{1,...,|S|} Sr = R . Example 3.5. From Example 3.4, the set S = {{a + b/2 ≤ 0, a + 2b ≤ 0}, {a + b/2 ≤ 0, 0 ≤ a + 2b ≤ 4}, {a + b/2 ≤ 0, a + 2b ≥ 4}, ··· , {a + b/2 ≥ 2, a + 2b ≥ 4}}. Now the subproblem defined over subregion S2 is max 0 + ax + by (a + 2b)2 0 = − 8 0 ≤ 4 − 2a − 2b
(a, b) ∈ S2 = {a + b/2 ≤ 0, 0 ≤ a + 2b ≤ 4}.
Similarly, the subproblem defined over subregion S5 is
(a + 2b)2 max − + ax + by 8 (a + 2b)2 (a + b/2)2 − = − 8 2 (a + 2b)2 − ≤ 4 − 2a − 2b 8 (a, b) ∈ S5 = {0 ≤ a + b/2 ≤ 2, 0 ≤ a + 2b ≤ 4}.
21 3.3. Exponential subproblems
Figure 3.1: η1’s polyhedral domain
3.3 Exponential subproblems
We rearranged the subproblems from (3.16) to (3.17) by using the infor- mation that each η that belongs to a convex edge, is a piecewise function with exactly 3 pieces defined over a polyhedral set. Solving the subprob- lems for each pair (h, w) would eventually lead to solving them for each intersecting polyhedral set Sr. Now, the number of subproblem ranges in {1, ..., 3|E0(P )|} because with each η the a − b space could potentially be di- vided into 3 times the number of current divisions. This is best illustrated with the help of Figures 3.1- 3.3. Figures 3.1 and 3.2 show the polyhedral domains for some η1(a, b) and 0 η2(a, b). For the bilinear function, the domain of ηj(a, b) ∀j ∈ E (P ) will 0 always be divided by two parallel hyperplanes of form a + mjb = f (xvl ) for l ∈ {1, 2}, where v1 and v2 are the end points of the respective convex edge, and mj > 0. Now solving the subproblems in (3.16) for η1(a, b) and η2(a, b) is the same as solving all the subproblems defined over each polyhedral set in Figure 3.3. Since every polyhedral set contains exactly one piece from each η(a, b), the formulation (3.17) is the rearrangement of subproblems according to their subregions. These Subregions are formed as a result of the intersection between re- − + gions Dj ,Dj and Dj for all ηj. The number of such possible regions are 3|E0(P )|, however, out of all the 3|E0(P )| combinations, few will always be invalid due to the restriction mj > 0. For the new subdivision to have 3 times the number of current members, each polyhedral set of the adding subdivision needs to have a valid intersection with all the members of the
22 3.3. Exponential subproblems