Conjugate of some Rational functions and Convex Envelope of Quadratic functions over a Polytope

by

Deepak Kumar

Bachelor of Technology, Indian Institute of Information Technology, Design and Manufacturing, Jabalpur, 2012

A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

MASTER OF SCIENCE

in

THE COLLEGE OF GRADUATE STUDIES

(Computer Science)

THE UNIVERSITY OF BRITISH COLUMBIA (Okanagan)

January 2019

c Deepak Kumar, 2019 The following individuals certify that they have read, and recommend to the College of Graduate Studies for acceptance, a thesis/dissertation en- titled:

Conjugate of some Rational functions and Convex Envelope of Quadratic functions over a Polytope submitted by Deepak Kumar in partial fulfilment of the requirements of the degree of Master of Science.

Dr. Yves Lucet, I. K. Barber School of Arts & Sciences Supervisor

Dr. Heinz Bauschke, I. K. Barber School of Arts & Sciences Supervisory Committee Member

Dr. Warren Hare, I. K. Barber School of Arts & Sciences Supervisory Committee Member

Dr. Homayoun Najjaran, School of Engineering University Examiner

ii Abstract

Computing the convex envelope or biconjugate is the core operation that bridges the domain of nonconvex analysis with . For a bi- variate PLQ function defined over a polytope, we start with computing the convex envelope of each piece. This convex envelope is characterized by a polyhedral subdivision such that over each member of the subdivision, it has an implicitly defined rational form(square of a linear function over a linear function). Computing this convex envelope involves solving an expo- nential number of subproblems which, in turn, leads to an exponential time algorithm. After that, we compute the conjugate of each such rational function de- fined over a polytope. It is observed that the conjugate has a parabolic subdivision such that over each member of its subdivision, it has an im- plicitly defined fractional form(linear function over square root of a linear function). This computation of the conjugate is performed with a worst-case linear time complexity algorithm. Finally, some directions and insights about computing the maximum of all the conjugates of each piece of a PLQ function and then the conjugate of that to obtain the biconjugate are provided as conjectures for future work.

iii Lay Summary

Optimization problems occur in a wide variety of fields ranging from Engineering to Mathematical Finance and solving these problems play a crucial role in the process. When the optimization problem is nonconvex it is often difficult to solve. In this thesis, we present a method to convert some class of nonconvex problems to convex problems, which are often easier to solve, and a method to compute their conjugates which can help solving the problem faster than before.

iv Table of Contents

Abstract ...... iii

Lay Summary ...... iv

Table of Contents ...... v

List of Tables ...... viii

List of Figures ...... ix

List of Notations and Abbreviations ...... xi

Acknowledgements ...... xii

Dedication ...... xiii

Chapter 1: Introduction ...... 1 1.1 PLQ Functions ...... 1 1.2 Convex Envelope ...... 2 1.3 Fenchel conjugate ...... 4 1.4 Motivation ...... 5

Chapter 2: Preliminaries and Notations ...... 8

Chapter 3: Convex Envelope of Bivariate Quadratic Func- tions over a polytope ...... 13 3.1 Quadratic Reduction ...... 13 3.2 Problem Formulation[Loc16] ...... 16 3.2.1 Problem Formulation ...... 16 3.2.2 Expressions for η(a, b)...... 19 3.2.3 Subproblems ...... 20 3.3 Exponential subproblems ...... 22

v TABLE OF CONTENTS

3.4 Optimal Solutions [Loc16] ...... 25 3.4.1 Solving the subproblems ...... 25 3.4.2 Functional forms ...... 26 3.4.3 Solutions ...... 26 3.5 Convex envelope: the maximum of all subproblems ...... 29 3.6 Algorithmic Design ...... 30 3.6.1 Input and Output Data Structures ...... 30 3.6.2 Main Algorithm ...... 33 3.6.3 Solving the subproblems ...... 37 3.7 Map Overlay problem ...... 40 3.7.1 Data structures for polyhedral region ...... 41 3.7.2 Sorting vertices in clockwise direction ...... 42 3.7.3 Map Overlay Algorithm ...... 45

Chapter 4: Conjugate of a class of Convex Bivariate Rational Functions over a polytope ...... 49 4.1 Subdifferentials in the interior of the polytope ...... 50 4.2 Subdifferentials at the Vertices ...... 55 4.3 Subdifferentials on the edges ...... 55 4.4 Structure of the conjugate domain ...... 59 4.5 Conjugate Expressions ...... 65 4.5.1 Fractional forms ...... 65

Chapter 5: Algorithmic computation of the Conjugate for a class of Bivariate Rational functions over a poly- tope ...... 73 5.1 Data Structures ...... 73 5.1.1 Parabolic region ...... 74 5.1.2 Output Data Structure ...... 76 5.2 Algorithm ...... 79 5.2.1 Main Algorithm ...... 80 5.2.2 Algorithm for edges ...... 81 5.3 Example 1 ...... 83 5.4 Example 2 ...... 89

Chapter 6: Conclusions and Future work ...... 96 6.1 Future work ...... 97 6.1.1 Maximum of all conjugates ...... 97 6.1.2 Conjugate of bivariate nonconvex PLQ Functions . . . 98 6.1.3 Convex envelope of bivariate nonconvex PLQ functions 98

vi TABLE OF CONTENTS

Bibliography ...... 100

Appendix ...... 105 Appendix A: The η(a, b) expressions for the bilinear functions . . . 106 Appendix B: Solving the subproblems ...... 108 B.1 Case 1: Quadratic-Quadratic ...... 108 B.1.1 Solving the equality constraint ...... 109 B.1.2 Solving the inequalities ...... 109 B.1.3 Solutions ...... 111 B.2 Case 2: Quadratic-Linear ...... 112 B.2.1 Solving the equality constraint ...... 113 B.2.2 Solving the inequalities ...... 113 B.2.3 Solutions ...... 115 B.3 Case 3: Linear-Linear ...... 116 B.3.1 Solving the equality constraint ...... 116 B.3.2 Solving the inequalities ...... 117 B.3.3 Solutions ...... 118 Appendix C: Conjugate expressions for a rational function over an edge ...... 119

vii List of Tables

Table 3.1 Different cases depending upon the eigenvalues . . . . 14

Table 5.1 Conjugate domain for a rational function ...... 76 Table 5.2 Conjugate domain for quadratic functions ...... 80

Table 6.1 Observed intersections for computing the maximum of all conjugates ...... 97

viii List of Figures

Figure 1.1 A univariate convex PLQ Function ...... 2 Figure 1.2 A non convex univariate PLQ Function ...... 3 x2 s2 Figure 1.3 f(x) = , its conjugate f ∗(s) = , and the map- 2 2 ping between the two ...... 5 Figure 1.4 Closed convex envelope (shown in Blue) of each piece of a univariate nonconvex PLQ function ...... 6

2 Figure 2.1 A Polyhedral subdivision of R ...... 10

Figure 3.1 η1’s polyhedral domain ...... 22 Figure 3.2 η2(a, b)’s polyhedral domain ...... 23 Figure 3.3 Nine Subregions formed by overlaying domains of η1 and η2 ...... 23 Figure 3.4 19 subregions, shown by different colors, formed by intersection of domains of 3 η(a, b)s belonging to a convex edge...... 24 Figure 3.5 Input Data Structure ...... 31 Figure 3.6 Output Data Structure ...... 32 Figure 3.7 Space division by line y = mx + c ...... 41 Figure 3.8 Vertex Enumeration Problem ...... 48

Figure 4.1 Primal and dual mapping for quadratic function in Example 4.4 over lines parallel to x1 + x2 = 0. . . . 52 Figure 4.2 Normal cone at a vertex v ∈ P shown as the arrows in the intersection of Red and Green region ...... 56 Figure 4.3 Polyhedral subdivision for Example 4.14 ...... 62 Figure 4.4 Parabolic subdivision for r and P in Example 4.22 . . 64 Figure 4.5 Parabolic subdivision for r and P from Example 4.5 . 66

Figure 5.1 A subdifferential region ...... 74 Figure 5.2 Region with only active constraints ...... 75

ix LIST OF FIGURES

Figure 5.3 Proposed subdivision for storing a subdifferential region 75 Figure 5.4 Output Data structure ...... 77 Figure 5.5 Normal cone division lines for Example 5.3 ...... 83 Figure 5.6 Conjugate for Example 5.3 ...... 88 Figure 5.7 Normal cone division lines for Example 5.4 ...... 90 Figure 5.8 Conjugate for Example 5.4 ...... 95

x List of Notations and Abbreviations

R Real numbers φ Empty set |A| Cardinality of set A dom(f) Effective domain of function f cl(g) Closed envelope of function g

convfP Convex envelope of f over a region P conv Closed convex Envelope Sn×n Symmetric Matrix of size n × n int(P ) Interior of the set P ri(P ) Relative interior of set P PLQ Piecewise Linear Quadratic LFT Legendre-Fenchel transform f ∗ Conjugate of function f sup S The supremum of set S inf S The infimum of set S

NP (x) Normal cone of set P at x ∂f(x) Subdifferential of function f at x ∇f(x) Gradient of function f at x

IP Indicator function of set P

xi Acknowledgements

I would take this opportunity to acknowledge and express my sincere gratitude towards my supervisor Dr. Yves Lucet, for his constant guidance, encouragement, and precious advice throughout my graduate studies. I am very grateful for his endless support, especially for his expertise, patience and insightful suggestions during the course of my research. His attention to details motivated me to pay closer attention to everything I work on, and his positive outlook and confidence in my research inspired me and gave me the right confidence to dig deeper into my work. I would also like to extend my gratitude towards the members of my su- pervisory committee: Dr. Heinz Bauschke and Dr. Warren Hare for spending their valuable time reading my thesis and providing constructive and insight- ful feedback. I am also thankful to all of my committee members for making my defense a wonderful moment of my life. I truly appreciate the support of my family and all of my friends and colleagues of the CCA research group for all they meant to me during the completion of my research. Without them, this journey and the lab would not have been such an enjoyable experience. I am also thankful to the University of British Columbia, Okanagan and the Natural Sciences and Engineering Research Council of Canada (NSERC) for providing the financial assistance, without which this research would have seemed like a distant dream.

xii Dedication

To my family, for their constant love and support.

xiii Chapter 1

Introduction

Computational convex analysis (CCA) focuses on creating efficient al- gorithms to compute fundamental transforms arising in the field of convex analysis. Computing the convex envelope or biconjugate is the core oper- ation that bridges the domain of nonconvex analysis with convex analysis. The early idea of the computation of convex transforms can be traced back to [Mor65]. However, development of most of the algorithms in CCA began with the Fast Legendre Transform (FLT) in [Bre89] and was further devel- oped in [Cor96, Luc96] (and independently in [NV94, SAF92]). Later, the algorithm was improved to the optimal linear worst-case time complexity in [Luc97]. Recent algorithms to compute the conjugate numerically have been based on either parameterization [HUL07], manipulation of graphs (GPH model) [GL11], or the computation of the Moreau envelope [Luc06]. More complex operators such as the proximal average operator [BGLW08, BLT08] can be built by using a combination of addition, scalar multiplica- tion, and conjugacy operations. This has further motivated more research in the field of computational convex analysis [LLT17, JKL11, LBT09]. Computational convex analysis has numerous applications in a wide va- riety of fields, ranging from image processing and partial differential equa- tions to thermodynamics. An extensive study has been carried out in the survey [Luc10].

1.1 PLQ Functions

Piecewise Linear Quadratic (PLQ) functions play a significant role in the approximation of functions. Compared to piecewise linear functions, PLQ functions capture the curve of the original function and are continu- ous but may not be smooth, however, they are second-order differentiable in the interior of each piece. PLQ functions have been well-known in the field of convex analysis (the set of PLQ functions is closed under common convex operators [RW98]). Their applications in nontraditional statistical estimations problems have been well studied in [CCHP17].

1 1.2. Convex Envelope

Figure 1.1: A univariate convex PLQ Function

An example of a convex univariate PLQ function is shown in Figure 1.1. A PLQ framework along with the existence of linear time algorithms for various convex transforms has been proposed in [LBT09, BLT08]. A detailed description of all the important transforms is presented in the study [Luc10]. Computing the full graph of the of univariate nonconvex PLQ functions using the biconjugate operation in optimal linear worst-case time complexity has been proposed in [GL10]. However, the area of bivariate PLQ functions, especially nonconvex, has not been explored much.

1.2 Convex Envelope

For a function f defined over a region X, the process of computing the pointwise supremum of all convex underestimators of f for region X is called the convex envelope and is denoted by convfX . For example, for the univariate nonconvex PLQ function  x2 x ≤ 0  f(x) = 1 − (x − 1)2 0 < x < 1 x2 x ≥ 1 shown in Figure 1.2, its convex envelope is given by

2 1.2. Convex Envelope

Figure 1.2: A non convex univariate PLQ Function

 x2 x ≤ 0  convf = x 0 < x < 1 x2 x ≥ 1 and is shown earlier in Figure 1.1. Computing the convex envelope of a function is one of the fundamental task in mathematical optimization. Every nonconvex function shares the same minima as its convex envelope and computing the minima of a is much easier than computing the same for a nonconvex function. The task of computing the convex envelope of a function is very useful though not simple (computation of the convex envelope of a multilinear function over a unit hypercube has been proved to be NP-Hard in [Cra89]). However, the convex envelope of functions defined over a polytope P and restricted by the vertices of P can be computed in finite time using a linear program [Tar04, Tar08]. A method to reduce the computation of convex n−1 envelope of functions that are one lower dimension(R ) convex and have indefinite Hessian to optimization problems in lower dimensions is discussed in [JMW08]. Results for specific functions exist in the literature, a particular one is the bilinear function. Any general bivariate nonconvex quadratic function can be linearly transformed to the sum of bilinear and a linear function. Some studies involving bilinear functions are

3 1.3. Fenchel conjugate

− convex envelopes for bilinear functions over rectangles have been dis- cussed in [McC76] and validated in [AKF83]; − the convex envelope over special polytopes (not containing edges with finite positive slope) has been derived in [SA90]; − the convex envelope of a bilinear function over a triangle containing exactly one edge with finite positive slope is presented in [Lin05]; − the convex envelope over general triangles and triangulation of the polytopes through doubly nonnegative matrices (both semidefinite and nonnegative) is presented in [AB10, Ans12]. In [Loc14], it is shown that the analytical form of the convex envelope of some bivariate functions defined over polytopes can be computed by solving a continuously differentiable convex problem and then later is extended to any bivariate function satisfying some assumption (see Assumption [Loc16]). They go on to prove that the convex envelope of any bivariate function over a polytope satisfying the assumption is characterized by a polyhedral subdivision, however, the form of the envelope is determined implicitly.

1.3 Fenchel conjugate

n The Fenchel conjugate of a function f : R → R ∪ {+∞} is given by f ∗(s) = sup [hs, xi − f(x)] x∈Rn where hs, xi = sT x is the standard dot product. It is also known as the Legendre-Fenchel Transform or or simply conjugate. It plays a significant role in and computing it is a key step in solving x2 the dual optimization problem [Her16]. For a convex function f(x) = , 2 the conjugate is the function itself in the dual domain. The mapping between the primal and dual domain for this function is shown in Figure 1.3. For a smooth convex univariate function, the slope of the affine underestimator at a given point in the primal domain maps to a point in the dual domain, while the intercept multiplied by −1 becomes the value of the conjugate at that slope and vice-versa. A method to compute the conjugate known as the fast Legendre trans- form was first introduced in [Bre89] and its applications have been studied in [Cor96, Luc96]. A linear time algorithm was later introduced by Lucet to compute the discrete Legendre transform [Luc97].

4 1.4. Motivation

Primal domain (x,y) Dual domain (s,r) 1.5

1 1

0.5 r = 1*s -0.5

0.5 y = 1*x-0.5

0 0

-0.5 -0.5

-1

-1.5 -1 -0.5 0 0.5 1 -1.5 -1 -0.5 0 0.5 1 1.5

x2 s2 Figure 1.3: f(x) = , its conjugate f ∗(s) = , and the mapping between 2 2 the two

Computation of the conjugate of convex univariate PLQ functions have been well studied in the literature and linear time algorithms have been developed in [GL13, GJL14] both for the PLQ and the GPH models. Re- cently, a linear time algorithm to compute the conjugate of convex bivariate PLQ functions using an entity graph (storing the entities of the domain as a graph) has been proposed in [HL18].

1.4 Motivation

n Let f : R → R ∪ {+∞} be a piecewise function, that is,  f1(x) x ∈ P1  f (x) x ∈ P f(x) = 2 2 ······   fN (x) x ∈ PN . From [HUL93, Theorem 2.4.1], we have ∗ ∗ (inf fi) = sup fi . (1.1) i i

5 1.4. Motivation

Figure 1.4: Closed convex envelope (shown in Blue) of each piece of a uni- variate nonconvex PLQ function

Also, from [HUL93, Proposition 2.6.1],

conv(inf(fi + IP )) = conv( inf[conv(fi + IP )] ) (1.2) i i i i

where IPi is the indicator function for Pi. So

conv(inf(fi + IP )) = conv(inf[conv(fi + IP )]) using (1.2) i i i i ∗ ∗ = ((inf[conv(fi + IP )]) ) i i ∗ ∗ = ( sup [conv(fi + IPi )] ) using (1.1). i Hence the closed convex envelope of a PLQ function can be computed in four steps: Step 1 Compute the convex envelope of each piece

Step 2 Compute the conjugate of the convex envelope of each piece

6 1.4. Motivation

Step 3 Compute the maximum of all the conjugates

Step 4 Compute the conjugate of the function obtained in Step 3 to get the biconjugate (or the closed convex envelope) of the original function.

Note that computing the closed convex envelope of each piece does not result in the closed convex envelope of the whole function. This is illustrated in Figure 1.4. In this thesis, we begin with computing the convex envelope of each piece using the method proposed in [Loc16] in Chapter 3. After that, we compute the conjugate of each piece in Chapter 4. Chapter 5 covers the algorithmic design and examples for computing the conjugate of a convex rational function over a polytope. Step 3 and 4 are discussed in Chapter 6 as future work.

7 Chapter 2

Preliminaries and Notations

This section covers the basic background to facilitate proper understand- ing of the rest of the thesis. Definition 2.1. . A set S is convex if ∀x, y ∈ S and λ ∈ [0, 1], λx + (1 − λ)y ∈ S. Definition 2.2. Cardinality of a finite set. Cardinality of a finite set is the number of elements in the set. For example, let A = {1, 5, 7}, then its cardinality |A|= 3. n Definition 2.3. Convex Function. A function f : R → R∪{+∞} is convex if and only if ∀x, y ∈ dom(f) = {x : f(x) ∈ R} and λ ∈ (0, 1), f(λx + (1 − λ)y) ≤ λf(x) + (1 − λ)f(y). It is strictly convex when f(λx + (1 − λ)y) < λf(x) + (1 − λ)f(y) for x, y ∈ dom(f), x 6= y and λ ∈ (0, 1). Definition 2.4. Lower Semi-continuous/Closed function [HUL13, Defini- n tion 3.2.1]. A function f : R → R ∪ {+∞} is said to be closed if n lim inf f(x) ≥ f(x0) for all x0 ∈ R . x→x0 n For any non closed function g : R → R ∪ {−∞, +∞}, we note cl(g) the closed envelope of g, that is, the largest closed function that is lower than n n g. We denote the set of closed convex functions on R as convR . All the n functions involved here on are in convR . Definition 2.5. Convex Envelope. The convex envelope of a function f over a region P , noted conv fP , is the pointwise supremum of all the convex underestimators of f over P . It is defined as

convfP (x, y) = sup c f(x0, y0) − [a(x0 − x) + b(y0 − y) + c] ≥ 0, ∀(x0, y0) ∈ P

8 Chapter 2. Preliminaries and Notations

The optimal value (a∗, b∗, c∗) defines the supporting hyperplane for the function f at a point (x, y) ∈ P .

Definition 2.6. Symmetric Matrix. A symmetric matrix, A, is a square matrix which is equal to its transpose, that is, A = AT . The set of symmetric matrices is denoted by S.

Definition 2.7. Positive Semidefinite Matrix. An n × n symmetric matrix n T M is called positive semidefinite if for any x 6= 0 ∈ R , x Mx ≥ 0. When xT Mx > 0, then M is a positive definite matrix.

Definition 2.8. Indefinite Matrix. A symmetric matrix, M, is called in- definite if it contains at least two nonzero eigenvalues of opposite signs.

For a function f ∈ C1, an indefinite Hessian signifies f is nonconvex.

Definition 2.9. Halfspace. A halfspace is an unbounded region formed by the set of points satisfying a linear inequality. Mathematically,

n T Hs = {x ∈ R : a x ≤ b}

n where a ∈ R , a 6= 0 and b ∈ R. Definition 2.10. Hyperplane(Affine). A hyperplane is formed by the set of points statisfying a linear equality. Mathematically,

n T Hp = {x ∈ R : a x = b}

n where a ∈ R , a 6= 0, b ∈ R. Definition 2.11. Polytope. A polytope P is a bounded region formed by the intersection of a finite number of halfspaces. It can be written as

n P = {x ∈ R : Ax ≤ b},

m×n m where A ∈ R , b ∈ R and m is the number of halfspaces. A polytope is always convex. 2 A polytope in R can be divided into three entities namely vertices, edges, and the interior. A vertex is formed by the intersection of edges. An edge is a line segment on the boundary of the polytope joining two vertices, while n the interior of a polytope P can be written as int(P ) = {x ∈ R : Ax < b}.

9 Chapter 2. Preliminaries and Notations

2 Figure 2.1: A Polyhedral subdivision of R

S Definition 2.12. Polyhedral Subdivision. A convex set R = i∈{1,...,m} Ri, 2 R ⊆ R , defined as the union of a finite number of polyhedral regions is said T to have a polyhedral subdivision if for any j, k ∈ {1, ··· , m}, j 6= k, Rj Rk is either φ or is contained in a line.

2 Figure 2.1 shows an example of polyhedral subdivision of R . Definition 2.13. Piecewise Linear-Quadratic function. A Piecewise Linear- Quadratic(PLQ) function is characterized by a polyhedral subdivision such that over each member of the subdivision, it has either linear or quadratic form. Mathematically, a PLQ function

 1 T  x A1x + b1x + δ1 x ∈ P1  2  1 xT A x + b x + δ x ∈ P f(x) = 2 2 2 2 2 ······   1 T  2 x AN x + bN x + δN x ∈ PN .

n n×n n where x ∈ R ,Ai ∈ S , bi ∈ R , δi ∈ R and Pi are polyhedral sets. n Definition 2.14. Legendre-Fenchel transform. For a function f : R → ∗ R ∪ {+∞}, its Legendre-Fenchel transform(LFT), f , is defined as

10 Chapter 2. Preliminaries and Notations

f ∗(s) = sup [hs, xi − f(x)] x∈Rn where hs, xi = sT x is the standard dot product.

n Definition 2.15. Biconjugate. For a function f : R → R ∪ {+∞}, its biconjugate is given as (f ∗)∗.

When f is convex (f ∗)∗ = cl(f), otherwise (f ∗)∗ is the closed convex envelope of f.

Definition 2.16. Cone. A set C is a cone if is contains 0 and for every x ∈ C, λx ∈ C for all λ ≥ 0.

Definition 2.17. Relative Interior [HUL13, Definition 2.1.1]. The relative n interior of a convex set C ∈ R , denoted by ri(C), is defined as the interior within the affine hull of C. Mathematically, x ∈ ri(C) if and only if x ∈ aff(C) and ∃δ > 0 such that aff(C) ∩ B(x, δ) ⊂ C where aff(C) is the affine hull of C and B(x, δ) is a ball of radius δ centered on x.

Definition 2.18. Normal Cone. For a set P , the normal Cone at a point x ∈ P is given by

NP (x) = {s : hs, y − xi ≤ 0, for all y ∈ P }.

For all x ∈ int(P ),NP (x) = {0}. When P is a 2D polytope, for any edge E = {x : L(x) ≤ 0, l1 ≤ x1 ≤ u1} with L(x) = ax + by + c between vertices l and u, the normal cone at any x ∈ ri(E),NP (x) = {s : s = λ∇L(x), λ ≥ 0}. Definition 2.19. Subdifferential of a function. The subdifferential of a function f at any x ∈ dom(f), given by ∂f(x) is

∂f(x) = {s : f(y) ≥ f(x) + hs, y − xi, for all y ∈ dom(f)}.

Every element of ∂f(x) is called a subgradient of f at x. When f is differ- entiable at x then ∂f(x) = {∇f(x)}.

Definition 2.20. Parabola (or Parabolic curve). A parabola is a two dimen- sional planar curve whose points (x, y) satisfy the equation ax2 +bxy +cy2 + dx + ey + f = 0 where b2 − 4ac = 0. A line is a parabola with a = b = c = 0.

11 Chapter 2. Preliminaries and Notations

2 Definition 2.21. Parabolic region. A parabolic region, Pr ⊂ R , is formed by the intersection of a finite number of parabolic inequalities. It can be defined as

2 2 2 Pr = {x ∈ R : aix1 + bix1x2 + cix2 + dix1 + eix2 + fi ≤ 0, i ∈ {1, ··· , k}, 2 bi − 4aici = 0}.

2 2 2 For Cp(x) = ax1 + bx1x2 + cx2 + dx1 + ex2 + f with b − 4ac = 0, the set 2 2 Pr = {x ∈ R : Cp(x) ≤ 0} is convex, but Ps = {x ∈ R : Cp(x) ≥ 0} is not. S Definition 2.22. Parabolic subdivision. A convex set R = i∈{1,...,m} Ri, 2 R ⊆ R , defined as the union of a finite number of parabolic regions is said T to have a parabolic subdivision if for any j, k ∈ {1, ··· , m}, j 6= k, Rj Rk is either φ or is contained in a parabola.

12 Chapter 3

Convex Envelope of Bivariate Quadratic Functions over a polytope

Computing the closed convex envelope of a 2D PLQ function requires computation of the closed convex envelope of each quadratic piece. For this purpose, we utilize the method proposed in [Loc16]. We start with problem formulation, reduce it to subproblems, and then obtain the optimal solution for each. The closed convex envelope then comes out to be the maximum of all such solutions over the given polytope. In this chapter, we formulate the mathematical form of the closed convex envelope of each piece of a bivariate nonconvex PLQ function using [Loc16]. After that, we introduce the data structures and algorithms involving various 2D geometry manipulations to compute the closed convex envelope of a bivariate nonconvex quadratic function. Before we present the method proposed in [Loc16] and our algorithmic approach, let us discuss a technique to reduce the general expression of any nonconvex 2D quadratic function to a sum of bilinear and affine functions.

3.1 Quadratic Reduction

Using linear algebra and eigenvalue decomposition, we can reduce any quadratic function to a sum of a bilinear function and an affine function. This reduction is required to simplify our process of solving the subprob- lems, which, otherwise, would have led to solving quartic equations in most of the cases. The other benefit of this reduction is that solving the subprob- lems only for the bilinear term, as compared to the general quadratic form, reduces the size, complexity, and number of special cases to be dealt with when creating an algorithmic solution for the problem. 2 A general quadratic expression in R can be written as f(x) = xT Ax + bT x + δ (3.1)

13 3.1. Quadratic Reduction

Table 3.1: Different cases depending upon the eigenvalues Eigenvalues Quadratic function type Convex envelope over a polytope λ1 ≥ 0, λ2 ≥ 0 convex Function itself λ1 > 0, λ2 > 0 strictly convex Function itself λ1 ≤ 0, λ2 ≤ 0 concave Piecewise Linear λ1 < 0, λ2 < 0 strictly concave Piecewise Linear λ1 > 0, λ2 < 0 nonconvex and nonconcave Computed using [Loc16]

2×2 2 where A ∈ S , b ∈ R , δ ∈ R and S is the set of all symmetric matrices (see Definition 2.6). When f is neither convex or concave, A is an indefinite matrix. T T Let f1(x) = x Ax and f2(x) = b x + δ. From [HUL13, p. 71], we have

convf1 + convf2 ≤ conv(f1 + f2), and because f2 is affine,

convf1 + convf2 = conv(f1 + f2).

An affine function is already convex, so

T convf = convf1 + b x + δ. (3.2)

T For f1(x) = x Ax, by using eigenvalue decomposition

λ 0  A = U T 1 U, (3.3) 0 λ2 where λ1 and λ2 are the eigenvalues, and U is the matrix of eigenvectors of A. Letx ˜ = Ux, then   −1 T λ1 0 f1(U x˜) = g1(˜x) =x ˜ x˜ 0 λ2 or (3.4) 1 0  g (˜x) = λ x˜T x,˜ 1 1 0 −α with α = −λ2/λ1. When f(x) is convex, α ≤ 0. Since we are interested in the case when f(x) is neither convex nor nonconcave, λ1 and λ2 would

14 3.1. Quadratic Reduction carry different signs, hence α > 0. Table 3.1 enumerates different possible cases based on different eigenvalues. Now 1 0   1 1  0 1  1 1 T = √ √ √ √ . (3.5) 0 −α α − α 1 0 α − α

 1 1 T Letx ¯ = √ √ x˜, so by using (3.5) in (3.4), we get α − α

0 1 h (¯x) = λ x¯T x.¯ 1 1 1 0

T Hence forx ¯ = [¯x1 x¯2] , h1(¯x) = λ1x¯1x¯2 (3.6) which is our bilinear term. All the linear transformations performed in (3.3)-(3.6) are invertible as  1 1  U is invertible and matrix √ √ is clearly invertible with inverse α − α √  α 1  √1 √ . 2 α α −1 Proposition 3.1. For a bivariate quadratic nonlinear function f of T form (3.1), a polytope P , let f(x) = f1(x) + f2(x) with f1(x) = x Ax and T f2(x) = b x + δ, then for λ1 > 0 and h1(x) = x1x2, conv fP = λ1conv h1P ◦ −1 M + conv f2P where M is a 2 × 2 invertible matrix. Proof. From [HUL13, p. 71], for any function f and λ > 0

convλf = λ convf.

So for (3.6), convf1 = λ1 convh1. The value λ1 > 0 can be made sure by always assigning the positive eigenvalue to λ1.  1 1 T Now, for M = √ √ U, h (¯x) = f (M ◦ x). So α − α 1 1

−1 convf1 = convh1 ◦ M . (3.7)

Finally, (3.2) becomes

−1 T convf = λ1convh1 ◦ M + b x + δ.

15 3.2. Problem Formulation[Loc16]

Hence the closed convex envelope of any nonconvex quadratic function f defined over a polytope P can be obtained by

−1 convfP = λ1convh1P ◦ M + convf2P .

 3 −1 Example 3.2. For the quadratic function f(x) = x xT +[2, −2]T x −1 −5 +4, its quadratic term gets reduced to h1(¯x) ≈ 3.123 · x¯1x¯2 with transfor- −0.8360 −1.1490 mation matrix M ≈ . 1.3934 −1.1490 Since the problem of computing the closed convex envelope of a quadratic function reduces to computing the closed convex envelope of a bilinear func- tion, from now on we focus on formulating the problems, its solutions, and the algorithmic design for bilinear function only. Also, for consistency of this 2 chapter with [Loc16], we use variables x, y ∈ R instead of x = [x1, x2] ∈ R in the algebraic expressions.

3.2 Problem Formulation[Loc16]

In this section, we will reformulate the problem of computing the closed convex envelope into a group of subproblems as presented in [Loc16]. The maximum of the solutions of such subproblems, when computed over the given polytope, leads to a polyhedral subdivision of the polytope such that over each such division, the closed convex envelope belongs to a particular class of bivariate rational functions.

3.2.1 Problem Formulation This section covers a new formulation of the closed convex envelope problem presented before in Definition 2.5. The closed convex envelope of 2 any function f at a point (x0, y0) ∈ P , where P ⊂ R is a convex set, is given by an optimization problem in a, b, c with infinite number of linear constraints as

convfP (x0, y0) = max c

f(x, y) − [a(x − x0) + b(y − y0) + c] ≥ 0 ∀(x, y) ∈ P.

The optimal solution (a∗, b∗, c∗) of this problem defines the supporting hy- perplane to the closed convex envelope of f at point (x0, y0) over P .

16 3.2. Problem Formulation[Loc16]

The infinite number of constraints can be replaced with a constraint as

convfP (x0, y0) = max c

min f(x, y) − [a(x − x0) + b(y − y0)] ≥ c. (x,y)∈P

Let us introduce the following assumption,

2 2 Assumption 3.2.1. For a polytope P ⊂ R and function f ∈ C , − the Hessian of f is indefinite in the interior of P , and

− along each edge of P , f is either strictly convex or concave.

This first part of the assumption is sufficiently reasonable as we want to compute the closed convex envelope of a function which is neither convex nor concave. However the second condition is restrictive and requires careful selection of the polytope P , but it plays a crucial role in the new formula- tion of the convex envelope problem. Most of the 2D nonconvex nonlinear quadratic functions over any polytope P satisfy these assumptions. For the quadratic functions that do not statisfy these assumptions, a change in the polytope can be used to make them valid under our assumptions. Now under the above assumptions, the minimum of f(x, y)−[a(x−x0)+ b(y − y0)] ≥ c will always be attained either at a vertex of P or over an edge such that the constriction of f along the edge is a strictly convex function. So the problem of computing the closed convex envelope of f over a polytope P can then be written as

convfP (x, y) = max c 0 f(xvi ,y vi ) − [a(xvi − x) + b(yvi − y)] ≥ c, ∀ i ∈ V (P ) min f(x0, y0) − [a(x0 − x) + b(y0 − y))] ≥ c, ∀j ∈ 0 0 (x ,y )∈[vj1 ,vj2 ] E0(P ) (3.8) where

− E0(P ) is the index set of edges where f is strictly convex,

0 −∀ j ∈ E (P ), vertices vj1 and vj2 are the end points of the edge, − V 0(P ) is the set of all the remaining vertex indices not belonging in E0(P ),

17 3.2. Problem Formulation[Loc16]

0 −∀ i ∈ V (P ), (xvi ,y vi ) are the coordinates of the corresponding vertex. Example 3.3. Let f(x, y) = xy and P be a polytope with vertices

v1 = (0, 0), v2 = (1, 2), v3 = (2, 2), v4 = (2, 1).

The equation of lines constituting the boundaries of polytope are e1 := y = 2x from v1 to v2, e2 := y = 2 from v2 to v3, e3 := x = 2 from v3 to v4, and e4 := y = x/2 from v4 to v1 respectively. Since f is strictly convex along the lines y = 2x and y = x/2 only, so E0(P ) = {1, 4} and V 0(P ) = {3}. A detailed formulation of the above steps has been presented in [LS14, Sec. 2]. In the above formulation, each constraint either belongs to an edge or a vertex of the polytope P . However, the constraint belonging to an edge still remains a minimization problem in itself. 0 0 For j ∈ E (P ), a convex edge with equation of line as y = mjx + qj, 0 0 let fj(x ) = f(x , mjx + qj) be the restriction of f along the edge. Now the minimization problem related to this edge can be written as

0 0 0 min fj(x ) − [a(x − x) + b(mjx + qj − y)] ≥ c. (3.9) 0 x ∈[xv ,xv ] j1 j2

Let sj(a, b) = arg min{(3.9)}. If the function is decreasing (resp. in- creasing), sj(a, b) = +∞(resp. −∞), otherwise

0 fj(sj(a, b)) = a + bmj. (3.10) Since f is strictly convex along [x , x ], we have f 0(x ) ≥ f 0(x ). j vj1 vj2 j vj2 j vj1 Now the minima will either lie in the interval or on the boundaries. Let us define the three regions D− = {(a, b): a + bm ≤ f 0(x )} j j j vj1 D = {(a, b): f 0(x ) ≤ a + bm ≤ f 0(x )} (3.11) j j vj1 j j vj2 D+ = {(a, b): a + bm ≥ f 0(x )}. j j j vj2 The arg min{(3.9)} is

 − xv if (a, b) ∈ D  j1 j xj(a, b) = sj(a, b) if (a, b) ∈ Dj (3.12) x if (a, b) ∈ D+. vj2 j Correspondingly,

yj(a, b) = mjxj(a, b) + qj.

18 3.2. Problem Formulation[Loc16]

Also for the constraints belonging to the remaining vertex indices i ∈ V 0(P ),

xi(a, b) = xvi and yi(a, b) = yvi .

So ∀k ∈ V 0(P ) ∪ E0(P ), we define

ηk(a, b) = f(xk(a, b), yk(a, b)) − axk(a, b) − byk(a, b). (3.13)

By using (3.13), the closed convex envelope problem (3.8) becomes

convfP (x, y) = max c 0 0 (3.14) ηk(a, b) + ax + by ≥ c, ∀ k ∈ V (P ) ∪ E (P ).

The problem (3.14) is a simplified version of (3.8) where ηk can take two different forms depending upon if it is a convex edge or a vertex of the polytope.

3.2.2 Expressions for η(a, b) The bilinear function, f(x, y) = xy, is strictly convex on an edge y = 00 00 mjx + qj if mj > 0 (as fj (x) = 2mj and to be strictly convex fj (x) ≥ 0). From (3.12), over such edge

a + bmj − qj sj(a, b) = . 2mj So ∀j ∈ E0(P ), 2  (a + mjb − qj) − − bqj (a, b) ∈ Dj ηj(a, b) = 4mj  − + −axj(a, b) − byj(a, b) + xj(a, b)yj(a, b)(a, b) ∈ {Dj ,Dj }, 0 and, ∀i ∈ V (P ) and vertex (xvi ,y vi ),

ηi(a, b) = −axvi − byvi + xvi yvi . (3.15) The derivation of the above expressions is presented in Appendix A.

Example 3.4. From Example 3.3, we have two convex edges, e1 and e4 with lines y = 2x and y = x/2 between vertices v1 = (0, 0) to v2 = (1, 2)

19 3.2. Problem Formulation[Loc16]

and v1 = (0, 0) to v4 = (2, 1) respectively. Now for the edge e1,

0 a + 2b ≤ 0   (a + 2b)2 η1(a, b) = − 0 ≤ a + 2b ≤ 4 8  2 − a − 2b 4 ≤ a + 2b.

Similarly, for edge e4

0 a + b ≤ 0  2  2 (a + b/2) b η2(a, b) = − 0 ≤ a + ≤ 2  2 2  b 2 − 2a − b 2 ≤ a + 2 . The indexing of η functions does not matter as our main objective is to obtain the constraint set only. Finally, for vertex v3

η3(a, b) = 4 − 2a − 2b.

3.2.3 Subproblems From [Loc16, lemma 3.1], at any point in the polytope not lying on the strictly convex edge or vertices, the solution of problem (3.14) has at least two active constraints. So the optimization problem (3.14) can be reduced to subproblems for each pair of η functions. Hence, ∀(h, w) ∈ E0(P ) S V 0(P ) and h 6= w,

convfP (x, y) = max max ηh(a, b) + ax + by h,w a,b

ηh(a, b) = ηw(a, b) 0 0 ηh(a, b) ≤ ηk(a, b), k ∈ E (P ) ∪ V (P ) \{h, w} (3.16) Since η is a piecewise defined function with a polyhedral subdivision, solving the subproblems for each pair is the same as solving them over each member of the resulting polyhedral subdivision obtained by overlaying all the polyhedral subdivisions for all η functions. When arranged according to the subdivided region, the form of the sub-

20 3.2. Problem Formulation[Loc16] problems become

max ηh(a, b) + ax + by

ηh(a, b) = ηw(a, b) 0 0 (3.17) ηh(a, b) ≤ ηk(a, b), k ∈ E (P ) ∪ V (P ) \{h, w}

(a, b) ∈ Sr where − index r of subproblem ranges in interval 1, ..., 3|E0(P )| where |E0(P )| is the cardinality of set E0(P )(see Definition 2.2),

− Each polyhedral set Sr is defined as

Sr = ∩j∈E0(P )Γj

− + − + where Γj ∈ {Dj ,Dj,Dj } i.e all possible combinations of Dj ,Dj,Dj and ∀j ∈ E0(P ).

A set S = Sr is the collection of all the polyhedral set such that over each Sr, there is an unique combination of pieces of η functions. Also, S 2 i∈{1,...,|S|} Sr = R . Example 3.5. From Example 3.4, the set S = {{a + b/2 ≤ 0, a + 2b ≤ 0}, {a + b/2 ≤ 0, 0 ≤ a + 2b ≤ 4}, {a + b/2 ≤ 0, a + 2b ≥ 4}, ··· , {a + b/2 ≥ 2, a + 2b ≥ 4}}. Now the subproblem defined over subregion S2 is max 0 + ax + by (a + 2b)2 0 = − 8 0 ≤ 4 − 2a − 2b

(a, b) ∈ S2 = {a + b/2 ≤ 0, 0 ≤ a + 2b ≤ 4}.

Similarly, the subproblem defined over subregion S5 is

(a + 2b)2 max − + ax + by 8 (a + 2b)2 (a + b/2)2 − = − 8 2 (a + 2b)2 − ≤ 4 − 2a − 2b 8 (a, b) ∈ S5 = {0 ≤ a + b/2 ≤ 2, 0 ≤ a + 2b ≤ 4}.

21 3.3. Exponential subproblems

Figure 3.1: η1’s polyhedral domain

3.3 Exponential subproblems

We rearranged the subproblems from (3.16) to (3.17) by using the infor- mation that each η that belongs to a convex edge, is a piecewise function with exactly 3 pieces defined over a polyhedral set. Solving the subprob- lems for each pair (h, w) would eventually lead to solving them for each intersecting polyhedral set Sr. Now, the number of subproblem ranges in {1, ..., 3|E0(P )|} because with each η the a − b space could potentially be di- vided into 3 times the number of current divisions. This is best illustrated with the help of Figures 3.1- 3.3. Figures 3.1 and 3.2 show the polyhedral domains for some η1(a, b) and 0 η2(a, b). For the bilinear function, the domain of ηj(a, b) ∀j ∈ E (P ) will 0 always be divided by two parallel hyperplanes of form a + mjb = f (xvl ) for l ∈ {1, 2}, where v1 and v2 are the end points of the respective convex edge, and mj > 0. Now solving the subproblems in (3.16) for η1(a, b) and η2(a, b) is the same as solving all the subproblems defined over each polyhedral set in Figure 3.3. Since every polyhedral set contains exactly one piece from each η(a, b), the formulation (3.17) is the rearrangement of subproblems according to their subregions. These Subregions are formed as a result of the intersection between re- − + gions Dj ,Dj and Dj for all ηj. The number of such possible regions are 3|E0(P )|, however, out of all the 3|E0(P )| combinations, few will always be invalid due to the restriction mj > 0. For the new subdivision to have 3 times the number of current members, each polyhedral set of the adding subdivision needs to have a valid intersection with all the members of the

22 3.3. Exponential subproblems

a + m2b ≤ f2′(xv1) f2′(xv1) ≤ a + m2b a + m2b ≥ f2′(xv2) ≤ f2′(xv2)

Figure 3.2: η2(a, b)’s polyhedral domain

Figure 3.3: Nine Subregions formed by overlaying domains of η1 and η2

23 3.3. Exponential subproblems

Figure 3.4: 19 subregions, shown by different colors, formed by intersection of domains of 3 η(a, b)s belonging to a convex edge. current subdivision. This only holds upto 2 layers as each new layer of sub- division is of polyhedral form, however, 3|E0(P )| still serves as a good upper bound on the number of subregions. This exponential bound also leads to a O 3N  time complexity algorithm, as any algorithmic approach to this problem needs to generate all the subregions, which is growing exponentially with the number of convex edges. A result involving 19 subregions formed 0 by intersecting 3 ηj(a, b) for j ∈ E (P ) is shown in Figure 3.4. The problem of obtaining all the valid subregions belong to a special class of map overlay problems and an algorithm to obtain all such subregions is discussed in section 3.7. Since there is an exponential number of subregions, the time complexity of the algorithm is exponential as well. Although the number of subregions is exponential, the number of sub- 0 problems are much more than that. The ηi(a, b) ∀i ∈ V (P ) do not con- tribute to any of the division of subregions as they are defined over the full domain. However, they add up to the number of subproblems to be solved

24 3.4. Optimal Solutions [Loc16]

over a subregion. For N = |E0(P )| and M = |V 0(P )|, we have to solve N+M the subproblems for each η pair which leads to 2 problems over each N+M N subregion. Finally, the total subproblems are bounded above by 2 3 which increases the time complexity of the solving algorithm further than exponential. The best strategy, in this regard, is to choose a polytope that contains a low number of convex edges.

3.4 Optimal Solutions [Loc16]

In this section, we introduce the method proposed in [Loc16, Section 4], and then formulate the possible forms of the optimal solutions for the subproblems obtained in Section 3.2.3.

3.4.1 Solving the subproblems The method proposed can be summarized as follows

Step 1 Use the equality constraint to find the relation between the variables which leads to eliminating one variable from the subproblem. Then use this relation to reduce the subregion to one dimension.

Step 2 Using the relation in step 1, obtain the feasible region by solving all the inequalities. The feasible region will either get reduced to a union of finite number of one-dimensional intervals or become infeasible. In the former case, the subproblem gets further divided into subproblems over each interval.

Step 3 Each one-dimensional problem is then solved to obtain the closed form 2 solutions with a polyhedral subdivision of R . Step 4 After obtaining the solutions for each subproblem, the closed convex envelope is the maximum of all such optimal solutions over the poly- tope. This again, leads to a further subdivision of the polytope such that over each member of the subdivision, the closed convex envelope has one of the explicit forms discussed in Section 3.4.2.

For the bilinear function, η has either a quadratic or a linear form as de- fined in (3.15). Solving the equality constraint in Step 1 then leads to three cases: quadratic-quadratic(qq), quadratic-linear(ql), and linear-linear(ll). For qq and ll, the relation between a and b comes out to be linear of form a = α1b + α0, for α1, α0 ∈ R. The case qq leads to two such linear relations

25 3.4. Optimal Solutions [Loc16] resulting from the roots of a quadratic function. The case ql is special and 2 requires a new variable Z such that b = β2Z + β1Z + β0 for βi ∈ R, and a = Z − mjb. These relations are then used to reduce the subregions, Sr, to a union of one dimension intervals. Step 2 then leads to solving inequalities for all the remaining η func- tions over each one-dimensional region. Step 3 leads to the formulation and computing the solution of the one dimension problem, while Step 4 involves computing the maximum of all the solutions for one dimension optimization problems. A detailed formulation of steps 1-3 is presented in Appendix B.

3.4.2 Functional forms Let us introduce the following functional forms 2 ξ1(x, y) fr(x, y) = + ξ0(x, y), (3.18) ξ2(x, y) ξ2(x, y) f (x, y) = 1 + ξ (x, y), ζ > 0 (3.19) q ζ 0

fl(x, y) = ζξ1(x, y) + ξ0(x, y), (3.20) where ξi(x, y) are linear functions in x and y, and ζ ∈ R. The functional forms fq and fl are just special cases of fr and can be ob- tained by setting coefficients of x, y in ξ2(x, y) to zero, and the constant term to ζ and 1/ζ respectively. Let L = {(x, y): α10x + α01y + α00 = 0} where α01, α10, α00 ∈ R. In the next section, we will observe that the solutions to all the subproblems have one of the three functional forms mentioned above.

3.4.3 Solutions After solving the equality and all the inequalities in Step 1 and 2, each subproblem gets reduced to the form

max ηh(a, b) + ax + by (3.21) dl ≤ (b or Z) ≤ du By substituting the relation between a and b in (3.21), we get 2 max − ξ2(x, y)u + 2uξ1(x, y) + ξ0(x, y) (3.22) dl ≤ u ≤ du where u = b or Z (depending upon the case) and ξ2, ξ1 and ξ0 are linear functions in x and y. Since (3.22) is differentiable and concave in u, to compute the maximum we compute the critical points.

26 3.4. Optimal Solutions [Loc16]

Case 1: ηh(a, b) quadratic, ηw(a, b) quadratic When both η(a, b) functions are quadratic, the equality constraint re- duces to the linear relations

a = α1b ± α0. After substituting one of the above relation in (3.21), we obtain

2 max −ξ2b + 2bξ1(x, y) + ξ0(x, y) (3.23) dl ≤ b ≤ du where ξ1, ξ0 ∈ L, ξ2 > 0, and 2 (α1 + mh) ξ2 = ≥ 0, 4mh (α1 + mh)(α0 − qh) ξ1(x, y) = − + y − qh + α1x, 2mh 2 (α0 − qh) ξ0(x, y) = − + α0x. 4mh Since (3.23) is differentiable and strictly concave in b, its optimal value is  2 ξ1(x, y)  + ξ0(x, y) if dlξ2 < ξ1(x, y) < duξ2,  ξ2 −ξ d2 + 2d ξ (x, y) + ξ (x, y) if d ξ ≥ ξ (x, y), (3.24)  2 l l 1 0 l 2 1  2 −ξ2du + 2duξ1(x, y) + ξ0(x, y) otherwise. For quadratic-quadratic case, the potential convex envelope’s closed form expression is defined by a polyhedral subdivision with three pieces (3.24)) such that over each piece, it has form (3.19) or (3.20).

Case 2: ηh(a, b) quadratic, ηw(a, b) linear

In this case, we define a variable Z = a + mhb such that the equality constraint leads to 2 b = β2Z + β1Z + β0

a = Z − mhb. Now the optimization problem becomes

2 max − ξ2(x, y)Z + 2Zξ1(x, y) + ξ0(x, y) (3.25) dl ≤ Z ≤ du

27 3.4. Optimal Solutions [Loc16]

where ξ2, ξ1, ξ0 ∈ L and 1 ξ2(x, y) = β2x − β1y + qhβ2 + , 4mh   1 qh ξ1(x, y) = (1 − β1)x + β1y + − qhβ1 , 2 2mh 2 qh ξ0(x, y) = −β0x + β0y − − qhβ0. 4mh Since, in this case, the concavity of the objective function depends upon ξ2, when ξ2(x, y) > 0, the optimal value of the subproblem is  2 ξ1(x, y)  + ξ0(x, y) if dlξ2(x, y) < ξ1(x, y) < duξ2(x, y),  ξ2(x, y) −ξ (x, y)d2 + 2d ξ (x, y) + ξ (x, y) if d ξ (x, y) ≥ ξ (x, y),  2 l l 1 0 l 2 1  2 −ξ2(x, y)du + 2duξ1(x, y) + ξ0(x, y) otherwise. (3.26) and when ξ2(x, y) ≤ 0, 2 2 max{−ξ2(x, y)dl +2dlξ1(x, y)+ξ0(x, y), −ξ2(x, y)du +2duξ1(x, y)+ξ0(x, y)}. (3.27) The expression (3.27) may further divide the region ξ2(x, y) ≤ 0 into two subregions if the intersecting line of both functions passes through ξ2(x, y) ≤ 0. Finally, in this case, the potential closed convex envelope function is characterized by a polyhedral subdivision with 4(or 5) pieces such that over each piece it has form (3.18) or (3.20).

Case 3: ηh(a, b) linear, ηw(a, b) linear In this case, the equality constraint leads to a linear relation

a = α1b + α0. Since both η(a, b) functions are linear, the subproblem reduces to

max 2bξ1(x, y) + ξ0(x, y) (3.28) dl ≤ b ≤ du where ξ1, ξ0 ∈ L with

ξ1(x, y) = α1x + y − α1xh − yh,

ξ0(x, y) = α0x + xhyh − α0xh.

28 3.5. Convex envelope: the maximum of all subproblems

Since the objective function is linear, the optimal value lies at the bound- aries, that is,

max {dlξ1(x, y) + ξ0(x, y), duξ1(x, y) + ξ0(x, y)}.

The solution, in this case, leads to a polyhedral subdivision with only two pieces such that over each piece, the potential convex envelope has form (3.20).

3.5 Convex envelope: the maximum of all subproblems

The solutions of all the subproblems can be generalized as a piecewise 2 function of form (3.18)-(3.20) with a polyhedrally divided domain in R . Now the closed convex envelope is the pointwise maximum of all such solutions, that is, convfP = max uS(x, y) S∈S where S is the set of all subproblems and uS is the solution of each. 2 Every solution is defined over the full domain R and each solution pro- vides at least two candidate expressions for the closed convex envelope. Computing the closed convex envelope leads to solving a map overlay prob- lem with each solution as a different layer. Since we are only interested in the functions defined over the polytope P , the contribution from each sub- problem reduces to at least one. Hence for N number of valid subproblems, we need to find the maximum of N expressions over each member of the resulting subdivision of P . From [Loc16, lemma 5.2], for S1,S2 ∈ S,S1 6= S2, and the points (x, y) such that uS (x, y) = uS (x, y) = max uS(x, y) 1 2 S∈S lie along a line as the functions uS1 and uS2 are linear. The quadratic and rational functions are the results of subproblems for h ∈ E0(P ), (a, b) ∈ int(Dh) and the optimal solutions that could be shared by two subproblems would always lie at the boundaries of the convex edge, that is, when (a, b) ∈ − + Dh ,Dh and over these regions the solution expressions are linear. Hence if the intersecting line passes through the polyhedral set over which both are defined, it would lead to a further subdivision of that region again. However, the subdivision obtained after finding the maximum of all expressions would still retain the polyhedral form.

29 3.6. Algorithmic Design

So finally, the closed convex envelope would again be characterized by a polyhedral subdivision such that over each member of the subdivision it has one of the three functional forms discussed in Section 3.4.2.

3.6 Algorithmic Design

Our algorithm is based on the same design as discussed in Section 3.4.1. An algorithm to compute the closed convex envelope using the above method would be quite complex in nature as it requires generating and potentially solving an exponential number of problems with sufficient floating point accuracy. Even a small floating point error could make an infeasible sub- problem feasible and vice versa. We also need to create a new module including appropriate data struc- tures and algorithms for all the 2D geometry manipulations required in the process. This involves methods for the map overlay problem which for N layers could be exponential in nature (see Section 3.3) and vertex enumer- ation problem to efficiently convert between halfspaces and vertices for a polytope.

3.6.1 Input and Output Data Structures The data structure for the input quadratic function involves storing its 6 coefficients in an array. For Q(x, y) = ax2 + bxy + cy2 + dx + ey + f, we store the coefficients as Qxy := [a b c d e f]. A linear function can be stored similarly by setting coefficients a = b = c = 0. The input polytope with N vertices is stored as an N × 2 matrix where each row contains a vertex and the rows are sorted in the clockwise direc- tion as shown in Figure 3.5. Since a 2D polytope is characterized by an edge-vertex (or vertex-edge) relationship, storing it requires storing all such relationships. One way would be to store these relationships as a graph with adjacency lists. However, this will lead to redundant values in the data structure. Another way would be to store the polytope as a set of lin- ear inequalities (halfspaces). This representation follows directly from the definition of polytope as being characterized by the intersection of a finite number of inequalities. These inequalities can be stored in a matrix, but in that case, obtaining the adjacent vertices and edges would require O (N) computations, whereas obtaining all the edge-vertex relationships would lead to O N 2 operations.

30 3.6. Algorithmic Design

Q(x,y) = a a x^2 + b xy + c y^2 + d x + ey + f

Qxy := a 22 b c d e f

P (Matrix) x y x1 y1 (x2,y2) x2 y2 x3 y3 (x1,y1) x4 y4 x5 y5 (x3,y3)

(x5,y5) (x4,y4) A Polytope

Figure 3.5: Input Data Structure

To overcome this issue, we introduced the clockwise direction in our data structure. When vertices are stored in clockwise direction, obtaining the neighborhood information is just a constant time operation as adjacent vertices would be at index ±1, and the edges between them can be obtained by computing the equation of the line passing through the two sharing ver- tices, which is again O (1). We only stored the vertices as for a convex polytope, by using the clockwise direction, we can obtain all the halfspaces in O (N) time and vice-versa. Example 3.6. Let Q(x, y) = 2x2 +xy −10y2 +5x−17y −120 defined over a polytope formed by vertex set V = {(3, 10), (9, −5), (9, 10), (−8, 1), (7, −8)}. Then its input data structure representation would be Qxy := [2, 1, −10, 5, −17, −120], and  3 10   9 10    P :=  9 −5 .    7 −8 −8 1

31 3.6. Algorithmic Design

ax^2 + bxy + cy^2 + dx + ey + f Functional form = ux + vy + w

Solxy(1,:) Solrgn{1,1} Solrgn{2,1} Solxy(2,:)

a222 a122 x1 y1 x2 y2 b1 b2 x2 y2 x3 y3 c1 c2 x6 y6 d1 x6 y6 d2 e1 x5 y5 e2 f1 (x2,y2) f2 u1 u2 v1 (x1,y1) v2 w1 w2 (x6,y6) (x3,y3) Solrgn{3,1} (x5,y5) x5 y5 (x4,y4) x6 y6 Solxy(3,:) Output Polytope x3 y3 a3 22 b3 c3 d3 e3 f3 u3 v3 w3 x4 y4

Figure 3.6: Output Data Structure

We follow the same pattern for our output data structure. Since the closed convex envelope is a piecewise defined rational function over a poly- hedrally divided domain, the coefficients of each function are stored in an array, and each polytope in a vertex matrix. So for ax2 + bxy + cy2 + dx + ey + f r(x, y) = , the function is stored as ux + vy + w

Rxy := [a b c d e f u v w], and a polytope with M vertices as a M ×2 vertex matrix sorted in clockwise direction. For a closed convex envelope characterized by N pieces, we store all of the functions in a N × 9 matrix, and its corresponding polyhedral regions in a N × 1 cell with each element representing a polytope as shown in Figure 3.6.

Example 3.7. Let Q(x, y) = xy defined over polytope P = {(−5, 5), (1, 3),

32 3.6. Algorithmic Design

(−1, 0), (−5, −4)}, then its closed convex envelope is given by

 2 2 5x − xy + 5y + 30x − 30y + 25  (x, y) ∈ P1 conv (x, y) = x − y + 10 Q,P 15x2 − 3xy + 10y2 + 90x − 65y + 75  (x, y) ∈ P  3x − 2y + 25 2 where P1 and P2 are triangles with vertices {(−5, 5), (−1, 0), (−5, −4)} and {(−5, 5), (1, 3), (−1, 0)}. So its output data structure representation is fxps(1,:) = [5, −1, 5, 30, −30, 25, 1, −1, 10], fxps(2,:) = [15, −3, 10, 90, −65, 75, 3, −2, 25], and −5 5  −5 5 rgns{1,1} = −1 0  , rgns{2,1} =  1 3 . −5 −4 −1 0

3.6.2 Main Algorithm Our main algorithm is presented in Algorithm 1. It takes two input arguments, Qxy and Pset, following the data structures discussed in Sec- tion 3.6.1. We start with verifying if the input quadratic function has an indefinite Hessian and then proceed onto transforming it into bilinear term using the linear tranformations discussed in Section 3.1. We use the trans- formation matrix to obtain new transformed polytope points. After that, we separate the convex edges and the remaining vertices, and store them in the variable vertexSet. The vertexSet is a cell with two elements where vertexSet{1, 1} contains the edges as rows in the form [v1, v2] with v1 and v2 as the end points of the edge with v1x ≤ v2x , while vertexSet{2, 1} contains the remaining vertices as rows. Example 3.8. Let Qxy = xy defined over polytope with P set = [−5, 5; 1, 3; −1, 0; −5, −4]. Now the function is strictly convex over edges {(1, 3), (−1, 0)} and {(−1, 0), (−5, −4)}. So −1 0 1 3 vertexSet{1, 1} = , and −5 −4 −1 0 vertexSet{2, 1} = [−5, 5].

We use the vertexSet to compute the η functions according to convex edges and remaining vertices case, and store them in the variables convex- Eta and nonconvexEta respectively. Since η functions are either linear or

33 3.6. Algorithmic Design

quadratic (see Section 3.2.2) in a, b, we store the linear term xva + yvb + c 2 (a + mjb − qj) as [xv yv c], and quadratic − − bqj as [∞ mj qj]. The ∞ 4mj signifies the quadratic form of η function. The element convexEta{1, 1} is a 3 × |E0(P )| matrix containing all the η functions for the convex edges, while convexEta{2, 1} stores all the corre- sponding regions over which they are defined. Since the regions are either − + Dj = {(a, b): a + bmj ≤ u1},Dj = {(a, b): l1 ≤ a + bmj ≤ +∞} or Dj = {(a, b): l2 ≤ a + mj ≤ u2}, they are stored as [−∞, 1, mj, u1], [l1, 1, mj, +∞] and [l2, 1, mj, u2] respectively. We follow the same pattern for nonconvexEta, but since they are defined over the full domain, their regions are stored as [−∞, 0, 0, +∞].

Example 3.9. For the function and polytope in example 3.8, the η functions are

 3b 3 a a + ≤ −  2 2  2   3b 3  a + − η1(a, b) = 2 2 3b 3 3b 9 − − − ≤ a + ≤  4 2 2 2 2  9 3b −a − 3b + 3 ≤ a + ,  2 2

 0 a + b ≤ −9  2   b   a + η2(a, b) = 2 − −9 ≤ a + b ≤ −1  2 −2a − b + 2 −1 ≤ a + b, and

η3(a, b) = 5a − 5b − 25. So they are stored as

 1 0 0   ∞ 1.5 1.5   −1 −3 3  convexEta{1, 1} =   ,  5 4 20     ∞ 1 1  1 0 0

34 3.6. Algorithmic Design

 −∞ 1 1.5 −1.5 −1.5 1 1.5 4.5     4.5 1 1.5 ∞  convexEta{2, 1} =   ,  −∞ 1 1 −9     −9 1 1 −1  −1 1 1 ∞ and

nonconvexEta{1, 1} = [5, −5, −25], nonconvexEta{2, 1} = [−∞, 0, 0, ∞].

The convexEta and nonconvexEta are then used to generate all the pos- sible subproblems. Since the subproblems are defined according to the sub- regions, first we solve the map overlay problem for all η functions and obtain all the subregions Srs. Then we use it to generate all the subproblems and store them in the variable subprobs. Each subproblem, sp, is again stored as a cell with two elements. The sp{1, 1} contains the η functions, while sp{2, 1} contains the subregions in the same form as convexEta. All the subproblems are solved using solve all subprobs() subroutine which follows the same approach as presented in Section 3.4.1. A detailed description of the underlying algorithm and data structures is covered in Section 3.6.3. The solutions of all the subproblems are then stored in the variables pxps and pcdns where pxps is a cell containing all the solution expressions as cells while pcdns stores all the solution regions as a cell of cells for the respective subproblems. Since the solution expressions are in the rational form, they are stored as a 1 × 9 array in the cells of pxps. On the other hand, each cell of pcdns stores the regions as a set of inequalities in the form [yf, m, c, gf] where yf and gf are flags for the y term and “≥” sign respectively. This form for storing halfspaces is discussed in Section 3.7. After that we have computed all the solutions, we solve the map overlay problem to obtain the polyhedral subdivision of our transformed polytope nPset, and the maximum of all the expression over each member of subdivi- sion. We use pxps, pcdns and nPset for this purpose and store the respective expressions and regions in nfxps and nrgns respectively. Then we use the linear transformation matrix to revert back the transformations and obtain the closed convex envelope with expressions stored in fxps and regions in rgns respectively. The data structure for fxps and rgns is covered in detail in Section 3.6.1.

35 3.6. Algorithmic Design

Algorithm 1 Computation of closed convex envelope for 2D Quadratic functions 1: function cvxEnv2d(Qxy,Pset) 2: qxp = Qxy(1:3); 3: [isindefinite, isConvex] = hasIndefinite EigenVal(qxp); 4: if isindefinite then 5: quadxp = [0,1,0,0,0,0]; . Biliear form 6: [Tm, lambda1 ] = transform quadxp to xy( Qxy ); 7: nPset = polytope points clockwise sort(Pset*Tm); 8: vertexSet = find sconvex edgesn( quadxp, nPset); 9: [convexEta, nonconvexEta] = compute eta( quadxp, vertexSet ); 10: if isempty(convexEta{1,1}) then 11: Srs = compute subregions( convexEta{2,1} ); 12: ri = obtain subregion indices( Srs ); 13: subprobs = generate subproblems( convex- Eta,nonconvexEta,ri ); 14: else 15: subprobs{1,1}{1,1} = nonconvexEta{1,1}; 16: subprobs{1,1}{2,1} = nonconvexEta{2,1}(1,:); 17: end if 18: [pxps, pcdns] = solve all subprobs( subprobs ); 19: if isempty(convexEta{1,1}) then 20: [nfxps, nrgns, prgns] = compute polyhed subdiv( nPset, pcdns, pxps ); 21: else 22: nfxps = pxps; nrgns = pcdns; 23: end if [fxps, rgns] = revert transformations from xy( nfxps, nrgns, Tm, lambda1, Qxy, Pset); 24: else 25: if isConvex then 26: fxps = [Qxy, 0, 0, 1]; rgns{1,1} = Pset; . Convex case 27: else 28: print “Function is Concave.” . Function is concave! 29: end if 30: end if 31: return fxps, rgns 32: end function

36 3.6. Algorithmic Design

3.6.3 Solving the subproblems The algorithm to solve subproblems over a subregion, at its core, follows the same approach discussed in section 3.4.1. In this section, we discuss the data structures and design decisions made for solving the subproblems in accordance with Algorithm 1. In Algorithm 2, each subproblem sp is a cell of two elements, both of which are matrices. The matrix sp{1,1} stores all the coefficients of the η functions while sp{2,1} contains their respective subregions Sr. For each η pair, we solve the equality in cvxHull2d equality solver() and store the relation in the variable relab. The expressions for a and b are stored in a cell with relab{1,1} and relab{2,1} containing their respective relations. The element relab{1,1} is a vector(or matrix) of size 3(or 2 × 3) containing the coefficients of the quadratic or linear relation(s) in Z or b. The element relab{2,1} is empty if a is defined in terms of b, otherwise it is a vector of size 3 containing the quadratic relation in Z, similar to relab{1,1}. Using the relation, we reduce the subregion Sr to one-dimension. If the one-dimension region is feasible( nonempty ), we solve the inequality constraint for the remaining ηj. When all the inequalities are feasible, we formulate the one-dimension optimization problem in Z or b knowing that we will have a valid solution too. The problem is computed using compute OP() and is stored in the variable mOP as a matrix of size 4×3. Each row contains the coefficients of x,y and the constant term which are the coefficients of the powers of the optimization variable u respectively(see Section 3.4.3). Row 1 contains the coefficients of quadratic, row 2 of linear, and row 3 the constant term in the optimization variable. Row 4 contains the case number of the problem for our solver function OP Solver(). Example 3.10. For an optimization problem of form  x y 5  x y  11x 11y 5 max − + Z2 + 2 + Z + − + 36 36 18 4 4 4 4 2 − 9 ≤ Z ≤ −1 defined in Z would be stored as 1/36 −1/36 5/18  1/4 1/4 0  mOP =   . 11/4 −11/4 5/2  0 1 0 The fourth row signifies that the optimization problem comes from the quadratic-linear case, and is defined in Z.

37 3.6. Algorithmic Design

Algorithm 2 Algorithm for solving subproblems over one subregion 1: function solve subprobs over subregion(sp) 2: ae = sp{1,1}; ac = sp{2,1}; sc = 1; 3: for i=1 to size(ae) do 4: for j=i+1 to size(ae) do 5: eh = ae(i,:); ew=ae(j,:); 6: [ relab, isfeasibleRelab] = cvxHull2d equality solver(eh,ew); 7: if isfeasibleRelab then 8: for ii=1:size(relab) do 9: ab = relab{ii,1}; 10: [Sr1d, isfeasibleSr1d] = compute Sr 1d(ac,ab); 11: if isfeasibleSr1d then 12: [feasible region, isfeasible] = cvx- Hull2d inequality solver(eh, ew, Sr1d, ab, ae); 13: if isfeasible then 14: [mOP] = compute OP( eh, ew, ab ); 15: [ txps, tcdns ] = OP Solver( mOP, feasi- ble region ); 16: solxp{sc,1} = txps; solcdn{sc,1} = tcdns; 17: sc = sc + 1; 18: end if 19: else 20: print “The subproblem for eh,ew is infeasible.” 21: end if 22: end for 23: end if 24: end for 25: end for 26: end function

38 3.6. Algorithmic Design

Similarly an example of optimization problem resulting from solving a linear-linear equality in b would be

5x 5 25x 25 max + y − b + − 4 4 4 4 29 31 − ≤ b ≤ − , 9 11 and it would be stored as

 0 0 0   5/4 1 5/4  mOP =   25/4 0 25/4 0 1 1 The feasible regions are stored in the feasible region variable.

We use the solver function OP Solver() to solve the optimization prob- lem mOP and obtain the respective solution expressions and regions. The expressions are stored in txps as a cell of 1×9 vectors and the regions in tcdns as a cell of matrices containing the region in the form l ≤ ax + by + c ≤ u as [l, a, b, c, u, ef] where the ef flag differentiates between “≤” and “<” with 1 for the former and 0 for the latter case.

Example 3.11. For the Optimization problem defined in Z in example 3.10, x y 5 the solution would be in two regions, that is, when − + > 0, the 36 36 18 function would be

 2 2 5x − xy + 30x + 5y − 30y + 25 1  x − y + 10 < x + y < − (x − y + 10)  x − y + 10 9 −4x − 5y − 20 x − y + 10 ≥ x + y 20x − 29y + 20 1  − (x − y + 10) ≤ x + y,  9 9 x y 5 and when − + ≤ 0, the function would be 36 36 18  20x − 29y + 20 max −4x − 5y − 20, . 9

39 3.7. Map Overlay problem

So they would be stored in the variables txps and tcdns as

txps{1,1} = [5/36, −1/36, 5/36, 5/6, −5/6, 25/36, 1/36, −1/36, 5/18],  0 1/36 −1/36 5/18 ∞ 0 tcdns{1,1} =  0 1/2 0 5/2 ∞ 0 , −∞ 5/18 2/9 5/18 0 0 txps{2,1} = [0, 0, 0, −4, −5, −20, 0, 0, 1],  0 1/36 −1/36 5/18 ∞ 0 tcdns{2,1} = , −∞ 1/2 0 5/2 0 1 txps{3,1} = [0, 0, 0, 20/9, −29/9, 20/9, 0, 0, 1], 0 1/36 −1/36 5/18 ∞ 0 tcdns{3,1} = , 0 5/18 2/9 5/18 ∞ 1  0, 0, 0, −4, −5, −20, 0, 0, 1  txps{4,1} = , 0, 0, 0, 20/9, −29/9, 20/9, 0, 0, 1 tcdns{4,1} = [−∞, 1/36, −1/36, 5/18, 0, 1].

More than one expression in an expression matrix signifies the maximum of all, for example in txps{4,1} the solution expression is defined in two linear expressions. Also, when we have more than one expression, we may have further subdivision of the region as well.

3.7 Map Overlay problem

Map overlay is one of the most commonly occurring problem when work- ing with piecewise functions in more than one-dimension. As simple as adding two piecewise functions requires solving a map overlay problem on their domains. The data structures used to store the domains play a sig- nificant role in simplifying the problem a bit. The map overlay problem is exponential in nature, that is, for N layers with m subdivisions each, the resulting subdivision can have mN subregions. The type of domain division of the functions can further add some additional complexity to the problem. For example, to solve the map overlay problem for two parabolic subdivi- sions, not only do we need to enumerate all the possible intersections but also add some more divisions to the solution to properly store some sub- regions. However, to obtain the closed convex envelope of a 2D quadratic function, we only need to solve the problem for polyhedral subdivisions. Next, we design an efficient data structure to reduce the complexity of the problem in our module.

40 3.7. Map Overlay problem

y ≥ mx + c

y = m x + c

y ≤ mx + c

Figure 3.7: Space division by line y = mx + c

3.7.1 Data structures for polyhedral region The most common form for representing an inequality is ax + by ≤ c where a, b, c ∈ R. The form can be stored in an array as [a b c]. For u ∈ n T n R , it is c u ≤ d where c ∈ R and d ∈ R, and the data structure can be expanded to size n + 1 to store inequalities of this form. This data structure performs well as it is intuitive too, however the representation of any unique inequality, in this case, is not unique. For example, ax + by ≤ c and N(ax + by ≤ c) ∀N ∈ R>0 are two different representations of the same inequality. To force unicity of the form, it can be normalized by dividing the inequality with the coefficient of either x or y term. But in that case, not all inequalities can be represented by this form, we would need “≥” sign too. In our algorithm, we came across the map overlay problem twice. First, when generating all the possible subproblems in a − b space and second, when computing the polyhedral subdivision for the closed convex envelope in the given polytope. For generating the subproblems, we solve the map overlay problem for η functions with 3 polyhedral regions separated by two parallel hyperplanes. We stored the subregions l ≤ a+bmj ≤ u as [l, 1, mj, u] while regions a + mj ≤ u and l ≤ a + mjb are stored as [−∞, 1, mj, u] and [l, 1, mj, ∞] respectively. As this is a normalized form too, there always will be a unique representation for each inequality, however it cannot be used to store any general inequality in two variables.

41 3.7. Map Overlay problem

To do so, we formulate it as either y ≤ mx + c or y ≥ mx + c and then store it as [yf, m, c, gf]. The motivation for this representation comes from 2 the general equation of a line y = mx + c. This hyperplane divides the R space into two halfspaces as shown in Figure 3.7. This form is normalized as well because by forcing the coefficient of y to 1, we make sure that there can only be one representation of an inequality. Any other representation obtained by multiplying a number N ∈ R>0 will still have the same form. When the coefficient of y is zero, we make the coefficient of x as 1, thereby maintaining the unique representation. Now to store this form, we need four values [yf, m, c, gf] compared to only three in [a, b, c] because for uniqueness we need a flag for the “≥” sign. The variable yf is a flag for the y term, if it is 1 then the inequality is defined in terms of x and y, otherwise it would only be in x. The terms m and c are corresponding terms from the inequality with m forced to be 1 whenever yf = 0. The gf flag when 1 represents “≥” and when 0 represents the “≤” case.

Example 3.12. For y ≥ 2x + 3, y ≤ 3x + 4, 0 ≥ x + 2, y ≥ 1, their data structure arrays would be [1, 2, 3, 1], [1, 3, 4, 0], [0, 1, 2, 1], and [1, 0, 1, 1] .

Using the four terms in our data structure, we can uniquely represent any general inequality with the least amount of space possible (two of them are booleans). The other advantage is that it is easier to substitute the value of y in our algorithms. In our module, we also have a subroutine convert inequalities() which can convert inequalities of form l ≤ ax+by+c ≤ u to [yf, m, c, gf] form.

3.7.2 Sorting vertices in clockwise direction As discussed before in section 3.6.1, storing points in clockwise direc- tion significantly reduces the amount of computation required to obtain the neighborhood information for a vertex or edge. Apart from being an input data structure, our algorithms heavily rely on the computation of the new polytopes, so we need a method to sort the vertices in clockwise direction as well. The algorithm to sort polytope vertices in clockwise direction is pre- sented in Algorithm 3. The idea has been taken from the working of a clock itself. For a given set of vertices, we first compute the centroid. Since the centroid always would lie inside the polytope (as it is convex), we use this point as the center of the clock.

42 3.7. Map Overlay problem

Algorithm 3 Sorting vertices of a polytope in clockwise direction 1: function polytope points clockwise sort(P) 2: cen = mean(P); ps = []; ns = []; 3: for i=1 to size(P) do 4: ts = compute slope(P(i,:), cen); 5: if ts ≥ 0 or isinf(ts) then 6: ps = [ps; [P(i,:), abs(ts)]]; . positive slopes or ±∞ 7: else 8: ns = [ns; [P(i,:), ts]]; . negative slopes 9: end if 10: end for 11: [ q1p, q3p ] = polytope points sort Q13(ps, cen); . Sorts according to slopes in quadrant 1 & 3 12: [ q2p, q4p ] = polytope points sort Q24(ns, cen); . Sorts according to slopes in quadrant 2 & 4 13: cwp = [ q2p; q1p; q4p; q3p ]; 14: return cwp 15: end function

2 We divide the R space into four quadrants using centroid as the origin. Then we compute the slope of each line connecting the centroid with the vertices. Now the slopes for points in quadrants 1 and 3 will be positive while in quadrants 2 and 4 will be negative. The special cases of slopes as 0 or ±∞ have been accommodated in the quadrants 1 and 3 case itself. We sort the points in quadrants 1 and 3, and quadrants 2 and 4 according to their slopes in the clockwise direction (opposite of how we measure angles conventionally) and obtain the sorted points in all the 4 quadrants. The clockwise sorted points are stored in the variable cwp in the order: quadrant 2, 1, 4 and 3 respectively. It does not matter how the ordering begins as long as they follow the clockwise order.

Example 3.13. For a polytope P = {(8, 10), (10, 7), (7, 8), (1, 8), (9, 4)}, we start with computing the centroid cen = (7, 7.4). Next, we compute the slope of all the lines connecting the points with the centroid as

43 3.7. Map Overlay problem cen(2) − P (i, 2) to get cen(1) − P (i, 1)

 13/5   −2/15    sv =  −∞  .    −1/10  −17/10

Assuming cen as the origin, we divide the polytope points according to positive slopes (including ±∞) and negatives slopes, that is,

10 7 −2/15  8 10 13/5 psp = , nsp = 1 8 −1/10 . 7 8 −∞   9 4 −17/10

In the algorithm, we convert −∞ to ∞, so that psp contains all the points with positive slopes. This conversion has been properly accommodated in polytope points Q1 3(). After that, we divide the points with positive slopes in quadrants 1 and 3, and with negative slopes in quadrants 2 and 4. So the points and slopes combination becomes

8 10 13/5 q1 = , q3 =   7 8 −∞ and 10 7 −2/15  q2 = 1 8 −1/10, q4 = . 9 4 −17/10

Finally, after sorting the points in each quadrant according to arctan of their slopes, in decreasing order, we get

7 8  q1p = , q3p =   8 10 and 10 7 q2p = 1 8, q4p = . 9 4

So the polytope points in the clockwise order (according to algorithm default

44 3.7. Map Overlay problem order is [q2p; q1p; q4p; q3p]) are

 1 8   7 8    cwp =  8 10 .   10 7  9 4

The time complexity of this algorithm is bounded by the time required to sort the points by decreasing slopes in each quadrant. So for N vertices, the worst case time complexity would be O (N log N). Although it takes O (N log N) time to sort the points, it would still perform better compared to unsorted points as it would take O (1) operation to obtain the neighborhood information which would have taken O (N) operations in the latter case. Also, the transformation between vertices and halfspaces representations would only take O (N) computations, which otherwise would have been O N 2 with unsorted points.

3.7.3 Map Overlay Algorithm Since for a map overlay problem for N layers with m subdivisions each, the resulting region could have mN number of subdivisions in the worst case, any algorithm aiming at finding the solution cannot perform better than a brute force method that enumerates all the possible subregions. However, still a constructive algorithm can be designed which builds the solution by adding regions on top of another and gets rid of the infeasible subregions at each step. An example with three layers have been shown in Figures 3.1-3.4. The main algorithm for map overlay problem is presented in Algorithm 4. Its implementation in the module has API compute polyhed subdiv() as along with computing the subregions, we need to store the corresponding function expressions as well. However, the subroutine map overlay prob() presented here contains the main concept of the additive computation of subregions. At each iteration, the computation of the solution subregions is per- formed in compute intersection subregions(). For two layers Sr1 and Sr2, we first enumerate all the possible subregions pair, compute the intersection between each, and then remove the infeasible ones. If Sr1 and Sr2 have n1 and n2 subregions, then each call to this method takes O (n1 × n2) time. Now, at the subregion level, for two polyhedral sets sr1 and sr2, the resulting subregion is the intersection of their inequality sets. In our case, all the subregions are bounded polyhedral sets (as they are a member of the subdivision of the input polytope). So we enumerate the inequality pairs

45 3.7. Map Overlay problem

Algorithm 4 Constructive algorithm for map overlay problem for polytopes 1: function map overlay prob(rgns) 2: Srs = {}; . Initialize Srs as empty cell 3: N = size(rgns); . Number of regions 4: if N==1 then . If only one region 5: Srs = rgns; 6: else 7: for i=1 to N-1 do 8: if i==1 then 9: Srs=compute intersection subregions(rgns{1,1},rgns{2,1} ); 10: else 11: Srs = compute intersection subregions(Srs, rgns{i+1,1}); 12: end if 13: end for 14: end if 15: return Srs 16: end function 17: 18: function compute intersection subregions(Sr1,Sr2) 19: nSrs = ; 20: cnt = 1; . Counter for new subregions 21: for i=1 to size(Sr1) do 22: for j=1 to size(Sr2) do 23: tmpsr=compute intersection of two subregions(Sr1{i,1}, Sr2{j,1}); 24: tmpPoly = compute polytope from regions(tmpsr); 25: if size(tmpPoly) > 2 then 26: nSrscnt,1 = compute polytope region(tmpPoly); 27: cnt = cnt + 1; 28: end if 29: end for 30: end for 31: return nSrs 32: end function

46 3.7. Map Overlay problem

Algorithm 5 Computing intersection of two subregions 1: function compute intersection of two subregions(sr1,sr2) 2: isIntersectionfeasible = true; 3: for i=1 to size(sr1) do 4: for j=1 to size(sr2) do 5: isfeasible = intersection of two halfspaces( sr1(i,:),sr2(j,:)); 6: if !isfeasible then 7: isIntersectionfeasible = false; 8: break; 9: end if 10: end for 11: if !isfeasible then 12: break; 13: end if 14: end for 15: if isIntersectionfeasible then 16: nsr = reduce to required inequalities([sr1;sr2]); 17: if isempty(nsr) then 18: isIntersectionfeasible = false; 19: end if 20: else 21: nsr = []; 22: end if 23: return nsr 24: end function

47 3.7. Map Overlay problem

Figure 3.8: Vertex Enumeration Problem from both sets, then verify if they are feasible, if at least one pair is not, we set the isfeasible flag to false and return an empty set. When all the pairs are feasible, we reduce the number of inequalities to the only active ones and return it using nsr. The problem of reducing the inequalities to only the active inequalities, a variant of the Vertex Enumeration Problem, is known to be an NP-hard problem for unbounded polyhedral sets [KBB+09, Lin86], however, we only have to deal with polytopes, especially 2D polytopes, as the subregions are a member of the polyhedral subdivision of the input polytope only. The method of computing the vertices is the same as the intersection of two subregions. For each inequality pair, we compute their intersection point and verify if it satisfies all other inequalities. Since for each pair, we are verifying the point with every other inequality, the computation of polytope takes O N 3 operations for N number of inequalities. The closed convex envelope of a 2D nonconvex quadratic function is a piecewise rational function defined over a polyhedral subdivision of the input polytope. After computing the convex envelope, we proceed to compute the conjugate of each piece of the closed convex envelope in the next chapter.

48 Chapter 4

Conjugate of a class of Convex Bivariate Rational Functions over a polytope

The closed convex envelope of a bivariate nonconvex quadratic function defined over a polytope is a piecewise rational function of form (3.18) with a polyhedral subdivision. The conjugate of the convex piecewise rational function, say R, is then computed as ∗ ∗ R (s) = max Rj (s) j∈{1,...,N} ∗ th where N is the total number of pieces and Rj (s) is the conjugate of the j piece. In this chapter, we discuss the computation of the conjugate of a convex rational function defined over a polytope. We start with computing the sub- differentials corresponding to each entity of the polytope and then compute the expressions over them to finally obtain the conjugate. All the functions n involved here are in convR . n n Lemma 4.1. For x ∈ R , a convex set P ⊂ R , and the indicator function ( 0 x ∈ P IP (x) = +∞ otherwise, the subdifferential set ∂IP (x) = NP (x), where NP (x) is the normal cone of P at point x.

Proof. The subdifferential of IP (x) is n ∂IP (x) = {s ∈ R : IP (y) ≥ IP (x) + hs, y − xi, ∀y ∈ P } n = {s ∈ R : 0 ≥ 0 + hs, y − xi, ∀y ∈ P } n = {s ∈ R : hs, y − xi ≤ 0, ∀y ∈ P } = NP (x).

49 4.1. Subdifferentials in the interior of the polytope

n n Lemma 4.2. For a function g : R → R ∪ {+∞} with dom(g) = R , a n polyhedral set P ⊂ R , let f(x) = g(x)+IP (x), then ∂f(x) = ∂g(x)+NP (x) and  ∂g(x) x ∈ int(P )  ∂f(x) = ∂g(x) + NP (x) x ∈ bndry(P )  φ otherwise.

Proof. For f(x) = g(x) + IP (x), the subdifferential set ∂f(x) = ∂(g(x) + n IP (x)). Since dom(g) = R , so dom(g) ∩ P = P . Hence from [HUL13, p. 114, Corollary 3.1.2], for all x ∈ P

∂f(x) = ∂(g(x) + IP (x))

= ∂g(x) + ∂IP (x)

= ∂g(x) + NP (x) [using Lemma 4.1].

Now for any x ∈ int(P ),NP (x) = {0}. So ∂f(x) = ∂g(x). In our case, the function g will either be a convex rational, quadratic or linear function of forms (3.18)-(3.20) respectively. In addition, g is the closed convex envelope of a bivariate nonconvex function Q satisfying As- sumption 3.2.1.

4.1 Subdifferentials in the interior of the polytope

Proposition 4.3. For a bivariate convex quadratic nonlinear function q of form (3.19), there exists α , α , α ∈ such that S ∂q(x) = {s : 10 01 00 R x∈R2 Cq(s) = 0} where Cq(s) = α10s1 + α01s2 + α00.

2 ξ1(x) Proof. From form (3.19), let q(x) = + ξ0(x) be a convex quadratic ξ20 function with ξ1(x) = ξ11x1 + ξ12x2 + ξ10, ξ0(x) = ξ01x1 + ξ02x2 + ξ00, ξ20 > 0, and ξ11 or ξ10 6= 0. The parameter ξ20 is included to match αij with Formula (3.19). 1 2 Since q ∈ C , for any x ∈ R

∂q(x) = {s : s = ∇q(x)} .

50 4.1. Subdifferentials in the interior of the polytope

The gradient s = ∇q(x) is then computed as

2ξ s = 11 t + ξ 1 ξ 01 20 (4.1) 2ξ12 s2 = t + ξ02, ξ20 where t = ξ11x1 + ξ12x2 + ξ10. The Equation (4.1) represents the parametric equation of a line in the plane and by eliminating t, we get

−ξ12s1 + ξ11s2 + ξ01ξ12 − ξ02ξ11 = 0 which is the Cartesian equation of our line. Let Cq(s) = α10s1 + α01s2 + α00 with α10 = −ξ12, α01 = ξ11, and α00 = ξ01ξ12 − ξ02ξ11. Then for all 2 x ∈ R , ∂q(x) is contained in the line Cq(s) = 0, that is, [ ∂q(x) ⊂ {s : Cq(s) = 0}. (4.2) x∈R2

Now consider a point sq that satisfies Cq(sq) = 0, so it must satisfy the parametric equation of the same line as well. Hence we also have [ {s : Cq(s) = 0} ⊂ ∂q(x). (4.3) x∈R2 So by using (4.2) and (4.3) we get [ ∂q(x) = {s : Cq(s) = 0}. x∈R2

1 1 Example 4.4. For a convex quadratic function q(x) = (x +x )2 + (x − 4 1 2 8 1 x2), the gradient s = ∇q(x) is t 1 t 1 s = + , s = − (4.4) 1 2 8 2 2 8 where t = x1 + x2. Upon solving the above equations, we get

−4s1 + 4s2 + 1 = 0.

51 4.1. Subdifferentials in the interior of the polytope

Primal domain Dual domain (x1, x2) (s1, s2)

0 = 1 x + 1 + s 2 x 4 2 + − 1 x 1 4s 1 + = − x x 0 1 + 2 = x 0 5 3 2 + , 1 ( 8 8 ) = 0 1 1 , ( 8 8 )

3 5 − , − ( 8 8 )

Figure 4.1: Primal and dual mapping for quadratic function in Example 4.4 over lines parallel to x1 + x2 = 0.

For any c ∈ R, let t − c = 0, that is, x1 + x2 − c = 0 be a line parallel to x1 + x2 = 0. When c = 0, both lines coincide. Now (4.4) can be written as c 1 c 1 s = + , s = − . 1 2 8 2 2 8 So the subdifferentials along each dimension are a linear function of the intercepts of the lines parallel to x1 + x2 = 0. This is well illustrated in Figure 4.1. The whole primal domain (shown in grey) maps to the line −4s1 + 4s2 + 1 = 0 (also grey) in the dual domain. The line to point mapping is shown by the three lines x1 + x2 = 0, x1 + x2 − 1 = 0 and x1 + x2 + 1 = 0 of color red, green and blue respectively, to the points of the same color on the line −4s1 + 4s2 + 1 = 0 in the dual domain. 2 Corollary 4.5. For a linear function l of form (3.20), and any x ∈ R , the subdifferential set ∂l(x) is a singleton.

Proof. By setting ξ11 = ξ12 = 0 in Proposition 4.3, the Equation (4.1) reduces to s1 = ξ01 and s2 = ξ02.

Hence ∂l(x) = {(ξ01, ξ02)} is a singleton set.

52 4.1. Subdifferentials in the interior of the polytope

Corollary 4.6. For a bivariate convex quadratic function q of form (3.19), 2 and a polytope P ⊂ R , let f(x) = q(x) + IP (x), then for all x ∈ int(P ), the S set x∈int(P ) ∂f(x) is contained in a line segment.

2 1 Proof. Since P ⊂ R is a compact set and q ∈ C , ∇q(P ) is compact as S 2 well. Using Proposition 4.3, x∈P ∂q(x) ⊂ L where L ⊂ R is a line. Since s is defined by the parametric equation of a line and P is convex (and so S S is connected), x∈P ∂q(x) = ∇q(P ) is a segment and x∈int(P ) ∂f(x) ⊂ S x∈P ∂q(x) is contained in the same line segment as well. Proposition 4.7. For a bivariate rational function r of form (3.18), there S exists αij such that the set x∈dom(r) ∂r(x) = {s : Cr(s) = 0}, where Cr(s) = 2 2 α11s1 + α12s1s2 + α22s2 + α10s1 + α02s2 + α00 and Cr(s) = 0, is a parabolic curve (as per Definition 2.20).

2 ξ1(x) Proof. From form (3.18), let r(x) = + ξ0(x) be a bivariate rational ξ2(x) function with ξ1(x) = ξ11x1 + ξ12x2 + ξ10, ξ2(x) = ξ21x1 + ξ22x2 + ξ20 and ξ0(x) = ξ01x1 + ξ02x2 + ξ00. Since r is differentiable everywhere in 2 dom(r) = R /{z : ξ21z1 + ξ22z2 + ξ20 = 0}, for any x ∈ dom(r) ∂r(x) = {s : s = ∇r(x)}.

Next we compute the gradient s = ∇r(x) as

2 s1 = 2ξ11t − ξ21t + ξ01 2 (4.5) s2 = 2ξ12t − ξ22t + ξ02 ξ x + ξ x + ξ where t = 11 1 12 2 10 . Equation (4.5) represents the parametric ξ21x1 + ξ22x2 + ξ20 equation of a conic section and by eliminating t, we get

2 2 α11s1 + α12s1s2 + α22s2 + α10s1 + α02s2 + α00 = 0 (4.6)

2 2 3 4 where α11 = ξ21ξ22, α12 = −2ξ21ξ22, α22 = ξ21 and other αij are functions of the coefficients of r. Now since

2 3 2 6 2 α12 − 4α11α22 = (−2ξ21ξ22) − 4ξ21ξ22 = 0, the conic section represented by (4.6) is a parabola. 2 2 Let Cr(s) = α11s1 + α12s1s2 + α22s2 + α10s1 + α02s2 + α00, then for all x ∈ dom(r), ∂r(x) is contained in the parabolic curve Cr(s) = 0, so

53 4.1. Subdifferentials in the interior of the polytope

[ ∂r(x) ⊂ {s : Cr(s) = 0}. (4.7) x∈dom(r)

Now any point sr that satisfies Cr(sr) = 0, satisfies the parametric equation (4.5) as well. So we also have [ {s : Cr(s) = 0} ⊂ ∂r(x). (4.8) dom(r) Hence by using (4.7) and (4.8), we get [ ∂r(x) = {s : Cr(s) = 0}. x∈dom(r)

x2 Example 4.8. For a rational function r(x) = 2 and any x ∈ x2 − x1 + 1 dom(r), the gradient s = ∇r(x) is

2 2 s1 = t , s2 = 2t − t x where t = 2 . After eliminating t, the equation of the parabola x2 − x1 + 1 comes out to be 2 2 s1 + 2s1s2 + s2 − 4s1 = 0. So from Proposition 4.7,

[ 2 2 ∂r(x) = {s : s1 + 2s1s2 + s2 − 4s1 = 0}. x∈dom(r) Unlike the quadratic function in Example 4.4, the mapping between primal and dual domain is difficult to visualize for rational functions. Corollary 4.9. For a bivariate rational function r, of form (3.18), a poly- S tope P , let f(x) = r(x) + IP (x), then for all x ∈ int(P ), the set x∈int(P ) ∂f(x) is contained inside a parabolic arc. S Proof. Using the same arguments from Corollary 4.6, x∈int(P ) ∂f(x) ⊆ S S S x∈int(P ) ∂r(x) ⊂ x∈dom(r){∇r(x)}. From Proposition 4.7, x∈int(P ) ∂r(x) 2 ⊂ P where P ⊂ R is a parabolic curve. Since s is defined by the para- S metric equation of a parabola and P is connected, so x∈int(P ) ∂r(x) = ∇r(int(P )) ⊂ P is contained in a parabolic arc.

54 4.2. Subdifferentials at the Vertices

4.2 Subdifferentials at the Vertices

Lemma 4.10. For a function g ∈ C1, a polytope P , and vertex v, let f(x) = g(x) + IP (x), then the subdifferential set ∂f(v) is an unbounded polyhedral set.

Proof. Let v be a vertex of P , and v− and v+ be the adjacent vertices when traversed P in the clockwise direction, then for all y ∈ P , yl ∈ E− and yr ∈ E+ where E− and E+ are the respective adjacent edges to v, the normal cone at v is

2 NP (v) = {s ∈ R : hs, y − vi ≤ 0, ∀y ∈ P } 2 2 = {sl ∈ R : hsl, yl − vi ≤ 0} ∩ {sr ∈ R : hsr, yr − vi ≤ 0} 2 = {s ∈ R : hs, yl − vi ≤ 0, hs, yr − vi ≤ 0}, which is an unbounded polyhedral set. Now from Proposition 4.2, we have

∂f(v) = ∂g(v) + NP (v)

= ∇g(v) + NP (v) 2 = {s + ∇g(v) ∈ R : hs, yl − vi ≤ 0, hs, yr − vi ≤ 0} which is again an unbounded polyhedral set. It is illustrated as the arrows in the intersection of Red and Green regions in Figure 4.2.

Conjecture 4.11. For a bivariate rational function r of form (3.18), let f(x) = r(x) + IP (x) and v ∈ P be a vertex of P such that ξ1(v) = ξ2(v) = 0, then the subdifferential set ∂f(v) is a parabolic region (as per Defini- tion 2.21)(Note that r is not differentiable at v so Lemma 4.10 does not apply).

4.3 Subdifferentials on the edges

Lemma 4.12. For a bivariate function g ∈ C1, a polytope P , and an edge

E = {x : x2 = mx1 + c, xl ≤ x1 ≤ xu1 } between vertices xl and xu, let S1 S f(x) = g(x)+IP (x), then x∈ri(E) ∂f(x) = x∈ri(E){s+∇g(x): s1 +ms2 = 0, s2 ≥ 0}.

Proof. From Lemma 4.2, for all x ∈ ri(E), ∂f(x) = ∂g(x) + NP (x). Let L(x) = x2 − mx1 − c be the expression of the line joining xl and xu such that P ⊂ {x : L(x) ≤ 0}. The case P ⊂ {x : L(x) ≥ 0} is analogous.

55 4.3. Subdifferentials on the edges

Figure 4.2: Normal cone at a vertex v ∈ P shown as the arrows in the intersection of Red and Green region

2 Since P ⊂ R is a polytope, for all x ∈ ri(E),NP (x) = {s : s = λ∇L(x), λ ≥ 0} is the normal cone of P at x. So

NP (x) = {s : s = λ∇L(x), λ ≥ 0} = {s : s = λ(−m, 1), λ ≥ 0}

= {s : s1 = −mλ, s2 = λ, λ ≥ 0}

= {s : s1 = −ms2, s2 ≥ 0} [by eliminating λ]

= {s : s1 + ms2 = 0, s2 ≥ 0}.

In the special case when E = {x : x1 = d, xl1 ≤ x1 ≤ xu1 },L(x) = x1 − d, so

NP (x) = {s : s = λ(1, 0), λ ≥ 0}

= {s : s2 = 0, s1 ≥ 0}.

Now for any x ∈ ri(E),

∂f(x) = ∂g(x) + NP (x)

= ∇g(x) + {s : s1 + ms2 = 0, s2 ≥ 0} (4.9)

= {s + ∇g(x): s1 + ms2 = 0, s2 ≥ 0}.

56 4.3. Subdifferentials on the edges

So [ [ ∂f(x) = {s + ∇g(x): s1 + ms2 = 0, s2 ≥ 0}. x∈ri(E) x∈ri(E)

Proposition 4.13. For a true bivariate convex quadratic function q of form (3.19), a polytope P and an edge E = {x : x2 = mx1 + c, xl1 ≤ x1 ≤ xu1 } between vertices xl and xu, let f(x) = q(x) + IP (x), then the set S x∈ri(E) ∂f(x) is either an unbounded polyhedral set with nonempty interior or a ray. S Proof. From Corollary 4.6, there exists some l, u such that x∈ri(E) ∂q(x) = S S x∈ri(E){s : Cq(s) = 0, l1 ≤ s1 ≤ u1}. Now computation of x∈ri(E) ∂f(x) can be divided in the following two cases: Case 1 (l = u) From Corollary 4.6, [ [ ∂q(x) = {s : Cq(s) = 0, l1 ≤ s1 ≤ u1} x∈ri(E) x∈ri(E)

= {s : Cq(s) = 0, s = l = u = ∇q(x)} [since l = u] = u

So from Lemma 4.12, [ [ ∂f(x) = {s + ∇q(x): s1 + ms2 = 0, s2 ≥ 0} x∈ri(E) x∈ri(E) (4.10) = {s + u : s1 + ms2 = 0, s2 ≥ 0}

= {s : s1 + ms2 − (u1 + mu2) = 0, s2 ≥ u2} is a ray. S Case 2 (l 6= u) From Corallary 4.6, x∈ri(E) ∂q(x) = {s : Cq(s) = 0, l1 ≤ s1 ≤ u1}. Now for any x ∈ ri(E), ∂f(x) = {s : s1 + ms2 = 0, s2 ≥ 0}. So when ∇q(x) = l, ∂f(x) = {s : s1 + ms2 − (l1 + ml2) = 0, s2 ≥ l2} and when ∇q(x) = u, ∂f(x) = {s : s1 + ms2 − (u1 + mu2) = 0, s2 ≥ u2}. Now let ∂f(x) ⊂ {s : Cq(s) ≤ 0}, the case ∂f(x) ⊂ {s : Cq(s) ≥ 0} is analogous. Then [ [ ∂f(x) = {s + ∇q(x): s1 + ms2 = 0, s2 ≥ 0} x∈ri(E) x∈ri(E) (4.11)

= {s : l1 + ml2 ≤ s1 + ms2 ≤ u1 + mu2,Cq(s) ≤ 0} is an unbounded polyhedral set with nonempty interior.

57 4.3. Subdifferentials on the edges

Example 4.14. For the quadratic function in Example 4.4, and polytope 1  3 3 1 1  1 P with vertices v = , 1 , v = , , v = , and v = 0, , 1 2 2 4 4 3 4 4 4 2 let f(x) = q(x) + IP (x).  3 1 3 For an edge E = x : x = −x + , ≤ x ≤ between vertices v 1 2 1 2 2 1 4 1 7 5 and v , ∇∂f(v ) = ∇∂f(v ) = , , as the edge is parallel to x +x = 0. 2 1 2 8 8 1 2 S So in this case, x∈ri(E) ∂f(x) = {s : s1 − s2 − 1/4 = 0, s2 ≥ 5/8} is a ray.  1 3 For the edge E = x : x = x , ≤ x ≤ between vertices v and 2 2 1 4 1 4 2 7 5 3 1 v , ∇f(v ) = , and ∇f(v ) = , . Also, from Example 4.4, 3 2 8 8 3 8 8 Cq(s) = −4s1 + 4s2 + 1. Now ∂f(v2) = {s : s1 − s2 − 1/4 = 0, s2 ≥ 5/8} and ∂f(v2) = {s : s1 − s2 − 3/2 = 0, s2 ≥ 1/8}. Clearly, for any x ∈ ri(E), ∂f(x) ⊂ {s : Cq(s) ≤ 0}. So [  1 3  ∂f(x) = s : ≤ s − s ≤ , −4s + 4s + 1 ≤ 0 , 2 1 2 2 1 2 x∈ri(E) which is an unbounded polyhedral set. Corollary 4.15. For a bivariate linear function l, a polytope P and an edge E = {x : x2 = mx1 + c, xl ≤ x1 ≤ xu1 } between vertices xl and xu, let 1 S f(x) = l(x) + IP (x), then the subdifferential set x∈ri(E) ∂f(x) is a ray. 2 Proof. From Corollary 4.5, for all x ∈ R , ∂l(x) is a singleton {∇l(x)}. Let c = ∇l(x). Then from Lemma 4.12, [ [ ∂f(x) = {s + ∇l(x): s1 + ms2 = 0, s2 ≥ 0} x∈ri(E) x∈ri(E)

= {s : s1 + ms2 − (c1 + mc2) = 0, s2 ≥ c2} is a ray.

Proposition 4.16. For a bivariate rational function r of form (3.18), a − + polytope P and an edge E = {x : x2 = mx1 + c, v1 ≤ x1 ≤ v1 } between − + vertices v and v , let f(x) = r(x) + IP (x), then the subdifferential set S x∈ri(E) ∂f(x) is either a parabolic region(see Definition 2.21) or a ray. 2 S Proof. From Corollary 4.9, there exists l, u ∈ R such that x∈ri(E) ∂r(x) = S S x∈ri(E){s : Cr(s) = 0, l1 ≤ s1 ≤ u1}. So computing x∈ri(E) ∂f(x) leads to the following two cases:

58 4.4. Structure of the conjugate domain

Case 1 (l = u) Same as Proposition 4.13, Case 1. Case 2 (l 6= u) By setting g = r in Lemma 4.12, for any x ∈ ri(E), ∂f(x) = {s : s1 + ms2 = 0, s2 ≥ 0}. Similar to Proposition 4.13 Case 2, when ∇r(x) = l, ∂f(x) = {s : s1 + ms2 − (l1 + ml2) = 0, s2 ≥ l2} and when ∇r(x) = u, ∂f(x) = {s : s1 + ms2 − (u1 + mu2) = 0, s2 ≥ u2}. Let ∂f(x) ⊂ {s : Cr(s) ≤ 0}, the case ∂f(x) ⊂ {s : Cr(s) ≥ 0} is analogous. Then [ [ ∂f(x) = {s + ∇r(x): s1 + ms2 = 0, s2 ≥ 0} x∈ri(E) x∈ri(E)

= {s : l1 + ml2 ≤ s1 + ms2 ≤ u1 + mu2,Cr(s) ≤ 0}

is a parabolic region.

Example 4.17. For the rational function r in Example 4.8 and polytope P with vertices v1 = (0, 0), v2 = (1, 0) and v3 = (1, 1), let f(x) = r(x) + IP (x). 2 2 From Example 4.8, we also have Cr(s) = s1 + 2s1s2 + s2 − 4s1. For an edge E = {x : x2 = x1, 0 ≤ x1 ≤ 1} between vertices v1 and v3, ∇r(v1) = (0, 0) and ∇r(v3) = (1, 1). Now ∂f(v1) = {s : s1 + s2 = 0, s2 ≥ 0} and ∂f(v2) = {s : s1 + s2 − 2 = 0, s2 ≥ 1}. Also, for any x ∈ ri(E), ∂f(x) ⊂ {s : Cr(s) ≥ 0}. So

[ 2 2 ∂f(x) = {s : 0 ≤ s1 + s2 ≤ 2, s1 + 2s1s2 + s2 − 4s1 ≥ 0} x∈ri(E) is an unbounded parabolic region. This is well illustrated in Figure 4.4 with E = E13.

4.4 Structure of the conjugate domain

It is known that the conjugate of a PLQ function is a PLQ function so the image of the domain of a PLQ function has a polyhedral subdivision [GL13]. The alternate proof we provide here is useful for 2 reasons: (1) it generalizes to rational functions, and (2) it gives algorithmic insight.

Theorem 4.18. For a bivariate convex quadratic function q of form (3.19) and polytope P , let f(x) = q(x) + IP (x), then the union of subdifferentials S set x∈P ∂f(x) has a polyhedral subdivision.

59 4.4. Structure of the conjugate domain

2 Proof. The polytope P ⊂ R can be divided into vertices, faces (or edges), and the interior of P . Now [ [ [  ∂f(x) = (∪v∈V ∂f(v)) (∪E∈F ∂f(ri(E))) ∪x∈int(P )∂f(x) x∈P (4.12) where V is the set of all vertices and F is the set of all the faces (or edges) of P . Now we compute the subdifferentials for each entity according to the following three cases: Case 1(Vertices) From Lemma 4.10, for a vertex v we have ∂f(v) = ∂q(v) + NP (v), which is an unbounded polyhedral set with nonempty inte- rior. Case 2 (Edges) From Proposition 4.13, for any edge E with slope m, S x∈ri(E) ∂f(x) = {s : l1 + ml2 ≤ s1 + ms2 ≤ u1 + mu2,Cq(s) ≤ 0} is an unbounded polyhedral set with nonempty interior. In the special case when l = u, it is a ray. S Case 3 (int(P )) From Corollary 4.6, x∈int(P ) ∂f(x) is a contained in a line segment. T S  Now for any v ∈ V and E ∈ F, ∂f(v) x∈ri(E) ∂f(x) is either φ S T S or a ray. Similarly for any E ∈ F, ( x∈ri(E) ∂f(x)) ( x∈int(P ) ∂f(x)) is contained in the corresponding line segment, while for any v ∈ V, ∂f(v) T S S 2 ( x∈int(P ) ∂f(x)) is a singleton {∇q(v)}. Since x∈P ∂f(x) = R is a convex S set, so the subdifferential set x∈P ∂f(x) is the union of a finite number of polyhedral sets, consequently has a polyhedral subdivision.

Example 4.19. For the quadratic function and the polytope in Exam- S ple 4.14, we have f(x) = q(x) + IP (x). Now to compute x∈P ∂f(x), we compute the subdifferential set for each entity of P . For all vertices, from Lemma 4.10, the subdifferential are

∂f(v1) = {s : 2s1 + 2s2 − 3 ≥ 0, −4s1 + 4s2 + 1 ≥ 0},

∂f(v2) = {s : 2s1 + 2s2 − 3 ≥ 0, −4s1 + 4s2 + 1 ≤ 0},

∂f(v3) = {s : 2s1 + 2s2 + 1 ≤ 0, −4s1 + 4s2 + 1 ≤ 0},

∂f(v4) = {s : 2s1 + 2s2 + 1 ≤ 0, −4s1 + 4s2 + 1 ≥ 0}.

Let Eij denote the edge between vertex vi and vj. Since ∇f(v1) = ∇f(v2) and ∇f(v3) = ∇f(v4), ∂f(E12) and ∂f(E34) are contained in a ray

60 4.4. Structure of the conjugate domain

respectively. For the remaining edges [ ∂f(x) = {s : 1 ≤ 2s1 + 2s2 ≤ 3, −4s1 + 4s2 + 1 ≤ 0},

x∈ri(E23) [ ∂f(x) = {s : 1 ≤ 2s1 + 2s2 ≤ 3, −4s1 + 4s2 + 1 ≥ 0}.

x∈ri(E14) S So the subdifferential set x∈P ∂f(x) is the union of all the subdifferen- tials corresponding to each edge and vertex, which are polyhedral sets. This is well illustrated in Figure 4.3. S A simple method to obtain the polyhedral subdivision lines of x∈P ∂f(x) can be formulated as follows:

Step 1 For each vertex v, compute its gradient ∇f(v).

Step 2 Connect the farthest two gradients (in terms of Euclidean distance) S with a line segment. Since x∈P ∇q(x) is contained in a line segment, the gradients of the rest of the vertices would lie on the segment as well.

Step 3 Compute the boundary of subdifferentials for each vertex and obtain the subdivision lines.

Corollary 4.20. For a bivariate linear function l, polytope P , let f(x) = S l(x) + IP (x), then the union of subdifferentials x∈P ∂f(x) has a polyhedral subdivision. S S Proof. Since x∈int(P ) ∂f(x) is a singleton (from Corollary 4.5), x∈P ∂f(x) S can be written as v∈V ∂f(v) which is the union of a finite number of poly- hedral sets such that for any u, w ∈ V, ∂f(u) ∩ ∂f(w) is either φ or a ray. S Consequently, x∈P ∂f(x) has a polyhedral subdivision. Theorem 4.21. Assuming Conjecture 4.11 holds, for a bivariate rational function r of form (3.18), polytope P , let f(x) = r(x) + IP (x), then the S union of the subdifferentials set x∈P ∂f(x) has a parabolic subdivision (see Definition 2.22).

2 ξ1(x) Proof. Let r(x) = + ξ0(x) be a bivariate rational function where ξ2(x) ξ1(x), ξ2(x) and ξ0(x) are linear functions in x. Similar to Theorem 4.18, 2 polytope P ⊂ R can be divided into vertices, edges, and the interior. The

61 4.4. Structure of the conjugate domain

∂f(v1)

⋃ ∂f(x) x∈ri(E14)

7 5 , ∂f(v ) ( 8 8 ) 2

∂f(v4) 3 1 , ( 8 8 )

⋃ ∂f(x) x∈ri(E23)

∂f(v3)

Figure 4.3: Polyhedral subdivision for Example 4.14

62 4.4. Structure of the conjugate domain

S set x∈P ∂f(x) would be the union of the subdifferentials set for each such entity of P . It can be divided into the following three cases Case 1 (Vertices) From Lemma 4.10, for any v ∈ V, ξ2(v) 6= 0, the set ∂f(v) is an unbounded polyhedral set, otherwise for some z ∈ V such that ξ2(z) = 0, the set ∂f(z) is a parabolic region (from Conjecture 4.11). S Case 2 (Edges) From Proposition 4.16, for any edge E ∈ F, x∈ri(E) ∂f(x) = {s : l1 + ml2 ≤ s1 + ms2 ≤ u1 + mu2,Cr(s) ≤ 0} is a parabolic S region. In the special case when l = u, x∈ri(E) ∂f(x) is a ray. S Case 3 (int(P )) From Corollary 4.9, x∈int(P ) ∂f(x) is a parabolic arc. T S  For any v ∈ V and E ∈ F, ∂f(v) x∈E ∂f(x) is either φ or a ray, S T S and ( x∈ri(E) ∂f(x)) ( x∈int(P ) ∂f(x)) is contained in the corresponding T S parabolic arc. While for any v ∈ V, ∂f(v) ( x∈int(P ) ∂f(x)) is either a singleton {∇r(v)} or contained in the parabolic arc. S Hence the subdifferential set x∈P ∂f(x) is the union of a finite number of parabolic regions such that any two regions meet either on a parabolic arc, a ray or at a point, thus leading to a parabolic subdivision.

Example 4.22. For the rational function r and polytope P in Example 4.17, S f(x) = r(x) + IP (x). Now to compute x∈P ∂f(x), we compute the subdif- ferential set for each entity of P as follows: For vertices, from Lemma 4.10, the subdifferential at v1 and v3 are

∂f(v1) = {s : s1 ≤ 0, s1 + s2 ≤ 0},

∂f(v3) = {s : s1 + s2 ≥ 0, s2 ≥ 1}.

Since v2 is the special vertex case, so ∂f(v2) would be computed according to Conjecture 4.11. However it can also be obtained by exclusion, that is, ∂f(v ) = 2 \ S ∂f(x). 2 R x∈P \v2 Similar to Example 4.22, let Eij be the edge between vertices vi and v . Since S ∂f(x) = {(0, 0)} = {∇f(v )} and S ∂f(x) = j x∈ri(E12) 1 x∈ri(E23) {(1, 1)} = {∇f(v3)}, ∂f(E12) and ∂f(E23) are contained in a ray respec- tively. For E13

[ 2 2 ∂f(x) = {s : 0 ≤ s1 + s2 ≤ 2, s1 + 2s1s2 + s2 − 4s1 ≥ 0}. x∈ri(E13) So by exclusion [ ∂f(v2) ={s : s2 ≤ s1, s1 ≥ 0, s2 ≤ 1} 2 2 {s : s2 ≥ s1, s1 + 2s1s2 + s2 − 4s1 ≤ 0}.

63 4.4. Structure of the conjugate domain

∂f(x) ⋃ ∂f(v3) x∈ri(E13)

(1,1)

(0,0)

∂f(v1) ∂f(v2)

Figure 4.4: Parabolic subdivision for r and P in Example 4.22

64 4.5. Conjugate Expressions

S The parabolic subdivision for x∈P ∂f(x) is illustrated in Figure 4.4. 36x2 + 21x x + 36x2 − 81x + 24x − 252 Example 4.23. For r = 1 1 2 2 1 2 and −12x1 + 9x2 + 75 polytope P formed by vertices v1 = (−1, 1), v2 = (−3, −3) and v3 = (−4, −3), S let f(x) = r(x)+IP (x). Also from Proposition 4.7, we have x∈dom(r) ∂r(x) 2 2 = {s : C(s) = 0} where C(s) = 9s1 + 24s1s2 − 234s1 + 16s2 + 200s2 − 527. From Lemma 4.10, the subdifferentials at the vertices are

∂f(v1) = {s : 3s1 + 4s2 + 1 ≥ 0, 4s1 + 8s2 − 1 ≥ 0},

∂f(v2) = {s : s1 + 2s2 + 11 ≤ 0, s1 ≥ −3},

∂f(v3) = {s : s1 ≤ −3, 3s1 + 4s2 + 25 ≤ 0}.

Again similar to Example 4.22, let Eij be an edge between vertices vi and v . Since ∇f(v ) = ∇f(v ), S ∂f(x) would be a ray, while j 2 3 x∈ri(E23) [ [ ∂f(x) ={s : −13s1 + 4s2 − 23 ≥ 0,C(s) ≤ 0} {s : −13s1 + 4s2

x∈ri(E12)

− 23 ≤ 0, 4s1 + 8s2 − 1 ≤ 0, s1 + 2s2 + 11 ≥ 0}, [ ∂f(x) ={s : 3s1 + 4s2 + 25 ≥ 0, 3s1 + 4s2 + 1 ≤ 0, −13s1 + 4s2

x∈ri(E13) − 23 ≥ 0,C(s) ≥ 0}.

The parabolic subdivision for this example is shown in Figure 4.5.

4.5 Conjugate Expressions

4.5.1 Fractional forms Let us introduce the following fractional forms

ψ1(s1, s2) gf (s1, s2) = q + ψ0(s1, s2) (4.13) ζ00 ψ1/2(s1, s2) 2 2 gq(s1, s2) = ζ11s1 + ζ12s1s2 + ζ22s2 + ζ10s1 + ζ01s2 + ζ00 (4.14) gl(s1, s2) = ζ10s1 + ζ01s2 + ζ00 (4.15) where ψ0, ψ1/2 and ψ1 are linear functions in s, and ζij ∈ R.

65 4.5. Conjugate Expressions

∂f(v1)

⋃ ∂f(x) x∈ri(E13)

⋃ ∂f(x) x∈ri(E12)

∂f(v3)

∂f(v2)

Figure 4.5: Parabolic subdivision for r and P from Example 4.5

66 4.5. Conjugate Expressions

Theorem 4.24. For a convex quadratic function q of form (3.19), polytope ∗ P , let f(x) = q(x) + IP (x), then its conjugate f (s) has a polyhedral subdi- vision such that over each member of subdivision it has either form (4.14) or (4.15).

2 ξ1(x) Proof. From form (3.19), let q(x) = + ξ0(x) be a convex quadratic ξ20 function with ξ1(x) = ξ11x1 + ξ12x2 + ξ10, ξ0(x) = ξ01x1 + ξ02x2 + ξ00 and ξ20 > 0. Now the conjugate of f is computed as

f ∗(s) = sup {hs, xi − f(x)}. x∈Rn

∗ Since f(x) = q(x) + IP (x) is convex over P , so the maxima in f (s) can be obtained by setting ∇f(x) = 0, and would lead to the following three possible cases. Case 1 (Vertices) Let V ⊂ P be the set of all the vertices, then for v ∈ V

f ∗(s) = sup {hs, xi − f(x)} x∈{v}

= hs, xi − (q(x) + IP (x)) = hs, vi − q(v)

= s1v1 + s2v2 − q(v) is a linear function in s. From Proposition 4.18, the set ∂f(v) is an un- bounded polyhedral set, so for any vertex v, the conjugate is a linear function defined on an unbounded polyhedral set. Case 2 (Edges) Let F be the set of all the edges, and E = {x : x2 = mx1 + c, l1 ≤ x1 ≤ u1} ⊂ F be an edge between vertices l and u. The case when m = 0 or E = {x : x1 = d, l1 ≤ x1 ≤ u1} is analogous. The conjugate expression is then computed as

f ∗(s) = sup {hs, xi − f(x)} x∈ri(E)

= sup {hs, xi − (q(x) + IP (x))}. x∈ri(E)

By first order optimality condition,we have

s − (∇q(x) + NP (x)) = 0.

67 4.5. Conjugate Expressions

For m as the slope of the edge, NP (x) = {s : s = λ(−m, 1), λ ≥ 0}. So we have 2ξ s = 11 t + ξ − mλ, 1 ξ 01 20 (4.16) 2ξ12 s2 = t + ξ02 + λ ξ20 where t = ξ11x1 + ξ12x2 + ξ10. Since x ∈ ri(E), we also have

x2 = mx1 + c. (4.17)

From (4.16) and (4.17), we have

x1 = γ1s1 + γ2s2 + γ0

x2 = m(γ1s1 + γ2s2 + γ0) + c where all γi are defined in coefficients of q, and parameters m and c. By substituting the above values in f ∗(s), we get

∗ 2 2 f (s) = ζ11s1 + ζ12s1s2 + ζ22s2 + ζ10s1 + ζ01s2 + ζ00 which is a quadratic function of form (4.14) with all ζij defined in coefficients of q(x), and parameters m and c. S From Proposition 4.13, the x∈ri(E) ∂f(x) is either an unbounded poly- hedral set with nonempty interior or a ray. So for any E, the conjugate is a quadratic function of form (4.14) defined over an unbounded polyhedral set. S When x∈ri(E) ∂f(x) is a ray, we do not compute the conjugate since its values on the ray are given by the intersection of the conjugate expressions of the corresponding two adjacent pieces. S Case 3 (int(P )) From Corollary 4.6, x∈int(P ) ∂f(x) is contained in a line segment. In this case, the computation of conjugate expressions is again obtained by its neighbours. So for any convex quadratic function of form (3.19), defined over a poly- tope P , the conjugate has a polyhedral subdivision such that over each member of its subdivision, it is of form (4.14) or (4.15).

Example 4.25. For the quadratic function q and polytope P in the Exam- ple 4.19, let f(x) = q(x) + IP (x). Similar to the domain, computation of conjugate expressions can be divided into computing it for each entity of P .

68 4.5. Conjugate Expressions

From Theorem 4.24 case 1, the conjugate expressions for vertices are 1 f ∗ (s) = (s + 2s − 1), v1 2 1 2 3 f ∗ (s) = (4s + 4s − 3), v2 16 1 2 1 f ∗ (s) = (4s + 4s − 1), v3 16 1 2 s f ∗ (s) = 2 , v4 2 respectively. Now since the subdifferential set at each vertex is a polyhedral set, so for vertices, the conjugate is a linear function defined on a polyhedral set. Since from Example 4.19, S ∂f(x) and S ∂f(x) are rays, x∈ri(E12) x∈ri(E34) the conjugate expressions for the remaining edges are 1 f ∗ (s) = (s + s )2, E23 4 1 2 1 f ∗ (s) = (4s2 + 8s s + 4s2 − 4s + 4s + 1), E14 16 1 1 2 2 1 2 respectively. From Example 4.19, the sets S ∂f(x) and S x∈ri(E23) x∈ri(E14) ∂f(x) are again unbounded polyhedral sets, so for the edges E23 and E14, the conjugate is a quadratic function defined on an unbounded polyhedral set. So

 1  (s1 + 2s2 − 1) s ∈ ∂f(v1)  2  3 (4s + 4s − 3) s ∈ ∂f(v )  16 1 2 2  1 2 S  (s1 + s2) s ∈ ∂f(x) f ∗ (s) = 4 x∈ri(E23) P 1 (4s + 4s − 1) s ∈ ∂f(v )  16 1 2 3  1  s2 s ∈ ∂f(v4)  2  1 (4s2 + 8s s + 4s2 − 4s + 4s + 1) s ∈ S ∂f(x), 16 1 1 2 2 1 2 x∈ri(E14) is a piecewise defined function with a polyhedral subdivision.

Corollary 4.26. For a bivariate linear function l of form (3.20), a polytope ∗ P , let f(x) = l(x) + IP (x), then its conjugate f (s) has a polyhedral subdi- vision such that over each member of its subdivision, it is of form (4.15).

69 4.5. Conjugate Expressions

S S Proof. From Corollary 4.20, x∈P ∂f(x) = v∈V ∂f(v), so at any vertex v f ∗(s) = sup {hs, xi − f(x)} x∈{v}

= hs, vi − (l(v) + IP (v))

= s1v1 + s2v2 − l(v) which is a linear function in s and is of form (4.15).

Theorem 4.27. Assume Conjecture 4.11 holds. For a bivariate rational function r of form (3.18), a polytope P , let f(x) = r(x) + IP (x), then the conjugate f ∗(s) has a parabolic subdivision such that over each member of its subdivision it has one of the forms in (4.13)-(4.15)

2 ξ1(x) Proof. Let r(x) = +ξ0(x) be a bivariate rational function with ξ1(x) = ξ2(x) ξ11x1 +ξ12x2 +ξ10, ξ2(x) = ξ21x1 +ξ22x2 +ξ20 and ξ0(x) = ξ01x1 +ξ02x2 +ξ00. The conjugate of a function f is given by

f ∗(s) = sup {hs, xi − f(x)}. x∈Rn

Since f is convex (f = convP (Q)), its critical points are the maxima. Hence, similar to Theorem 4.24, computing the maxima of f ∗(s) leads to the fol- lowing three cases. Case 1 (Vertices) Similar to Theorem 4.24 Case 1, by setting q = r, for any vertex v ∗ f (s) = s1v1 + s2v2 − r(v)

is a linear function of form (4.15). In the special case, when ξ2(v) = 0, ∗ f(x) = Q(x), so f (s) = s1v1 +s2v2 −Q(v), which is again of the same form. Hence the conjugate would be a linear function of form (4.15) defined over an unbounded polyhedral set(from Lemma 4.10). In the special case, when ∂f(v) is a parabolic region (Conjecture 4.11), the conjugate would again be a linear function but defined over a parabolic region. Case 2 (Edges) Similar to Theorem 4.24, let F be the set of all the edges, and E = {x : x2 = mx1 + c, l1 ≤ x1 ≤ u1} ⊂ F be an edge between vertices l and u, then

f ∗(s) = sup {hs, xi − f(x)} x∈ri(E)

= sup {hs, xi − (r(x) + IP (x))}. x∈ri(E)

70 4.5. Conjugate Expressions

By computing the critical points, we have

s − (∇r(x) + NP (x)) = 0.

Again, for m as the slope of the edge, NP (x) = {s : s = λ(−m, 1), λ ≥ 0}. So we have 2 s1 = −ξ21t + 2ξ11t + ξ01 − mλ 2 (4.18) s2 = −ξ22t + 2ξ12t + ξ02 + λ ξ x + ξ x + ξ where t = 11 1 12 2 10 . Since x ∈ ri(E), we also have ξ21x1 + ξ22x2 + ξ20

x2 = mx1 + c. (4.19)

From (4.18) and (4.19), we have  γ10s1 + γ01s2 + γ00 when ξ21 + mξ22 = 0  p x1 = γ00 ± γ1/2 γ10/2s1 + γ01/2s2 + γ00/2  p otherwise,  ±γ−1/2 γ10/2s1 + γ01/2s2 + γ00/2 (4.20) where all γij and γij/k are defined in the coefficients of r, and parameters m and c. When ξ21 + mξ22 6= 0, solving (4.18) and (4.19), leads to a quadratic equation in t with coefficients as linear functions in s, and so ± sign comes as a result of the quadratic roots. ∗ By subtituting (4.20) and (4.19) in f (s), when ξ21 + mξ22 6= 0, we have

∗ ψ1(s1, s2) f (s) = q + ψ0(s1, s2), ζ00 ψ1/2(s1, s2) and when ξ21 + mξ22 = 0,

∗ 2 2 f (s) = ζ11s1 + ζ12s1s2 + ζ22s2 + ζ10s1 + ζ01s2 + ζ00, where all ζij, ψi and ψi/j are defined in the coefficients of r, and paramaters m and c. Also, ψi(s) and ψi/j(s) are linear functions in s. A detailed formulation is provided in Appendix C. S From Proposition 4.16, x∈ri(E) ∂f(x) is either a parabolic region or a ray. So for any E, the conjugate is a fractional function of form (4.13) defined S over a parabolic region. Similar to Theorem 4.24 case 2, when x∈ri(E) ∂f(x) is a ray, the computation of the conjugate is deduced from its neighbours by continuity.

71 4.5. Conjugate Expressions

S Case 3 (int(P )) Similar to Theorem 4.24, since x∈int(P ) ∂f(x) is con- tained in a parabolic arc (from Corollary 4.9), so the computation of the conjugate is deduced by continuity. So for any rational function of form (3.19), defined over a polytope P , its conjugate has a parabolic subdivision such that over each member of the subdivision it has one of the forms in (4.13)- (4.15).

Example 4.28. For the rational function r and polytope P in Example 4.23, f(x) = r(x) + IP (x). Similar to Example 4.25, we obtain the full conjugate by computing the conjugate for the vertices and edges of P only. For the vertices, from Theorem 4.27 case 1, the conjugate expressions for vertices are

∗ fv1 (s) = −s1 + s2 + 1, ∗ fv2 (s) = −3s1 − 3s2 − 9, ∗ fv3 (s) = −4s1 − 3s2 − 12, and from Example 4.23, the subdifferential set for each vertex is an un- bounded polyhedral set. So for each vertex of P , the conjugate is a linear function defined over an unbounded polyhedral set. As S ∂f(x) is a ray, for the remaining edges x∈ri(E23) 336s + 672s − 12432 f ∗ (s) = 1 2 − 17s − 31s + 1181, E12 p 1 2 3(−s1 − 2s2 + 37) 1 f ∗ (s) = (9s2 + 24s s + 16s2 − 42s + 56s + 49) E13 48 1 1 2 2 1 2 respectively. So from Example 4.23, for edge E12, the conjugate is a frac- tional function of form (4.13) with parabolic region, while for edge E13, the conjugate is a quadratic function defined over a parabolic region again. So  −s1 + s2 + 1 s ∈ ∂f(v1)   336√s1+672s2−12432 − 17s − 31s + 1181 s ∈ S ∂f(x)  1 2 x∈ri(E12)  3(−s1−2s2+37) ∗  fP (s) = −3s1 − 3s2 − 9 s ∈ ∂f(v2)  −4s − 3s − 12 s ∈ ∂f(v )  1 2 3  (9s2+24s s +16s2−42s +56s +49)  1 1 2 2 1 2 s ∈ S ∂f(x) 48 x∈ri(E13) is a piecewise defined function with parabolic subdivision.

72 Chapter 5

Algorithmic computation of the Conjugate for a class of Bivariate Rational functions over a polytope

This section covers the data structures and the algorithmic design used for the computation of the conjugate. We start with our proposed division of a parabolic region and a hybrid data structure between numeric and symbolic forms to store the domain and the expressions of the conjugate. After that, we discuss our algorithmic design along with two examples for the computation of the conjugate.

5.1 Data Structures

The input data structure is similar to the output data structure in Sec- tion 3.6.1. The coefficients of the rational function are stored in an array and the regions in an N × 2 vertex or N × 4 halfspaces matrix. The output data structure chosen for this module is a hybrid between numerical and symbolic structures. Since the conjugate of a rational function is a piecewise fractional function with a parabolically divided domain, the fractional expressions are stored in a symbolic array and the domain as a cell of matrices.

Definition 5.1. Cell (Data Structure). A cell is tabular form of data struc- ture with rows and columns based indexing for its elements where each ele- ment can again be a cell, matrix or vector. The advantage of a cell is that elements can be of different sizes and can have a different data type.

Example 5.2. A cell C of size 3 × 2 could be

C{1, 1} = [1, 3, 2, 4] C{1, 2} = [x1 + x2 + 1]

73 5.1. Data Structures

Figure 5.1: A subdifferential region

  2x1 + 3x2 + 5 = 0 22 C{2, 1} = 2 2 C{2, 2} = s1 + 3s1s2 + 7s2 + 9s2 = 0 7 0.1 0.7 C{3, 1} = C{3, 2} = [ ] 0.5 .47 where x1, x2, s1 and s2 are symbolic variables.

5.1.1 Parabolic region To store the conjugate domain for a rational function, we need a way to store the subdifferentials with parabolic regions. Unlike the polyhedral case, storing all the active inequalities does not lead to the same region. For example, in Figure 5.1, storing only the active inequalities leads to the parabolic region with only one active inequality as shown in Figure 5.2. To overcome this problem, we store the region with a parabolic arc as a parabolic region with parabolic inequality and a halfspace obtained by connecting the endpoints of the arc, and the rest of the region as a polyhedral region. This is well illustrated in Figure 5.3.

74 5.1. Data Structures

Figure 5.2: Region with only active constraints

Figure 5.3: Proposed subdivision for storing a subdifferential region

75 5.1. Data Structures

Table 5.1: Conjugate domain for a rational function Entity Form 1 Form 2 Vertex Polyhedral Parabolic + halfspace S Polyhedral ri(E) Parabolic + Polyhedral Parabolic + halfspace S Polyhedral int(P ) Parabolic arc —

5.1.2 Output Data Structure For the conjugate with N pieces, the domain is stored as an N × 2 cell. The first column contains all the halfspaces forming a region while the second column contains all the parabolic regions in the symbolic form. For each row, the region is given by the intersection of polyhedral and parabolic regions in first and second columns. Each region can have up to a maximum of only one parabolic region. The possible structures for the domain of the conjugate of a rational function of form (3.18) is shown in Table 5.1.2. For  f1(s) {s : A1s ≤ b1,Cr(s) ≤ 0}  f2(s) {s : A2s ≤ b2}  f ∗(s) = ......  ......  fN (s) {s : AN s ≤ bN ,Cr(s) ≥ 0}, the conjugate domain is stored as   1 m11 c11 1 cdiv{1, 1} = 1 m12 c12 0 cdiv{1, 2} = [Cr(s) ≤ 0] ....   1 m21 c21 1 cdiv{2, 1} = 1 m22 c22 0 cdiv{2, 2} = [] .... ······   1 mN1 cN1 1 cdiv{N, 1} = 1 mN2 cN2 0 cdiv{N, 2} = [Cr(s) ≥ 0], .... while the expressions are stored in a symbolic array as   cxps = f1(s) f2(s) ··· fN (s) . This is well illustrated in Figure 5.4.

76 5.1. Data Structures

cdiv

P5 P4 [ ] P1

P2 [ Cr ≤ 0 ] P2 Cr

[ ] P3

P1 P3 [ ] P4

P5 [ Cr ≥ 0 ]

cxps f1(s) f2(s) ⋯ ⋯ fN(s)

Figure 5.4: Output Data structure

77 5.1. Data Structures

x Example 5.3. For a rational function r(x) = 2 and a polytope P x2 − x1 + 1 with vertices v1 = (0, 0), v2 = (1, 0) and v3 = (1, 1), let f(x) = r(x) + IP (x). Then the conjugate of f is  0 s ∈ ∂f(v )  1  ∗ s1 s ∈ ∂f(v2) fP (s) = s1 + s2 − 1 s ∈ ∂f(v3)   1 (s + s )2 s ∈ S ∂f(x) 4 1 2 x∈ri(E13) where the subdifferential sets are defined in Example 4.22. So it is stored as

 1 2 cxps = 0 s1 s1 s1 + s2 − 1 4 (s1 + s2) , and 1 −1 2 1 cdiv{1, 1} = cdiv{1, 2} = [ ] 1 0 1 1   2 2 cdiv{2, 1} = 1 1 0 1 cdiv{2, 2} = [ s1 + 2s1s2 + s2 − 4s1 ≤ 0 ] 1 1 0 0 cdiv{3, 1} = 1 0 1 0 cdiv{3, 2} = [ ] 0 1 0 0 0 1 0 1 cdiv{4, 1} = cdiv{4, 2} = [ ] 1 −1 0 0 1 −1 0 1 2 2 cdiv{5, 1} = 1 −1 2 0 cdiv{5, 2} = [ s1 + 2s1s2 + s2 − 4s1 ≥ 0 ]. 1 1 0 1

Since the domain of the conjugate of a quadratic or linear function only has a polyhedral subdivision as shown in Table 5.1.2, their domains with N pieces are stored as an N × 1 cell. The symbolic column is not required in this case. 1 1 Example 5.4. For a quadratic function q(x) = (x + x )2 + (x − x ) 4 1 2 8 1 2 and polytope P with vertices v1 = (0.5, 1), v2 = (0.75, 0.75), v3 = (0.25, 0.25)

78 5.2. Algorithm

and v4 = (0, 0.5), let f(x) = q(x) + IP (x). Then

 1  (s1 + 2s2 − 1) s ∈ ∂f(v1)  2  3 (4s + 4s − 3) s ∈ ∂f(v )  16 1 2 2  1 2 S  (s1 + s2) s ∈ ∂f(x) f ∗ (s) = 4 x∈ri(E23) P 1 (4s + 4s − 1) s ∈ ∂f(v )  16 1 2 3  1  s2 s ∈ ∂f(v4)  2  1 (4s2 + 8s s + 4s2 − 4s + 4s + 1) s ∈ S ∂f(x), 16 1 1 2 2 1 2 x∈ri(E14) where the subdifferential sets are defined in Example 4.19. It is stored as

2 cxps = [s1/2 + s2 − 1/2, (3s1)/4 + (3s2)/4 − 9/16, (s1 + s2) /4, 2 2 s1/4 + s2/4 − 1/16, s2/2, s1/4 + (s1s2)/2 − s1/4 + s2/4 + s2/4 + 1/16] and 1 −1 1.50 1 cdiv{1, 1} = 1 1 −0.25 1 1 1 −0.25 0 cdiv{2, 1} = 1 −1 1.50 1 1 −1 1.50 0 cdiv{3, 1} = 1 −1 0.50 1 1 1 −0.25 0 1 −1 0.50 0 cdiv{4, 1} = 1 1 −0.25 0 1 1 −0.25 1 cdiv{5, 1} = 1 −1 0.50 0 1 −1 0.50 1 cdiv{6, 1} = 1 −1 1.50 0 . 1 1 −0.25 1

5.2 Algorithm

This section covers the main algorithms in the module to compute the conjugate of bivariate functions of forms (3.18)-(3.20). The main algorithm follows the same method as Theorem 4.27. We also discuss the algorithm used to compute the subdifferentials for the edges of the polytope P .

79 5.2. Algorithm

Table 5.2: Conjugate domain for quadratic functions Entity Form Vertex Polyhedral ri(E) Polyhedral int(P ) Line Segment

Algorithm 6 Computing the conjugate for rational function 1: function compute conjugate(rx,P) 2: [cdiv, cdivt,ncrgn] = compute conjugate domain div(rx, P); 3: cxpst = compute conjugatexps vertices(ncrgn); 4: cxps = compute conjugatexps edges(cdivt,cxpst); 5: return cxps,cdiv 6: end function

5.2.1 Main Algorithm The main algorithm is presented in Algorithm 6. It takes the rational function rx and the polytope region P following the input data structures as input arguments. We start with computing the domain division and store it in the variable cdiv. The variables cdivt and ncrgn contain the intermediate data for computing the domain division and the normal cone regions respec- tively. The variable cdivt follows the same output data structure, however, it stores the mapping between primal polytope and the parabolic or polyhe- dral subdivision in the dual domain. The variable ncrgn stores the vertices and their normal cones in the primal as an N × 2 cell where N is the sum of the total number of vertices and edges of polytope P. By using ncrgn, we compute the conjugate expressions corresponding to the vertices of P and store it in the variable cxpst. After that, we compute the remaining expressions using the variables cdivt and cxpst, and store it in cxps. The variables cxps and cdiv contains the conjugate expressions and domain division with the relation being mapped as the indices of the array cxps to the rows in cdiv. The subroutine compute conjugate domain div() has been presented in Algorithm 7. It iterates through each entity once and computes their sub- differentials using the different subroutines. Since it traverse through them only once, for a polytope with n vertices, the time complexity is T (n) = c1 + c2n + c3n + c4n where ci are constants, and so is bounded by O (n). Similarly, for computing the expressions, we go through each entity again

80 5.2. Algorithm

Algorithm 7 Computing the domain of the conjugate 1: function compute conjugate domain div(rx, rgn) 2: rels1s2 = compute curve equation(rx,rgn); . equation of the curve 3: ncrgn = compute normal cone for polyhedral domain(rgn); 4: cdivt = compute subdiff vertices(ncrgn); 5: [cdiv,iv] = compute subdiff edges(rels1s2, cdivt, ncrgn); 6: return cdiv, cdivt, ncrgn, iv 7: end function only once, then overall the time complexity of the algorithm remains O (n).

5.2.2 Algorithm for edges An algorithm for computing the union for subdifferentials at the edges has been shown in Algorithm 8. For each edge, we start with finding its corresponding adjacent vertices and the parallel halfspaces forming the sub- differentials region. We compute the normal vector on the edge and store it in variable nv. After that, using the equation of the curve rels1s2, the adjacent vertices, parallel halfspaces prs and the normal vector, we compute the parabolic region with the “≤ 0” inequality. To verify that this is our parabolic region, we compute the gradient at the midpoint of the edge and then add the normal vector to it. Now if this point satisfies all the inequali- ties of the “≤ 0” parabolic region, we set the flag islessthan to true else it is false. When flag is true, it leads to a subdivision of the region to a parabolic and polyhedral regions, otherwise the parabolic region is “≥ 0” and can be represented with only one parabolic region. Similarly while computing the expressions for the union of subdifferen- tials on the edges, the relation between x and s has a quadratic form and leads to two relations as shown in 4.27 [Case 2 edges]. Now since the conju- gate would be continuous on the boundary with neighboring expressions, we find the conjugate expression by substituting the equation of the boundaries in the two and verifying its equality with the neighbors. The case of the quadratic and linear functions are simple and do not require any further subdivision or verification of the expressions.

81 5.2. Algorithm

Algorithm 8 Computing the subdifferentials for the edges 1: function compute subdiff edges(rels1s2, cdivt, ncrgn) 2: for i = 1 to size(cdivt,1) do 3: if isempty(cdivt{i,1}) then . if an edge 4: [vli,vri] = choose proper indexpm1(i,size(ncrgn,1)); 5: v = [cdivt{vli,1}; cdivt{vri,1}]; 6: hslv = cdivt{vli,2}(2,:); hsrv = cdivt{vri,2}(1,:); 7: prs = [hslv(1:3),abs(1-hslv(4)); hsrv(1:3),abs(1-hsrv(4))]; 8: [nv] = compute normal vector(ncrgn{i,1},ncrgn{vli,1}, ncrgn{i,2}(1,:)); 9: [ldr] = compute curve less than region parll(rels1s2, v, prs, nv); 10: [x1v,x2v] = compute midpoint on edge(ncrgn{i,1}, ncrgn{vli,1}, ncrgn{vri,1}); 11: sc = [drx1(x1v,x2v), drx2(x1v,x2v)]; 12: ptc = sc + nv; 13: if inside polytope check(ldr{1,1},ptc(1),ptc(2)) and rl(ptc(1),ptc(2))≤ 0 then 14: islessthan = true; 15: else if inside polytope check(ldr{2,1},ptc(1),ptc(2)) then 16: islessthan = true; 17: else 18: islessthan = false; 19: end if 20: if islessthan then 21: cdiv{j,1} = ldr{1,1}; cdiv{j,2} = ldr{1,2}; iv(j) = i; j++; 22: cdiv{j,1} = ldr{2,1}; iv(j) = i; j++; 23: else 24: cdiv{j,1} = [prs; ldr{1,1}]; cdiv{j,2} = [rels1s2 >= 0]; 25: iv(j) = i; j++; 26: end if 27: end if 28: end for 29: return cdiv, iv 30: end function

82 5.3. Example 1

Figure 5.5: Normal cone division lines for Example 5.3

5.3 Example 1

x2 For a bivariate rational function r(x) = 2 defined over a poly- x2 − x1 + 1 tope P with vertices v1 = (1, 1), v2 = (1, 0) and v3 = (0, 0), let f(x) = r(x) + IP (x). We start with computing the normal cone division lines for the polytope as shown in Figure 5.5 and store it in variable ncrgn as 1 −1 2 1 ncrgn{1, 1} = 1 1 ncrgn{1, 2} = 1 0 1 1 1 0 1 0 ncrgn{2, 1} = 0 1 −1 1 ncrgn{2, 2} = 1 0 0 1 1 0 0 0 ncrgn{3, 1} = 1 0 ncrgn{3, 2} = 0 1 −1 0 0 1 −1 1 ncrgn{4, 1} = 1 0 0 1 ncrgn{4, 2} = 0 1 0 0 0 1 0 1 ncrgn{5, 1} = 0 0 ncrgn{5, 2} = 1 −1 0 0 1 −1 0 1 ncrgn{6, 1} = 1 1 0 0 ncrgn{6, 2} = . 1 −1 2 0 All the arrays in the first column represent an entity corresponding to

83 5.3. Example 1 the polytope P with adjacent elements being the neighboring entity when traversed clockwise direction while the second row shows the normal cone (or union of normal cones) region corresponding to that entity. The arrays of length 2 are vertices while ones with length 4 are the equation of the edges. For example, the equation of line given by ncrgn{2, 1} is the relation of the points lying on the edge between vertices ncrgn{1, 1} and ncrgn{3, 1}, while the regions given by ncrgn{1, 2} is the normal cone for vertex v3 and ncrgn{2, 2} is the union of normal cones when presented in primal domain. We compute the gradients of r as  2 x2 drx1(x1, x2) = x2 − x1 + 1    2 x2 x2 drx2(x1, x2) = 2 − x2 − x1 + 1 x2 − x1 + 1 ∂r(x) ∂r(x) where drx1 = and drx2 = are symbolic variables storing the ∂x1 ∂x2 gradients respectively. Using the gradients, we compute the equation of the parabolic curve for r, as shown in Example 4.8, and store it in the symbolic 2 2 variable rels1s2 = s1 + 2s1s2 + s2 − 4s1. After that, we compute the value of gradients at all the vertices as the endpoints of the parabolic arc would be the farthest two vertices. Now

drx1(v1) = 1 drx2(v1) = 1,

drx1(v3) = 0 drx2(v3) = 0, 0 however, since r(v ) = is not defined, and 2 0  ∂r(x)  ∂r(x) lim lim 6= lim lim x1→v21 x2→v22 ∂x1 x2→v22 x1→v21 ∂x1 and  ∂r(x)  ∂r(x) lim lim 6= lim lim , x1→v21 x2→v22 ∂x2 x2→v22 x1→v21 ∂x2

2 the set ∂f(v2) is a union of parabolic and polyhedral region given by R \ S ∂f(x), so would be computed at the end. Also for edges E and x∈P \v2 12 E23,

drx1(E12) = drx1(v1) and drx2(E12) = drx2(v1),

drx1(E23) = drx1(v3) and drx2(E23) = drx2(v3),

84 5.3. Example 1 the subdifferentials for these edges lie along a ray and can be deduced by the neighboring entities. Along with computing the gradients and the normal cones at vertices, we set up an identifier for v2 in the transitioning variable cdivt as 1 1 2 1 cdivt{1, 1} = 1 1 cdivt{1, 2} = 1 0 1 1 cdivt{2, 1} = NaN NaN NaN NaN cdivt{2, 2} = [ ] 1 0 1 0 cdivt{3, 1} = NaN NaN cdivt{3, 2} = 0 1 0 0 cdivt{4, 1} = NaN NaN NaN NaN cdivt{4, 2} = [ ] 0 1 0 1 cdivt{5, 1} = 0 0 cdivt{5, 2} = 1 −1 0 0 cdivt{6, 1} = [ ] cdivt{6, 2} = [ ].

The length of arrays in the first column with two NaN values signifies the vertex with the special case and the one with four NaN values signifies an edge for which computation of subdifferentials is not required as it is a ray and can be deduced by the neighboring entities. At this point, we only need to compute ∂f(v ) and S ∂f(x). 2 x∈ri(E13) For the edge, we start with finding the gradients at the adjacent vertices, which in this case are ∇f(v1) = [0, 0] and ∇f(v3) = [1, 1]. Now the parallel halfspaces forming the subdifferential regions would be the opposite region of one of the halfspaces forming the normal cones at these vertices, that is,

1 −1 0 1 prs = 1 −1 2 0 which is [ cdivt{5, 1}(2, 1 : 3), abs(1 − cdivt{5, 1}(2, 4)); cdivt{1, 1}(1, 1 : 3), abs(1 − cdivt{1, 1}(1, 4)) ]. After that, we compute the√ normal√ vector corresponding to the edge of P and store it as nv = [−1/ 2, 1/ 2]. The normal cone at any point x ∈ ri(E13) is λ×nv for λ ≥ 0, so s = ∇f(x)+nv ∈ S ∂f(x). We pick x to be the mid point of the vertices v and v x∈ri(E13) 1 3 lying on the edge and store it in mp = [1/2, 1/2]. Since we do not know which side of the parbolic arc the region lies, we assume that the region lies on the “≤ 0” side of the parabolic arc, which we verify by the criteria

85 5.3. Example 1 s ∈ S ∂f(x). This region is stored as x∈ri(E13)

  2 2 ldr{1, 1} = 1 1 0 1 ldr{1, 2} = [ s1 + 2s1s2 + s2 − 4s1 ≤ 0 ] 1 −1 0 1 ldr{2, 1} = 1 −1 2 0 ldr{2, 2} = [ ]. 1 1 0 0

However, since s = ∇f(mp) + nv ≈ [−0.4571, 1.4571 ] fails to satisfy the constraints for ldr as

2 2 s1 + 2s1s2 + s2 − 4s1 = 2.8284  0, it belongs to the “≥ 0” side of the parabolic arc, so the subdifferentials for the region is stored as

1 −1 0 1 2 2 cdiv{5, 1} = 1 −1 2 0 cdiv{5, 2} = [ 0 ≤ s1 + 2s1s2 − 4s1 + s2 ]. 1 1 0 1

Now at the end, we compute ∂f(v2). This is identified by an array with two NaN values in cdivt. Since ∂f(v2) is a convex set, it belongs to the “≤ 0” side of the parabolic, we only need to compute the halfspaces forming the boundary of this region. Since ∇f(E12) and ∇f(E23) maps to ∇f(v1) and ∇f(v3) respectively, similar to the edge E13, the halfspaces would be the opposite side of the sharing hyperplane from the normal cones at the adjacent vertices, and would be stored as a subdivision with one parabolic and one polyhedral region as

  2 2 cdiv{2, 1} = 1 1 0 1 cdiv{2, 2} = [ s1 + 2s1s2 + s2 − 4s1 ≤ 0 ] 1 1 0 0 cdiv{3, 1} = 1 0 1 0 cdiv{3, 2} = [ ] 0 1 0 0

Finally, all the regions would be stored in cdiv as

1 −1 2 1 cdiv{1, 1} = cdiv{1, 2} = [ ] 1 0 1 1   2 2 cdiv{2, 1} = 1 1 0 1 cdiv{2, 2} = [ s1 + 2s1s2 − 4s1 + s2 ≤ 0 ] 1 1 0 0 cdiv{3, 1} = 1 0 1 0 cdiv{3, 2} = [ ] 0 1 −0 0

86 5.3. Example 1

0 1 −0 1 cdiv{4, 1} = cdiv{4, 2} = [ ] 1 −1 0 0 1 −1 0 1 2 2 cdiv{5, 1} = 1 −1 2 0 cdiv{5, 2} = [ 0 ≤ s1 + 2s1s2 − 4s1 + s2 ]. 1 1 0 1

Along with storing the transition matrix, we also store the indices mapping to compute the corresponding expression for the subregions in cdiv as an array

iv = [1, 3, 3, 5, 6] where the index of iv corresponds to the row in cdiv and the value at the index to the row in cdivt. Next we use the variable ncrgn and rx to compute the conjugate expres- sions for vertices v1 and v3 as

cxpst(1) = s1 + s2 − 1, and cxpst(5) = 0.

0 For v , f(v ) = is undefined, however, since f is the closed convex en- 2 2 0 velope of a nonconvex quadratic function Q, lim f(x) should exist, hence x→v   2 upon computing lim lim f(x) = 0. So the conjugate expression x1→v21 x2→v22 corresponding to v2 is

cxpst(3) = 1 · s1 + 0 · s2 − 0 = s1.

Now for the edge E12, by computing the critical point, from Theo- rem 4.27 [Case 2], we have s − (∇r(x) + NP (x)) = 0, so

2 s1 = t + λ 2 s2 = 2t − t − λ x where t = 2 . By adding both of the above equations, we get x2 − x1 + 1 s + s t = 1 2 . 2

87 5.3. Example 1

Figure 5.6: Conjugate for Example 5.3

Since the points lie on the edge E13, we also have x2 = x1. So x2 = x1 = s + s 1 2 . By using the optimal values of x and x , upon simplification, the 2 1 2 conjugate expression becomes

(s + s )2 cxpst(6) = 1 2 . 4 Now by using cxpst and iv, we get cxps = cxpst(iv) as

 (s + s )2  cxps = s + s − 1, s1, s1, 0, 1 2 1 2 4

The conjugate in mathematical form is  0 s ∈ R  1  s1 s ∈ R2 ∗  fP (s) = s1 s ∈ R3  s1 + s2 − 1 s ∈ R4   1 2 4 (s1 + s2) s ∈ R5

88 5.4. Example 2 where

R1 = {s : s2 ≥ −s1 + 2, s2 ≥ 1} 2 2 R2 = {s : s2 ≥ s1, s1 + 2s1s2 − 4s1 + s2 ≤ 0} R3 = {s : s2 ≤ s1, s2 ≤ 1, s1 ≥ 0}

R4 = {s : 0 ≤ s1, s2 ≤ −s1} 2 2 R5 = {s : s2 ≥ −s1, s2 ≤ −s1 + 2, s2 ≥ s1, s1 + 2s1s2 − 4s1 + s2 ≥ 0}.

It is shown in Figure 5.6.

5.4 Example 2

36x2 + 21x x + 36x2 − 81x + 24x − 252 For r(x) = 1 1 2 2 1 2 and polytope P −12x1 + 9x2 + 75 formed by vertices v1 = (−1, 1), v2 = (−3, −3) and v3 = (−4, −3), let f(x) = r(x)+IP (x). We start with computing the normal cones for each entity and store them in variable ncrgn as

1 −0.75 0.25 1 ncrgn{1, 1} = −1 1 ncrgn{1, 2} = 1 −0.50 0.50 1 1 −0.50 0.50 0 ncrgn{2, 1} = 1 2 3 1 ncrgn{2, 2} = 1 −0.50 −4.50 1 1 −0.50 −4.50 0 ncrgn{3, 1} = −3 −3 ncrgn{3, 2} = 0 1 3 0 0 1 3 1 ncrgn{4, 1} = 1 0 −3 1 ncrgn{4, 2} = 0 1 4 0 0 1 4 1 ncrgn{5, 1} = −4 −3 ncrgn{5, 2} = 1 −0.75 −6 0 1 −0.75 −6 1 ncrgn{6, 1} = 1 1.33 2.33 0 ncrgn{6, 2} = 1 −0.75 0.25 0

The normal cone division lines is shown in Figure 5.7. Next we compute the gradients as

2 2 −48x1 + 72x1x2 + 600x1 + 69x2 + 126x2 − 1011 drx1(x) = 2 (3x2 − 4x1 + 25) 2 2 −64x1 − 96x1x2 + 224x1 + 36x2 + 600x2 + 452 drx2(x) = 2 . (3x2 − 4x1 + 25)

89 5.4. Example 2

Figure 5.7: Normal cone division lines for Example 5.4

Using the above gradients, we compute the equation of the parabolic curve 2 for r and store it in the symbolic variable rl(s1, s2) = 9s1 + 24s1s2 − 234s1 + 2 16s2 + 200s2 − 527. Now the gradients at each vertex is

drx1(v1) = −1.5 drx2(v1) = 0.8750

drx1(v2) = −3 drx2(v2) = −4

drx1(v3) = −3 drx2(v3) = −4.

Since the ∇f(v2) = ∇f(v3), the set ∂f(E23) is a ray. So the transition variable cdivt is 1 −0.75 −0.25 1 cdivt{1, 1} = −1.50 0.88 cdivt{1, 2} = 1 −0.50 0.12 1 cdivt{2, 1} = [ ] cdivt{2, 2} = [ ] 1 −0.50 −5.50 0 cdivt{3, 1} = −3 −4 cdivt{3, 2} = 0 1 3 0 cdivt{4, 1} = [nan, nan, nan, nan] cdivt{4, 2} = [ ] 0 1 3 1 cdivt{5, 1} = −3 −4 cdivt{5, 2} = 1 −0.75 −6.25 0 cdivt{6, 1} = [ ] cdivt{6, 2} = [ ]

90 5.4. Example 2 where the nan values in cdivt{4, 1} represents the subdifferentials infor- mation E23 which can be deduced by the neighboring entities, so direct computation of the subdifferentials is not required. At this point, we only need to compute S ∂f(x) and S x∈ri(E12) x∈ri(E13) ∂f(x). For edge E12, we compute the parallel halfspaces forming the sub- differentials region, which are the regions on the opposite side of one the halfspaces forming the normal cones at the adjacent vertices. We store this in variable 1 −0.50 0.12 0 prs = . 1 −0.50 −5.50 1 Now we only need to verify on which side of the parabolic arc does the subdifferentials region lie. So we formulate the “≤ 0” side of the inequality region as

  2 ldr{1, 1} = 1 3.25 5.75 1 ldr{1, 2} = [ 9s1 + 24s1s2 − 234s1 2 + 16s2 + 200s2 − 527 ≤ 0 ] 1 −0.50 0.12 0 ldr{2, 1} = 1 −0.50 −5.50 1 ldr{2, 2} = [ ]. 1 3.25 5.75 0

Then we compute the normal vector nv = [ 0.8944, −0.4472 ] and the value of the gradient at the mid point of the vertices and on the edge as sc = [ −2.5733, −1.1200 ]. Next, we verify if the point s = nv + sc = [ −1.6789, −1.5672 ] lie in the region ldr. So

2 2 9s1 + 24s1s2 − 234s1 + 16s2 + 200s2 − 527 = −319.7627 ≤ 0, hence the subdifferential region corresponding to the edge E12 is ldr and it would be stored in cdiv as

  2 cdiv{2, 1} = 1 3.25 5.75 1 cdiv{2, 2} = [ 9s1 + 24s1s2 − 234s1 2 + 16s2 + 200s2 − 527 ≤ 0 ] 1 −0.50 0.12 0 cdiv{3, 1} = 1 −0.50 −5.50 1 cdiv{3, 2} = [ ]. 1 3.25 5.75 0

Similarly for the edge E13, the parallel halfspaces would be 1 −0.75 −6.25 1 prs = , 1 −0.75 −0.25 0

91 5.4. Example 2 and nv = [ −0.8, 0.6 ] and sc = [ −2.6250, −1.2812 ]. So s = [ −3.4250, −0.6813 ], but 2 2 9s1 + 24s1s2 − 234s1 + 16s2 + 200s2 − 527 = 307.2  0, so it lies on the “≥ 0” side of the inequality. The subdifferential region, in this case, is then stored as 1 −0.75 −6.25 1 cdiv{6, 1} = 1 −0.75 −0.25 0 1 3.25 5.75 1 2 2 cdiv{6, 2} = [ 9s1 + 24s1s2 − 234s1 + 16s2 + 200s2 − 527 ≥ 0 ]. Finally by all the subdifferential regions are stored in cdiv as 1 −0.75 −0.25 1 cdiv{1, 1} = cdiv{1, 2} = [ ] 1 −0.50 0.12 1   2 cdiv{2, 1} = 1 3.25 5.75 1 cdiv{2, 2} = [ 9s1 + 24s1s2 − 234s1 2 + 16s2 + 200s2 − 527 ≤ 0 ] 1 −0.50 0.12 0 cdiv{3, 1} = 1 −0.50 −5.50 1 cdiv{3, 2} = [ ] 1 3.25 5.75 0 1 −0.50 −5.50 0 cdiv{4, 1} = cdiv{4, 2} = [ ] 0 1 3 0 0 1 3 1 cdiv{5, 1} = cdiv{5, 2} = [ ] 1 −0.75 −6.25 0 1 −0.75 −6.25 1 2 cdiv{6, 1} = 1 −0.75 −0.25 0 cdiv{6, 2} = [ 9s1 + 24s1s2 − 234s1 1 3.25 5.75 1 2 + 16s2 + 200s2 − 527 ≥ 0 ] and the index mapping variable as iv = [1, 2, 2, 3, 5, 6]. Now using ncrgn and r, the conjugate expressions corresponding to the vertices are

cxpst(1) = −s1 + s2 + 1,

cxpst(3) = −3s1 − 3s2 − 9, and

cxpst(5) = −4s1 − 3s2 − 12.

92 5.4. Example 2

For the edge E12, by computing the critical point, from Theorem 4.27 [Case 2], we have s − (∇r(x) + NP (x)) = 0, so

s1 = drx1(x) + α (5.1)

s2 = drx2(x) − α/2 (5.2) and since the points lie on an edge, we also have

x2 = 2x1 + 3. (5.3)

By eliminating α from the Equations (5.1) - (5.3), we have

2 (s1 + 2s2 − 37)x1 + (34s1 + 68s2 − 1258)x1 + 289s1 + 578s2 − 1285 = 0, and upon simplification, we get p −17s1 − 34s2 + 629 ± 56 3 · (−s1 − 2s2 + 37) x1 = . s1 + 2s2 − 37 Now by using the above equation and (5.3), we have two conjugate expres- sions as

336s1 + 672s2 − 12432 tcxp(1) = p + 17s1 + 31s2 − 1181 3(−s1 − 2s2 + 37) 336s1 + 672s2 − 12432 tcxp(2) = p − 17s1 − 31s2 + 1181. 3(−s1 − 2s2 + 37) Since the conjugate is continuous, we verify the continuity of the expressions at the boundaries with the neighboring entites. Now the neighbouring enti- ties in this case are the subdiffential regions corresponding to vertices v1 and v2, and the shared boundaries are the rays s2 = −0.5s1 + 0.1250, s1 ≥ −1.5 and s2 = −0.5s1 − 5.5, s1 ≥ −3 with expressions cl(s) = −s1 + s2 + 1 and cr(s) = −3s1 − 3s2 − 9 respectively. So on boundaries, clb(s1) = 3 9 cl(s1, −0.5s1 + 0.1250) = − 2 s1 + 8 and crb(s1) = cr(s1, −0.5s1 − 5.5) = −3 15 2 s1 + 2 . Now the restrictions of tcxp(1) along the boundaries are −3 18825 −3 5391 tpcl(1) = s + , tpcr(1) = s + 2 1 8 2 1 2 and 3 9 −3 15 tpcl(2) = − s + , tpcr(2) = s + . 2 1 8 2 1 2

93 5.4. Example 2

So for edge E12, the conjugate expression is

336s1 + 672s2 − 12432 cxpst(2) = p − 17s1 − 31s2 + 1181. 3(−s1 − 2s2 + 37)

Similarly for edge E13, for s − (∇r(x) + NP (x)) = 0, we have

s1 = drx1(x) + α (5.4) 3α s = drx2(x) − . (5.5) 2 4 Also from the equation of the line forming the edge 4x 7 x = 1 + . (5.6) 2 3 3 Now by eliminating α from the Equations (5.4) - (5.5), we get 3 s 7 x = s + 2 − . 1 8 1 2 8 Finally, by using the above equation and (5.6), the conjugate expression for E13 comes out to be 1 cxpst(6) = (9s2 + 24s s + 16s2 − 42s + 56s + 49). 48 1 1 2 2 1 2 So by using cxpst and iv, the conjugate expressions cxps = cxpst(iv) are " 336s1 + 672s2 − 12432 cxps = − s1 + s2 + 1, p − 17s1 − 31s2 + 1181, 3(−s1 − 2s2 + 37) 336s1 + 672s2 − 12432 p − 17s1 − 31s2 + 1181, −3s1 − 3s2 − 9, 3(−s1 − 2s2 + 37) # 1 − 4s − 3s − 12, (9s2 + 24s s + 16s2 − 42s + 56s + 49) . 1 2 48 1 1 2 2 1 2 The conjugate in mathematical form is  −s1 + s2 + 1 s ∈ R1  336s1 + 672s2 − 12432  p − 17s1 − 31s2 + 1181 s ∈ R2  3(−s1 − 2s2 + 37)  336s1 + 672s2 − 12432 ∗ − 17s1 − 31s2 + 1181 s ∈ R3 fP (s) = p3(−s − 2s + 37)  1 2 −3s − 3s − 9 s ∈ R  1 2 4  −4s1 − 3s2 − 12 s ∈ R5   1 2 2 48 (9s1 + 24s1s2 + 16s2 − 42s1 + 56s2 + 49) s ∈ R6

94 5.4. Example 2

Figure 5.8: Conjugate for Example 5.4 where

R1 = {s : 3s1 + 4s2 + 1 ≥ 0, 4s1 + 8s2 − 1 ≥ 0}

R2 = {s : −13s1 + 4s2 − 23 ≥ 0,C(s) ≤ 0}

R3 = {s : −13s1 + 4s2 − 23 ≤ 0, 4s1 + 8s2 − 1 ≤ 0, s1 + 2s2 + 11 ≥ 0}

R4 = {s : s1 + 2s2 + 11 ≤ 0, s1 ≥ −3}

R5 = {s : s1 ≤ −3, 3s1 + 4s2 + 25 ≤ 0}

R6 = {s : 3s1 + 4s2 + 25 ≥ 0, 3s1 + 4s2 + 1 ≤ 0, −13s1 + 4s2 − 23 ≥ 0, C(s) ≥ 0}

2 2 for C(s) = 9s1 + 24s1s2 − 234s1 + 16s2 + 200s2 − 527. It is shown in Figure 5.8.

95 Chapter 6

Conclusions and Future work

We provided an algorithmic design to compute the closed convex enve- lope of a bivariate nonconvex quadratic function defined over a polytope. From [Loc16], we reformulated the original problem following the Assump- tion 3.2.1 to a subset of optimization subproblems. The solutions to these subproblems have either rational, quadratic or linear forms and are defined 2 over R with a polyhedral subdivision. After solving the subproblems, the closed convex envelope is computed as the maximum of all such solutions, which involves solving the map overlay problem over the original polytope for all the polyhedral subdivisions of the solution’s domains. The conjugate of the above rational function defined over a polytope P has a parabolic subdivision such that over each member of the subdivision it has form (4.13) - (4.15). The subdifferentials of the rational function is observed to be contained in a parabola and when it is computed for int(P ) ⊂ 2 R , they lie along a parabolic arc. Similarly, for a quadratic function of form (3.19) defined over a polytope, the conjugate is a piecewise defined function with either quadratic or linear form over a polyhedral subdivision 2 of R . Unlike the computation of the closed convex envelope of a bilinear func- tion over a polytope, the conjugate has a hybrid data structure involving both the symbolic expressions and numerical values for the conjugate do- main, while the expressions are stored entirely in symbolic form. This is well illustrated in Section 5.1. We formulated an algorithmic approach to compute the conjugate of the rational functions of form (3.18). We start with computing the subdiffer- entials for the three entities of P , that is, vertices, edges and int(P ). In the special case of a vertex v where the function value is obtained by con- tinuity, the computation of subdifferentials is done at the end as ∂f(v) = 2 \ S ∂f(x). This region has been observed to be a union of polyhe- R x∈P \v2 dral and parabolic region, and has been presented as Conjecture 4.11. If the conjecture does not hold, the conjugate would still be a fractional function 2 defined over a parabolic subdivision of R .

96 6.1. Future work

Table 6.1: Observed intersections for computing the maximum of all conju- gates Functions Linear Quadratic Fractional Linear line parabola lines or parabola Quadratic parabola two lines lines or parabola Fractional lines or parabola lines or parabola lines or parabola

6.1 Future work

The symbolic algorithm presented in Chapter 5.2 relies heavily on the symbolic computations performed by the symbolic engine of the software Matlab. Compared to any numeric computation, symbolic computations are relatively slow and a new module for computing the parabolic computations could be created. However, similar to the polyhedral manipulations module, it will take significant developer time to do so.

6.1.1 Maximum of all conjugates This thesis covers the computation of the conjugate of a single ratio- nal function defined over a polytope. However, the closed convex envelope of a bivariate function is a piecewise rational function with a polyhedral subdivision, so computing the conjugate of the piecewise function requires computing the conjugate of each piece, then finding the maximum of all such conjugate expressions over each resulting parabolic subdivision. Now, since we are computing the maximum of the fractional functions of form (4.13), the points where two of them are equal may not lie on a parabolic arc (or parabola). However, for all the experiments conducted with valid conju- gates of a piecewise rational function which is the closed convex envelope of a bilinear function, it has been observed to be parabolic. Table 6.1.1 summarizes the observations. A parabolic intersection is also expected since taking the conjugate of the conjugate gives us the closure of our original convex piecewise ratio- nal function and by primal-dual symmetry, rational forms with polyhedral domain lead to fractional form with parabolic domain, so the conjugate of fractional forms will lead to rational form with polyhedral subdivision as well.

Conjecture 6.1. For a bivariate nonconvex quadratic function Q defined over a polytope P , let f(x) = conv QP (x) be a piecewise rational function

97 6.1. Future work with form (3.18)-(3.20), then the conjugate f ∗(s) has a parabolic subdivi- sion such that over each member of its subdivision, it has one of the forms in (4.13)-(4.15).

The computation of the maximum of all conjugates has been imple- mented in the module max conjugate, however, since we need to solve the map overlay problem for parabolic regions, which can increase exponentially with each added layer, it takes significant amount of computer time to return the resulting subdivision as the number of layers increases.

6.1.2 Conjugate of bivariate nonconvex PLQ Functions In the case of PLQ functions, we start with computing the closed convex envelope of each piece and then compute the maximum of the conjugates of each closed convex envelope piece to obtain the conjugate of the PLQ function. Now, the conjugate is the maximum of all piecewise fractional functions with parabolic subdivisions and since we do not know the struc- ture of the closed convex envelope of the PLQ function, we cannot infer anything about the resulting subdivision of the conjugate. From [Loc16], we have that the closed convex envelope of any bivariate function satisfying Assumption 3.2.1 has a polyhedral subdivision, however, the form can only be implicitly defined. So if the closed convex envelope of a PLQ function has rational form, then by symmetry we would know that the conjugate would have a parabolic subdivision with fractional forms as well.

Conjecture 6.2. For a bivariate nonconvex PLQ function g defined over a polytope P , let f(x) = conv gP (x), then if f is piecewise defined with a polyhedral subdivision of P such that over each member of its subdivision it has form (3.18)-(3.20), then the conjugate f ∗(s) has a parabolic subdivi- sion such that over each member of its subdivision, it has one of the forms in (4.13)-(4.15).

6.1.3 Convex envelope of bivariate nonconvex PLQ functions Most of the PLQ functions defined over a polytope satisfy Assump- tion 3.2.1, so the closed convex envelope can be computed directly with 2 the method proposed in [Loc16]. For a quadratic function Q(x, y) = q1x + 2 q2xy + q3y + q4x + q5y + q6 and a convex edge of the polytope with relation

98 6.1. Future work

y = mjx + qj, the η function is defined as

∀j ∈ E0(P ), 2  (a + mjb − γ1j) − − bqj + γ0j (a, b) ∈ Dj ηj(a, b) = 4γ2j  − + −axj(a, b) − byj(a, b) + xj(a, b)yj(a, b)(a, b) ∈ {Dj ,Dj },

2 where γ2j = q1 + q2mj + q3mj , γ1j = q4 + q5mj + q2qj + 2mjq3qj and 2 γ0j = q6 + q5qj + q3qj . Now when solving the subproblems with quadratic- quadratic case, the relation between a and b is not linear, and so the one dimension optimization problems reduce to a quartic optimization problem. Similarly, the quadratic-linear case leads to a quartic optimization problem as well. So the solution may or may not have a rational form. However, if the solution has the same rational form as (3.18), the conjugate of PLQ function would have a parabolic subdivision with the same fractional forms.

Conjecture 6.3. For a bivariate nonconvex PLQ function g defined over a polytope P , if the solution of the one dimension optimization subprob- lems (3.17) has form (3.18)-(3.20), then conv gP has the same form and is defined over a polyhedral subdivision of the original polytope P .

99 Bibliography

[AB10] Kurt M Anstreicher and Samuel Burer. Computable representa- tions for convex hulls of low-dimensional quadratic forms. Math- ematical Programming, 124(1-2):33–43, 2010. → pages4

[AKF83] Faiz A Al-Khayyal and James E Falk. Jointly constrained biconvex programming. of Operations Research, 8(2):273–286, 1983. → pages4

[Ans12] Kurt M Anstreicher. On convex relaxations for quadratically constrained quadratic programming. Mathematical Program- ming, 136(2):233–251, 2012. → pages4

[BGLW08] Heinz H Bauschke, Rafal Goebel, Yves Lucet, and Xianfu Wang. The proximal average: basic theory. SIAM Journal on Optimiza- tion, 19(2):766–785, 2008. → pages1

[BLT08] Heinz H Bauschke, Yves Lucet, and Michael Trienis. How to transform one convex function continuously into another. SIAM Review, 50(1):115–132, 2008. → pages 1, 2

[Bre89] Yann Brenier. Un algorithme rapide pour le calcul de trans- form´ees de Legendre-Fenchel discretes. Comptes rendus de l’Acad´emiedes sciences. S´erie1, Math´ematique, 308(20):587– 589, 1989. → pages 1, 4

[CCHP17] Ying Cui, Tsung-Hui Chang, Mingyi Hong, and Jong-Shi Pang. A study of piecewise linear-quadratic programs. arXiv preprint arXiv:1709.05758, 2017. → pages1

[Cor96] Lucilla Corrias. Fast Legendre-Fenchel transform and applica- tions to Hamilton-Jacobi equations and conservation laws. SIAM Journal on Numerical Analysis, 33(4):1534–1558, 1996. → pages 1, 4

100 Bibliography

[Cra89] Yves Crama. Recognition problems for special classes of polyno- mials in 0–1 variables. Mathematical Programming, 44(1-3):139– 155, 1989. → pages3

[GJL14] Bryan Gardiner, Khan Jakee, and Yves Lucet. Computing the partial conjugate of convex piecewise linear-quadratic bivari- ate functions. Computational Optimization and Applications, 58(1):249–272, 2014. → pages5

[GL10] Bryan Gardiner and Yves Lucet. Convex hull algorithms for piecewise linear-quadratic functions in computational convex analysis. Set-Valued and , 18(3-4):467–482, 2010. → pages2

[GL11] Bryan Gardiner and Yves Lucet. Graph-matrix calculus for computational convex analysis. In Fixed-Point Algorithms for Inverse Problems in Science and Engineering, pages 243–259. Springer, 2011. → pages1

[GL13] Bryan Gardiner and Yves Lucet. Computing the conjugate of convex piecewise linear-quadratic bivariate functions. Mathe- matical Programming, 139(1-2):161–184, 2013. → pages 5, 59

[Her16] Cristopher Hermosilla. Legendre transform and applications to finite and infinite optimization. Set-Valued and Variational Analysis, 24(4):685–705, 2016. → pages4

[HL18] Tasnuva Haque and Yves Lucet. A linear-time algorithm to com- pute the conjugate of convex piecewise linear-quadratic bivari- ate functions. Computational Optimization and Applications, 70(2):593–613, 2018. → pages5

[HUL93] Jean-Baptiste Hiriart-Urruty and Claude Lemar´echal. Convex analysis and minimization algorithms II: Advanced Theory and Bundle Methods. Springer Science & Business Media, 1993. → pages 5, 6

[HUL07] Jean-Baptiste Hiriart-Urruty and Yves Lucet. Parametric com- putation of the Legendre-Fenchel conjugate with application to the computation of the Moreau envelope. Journal of Convex Analysis, 14(3):657, 2007. → pages1

101 Bibliography

[HUL13] Jean-Baptiste Hiriart-Urruty and Claude Lemar´echal. Convex analysis and minimization algorithms I: Fundamentals, volume 305. Springer Science & Business Media, 2013. → pages 8, 11, 14, 15, 50 [JKL11] Jennifer A Johnstone, Valentin R Koch, and Yves Lucet. Con- vexity of the proximal average. Journal of Optimization Theory and Applications, 148(1):107–124, 2011. → pages1 [JMW08] Matthias Jach, Dennis Michaels, and Robert Weismantel. The convex envelope of (n–1)-convex functions. SIAM Journal on Optimization, 19(3):1451–1466, 2008. → pages3 [KBB+09] Leonid Khachiyan, Endre Boros, Konrad Borys, Vladimir Gur- vich, and Khaled Elbassioni. Generating all vertices of a poly- hedron is hard. In Twentieth Anniversary Volume:, pages 1–17. Springer, 2009. → pages 48 [LBT09] Yves Lucet, Heinz H Bauschke, and Mike Trienis. The piecewise linear-quadratic model for computational convex analysis. Com- putational Optimization and Applications, 43(1):95–118, 2009. → pages 1, 2 [Lin86] Nathan Linial. Hard enumeration problems in geometry and combinatorics. SIAM Journal on Algebraic Discrete Methods, 7(2):331–335, 1986. → pages 48 [Lin05] Jeff Linderoth. A simplicial branch-and-bound algorithm for solving quadratically constrained quadratic programs. Mathe- matical Programming, 103(2):251–282, 2005. → pages4 [LLT17] Florian Lauster, D Russell Luke, and Matthew K Tam. Symbolic computation with monotone operators. Set-Valued and Varia- tional Analysis, pages 1–16, 2017. → pages1 [Loc14] Marco Locatelli. A technique to derive the analytical form of convex envelopes for some bivariate functions. Journal of Global Optimization, 59(2-3):477–501, 2014. → pages4 [Loc16] Marco Locatelli. Polyhedral subdivisions and functional forms for the convex envelopes of bilinear, fractional and other bivari- ate functions over general polytopes. Journal of Global Opti- mization, 66(4):629–668, 2016. → pages v, vi, 4, 7, 13, 14, 16, 17, 18, 19, 20, 21, 25, 26, 27, 28, 29, 96, 98

102 Bibliography

[LS14] Marco Locatelli and Fabio Schoen. On convex envelopes for bivariate functions over polytopes. Mathematical Programming, 144(1-2):65–91, 2014. → pages 18

[Luc96] Yves Lucet. A fast computational algorithm for the Legendre- Fenchel transform. Computational Optimization and Applica- tions, 6(1):27–57, 1996. → pages 1, 4

[Luc97] Yves Lucet. Faster than the fast Legendre transform, the linear- time Legendre transform. Numerical Algorithms, 16(2):171–185, 1997. → pages 1, 4

[Luc06] Yves Lucet. Fast Moreau envelope computation i: Numerical al- gorithms. Numerical Algorithms, 43(3):235–249, 2006. → pages 1

[Luc10] Yves Lucet. What shape is your conjugate? a survey of com- putational convex analysis and its applications. SIAM Review, 52(3):505–542, 2010. → pages 1, 2

[McC76] Garth P McCormick. Computability of global solutions to fac- torable nonconvex programs: Part i—convex underestimating problems. Mathematical Programming, 10(1):147–175, 1976. → pages4

[Mor65] Jean-Jacques Moreau. Proximit´e et dualit´e dans un espace hilbertien. Bull. Soc. Math. France, 93(2):273–299, 1965. → pages1

[NV94] A Noullez and M Vergassola. A fast Legendre transform al- gorithm and applications to the adhesion model. Journal of Scientific Computing, 9(3):259–281, 1994. → pages1

[RW98] R Tyrrell Rockafellar and Roger J-B Wets. Variational Analysis, volume 317. Springer Science & Business Media, 1998. → pages 1

[SA90] Hanif D Sherali and Amine Alameddine. An explicit characteri- zation of the convex envelope of a bivariate bilinear function over special polytopes. Annals of Operations Research, 25(1):197– 209, 1990. → pages4

103 Bibliography

[SAF92] Zhen-Su She, Erik Aurell, and Uriel Frisch. The inviscid Burgers equation with initial data of brownian type. Communications in Mathematical Physics, 148(3):623–641, 1992. → pages1

[Tar04] Fabio Tardella. On the existence of polyhedral convex envelopes. In Frontiers in Global Optimization, pages 563–573. Springer, 2004. → pages3

[Tar08] Fabio Tardella. Existence and sum decomposition of vertex poly- hedral convex envelopes. Optimization Letters, 2(3):363–375, 2008. → pages3

104 Appendix

105 Appendix A

The η(a, b) expressions for the bilinear functions

Let f(x, y) = xy be defined over a polytope P . Now the restriction of f along the edge between vertices vj1 and vj2 , defined by points statisfying 0 0 y = mjx + qj, is

0 0 0 fj(x ) = x · (mjx + qj) 02 0 = mjx + qjx , and its Hessian

2 0 ∇ fj(x ) = 2mj.

So fj can only be strictly convex when mj > 0. Now the minimization problem related to this edge can be written as

0 0 0 min fj(x ) − [a(x − x) + b(y − y)] ≥ c 0 x ∈[xv ,xv ] j1 j2 02 0 0 mjx + qjx − (a + bmj)x − bqj + ax + by ≥ c (A.1)

0 02 0 0 Let C(x ) = mjx + qjx − (a + bmj)x + ax + by − bqj. Then the optimal value of the above unconstrained problem, obtained by computing the critical points, is

0 2mjx + qj − (a + bmj) = 0 a + bm − q x0 = j j . 2mj So the optimal value of (A.1) is

 − xv (a, b) ∈ D  j1 j  + xv (a, b) ∈ D xj(a, b) = j2 j a + bmj − qj  (a, b) ∈ Dj  2mj

106 Appendix A. The η(a, b) expressions for the bilinear functions where the three regions (defined in (3.11)) are

D− = {(a, b): a + bm ≤ 2m x + q } j j j vj1 j D+ = {(a, b): a + bm ≥ 2m x + q } j j j vj2 j D = {(a, b) : 2m x + q ≤ a + bm ≤ 2m x + q }. j j vj1 j j j vj2 j 0 0 Also, yj(a, b) = mjxj(a, b) + qj. Now from (3.13), ∀k ∈ V (P ) ∪ E (P )(see (3.8)),

ηk(a, b) = f(xk(a, b), yk(a, b)) − axk(a, b) − byk(a, b).

So for f(x, y) = xy and j ∈ E0 (P ),

ηj(a, b) = xj(a, b)yj(a, b) − axj(a, b) − byj(a, b).

Now when (a, b) ∈ Dj,  2 a + bmj − qj a + bmj − qj a + bmj − qj ηj(a, b) = mj + qj − a 2mj 2mj 2mj   a + bmj − qj − b mj + qj 2mj     a + bmj − qj a + bmj − qj = · + qj − a − bmj − bqj 2mj 2   a + bmj − qj = · (a + bmj − qj + 2qj − 2a − 2bmj) − bqj 4mj   a + bmj − qj = · (−a − bmj + qj) − bqj 4mj 2 (a + bmj − qj) = − − bqj. 4mj So ∀j ∈ E0(P ),

 − −axv − byv + xv yv (a, b) ∈ D  j1 j1 j1 j1 j  (a + m b − q )2 η (a, b) = − j j − bq (a, b) ∈ D j 4m j j  j −ax − by + x y (a, b) ∈ D+. vj2 vj2 vj2 vj2 j Similarly, ∀i ∈ V 0(P ),

ηi(a, b) = xvi yvi − axvi − byvi .

107 Appendix B

Solving the subproblems

0 For h, w ∈ E (P ), over a subregion Sr, the subproblem is

max ηh(a, b) + ax + by a,b

ηh(a, b) = ηw(a, b) 0 0 ηh(a, b) ≤ ηk(a, b), k ∈ E (P ) ∪ V (P ) \{h, w}

(a, b) ∈ Sr

Since for the bilinear term, η is either quadratic or linear, solving the subproblems lead to the three cases: quadratic-quadratic, quadratic-linear and linear-linear as discussed in Section 3.2.3.

B.1 Case 1: Quadratic-Quadratic

Let the respective convex edges for ηh and ηw be characterized by mh > 0, qh and mw > 0, qw. So

2 2 (a + bmh − qh) (a + bmw − qw) ηh(a, b) = − − bqh, and ηw(a, b) = − − bqw. 4mh 4mw Now, in this case, the subproblem is

2 (a + bmh − qh) max − − bqh + ax + by a,b 4mh 2 2 (a + bmh − qh) (a + bmw − qw) − − bqh = − − bqw 4mh 4mw 2 (a + bmh − qh) 0 0 − − bqh ≤ ηk(a, b), k ∈ E (P ) ∪ V (P ) \{h, w} 4mh lh ≤ a + bmh ≤ uh, lw ≤ a + bmw ≤ uw, ··· where lh ≤ a + bmh ≤ uh, lw ≤ a + bmw ≤ uw, ··· are the active constraints for the subregion Sr.

108 B.1. Case 1: Quadratic-Quadratic

B.1.1 Solving the equality constraint Now we start with solving the equality constraint as

2 2 (a + bmh − qh) (a + bmw − qw) − − bqh = − − bqw 4mh 4mw By rearranging the terms, we obtain

2 2 2 (mh − mw)a − 2(mhqw − mwqh)a + mh(mwb + qw) − mw(mhb + qh) = 0.

When mh = mw,

2 2 2 − 2mh(qw − qh)a + mh((mhb) + 2mhqwb + qw) − mh((mhb) + 2mhqhb 2 + qh) = 0, 2 2 2 2 or − 2mh(qw − qh)a + 2mhqwb − 2mhqhb + mh(qw − qh) = 0, 2 2 or − 2(qw − qh)a + 2mh(qw − qh) + qw − qh = 0, or − 2(qw − qh)a + 2mh(qw − qh) + (qw + qh) · (qw − qh) = 0,

or − 2a + 2mhb + (qh + qw) = 0, q + q or a = m b + h w , h 2 and when mh 6= mw, √ m q − q m ± m m [(m − m )b + q − q ] a = h w h w h w h w h w . mh − mw

Now using the above relation(s), we reduce the subregion {(a, b): lh ≤ a + bmh ≤ uh, lw ≤ a + bmw ≤ uw, · · ·} to simple bound constraints as

l ≤ b ≤ u.

If the intersection of all such one dimension bounds for all the active con- straints is an empty set, the subproblem becomes infeasible and we proceed on to solving the next pair.

B.1.2 Solving the inequalities When the one dimesion subregion is feasible, using the relations, we solve 0 0 the inequalities for all the remaining ηk, ∀ k ∈ E (P ) ∪ V (P ) \{h, w}, which leads to two cases: quadratic-quadratic and quadratic-linear. For simplicity, let the relation be denoted by a = α1b + α0.

109 B.1. Case 1: Quadratic-Quadratic

Case 1.1 : Quadratic-Quadratic inequality

When ηk is quadratic, the inequality constraint is 2 2 (a + bmh − qh) (a + bmk − qk) − − bqh ≤ − − bqk 4mh 4mk 2 2 ((α1 + mh)b + α0 − qh) ((α1 + mk)b + α0 − qk) or − − bqh ≤ − − bqk 4mh 4mk Upon simplification, it reduces to the form

2 β2b + β1b + β0 ≤ 0 where

2 2 β2 = mh((α1 + mk) ) − mk((α1 + mh) ),

β1 = 2mh(α1 + mk) · (α0 − qk) − 2mk(α1 + mh) · (α0 − qh) + 4mhmk(qk−

qh), 2 2 β0 = mh((α0 − qk) ) − mk((α0 − qh) ).

When β2 = 0, the inequality reduces to an interval, otherwise, the solution is a union of intervals.

Case 1.2 : Quadratic-Linear inequality

When ηk is linear, the inequality constaint is 2 (a + bmh − qh) − − bqh ≤ −xka − ykb + xkyk, 4mh 2 ((α1 + mh)b + α0 − qh) or − − bqh ≤ −xk(α1b + α0) − ykb + xkyk. 4mh Upon simplication, it reduces to

2 β2b + β1b + β0 ≤ 0 with 2 (α1 + mh) β2 = − 4mh (α1 + mh) · (α0 − qh) β1 = − + yk + xkα1 − qh 2mh 2 (α0 − qh) β0 = − + xkα0 − xkyk. 4mh

110 B.1. Case 1: Quadratic-Quadratic

Similar to Case 1.1 in Section B.1.2, when β2 = 0, the inequality reduces to an interval, otherwise, to a union of intervals. We solve the inequalities for all ηk, at the same time, verify that the solution, along with the intersection with the one dimension subregion, is feasible. After solving all the inequalities, the bounds on b reduce to the form

dl ≤ b ≤ du.

B.1.3 Solutions Next, using the relation, we reduce the subproblem to one dimension as

2 ((α1 + mh)b + α0 − qh) max − − bqh + (α1b + α0)x + by 4mh dl ≤ b ≤ du.

By simplifying the terms, it reduces to

2 max − ξ2b + 2bξ1(x, y) + ξ0(x, y) (B.1)

dl ≤ b ≤ du where

2 (α1 + mh) ξ2 = ≥ 0, 4mh (α1 + mh)(α0 − qh) ξ1(x, y) = − + y − qh + α1x, 2mh 2 (α0 − qh) ξ0(x, y) = − + α0x 4mh as mh > 0. Since (B.1) is concave and differentiable in b, the optimal solution of the above unconstrained problem is d (−ξ b2 + 2bξ (x, y) + ξ (x, y)) = 0, db 2 1 0 or − 2ξ2b + 2ξ1(x, y) = 0, ξ (x, y) or b = 1 . ξ2

111 B.2. Case 2: Quadratic-Linear

Consequently, its optimal value

ξ1(x, y) 2 ξ1(x, y) = −ξ2( ) + 2 ξ1(x, y) + ξ0(x, y) ξ2 ξ2 2 2 ξ1(x, y) ξ1(x, y) = − + 2 + ξ0(x, y) ξ2 ξ2 2 ξ1(x, y) = + ξ0(x, y). ξ2 So the optimal solution of the subproblem (B.1) is

 ξ1(x,y) dl when ≤ dl  ξ2  ξ1(x, y) ξ1(x,y) when dl < ξ < du  ξ2 2  du otherwise, and hence its optimal value is

 2 −ξ2dl + 2dlξ1(x, y) + ξ0(x, y) if dlξ2 ≥ ξ1(x, y),  2 ξ1(x, y) + ξ0(x, y) if dlξ2 < ξ1(x, y) < duξ2,  ξ2  2 −ξ2du + 2duξ1(x, y) + ξ0(x, y) otherwise.

B.2 Case 2: Quadratic-Linear

Let the respective convex edge for ηh be characterized by mh > 0, qh and the vertex for ηw by (xw, yw), then

2 (a + bmh − qh) ηh = − − bqh and ηw = −xwa − ywb + xwyw. 4mh So the subproblem is

2 (a + bmh − qh) max − − bqh + ax + by a,b 4mh 2 (a + bmh − qh) − − bqh = −xwa − ywb + xwyw 4mh 2 (a + bmh − qh) 0 0 − − bqh ≤ ηk(a, b), k ∈ E (P ) ∪ V (P ) \{h, w} 4mh lh ≤ a + bmh ≤ uh, ···

112 B.2. Case 2: Quadratic-Linear

B.2.1 Solving the equality constraint Since the equality constraint is

2 (a + bmh − qh) − − bqh = −xwa − ywb + xwyw, 4mh let zh = a + mhb, then a = zh − mhb. So 2 (zh − qh) − − bqh = −xw(zh − mhb) − ywb + xwyw, 4mh which upon rearranging results in z2 − (4m x + 2q )z + 4m x y + q2 b = h h w h h h w w h . 4mh(mhxw + q − yw)

2 For simplicity, let b = β2zh + β1zh + β0. So the equality constraint results in a quadratic relation for a and b in zh. Now using the above relations, we reduce the subregion {(a, b): lh ≤ a + bmh ≤ uh, · · ·} to one dimension intervals as

l ≤ zh ≤ u. Similar to Case 1 quadratic-quadratic , when the intersection results in an empty set, the subproblem is declared infeasible and we proceed on to solving the next pair.

B.2.2 Solving the inequalities

Solving the inequalities for all the remaining ηk defined on the subregion, similar to Case 1(see SectionB.1.2), leads to two cases as quadratic-quadratic and quadratic-linear.

Case 2.1 : Quadratic-Quadratic inequality

When ηk is quadratic, the inequality constraint is 2 2 (zh − qh) 2 (a + bmk − qk) − − (β2zh + β1zh + β0)qh ≤ − − bqk 4mh 4mk 2 (zh − qh) 2 or − − (β2zh + β1zh + β0)qh ≤ 4mh 2 2 (zh + (mk − mh)(β2zh + β1zh + β0) − qk) 2 − − (β2zh + β1zh + β0)qk. 4mk

113 B.2. Case 2: Quadratic-Linear

Upon simplication, it reduces to a quartic inequality as

4 3 2 ζ4zh + ζ3zh + ζ2zh + ζ1zh + ζ0 ≤ 0 where 2 t2 ζ4 = 4mk 2t2t1 ζ3 = 4mk 2 t1 + 2t2t0 1 ζ2 = + β2(qk − qh) − 4mk 4mh 2t1t0 2qh ζ1 = + β1(qk − qh) + 4mk 4mh 2 2 t0 qh ζ0 = + β0(qk − qh) − , 4mk 4mh and t2 = β2(mk −mh), t1 = β1(mk −mh)+1, t0 = β0(mk −mh)−qk. When mh = mk, the inequality reduces to quadratic form. The solution, in this case, is a union of intervals for zh.

Case 2.2 : Quadratic-Linear inequality

When ηk is linear, the inequality constraint is 2 (zh − qh) 2 − − (β2zh + β1zh + β0)qh ≤ 4mh 2 − xkzh + (xkmh − yk) · (β2zh + β1zh + β0) + xkyk. By rearranging the terms, we get

2 ζ2zh + ζ1zh + ζ0 ≤ 0 where 1 ζ2 = β2(yk + mhxk − qh) − 4mh qh ζ1 = β1(yk + xkmh − qh) − xk + 2mh 2 qh ζ0 = β0(yk + mhxk − qh) − xkyk − . 4mh The solution of the above quadratic inequality is again a union of the inter- vals for zh.

114 B.2. Case 2: Quadratic-Linear

B.2.3 Solutions By subtituting the expressions for a and b, the subproblem becomes 2 (a + bmh − qh) max − − bqh + ax + by 4mh dl ≤ a + bmh ≤ du 2 (zh − qh) 2 = max − − (β2zh + β1zh + β0)qh + zhx + (y − xmh)· 4mh 2 (β2zh + β1zh + β0) dl ≤ zh ≤ du. By rearranging the terms, it reduces to the form 2 max − ξ2(x, y)zh + 2zhξ1(x, y) + ξ0(x, y) (B.2) dl ≤ zh ≤ du where 1 ξ2(x, y) = β2x − β2y + β2qh + , 4mh 1 qh ξ1(x, y) = [(1 − β1)x + β1y − β1qh + ], 2 2mh 2 qh ξ0(x, y) = −β0x + β0y − β0qh − . 4mh

Now when ξ2(x, y) > 0, the optimal solution for the unconstrained prob- lem of (B.2) is

d 2 (−ξ2(x, y)zh + 2zhξ1(x, y) + ξ0(x, y)) = 0 dzh or − 2ξ2(x, y)zh + 2ξ1(x, y) = 0

ξ1(x, y) or zh = ξ2(x, y) and consequently, its optimal value

ξ1(x, y) 2 ξ1(x, y) = −ξ2(x, y)( ) + 2 ξ1(x, y) + ξ0(x, y) ξ2(x, y) ξ2(x, y) 2 2 ξ1(x, y) ξ1(x, y) = − + 2 + ξ0(x, y) ξ2(x, y) ξ2(x, y) 2 ξ1(x, y) = + ξ0(x, y). ξ2(x, y)

115 B.3. Case 3: Linear-Linear

So the optimal solution of (B.2), when ξ2(x, y) > 0 is

 ξ1(x,y) dl when ≤ dl  ξ2(x,y)  ξ1(x, y) ξ1(x,y) when dl < ξ (x,y) < du ξ2(x, y) 2  du otherwise and its optimal value is

 2 −ξ2(x, y)dl + 2dlξ1(x, y) + ξ0(x, y) if dlξ2(x, y) ≥ ξ1(x, y),  2 ξ1(x, y) + ξ0(x, y) if dlξ2(x, y) < ξ1(x, y) < duξ2(x, y),  ξ2(x, y)  2 −ξ2(x, y)du + 2duξ1(x, y) + ξ0(x, y) otherwise, and when ξ2(x, y) ≤ 0, it is 2 2 max{ − ξ2(x, y)dl + 2dlξ1(x, y) + ξ0(x, y), −ξ2(x, y)du + 2duξ1(x, y)+ ξ0(x, y)}.

B.3 Case 3: Linear-Linear

Let the two vertices corresponding to ηh and ηw be defined as (xh, yh) and (xw, yw). Then the subproblem in this case is defined as

max − xha − yhb + xhyh + ax + by a,b

− xha − yhb + xhyh = −xwa − ywb + xwyw 0 0 − xha − yhb + xhyh ≤ ηk(a, b), k ∈ E (P ) ∪ V (P ) \{h, w}

lk ≤ a + bmk ≤ uk, ···

B.3.1 Solving the equality constraint

We start with solving the equality constraint first. Since both ηh and ηw are linear, the relation

− xha − yhb + xhyh = −xwa − ywb + xwyw y − y x y − x y or a = h w b + h h w w xw − xh xh − xw is linear as well. For simplicity, let a = α1b + α0. By using this relation, we reduce the subregion to simple one-dimension bounds as l ≤ b ≤ u. When the intersection results in an empty set, the subproblem is infeasible.

116 B.3. Case 3: Linear-Linear

B.3.2 Solving the inequalities

Next we solve the inequalities for the remaining ηk.

Case 3.1 : Linear-Quadratic

Let the convex edge corresponding to ηk be characterized by mk and qk. Then 2 (a + bmk − qk) − xha − yhb + xhyh ≤ − − bqk, 4mk 2 ((α1 + mk)b + α0 − qk) or − xh(α1b + α0) − yhb + xhyh ≤ − − bqk. 4mk Upon simplification this leads to a quadratic inequality in b as

2 β2b + β1b + β0 ≤ 0 with 2 (α1 + mk) β2 = 4mk (α1 + mk) · (α0 − qk) β1 = + qk − α1xh − yh 2mk 2 (α0 − qk) β0 = + xhyh − xhα0. 4mk So the solution in this case is a union of intervals.

Case 3.2 : Linear-Linear

When ηk is linear, the inequality reduces to

− xha − yhb + xhyh ≤ −xka − ykb + xkyk,

or − xh(α1b + α0) − yhb + xhyh ≤ −xk(α1b + α0) − ykb + xkyk,

or (α1xk + yk − α1xh − yh)b ≤ xkyk + α0xh − xhyh − α0xk.

Now when α1xk + yk − α1xh − yh > 0, x y + α x − x y − α x b ≤ k k 0 h h h 0 k α1xk + yk − α1xh − yh and when α1xk + yk − α1xh − yh < 0, x y + α x − x y − α x b ≥ k k 0 h h h 0 k . α1xk + yk − α1xh − yh

117 B.3. Case 3: Linear-Linear

B.3.3 Solutions By substituting the linear relation between a and b in the subproblem, we get

max − xh(α1b + α0) − yhb + xhyh + (α1 + α0)x + by a,b

dl ≤ b ≤ du.

Upon simplification, it reduces to

max ξ1(x, y)b + ξ0(x, y)

dl ≤ b ≤ du where

ξ1(x, y) = α1x + y − α1xh − yh,

ξ0(x, y) = α0x + xhyh − α0xh.

Since we are maximizing a linear function, the optimal value lies at the boundaries, that is,

max {dlξ1(x, y) + ξ0(x, y), duξ1(x, y) + ξ0(x, y)}.

118 Appendix C

Conjugate expressions for a rational function over an edge

2 ξ1(x) For a bivariate rational function r(x) = + ξ0(x) with ξ2(x) = ξ2(x) ξ21x1 +ξ22x2 +ξ20, ξ1(x) = ξ11x1 +ξ12x2 +ξ10 and ξ0(x) = ξ01x1 +ξ02x2 +ξ00, and a polytope P , let f(x) = r(x)+IP (x) and E = {x : x2 = mx1 +q, v−1 ≤ x1 ≤ v+1 } ⊂ P be an edge between vertices v− and v+. Then the conjugate

f ∗(s) = sup {hs, xi − f(x)} x∈ri(E)

= sup {hs, xi − (r(x) + IP (x))}. x∈ri(E)

By computing the critical points, we get

s − (∇r(x) + NP (x)) = 0.

Now NP (x) = {s : s = λ(−m, 1), λ ≥ 0}, so

2 s1 = −ξ21t + 2ξ11t + ξ01 − mλ 2 s2 = −ξ22t + 2ξ12t + ξ02 + λ (C.1)

ξ x + ξ x + ξ where t = 11 1 12 2 10 . As x ∈ ri(E), we also have ξ21x1 + ξ22x2 + ξ20

x2 = mx1 + q. (C.2)

Now by eliminating λ from (C.1), we obtain

2 s1 + ms2 = −(ξ21 + mξ22)t + 2(ξ11 + mξ12)t + ξ01 + mξ02). (C.3)

119 Appendix C. Conjugate expressions for a rational function over an edge

When ξ21 + mξ22 = 0, (C.3) becomes

s1 + ms2 = 2(ξ11 + mξ12)t + ξ01 + mξ02, s + ms − ξ − mξ or t = 1 2 01 02 . 2(ξ11 + mξ12)

Let t = t1s1 + t2s2 + t0, then by using the value of t and (C.2), we get

ξ11x1 + ξ12x2 + ξ10 = t1s1 + t2s2 + t0 ξ21x1 + ξ22x2 + ξ20 or ξ11x1 + (q + mx1)ξ12 + ξ10 − (t1s1 + t2s2 + t0) · (ξ20 + ξ21x1 + (q+

mx1)ξ22) = 0

(ξ20 + qξ22) · (t0 + t1s1 + t2s2) − ξ10 − qξ12 or x1 = . ξ11 + mξ12

For simplification, let x1 = γ10s1 + γ01s2 + γ00 where

t1(ξ20 + qξ22) γ10 = ξ11 + mξ12 t2(ξ20 + qξ22) γ01 = ξ11 + mξ12 t0(ξ20 + qξ22) − ξ10 − qξ12 γ00 = , ξ11 + mξ12 then the conjugate expression, upon substituting values, becomes

∗ f (s) = s1x1 + s2(mx1 + q) − r(x1, mx1 + q) 2 2 = ζ11s1 + ζ12s1s2 + ζ22s2 + ζ10s1 + ζ01s2 + ζ00 where

2 (ξ11γ10 + mξ12γ10) ζ11 = − + γ10 ξ20 + qξ22 2(ξ11γ01 + mξ12γ01) · (ξ11γ10 + mξ12γ10) ζ12 = − + γ01 + mγ10 ξ20 + qξ22 2 (ξ11γ01 + mξ12γ01) ζ22 = − + γ01m ξ20 + qξ22 2(ξ11γ10 + mξ12γ10) · (ξ10 + ξ11γ00 + ξ12(q + mγ00)) ζ10 = − − mξ02γ10 ξ20 + qξ22 + γ00 − ξ01γ10

120 Appendix C. Conjugate expressions for a rational function over an edge

2(ξ11γ01 + mξ12γ01) · (ξ10 + ξ11γ00 + ξ12(q + mγ00)) ζ01 = − − mξ02γ01 ξ20 + qξ22 − ξ01γ01 + mγ00 + q 2 (ξ10 + ξ11γ00 + ξ12(q + mγ00)) ζ00 = − − ξ00 − ξ01γ00 − ξ02(q + mγ00). ξ20 + qξ22

Now when ξ21 + mξ22 6= 0, (C.3) results in

2 s1 + ms2 = −(ξ21 + mξ22)t + 2(ξ11 + mξ12)t + ξ01 + mξ02, 2 or (ξ21 + mξ22)t − 2(ξ11 + mξ12)t + s1 + ms2 − ξ01 − mξ02 = 0, p t00 ± t10/2s1 + t01/2s2 + t00/2 or t = t−00 where

t00 = ξ11 + mξ12

t10/2 = −ξ21 − mξ22 2 t01/2 = −ξ22m − mξ21 2 2 t00/2 = ξ01ξ21 + ξ11 + m(mξ12 + ξ01ξ22 + ξ02ξ21 + 2ξ11ξ12 + mξ02ξ22)

t−00 = ξ21 + mξ22. p For simplicity, let t1/2 = t10/2s1 + t01/2s2 + t00/2. By using t = ξ11x1 + ξ12x2 + ξ10 and (C.2), x2 = mx1 + q, we get ξ21x1 + ξ22x2 + ξ20 p ξ x + ξ x + ξ t00 ± t10/2s1 + t01/2s2 + t00/2 11 1 12 2 10 = , ξ21x1 + ξ22x2 + ξ20 t−00 (t00 ± t1/2) · (ξ20 + ξ21x1 + ξ22x2) or ξ10 + ξ11x1 + ξ12x2 − = 0, t−00 ξ20t00 − ξ10t−00 ± ξ20t1/2 − qξ12t−00 + qξ22t00 ± qξ22t1/2 or x1 = , ξ11t−00 − ξ21t00 ∓ ξ21t1/2 + mξ12t−00 − mξ22t00 ∓ mξ22t1/2 p γ00 ± γ1/2 γ10/2s1 + γ01/2s2 + γ00/2 or x1 = p ∓γ−1/2 γ10/2s1 + γ01/2s2 + γ00/2

121 Appendix C. Conjugate expressions for a rational function over an edge where

γ00 = ξ20t00 − ξ10t−00 − qξ12t−00 + qξ22t00

γ10/2 = t10/2

γ01/2 = t01/2

γ00/2 = t00/2

γ−1/2 = −ξ21 − mξ22.

Now by substituting the optimal values of x1 and x2, the conjugate expression becomes ∗ f (s) = s1x1 + s2(mx1 + q) − r(x1, mx1 + q)

ψ1(s1, s2) = q + ψ0(s1, s2) ζ00 ψ1/2(s1, s2) where 2 ζ00 = (ξ21 + mξ22)

ψ1(s1, s2) = 2(ξ21 + mξ22) · (ξ10ξ21 − ξ11ξ20 + mξ10ξ22 − mξ12ξ20 − qξ11ξ22

+ qξ12ξ21)s1 + 2m(mξ22 + ξ21) · (ξ10ξ21 − ξ11ξ20 + mξ10ξ22

− mξ12ξ20 − qξ11ξ22 + qξ12ξ21)s2 + δ1 2 ψ1/2(s1, s2) = −(ξ21 + mξ22)s1 − m(ξ21 + mξ22m)s2 + ξ01ξ21 + ξ11 2 + m(mξ12 + ξ01ξ22 + ξ02ξ21 + 2ξ11ξ12 + mξ02ξ22) ψ0(s1, s2) = (−ξ20(ξ21 + mξ22) − qξ22(ξ21 + mξ22))s1 + (qξ21(ξ21 + mξ22)

− mξ20(ξ21 + mξ22))s2 + δ0. and

δ1 = − 2(ξ10ξ21 − ξ11ξ20 + mξ10ξ22 − mξ12ξ20 − qξ11ξ22 + qξ12ξ21) 2 2 · (ξ01ξ21 + ξ11 + m(ξ12m + ξ01ξ22 + ξ02ξ21 + 2ξ11ξ12 + ξ02ξ22m)). δ0 = 2ξ11ξ20(ξ11 + mξ12) − 2ξ10ξ21(ξ11 + mξ12) − ξ00ξ21(ξ21 + mξ22)

+ ξ01ξ20(ξ21 + mξ22) − 2mξ10ξ22(ξ11 + mξ12) + 2mξ12ξ20(ξ11 + mξ12)

− mξ00ξ22(ξ21 + mξ22) + mξ02ξ20(ξ21 + mξ22) + 2qξ11ξ22(ξ11 + mξ12)

− 2qξ12ξ21(ξ11 + mξ12) + qξ01ξ22(ξ21 + mξ22) − qξ02ξ21(ξ21 + mξ22).

Hence when ξ21 + mξ22 6= 0,

∗ ψ1(s1, s2) f (s) = q + ψ0(s1, s2), ζ00 ψ1/2(s1, s2)

122 Appendix C. Conjugate expressions for a rational function over an edge

and when ξ21 + mξ22 = 0,

∗ 2 2 f (s) = ζ11s1 + ζ12s1s2 + ζ22s2 + ζ10s1 + ζ01s2 + ζ00, where all ζij, ψi and ψi/j have been formulated above.

123