Convexity, Optimization, and Convex Duality

Chapter 1 Convexity, optimization, and convex duality The purpose of this chapter is to cover some background theory in convexity, optimization, and convex duality. The structure of this chapter is as follows: Section 1.1 recalls some basic notions of convexity theory, such as convex sets, convex functions and properties of these. In Section 1.2 we consider a weaker requirement than convexity, namely quasiconvexity. Section 1.3 covers some of the most central theorems and ideas of optimization theory. In Section 1.4 we consider a method for soving constrained optimization problems, called La- grange duality. Section 1.5 introduces the convex (conjugate) duality framework of Rockafellar [18] which can be applied to rephrase and solve a large variety of optimization problems, due to its generality. The convex duality framework is a generalized version of the Lagrange duality in Section 1.4. Some examples of optimization using convex duality is given in Section 1.6. Section 1.7 introduces conjugate functions. In Section 1.8, we introduce the Lagrange function of convex duality theory. 1 1.1 Basic convexity This section summarizes some of the most important definitions and properties of convexity theory. The material of this section is mainly based on the pre- sentation of convexity in Rockafellar [18], Hiriart-Urruty and Lemarèchal [11] and Dahl [4]. The last two consider X = Rn, but the extension to a general inner product space is straightforward. Therefore, in the following, let X be a 1These notes are an adaptation of parts of Dahl [5]: Exercises and solutions have been added, some new material has been added and other things have been removed. Some material has been rewritten and new figures have been added. 1 2 CHAPTER 1. CONVEXITY real inner product space, i.e. a vector space X equipped with an inner product h·; ·i : X × X ! R (so the function h·; ·i is symmetric, linear in the first component and positive definite in the sense that hx; xi ≥ 0 for all x 2 X, with equality if and only if x = 0). For instance, X = Rn, n 2 N is such a space. We begin with some core definitions. Definition 1.1.1 (i) (Convex set) A set C ⊆ X is called convex if λx1 +(1− λ)x2 2 C for all x1; x2 2 C and 0 ≤ λ ≤ 1. (ii) (Convex combination) A convex combination of elements x1; x2; : : : ; xk in Pk Pk X is an element of the form i=1 λixi where i=1 λi = 1 and λi ≥ 0 for all i = 1; : : : ; k. (iii) (Convex hull, conv(·)) Let A ⊆ X be a set. The convex hull of A, denoted conv(A) is the set of all convex combinations of elements of A. (iv) (Extreme points) Let C ⊆ X be a convex set. An extreme point of C is a point that cannot be written as a convex combination of any other points than itself. That is: e 2 C is an extreme point for C if λx + (1 − λ)y = e for some x; y 2 C implies x = y = e. (v) (Hyperplane) H ⊂ X is called a hyperplane if it is of the form H = fx 2 X : ha; xi = αg for some nonzero vector a 2 X and some real number α. (vi) (Halfspace) A hyperplane H divides X into two sets H+ = fx 2 X : ha; xi ≥ αg and H− = fx 2 X : ha; xi ≤ αg, these sets intersect in H. These sets are called halfspaces. We will now look at some hyperplane theorems in Rn. These will be used in connection to environmental contours later in the course. Note that most of these theorems generalise to an arbitrary real inner product space X. However, the proofs are more complicated in the general case. Since the Rn versions are sufficient for our purposes in this course, we restrict ourselves to this. Any hyperplane in Rn can be written in the form Π = fx : c0x = dg, where c 2 Rn is a normal vector to Π and d 2 R. Let Π− = fx : c0x ≤ dg and Π+ = fx : c0x ≥ dg denote the two half-spaces bounded by Π. Let S ⊆ Rn.A supporting hyperplane of S, is a hyperplane Π such that we either have S ⊆ Π− or S ⊆ Π+, and such that Π \ @S 6= ;. If Π is a supporting hyperplane of the set S, and S ⊆ Π−, we say that Π+ is a supporting half-space of S. We observe that if Π+ is a supporting half-space of S, we also have: Π+ \ S ⊆ @S: Moreover, we introduce the notation: P(S) = The family of supporting half-spaces of S. n ∗ For a given nonempty set S ⊆ R and a vector x0 2= S, the vector x 2 S is ∗ said to be the projection of x0 onto S if x is the point in S which is closest to ∗ x0. In general the projection x may neither exist nor be unique. However, if S is a closed convex set, x∗ is well-defined, and we have: 1.1. BASIC CONVEXITY 3 Figure 1.1: Some convex sets in the plane. x y Figure 1.2: A non-convex set. 4 CHAPTER 1. CONVEXITY Theorem 1.1.2 (Projection) Let S ⊆ Rn be a closed convex set, and let x0 2= S. Then the following holds true: • There exists a unique solution to the projection problem ∗ • A vector x 2 S is the projection of x0 onto S if and only if: ∗ 0 ∗ (x − x0) (x − x ) ≥ 0 for all x 2 S: See Figure 1.3 for an illustration of the projection in R2. x S x0 x* ∗ Figure 1.3: The point x is the projection of x0 onto the closed convex set S. ∗ ∗ Remark 1.1.3 If x 2 S, and θ is the angle between (x − x0) and (x − x ), then we must have θ 2 [−π=2; π=2]. This holds if and only if: ∗ 0 ∗ (x − x0) (x − x ) ≥ 0 for all x 2 S: Theorem 1.1.4 (Projection hyperplane) Let S 2 Rn be a closed convex set, and assume that x0 2= S. Then there exists a supporting hyperplane Π = fx : c0x = dg of S such that: 0 0 c x ≤ d for all x 2 S; and c x0 > d: Proof: Since S is a closed convex set, it follows by the projection theorem that ∗ the projection of x0 onto S, denoted x exists and satisfies: ∗ 0 ∗ (x − x0) (x − x ) ≥ 0 for all x 2 S: (1.1) ∗ 0 ∗ Now, we let c = (x0 − x ), and d = c x . Then (1.1) can be written as: c0(x − x∗) ≤ 0 for all x 2 S: (1.2) 1.1. BASIC CONVEXITY 5 Hence, by (1.2) we have: c0x ≤ c0x∗ = d for all x 2 S: Thus, S ⊆ Π− = fx : c0x ≤ dg, and since x∗ 2 S \ Π, Π is a supporting hyperplane of S. Furthermore, we have: 0 ∗ ∗ 0 ∗ c (x0 − x ) = (x0 − x ) (x0 − x ) > 0: Hence, it follows that: 0 0 ∗ c x0 > c x = d: Theorem 1.1.5 (Supporting hyperplane) Let S 2 Rn be a convex set, and assume that either x0 2= S or x0 2 @S. Then there exists a hyperplane Π such − that S ⊆ Π and such that x0 2 Π. If x0 2 @S, Π is a supporting hyperplane of S. Proof: The result follows by a similar argument as for the projection hyperplane theorem and is left as an exercise to the reader. Let S; T ⊆ Rn. A hyperplane Π separates S and T if either S ⊆ Π− and T ⊆ Π+ or S ⊆ Π+ and T ⊆ Π−. Theorem 1.1.6 (Separating hyperplane) Assume that S; T ⊆ Rn are convex, and that S \ T ⊆ @S. Then there exists a hyperplane Π separating S and T such that S ⊆ Π− and T ⊆ Π+. n Proof: We let u0 = 0 2 R and introduce the set: U = fx − y : x 2 So; y 2 T g; where So = S n @S is the (convex) set of inner points in S. We first argue that U is convex. To show this we must show that if u1; u2 2 U, then αu1 + (1 − α)u2 2 U for all α 2 [0; 1]. Since u1; u2 2 U there exists x1; x2 2 So and y1; y2 2 T such that: u1 = x1 − y1 and u2 = x2 − y2: Since So and T are convex, it follows that for any α 2 [0; 1], we have: αx1 + (1 − α)x2 2 So and αy1 + (1 − α)y2 2 T Hence, we have: αu1 + (1 − α)u2 = α(x1 − y1) + (1 − α)(x2 − y2) = (αx1 + (1 − α)x2) − (αy1 + (1 − α)y2) 2 U: 6 CHAPTER 1. CONVEXITY By the assumption that S \ T ⊆ @S it follows that So and T do not have any element in common. Hence, it follows that: u = x − y 6= 0; for all x 2 So and y 2 T : Thus, we conclude that: u0 = 0 2= U: Then, by the supporting hyperplane theorem there exists a hyperplane Π0 = 0 − fx : c x = d0g such that U ⊆ Π0 and such that u0 2 Π0. 0 0 In fact, u0 2 Π0 implies that c u0 = c 0 = d0. Thus, d0 = 0. − 0 Since U ⊆ Π0 , we have c u ≤ d0 = 0 for all u 2 U, implying that: 0 c (x − y) ≤ 0 for all x 2 So and y 2 T ; or equivalently: 0 0 c x ≤ c y for all x 2 S0 and y 2 T : (1.3) 0 We then let d = supx2So c x.

Load more