A Mini-Introduction to Convexity

A mini-introduction to convexity Geir Dahl∗ January 20, 2014 1 Introduction Convexity, or convex analysis, is an area of mathematics where one studies questions related to two basic objects, namely convex sets and convex functions. Triangles, rectangles and “certain” polygons are examples of convex sets in the plane, and the quadratic function f(x) = ax2 + bx + c is convex provided that a ≥ 0. Actually, the points in the plane on or above the graph of this quadratic function is another example of a convex set. But one may also consider convex sets in IRn for any n, and convex functions of several variables. Convexity is the mathematical core of optimization, and it plays an important role in many other mathematical areas as statistics, approximation theory, differential equations and mathematical economics. This short note is meant for a short (probably too short) introduction to some concepts and results in convexity. The focus is on convexity in connection with linear optimization. This notes are meant for two (or three) lectures in the course MAT-INF3100 Linear Optimization where the main project is to study linear programming, but where some knowledge to convexity is useful. Due to this limited scope of these notes we do not discuss convex functions, except for a few remarks a couple of places. Example 1. (Optimization and convex functions) A basic optimization problem is to minimize a real-valued function f of n variables, say f(x) where x = (x1,...,xn) ∈ A and A is the domain of f. (Such problems arise in all sorts of applications: economics, statistics (estimation, regression, curve fitting), approximation problems, scheduling and planning problems, image ∗University of Oslo, Dept. of Mathematics ([email protected]) 1 f(x1, x2) f(x) f(x) x2 x x x1 Figure 1: Some convex functions analysis, medical imaging, engineering applications etc.). A global minimum of f is a point x∗ with f(x∗) ≤ f(x) for all x ∈ A where A is the domain of f. Often it is hard to find a global minimum so one settles with a local minimum point which satisfies f(x∗) ≤ f(x) for all x in A that are sufficiently close to x∗. There are several optimization algorithms that are able to locate a local minimium of f. Unfortunaly, the function value in a local minimum may be much larger than the global minimum value. This raises the question: Are there functions where a local minimum point is also a global minimum? The main answer to this question is: If f is a convex function, and the domain A is a convex set, then a local minimum point is also a global minimum point ! Thus, one can find the global minimum of convex functions whereas this may be hard (or even impossible) in other situations. Some convex functions are illustrated in Figure 1. In linear optimization (= linear programming) we minimize a linear function f subject to linear constraints; more precicely these constraints are linear inequalities and linear equations. The feasible set in this case (the set of points satisfying the constraints) is always a convex set, in fact it is a special convex set called a polyhedron. Example 2. (Convex set) Loosely speaking a convex set in IR2 (or IRn) is a set “with no holes”. More accurately, a convex set C has the following property: whenever we choose two points in the set, say x, y ∈ C, then all 2 Figure 2: Some convex sets in the plane. points on the line segment between x and y also lie in C. Some examples of convex sets in the plane are: a sphere (ball), an ellipsoid, a point, a line, a line segment, a rectangle, a triangle, see Fig. 2. But, for instance, a set with a finite number p of points is only convex when p = 1. The union of two disjoint (closed) triangles is also nonconvex. Example 3. (Approximation) A basic approximation problem, with several applications, may be presented in the following way: given some closed set S ⊂ IRn and a vector a 6∈ S, find a nearest point defined as a point (vector) x ∈ S which is as close to a as possible among elements in S. Let us measure the distance between vectors using the Euclidean norm, so(kx−yk = n 2 1/2 n ( j=1(xj − yj) ) for x, y ∈ IR . One can show that there is always at least one nearest point provided that S is nonempty and closed (contains its boundary).P Now, if S is a convex set, then there is a unique nearest point. 2 The definitions You will now see three basic definitions: 1. A set C ⊆ IRn is called convex if (1 − λ)x1 + λx2 ∈ C whenever x1, x2 ∈ C and 0 ≤ λ ≤ 1. Geometrically, this means that C contains the line segment between each pair of points in C. 2. Let C ⊆ IRn be a convex set and consider a real-valued function f defined on C. The function f is called convex if the inequality f((1 − λ)x + λy) ≤ (1 − λ)f(x)+ λf(y) (1) holds for every x, y ∈ C and every 0 ≤ λ ≤ 1. 3 3. A polyhedron P ⊆ IRn is defined a the solution set of a system of linear inequalities. Thus, P has the form P = {x ∈ IRn : Ax ≤ b} (2) where A is a real m×n matrix, b ∈ IRm and where the vector inequality means it holds for every component. Here are some important comments to these definitions: • In part 1 of the definition above the point (1 − λ)x1 + λx2 is called a convex combination of x1 and x2. So the definition of a convex set says that it is closed under taking convex combination of pairs of points. Actually, one can show that when C is convex it also contains every convex combination of any (finite) set of its points. A convex combination of points x1, x2,...,xm is a linear combination m λjxj j=1 X where the coefficients λ1,λ2,...,λm are nonnegative and sum to 1. • In the definition of a convex function we actually use that the domain C is a convex set: this assures that the point (1 − λ)x + λy lies in C so the defining inequality for f makes sense. • In the definition of a polyhedron we consider systems of linear inequalities. Since a linear equation aT x = α may be written as two linear inequalities, namely aT x ≤ α and −aT x ≤ −α, one may also say that a polyhedron is the solution set of a system of linear equations and inequalities. Proposition 1. Every polyhedron is a convex set. n Proof. Consider a polyhedron P = {x ∈ IR : Ax ≤ b} and let x1, x2 ∈ P and 0 ≤ λ ≤ 1. Then A((1 − λ)x1 + λx2)=(1 − λ)Ax1 + λAx2 ≤ (1 − λ)b + λb = b which shows that (1 − λ)x1 + λx2 ∈ P and the convexity of P follows. 4 3 Linear optimization and convexity Recall that a linear programming problem may be written as maximize c1x1 + ... +cnxn subject to a11x1 + ... +a1nxn ≤ b1; . (3) . am1x1 + ... +amnxn ≤ bm; x1,...,xn ≥ 0. or more compactly in matrix form maximize cT x subject to (4) Ax ≤ b x ≥ O. Here A = [ai,j] is the m × n coefficient matrix with (i, j)th element being m ai,j, b ∈ IR (column vector) and O denotes a zero vector (here of dimension n). Again vector inequalities should be interpreted componentwise. Thus, the LP feasible set {x ∈ IRn : Ax ≤ b, x ≥ O} is a polyhedron and therefore a convex set. (Actually, LP may be defined as minimizing or maximizing a linear function over a polyhedron). As a consequence we have that if x1 and x2 are two feasible points, then every convex combination of these points is also feasible. But what can be said about the set of optimal solutions? Proposition 2. In an LP problem with finite optimal value the set P ∗ of optimal solutions is a convex set, actually P ∗ is a polyhedron. Proof. Let v∗ denote the optimal value. Then P ∗ = {x ∈ IRn : Ax ≤ b, x ≥ O, cT x = v∗} which is a polyhedron. So, if you have different optimal solutions of an LP problem every convex combination of these will also be optimal. An attempt to illustrate the geometry of linear programming is given in Fig. 3 (where the feasible region is the solution set of five linear inequalities). 5 c b ∗ b x cT x = const. b feasible set cT x : maximum value b b Figure 3: Linear programming . 4 The convex hull Given a (possibly nonconvex) set S it is natural to ask for the smallest convex set containing S. This question is what we consider in this section. Let S ⊆ IRn be any set. Define the convex hull of S, denoted by conv(S) as the set of all convex combinations of points in S (see Fig. 4). The convex hull of two points x1 and x2 is the line segment between the two points An important fact is that conv(S) is a convex set, whatever the set S might be. Thus, taking the convex hull becomes a way of producing new convex sets. The following proposition tells us that the convex hull of a set S is the smallest convex set containing S. Recall that the intersection of an arbitrary family of sets consists of the points that lie in all of these sets. Proposition 3. Let S ⊆ IRn. Then conv(S) is equal to the intersection of all convex sets containing S.

Load more