<<

Convexity in Rn 1 N CONVEXITY IN R Juan Pablo Xandri

Up to this point we have studied metric spaces from a very general perspective. What we aim to do now is to study optimization problems on these spaces. Even when there are well de…ned optimization methods on metric spaces, the challenge that we will face now is that these are quite diverse. In this chapter we will focus on the analysis of concepts and properties that are speci…c to the case of Rn, and which will help us in the next chapter to analyze the main optimization methods in Rn. In this and following notes we will introduce the concepts of convexity and di¤erentiability for the particular case of normed spaces, as is the case of Rn. Even when these concepts can be generalized to spaces such as Rm; Rk , (R) or even spaces of bounded sequences, this will not be necessaryB for this course.L 

Convex Sets

De…nition We say that a set is convex if whenever we consider any two elements of the set, then the connecting these elements is also contained in this set. Graphically, if we have the points x; y A as in Figure 10, then the line segment that joins them (xy) is also contained in2 the set.

Figure 10 In Figure 11 we present an example of a set that is not convex: by considering the points x and y in the set A, we see that there is an element z xy such that z = A. 2 2 Figure 11 In this section we will formalize the concept of convex sets and present some of their properties.

De…nition 1 () Given x; y Rn, we say that z Rn is a linear convex combination of x and y there2 exists  [0; 1] such2 that () 2 z = x + (1 ) y:

De…nition 2 () We say that a set C Rn is a convex set if for all  pair of points x; y C and for all  [0; 1], the point z x + (1 ) y C. Equivalently, a set2 is said to be convex2 if whenever x; y C, then also all2 their linear convex combinations are contained in C. 2 2 Convexity in Rn

Hence, in order to prove that a set C is convex, we need to prove a "theorem": basically, we consider any two points x; y C and a number  [0; 1] as given, and we need to prove that x + (1 ) 2C. We will now consider2 some simple examples of convex sets. 2

Examples of Convex Sets

(1) Normed Ball: In this example we prove that for any norm N, the set

n BN (a; r) = x R : N (x a) < r dN (x; a) < r f 2 () g

is a convex set. Consider any x; y BN (a; r) and  [0; 1]. We want to 2 2 show that the point z = x + (1 ) y is contained in BN (a; r). In order to do so, notice that

N (z a) = N (x + (1 ) y a) = N ( (x a) + (1 )(y a)) N ( (x a)) + N ((1 )(y a)) = N (x a) + (1 ) N (y a) (1) (2)

< r + (1 ) r = r; (3) where in (1) we use the subadditivity of normed spaces, in (2) the property that states that N ( x) = N (x), and in (3) the fact that x; y BN (a; r). j j 2 (2) Polyhedra: A polyhedron is a set characterized by a collection of linear equalities and P inequalities. Let A be a matrix m n, B a matrix p n, and a Rm and   2 b Rp two vectors. A polyhedron is characterized as follows: 2 n = x R : Ax a and Bx = b : P f 2  g

We will prove that any polyhedron is a convex set. Once again, we prove that if x; y and  [0; 1], then z = x + (1 ) y . Thus, we have 2 P 2 2 P to prove that Az a and Bz = b. Notice that 

Az = A (x + (1 ) y) = Ax + (1 ) Ay a + (1 ) a = a;  Bz = B (x + (1 ) y) = Bx + (1 ) By = b + (1 ) b = b:

Therefore, z . 2 P Convexity in Rn 3

Properties of Convex Sets

In this section we will prove that convexity is preserved under certain operations over sets. These properties will aid us to de…ne some fundamental concepts involv- ing convex sets.

Proposition 3 (Interesection of Convex Sets) Let be a collection of sets F in Rn such that every C is convex. Then, the set 2 F = C C C \2F is also convex.

Proof. Consider x; y and  [0; 1]. We want to prove that z = x + (1 ) y . Notice then2 C that 2 2 C x x C for all C and 2 C () 2 2 F y y C for all C . 2 C () 2 2 F Hence, x (1 ) y C for all C = x + (1 ) y , which is what we wanted to prove.2 2 2 F ) 2 C

This proposition will enable us to de…ne the concept of of a set.

De…nition 4 (Convex Hull) Let A Rn be any set. We say that C A is the convex hull of A (and we write C = co(A)) if it is the "smallest" convex set that contains A. By smallest we mean that if A B and B is convex, then co (A) B.  

n n Theorem 5 Given a set A R , de…ne A = C R : A C and C is convex . Then,  F f   g co (A) = C = ?: 6 C A \2F

n Proof. The fact that C = ? is obvious given that R A. Since A C 6 2 F  C A \2F for every C A = A C, and by the previous proposition we know that 2 F )  C A \2F C is a convex set. Hence, this set is a candidate to be the convex hull of A.

C A Since\2F co (A) is the smallest convex set that contains A, it must also be the case that, since C is convex, we must have that co (A) C. On the other  C A C A \2F \2F 4 Convexity in Rn hand, if there is an x C such that x = co (A) = x = C. Hence, we 2 2 ) 2 C A C A \2F \2F conclude that co (A) = C.

C A \2F

We will now prove a result that will be very useful when we want to show that certain sets are convex: the Cartesian product of convex sets is also a convex set.

m k Proposition 6 (Product of Convex Sets) Let C1 R and C2 R be con- m k   vex sets. Then, C = C1 C2 R R is a convex set.   

Proof. Consider x = (x1; x2) C1 C2 and y = (y1; y2) C1 C2, and pick a  [0; 1]. We have that 2  2  2

x+(1 ) y =  (x1; x2)+(1 )(y1; y2) = (x1 + (1 ) y1; x2 + (1 ) x2) C1 C2 2  given that C1 and C2 are convex sets.

Separation of Convex Sets

One of the fundamental results regarding convex sets is the so called Hyperplane Separation Theorem. Basically, this theorem tells us that if we take two convex sets C;D Rn such that C D = ?, then we can …nd a hyperplane that separates them. In order to understand\ this, consider the case of n = 2. This theorem tells us that we can trace a straight line between both sets which separates the plane n into two half-planes, S1 and S2 such that S1 S2 = R and C S1 and D S2. We can see this graphically in Figure 12. [  

Figure 12 (Separation of Convex Sets)

In Figure 13 we can also appreciate an example of how this theorem might not hold if one of the two sets is not convex.

Figure 13

In order to be able to prove this theorem, we must formalize the concepts of "lines" and half-spaces.

De…nition 7 (Hyperplanes) Given a vector a Rn and a constant b R, we 2 2 say that the set (a; b) Rn is a hyperplane if x (a; b) aT x = b. H  2 H () Convexity in Rn 5

+ De…nition 8 (Half-spaces) Given a hyperplane (a; b), de…ne the sets S (a; b) ;S (a; b) H  Rn as

+ n T S (a; b) = x R : a x b , and 2 n T  S (a; b) = x R : a x b .  2  We say that S+ (a; b) is the positive half-space of and, accordingly, that H S (a; b) is the negative half-space of . H

Now that we have de…ned the previous concepts, we can state the Hyperplane Separation Theorem.

Theorem 9 (Hyperplane Separation Theorem) Let C;D Rn be two non-  empty convex sets such that C D = ?. Then, there exist a Rn and b R such + \ 2 2 that C S (a; b) and D S (a; b).  

This theorem will be very relevant for many applications, for example, for the proof of the Second Welfare Theorem (if we have enough time, we will consider it later on as an example).

Brouwer’sFixed-Point Theorem

In this section we will once again study …xed-point problems. Brouwer’stheorem, which is very relevant for many applications, is easy to understand, but extremely di¢ cult to prove (it requires the knowledge of concepts of algebraic topology that are well beyond the scope of this course). However, its practical relevance is ex- traordinary and it involves many concepts that we have been studying up to this point.

Theorem 10 (Brouwer’sFixed-Point Theorem) Consider a function T : C ! C where C Rn is a compact and convex set and T is continuous. Then, there  exists x C such that T (x) = x. 2

This theorem does not tell us anything about the uniqueness of the …xed point or how to approximate it, which the Contraction Mapping theorem does. However, the simple fact that under relatively generic assumptions there exists a …xed point will be of great practical relevance in other courses. For example, this result is used in game theory and general equilibrium to prove the existence of equilibria (as you will see in future courses, equilibria are almost by de…nition …xed points of certain processes). 6 Convexity in Rn Concave and Convex Functions

In this section we will study concave and convex functions. These types of func- tions, which are very important in probability and measure theory (as we will see later), are of fundamental importance for the study of optimization problems. Ac- tually, almost any optimization problem on any space (not necessarily Rn) has to deal with the concavity and convexity of the objective function.

We will (i) de…ne the concepts of concave and convex functions, (ii) study the basic properties of these functions, (iii) prove what types of operations preserve concavity/convexity, (iv) analyze the relationship that these functions have with probability theory (and measure theory), and …nally, (v) we will study the rela- tionship that concave and convex functions have with optimization problems.

De…nition

De…nition 11 () Let C Rn be a convex set, and let f :  C R be a function. We say that f is a concave function for every x; y! C and for all  [0; 1] it is the case that () 2 2 f (x + (1 ) y) f (x) + (1 ) f (y) : 

De…nition 12 () Let C Rn be a convex set, and let f : C  ! R be a function. We say that f is a concave function for every x; y C and for all  [0; 1] it is the case that () 2 2 f (x + (1 ) y) f (x) + (1 ) f (y) : 

Intuitively, a function is convex if whenever I draw a line segment between two points on the curve of the function, then for every point on this segment, the function lies below it. Mathematically this means that every linear convex combination of elements in the image of f is greater than the corresponding image of the linear convex combination in the domain. This concept can be easily seen graphically.

Convexity Graph It is worth noticing that restricting the domain to convex sets should be evident given that we should be able to evaluate the function on any convex combination of points. Regarding concave functions, these are simply the ones for which the relationship holds opposite. An obvious result obtained from the previous de…ni- tions is the following proposition, which we should keep in mind in any application of convex functions.

Concavity Graph Convexity in Rn 7

Proposition 13 (Relationship Concave-Convex Functions) Let C Rn be  a convex set and f : C R a function. We have that ! f is concave g f is convex. () 

Proof. Evident by simply multiplying the inequality in the de…nitions of concavity and convexity by ( 1).

We will now de…ne the concepts of strictly concave and convex functions. These functions will be very important whenever we want to show that some optimization problems have unique solutions.

De…nition 14 (Strictly Convex Functions) Let C Rn be a convex set and  f : C R a convex functions. We say that f is strictly convex for every x; y C!and for all  (0; 1) such that x = y, it is the case that () 2 2 6 f (x + (1 ) y) < f (x) + (1 ) f (y) .

De…nition 15 (Strictly Concave Functions) Let C Rn be a convex set and  f : C R a convex functions. We say that f is strictly convex for every x; y C!and for all  (0; 1) such that x = y, it is the case that () 2 2 6 f (x + (1 ) y) < f (x) + (1 ) f (y) .

Notice that this de…nition is very similar to that of convexity, but that in this case we restrict attention to pairs of di¤erent points and strict linear convex combinations in the sense that x = x + (1 ) y = y. 6 6

Exercise 1: Let f : C R where C Rn is a convex set and f is a convex function. Show !  i=k that if we have x1; x2; :::; xk C and and real numbers i i=1 such that i 0 i=k f g  f g  and i = 1, it is the case that i=1 X i=k i=k

f ixi if (xi) . i=1 !  i=1 X X Hint: We know this is true for k = 2. In order to prove it for the general case, suppose that this is true for k = h and show that it holds for k = h + 1 (this type of proof is known as proof by "complete induction"). 8 Convexity in Rn

Examples of Concave and Convex Functions

(1): A¢ ne Functions

Let f : Rn R be a function de…ned as ! f (x) = aT x + b

with a Rn; b R. We will prove that f is a convex, as well as a concave function.2 Notice2 that a function is convex and concave for every x; y () 2 Rn and  [0; 1], we have that f (x + (1 ) y) = f (x) + (1 ) f (y). Notice that2

f (x + (1 ) y) = aT (x + (1 ) y)+b =  aT x + b +(1 ) aT y + b = f (x) + (1 ) f (y) ;   which proves the desired result.

(2): Norms

Consider any norm N : Rn R. We will show that N is, in fact, a convex ! function. So, consider x; y Rn and  [0; 1]. Then, we have that 2 2 N (x + (1 ) y) N (x) + N ((1 ) y) =  N (x) + 1  N (y) (1) (2) j j j j

= N (x) + (1 ) N (y) ; (3) where (1) uses the subadditivity of the norm, in (2) the proportionality of norms, and …nally in (3) the fact that  [0; 1]. Notice that this result shows us that functions like 2

i=n N (x) = x2, 2 v i u i=1 uX ti=n N1 (x) = xi , and i=1 j j X N (x) = max xi 1 i j j are convex functions.

Relation Between Concave and Convex Functions and Convex Sets

We will now prove that many convex sets can be de…ned by applying the concept of convex function. Convexity in Rn 9

Proposition 16 Let C Rn be a convex set, f : C R a convex function, and de…ne A C as  !  A = x C : f (x) 0 : f 2  g Then A is a convex set.

Proof. Consider x; y A and  [0; 1]. We want to show that z x + 2 2  (1 ) y A f (z) 0. We have that 2 ()  f (z) = f (x + (1 ) y) f (x) + (1 ) f (y) 0 (1) (2) using in (1) that f is convex and in (2) that x; y A and that  [0; 1]. 2 2

Proposition 17 Let C Rn be a convex set, f : C R a concave function, ad de…ne B C as  !  B = x C : f (x) 0 : f 2  g Then B is a convex set.

Proof. Simply notice that B = x C :( f)(x) 0 and use the previous result with f = f. f 2  g

Operations that Preserve Convexity

We will now present a list of results that will be useful to determine if certain functions are concave of convex.

Theorem 18 ( of Concave (Convex) Functions) Let C n i=k  R be a convex set and fi a set of concave (convex) functions. Given numbers f gi=1 i 0 with i = 1; 2; :::; k, the function f  : C R de…ned as  ! i=k

f  (x) = ifi (x) i=1 X is also a concave (convex) function.

Proof. We will provide a proof only for concave functions, since for convex func- tions it is almost identical. Consider x; y C and  [0; 1]. Then, we have that 2 2 i=k i=k

f  (x + (1 ) y) = ifi (x + (1 ) y) i (fi (x) + (1 ) fi (y)) (1) i=1  i=1 X X = f  (x) + (1 ) f  (y) ; which implies that f  is a concave function. 10 Convexity in Rn

Theorem 19 (Integration of Concave (Convex) Functions) Let C Rn be  a convex set, Y Rk a bounded set, and f : C Y R a function such that f ( ; y) is concave (convex) for every y Y andf (x;!) is integrable for every  2  x C. Then, the function f  : C R de…ned as 2 !

f  (x) = f (x; y) dy y Y Z 2 is a concave (convex) function.

Proof. Let us prove this result for the case of concave functions (the case for convex functions is identical). Since f (x; ) is integrable for every x C, we can de…ne the function correctly, otherwise it would not be well de…ned.2 Consider x; z C and  [0; 1]. Then, we have that 2 2 f  (x + (1 ) z) = f (x + (1 ) z; y) dy (f (x; y) + (1 ) f (z; y)) dy = y Y (1) y Y (2) Z 2 Z 2

=  f (x; y) dy + (1 ) f (z; y) dy = f  (x + (1 ) z) y Y y Y Z 2 Z 2 using in (1) the fact that f ( ; y) is concave for every y Y and in (2) the linearity of the integral.  2

We will now prove some results on the composition of concave (convex) func- tions.

Theorem 20 (Composition of Convex and Concave Functions) Let C Rn  be a convex set, f : C R a convex (concave) function such that f (C) R is !  convex, and g : f (C) R be a non-decreasing convex (concave) function. Then, ! then function h : C R de…ned as ! h (x) = g (f (x)) (g f)(x)   is also a convex (concave) function.

Proof. Let us provide the proof for the case of convex functions (the case of concave functions is analogous). Consider x; y C and  [0; 1]. Then, we have that 2 2 g (f (x + (1 ) y)) g (f (x) + (1 ) f (y)) g (f (x)) + (1 ) g (f (y)) ; (1) (2) using the fact that f is convex and g is non-decreasing in (1), and the fact that g is also convex in (2).

The main application of this result is the following: Convexity in Rn 11

n n Proposition 21 For any norm N : R R, the function f : R+ R de…ned as ! ! f (x) = (N (x))2 is a convex function.

Proof. We know that N is a composition of functions. We also know that n 2 N (R ) = R+ and that on this set, the function g : R+ R given by g (x) = x is a strictly increasing and strictly convex function. Using! the previous result, we obtain that f (x) is a convex function, which is the result we were trying to prove.

In the following theorem we will prove that the convexity or concavity of a function does not depend on the coordinate system that we choose. This is, if we transform the coordinate system x x0 according to x0 = Ax + b with An n and !  b Rn, we will have that the functions remain convex or concave. 2

Theorem 22 (Invariance to Changes in Coordinate System) Let f : Rn n ! R be a concave (convex) function, An n a matrix and b R . Then, the function  2 f : Rn R de…ned as ! f (x) = f (Ax + b) remainse a concave (convex) functions. e

Proof. We will prove the result for convex functions. Consider x; y C and  [0; 1] : We have that 2 2 f (x + (1 ) y) = f (A (x + (1 ) y) + b) = f ( (Ax + b) + (1 )(Ay + b)) e f (Ax + b) + (1 ) f (Ay + b) = f (x) + (1 ) f (y) ; (1) where (1) is due to the convexity of f. Hence, wee just provede that f is also a convex function. e Lastly, let us consider one of the most important properties of concave and convex functions, one that deals with the operations of supremum and in…mum over collections of concave and convex functions.

Theorem 23 (Supremum of Convex Functions) Let f I be a collection f g 2 of functions (indexed by I) f : C R with C convex and such that f is a convex ! function for all I. Then, the function f : C R de…ned as 2 !

f (x) = sup f (x) I 2 is also a convex function. 12 Convexity in Rn

Proof. Consider x; y C and  [0; 1]. We then have that 2 2

f (x + (1 ) y) = sup f (x + (1 ) y) sup f (x) + (1 ) f (y) I (1) I f g 2 2

 sup f (x) + (1 ) sup f (y) = f (x) + (1 ) f (y) ; (2) I I 2 2 using the fact in (1) that f are convex for all I and that the supremum is monotone, and in (2) the fact that the supremum2 of a sum is less than or equal the sum of the suprema

Corollary 24 (In…mum of Concave Functions) Let f I be a collection of f g 2 functions (indexed by I) f : C R with C convex and such that f is a concave ! function for all I. Then, the function f : C R de…ned as 2 !

f (x) = inf f (x) I 2 is also a concave function.

Proof. This is derived from the fact that f is concave f is convex and the () fact that sup (A) = inf ( A) for any bounded set A R. 

Lastly, we present a result that shows that any convex function is, in fact, continuous. This will be very useful when we consider optimization problems.

Theorem 25 (Continuity of Convex Functions) Let f : C R be a convex ! function, where C Rn is an open convex set. Then, f is continuous in C. 

Optimization with Concave and Convex Functions

In this section we will study fundamental properties of optimization problems re- garding concave functions, and we will …nd conditions under which an optimum (if it exists) is unique. This would be the complement to Weierstrass’theorem, which provides su¢ cient conditions to guarantee the existence of an optimum, but it does not tell us anything regarding its uniqueness nor the properties of the set of maxi- mizers. In optimization problems, when we consider concave functions, we will see that we can say much more regarding the results of a maximization problem.

De…nition 26 (Level Sets) Let f : C R be a function, and t a real number. The superlevel set of f in t is de…ned as! the set C+ (t) = x C : f (x) t . f f 2  g Similarly, we de…ne the sublevel set as C (t) = x C : f (x) t . f f 2  g Convexity in Rn 13

Proposition 27 (Convexity of Level Sets) Let C Rn be a convex set and  f : C R a function. If f is concave, we then have that for every t R, the +! 2 set Cf (t) is convex. Analogously, if f is convex, then the set Cf (t) is convex for every t R: 2 Proof. We will provide the proof for concave functions. The result for convex + functions can be derived from the fact that Cf (t) C( f) (t) for every t R.  2 Consider t such that C+ (t) = ? (otherwise, the proof is trivial). Consider also + 6 + x; y C (t). We want to prove that z = x + (1 ) y C (t). Notice that 2 2 f (z) = f (x + (1 ) y) f (x) + (1 ) f (y) t + (1 ) t = t; (1) (2) where (1) considers the fact that f is concave, and (2) the fact that both, x and y, are in C+ (t). Therefore, C+ (t) are convex sets for every t R. 2 De…nition 28 (Argument of the Maximum) Let f : C R be a function. ! We de…ne the set argument of the maximum of C (and we write Xf (C) arg maxx C f (x) ) as the set  2

X (C) = x C : for every y C; f (x) f (y) : f f 2 2  g Theorem 29 (Convexity of the Argument of the Maximum) Let C Rn  be a convex set and f : C R a concave function. We then have that Xf (C) is a ! convex set. Moreover, if f is strictly concave and Xf (C) = ? = Xf (C) = x 6 ) f g with x C. 2

Proof. If Xf (C) is empty, the result is trivial. Otherwise, consider t = maxx C f (x). 2 The, we must have that, for every x Xf (C) = f (x) = t. Hence, we obtain + 2 ) + that Xf (C) = Cf (t), and using the previous result, we have that Cf (t) is con- vex. Suppose now that f is strictly concave and that Xf (C) = ?, and suppose by 6 contradiction that there exist two distinct points, x = y, that maximize f. We 1 1 6 then have that if we de…ne z = 2 x + 2 y C (given that C is convex), and since f is strictly concave, we have that 2 1 1 1 1 1 1 f (z) = f x + y > f (x) + f (y) = t + t = t: 2 2 2 2 2 2   Hence, t could not have been the maximum of f on C. Therefore, the argument of the maximum is a single point.

Appendix: Linear Algebra Review One of the main properties that we will prove using the previous results as base is one of the most relevant ones of the section. This result deals with the convexity of symmetric and positive de…nite quadratic forms. This will be very useful later on, when we present the characterization of convexity using the concept of di¤erentia- bility (this is, a function will be convex it is locally convex for every point). In order to do this, we must …rst provide some() de…nitions and present some theorems that we will not prove in this course. 14 Convexity in Rn

Reviewing Concepts and Basic Results

n n n De…nition 30 (Bases in R ) Let = v1; v2; :::; vk R be a set in R . We B f g  say that is a base of Rn for every x Rn there exist unique real numbers B i=k () 2 x i=k x i i=1 such that x = i vi. f g i=1 X

De…nition 31 (Invertible Matrix) A matrix An n is invertible (or non-singular) 1  there exists a matrix An n such that ()  1 0 0 0 1    0 1 1 AA = A A = I = 0 .   . 1 0 0 .. . B C B 0 0 1 C B    C @ A The concept of invertible matrix was developed as an answer to the question of existence and uniqueness of solutions to systems of linear equations. As we will see in the following proposition, a system of linear equations has a unique solution if and only if the matrix associated to the system is invertible.

Proposition 32 (Systems of Equations) Consider the system of equations

Ax = b

n with An n and x; b R . There exists a unique solution to this system A is invertible. 2 ()

Proof. ( =) : Let us …rst prove that there exists a solution. If we de…ne x = 1 ( A b, we then have that

1 Ax = A A b = Ib = b; which implies that this is a solution. Let us now suppose that there exists another solution y. We then have that

1 1 Ay = b A (Ay) = A b Iy = x y = x: () () () (= ): Omitted. The general proof uses a general concept of diagonalization (of triangular) forms) that we will not use in this course.

Corollary 33 (Necessary and Su¢ cient Conditions for Bases) The set = B n n v1; v2; :::; vk R is a set of R k = n and the matrix V v1 v2 vn fis invertible.g  () B      Convexity in Rn 15

x Proof. Being a base implies that there are unique numbers i such that

x 1 i=k x x 2 x = i vi = v1 v2 vk 0 . 1 V .    . B i=1 .  X B x C  B k C B C @ A But then, the problem boils down to solve this system of lineal equations for , for any x Rn. If we want a unique solution, we must have that k = n and that V is invertible,2 as we wanted to show. B

n n n De…nition 34 (Determinants) A determinant is a function det : R R ::: R    ! R that meets the following properties:

n n | {z } 1. Multilinearity: for every (x1; x2; :::; xn) (R ) and for every coordinate n 2 j, every vector yj R and R we have that 2 2

det (x1; :::; xj 1; xj + yj; xj+1; :::; xn) = det (x1; :::; xj 1; xj; xj+1; :::; xn)+ det (x1; x2; :::; yj; xj+1; :::; xn) ;

this is, det (:) is multilineal in each coordinate;

n n 2. Antisymmetry: for every (x1; x2; :::; xn) (R ) we have that 2

det (x1; :::; xi; :::; xj; :::; xn) = det (x1; :::; xj; :::; xi; :::; xn) ; this is, if two elements switch places, then the sign changes.

n 3. Let e1; e2; :::; en be the canonical vectors of R . We then have that det (e1; e2; :::; en) = f g n 1. If there is a matrix An n a1 a2 an with ai R , we have      2 that det (An n) det (a1; a2; :::; an).   

In principle, many functions could be considered as determinants. Actually, it can be proven that there exists a unique determinant function, which is the one we usually use (we do not include the formula). Let us remember some basic properties of determinants.

1. A is invertible det (A) = 0: () 6 1 1 2. If A is invertible = det (A ) = : ) det(A)

3. Given two matrices, An n and Bn n, we have that det (AB) = det (A) det (B) :   4. For every matrix A = det AT = det (A) : )

1 0 0    i=n 0 2 0 5. If we have a matrix D = 0 .   . 1 = det (D) = i: .. . 0 0 . ) i=1 B C Y B 0 0 n C B    C @ A 16 Convexity in Rn

De…nition 35 (Positive Semide…nite Matrix) A matrix An n is said to be  positive semide…nite for every x Rn, we have that xT Ax 0. We say that A is a positive() de…nite matrix if,2 additionally, for every x = 0 we have that xT Ax > 0. 6

The concept of positive de…nite and semide…nite matrix corresponds to the analogous for real numbers: notice that a real number a is greater than or equal than zero if and only if, for every x R it happens that ax2 0 xax 0 xT ax 0, and analogously it happens2 for strictly positive real numbers.()  () 

De…nition 36 (Negative Semide…nite Matrix) A matrix An n is said to be negative (semi-)de…nite A is a positive (semi-)de…nite matrix. ()

A fundamental concept that we will use when we study other topics is that of a diagonalizable matrix. This is a matrix that can be written as the product of an invertible matrix with a diagonal matrix. Let us …rst de…ne some important concepts.

De…nition 37 (Eigenvectors and Eigenvalues) Let A be a matrix with dimen- sions n n. We say that a vector v Rn with v = 0 is an eigenvector of A with corresponding eigenvalue  Av =2v. 6 ()

In order to be able to …nd the eigenvalues and eigenvectors of a matrix A, it will prove very useful to de…ne the concept of characteristic polynomial of a matrix A.

De…nition 38 (Characteristic Polynomial) Given a matrix An n, we de…ne  the characteristic polynomial of matrix A as the function PA : R R de…ned as !

PA () = det (A I) :

Theorem 39 (Roots of Characteristic Polynomial) Let An n be a matrix.  Then,  is an eigenvalue of A PA () = 0. ()

Proof. By de…nition, we know that  is an eigenvalue of An n if and only if there  exists a vector v Rn with v = 0 such that 2 6 Av = v (A I) v = 0: ()

If we consider matrix B = A I, if B was invertible, then the unique solution 1 to this system of equations (for v) would be v = B 0 = 0, violating then that v Convexity in Rn 17 is an eigenvector. On the other hand, 0 is always a solution of this system, which implies that the system is not incompatible (this is, it has at least one solution). Therefore, in order to have v = 0 as a solution, the matrix B cannot be invertible (using a previous proposition6 on linear systems). Hence, using the property of the determinant, we have that

 is an eigenvalue B is not invertible det (B) = 0 PA () = 0; () () () as we wanted to show.

We will now de…ne a concept that is essential for calculus in Rn, which is that of diagonalization of quadratic matrices.

De…nition 40 (Diagonalizable Matrix) A matrix An n is diagonalizable there exists a nonsingular matrix V and a diagonal matrix D such that ()

1 V AV = D; with 1 0 0    0 2 0 D = 0 .   . 1 : 0 0 .. . B C B 0 0 n C B    C The values on the diagonal of@ D are called the Aeigenvalues of A. If we can 1 T choose a matrix V such that V = V , then we say that A is orthogonally diagonalizable.

The matrices V and D considered in the previous de…nition have a particular meaning. Actually, the column vectors that form the matrix V are eigenvectors of the matrix A, and the values in the diagonal of D are their corresponding eigenvalues. We show this fact next.

Theorem 41 (Eigenvalues and Diagonalization) If An n is a diagonalizable matrix with associated matrices V and D, then V is a matrix whose column vectors are eigenvectors of A, and the values in the diagonal of D are eigenvalues of A. Proof. We know that if A is diagonalizable with corresponding matrices V and 1 D, then V AV = D. Let us express V = v1 v2 vn , where vi is the i-th column of V . Hence, if we premultiply D by V , we have   that  1 0 0    0 2 0 AV = VD A v1 v2 vn = v1 v2 vn 0 .   . 1 ()       0 0 .. . () B C   B 0 0 n C B    C @ A Av1 Av2 Avn = 1v1 2v2 nvn :       Therefore, for every i, vi is an eigenvector of A with corresponding eigenvalue i. 18 Convexity in Rn

Corollary 42 (Necessary and Su¢ cient Conditions for Diagonalization) A matrix An n is diagonalizable there exist eigenvectors v1; v2; ::::vn such  () f g that they form a base of Rn.

A fundamental theorem in linear algebra shows that any symmetric matrix is orthogonally diagonalizable. This result is very useful, but also very hard to prove (given that many intermediate steps must be proven …rst). Even though we will not prove this theorem, we will state it given that it will be very useful when we come to study the relationship between di¤erentiability and convexity.

Theorem 43 (Diagonalization and Symmetric Matrices) Let An n be a sym- metric matrix (this is, such that A = AT ). Then, A is orthogonally diagonalizable

A straight forward application of this result delivers a useful characterization of positive de…nite matrices in terms of their eigenvalues.

Theorem 44 (Positive De…niteness and Eigenvalues) Let An n be a sym- i=n  metric matrix, and i its corresponding eigenvalues. We have that: f gi=1

1. A is positive semide…nite i 0 for every i = 1; 2; :::; n () 

2. A is positive de…nite i > 0 for every i = 1; 2; :::; n ()

3. A is negative semide…nite i 0 for every i = 1; 2; :::; n () 

4. A is negative de…nite i < 0 for every i = 1; 2; :::; n ()

Proof. Proving (1) and (2) immediately implies that (3) and (4) are also hold, since i is an eigenvalue of A i is an eigenvalue of A. Given our previous result, we know that A = V T DV(). Notice that by de…ning

p1 0 0    1 0 p2 0 D 2 = 0 .   . 1 ; 0 0 .. . B C B 0 0 pn C B    C @ A 1 T 1 we have that D 2 = D 2 , which implies that   T T T T 1 1 x Ax = x V DV x = D 2 V x D 2 V x :      i=n 1 n T T 2 If we let z = D 2 V x R = x Ax = z z = zi 0, proving then that A 2 ) i=1  is positive semide…nite, which is exactly what weX wanted to prove. Moreover, if 1 1 i > 0 for every i, then D 2 is invertible, which implies that the matrix H = D 2 V Convexity in Rn 19 is invertible, since it is the product of two invertible matrices. Hence, the linear transformation T : Rn Rn de…ned as ! 1 T (x) = D 2 V x = Hx   is bijective (one-to-one and onto). This implies that T (x) = 0 x = 0. There- fore, relying on the previous observation, we can conclude that()

T T x Ax = 0 (T (x)) (T (x)) = 0 N2 (T (x)) = 0 T (x) = 0 x = 0; () () () () which …nishes the prove.

Even when this theorem simpli…es the task of characterizing positive de…nite and semide…nite matrices, we can rapidly see that we now face a computational challenge: as we previously saw, …nding the eigenvalues of a matrix is equivalent to …nding the roots of a polynomial of order n. Hence, notice that for any particular case in which n > 5, the task of …nding eigenvalues analytically becomes more complicated computationaly. We will now state a theorem that will allow us to simplify enormously the issue of determining if a matrix is negative or positive de…nite (however, it does not tell us anything about semide…niteness).

Theorem 45 (Su¢ cient Condition for the De…nition of Sign) Let An n be i=t;j=t  a squared matrix, and de…ne the sequence of matrices Ct = [aij]i=1;j=1. If det (Ct) > 0 for every t = 1; 2; :::; n, we have that A is positive de…nite. If det (Ct) < 0 for every t odd and det (Ct) > 0 for every t even, then A is negative de…nite.

Example 46 Consider the matrix

1 0 0 A = 4 2 0 : 0 2 1 1 1 2 @ A 1 0 Consider C = 1, C = and C = A. We have that 1 2 4 2 3  

det (C1) = 1 > 0; 1 0 det (C ) = det = 2 > 0 and 2 4 2   1 0 0 det (C3) = det 4 2 0 = 2 > 0: 0 2 1 1 1 2 @ A Hence, A is positive de…nite.

Notice that this theorem provides only a su¢ cient condition. If any of the submatrices has a determinant equal to zero, we cannot say anything about the de…niteness of matrix A. 20 Convexity in Rn

Quadratic Forms By a quadratic form we will refer to an extension of the concept of quadratic equation in R. This type of functions will play a crucial role when we study local di¤erentiability and convexity. Using the following, as well as the previous results, we will show that, in fact, every positive symmetric quadratic form is a convex function.

De…nition 47 (Quadratic Form) A function f : Rn R is a quadratic form ! if and only if there exists a matrix An n and a vector b Rn such that  2 f (x) = (x b)T A (x b) :

Theorem 48 (Convexity of Positive Quadratic Forms) Let f : Rn R be a quadratic form with corresponding matrix A. If A = AT and A is positive! semi- de…nite, then f is a convex function. Moreover, if A is positive de…nite, then f is strictly convex.

Proof. The proof is a simple corollary of a previous theorem together with the invariance of change of coordinates. Firstly, we change the coordinates from x to z = x b. Hence, we have that f (x) = g (x b) with g (x) = xT Ax. Hence, if we prove that g is convex, we will be proving that f was convex in the beginning, and analogously for the the case of strict convexity. In a previous proof we saw that if we consider a new change of coordinates, we have that

g (x) =  (Hx)

i=n 1 2 with H = D 2 V , and  (z) = zi . Once again, using the invariance to change of i=1 coordinates, if we prove that X is convex (or strictly convex), we will have proved that g is also convex (or strictly convex), and therefore, that f also is. But since

2  (z) = (N2 (z)) ; our previous result provide the proof of this one.

Corollary 49 (Concavity of Negative Quadratic Forms) Let f : Rn R be a quadratic form with corresponding matrix given by A. If A = AT and!A is negative semide…nite, then f is a concave function. Moreover, if A is negative de…nite, then f is strictly concave.

Exercises

1 - Extreme points and concavity. Pareto Set.