Contents 1 Definiteness of a Matrix

Constantin Bürgi Optimization 1 Contents 1 Definiteness of a Matrix 1 1.1 Definitions . .1 1.2 Finding the definiteness of a symmetric Matrix . .3 2 Concave and Convex functions 4 3 Unconstrained optimization 7 3.1 Definitions . .7 3.2 Existence of maxima . .7 3.3 Necessary condition . .8 3.4 Sufficient condition . .8 4 Constrained optimization 10 4.1 Equality constraints . 10 4.1.1 Marginal Rate of Substitution . 11 4.1.2 The Lagrangian . 13 4.1.3 Constraint qualification . 13 4.1.4 Sufficiency . 15 4.1.5 Linear optimization . 16 4.1.6 Monotonic increasing transformations . 16 4.1.7 Non-differentiability . 16 4.2 Inequality constraints . 17 4.2.1 Kuhn-Tucker Conditions . 17 1 Definiteness of a Matrix 1.1 Definitions Previously, it was shown, that any linear system has a matrix representation. This is also true for any polynomial of degree two. The representation is called quadratic form, as any variable is at most quadratic. The general quadratic forms is n X 0 q(x1; :::; xn) = aijxixj = x Ax i;j=1 Constantin Bürgi Optimization 2 2 2 Example 1. a11x1 + a12x1x2 + a22x2 has the corresponding quadratic form " a12 #" # h i a11 2 x1 x1 x2 a12 2 a22 x2 Remark 1. For x0Ax to be a quadratic form, A needs to be symmetric. These quadratic forms can always be (strictly) positive, negative or changing their sign depending on the symmetric matrix A. Based on this, it can be defined that Definition 1. A symmetric matrix A is positive (negative) definite, if the quadratic form x0Ax > 0 (x0Ax < 0) 8x 6= 0. Pn 2 Example 2. In is positive definite, as it corresponds to the quadratic form i=1 xi . Similarly, −In is negative definite. Definition 2. A symmetric matrix A is positive (negative) semi-definite, if the quadratic form x0Ax ≥ 0 (x0Ax ≤ 0) 8x and 9x for which it holds with strict inequality. Remark 2. Every positive (negative) definite matrix is also positive (negative) semi- definite. Example 3. The matrix " # 1 1 B = 1 1 2 is positive semi-definite, as it corresponds to the quadratic form (x1 + x2) and it is 0 for x1 = −x2. Similarly, −B is negative semi-definite. Definition 3. A symmetric matrix A is indefinite, if 9x such that the quadratic form x0Ax takes positive and negative values, or if A = 0. Example 4. The matrix " # 1 0 0 −1 2 2 is indefinite, as it corresponds to the quadratic form x1 − x2, which is positive for large x1 and negative for large x2. The definitions above can be easily expanded to non-symmetric matrices, by noting that the quadratic form underlying any matrix B, is equivalent to the quadratic form 1 0 underlying the symmetric matrix A = 2 (B + B ). Constantin Bürgi Optimization 3 1.2 Finding the definiteness of a symmetric Matrix The two approaches to determine the definiteness of a symmetric matrix discussed here are using eigenvalues and principal minors. Looking at eigenvalues, a matrix is positive (negative) definite, if all its eigenvalues λi > (<)0. Similarly, a matrix is positive (negative) semi-definite, if all its eigenvalues λi ≥ (≤)0. Often, calculating eigenvalues might be tedious and using principal minors is easier. Definition 4. A kth order principal minor denoted PMk(A) is the determinant of a k ×k submatrix of the n × n matrix A, where (n − k) rows and corresponding columns were deleted. Example 5. A 3 × 3 matrix has 7 principal minors. The three diagonal elements are first order principal minors and the determinant of the entire matrix is the third order principal minor. Deleting the first, second and third row and column individually and calculating the determinant of the remaining matrix give the three second order principal minors. Definition 5. A kth order leading principal minor denoted LP Mk(A) is given by the determinant of the k × k submatrix of a n × n matrix A, where the last (n − k) rows and columns were deleted. Remark 3. Every n × n matrix has n leading principal minors. Example 6. The three leading principal minors of a 3 × 3 matrix A are a11 a12 a13 a11 a12 a11; ; a21 a22 a23 a a 21 22 a31 a32 a33 Theorem 1.1. A matrix A is positive (semi-)definite, if for all its leading principle minors (principle minors) LP Mk(A) > 0 (PMk(A) ≥ 0). Theorem 1.2. A matrix A is negative (semi-)definite, if for all its leading principle k k minors (principle minors) (−1) LP Mk(A) > 0 ((−1) PMk(A) ≥ 0). Remark 4. As with the definitions for semi-definiteness, the inequalities have to hold with strict inequality for at least some principle minors. Example 7. The leading principal minors of the matrix 1 2 1 2 4 2 1 2 1 Constantin Bürgi Optimization 4 are LP M1 = 1, LP M2 = 0 and LP M3 = 0, hence the other principal minors need to be checked. Because all other first order principle minors are positive and all other principle minors 0, the matrix is positive semi-definite. 2 Concave and Convex functions Definition 6. A function f : S ! R defined over the convex set S is concave (convex) if for all vectors x; y 2 S, f(tx + (1 − t)y) ≥ (≤)tf(x) + (1 − t)f(y); 8t 2 [0; 1]: The function is strictly concave (convex), if the condition holds with strict inequality. Intuitively, this means that a concave (convex) function is weakly above (below) the line of two points of that function. A linear function is both concave and convex. f(x) f(x) f(y) f(x) f(y) f(x) x x x y x y Concave Convex Theorem 2.1. A twice continuously differentiable function f : S ! R defined over the convex set S is concave (convex), if and only if the Hessian is negative (positive) semi- definite 8x 2 S. Theorem 2.2. A twice continuously differentiable function f : S ! R defined over the convex set S is strictly concave (convex), if the Hessian is negative (positive) definite 8x 2 S. Example 8. The function f(x) = x2 is strictly convex, as f 00(x) = 2 > 0. Remark 5. The converse for the second theorem is not true however, as the example of x4 (−x4) shows. This function is strictly convex (concave), however its Hessian is only positive (negative) semi-definite as it becomes 0 at x = 0. Constantin Bürgi Optimization 5 A useful property of concave and convex function is given by the next theorem. Theorem 2.3. If f(x) is a concave function, f(x) ≤ f(x) + 5f(x)0[x − x]; 8x in the domain of the function. Proof. Define z = x − x. Then the definition of a concave function can be rewritten as f(tx + (1 − t)x) ≥ tf(x) + (1 − t)f(x) f(x + tz) ≥ tf(x) + (1 − t)f(x) f(x + tz) − f(x) ≥ tf(x) − tf(x) f(x + tz) − f(x) ≥ f(x) − f(x) j lim t t!0 5f(x)0[z] ≥ f(x) − f(x) f(x) ≤ f(x) + 5f(x)0[x − x] f(x) f(x) x x Quasi-Concave Quasi-Convex Definition 7. A function f : S ! R defined over the convex set S is quasi-concave if for all vectors x; y 2 S, f(tx + (1 − t)y) ≥ min[f(x); f(y)]; 8t 2 [0; 1]: Definition 8. A function f : S ! R defined over the convex set S is quasi-convex if for all vectors x; y 2 S, f(tx + (1 − t)y) ≤ max[f(x); f(y)]; 8t 2 [0; 1]: Constantin Bürgi Optimization 6 f(x) x Neither Remark 6. There are equivalent definitions for quasi-concave (quasi-convex) functions using subsets of the function. If the upper (lower) contour set of a function is a convex set, the function is quasi-concave (quasi-convex). More formally, a function is quasi-concave (quasi-convex), if the set Sa = fxjf(x) ≥ (≤)ag is convex for all a 2 R. Similarly as before, for strict quasi-convex/quasi-concave functions the conditions have to hold with strict inequality. Functions that are (weakly) increasing/decreasing are both quasi-convex and quasi-concave (quasi-linear). Example 9. The function f(x) = x3 is both quasi-convex and quasi-concave, as it is an increasing function but neither concave or convex. Aside from these quasi-linear functions, quasi-convex or quasi-concave functions intuitively have some point where the function reaches a global maximum or minimum. There are sufficient conditions for quasi-concave (quasi-convex) functions based on the bordered Hessian. The bordered Hessian has the Hessian matrix in the bottom right corner, the first order conditions on both sides of the Hessian and a 0 in the top left corner. " # 0 5f(x)0 BH(f(·)) = 5f(x) H(f(x)) Theorem 2.4. A function is quasi-concave, if the leading principle minors of the bordered ( Hessian alternate their sign according to (−1) k + 1)LP Mk(BH(f(·))) > 0 for k > 1. Theorem 2.5. A function is quasi-convex, if all but the first leading principle minors of the bordered Hessian are negative LP Mk(BH(f(·))) < 0 for k > 1. Constantin Bürgi Optimization 7 Remark 7. If the above conditions hold with weak inequality, they are only necessary but not sufficient. 3 Unconstrained optimization Unconstrained optimization assumes that there is no constraint to the optimization prob- lem aside from simple restrictions on the domain.

Contents 1 Definiteness of a Matrix

SUPPLEMENTARY MATERIAL: I. Fitting of the Hessian Matrix

EUCLIDEAN DISTANCE MATRIX COMPLETION PROBLEMS June 6

Removing External Degrees of Freedom from Transition State

Linear Algebra Review James Chuang

An Introduction to the HESSIAN

Linear Algebra Primer

Lecture 11: Maxima and Minima

Hessian Matrices, Automorphisms of P-Groups, and Torsion Points of Elliptic Curves

Chapter 7 Principal Hessian Directions

Exact Calculation of the Product of the Hessian Matrix of Feed-Forward

This Lecture: Lec2p1, ORF363/COS323

Math 233 Hessians Fall 2001 and Unconstrained Optimization