Genericity in Linear Algebra and Analysis with Application to Optimization October 4, 2019

Georg Still University of Twente

Contents

Chapter 1. Introduction and notation 1 1.1. Introduction 1 1.2. Summary and comments on further reading 5 1.3. Notation 8 Chapter 2. Basis Theory 11 2.1. Properties of generic sets in Rn 11 2.2. Generic sets in function spaces 15 2.3. Algebraic and semi-algebraic sets 18 2.3.1. Properties of semi-algebraic sets 19 2.3.2. Properties of algebraic sets 22 Chapter 3. Genericity results in linear algebra 27 3.1. Basic results 27 3.2. Genericity results in linear programming 28 Chapter 4. Genericity results in analysis 35 4.1. Basic results 35 4.2. Genericity results for eigenvalues of real symmetric matrices 37 4.3. Genericity results for unconstrained programs 39 4.4. Genericity results for systems of nonlinear equations 42 4.5. Genericity results for constrained programs 46 Chapter 5. Manifolds, stratifications, and transversality 55 5.1. Manifolds in Rn 55 5.2. Stratifications and Whitney regular stratifications of sets in Rn 60 5.3. Stratifications and Whitney regular stratifications of sets of matrices 65 5.4. Transversal intersections of manifolds 74 5.5. Transversality of mappings 76 Chapter 6. Genericity results for parametric problems 81 6.1. Jet transversality theorems 81 6.2. Genericity results for parametric families of matrices 83 6.2.1. The ranks of parametric families of symmetric matrices 83 6.2.2. Eigenvalues of parametric families of symmetric matrices 85 6.3. Genericity results for parametric unconstrained programs 90 6.3.1. Parametric unconstrained programs 90 6.3.2. One-parametric unconstrained programs 94 iii iv CONTENTS

6.4. Genericity results for parametric systems of nonlinear equations 99 6.5. Genericity results for parametric constrained programs 102 6.5.1. Parametric constrained programs 104 6.5.2. One-parametric constrained programs 107 6.6. One-parametric quadratic and one-parametric linear programs 119 6.6.1. Genericity results for one-parametric quadratic programs 120 6.6.2. Genericity results for one-parametric linear programs 128 Chapter 7. Appendix: Basics from linear algebra and analysis 147 Bibliography 151 Index 153 CHAPTER 1

Introduction and notation

1.1. Introduction This first section aims to provide an introduction into the topic of the booklet. We start with an illustrative example to motivate the genericity concept. Consider the problem of solving linear equations, (1.1) (P ): Ax = b with a matrix A ∈ Rn×n, and a vector b ∈ Rn, n ∈ N. It is well-known that this equation has a unique solution for all b if and only the following property is met:

(1.2) Property. A is nonsingular, or equivalently, det(A) 6= 0 .

So we could ask the following Question. Can we expect that for problem (P ) in the ”normal” (or ”generic”) case (to be defined lateron) the ”nice” property (1.2) holds? To formulate this question properly we have to define what we mean by a problem and what we mean by ”generic case”. For our example problem (P ) and the property in (1.2) a problem instance is given by a matrix A ∈ Rn×n. If we specify the ”size” n of A, we can consider the whole family P (problem class P) of problems (P ), and we can identify this problem class with the set of all problem instances A:

n×n n×n (1.3) P = {A | A ∈ R } ≡ R . For topological results, e.g., if we speak of open sets, we need to specify some on n×n qP 2 R . Here, we chose the topology related with the (Frobenius-) norm kAk = i,j aij, for a matrix A = (aij)i,j=1,...,n.

DEFINITION 1.1. [problem, problem instance, problem class, problem set (data)] We denote, e.g., by (P ), a mathematical problem of a special type and are interested in a specific property for (P ). Take for example (P ) in (1.1) and property (1.2). This problem and the property depends on the problem data (A in our example). If we specify the data, then we speak of an instance Q of problem (P ) (Q = A in our example). 1 2 1. INTRODUCTION AND NOTATION

By a class P of problems (P ) we mean a family of problem instances Q of a specified size. In this report we assume that the problem instances Q are elements of some linear real vector space L endowed with some norm k kL, or some other topology. So, the problem class P can be identified with the set of instances Q (problem set in short): P = {a subset of instances Q in L} We often tacitly assume that the problem size (n in the example) is fixed and simply use the notion problem to denote the problem class.

Coming back to the problem (P ) in (1.1), the property in (1.2), and the problem class P in (1.3), we have to make precise what we mean if we say that property (1.2) holds generically in P. Our first definition is restricted to the case that the problem set (data set) P is a subset of a space Rn. We denote the Lebesgue in Rn by µ.

DEFINITION 1.2. [generic set, weakly generic set, generic property] n (a) Let be given an S ⊂ R . A subset S0 ⊂ S is called a generic set in S if (1) the complement S \ S0 of S0 has Lebesgue measure zero, i.e., µ(S \ S0) = 0, and n (2) the set S0 is open in R .

The set S0 is called weakly generic if only the first condition (1) holds. We also simply call a set Sr ⊂ S generic in S if Sr contains such a generic subset S0 in S. (b) Let us consider a problem class P of a problem (P ), where P ⊂ Rn is open. A prop- erty is called to be generic in the problem class P if there is a generic subset Pr ⊂ P, i.e., (1) P\Pr has Lebesgue measure zero in P, µ(P\Pr) = 0 and n (2) the set Pr is open in R , such that the property holds for all instances Q ∈ Pr. We then also say that the property is generically fulfilled in P. The property is said to be weakly generic if for Pr only the first condition (1) holds.

REMARK 1.1. The condition (1) in the definition above says that the property holds for instances from P. The second part (2) implies that the property is stable at any instance Q ∈ Pr. This condition (2) is particularly important in practical applications. It means that near Q ∈ Pr the (“nice”) property is stable wrt. (sufficiently) small perturba- tions of the instance (data) Q.

REMARK 1.2. For the definition of a generic set in R (or Rn) we have chosen the Lebesgue measure. Other choices are possible. We have chosen this measure since it is common in analysis and the definition of a set of Lebesgue measure zero only involves basic notions (see Definition 2.1). If we consider another (finite) measure λ in R which is defined by a distribution function R x F (x) = −∞ δ dµ with nonnegative density function δ : R → R+ such that the Lebesgue 1.1. INTRODUCTION 3 R ∞ R integral −∞ δ dµ exists, i.e., λ(S) = S δ dµ for any Lebesgue measurable set S ⊂ R, then µ(S) = 0 implies λ(S) = 0 (see e.g., [34, 11.23 Remark]). So, for such measures λ, genericity wrt. the Lebesgue measure implies genericity wrt. λ measure. The Lebesgue measure corresponds to a uniform distribution given by a “constant” density function.

Let us reconsider property (1.2) for problem (P ) in (1.1), with given n ∈ N, and corre- sponding problem set P ≡ Rn×n. To show that the property is generic, we have to show that the subset Pr of P given by n×n (1.4) Pr = {A ∈ P ≡ R | det(A) 6= 0} is generic. To do so, we apply the basis result in Theorem 3.1. According to formula (7.1) n×n the function p(A) = det(A) defines a function on R , which by det(In) = 1, is not the zero function. So, using Theorem 3.1, the set P\Pr = {A ∈ P | det(A) = 0} has Lebesgue measure zero in P. To show the openness condition for Pr we conclude from (7.1) that det(A) is a continuous function of A. So if det(A) 6= 0 holds, i.e., A ∈ Pr, then there must exist some ε > 0 such that det(A) 6= 0 is true for all A ∈ P satisfying kA − Ak < ε. Consequently, Pr is open and we have obtained our first genericity result.

THEOREM 1.1. The subset Pr in (1.4) is generic in P and thus the property that Ax = b has a unique solution for all b ∈ Rn is generic in P. This genericity result can equivalently be formulated as follows: The property that (for all b ∈ Rn) the linear mapping F (x) := Ax − b has a nonsingular Jacobian ∇F (x) = A is generic in P = {A | A ∈ Rn×n}. Nonlinear case: If we consider problems where general nonlinear functions are involved, things become more complicated. Let us look at the problem of solving the system of n nonlinear equa- tions in n variables: (1.5) (P ): F (x) = 0 , where F : Rn → Rn is a function from C1(Rn, Rn). The problem set would then be 1 n n 1 n n P = {F | F ∈ C (R , R )} ≡ [C (R , R)] , 1 n 1 where the space C (R , R) is endowed with some topology, say a Ct -topology. How could we define genericity wrt. such a function space and what could be a property similar to (1.2)? In contrast to the linear case F (x) := Ax − b, for the system (1.5) of nonlinear equations, even in the “nice” situation we may have no solution, or many different solu- tions (even infinitely many, see Figure 1.1). So we only can expect that ”generically” the following “nice” property holds: Property 1. At each solution x ∈ Rn of F (x) = 0 the Jacobian (1.6) ∇F (x) is nonsingular .

Later, by applying the Implicit Function Theorem 7.2, we will see that condition (1.6) im- plies that any solution x of F (x) = 0 is locally unique (isolated). 4 1. INTRODUCTION AND NOTATION

1 x2+1 cos x

x

1 FIGURE 1.1. Infinitely many solutions of F (x) = x2+1 cos x = 0

In Section 4.4 the following genericity result will be proven. In this theorem the set 1 n n 1 C (R , R ) is endowed with the so-called (strong) Cs -topology (see Section 2.2 for de- tails).

1 n n THEOREM 1.2. The subset P = C (R , R ) contains a generic subset Pr (generic wrt. 1 the Cs -topology), such that Property 1 is satisfied for any F ∈ Pr. As we shall see in Section 4.4, this results is a basis result for the Newton method for solv- ing systems (1.5). In fact this theorem says that generically the assumptions for the (local) quadratic convergence of the Newton method is met. But before proving such a result in Section 4.4 we first have to answer the following questions:

• How to define a generic set, e.g., in C1(Rn, R)? 1 1 n Rough answer: We will chose the Cs -topology for C (R , R) (see Section 2.2).

• Which techniques are needed to prove the genericity results. Rough answer: We will apply techniques from Differential Geometry or Differ- ential Topology (cf., Section 4.1).

REMARK 1.3. Let us note that the stronger property: n Property 2: ∇F (x) is nonsingular for all x ∈ R , cannot be expected to hold generically in C1(Rn, Rn). Take the example F ∈ C1(Rn, Rn) given by (cf., Figure 1.2) F (x) = sin x with ∇F (x) = cos x .

Here, Property 1 is fulfilled: at each solution xk = kπ, k ∈ Z, of F (x) = 0, the condition k ∇F (xk) = cos kπ = (−1) 6= 0 is valid. However, Property 2 does not hold and we cannot expect that by small (differ- entiable) perturbations of F we obtain some Fˆ(x) ≈ sin x, Fˆ ∈ C1(Rn, Rn), such that ∇Fˆ(x) ≈ cos x satisfies n ∇Fˆ(x) 6= 0 for all x ∈ R .

A General warning: Throughout this report many genericity results in linear algebra, analysis, and optimization 1.2. SUMMARY AND COMMENTS ON FURTHER READING 5

sin x

x

FIGURE 1.2. Function F (x) = sin x satisfies F 0(x) 6= 0 at all solutions x of F (x) = 0. will be presented. We however have to give a warning. The genericity results always con- cern a specific class of problems (with a certain structure). The point is, that a genericity result for a given problem set cannot be directly transferred to a subclass (with a specific substructur) of this problem set. Unfortunately each subclass of a problem class which consists of data with a special structure requires a new (independent) genericity analysis. In what follows we will regularly come back to this fact. For example, the genericity results for general real (n × n)-matrices need no more be true for matrices with a spe- cial structure such as band matrices (see, Section 5.3 for a counterexample). Also the genericity results for (general) optimization problems (non-parametric in Section 4.5, or parametric in Section 6.5) are not valid for the subclass of quadratic or linear programs (see, Section 6.6).

A comment on the practical relevance of genericity results: A genericity result on a problem class is a statement on the structure shared by almost all problems of the class (see also Remark 1.1). What is the relevance of such results on the generic behavior of a specific class of mathematical problems? Beyond their mathematical value they are important for all solution methods for solving the problems numerically. A specific solution method typically requires some regularity conditions to be satisfied for the problem at hand. So, from a practical viewpoint, it is important that the required regularity conditions are generically fulfilled within the problem class. Otherwise it is likely that the method fails for most of the problems in the class.Therefore it is crucial that a solution method is designed in such a way that it is able to cope numerically with all situations which can occur generically. Ideally the algorithm should also be able to detect situations which are not generic.

1.2. Summary and comments on further reading We start with a short summary of the contents of this report. In Chapter 2 we present the definitions and basic properties of generic sets in the space Rn and generic sets in function spaces Ck(Rn, R). We further provide an introduction into algebraic- and semi-algebraic sets. Chapter 3 performs a genericity analysis for (non-parametric) linear programs and Chapter 4 deals with genericity results for non-parametric problems in analysis, such as systems of nonlinear equations as well as unconstrained/constrained programs. 6 1. INTRODUCTION AND NOTATION

Chapter 5 provides the basis techniques needed for the analysis of parametric problems in Chapter 6. In the latter, genericity results for parametric families of matrices and para- metric families of optimization problems are presented. In particular the structure of one- parametric general programs as well as one-parametric quadratic and linear programming problems are studied in detail. The genericity results in this report are based on (Parametric) Sard Theorems and tech- niques from Rene´ Thom’s transversality theory in Differential Topology. We emphasize that much of the ideas and material discussed in this report are taken from the landmark book [21] and its predecessors [19, 20]. These investigations on the generic behavior of unconstrained/constrained optimization problems have inspired many researchers to study the generic properties of other classes of mathematical programs. We only sum up some of these studies: In [41], genericity results for (non-parametric) semi-infinite optimization problems

(PSIP ) : min f(x) s.t. g(x, y) ≤ 0 ∀y ∈ Y, x∈Rn with Y ⊂ Rm, a compact index set, are presented. In [35] and [24], one-parametric semi- infinite problems are studied, leading in particular to a genericity theorem of the 8 Types. The generic behavior of so-called generalized semi-infinite programs is investigated in [15]. For genericity results for programs with complementarity constraints of the form,

gj(x) ≤ 0 , j = 1, . . . , m (PCC ) : min f(x) s.t. si(x) · ri(x) = 0 , i = 1, . . . , l , x∈Rn si(x), ri(x) ≥ 0 , i = 1, . . . , l we refer to [38], [6] for the non-parametric case and to [7] for one-parametric (PCC ) pro- grams. Also the generic behavior of bilevel problems of the following form have been studied in [8]:

(PBL) : min f(x, y) s.t. gj(x, y) ≤ 0 , j = 1, . . . , m , x,y and s.t. y solves the program Q(x) , where Q(x) is the so-called lower level problem:

Q(x) : min h(x, y) s.t. vi(x, y) ≤ 0 , i = 1, . . . , l . y Genericity results for linear bilevel problems have been proven before in [37]. A summary of the generic behavior of (non-parametric) linear conic programs is to be found in [12]. In [1] the special (non-parametric) maximization problem of homogeneous on the unit simplex has been investigated from a generic viewpoint. We emphasize that the present report only considers mathematical problems with un- knowns in a space Rn. But there are also genericity results for many problems where the unknowns are functions u(x) in some Banach space such as problems where partial differ- ential equations are involved. Take for example the eigenvalue problem with the Laplacian 1.2. SUMMARY AND COMMENTS ON FURTHER READING 7

∂2 ∂2 operator ∆u(x1, x2) = 2 u(x1, x2) + 2 u(x1, x2): find a function u : G → R, u 6= 0, ∂x1 ∂x2 and a value λ ∈ R such that

(E) : ∆u(x1, x2)+λu(x1, x2) = 0 ∀ (x1, x2) ∈ G and u(x1, x2) = 0 ∀(x1, x2) ∈ bd G, where G is some given compact set in R2. It has, e.g., been proven in [2] that ”generically” the eigenvalue problem (E) has simple eigenvalues (see also the results for algebraic eigen- values in the Sections 4.2, 6.2.2). 8 1. INTRODUCTION AND NOTATION 1.3. Notation To facilitate the reading we present a list of notation used in this booklet.

NN = {0, 1, 2,...} or depending on the context N = {1, 2,...} n ej in R , ej denote the standard unit vectors n x ∈ R column vector with n components xi, i = 1, . . . , n T n T x ∈ R transposed of x, row vector x = (x1, . . . , xn) n xJ0 for x ∈ R and a subset J0 ⊂ {1, . . . , n}, it is the (partial) vector T xJ0 = (xj, j ∈ J0) n p 2 2 kxk the Euclidean norm of x ∈ R , kxk = x1 + ... + xn n kxk∞ the maximum norm of x ∈ R , kxk∞ = max1≤i≤n |xi| B(x, ε) for x ∈ Rn, ε > 0, it is the neigborhood B(x, ε) = {x ∈ Rn | kx − xk < ε} n n R+ set of x ∈ R with components xi ≥ 0, i = 1, . . . , n n×m A = (aij) A ∈ R , a real (n × m)-matrix, with element aij in row i, column j Ai· is the ith row of A A·j is the jth column of A

AJ0 for an (n × m)-matrix A and a subset J0 ⊂ {1, . . . , n}, AJ0 is the (partial) matrix Aj· A =  J0 . . j∈J0 0 0 vector or 0 matrix of appropriate dimension qP 2 kAk Frobenius norm, kAk = ij aij. This matrix norm satisfies:kAxk ≤ kAkkxk M n set of real (n × n)-matrices Sn set of real, symmetric (n × n)-matrices M n,m set of real (n × m)-matrices with n rows and m columns C0(Rn, R) set of real valued functions f, continuous on Rn We often simply write f ∈ C0 Ck(Rn, R) set of functions f, k-times continuously differentiable on Rn (k ≥ 1) We often simply write f ∈ Ck C∞(Rn, R) set of functionsf, infinitely many times continuously differentiable on Rn We often write f ∈ C∞ k k n Cs strong topology on C (R , R), see Definition 2.2 ∇f(x) row vector, gradient of f ∈ C1( n, ), ∇f(x) = ( ∂f(x) ,..., ∂f(x) ) R R ∂x1 ∂xn ∇T f(x) or ∇f(x)T , column vector, gradient of f, ∇T f(x) = ( ∂f(x) ,..., ∂f(x) )T ∂x1 ∂xn 2 2 n 2  ∂2  ∇ f(x) Hessian of f ∈ C (R , R), ∇ f(x) = ∂x ∂x f(x) j i i,j ∂f(x,t) ∂f(x,t) ∇xf(x, t) partial derivatives wrt. x, ∇xf(x, t) = ( ,..., ) ∂x1 ∂xn 2 ∇xf(x, t) Hessian wrt. x of f(x, t) T T n p ∇∇x f(x, t) or ∇(x,t)∇x f(x, t), for f : R × R → R, it is a n × (n + p)-matrix k n n kfkk,C norm on C (R , R) wrt. C ⊂ R , C compact, (see (2.6)) α kfkk,C := max|α|≤k maxx∈C |∂ f(x)| 1.3. NOTATION 9

n xl → x given a sequence xl ∈ R , l ∈ N, by xl → x we mean liml→∞ xl = x or liml→∞ kxl − xk = 0 J(x,t) active index set at (x, t),J(x,t) = {j ∈ J | gj(x, t) = 0} L(x, t, µ) Lagrangean function (locally at (x, t)): P L(x, t, µ) = f(x, t) + µjgj(x, t) j∈J(x,t) C(x,t) the cone of critical directions, see Lemma 4.5 T(x,t) tangent space of critical directions, see (4.21) NxM normal space of manifold M at x ∈ M, see Def. 5.2 TxM tangent space of manifold M at x ∈ M, see Def. 5.2 P (t) parametric program see (6.30) F(t) feasible set of P (t) v(t) value function of P (t) S(t) set of (global) minimizers of P (t) cl S closure of the set S int S interior of the set S M − M manifolds M ,M intersect transversally, see Def. 5.6 −1 t 2 1 2 f t M function f meets manifold M transversally, see Def. 5.7

CHAPTER 2

Basis Theory

After the definition of generic sets in Rn (cf., Definition 1.1), the present chapter gives some facts on properties of generic sets in Rn and introduces the genericity concept in function spaces. Further, a survey of main properties of algebraic and semi-algebraic sets is added.

2.1. Properties of generic sets in Rn In this section we recall some basic facts. The set α q = {x ∈ n | |x − x | ≤ } with some x ∈ n , and α > 0 , R i i 2 R is called a (closed) cube in Rn of volume µ(q) := αn.

DEFINITION 2.1. A subset S ⊂ Rn is said to be of Lebesgue measure zero (notation n µ(S) = 0), if for every ε > 0 there is a countable family {qν}ν∈I of cubes qν ⊂ R , with volumes µ(qν), ν ∈ I (I ⊂ N), such that   [  X S ⊂ qν and µ ∪ν∈I qν ≤ µ(qν) < ε . ν∈I ν∈I

We then say that {qν}ν∈I covers S.

n LEMMA 2.1. Let {Si}i∈N be a countable family of sets Si ⊂ R with µ(Si) = 0, i ∈ N. Then  µ ∪i∈NSi = 0. i i Proof. We can choose ε > 0 and covers {qν}ν∈N of Si, i ∈ N, with cubes qν, such that µ(S qi ) < ε , i ∈ . Then the countable family {qi } covers S := S S and ν∈N ν 2i N ν i,ν∈N i∈N i satisfies ∞ X ε µ∪ S  ≤ µ∪ qi  < = ε . i∈N i i,ν∈N ν 2i i=1 

n,k n EX. 2.1. Hyperplanes H = {x ∈ R | x1 = ... = xk = 0} with 1 ≤ k ≤ n, have Lebesgue measure zero. 2,1 2 Proof. We only give the proof for H = {(x1, x2) ∈ R | x1 = 0}. The extension of the proof to the general case is easy. Take the countable family of sets   Si := (0, x2) | i − 1 ≤ x2 ≤ i ∪ (0, x2) | −i ≤ x2 ≤ −(i − 1) , i ∈ N . 11 12 2. BASIS THEORY

2,1 The family {Si}i∈N covers the hyperplane H , so that by Lemma 2.1 the hyperplane has measure zero provided we prove that each set Si has measure zero. To do so, fix i and chose 1 ν k ∈ N arbitrarily, and put ε = k . With a partition tν = (i − 1) + k , ν = 1, . . . , k − 1, 1 1 consider the cubes qν = {(x1, x2) | |x1| ≤ k , |x2 − tν| ≤ k }. Then the family of cubes qν, ν = 1, . . . , k − 1, covers the set {(0, x2) | i − 1 ≤ x2 ≤ i} and we have k−1 k−1 X X 1 1 µ({(0, x ) | i − 1 ≤ x ≤ i}) ≤ µ(q ) = < = ε . 2 2 ν k2 k ν=1 ν=1

The same holds for the second set {(0, x2) | −i ≤ x2 ≤ −(i − 1)} of Si. So Si is of measure zero.  For continuous functions f : S ⊂ Rn → Rm it is possible to have µ(S) = 0 and µ(f(S)) > 0, even if m ≥ n. Famous examples are the “Peano space-filling curves” given by continuous mappings f : [0, 1] ⊂ [0, 1]2 → [0, 1]2 which are surjective. Such a phenomenon is no more possible if f is a C1-function.

THEOREM 2.1. Let be given S ⊂ U ⊂ Rn with U open and µ(S) = 0 (measure in Rn). Suppose f : U → Rm is a C1-map with m ≥ n. Then µ(f(S)) = 0 (measure in Rm). Proof. Firstly, we note that we can assume that U is covered by a countable family of (closed) cubes q ⊂ U, i ∈ , i.e., U ⊂ S q . Indeed we can, e.g., take the countable i N i∈N i set of all points ri ∈ cl U with rational coefficients and define qi as the cubes with center points ri and side length α for some α > 0. Since µ(S) = 0 it follows that by putting Si := S ∩ qi we have µ(Si) = 0. In view of f(S) ⊂ ∪i∈Nf(Si), by Lemma 2.1 it suffice to show, that µ(f(Si0 )) = 0 for any fixed i0 ∈ N. Since µ(Si0 ) = 0, by Definition 2.1, to any ε > 0, there is a cover of Si0 , [ [ S ⊂ qj , µ( qj ) < ε , i0 i0 i0 j∈N j∈N qj µ(qj ) = εj , j ∈ formed with cubes i0 of measures i0 i0 N. We can assume (by considering qj ∩ q qj ⊂ q , j ∈ the intersection i0 i0 ) , that i0 i0 N. Obviously, we have proven the statement if we have shown that there is a constant γ = γi0 such that for all j ∈ N: µ(qj ) ≤ j ⇒ µ(f(qj )) ≤ γj . (2.1) i0 i0 i0 i0 Take such a (nondegenerate) cube qˆ := qj0 ⊂ n and put εˆ = εj0 . Suppose qˆ has side i0 R √ i0 √ √ length αˆ > 0. Since µ(ˆq) =α ˆn =ε ˆ, for the diameter dˆ = nαˆ2 of qˆ we find dˆ = n n εˆ.

Since qˆ ⊂ qi0 is convex, in view of the mean value formula in Lemma 7.1 it follows with M := max k∇f(x)k that x∈qi0 kf(x) − f(y)k ≤ Mkx − yk for all x, y ∈ qˆ . Thus, for all x, y ∈ qˆ, √ √ kf(x) − f(y)k ≤ Mdˆ= M n n ε,ˆ n 2.1. PROPERTIES OF GENERIC SETS IN R 13 √ m √ n πm/2 m and f(ˆq) is contained in a ball B ⊂ R of radius r := M n εˆ and volume m r . Γ( 2 +1) Consequently, using m ≥ n, and assuming εˆ ≤ 1 we find

m/2 m m/2 m/2 m m/2 π M n m π M n n µ(f(ˆq)) ≤ m · εˆ ≤ m εˆ , Γ( 2 + 1) Γ( 2 + 1) and (2.1) is satisfied.  Note, that the statement of Theorem 2.1 needs no more be true in the case m < n. To provide a simple counterexample even of a C∞-function f : S ⊂ Rn → Rm such that µ(S) = 0 but µ(f(S)) > 0, consider

2 2 f : S ⊂ R → R , f(x1, x2) = x1 ,S = {x ∈ R | x2 = 0} with f(S) = R . Recall the definition of generic and weakly generic sets in Rn (cf., Definition 1.1). From Lemma 2.1 we obtain

n LEMMA 2.2. Let {Si}i∈I , I ⊂ N, be a (countable) family of weakly generic sets Si ⊂ R . Then the intersection

S = ∩i∈I Si is also weakly generic .

n Proof. By definition, the sets Ci = R \ Si have measure zero. So by Lemma 2.1 also the set n n n R \ S = R \ (∩i∈I Si) = ∪i∈I (R \ Si) = ∪i∈I Ci has measure zero.  As for the openness, we only can allow a finite intersection of open sets. As a counterex- 1 1 ample look at the infinite family of open sets Si = (− i , i ), i ∈ N. The intersection is however closed:

∩i∈NSi = {0} .

n EX. 2.2. Let be given a finite family of open subsets Si ⊂ R . Then the (finite) intersection n S = ∩i∈I Si is also open in R . As a corollary of Lemma 2.2 and Ex 2.2 we obtain a result that will be repeatedly used in the report.

n COROLLARY 2.1. Let be given a finite family of generic subsets Si ⊂ R , i ∈ I, |I| < ∞. Then also the (finite) intersection

n S = ∩i∈I Si is generic in R .

For further definitions and basic properties of Lebesgue measure and Lebesgue integrable 14 2. BASIS THEORY functions we refer to [34]. Here we only give Fubini’s formula for (Lebesgue) integrable functions f(x, y), x ∈ Rn, y ∈ Rk: Z Z Z  n+k k n (2.2) f(x, y)dµ(R ) = f(x, y)dµ(R ) dµ(R ) A×B A B Z Z  n k = f(x, y)dµ(R ) dµ(R ) B A where A ⊂ Rn,B ⊂ Rk are Lebesgue measurable sets and µ(Rk) denotes the measure on Rk. We can apply this formula to prove

LEMMA 2.3. Let S ⊂ Rk be generic in Rk. Then the set n n n×k R × S = {(x, s) | x ∈ R , s ∈ S} is generic in R . Proof. weakly generic: By assumption, Rk \ S has measure zero in Rk. By formula (2.2) for the set C = (Rn × Rk) \ (Rn × S) = Rn × (Rk \ S) we find with f ≡ 1, Z Z  Z k n k n µ(C) = dµ(R ) dµ(R ) = [µ(R \ S)]dµ(R ) = 0 . Rn Rk\S Rn openness: Take (x, s) ∈ Rn × S. By assumption, there exists some ε > 0 such that k s ∈ S holds for all s ∈ R with ks − sk < ε . Now for all (x, s) ∈ Rn × Rk satisfying k(x, s) − (x, s)k < ε, we find ks − sk ≤ k(x, s) − (x, s)k < ε , i.e., s ∈ S, and thus (x, s) ∈ Rn × S.  EX. 2.3. Let z = (x, y), x ∈ Rn, y ∈ Rk. Suppose that for any x ∈ Rn there is a weakly k k generic subset Sx ⊂ R , i.e., µ(R \ Sx) = 0. Then the following set is weakly generic in Rn+k: Sz := ∪x∈Rn ({x} × Sx) . k n+k Proof. By defining Cx := R \ Sx we have Cz := R \ Sz = ∪x∈Rn ({x} × Cx). Now  1 if y∈Cx with the function f(x, y) := 0 otherwise we find from (2.2), Z Z  Z k n n µ(Cz) = f(x, y)dµ(R ) dµ(R ) = [µ(Cx)]dµ(R ) = 0 . Rn Rk Rn n+k n+k So, Sz ⊂ R is a weakly generic subset of R .  We emphasize that the same argument as in the preceeding proof cannot be used for the openness part. The fact that for any x ∈ Rn there is a generic (in particular open) subset k n+k Sx ⊂ R , does not imply that the set ∪x∈Rn ({x} × Sx) is open in R . 2.2. GENERIC SETS IN FUNCTION SPACES 15

Open and dense vs generic. A set S ⊂ Rn such that µ(Rn \ S) = 0, must be dense. Indeed, by definition of measure zero, such a set Rn \ S cannot contain any (nonempty) open set. The converse is not true as can be seen from the example of the set S = Q which is dense in R with measure µ(Q) = 0 and µ(R \ Q) = ∞. Interesting enough we even have: n n S ⊂ R is open and dense ; µ(R \ S) = 0 . We provide an example: Take the (countable) set of rational numbers Q in R, Q = {q1, q2, q3,...} = {qi | i ∈ N} This set is dense in R. Now choose ε > 0 (arbitrarily) and define, the open intervals 1 1 1 Qi := (qi − ε 2i , qi + ε 2i ) of diameter 2 2i ε around qi. Then the set [ A := Qi is open and dense i∈N with µ(A) ≤ P µ(Q ) ≤ 4ε. Thus, this open, A is not generic (µ( \A) = i∈N i R ∞). Later in Section 2.3.2 we will prove that if S is a semi-algebraic set then density of S alone implies that int S is generic. 2.2. Generic sets in function spaces In this section we introduce a special topology on the sets Ck(Rn, Rm), k ≥ 1. We often cannot go into details and we therefore refer, e.g., to [16, 18, 21] for further reading. Let X ⊂ Rn be open. Often we will take X = Rn. By Ck(X, R), k ≥ 1, we denote the space of all real-valued functions which are k-times continuously differentiable on X. C∞(X, R) is the set of functions belonging to Ck(X, R) for any k ∈ N. As usual we identify k m k m C (X, R ) ≡ [C (X, R)] . To motivate our choice for a special topology on Ck(X, R) (k ≥ 0), let us look again at the function f ∈ Ck(R, R) given by 1 f(x) = cos x , 1 + x2 with infinitely many zeroes xi, i ∈ Z (see Figure 2.1). This function is nondegenerate in the sense that at any zero xi the derivative satisfies 0 f (xi) 6= 0 ∀i ∈ Z . Now, the topology on Ck(X, R) should be such that there exists a neighborhoud U of f with the property: for all g ∈ U the same nondegeneracy condition holds: 0 (2.3) g (˜x) 6= 0 for all x˜ ∈ R satisfying g(˜x) = 0 . Let us for the moment restrain ourselfes to a compact subset B = [−a, a], a > 0 fixed (see Figure 2.1). Seen as functions in C1(B, R) it is clear that we can chose some ε > 0, ε small enough, such that for all functions g ∈ C1(B, R) satisfying (2.4) |f(x) − g(x)| , |f 0(x) − g0(x)| < ε ∀x ∈ B 16 2. BASIS THEORY

f

x

−a a

1 FIGURE 2.1. Function f(x) = x2+1 cos x. it holds, g0(˜x) 6= 0 for all x˜ ∈ B satisfying g(˜x) = 0 . However this is no more true if we take B = R. Obviously for any constant ε > 0 there

f

x x˜0

1 FIGURE 2.2. function f(x) = x2+1 cos x + ε0 exists some ε0, 0 < ε0 < ε, such that the function g˜ = f + ε0 has the form as indicated in 0 Figure 2.2, with a point x˜0 satisfying g˜(˜x0) =g ˜ (˜x0) = 0, and (2.3) does not hold at x˜0. So the ε neigborhood of f as defined in (2.4) does not capture the behaviour of ”g ≈ f near ∞”. To circumvent this problem, instead of defining an ε-neigborhood of f by a constant ε > 0 (as in (2.4)), we define a φ-neighborhood of f by a function φ : R → R, φ ∈ C0(R, R), 1 1 φ(x) > 0 ∀x ∈ R. Then by chosing, e.g., φ(x) = 2 1+x2 , and a neigborhood of f via 1 1 (2.5) |f(x) − g(x)| , |f 0(x) − g0(x)| < ∀x ∈ n , 2 1 + x2 R then for all functions g ∈ C1(R, R) satisfying this relation also the nondegeneracy condi- tion (2.3) will hold. 2.2. GENERIC SETS IN FUNCTION SPACES 17

In this report instead of considering functions on compact sets we prefer to develop our theory for functions Ck(Rn, R) defined on the whole Rn. To this end we introduce the so-called strong topology on Ck(Rn, R) (cf., also [21, Def.6.2.8]). Recall that for α = n Pn α (α1, . . . , αn) ∈ N , |α| = i=1 αi, the expression ∂ f(x) denotes the α-partial derivative ∂|α| ∂αf(x) = f(x) . αn α1 ∂xn . . . ∂x1 DEFINITION 2.2. For fixed k ∈ N, a basis of open sets (neigborhoods) for the (strong) k k n k Cs -topology on C (R , R) is given by all sets Uφ,f , k k n α α n Uφ,f = {g ∈ C (R , R) | |∂ f(x) − ∂ g(x)| < φ(x), ∀x ∈ R , ∀α, |α| ≤ k} , 0 n 0 n n where φ is a function in C+(R , R) := {φ ∈ C (R , R) | φ(x) > 0 ∀x ∈ R } and f ∈ Ck(Rn, R). ` k n In an obvious way, for ` ≥ 0, ` ≤ k ≤ ∞, a Cs-topology can be defined on C (R , R).A ∞ ∞ n k 0 n Cs -topology on C (R , R) is defined by all neigborhoods Uφ,f , k ∈ N, φ ∈ C+(R , R), f ∈ C∞(Rn, R). k k n m The Cs -topology on [C (R , R)] is defined as the product topology. More precisely, for k n m k F = (f1, . . . , fm) ∈ [C (R , R)] , the neigborhoods Uφ,F are defined by (see Ex. 2.4) U k := ⊗m U k . φ,F j=1 φ,fj

k n m 0 n EX. 2.4. Let F = (f1, . . . , fm) ∈ [C (R , R)] . Show that for given φj ∈ C+(R , R), j = 1, . . . , m, we have U k := ⊗m U k ⊂ ⊗m U k , φ,F j=1 φ,fj j=1 φj ,fj if we define 0 n φ(x) := min φj(x) ∈ C+( , ) . 1≤j≤m R R

REMARK 2.1. We notice that if a subset S of C∞(Rn, R) is dense in C∞(Rn, R) wrt. the k ∞ n ` Cs -topology (k ∈ N) then S is dense in C (R , R) wrt. the Cs-topology for all ` ∈ N, ` ≤ k. k ∞ n Indeed, density of S wrt. the Cs -topology means that given any f ∈ C (R , R), for any k 0 n neigborhood Uφ,f (i.e., for any φ ∈ C+(R , R)) there exists some fφ ∈ S such that α α n |∂ f(x) − ∂ fφ(x)| < φ(x) ∀x ∈ R , ∀α, |α| ≤ k . Then clearly also for all ` ≤ k it follows, α α n |∂ f(x) − ∂ fφ(x)| < φ(x) ∀x ∈ R , ∀α, |α| ≤ ` , ` i.e., fφ ∈ Uφ,f . Note that such a statement is not true for openness.

k n k REMARK 2.2. A warning: The spaces C (R , R) endowed with the Cs -topology are not topological vector spaces. By definition, in a topological vector space the mappings k n k n k n C (R , R) × C (R , R) → C (R , R)(f, g) → f + g k n k n R × C (R , R) → C (R , R)(λ, f) → λf 18 2. BASIS THEORY

k should be continuous wrt. the Cs -topology (see,e.g., [21, Rem.6.1.10]). The first mapping is continuous (proof as an exercise) but the second is not. Take, e.g., f ≡ 1, λ = 0, 1 −x2 f := λ · 1 = 0, the sequences λn = n , fn := λn · f, and the function φ(x) = e . Then λ f U k f Ck n is not continuous in the neigborhood φ,f of . In other words, wrt. s ,

fn 9 f = 0 .

It can be shown (see [21, p.306]) that for all k ∈ N ∪ {∞} the spaces k n k C (R , R) endowed with the Cs -topology are Baire spaces , k n k i.e., they have the following property: If Ai ⊂ C (R , R), i ∈ N, are Cs -open and k Cs -dense then k A = ∩i∈NAi are Cs -dense .

We now can define what we mean by genericity in function spaces (see, e.g., [14, p.127], [21, p.306]).

DEFINITION 2.3. [generic set in Ck(Rn, R)] k n k Let X be a (such as C (R , R) with the Cs -topology). Then a subset A ⊂ X is called generic in X (or residual in X, cf., [18, p.74]) if A contains a countable intersection of open and dense sets Ai ⊂ X. In particular a generic set is dense but need not be open. Note that by definition, in a Baire space, a countable intersection of generic sets is still a generic set. k n k We emphasize that in particular, a subset A ⊂ C (R , R) (endowed with the Cs -topology) k n k n k is generic in C (R , R) if A is open and dense in C (R , R) wrt. the Cs -topology. k n Lateron, we regularly will make use of the following norm k · kk,C for f ∈ C (R , R), k n m n resp., F = (f1, . . . , fm) ∈ [C (R , R)] , k ∈ N, and compact C ⊂ R : α (2.6) kfkk,C := max max |∂ f(x)| , kF kk,C := max kfjkk,C . |α|≤k x∈C 0≤j≤m

2.3. Algebraic and semi-algebraic sets This section presents definitions and basic properties of algebraic and semi-algebraic sets. Algebraic sets are solution sets of polynomial equations, and semi-algebraic sets are de- fined by polynomial equalities and inequalities. A polynomial function on Rn of degree ≤ k is a function p : Rn → R of the form X α p(x) = cαx , |α|≤k n P where the sum is taken over all vectors α = (α1, . . . , αn) ∈ N , with |α| := i αi ≤ k, α α α1 α2 αn and x stands for x = x1 x2 ··· xn . We write p 6= 0 if at least one of the coefficients n cα is not equal to zero or equivalently if p(x) 6= 0 holds for some x ∈ R . 2.3. ALGEBRAIC AND SEMI-ALGEBRAIC SETS 19

EX. 2.5. Show that for polynomials we have: n n p(x) = 0 ∀x in a nonempty open subset U of R ⇒ p(x) = 0 ∀x ∈ R . Proof. Let the open set U ⊂ Rn contain a point x. By Taylor expansion, p has the respresentation (α! = α1! ··· αn!) X ∂αp(x) p(x) = (x − x)α . α! |α|≤k In view of p(x) = 0, x ∈ U, it follows ∂ p(x) = lim p(x+hej )−p(x) = 0, j = 1, . . . , n, and ∂xj h→0 h further in the same way ∂(α)p(x) = 0 for all |α| ≤ k. So we find p ≡ 0. 

DEFINITION 2.4. A set S ⊂ Rn is called algebraic if it can be written as n (2.7) S = {x ∈ R | p1(x) = 0, . . . , pk(x) = 0}, n with a finite number of polynomial functions pj : R → R, j = 1, . . . , k. A set S ⊂ Rn is called semi-algebraic if it can be written as a finite union of sets of the following form: n (2.8) V = {x ∈ R | p1(x) = 0, . . . , pk(x) = 0; q1(x) > 0, . . . , qm(x) > 0}, n with a finite number of polynomial functions pj, q` : R → R, j = 1, . . . , k ; ` = 1, . . . , m.

We present two examples. Denote by M n the set of all real (n×n)-matrices, M n ≡ Rn×n. • The set n S1 = {A ∈ M | det(A) = 0} is algebraic, as p(A) = det(A) (cf., (7.1)) is a polynomial function. c • The complement S2 = S1, n S2 = {A ∈ M | det(A) 6= 0} ,

is a semi-algebraic set, as S2 can be written as union of semi-algebraic sets, S2 = {A ∈ M n | det(A) > 0} ∪ {A ∈ M n | − det(A) > 0}.

2.3.1. Properties of semi-algebraic sets.

EX. 2.6. The sets V in Definition 2.4 of semi-algebraic sets may also contain finitely many inequality constraints h1(x) ≥ 0, . . . , hs(x) ≥ 0. Hint: Use {x | h(x) ≥ 0} = {x | h(x) = 0} ∪ {x | h(x) > 0}.

n EX. 2.7. Let S1,S2 ⊂ R be semi-algebraic. Then the intersection S1 ∩ S2 is semi- algebraic. 20 2. BASIS THEORY

Proof. S = ∪ V i J i = 1, 2 V i Let i νi∈Ji νi , with finite index sets i, ( νi of the form (2.8)). 1 2 Obviously for each pair ν1 ∈ J1, ν2 ∈ J2, the set Vν1 ∩ Vν2 is semi-algebraic (take the 1 2 =, >’s of both sets Vν1 ,Vν2 ). The statement then follows by the formula

1  \ 2  1 2  ν ∈J S1 ∩ S1 = ∪ν1∈J1 Vν1 ∪ν2∈J2 Vν2 = ∪ 1 1 Vν1 ∩ Vν2 . ν2∈J2  By definition, for a polynomial p the set p−1(0) is algebraic. More generally we have

EX. 2.8. Let S ⊂ Rn be a semi-algebraic set, and let h : Rp → Rn be a polynomial function. Then the inverse image h−1(S) = {y ∈ Rp | h(y) = x, x ∈ S} is a semi- algebraic subset of Rp. Hint: E.g., for the case S = {x ∈ Rn | p(x) = 0} we find h−1(S) = {y ∈ Rp | h(y) = x, x ∈ S} = {y ∈ Rp | (p ◦ h)(y) = 0}.

REMARK 2.3. By definition, algebraic sets are closed, whereas semi-algebraic sets may be neither closed nor open.

Recall the example Q ⊂ R which shows that density of a set S ⊂ R does not imply weak genericity (nor openness). So a dense subset S ⊂ Rn can be ”far” from being generic. However if S is semi-algebraic then density of S alone implies genericity of int S.

THEOREM 2.2. Let S ⊂ Rn be a semi-algebraic set. If S is dense in Rn then Rn \ S has Lebesgue measure zero in Rn. Moreover, the set int S is a generic subset of Rn. Proof. We provide an elementary proof due to David Preinerstorfer (private communica- tion) which is only based on Theorem 3.1. ` S By definition, the semi-algebraic set is given by S = Si, with i=1 n Si = {x ∈ R | pij(x) = 0, j = 1, . . . , ki ; qij(x) > 0, j = 1, . . . , mi}, i = 1, . . . , ` , where ki and mi are positive integers, pij and qij are polynomials, and Si 6= ∅ for all i. We next show that ` [ bd (S) ⊆ bd (Si) has Lebesgue measure zero. i=1

For i such that pij 6≡ 0 for at least one j we have

−1 bd (Si) ⊆ cl (Si) ⊆ pij (0) , which by Theorem 3.1 satisfies µ(bd (Si)) = 0. For all other i, we have

n Si = {x ∈ R | qij(x) > 0 ∀j = 1, . . . , mi} , 2.3. ALGEBRAIC AND SEMI-ALGEBRAIC SETS 21 with qij 6≡ 0 for all j as Si 6= ∅. One can easily show that

mi [ −1 bd (Si) ⊆ qij (0) , j=1 and therefore also in this case µ(bd (Si)) = 0. Thus we have µ(bd (S)) = 0. Moreover for any dense set S ⊂ Rn the complement of int (S) coincides with bd (S), i.e., n (2.9) S dense ⇒ (R \ int (S)) = bd (S) . This proves that the set int (S) is a generic subset of Rn. To prove (2.9) we only note that for dense S we have cl (S) = Rn so that n bd (S) := cl (S) \ int (S) = R \ int (S) .  We also present some further deeper results on semi-algebraic sets. Other properties of semi-algebraic sets. Let S ⊂ Rn be semi-algebraic.

(1) Then the sets cl S, int S, bd S are semi-algebraic.

(2) S has only finitely many connected components, each of which is semi-algebraic.

(3) Let A be a semi-algebraic subset of vectors (x, y) ∈ Rn × Rk. Then the pro- n k n jection proj xA = {x ∈ R | ∃y ∈ R such that (x, y) ∈ A} of A onto R is a semi-algebraic set in Rn.

(4) [Tarski-Seidenberg] Let p : Rn → Rp be a polynomial mapping. Then the image p(S) is a semi-algebraic set. For proofs see, e.g., [14, p.18] for (1),(2) and [10, Cor. 2.4] for (3),(4).

Later in Theorem 5.9, we will see that any semi-algebraic set allows a complete (stratifi- cation) partition into smooth manifolds. We add two illustrative examples. Recall that a set C ⊂ Rn is a cone if x ∈ C implies λx ∈ C for all 0 ≤ λ ∈ R. Let Sn denote the set of real symmetric (n × n)-matrices, Sn ≡ R(n+1)n/2.

EX. 2.9. The set p n n X T n CP = {A ∈ S | A = bibi , 0 ≤ bi ∈ R , i = 1, . . . , p, for some p} i=1 is called the cone of completely positive matrices. In the definition of CPn, wlog., we can (n+1)n assume p ≤ 2 . This follows by Caratheodory’s Theorem (see, [13, Theorem 3.6]) n which implies that each element A in the cone CP ⊂ R(n+1)n/2 can be written as a conic Pp T (n+1)n combination A = i=1 λibibi , λi ≥ 0, bi ≥ 0, of at most p ≤ 2 linearly independent T elements bibi . 22 2. BASIS THEORY

Show that CPn is a semi-algebraic set in Sn. Proof. The sets p n n×p X T Vp = {(A, b1, . . . , bp) ∈ S × R | A = bibi , 0 ≤ bi, i = 1, . . . , p} , i=1 (n+1)n 1 ≤ p ≤ 2 , are semi-algebraic as solution sets of polynomial =, ≥’s. By property (3) above also the projections proj AVp, (n + 1)n proj V = {A ∈ Sn | ∃b ∈ n such that (A, b , . . . , b ) ∈ V }, 1 ≤ p ≤ , A p i R 1 p p 2 n n are semi-algebraic. Since CP is the union of these sets proj AVp, also CP is semi- algebraic.  In [5] by applying Theorem 2.2, genericity results on the so-called cp-rank of matrices in CPn are obtained.

EX. 2.10. Let C ⊂ Rn be a semi-algebraic cone. Then the dual cone C∗ of C, ∗ n T C = {y ∈ R | c y ≥ 0 ∀c ∈ C} , is also semi-algebraic. Proof. Consider the sets n T S = {y ∈ R | ∃c ∈ C with c y < 0} and n n T n n S1 = {(y, c) ∈ R × R | c y < 0} ,S2 = {(y, c) ∈ R × R | c ∈ C} .

The sets S1,S2 are semi-algebraic and by Ex. 2.7 also the set n n T S0 := S1 ∩ S2 = {(y, c) ∈ R × R | c ∈ C, c y < 0} is semi-algebraic. In view of property (3) above, also the projection S = proj yS0 is semi- algebraic. Since C∗ is the complement Sc of S, by property (1) above also C∗ must be semi-algebraic.  2.3.2. Properties of algebraic sets. In this subsection we consider algebraic sets, i.e., semi-algebraic sets that are defined only by equalities (see Definition 2.4). We denote by R[x] the set (ring) of all polynomials in the variable x ∈ Rn. n An algebraic set V = {x ∈ R | p1(x) = 0, . . . , pk(x) = 0} will often be denoted by V (p1, . . . , pk). We are interested in the geometric structure of such sets. This is the topic of the field of Al- gebraic Geometry. The structure of solution sets of (general) C1-functions is investigated in Differential Geometry. 2.3. ALGEBRAIC AND SEMI-ALGEBRAIC SETS 23

A first observation explains the difference between Algebraic- and Differential-Geometry. It is easy to construct C∞-functions f 6= 0, such that the solution set S = {x ∈ Rn | f(x) = 0} contains (nonempty) open sets. See Theorem 5.3 in Section 5.1. According to Ex 2.5 this is not possible for polynomial functions.

We now give some illustrative examples of algebraic sets. They in particular show that in general an algebraic set needs not be a manifold (see Definition 5.1). However these sets allow a partition into manifolds (see Theorem 5.9).

2 2 EX. 2.11. V = V (x2 − x1(x1 − 1)) (see Figure 2.3).

x2

x1

2 2 FIGURE 2.3. Algebraic set defined by x2 −x1(x1 −1) = 0 with an isolated point at (0, 0).

EX. 2.12. V = V (p1, p2) with p1(x) = x1x2, p2(x) = x1x3.

The gradients of p1, p2 are ∇p1(x) = (x2, x1, 0), ∇p2(x) = (x3, 0, x1). At the point x = (1, 0, 0) ∈ V the gradients

∇p1(x) = (0, 1, 0), ∇p2(x) = (0, 0, 1) are linearly independent. So (by the Implicit Function Theorem 7.2), locally around x, the set V consists of a 1- 3 dimensional manifold given by {x ∈ R | x2 = x3 = 0}. However, at the point x˜ = (0, 1, 1) ∈ V , the gradients ∇p1(˜x) = (1, 0, 0), ∇p2(˜x) = (1, 0, 0) are linearly dependent. 3 Around x˜ the set V is a 2-dimensional manifold given by {x ∈ R | x1 = 0}.

Ex 2.12 shows that for algebraic sets V (p1, . . . , pk) the following (bad situation) is possi- ble: n • There are algebraic sets V = V (p1, . . . , pk) ⊂ R which locally around x ∈ V define a manifold of dimenion n − k but V contains other parts of higher dimension. See, e.g., Ex.2.12.

2 EX. 2.13. V = V (p1, p2) with p1(x) = x1 − x2x3, p2(x) = x1(x3 − 1). 24 2. BASIS THEORY

Preciser information about the structure of an algebraic set can be obtained by techniques from Algebra. We only give some more details connected with the notion of reducibility of an algebraic set or the concept of ideals. For further reading we refer to books on (see e.g., [3]). Let us emphasize that the number of equations defining V , alone, does not give any (non- trivial) information about the algebraic set V . Indeed the set V = V (p1, . . . , pk) can be written equivalently as k n X 2 V = {x ∈ R | p(x) := pi (x) = 0}. i=1

DEFINITION 2.5. An algebraic set V ⊂ Rn is called reducible if V can be decomposed as V = V1 ∪ V2 with algebraic sets V1,V2 satisfying V1 6= V,V2 6= V . An algebraic set V which is not reducible is called irreducible.

The set V (x1x2, x1x3) of Ex 2.12 is reducible as it can be decomposed as

V = V1 ∪ V2 with V1 = {x | x1 = 0},V2 = {x | x2 = x3 = 0} . Note however that this decomposition does not define a partition (see Definition 5.3), since V1 ∩ V2 = {0} is nonempty. A partition of V into manifolds is, e.g., given by

V = {x | x1 = 0} ∪ {x | x2 = x3 = 0, x1 > 0} ∪ {x | x2 = x3 = 0, x1 < 0} . However the last 2 sets are not algebraic but only semi-algebraic. Many other partitions are possible.

EX. 2.14. Give a decomposition into irreducible algebraic subsets of the algebraic set V in Example 2.13. 2 Hint: Show V = {x | x1 = 0, x2 · x3 = 0} ∪ {x | x3 = 1, x2 = x1}. One can show that the algebraic set V of Example 2.11 is irreducible. As a consequence of the famous Hilbert Basis Theorem one can obtain the following de- composition result (see e.g., [3, 3.1.5 Proposition]). PROPOSITION 2.1. p-13-1 Any algebraic set V ⊂ Rn can be decomposed as

V = V1 ∪ ... ∪ Vs with irrecucible sets Vi.

If we remove the sets Vi that are contained in another set Vj, the decomposition obtained is unique.

Recall that the sets Vi of this decomposition need not to be manifolds.

Since the same algebraic set can be defined by different sets of polynomials pi the question arises whether there is some sort of “basic set”. To answer this, we have to consider the ideal I(V ) of an algebraic set: (2.10) I(V ) := {p ∈ R[x] | p(x) = 0 holds for every x ∈ V }. 2.3. ALGEBRAIC AND SEMI-ALGEBRAIC SETS 25

DEFINITION 2.6. A subset I of a ring R is called an ideal if: (1) 0 ∈ I (2) p, q ∈ I ⇒ p + q ∈ I. (3) p ∈ I, q ∈ R ⇒ p · q ∈ I.

Obviously the set I(V ) ⊂ R[x] in (2.10) is an ideal in R[x] (proof as Exercise).

Consider now with explicitly given polynomials pi the algebraic set V (p1, . . . , pk). We can define the ideal ( k ) X I(p1, . . . , pk) = piqi | qi ∈ R[x] . i=1

EX. 2.15. Show that I(p1, . . . , pk) is an ideal in R[x] satisfying

I(p1, . . . , pk) ⊂ I(V (p1, . . . , pk)) . In general the equality needs not to hold for the ideals in Exercise 2.15. Take for example 2 2 2 V (p1) ⊂ R with p1(x) = x1 + x2. Then V (p1) = {0} and p = x1 (or p = x2) is in 2 2 I(V (p1)) but obviously x1 (or x2) are not in I(x1 + x2). However with the help of the Hilbert Basis Theorem one can prove that to each algebraic set V there is a set of polynomials generating I(V ) (see e.g., [3, Section 3.1]).

PROPOSITION 2.2. p-13-2 Let V ⊂ Rn be an algebraic set. Then there exist polynomi- als P1,...,Ps such that I(V ) = I(P1,...,Ps) and thus V = V (P1,...,Ps).

EX. 2.16. Show that for an algebraic set V the relation I(V ) = I(P1 ...,Ps) implies V = V (P1,...,Ps).

Proof. Let V = V (p1, . . . , pk). Since pi ∈ I(V ) = I(P1 ...,Ps), each polynomial pi Ps has a representation pi = j=1 Pjqji. So x ∈ V (P1,...,Ps) implies x ∈ V (p1, . . . , pk). On the other hand we have Pj ∈ I(V ) and thus x ∈ V implies Pj(x) = 0. So we have proven V = V (p1, . . . , pk) = V (P1,...,Ps)  2 2 2 2 One can show for example, that for the set V = V (x1 + x2) we have I(V (x1 + x2)) = I(x1, x2). Also the reducibility of an algebraic set V is connected with the set I(V ) as follows.

DEFINITION 2.7. Let I be an ideal in a commutative ring R. Then I is called a prime ideal if (1) I 6= R (2) p, q ∈ R, p · q ∈ I ⇒ p ∈ I or q ∈ I.

Take, e.g., V = V (x1 · x2) and consider its ideal

I(V (x1 · x2)) = {p ∈ R[x] | p(x) = 0 ∀x ∈ V (x1 · x2)} . 26 2. BASIS THEORY

Then for the polynomials x1, x2 we have p := x1 · x2 ∈ I(V (x1 · x2)) but x1, x2 ∈/ I(V (x1 · x2)). So the ideal I(V (x1 · x2)) is not prime. Note that V is reducible, V = {x | x1 = 0} ∪ {x | x2 = 0} The following characterisation holds. PROPOSITION 2.3. p-13-3 The algebraic set V ⊂ Rn is irredicible if and only if the ideal I(V ) is prime. We end up this subsection with some observations.

• Recall (example after Ex. 2.15) that in general I(p1, . . . , pk) is not equal to I(V (p1, . . . , pk)). Also, even if I(p1, . . . , pk) is prime, the ideal I(V (p1, . . . , pk)) need not to be prime (i.e., V (p1, . . . , pk) need not to be irreducible). An example 2 2 2 is given by p = x1(x1 − 1) + x2. Here one can show that I(p) is prime but I(V (p)) not, and thus V (p) = {(0, 0), (1, 0)} is reducible.

2 2 • One can show that the set V (x1(x1 − 1) + x2) is irreducible. CHAPTER 3

Genericity results in linear algebra

3.1. Basic results In this section we present first genericity results in linear algebra. They will be based on the following statement on the set of zeroes of polynomial functions.

THEOREM 3.1. Let p : RN → R be a polynomial mapping, p 6= 0. Then the set p−1(0) = {x ∈ RN | p(x) = 0} has (Lebesgue) measure zero in RN . Moreover, since p−1(0) is closed, the set RN \ p−1(0) is generic. Proof. We refer the reader to a short, elegant proof by Richard Caron and Tim Traynor in an unpublished note [9].  Let us define the space M n of real (n × n)-matrices, n n×n M = {A = (aij) i=1,...,n } ≡ . j=1,...,n R A “nice” property for a matrix A in M n is, e.g., to be nonsingular or equivalently to satisfy det(A) 6= 0. In Theorem 1.1 we have already shown that n n n (3.1) Mr = {A ∈ M | det(A) 6= 0} is a generic subset of M . This statement is generalized in

EX. 3.1. Let M n,m denote the space of real (n × m)-matrices, M n,m ≡ Rn×m. Assume n,m n,m n×m m ≥ n. Show that the set Mr := {A ∈ R | rank (A) = n} is a generic set in R . Proof. Decompose the matrix A ∈ M n,m, m ≥ n, as A = (A1,A2) ,A1 ∈ M n,n ,A2 ∈ M n,m−n . 1 n×n 1 n×n Then by Theorem 1.1 the set S1 = {A ∈ R | det(A ) 6= 0} is generic in R . In view of Lemma 2.3 also the set S = {(A1,A2) ∈ Rn×m | det(A1) 6= 0} is generic in n×m n,m R . But clearly S ⊂ Mr .  The next corollary shows that generically a system Ax = b with A ∈ M n, b ∈ Rn, has a unique solution x satisfying xi 6= 0 for all i = 1, . . . , n.

COROLLARY 3.1. In the problem set P := {(A, b) | A ∈ M n, b ∈ Rn} ≡ Rn×n+n let us consider the set

Pr := {(A, b) ∈ P | Ax = b has a unique solution satisfying xi 6= 0, ∀i = 1, . . . , n} .

Then the set Pr is a generic subset of P. 27 28 3. GENERICITY RESULTS IN LINEAR ALGEBRA

Proof. Recall Cramer’s rule for a solution x of Ax = b (valid if det(A) 6= 0, see (7.3)): det(A (b)) x = A−1b with components x = i i det(A) where Ai(b) is the matrix obtained from A by replacing the i-th column A·i of A by b, i.e., n×n Ai(b) is a (general) matrix in R . We define the sets

S0 := {(A, b) ∈ P | det(A) 6= 0},Si := {(A, b) ∈ P | det(Ai(b)) 6= 0}, i = 1, . . . , n . By Theorem 1.1 each of these sets is a generic subset of P and obviously n Pr = ∩i=0Si holds, so that in view of Corollary 2.1 also Pr is generic in P. 

3.2. Genericity results in linear programming In this section we apply the genericity results of Section 3.1 to Linear Programming Prob- lems (LP). In particular we show that in the generic situation linear programs are “nonde- generate” with respect to the Simplex method. Let in the sequel m, n ∈ N be given, m, n ≥ 1. Consider the following pair of primal/dual linear programs, T n (P ) : max c x s.t. x ∈ FP := {x ∈ R | Ax ≤ b} T m T (D) : min b y s.t. y ∈ FD := {y ∈ R | A y = c, y ≥ 0}, where A is a given real m × n-matrix and c ∈ Rn, b ∈ Rm are given vectors. The sets FP , FD, respectively, are called feasible sets of (P ), (D), respectively. (P ) is called the primal and (D) the corresponding dual program. Throughout the section the abbreviation J := {1, 2, . . . , m} is used. We introduce the optimal values vP , vD of (P ), (D) by T T (3.2) vP := sup c x , vD := inf b y , y∈F x∈FP D and define vP := −∞ if FP = ∅, vD = ∞ if FD = ∅. In the sequel, Aj·, j = 1, . . . , m, denote the rows of A, and A·i, i = 1, . . . , n, the columns. According to the next remark, wlog. we can assume that AT has full row rank n. Then we also have n ≤ m.

REMARK 3.1. In our linear programs (P ), (D) we can assume rank AT = n. Indeed T T T suppose that the rows A·i of A are linearly dependent, e.g., assume that the last row A·n is a linear combination of the other rows. Applying a Gauss elimination step, by adding T T an appropriate combination of the rows A·i , i = 1, . . . , n − 1, to A·n, the last row is transformed to the zero vector. If by the same operation the value cn is also transformed to T zero we can skip the last row of A y = c. If cn is transformed to a value α 6= 0, then the condition AT y = c is not feasible and (see Theorem 3.2(b)) (P ) is unbounded. 3.2. GENERICITY RESULTS IN LINEAR PROGRAMMING 29

For any pair of x ∈ FP , y ∈ FD we obtain the weak relation, (3.3) bT y − cT x = yT (b − Ax) ≥ 0 or cT x ≤ bT y implying vP ≤ vD . T T Note that from (3.3) it follows that if x ∈ FP , y ∈ FD satisfy c x = b y then both must be optimal solutions of (P ), (D), respectively. We summarize the well-known strong duality results in LP.

THEOREM 3.2. For the programs (P ), (D) precisely one of the following three alterna- tives holds:

(a) Both programs are infeasible, i.e., vP = −∞ and vD = ∞.

(b) Precisely one is feasible. In case FP = ∅, FD 6= ∅ we then have vP = vD = −∞, and in case FP 6= ∅, FD = ∅, vP = vD = ∞. (c) Both programs are feasible. Then for both, (P ) and (D), optimal solutions x ∈ T T FP , y ∈ FD exist and vP = c x = b y = vD holds. Proof. We refer to [13, Section 4.1] for a proof. 

By this duality result, feasible points x ∈ FP , y ∈ FD, are optimal solutions if and only if cT x = bT y holds, or the equivalent complementarity condition is valid: T T T b y − c x = y (b − Ax) = 0 or yj(bj − Aj·x) = 0 ∀j ∈ J. Such optimal solutions x, y are called strictly complementary if,

(SC) for all j ∈ J : yj = 0 ⇔ (bj − Aj·x) > 0

Recall the abbreviation J = {1, . . . , m}. In the following, for any subset J0 ⊂ J we define m m×n the subvector yJ0 of y ∈ R and the submatrix AJ0 of A ∈ R by ! ! yj Aj· yJ0 = . ,AJ0 = . . . . j∈J0 j∈J0

DEFINITION 3.1. [vertex, nondegenerate vertex]

For x ∈ FP we define the active index set Jx := {j ∈ J | Aj·x = bj}. A feasible x ∈ FP is called primal vertex (or extreme point of FP ) if AJx has rank n, implying |Jx| ≥ n. The vertex x is called nondegenerate if

(3.4) |Jx| = n, and thus AJx is nonsingular with x as unique solution of AJx x = bJx .

In case |Jx| > n, a vertex x is said to be degenerate.

A feasible point y ∈ FD is called a dual vertex (or basic solution of FD) if there exists an index set J ⊂ J, |J | = n, such that AT is nonsingular (such a submatrix exists since B B JB AT has rank n) and the vector y ≥ 0 solves JB AT y = c . (3.5) JB JB 30 3. GENERICITY RESULTS IN LINEAR ALGEBRA

So, the dual vertex y is given componentwise by yj, j ∈ JB, and yj = 0, j ∈ J \ JB. The dual vertex y is said to be if y satisfies nondegenerate JB

(3.6) yj > 0 , for all j ∈ JB .

A3·

A3·x ≤ b3 x A1· A x A2· 1· A2· FP FP

A1·x ≤ b1 A2·x ≤ b2 A1·x ≤ b1 A2·x ≤ b2

FIGURE 3.1. Primal vertex x (left) and nondegenerate vertex x (right).

We summarise some facts on vertex solutions.

LEMMA 3.1. (a) Let vP = vD be finite, i.e., optimal solutions x, y of (P ), (D) exist. Then if FP has a vertex, the max value of (P ) is (also) attained at some vertex. In this case, FD 6= ∅, a dual vertex always exists, and the min value is (also) attained at some dual vertex.

(b) If there exists a nondegenerate optimal primal vertex x ∈ FP then (D) has a unique optimal solution y (and y is a vertex).

(c) If there is a nondegenerate optimal dual vertex y ∈ FD then (P ) has a unique optimal solution x (and x is a vertex). (d) If x, y is a pair of nondegenerate optimal vertex solutions of (P ), (D), then strict com- plementarity SC holds. Proof. (a) We refer to [4, Theorem 2.7]. (b) Recall that for a pair x, y of optimal solutions the complementarity conditions

T (3.7) yj (b − Ax)j = 0 , j = 1, . . . , m , hold. Let now x be an optimal nondegenerate vertex solution, i.e., we have |Jx| = n, AJx is nonsingular, and

c (b − Ax)Jx = 0, (b − Ax)j > 0 ∀j ∈ Jx := J \ Jx . According to (3.7) for any optimal solution y it follows

c T yj = 0, j ∈ Jx and [AJx ] yJx = c , which uniquely determines y, and y is a dual vertex with JB = Jx. 3.2. GENERICITY RESULTS IN LINEAR PROGRAMMING 31

(c) Let y be an optimal nondegenerate vertex solution of (D). Then with the correspond- ing index set JB we have T |JB| = n, yj > 0, j ∈ JB , and [AJB ] is nonsingular . In view of (3.7) any primal optimal solution x must satisfy

AJB x = bJB which uniquely determines x. Since JB ⊂ Jx and AJB is nonsingular, x is a vertex. (d) If both optimal solutions x, y are nondegenerate (thus unique) by the arguments in (b),(c) we must have |JB| = n = |Jx|, and JB ⊂ Jx implies JB = Jx. Together with the nondegeneracy condition this is equivalent to strict complementarity. 

By definition, each primal vertex x is determined by a subset J0 ⊂ Jx with |J0| = n (such that AJ0 is nonsingular). A dual vertex y is determined by the index set JB, |JB| = n (such m that AJB is nonsingular). By noticing that there are n different possibilities to select n elements out of J = {1, . . . , m} the following is immediate.

m LEMMA 3.2. The number of vertices of (P ) and of (D) is less than or equal to n .

A common method to solve linear programs is the Simplex method. The primal version of this method proceeds as follows:

Starting from a vertex x0 of FP the algorithm generates a sequence of primal vertices xk, k = 1, 2,... (or stops with an unbounded direction). If the vertex xk is nondegenerate, then this algorithm guaranties [31, 13]. T T c xk < c xk+1 .

If this holds at all generated vertices xk, then all vertices xk must be distinct, and m since there are finitely many vertices, the algorithm must end up after at most n steps with an optimal vertex (or with an unbounded direction implying that the problem is unbounded). The same holds for the dual version. Note that at a degenerate vertex a so-called cycling can occur in the Simplex method and the method can generate infinitely many times the same non-optimal vertex. In this case, the Simplex method does not find the solution of (P ). For more details on Linear Programming the reader is referred to, e.g., [31, 13]. We are now going to analyse whether generically we can exclude the appearance of de- generate vertices in linear programming and thus generically exclude a cycling during the simplex algorithm. We first have to define the problem set properly. For given n and m, m ≥ n, a pair of problems (P ), (D) can be seen as an element of the space m,n m,n m n N P = {(A, b, c) | A ∈ M , b ∈ R , c ∈ R } ≡ R ,N := n · m + m + n . m,n Recall that a subset S ⊂ P is called generic (see Definition 1.2) if S contains a set Sr m,n such that P \ Sr has measure zero, and Sr is open. We will first consider problem (P ). 32 3. GENERICITY RESULTS IN LINEAR ALGEBRA

m,n LEMMA 3.3. There is a generic subset SP of P such that for all instances (A, b, c) ∈ SP all vertices of the corresponding primal program (P ) are nondegenerate.

Proof. By definition, a vertex x of FP is degenerate if |Jx| ≥ n+1. In view of AJx x = bJx , this implies that the

|Jx| × (n + 1)-matrix (AJx , bJx ) has rank < (n + 1) . m,n But in view of Ex. 3.1, for a generic subset S(Jx) of P this is excluded. There are only finitely many such subsets Jx ⊂ J. So in view of Corollary 2.1 the set

SP := ∩Jx⊂J S(Jx) m,n is a generic set in P . By definition, for all instances (A, b, c) ∈ SP all vertices of the corresponding primal program (P ) are nondegenerate.  We obtain the same result for the dual program.

m,n LEMMA 3.4. There is a generic subset SD of P such that for all instances (A, b, c) ∈ SD all vertices of the corresponding dual program (D) are nondegenerate.

Proof. A vertex y of FD is degenerate if with the corresponding (basic) index set JB |J | = n y ∈ n B , the solution JB R of [A ]T y = c y = 0 j ∈ J . JB JB satisfies j0 for some 0 B m,n But by Corollary 3.1, for a generic subset S(JB) of P this is excluded. Since there are only finitely many such subsets JB ⊂ J, according to Corollary 2.1, the set

SD := ∩JB ⊂J S(JB) m,n is a generic set in P . By definition, for all instances (A, b, c) ∈ SD all vertices of the corresponding dual program (D) are nondegenerate.  By defining m,n Pr = SP ∩ SD , m,n m,n from Lemma 3.3 and Lemma 3.4 we conclude that the set Pr is generic in P .

m,n m,n COROLLARY 3.2. There is a generic subset Pr ⊂ P such that for all instances m,n (A, b, c) in Pr all primal and dual vertices of the corresponding programs (P ), (D) are nondegenerate. Corollary 3.2 implies that, given a ”degenerate problem” (A, b, c), by (almost all) arbi- trarily small perturbations we obtain a “nondegenerate problem” (A, b, c), i.e., a problem m,n (A, b, c) ∈ Pr . This explains the fact, that in practice the Simplex method works well even without any anticycling procedure.

n,m We add some remarks on feasibility. We emphasize, that the generic set Pr in Corol- lary 3.2 does contain problem pairs such that (P ) or (D) are infeasible. We give a simple 3.2. GENERICITY RESULTS IN LINEAR PROGRAMMING 33

2,1 example in Pr : (P ) : max x s.t. x ≤ −1 x∈R −x ≤ −1

(D) : min −y1 − y2 s.t. y1 − y2 = 1 y∈R2 y1, y2 ≥ 0

Here, FP = ∅, i.e., (P ) is infeasible and (D) possesses one vertex y = (1, 0), which is non- degenerate. Note, that beginning with y = (1, 0) the (dual) Simplex-method immediately 1 1 finds the unbounded edge y(t) = 0 + t 1 , t ≥ 0, showing that (D) is unbounded. These properties are stable wrt. small perturbations in (A, b, c). 2,2 We also give an example in Pr with FP = ∅, and FD = ∅:

(P ) : max x1 + 2x2 s.t. x1 + x2 ≤ 1 x∈R2 −x1 − x2 ≤ −2 (3.8) 1 −1 1 (D) : min y1 − 2y2 s.t. 1 y1 + −1 y2 = 2 y∈R2 y1, y2 ≥ 0

We emphasize that the condition FP = FD = ∅ is not stable wrt. to small perturbations in the whole parameter set (A, b, c) (see [12]). Also in our example, almost all small perturbations of A make the program (P ) in (3.8) feasible (see, Figure 3.2).

x2 x2

x1 x1

x2 ≥ 2 − x1

x2 ≤ 1 − x1 F

FIGURE 3.2. Empty feasible set (left) and perturbed nonempty feasible set (right).

We add some comments on topological properties of, e.g., the set of instances (A, b, c) ∈ Pm,n such that (P ) is feasible.

LEMMA 3.5. m×(n+1) (a) The set S1 := {(A, b) ∈ R | Ax ≤ b is feasible} is neither closed nor open. So also the complement set of instances which are infeasible are neither open nor closed. m×(n+1) (b) The set S0 := {(A, b) ∈ R | Ax < b has a solution} is open and S0 = int (S1). 34 3. GENERICITY RESULTS IN LINEAR ALGEBRA

Proof. (a) We give the proof by counterexamples. To show non-closedness we consider the sequence of feasible systems in R2: 1 x ≥ 1 , x − x ≤ 0 for k ∈ . 1 1 k 2 N Obviously, the points (1, k), k ∈ N are feasible. For k → ∞ the limiting system x1 ≥ 1, x1 ≤ 0, is however infeasible. An example that shows non-openness is given by the following inequalities in R: The system x1 ≤ 0, x1 ≥ 0 is feasible but for any ε > 0 the system x1 ≤ 0, x1 ≥ ε is not.

(b) The set S0 is obviously open. Indeed, if Ax < b holds then Ax < b for all (A, b) near (A, b). To prove S0 = int (S1) it is sufficient to show that an instance (A, b) ∈ S1 \ S0 is a boundary point of S1. To do so, we show that given an instance (A, b) ∈ S1 \ S0, by an appropriate small perturbation the instance becomes infeasible. Such a pertubation is given, e.g., by (A, b) := (A, b − εe), ε > 0 (e = (1,..., 1)). Indeed, if Ax ≤ b − εe has a feasible point x then Ax < b holds, a contradiction.  CHAPTER 4

Genericity results in analysis

We are interested in generic properties of (non-parametric) problems in analysis, such as nonlinear equations, or unconstrained and constrained programs. We have to mention that much of the material in Section 4.3 and Section 4.5 is essentially taken from the book [21].

4.1. Basic results In this section we sketch basic results from analysis needed to prove the genericity results lateron. Partition of unity. Recall that the support supp f of a function f : Rn → Rm is defined as (the closed set) n supp f = cl {x ∈ R | f(x) 6= 0} .

n DEFINITION 4.1. Let {Ui}i∈I be an open cover of R , with I an arbitrary (countable) ∞ n n index set. A collection {φi}i∈I of functions φi ∈ C (R , R), φi(x) ≥ 0 ∀x ∈ R , i ∈ I, ∞ is called a C -partition of unity subordinate to {Ui}i∈I if

(1) supp φi ⊂ Ui n (2) {supp φi}i∈I is locally finite, i.e., for each x ∈ R there exists an open neighbor- hood Ux such that {i ∈ I | supp φi ∩ Ux 6= ∅} is finite. P n (3) i∈I φi(x) = 1 for all x ∈ R .

EX. 4.1. Show that condition (3) of this definition implies that also the set n {int supp φi}i∈I is an open cover of R . The following theorem assures the existence of a partition of unity and will be used regu- larly to globalize a local behavior.

n ∞ THEOREM 4.1. Every open cover {Ui}i∈I of R has a C -partition of unity subordinate to {Ui}i∈I . Proof. See, e.g., [18, 2.1 Theorem].  Concering Definition 4.1 and Theorem 4.1 we have to add an important

REMARK 4.1. We recall some notions from differential geometry. A topological vector space V is called a Hausdorff space if for any pair x, y ∈ V , x 6= y, there exist neighbor- hoods Ux of x and Uy of y satisfying Ux ∩ Uy = ∅. V is said to be paracompact if every open cover of V has an open refinement that is locally finite. The topological vector space 35 36 4. GENERICITY RESULTS IN ANALYSIS

V is called second countable if it has a countable base, i.e., if there is a countable family

{Ui}i∈N of open subsets Ui of V such that any open subset U of V can be written as union of a subset of the family {Ui}i∈N. It is well-known that the space Rn endowed with the standard topology (open Euclidean ε-balls around points) is a paracompact, second countable Hausdorff space. Further, for a locally Euclidean Hausdorff space V the following equivalence holds (see, e.g., [30, Exercise 2.15]): V is second countable ⇔ V is paracompact with countably many components We emphasize that in the proof of a C∞-partition of unity in Theorem 4.1 the paracom- n pactness of the space R (i.e., the existence of a locally finite refinement of {Ui}i∈I ) plays a crucial role. We regularly will make use of the following result.

LEMMA 4.1. Let W ⊂ Rn be a closed set and let U, W ⊂ U, be an open neighborhood of W . Let further f ∈ Ck(U, Rm), k ≥ 1, be given. (a) Then there exists a function ξ ∈ C∞(Rn, R), such that n 0 ≤ ξ(x) ≤ 1 ∀x ∈ R , ξ(x) = 1 on an open neighborhood U1 of W, supp ξ ⊂ U. ˜ k n m ˜ (b) There exists f ∈ C (R , R ) such that f and f coincide on a neighborhood U1 of W . Moreover, if W is compact, then f˜ can be chosen such that supp f˜ is also compact and supp f˜ ⊂ U. Proof. See [21, Lemma 2.2.2]. 

Parametric Sard Theorem. This theorem is the basis for the density part of the generic- ity results in this chapter. Let be given a function f ∈ C1(Rn, Rs). Then 0 ∈ Rs is called a regular value of f if (4.1) ∇f(x) has (full) rank s for all x such that f(x) = 0 . The following is a useful generalisation of the Sard Theorem (see e.g., [40, Prop. 78.10] for a proof).

THEOREM 4.2. [Parametric Sard Theorem] Let h : Q × P ⊂ Rq × Rp → Rs, (x, y) 7→ h(x, y), be a Ck-mapping with k > max{0, q − s} and with open sets Q ⊂ Rq,P ⊂ Rp. If 0 ∈ Rs is a regular value of h, then for almost all y ∈ P the value 0 is a regular value of the function h1(x) := h(x, y). Note for the case s > q, in Theorem 4.2, that then the fact that 0 is a regular value of h1(x) = h(x, y) (for fixed y) implies that the equation h1(x) = 0 cannot possess any solution. 4.2. GENERICITY RESULTS FOR EIGENVALUES OF REAL SYMMETRIC MATRICES 37 4.2. Genericity results for eigenvalues of real symmetric matrices In this section we present some genericity results concerning eigenvalues of real valued matrices. Clearly, real matrices may have complex eigenvalues. However in this report we only are interested in real eigenvalues and corresponding real eigenvectors x. So here we restrict ourselves to symmetric matrices, i.e., matrices from Sn = {A ∈ M n | A = AT } (see Chapter 7). To find eigenpairs (x, λ), of A ∈ Sn we can consider solutions x ∈ Rn, λ ∈ R, of the system: Ax − λx = 0 (4.2) H(x, λ; A) := . xT x − 1 = 0

We say that λ ∈ R is an s-fold eigenvalue of A, if (4.2) allows s linearly independent n solutions (eigenvectors) x1, . . . , xs ∈ R wrt. the same value λ. The eigenvalue λ is called simple if λ is not a s-fold eigenvalue with s ≥ 2. Note that if λ is a double eigenvalue of A then with corresponding linearly independent eigenvectors x1, x2, for any vector

x1 + ρ(x2 − x1) x(ρ) := , ρ ∈ R , kx1 + ρ(x2 − x1)k the pair (x(ρ), λ) is a solution of (4.2) (an eigenpair of A). Consequently, an eigenvalue λ with corresponding (normalized) eigenvector x is a simple eigenvalue if

(4.3) the solution (x, λ) of (4.2) is locally unique .

By the Inverse Function Theorem 7.1 this condition holds if the Jacobian of (4.2) at (x, λ) (A fixed),  A − λI −x  (4.4) ∇ H(x, λ; A) = is nonsingular . (x,λ) 2xT 0 By applying the Parametric Sard Theorem 4.2 we now prove that the following (nice) prop- erty is generic for A ∈ Sn: Property. All eigenvalues of A ∈ Sn are simple.

n THEOREM 4.3. There is a generic subset Pr of P = S such that for any A ∈ Pr all eigenvalues λ of A are simple.

Proof. weakly generic: To apply Theorem 4.2, we have to show that at all solutions (x, λ; A) of H(x, λ; A) = 0 in (4.2) the Jacobian

 A − λI −x ∇ (Ax − λx)  ∇ H(x, λ; A) = A (x,λ;A) 2xT 0 0 38 4. GENERICITY RESULTS IN ANALYSIS has full row rank n + 1. Due to symmetry, the matrix A has the form

 a11 a12 a13 . . . a1n   a12 a22 a23 . . . a2n    A =  a13 a23 a33 . . . a3n  ,  . .   ......  a1n a2n a3n . . . ann and the Jacobian ∇A(Ax − λx) = ∇AAx reads,

a a a a  a11 a12 a13 a1n 22 23 33 nn  z}|{ z}|{ z}|{ z}|{ z}|{ z}|{ z}|{ z}|{  x1 x2 x3 ... xn 0 0 ...... 0 ...... 0     0 x1 0 ... 0 x2 x3 ...... 0 ...... 0    .  0 0 x1 ... 0 0 x2 ...... x3 ...... 0   . . .   ......  0 0 0 . . . x1 ...... xn A moment of reflection shows that in view of x 6= 0 this matrix has full row rank n. Indeed, if x1 6= 0 this is directly clear. The same holds for x2 6= 0, or x3 6= 0, etc. So n n by Theorem 4.2, for almost all A ∈ S , i.e., for a weakly generic subset Pr ⊂ S the following is true: for all A ∈ Pr at any solution (x, λ) of (4.2) the Jacobian wrt. (x, λ),  A − λI −x  ∇ H(x, λ; A) = has full row rank n + 1 . (x,λ) 2xT 0

By the arguments above (see (4.4)), for all A ∈ Pr all eigenvalues of A are simple. openness: Take A ∈ Pr arbitrarily and consider all its (real) simple eigenvalues λ1 <

. . . < λn, with corresponding (orthonormal) eigenvectors xi, i = 1, . . . , n, such that the Jacobians   A − λiI −xi T are nonsingular , i = 1, . . . , n . 2xi 0

According to the Implicit Function Theorem 7.2, to any eigenpair (xi, λi) there exists εi > 1 n+1 0, neigborhoods B(A, εi), and C -functions (xi, λi): B(A, εi) → R with xi(A) = xi, λi(A) = λi, such that for A ∈ B(A, εi):  xi(A), λi(A) are locally unique (isolated) eigenpairs of A.

Then for 0 < ε ≤ min{εi | i = 1, . . . , n}, ε small, the n eigenvalues λi(A), A ∈ B(A, ε), still satisfy

λ1(A) < . . . < λn(A) , with corresponding (orthonormal) eigenvectors xi(A), i.e., the eigenvalues λi(A) are sim- ple.  4.3. GENERICITY RESULTS FOR UNCONSTRAINED PROGRAMS 39 4.3. Genericity results for unconstrained programs For a given function f ∈ C2(Rn, R) we consider the unconstrained optimization problem to find local minimizers of f,

(P0) : min f(x) x∈Rn It is well-known that at each local minimizer x the first and second order optimality con- ditions ∇f(x) = 0 and ∇2f(x) is positive semidefinite must hold. These conditions are necessary but not sufficient for x to be a local minimizer. We however have

LEMMA 4.2. Let f ∈ C2(Rn, R) and x ∈ Rn such that ∇f(x) = 0 and ∇2f(x) is positive definite . Then x is an isolated, strict local minimizer of f. Proof. See, e.g., [13, Lemma 11.2, Ex. 11.4].  To solve (P0) so-called Newton type methods can be used. The classical version of such a method computes critical points , i.e., solutions x of ∇f(x) = 0 , via the following Newton procedure: Chose a starting point x0 and iterate 2 −1 T (4.5) xk+1 = xk − [∇ f(xk)] ∇ f(xk) , k = 0, 1,.... Under the condition that the Hessian ∇2f(x) is nonsingular at a critical point x, this itera- tion is locally quadratically convergent.

THEOREM 4.4. [Convergence result for the Newton iteration] Let f ∈ C3(Rn, R) and x ∈ Rn such that ∇f(x) = 0 and ∇2f(x) is nonsingular .

Then there exists a neighborhood Ux of x such that starting with any x0 ∈ Ux the procee- dure (4.5) converges quadratically to x, i.e., with some C > 0 we have 2 kxk+1 − xk ≤ C kxk − xk , k = 0, 1,.... Proof. We refer to, e.g., [13, Theorem 11.4] for a proof.  We will call a point x satisfying ∇f(x) = 0 and ∇2f(x) nonsingular, a nondegenerate critical point. For the genericity results, now, we restrict ourselves to function f in C∞(Rn, R). We however mention that all results remain true for functions f in Ck(Rn, R), k ≥ 2. According to the above facts, with regard to the Newton method for solving the minimation ∞ n problem (P0), the following property would be desirable for f ∈ C (R , R). 40 4. GENERICITY RESULTS IN ANALYSIS

Property. The function f belongs to the set  ∞ n n (4.6) Pr := f ∈ C (R , R) | at all solutions x ∈ R of ∇f(x) = 0 the Hessian ∇2f(x) is nonsingular .

In the sequel we will show that this set Pr is open and dense and thus generic wrt. the k Cs -topology (cf., Definition 2.2), k ≥ 2. The proof of the density part will be based on the following lemma.

LEMMA 4.3. Let f ∈ C∞(Rn, R) be given arbitrarily. Then for almost all a ∈ Rn the T perturbed function fa(x) := f(x) + a x belongs to Pr. Proof. To prove the statement, we will apply the Parametric Sard Theorem 4.2 to the equation T T H(x; a) := ∇ fa(x) = ∇ f(x) + a = 0 . n The Jacobian of H wrt. to (x; a) has the form (In is the unit matrix in M ) 2  ∇(x;a)H(x; a) = ∇ f(x) In . Obviously, this matrix has full row rank n, and by the Parametric Sard Theorem for almost all a ∈ Rn the following is true: at all solutions x of T H(x; a) = ∇ fa(x) = 0 , 2 2 the Jacobian ∇xH(x; a) = ∇ f(x) has full row rank n, i.e., ∇ f(x) is nonsingular.  We are now ready to prove the main genericity statement in unconstrained programming.

THEOREM 4.5. [Genericity result for unconstrained programs] ∞ n k The set Pr in (4.6) is open and dense in C (R , R) wrt. the Cs -toloplogy for any k ≥ 2. k This implies that Pr is also Cs -dense for k = 0, 1 (see Remark 2.1). Proof. The idea of the proof is taken from the proofs of Theorem 7.1.13 and Theo- rem 7.3.10 in [21]. We start by defining the function 2 n σf (x) := k∇f(x)k + | det ∇ f(x)| , x ∈ R . This function obviously has the following property for f ∈ C∞(Rn, R): n f ∈ Pr ⇔ σf (x) > 0 ∀x ∈ R . openness part. Let f ∈ Pr, i.e., n σf (x) > 0 for all x ∈ R . Take any x ∈ Rn (arbitrarily). Then by defining B(x, δ) := {x ∈ Rn | kx − xk < δ} we have, ηx := min σf (x) > 0 , x∈cl B(x,1) ∞ n and there exists εx > 0, such that for any g ∈ C (R , R) satisfying

kf − gkk,cl B(x,1) < εx 4.3. GENERICITY RESULTS FOR UNCONSTRAINED PROGRAMS 41

(cf., (2.6)) we have 1 (4.7) min σg(x) > ηx . x∈cl B(x,1) 2 n We now can chose a sequence of points xi ∈ R , i ∈ N, such that the open balls n {B(xi, 1)}i∈N cover R and that this cover is locally finite. Let further {θi}i∈N be a parti- tion of unity subordinate to {B(xi, 1)}i∈N (cf., Definition 4.1) and let εxi , ηxi be defined as above: for all i ∈ N we have

1 ∞ n min σg(x) > ηx ∀g ∈ C ( , ) with kf − gk < εx . i R R k,cl B(xi,1) i x∈cl B(xi,1) 2

Since {B(xi, 1)}i∈N is a locally finite cover (cf., Definition 4.1), for any i0 ∈ N the set

I(i0) = {i ∈ N | supp θi ∩ supp θi0 6= ∅} is finite. We define εi0 = mini∈I(i0) εxi and the strictly positive continuous function X ρ(x) = εiθi(x) . i∈N

For an arbitrary point x0 ∈ supp θi0 we find X X ρ(x ) = ε θ (x ) ≤ ε θ (x ) ≤ ε . 0 i i 0 xi0 i 0 xi0 i∈I(i0) i∈I(i0)

k k Consequently, by construction, the function ρ(x) defines a Cs -neighborhood Uρ,f such that k n for any g ∈ Uρ,f the function σg(x) is positive on R and thus g is contained in Pr. ∞ n k density part. Let g ∈ C (R , R) be an arbitrary function and let Uφ,g be an arbitrary k 0 n Cs -neigborhood defined by the function φ ∈ C+(R , R). ∞ n k We now construct a sequence of functions fi ∈ C (R , R) ∩ Uφ,g, i ∈ N, such that (the pointwise limit) k f := lim fi is contained in Pr ∩ Uφ,g . i→∞ n Recall the open cover {B(xi, 1)}i∈N of R , and the corresponding partition of unity {θi}i∈N n in the openness part. By Ex. 4.1 also {int supp θi}i∈N is an open cover of R and we can chose a partition {χi}i∈N of unity subordinate to {int supp θi}i∈N. Note that the sets supp θi are closed and satisfy

int supp θi ⊂ supp θi ⊂ B(xi, 1) . ∞ n Thus we can find functions ξi ∈ C (R , R) satisfying (see Lemma 4.1(a)) n 0 ≤ ξi(x) ≤ 1 ∀x ∈ R , ξi(x) = 1 on an open neighborhood Wi of supp χi, supp ξi ⊂ int supp θi . n We now start our construction with i = 1. According to Lemma 4.3 we can chose u1 ∈ R T (small enough) such that h1(x) := g(x) + u1 x is contained in Pr and T (4.8) kξ1(x)u1 xkk,supp θ1 < ε1 =: min φ(x) . x∈supp θ1 42 4. GENERICITY RESULTS IN ANALYSIS

By defining  T g(x) + ξ1(x)u1 x x ∈ B(x1, 1) f1(x) := n g(x) x ∈ R \ B(x1, 1) ∞ n k we have given a function f1 ∈ C (R , R)∩Uφ,g, and f1 is contained in Pr(supp χ1) where for a given closed C ⊂ Rn we put ∞ n 2 Pr(C) := {f ∈ C (R , R) | det(∇ f(x)) 6= 0 ∀x ∈ C with ∇f(x) = 0} . ∞ n k 2 In the same way we define f2 ∈ C (R , R) ∩ Uφ,g ∩ Pr(∪i=1supp χi) by  T f1(x) + ξ2(x)u2 x x ∈ B(x2, 1) f2(x) := n , f1(x) x ∈ R \ B(x2, 1) n where u2 ∈ R is chosen such that f2 ∈ Pr(supp χ2) and T kξ2(x)u2 xkk,supp θ2 < ε2 =: min φ(x) . x∈supp θ2

In case supp θ1 ∩ supp θ2 6= ∅, we also can force

f2(x) ∈ Pr(C) for C = supp χ1 .

With regard to the proof of the openness part of Pr this is possible by chosing u2 in the T part ξ2(x)u2 x of f2 small enough. ∞ n k In this way after s steps we have constructed a function fs ∈ C (R , R) ∩ Uφ,g,  T fs−1(x) + ξs(x)us x x ∈ B(xs, 1) fs(x) := n , fs−1(x) x ∈ R \ B(xs, 1) satisfying fs ∈ Pr(supp χs) and by openness arguments also s−1 fs ∈ Pr(∪i=1 supp χi) . n Since the cover {supp θi}i∈N is locally finite, for any x ∈ R there exists a number s(x) such that

fj1 (x) = fj2 (x) for all j1, j2 > s(x) . Thus, for any x ∈ Rn the pointwise limit

f(x) := lim fi(x) i→∞ ∞ n k is well defined and by construction f ∈ C (R , R) ∩ Uφ,g. Moreover, since {supp χi}i∈N n n constitutes a cover of R we also have f ∈ Pr(R ) = Pr.  4.4. Genericity results for systems of nonlinear equations In Section 3.1 we have shown that matrices A ∈ M m,n ≡ Rm×n (m ≤ n) generically have full rank m. In this section we generalize such a result to functions F : Rn → Rm. T For fixed m, n ∈ N, we consider for vector functions F (x) = f1(x), . . . , fm(x) with ∞ n components fj(x) ∈ C (R , R), j = 1, . . . , m, the system of m equations in n unknowns:

(4.9) (E0): F (x) = 0 or fj(x) = 0 , j = 1, . . . , m .

A desirable property for solving (E0) is 4.4. GENERICITY RESULTS FOR SYSTEMS OF NONLINEAR EQUATIONS 43

Property. The function F belongs to  ∞ n m (4.10) Pr := F ∈ C (R , R ) | at all solutions x of F (x) = 0 the Jacobian ∇F (x) has full row rank m

By using the same technique as in Section 4.3 we will show that the set Pr is open and k dense (i.e., generic) wrt. the Cs -topology for any k ≥ 1. Note that in case m > n the Jacobian ∇F has rank rank ∇F (x) ≤ min{m, n} = n < m .

So for F ∈ Pr in this case m > n, there cannot be any solution of F (x) = 0. Again the density proof is based on

LEMMA 4.4. Let F ∈ C∞(Rn, Rm) be given. Then for almost all a ∈ Rm the perturbed function F (x; a) := F (x) + a belongs to Pr. Proof. The Jacobian of F (x; a) wrt. to (x, a) has the form  ∇(x,a)F (x; a) = ∇F (x) Im , and obviously has full row rank m. So, the Parametric Sard Theorem implies that for almost all a ∈ Rm the following is true: at all solutions x of F (x; a) = F (x) + a = 0 the Jacobian ∇xF (x; a) = ∇F (x) has full row rank m.  The genericity statement wrt. problem (E0) is as follows.

THEOREM 4.6. [Genericity result for systems of equations] ∞ n m k The set Pr in (4.10) is open and dense in C (R , R ) wrt. the Cs -toloplogy for any k ≥ 1. 0 n Proof. As in the proof of Theorem 4.5 we first define a function σF (x) ∈ C (R , R) such that for F ∈ C∞(Rn, Rm) we have n (4.11) F ∈ Pr ⇔ σF (x) > 0 ∀x ∈ R . Obviously such a function is given by X (4.12) σF (x) := kF (x)k + | det σ(x)| , σ where the sum ranges over all (m × m)-submatrices σ(x) of ∇F (x). Note that ∇F (x) has rank m iff there exists at least one (m × m)-submatrix σ0(x) of ∇F (x) satisfying det σ0(x) 6= 0. openness part. Let F ∈ Pr, i.e., n σF (x) > 0 for all x ∈ R . Take any x ∈ Rn (arbitrarily). Then

ηx := min σF (x) > 0 , x∈cl B(x,1) 44 4. GENERICITY RESULTS IN ANALYSIS

∞ n m and there exists εx > 0 such that for any G ∈ C (R , R ) satisfying

kF − Gkk,cl B(x,1) < εx (cf., (2.6)) we have 1 (4.13) min σG(x) > ηx . x∈cl B(x,1) 2 n We now can chose a sequence of points xi ∈ R , i ∈ N, such that the open balls n {B(xi, 1)}i∈N cover R and that this cover is locally finite. Let further {θi}i∈N be a parti- tion of unity subordinate to {B(xi, 1)}i∈N (cf., Definition 4.1) and let εxi , ηxi be defined as above: for all i ∈ N we have

1 ∞ n m min σG(x) > ηx ∀G ∈ C ( , ) with kF − Gk < εx . i R R k,cl B(xi,1) i x∈cl B(xi,1) 2

Since {supp θi}i∈N is locally finite, for any i0 ∈ N the set I(i0) = {i ∈ N | supp θi ∩ supp θi0 6= ∅} is finite. We define εi0 = mini∈I(i0) εxi and the strictly positive continuous function X ρ(x) = εiθi(x) . i∈N

For an arbitrary point x0 ∈ supp θi0 we find X X ρ(x ) = ε θ (x ) ≤ ε θ (x ) ≤ ε . 0 i i 0 xi0 i 0 xi0 i∈I(i0) i∈I(i0)

k Consequently, by construction (supp θi ⊂ B(xi, 1)), function ρ(x) defines a Cs -neighborhood k k n Uρ,F such that for any G ∈ Uρ,F , the function σG(x) is positive on R and thus G is con- tained in Pr. ∞ n m k density part. Let G ∈ C (R , R ) be an arbitrary function and let Uφ,G be an arbitrary k 0 n Cs -neigborhood defined by the function φ ∈ C+(R , R). ∞ n m k We now construct a sequence of functions Fi ∈ C (R , R ) ∩ Uφ,G, i ∈ N, such that (the pointwise limit) k F := lim Fi is contained in Pr ∩ Uφ,G . i→∞ n Recall the open cover {B(xi, 1)}i∈N of R , and the corresponding partition of unity {θi}i∈N n in the openness part. By Ex. 4.1 also {int supp θi}i∈N is an open cover of R and we can chose an partition of unity {χi}i∈N subordinate to {int supp θi}i∈N. Note that the sets supp θi are closed and satisfy

int supp θi ⊂ supp θi ⊂ B(xi, 1) . ∞ n Thus we can find functions ξi ∈ C (R , R) satisfying (see Lemma 4.1) n 0 ≤ ξi(x) ≤ 1 ∀x ∈ R , ξi(x) = 1 on an open neighborhood Wi of supp χi, supp ξi ⊂ int supp θi . 4.4. GENERICITY RESULTS FOR SYSTEMS OF NONLINEAR EQUATIONS 45

m Starting with i = 1, according to Lemma 4.4 we can chose u1 ∈ R (small enough) such that H1(x) := G(x) + u1 is contained in Pr and

(4.14) kξ1(x)u1kk,supp θ1 < ε1 =: min φ(x) . x∈supp θ1 By defining  G(x) + ξ1(x)u1 x ∈ B(x1, 1) F1(x) := n G(x) x ∈ R \ B(x1, 1) ∞ n m k we have given a function F1 ∈ C (R , R ) ∩ Uφ,G and F1 is contained in Pr(supp χ1), where for given closed C ⊂ Rn we again put ∞ n m Pr(C) := {F ∈ C (R , R ) | rank ∇F (x) has rank m ∀x ∈ C with F (x) = 0} . ∞ n m k 2 In the same way we define F2 ∈ C (R , R ) ∩ Uφ,G ∩ Pr(∪i=1supp χi) by  F1(x) + ξ2(x)u2 x ∈ B(x2, 1) F2(x) := n , F1(x) x ∈ R \ B(x2, 1) m where u2 ∈ R is chosen such that F2 ∈ Pr(supp χ2) holds as well as

kξ2(x)u2kk,supp θ2 < ε2 =: min φ(x) , x∈supp θ2 and in case supp θ1 ∩ supp θ2 6= ∅ such that we have

F2(x) ∈ Pr(C) also for C = supp χ1 .

With regard to the proof of the openness part of Pr the latter is possible by chosing u2 in the part ξ2(x)u2 of F2 small enough. ∞ n m k In this way after s steps we have constructed a function Fs ∈ C (R , R ) ∩ Uφ,G,  Fs−1(x) + ξs(x)us x ∈ B(xs, 1) Fs(x) := n , fs−1(x) x ∈ R \ B(xs, 1) satisfying Fs ∈ Pr(supp χs) and by openness arguments also s−1 Fs ∈ Pr(∪i=1 supp χi) . n Since the cover {supp θi}i∈N is locally finite, for any x ∈ R there exists a number s(x) such that

Fj1 (x) = Fj2 (x) for all j1, j2 > s(x) . Thus for any x ∈ Rn the pointwise limit

F (x) := lim Fi(x) i→∞ ∞ n k is well defined and by construction F ∈ C (R , R) ∩ Uφ,G. Moreover, since {supp χi}i∈N n n constitutes a cover of R , we also have F ∈ Pr(R ) = Pr.  An implication of Theorem 4.6 is that generically the Newton method for solving a system ∞ n n (4.15) F (x) = 0 , where F ∈ C (R , R ) , 46 4. GENERICITY RESULTS IN ANALYSIS works well. More precisely, we consider the Newton procedure: chose a starting point n x0 ∈ R and iterate −1 (4.16) xk+1 = xk − [∇F (xk)] F (xk) , k = 0, 1,.... The well-known local convergence result for this Newton method is as follows.

THEOREM 4.7. Given F ∈ C2(Rn, Rn), let F (x) = 0 for some x ∈ Rn and suppose that the Jacobian ∇F (x) is nonsingular .

Then there exists an open neighborhood Ux of x such that starting with any x0 ∈ Ux the Newton iteration (4.16) converges quadratically to x, i.e., with some c > 0 we have 2 kxk+1 − xk ≤ c kxk − xk , k = 0, 1,.... According to this theorem, with respect to the Newton method (4.16), the nice function set in C∞(Rn, Rn) is given by  ∞ n n n (4.17) Pr = F ∈ C (R , R ) | at any x ∈ R with F (x) = 0 the Jacobian ∇F (x) is nonsingular . As a special case of Theorem 4.6 we obtain the genericity result.

COROLLARY 4.1. [Genericity result for the Newton method] ∞ n n k The set Pr in (4.17) is open and dense (thus generic) in C (R , R ) wrt. the Cs -topology for any k ≥ 1. 4.5. Genericity results for constrained programs In this section we perform a genericity analysis for nonlinear programs of the form (omit- ting equality constraints for notational simplicity), n (4.18) (P ) : min f(x) s.t. x ∈ F := {x ∈ | gj(x) ≤ 0, j ∈ J} , x R with index set J = {1, . . . , m} and feasible set F. The functions f, gj are assumed to be in C∞(Rn, R). To make explicit that (P ) is given by the problem functions f, G := T (g1, . . . , gm) we often write P (f, G), F(G) instead of (P ), F.

For x ∈ F we introduce the active index set Jx = {j ∈ J | gj(x) = 0}, and the Lagrangean function (near x) X L(x, µ) = f(x) + µjgj(x) .

j∈Jx The coefficients µj are called Lagrangean multipliers.

DEFINITION 4.2. We say that the Linear Independence Constraint Qualification (LICQ) is satisfied at x ∈ F if

∇gj(x), j ∈ Jx, are linearly independent . We now state the well-known necessary optimality conditions. 4.5. GENERICITY RESULTS FOR CONSTRAINED PROGRAMS 47

LEMMA 4.5. [Necessary optimality conditions] Let x ∈ F be a local minimizer of (P ) such that LICQ holds at x. Then there exist multipliers µj ≥ 0, j ∈ Jx (unique by LICQ), such that the Karush-Kuhn-Tucker (KKT) condition holds: X (4.19) ∇xL(x, µ) = ∇f(x) + µj∇gj(x) = 0 .

j∈Jx Moreover the following second order conditions are satisfied: T 2 d ∇xL(x, µ)d ≥ 0 for all d in Cx , where Cx is the cone of critical directions, n Cx = {d ∈ R | ∇f(x)d ≤ 0, ∇gj(x)d ≤ 0, j ∈ Jx} . Proof. For a proof we refer to [13, Theorem 12.7].  A point x ∈ F satisfying the KKT condition is called a KKT point.

EX. 4.2. Let the KKT-condition (4.19) hold at x ∈ F. Then if LICQ is satisfied at x, the multipliers µj, j ∈ J0(x), are uniquely determined.

At a KKT point x, with corresponding multipliers µj, j ∈ Jx, strict complementarity is said to hold if

(4.20) µj > 0, ∀j ∈ Jx . (SC)

It is not difficult to show that if at a KKT point x the condition SC holds, then the cone Cx coincides with the tangent space Tx (see Ex. 4.3), n (4.21) Cx = Tx := {d ∈ R | ∇gj(x)d = 0, j ∈ Jx} .

EX. 4.3. Let the KKT condition hold at x with µj ≥ 0, j ∈ Jx. Show that we have

Cx = {d | ∇gj(x)d = 0, if µj > 0, ∇gj(x)d ≤ 0, if µj = 0, j ∈ Jx} , i.e., Tx ⊂ Cx for Tx := {d | ∇gj(x)d = 0, j ∈ Jx}. If moreover SC holds, Cx equals the tangent space: Cx = Tx . The necessary optimality conditions in Lemma 4.5 are not sufficient for optimality. We however obtain the following sufficiency and stability conditions.

THEOREM 4.8. Let P (f, G) be given for (f, G) ∈ [C2(Rn, R)]1+m and let x ∈ F(G) satisfy LICQ, as well as the KKT condition (4.19) with SC and the second order condition T 2 (SOC): d ∇xL(x, µ)d > 0 ∀d ∈ Tx \{0} . (a) [Sufficient optimality conditions] Then x is a strict local minimizer and an isolated KKT point of P (f, G).

(b) [Stability result] There exist δ, ε > 0, open neighborhoods B(x, δ) of x, and Uε of (f, G) given by all (f,˜ G˜) ∈ [C2(Rn, R)]1+m satisfying ˜ ˜ k(f, G) − (f, G)k2,cl B(x,δ) < ε , 48 4. GENERICITY RESULTS IN ANALYSIS as well as a (Frechet) differentiable map x : Uε → B(x, δ) with the following property: ˜ ˜ ˜ ˜ For all (f, G) ∈ Uε the point x(f, G) is a strict local minimizer and the unique KKT point of (P (f,˜ G˜)) in cl B(x, δ). Proof. For (a) we refer to [13, Theorem 2.4,2.5]. The proof of the stability result in (b) depends on an application of the Implicit Function Theorem (in Banach spaces) to the system of KKT equations T P T ∇ f(x) + µj∇ gj(x) = 0 (4.22) H(x, µ; f, G) := j∈Jx , gj(x) = 0, j ∈ Jx

(with µj ≥ 0 and x feasible) (see also [21, Theorem 6.1.5]). It is not difficult to show that T under the assumptions of the theorem, the Hessian (recall GJx = (gj, j ∈ Jx) )  2 T  ∇xL(x, µ) ∇GJx (x) (4.23) ∇(x,µ)H(x, µ; f, G) = ∇GJx (x) 0 is nonsingular at (x, µ) and at the given problem functions (f, G) (see, Ex.4.4). 

EX. 4.4. Let be given a symmetric matrix A ∈ Sn and a matrix B ∈ Rn×k with full rank AB k. Then for the matrix M := BT 0 the following holds. (a) sT As 6= 0 ∀0 6= s ∈ ker(BT ) ⇒ M is nonsingular. One can even show ⇔ (see, e.g., [17, Lemma 2.43]). (b) Suppose sT As ≥ 0 ∀s ∈ ker(BT ) holds. Then M is nonsingular ⇔ sT As > 0 ∀0 6= s ∈ ker(BT ). Proof. The proof of (a) is straightforward. The statement (b) can be shown by using [21, Ex.3.2.22]. 

EX. 4.5. Show that x = ( √1 , 1 ) is the unique (global) minimizer of 2 2 2 2 2 (P ) : min f(x) := x1 + (x2 − 1) s.t. x2 ≤ x1, x1 ≥ 0 . Hint: Show that x satisfies the KKT conditions with LICQ, SC, and SOC.

For later purposes we also need a stability result assuring that if no KKT point exists for P (f, G) near a point x, then after a sufficiently small perturbation of (f, G) this property remains true.

LEMMA 4.6. Let (f, G) ∈ [C∞(Rn, R)]1+m and x ∈ Rn be given. Assume that there exists a neigborhood B(x, δ), δ > 0, such that LICQ holds for all x ∈ F(G) ∩ cl B(x, δ) and that no KKT point of P (f, G) is contained in cl B(x, δ). Then there exists ε > 0 and an open neigborhood Uε of (f, G) defined by ˜ ˜ ∞ n 1+m ˜ ˜ Uε = {(f, G) ∈ [C (R , R)] | k(f, G) − (f, G)k1,cl B(x,δ) < ε} , 4.5. GENERICITY RESULTS FOR CONSTRAINED PROGRAMS 49 ˜ ˜ ˜ such that for all (f, G) ∈ Uε we have: the condition LICQ holds at all x ∈ F(G) ∩ cl B(x, δ) and the set cl B(x, δ) does not contain any KKT point of P (f,˜ G˜) . Proof. By a continuity argument (similar to the following reasoning) the statement about the stability of LICQ is easily proven. Let us now assume that the statement about the ˜ ˜ stability for KKT points is false. Then there must exist a sequence of functions (fν, Gν) ∈ [C∞(Rn, R)]1+m, ν ∈ N, satisfying ˜ ˜ (4.24) k(fν, Gν) − (f, G)k1,cl B(x,δ) → 0 , ˜ ˜ such that P (fν, Gν) possesses KKT points xν in cl B(x, δ). The KKT conditions for xν read: ˜ P ν ∇fν(xν) + j∈J µj ∇[˜gj]ν(xν) = 0 (4.25) xν   ν ∈ N . g˜j ν(xν) = 0, j ∈ Jxν Wlog. (by chosing a subsequence) we can assume (use (4.24)) ˜ (4.26) xν → x˜ ∈ cl (B(x, δ) ∩ F(G)), ∇fν(xν) → ∇f(˜x), ∇[gj]ν(xν) → ∇gj(˜x) .

Since there are only finitely many subsets Jxν ⊂ J, we also can assume Jxν = J0 ⊂ Jx˜ ν for all ν (Jxν ⊂ Jx˜ holds by continuity). We now show that the multipliers µj , j ∈ J0, are uniformly bounded by some constant M: ν (4.27) 0 ≤ µj ≤ M ∀j ∈ J0, ∀ν ∈ N . ν ν Let us suppose that this is not the case. Then we must have µ0 := maxj∈J0 µj → ∞ for ν ν ν → ∞. Wlog. we can assume µ0 = µj0 for some j0 ∈ J0. By dividing the first relation ν in (4.25) by µj0 we find ˜ ν ∇fν(xν) X µj + ∇[˜g ] (x ) + ∇[˜g ] (x ) = 0 . µν j0 ν ν µν j ν ν j0 j0 j∈J0\{j0} ν ν µj µj Letting ν → ∞, in view of 0 ≤ µν ≤ 1, we have (for some subsequence) µν → µ˜j and j0 j0 thus (see (4.26)) X ∇gj0 (˜x) + µ˜j∇gj(˜x) = 0 ,

j∈J0\{j0} contradiction the fact that LICQ holds at x˜ ∈ cl (B(x, δ) ∩ F(G)). In view of (4.27) we conclude (for some subsequence of ν), ν µj → µ˜j, j ∈ J0, for ν → ∞ , and by taking in (4.25) the limit ν → ∞ we find P ∇f(˜x) + µ˜j∇gj(˜x) = 0 j∈J0 . gj(˜x) = 0, j ∈ J0

Thus, recalling J0 ⊂ Jx˜, x˜ is a KKT point of P (f, G) in cl B(x, δ) contradicting our assumption.  50 4. GENERICITY RESULTS IN ANALYSIS

An important type of solution method for computing local minimizers of P (f, G) are so- called Newton methods. These methods apply (some sort of) Newton iteration to the sys- tem (4.22) of KKT equations for P (f, G). In the proof of Theorem 4.8(b) we have seen that if a KKT point x satisfies LICQ, SC, and SOC, then the Jacobian (wrt. (x, µ), see (4.23)) of the system (4.22) is nonsingular, and according to Theorem 4.7 the Newton iteration applied to the KKT equation (4.22) is locally quadratically convergent to x. So, a desirable property is that the problem functions (f, G) of P (f, G) belong to the following ”nice set” ∞ n 1+m (4.28) Pr := {(f, G) ∈ [C (R , R)] | at all local minimizers x of P (f, G) we have LICQ, SC, and SOC} . Here, n, m ∈ N are given numbers defining the ”size” of problem P (f, G). A local mini- mizer satisfying LICQ, SC, and SOC, is called a nondegenerate minimizer.

k We will now prove that this set Pr is open and dense in the Cs -topology for any k ≥ 2. We need two auxiliary lemmas. As a corollary of Theorem 4.6 we now show that generically the LICQ condition is valid at all feasible points in F(G).

LEMMA 4.7. T ∞ n m m (a) Let G = (g1, . . . , gm) ∈ C (R , R ) be given. Then for almost all a ∈ R we have: for the perturbed function G˜(x) = G(x) + a at all feasible points in F(G˜) the condition LICQ is valid. (b) The subset of constraint functions G ∈ C∞(Rn, Rm) such that at all feasible points in ∞ n m k F(G) the condition LICQ holds is open and dense in C (R , R ) wrt. the Cs -topology for all k ≥ 1.

Proof. Chose any subset J0 ⊂ J = {1, . . . , m} and consider the feasible points x ∈ F(G) with Jx = J0, i.e, x solves the system

gj(x) G (x) := = 0 , J0 . . j∈J0 of |J0| equations in n unknowns.

Note that LICQ at x is equivalent with the condition that ∇GJ0 (x) has rank |J0|. By ∞ n |J0| Theorem 4.6 the set of functions F (x) := GJ0 (x) ∈ C (R , R ) such that LICQ holds k for all x satisfying GJ0 (x) = 0, is open and dense in the Cs -topology for any k ≥ 1. By noticing that there only exist finitely many subsets J0 of J, by taking the intersection of the corresponding sets we have proven the statement.  In the next lemma, for given problem functions (f, G) ∈ [C∞(Rn, R)]1+m, we consider ˜ ˜ ˜ T the perturbations (f, G), G = (˜g1,..., g˜m) defined by f˜(x) = f(x) + bT x (4.29) , g˜j(x) = gj(x) + αj , j ∈ J n+m depending on the parameter (b, a) = (b, α1, . . . , αm) ∈ R . 4.5. GENERICITY RESULTS FOR CONSTRAINED PROGRAMS 51

LEMMA 4.8. Let (f, G) ∈ [C∞(Rn, R)]1+m be given. Then for almost all parameters n+m ˜ ˜ (b, a) ∈ R the perturbed function (f, G) in (4.29) belongs to Pr. Proof. In view of Lemma 4.7(a) for almost all a ∈ Rm the set F(G˜) satisfies LICQ. Thus in view of Lemma 4.5 for these a all local minimizers must be KKT points. So we ˜ ˜ can consider KKT solutions (x, µ) of P (f, G) with a given active index set J0 = Jx, i.e., |J0| (x, µ), µ ∈ R+ solve the KKT equations T ˜ P T ∇ f(x) + µj∇ g˜j(x) = 0 (4.30) H(x, µ; f,˜ G˜) := j∈J0 . g˜j(x) = 0, j ∈ J0 We first show that for almost all (b, a) ∈ Rn+m at all KKT points of P (f,˜ G˜) the SC condition is met. Suppose that at a KKT point (x, µ), i.e., a solution of (4.30), SC does not hold. Wlog. we assume J0 = {1, . . . , s}. This means that for at least one j ∈ J0 (J0 6= ∅) say j = 1, the condition µ1 = 0 is satisfied. Thus the KKT point (x, µ) is a solution of the system H(x, µ; f,˜ G˜) = 0 see (4.30) with J = J (4.31) x 0 . µ1 = 0

The Jacobian of this system wrt. (x, µ, b, a) has the form (e1 is the unit vector (1, 0, .., 0) ∈ R|J0|)  2 T  ∇xL(x, µ) ∇GJ0 (x) In 0

 ∇GJ0 (x) 0 0 I|J0|  . T 0 e1 0 0 By adding an appropriate multiple of the last row to the first n rows the T T T ∇GJ0 (x) 0 ∇ g2(x) ... ∇ gs(x) submatrix 0 is transformed to 0 0 ... 0 , T e1 1 0 ... 0 while the other parts remain unchanged. Obviously the obtained matrix has full row rank n + |J0| + 1. According to the Parametric Sard Theorem, for almost all (b, a) also the Jacobian of the system (4.31) wrt. the n+|J0| variables (x, µ) has full row rank n+|J0|+1 at all solutions (x, µ) of (4.31). Consequently for almost all (b, a) solutions (x, µ) of (4.31) (with µ1 = 0) are excluded. Now we check the SOC condition (assuming that SC holds). To do so, we look at the Jacobian of H(x, µ; f,˜ G˜) wrt. to (x, µ, b, a), which has the form (recall the abbreviation T GJ0 (x) = (gj(x), j ∈ J0) )  2 T  ˜ ˜ ∇xL(x, µ) ∇GJ0 (x) In 0 ∇(x,µ,b,a)H(x, µ; f, G) = . ∇GJ0 (x) 0 0 I|J0|

Obviously, this matrix has full row rank n + |J0| and the Parametric Sard Theorem 4.2 implies that for almost all (b, a) ∈ Rn+m the following is true: at all solutions (x, µ) of (4.30) the Jacobian  2 T  ˜ ˜ ∇xL(x, µ) ∇GJ0 (x) (4.32) ∇(x,µ)H(x, µ; f, G) = ∇GJ0 (x) 0 52 4. GENERICITY RESULTS IN ANALYSIS has full rank n+|J0|. Note that by the result of Lemma 4.4 (and the observation before this |J0| lemma), for almost all a ∈ R the case |J0| > n is excluded. So we can assume |J0| ≤ n and that the matrix ∇GJ0 (x) has full rank |J0| at all solutions (x, µ) of (4.30) (due to LICQ, cf., Lemma 4.7(a)). With regard to the necessary (second order) optimality conditions in Lemma 4.5, by applying Ex 4.4 (use Tx = Cx because of SC) we conclude that for almost n+m ˜ ˜ all (b, a) ∈ R at all minimizers x of P (f, G) with Jx = J0, the condition LICQ, SC and SOC must hold. Since there only exist finitely many subsets J0 of J, by taking the corresponding intersec- tions of the perturbations (b, a), the statement of the lemma is shown.  We finally can formulate our main genericity result.

THEOREM 4.9. [Genericity result for constrained programs] ∞ n 1+m k The set Pr in (4.28) is open and dense in [C (R , R)] wrt. the Cs -toloplogy for any k ≥ 2. Proof. To avoid the technicalities involved with a function σ which characterizes the adherence to Pr as in the proof of [21, Th. 7.1.13] (see the functions σf , σF in the proofs of Theorem 4.5, 4.6) we base our proof on the stability result in Theorem 4.8, and on Lemmas 4.6,4.7. openness part. Let (f, G) ∈ Pr. Then P (f, G) has a sequence xi, i ∈ I ⊂ N (I possibly infinite) of local minimizers satisfying the conditions of Theorem 4.8 with corresponding

εxi ,B(xi, δxi ) and has no further KKT points. According to Lemma 4.7(b) we can assume ˆ that LICQ holds for F(G). So we can select an infinite sequence of points xˆi, i ∈ I ⊂ N, such that with corresponding εxˆi ,B(ˆxi, δxˆi ) the conditions in Lemma 4.6 are satisfied and n such that the sets {B(xi, δxi )}i∈I ∪ {B(ˆxi, δxˆi )}i∈Iˆ form a locally finite open cover of R . We re-arrange the points xˆi, xi into one sequence

{xi}i∈N = {xi}i∈I ∪ {xˆi}i∈Iˆ with corresponding εxi ,B(xi, δxi ) . By construction, for any i ∈ N it holds: For any (f,˜ G˜) ∈ [C∞(Rn, R)]m+1 satisfying ˜ ˜ k(f, G) − (f, G)k < εxi , k,cl B(xi,δxi ) ˜ ˜ the problem P (f, G) either has a unique local minimizer (unique KKT point) in cl B(xi, δxi ) satisfying the conditions of Theorem 4.8, or has no KKT points in cl B(xi, δxi ). We now take a partition of unity {θi}i∈N subordinate to {B(xi, δxi )}i∈N and define for any i0 ∈ N the finite set I(i0) = {i ∈ N | supp θi ∩ supp θi0 6= ∅} (finite, since the cover

{B(xi, δxi )}i∈N is locally finite). As before we put εi0 := mini∈I(i0) εxi and introduce the strictly positive continuous function X ρ(x) = εiθi(x) . i∈N

For an arbitrary point x0 ∈ supp θi0 we find X X ρ(x ) = ε θ (x ) ≤ ε θ (x ) ≤ ε . 0 i i 0 xi0 i 0 xi0 i∈I(i0) i∈I(i0) 4.5. GENERICITY RESULTS FOR CONSTRAINED PROGRAMS 53

k k Consequently, by construction, the function ρ(x) defines a Cs -neighborhood Uρ,(f,G) such ˜ ˜ k that any (f, G) ∈ Uρ,(f,G) is contained in Pr. density part. We shall argue as in the proofs of Theorems 4.5, 4.6. Let (f,ˆ Gˆ) ∈ [C∞( n, )]m+1 be an arbitrary (fixed) pair of problem functions and let U k be R R φ,(f,ˆ Gˆ) k 0 n an arbitrary Cs -neigborhood defined by φ ∈ C+(R , R) (see Definition 2.2). We again construct a sequence of functions (f ,G ) ∈ [C∞( n, )]m+1 ∩ U k , i ∈ , such that i i R R φ,(f,ˆ Gˆ) N (the pointwise limit)

k (f, G) := lim (fi,Gi) is contained in Pr ∩ U ˆ ˆ . i→∞ φ,(f,G)

To that end chose points xi ∈ N, such that {B(xi, 1)}i∈N constitutes a locally finite cover n of R , and take a corresponding partition of unity {θi}i∈N. Recall that by Ex. 4.1 also n {int supp θi}i∈N is an open cover of R and we can chose a partition of unity {χi}i∈N subordinate to {int supp θi}i∈N. In view of

int supp θi ⊂ supp θi ⊂ B(xi, 1) , ∞ n we can find functions ξi ∈ C (R , R) satisfying (see Lemma 4.1) n 0 ≤ ξi(x) ≤ 1 ∀x ∈ R , ξi(x) = 1 on an open neighborhood Wi of supp χi , supp ξi ⊂ int supp θi . n+m Starting with i = 1, according to Lemma 4.8 we can chose (b1, a1) ∈ R (small enough) ˆ T ˆ  such that f(x) + b1 x, G(x)) + a1 is contained in Pr and T  (4.33) ξ1(x) b1 x, a1 k, θ < ε1 =: min φ(x) . supp 1 x∈supp θ1 By defining  fˆ(x) + ξ (x)bT x, Gˆ(x) + ξ (x)a  x ∈ B(x , 1) f (x),G (x) := 1 1 1 1 1 , 1 1 ˆ ˆ  n f(x), G(x) x ∈ R \ B(x1, 1) we have given a function (f ,G ) ∈ [C∞( n, )]m+1 ∩ U k , which also is contained 1 1 R R φ,(f,ˆ Gˆ) n in Pr(supp χ1), where for given closed C ⊂ R we put ∞ n m+1 Pr(C) := {(f, G) ∈ [C (R , R)] | all local minimizers x ∈ C of P (f, G) are nondegenerate} . In the same way we define (f ,G ) ∈ [C∞( n, )]m+1 ∩ U k ∩ P (∪2 supp χ ) by 2 2 R R φ,(f,ˆ Gˆ) r i=1 i

 T   f1(x) + ξ2(x)b2 x, G2(x) + ξ2(x)a2 x ∈ B(x2, 1) f2(x),G2(x) :=  n , f1(x),G1(x) x ∈ R \ B(x2, 1) n+m where (b2, a2) ∈ R is chosen such that T  k ξ2(x)b2 x, ξ2(x)a2 kk,supp θ2 < ε2 =: min φ(x) , x∈supp θ2 54 4. GENERICITY RESULTS IN ANALYSIS and thus (f2,G2) ∈ Pr(supp χ2) holds, and such that in case supp θ1 ∩ supp θ2 6= ∅, we have (f2,G2) ∈ Pr(C) also for C = supp χ1 .

With regard to the proof of the openness part of Pr the latter is possible by chosing the vectors b2, a2 of (f2,G2) small enough. ∞ n m+1 In this way after s steps we have constructed a function (fs,Gs) ∈ [C (R , R)] ∩ U k , φ,(f,ˆ Gˆ)  T   fs−1(x) + ξs(x)bs x, Gs−1(x) + ξs(x)as x ∈ B(xs, 1) fs(x),Gs(x) :=  n , fs−1(x),Gs−1(x) x ∈ R \ B(xs, 1) satisfying (fs,Gs) ∈ Pr(supp χs) and by openness arguments also s−1 (fs,Gs) ∈ Pr(∪i=1 supp χi) . n Since the cover {supp θi}i∈N is locally finite, for any x ∈ R there exists a number s(x) such that   fj1 (x),Gj1 (x) = fj2 (x),Gj2 (x) for all j1, j2 > s(x) . Thus the pointwise limit   n f(x),G(x) := lim fi(x),Gi(x) , x ∈ , i→∞ R is well defined and by construction (f, G) ∈ [C∞( n, )]m+1 ∩ U k . Moreover, since R R φ,(f,ˆ Gˆ) n n {supp χi}i∈N constitutes a cover of R , we also have (f, G) ∈ Pr(R ) = Pr.  CHAPTER 5

Manifolds, stratifications, and transversality

The goal of this chapter is to provide the basic techniques and results needed to formulate and to prove the genericity results for parametric problems in the next chapter.

5.1. Manifolds in Rn Manifolds in Rn of dimension d are subsets of Rn which locally are defined as solution sets of n-d independent equations.

DEFINITION 5.1. Let k ∈ N ∪ {∞}, k ≥ 1. A subset M ⊂ Rn is called a Ck-manifold of dimension d = n − r (0 ≤ d ≤ n), or of codimension r (0 ≤ r ≤ n) (notation dim M = d or codim M = r), if the following condition CM1 is satisfied. n CM1: To any x ∈ M there exist a R -neighborhood U of x and r functions fj : U → k R, fj ∈ C (U, R), j = 1, ··· , r, such that

(i) M ∩ U = {x ∈ U | fj(x) = 0, j = 1, ··· , r} , (ii) ∇fj(x), j = 1, . . . , r, are linearly independent. n Let M1,M be manifolds in R of codimensions, r1, r, respectively. Then if M1 ⊂ M holds, we say that M1 is a submanifold of M of codimension r1 − r in M. Manifolds in Rn of dimension d can also be seen as subsets which locally look like pieces of the Cartesian d-space Rd. This is a consequence of the following equivalent condition. CM2: Let be given M ⊂ Rn, d ∈ N, 0 ≤ d ≤ n, such that to any x ∈ M there exist Rn-neighborhoods U of x, V of 0 ∈ Rn, and a map Φ: U → V satisfying with r = n − d: (i) Φ(U ∩ M) = V ∩ (0r × Rd) (ii) Φ is a (local) Ck-diffeomorphism (k ≥ 1) .

r d n Here 0 × R = {y ∈ R | y1 = y2 = ··· = yr = 0}.

THEOREM 5.1. Let be given M ⊂ Rn, d ∈ N, 0 ≤ d ≤ n, r = n − d, and k ∈ N ∪ {∞}, k ≥ 1. Then, the conditions CM1 and CM2 are equivalent.

Proof. “CM1 ⇒ CM2”: Let x ∈ M, U, fj, j = 1, . . . , r, as in CM1. By CM1 (ii) we n T T can choose n−r vectors ξr+1, . . . , ξn ∈ R , such that ∇ f1(x),..., ∇ fr(x), ξr+1, . . . , ξn, form a basis of Rn. Putting

T T T Φ(x) = f1(x), . . . , fr(x), ξr+1(x − x), . . . , ξn (x − x) 55 56 5. MANIFOLDS, STRATIFICATIONS, AND TRANSVERSALITY it follows Φ(x) = 0 and ∇Φ(x) is nonsingular. By the Inverse Function Theorem 7.1 there exist a Rn-neighbourhood Uˆ of x, Uˆ ⊂ U and a Rn-neighbourhood V of 0 such that Φ: Uˆ → V is a Ck-diffeomorphism (see Remark 7.1). Moreover by CM1 (i) we find ˆ ˆ Φ(U ∩ M) = {y = Φ(x) | x ∈ U ∩ M} = {y ∈ V | yj = fj(x) = 0, j = 1, . . . , r}.

“CM2 ⇒ CM1”: Let x ∈ M, U, V, Φ as in CM2. Putting fj = Φj, j = 1, . . . , r, it −1 r d follows by CM2 that U ∩ M = Φ (V ∩ (0 × R )) = {x ∈ U | fj(x) = 0, j = 1, . . . , r}. By Remark 7.1 the matrix ∇Φ(x) is nonsingular. Consequently, the rows ∇fj(x) = ∇Φj(x), j = 1, . . . , r, of ∇Φ(x) are linearly independent. 

REMARK 5.1. Commonly a subset M of Rn is called a d-dimensional Ck-manifold (sub- manifold of Rn) if M is locally Ck-diffeomorph to an open subset of Rd in the following sense: CM3: To any x ∈ M there exists an open subset U˜ of M containing x, and a Ck- diffeomorphism Φ:˜ U˜ → V˜ with V˜ open in Rd. This condition CM3 is equivalent with the relations in CM1 and CM2. We refer to [21, Theorem 3.1.1] for a proof. More generally, a Ck-manifold M is often introduced as a locally Euclidean, second count- able Hausdorff space (not necessarily a subset of Rn) such that CM3 holds (see, e.g., [39, Definition 5.2]). In our definition, a manifold M is always a subset (submanifold) of Rn and as such M in- herits from Rn the Hausdorff topology with a countable base (see [32, p.5, 1.2.1 Definition and following observation] Note that by these equivalent conditions CM1, CM2, and CM3, we have defined smooth manifolds of smoothness class k for k ≥ 1. We give some simple examples of manifolds.

EX. 5.1. (a) Given 0 ≤ r ≤ n, the hyperplane

n,r n H = {x ∈ R | x1 = ... = xr = 0} is a C∞-manifold of codimension r and of dimension n − r. This is clear from Definition 5.1, taking fj(x) = xj, j = 1, . . . , r.. n (b) The sphere Sn−1 in R , n n X 2 Sn−1 = {x ∈ R | xi = 1} i=1 is a C∞-manifold with codimension r = 1 and dimension n − 1. Here again, CM1 is Pn 2 satisfied. The relation f1(x) := i=1 xi − 1 = 0 defines Sn−1 globally. Note that ∇f1(x) = 2(x1, . . . , xn) 6= 0 for all x ∈ Sn−1, i.e., CM1 (ii) is valid.

(c) Any linear subspace S spanned by d given linearly independent vectors ξ1, . . . , ξd ∈ n 5.1. MANIFOLDS IN R 57

Rn, d X S = {x = αjξj , αj ∈ R}, j=1 is a C∞-manifold of dimension d, codimension n − d. In fact, by choosing n − d linearly n independent vectors ξd+1, . . . , ξn ∈ R which are perpendicular to S (i.e. perpendicular to the basis vectors ξ1, . . . , ξd of S), the set S is defined (globally) as the solution set of the n − d equations T f`(x) = ξ` x = 0 , ` = d + 1, . . . , n. n ∞ (d) Any open set M0 ⊂ R is a C -manifold of dimension n (codimension r = 0). Indeed, CM1 is valid with r = 0. (e) Let M 2 denote the set of real 2 × 2 matrices,     2 a11 a12 M = A = , aij ∈ R , a21 a22 which can be identified with R4. Then, the subset A = {A ∈ M 2 | det A = 0}\{0} is a C∞-submanifold of M 2 with dimension 3, codimension 1. To see this, consider the equation

f1(A) := detA = a11a22 − a12a21 = 0.

The gradient ∇f1(A) = (a22, −a21, −a12, a11) is not equal to 0 if A 6= 0, i.e., ∇f1(A) 6= 0 for all A ∈ A. Thus CM1 is valid.

Let M ⊂ Rn be a d-dimensional manifold. By a tangent vector wrt. M at x ∈ M we mean a vector ξ ∈ Rn, such that there is a C1-curve through x in M, with tangent vector to the curve in x equal to ξ. Since M is of dimension d there are d linearly independent directions ξ with such a property, spanning a d-dimensional space, the tangent space TxM. We will give a definition of that space based on Definition 5.1.

DEFINITION 5.2. Let M ⊂ Rn be a Ck-manifold (k ≥ 1) of dimension d (codimension r = n − d) as defined in CM1. Let x ∈ M. The linear space

NxM = span{∇fj(x), j = 1, . . . , r} is called the normal space of M at x, and the space n TxM = {ξ ∈ R | ∇fj(x)ξ = 0, j = 1, . . . , r} , is called the tangent space of M at x. By definition, dim TxM = d, dim NxM = r. The following lemma shows that the normal- and the tangent spaces in Definition 5.2 are well-defined.

LEMMA 5.1. The spaces NxM and TxM in Definition 5.2 do not depend on the (possibly different systems of) functions fj, j = 1, . . . , r, defining the manifold M around x ∈ M. 58 5. MANIFOLDS, STRATIFICATIONS, AND TRANSVERSALITY

Proof. Let x ∈ M, U, fj, j = 1, . . . , r, be as in CM1. Consider the corresponding local diffeomorphism Φ: U ∩ M → V ∩ (0r × Rd)(d = n − r) (cf., CM2 and Theorem T T  5.1). Then, with the matrices F = ∇ f1(x),..., ∇ fr(x) and X = (ξr+1, . . . , ξn) F T  (cf., the proof of Theorem 5.1) we have ∇Φ(x) = XT . We can assume that S := {ξj, j = r + 1, . . . , n} forms an orthonormal system and that it is orthogonal to NxM = T T span {∇fj(x), j = 1, . . . , r}, i.e., span S = TxM,F X = 0, and X X = Id. Suppose, ˜ we have given another system of functions fj, j = 1, . . . , r, defining M locally in U. (By making U small enough we can assume that U is the same.) Let Φ:˜ U ∩M → V ∩(0r ×Rd) T ˜ T ˜  be the corresponding diffeomorphism. Then again, with Fe := ∇ f1(x),..., ∇ fr(x) ˜ FeT  T ˜ ˜ T ˜ we can assume ∇Φ(x) = X˜ T , Fe X = 0, X X = Id. Now, it suffice to show, that the ˜ system S is also orthogonal to all gradients ∇fj(x), j = 1, . . . , r, or equivalently

(5.1) FeT X = 0. In order to show (5.1) let us consider the local diffeomorphism Φ˜ ◦ Φ−1 : V ∩ (0r × Rd) → V ∩ (0r × Rd). Since for any t(0r, y) ∈ V, y ∈ Rd, t ∈ R,(0r ∈ Rn−d) we have ˜ −1 r ˜ −1 Φ ◦ Φ (t(0 , y)) − Φ ◦ Φ (0) −1 r T r d lim = ∇(Φ˜ ◦ Φ )(0)(0 , y) ∈ 0 × R , t→0 t it follows  0   0   0  (5.2) ∇(Φ˜ ◦ Φ−1)(0) · = ∇Φ(˜ x)(∇Φ(x))−1 = Id Id B with some (nonsingular) d × d-matrix B. Putting G := (F T F )−1, we find using ∇Φ(x) = F T  XT ,  F T  · (FGX) = I or ∇Φ(x)−1 = (FGX). XT n Substituting into (5.2) gives           0 FeT 0 FeT FeT X = T · (FGX) = T X = T , B X˜ Id X˜ X˜ X i.e., (5.1). 2 In the equivalent definitions CM1, CM2, and CM3, a manifold M is locally defined around r ˜ ˜ any x ∈ M by pairs (U, {fj}j=1), (U, Φ), (U, Φ) (called charts). The neigborhoods U = ˜ ˜ ˜ Ux, U = Ux and functions fj, Φ, Φ depend on x. Thus, the collection of all neighborhoods Ux,

[ (5.3) Ux is an open cover of M (possibly uncountable). x∈M n 5.1. MANIFOLDS IN R 59 When M is compact, then, we can choose a finite cover. But even in the case, that M is not compact we can choose a countable cover. It is well-known (see [32, p.1, 1.1.2 Def. and following observation]) that for a manifold M we have the equivalence: M as a topological space, cf., Remark 5.1 M can be covered by countably ⇔ . has a countable base (is second countable) many charts (Ux` , Φx` ) Together with the observations in Remark 5.1 we obtain the

COROLLARY 5.1. Let M be a manifold of Rn. Then, in (5.3) we always can choose a countable cover, i.e., we can choose x` ∈ M, ` ∈ N, and charts (Ux` , Φx` ) such that [ Ux` is an open cover of M. `∈N

EX. 5.2. Let f : Rn → Rn define a (global) Ck-diffeomorphism, k ≥ 1. If M is a Ck- n k n submanifold of R of codimension cd, then f(M) is also a C -manifold in R of the same codimension cd. The same holds for a Ck-diffeomorphism f : X → Y between manifolds X,Y in Rn.

Proof. Let near x ∈ M the manifold M be defined by the open neighborhood Ux and k the C -functions gj : Ux → R, j = 1, . . . , cd, with linearly independent gradients ∇gj(x). −1 Since f is a diffeomorphism, V := f(Ux) is an open neighborhood of f(x). Indeed, f −1 −1 is in particular continuous and thus the inverse image V = (f ) (Ux) = f(Ux) of Ux wrt. f −1 is open. Moreover we have y ∈ f(M) ∩ V iff using x = f −1(y), y = f(x), the equations −1 hj(y) := gj(f (y)) = 0, j = 1, . . . , cd , −1 are satisfied, where the gradients ∇hj(y) = ∇gj(x) · ∇f (y) are linear independent due to the nonsingularity of ∇f −1(y).  The following theorem is an immediate consequence of Definition 5.1.

THEOREM 5.2. Let S ⊂ Rn be open and let f : S → Rr be a Ck-function with k ≥ 1. Suppose that for all x ∈ f −1(0) the Jacobian ∇f(x) has full rank r (r ≤ n). Then, the −1 k n inverse image f (0) = {x ∈ S | fj(x) = 0, j = 1, . . . , r} is a C -manifold of R of dimension n − r (codimension r). Proof. Let x ∈ f −1(0) and chose U = S (independent of x). Since ∇f(x) has full rank r, the r rows ∇fj(x), j = 1, . . . , r, must be linearly independent, i.e.. CM1 is satisfied.  Without the assumption in Theorem 5.2, that ∇f(x) must be of full rank, the set f −1(0) need not be a manifold. This fact becomes evident from the following Theorem due to Whitney (see, e.g., [33, IV.1.2 Lemma] for a proof).

THEOREM 5.3. Let S ⊂ Rn be an arbitrary closed set. Then there exists a C∞-function f : Rn → R such that S = f −1(0). The following implication of Corollary 5.1 is usefull. 60 5. MANIFOLDS, STRATIFICATIONS, AND TRANSVERSALITY

THEOREM 5.4. Any Ck-manifold M ⊂ Rn (k ≥ 1) of dimension d < n has Lebesgue- measure zero in Rn. Proof. By Corollary 5.1 the manifold M can be covered by countably many charts

(Ux` , Φx` ), ` ∈ N. Thus, it suffice to prove the statement locally, i.e., it suffice to show that for any local diffeomorphism Φx` : Ux` → V` (` fixed) as given in CM2 we µ(M ∩ U ) = 0 Φ−1 C1 Φ (U ∩ M) = have x` . In view of CM2, x` is a -function with x` x` n−d d n−d d n V` ∩ (0 × R ). Since d < n by Example 2.1 the set 0 × R has measure zero in R , µ(Φ (U ∩ M)) = 0 µ(Φ−1(Φ (U ∩ M))) = i.e., x` x` . Then, by Theorem 2.1 it follows x` x` x`

µ(Ux` ∩ M) = 0.  We finish the section with a more specific result. For a proof we refer to [21, Lemma 3.1.8].

THEOREM 5.5. Let M ⊂ Rn be a Ck-manifold, k ≥ 1. Then: k 1. Every connected component Mc ⊂ M is a C -manifold. 2. If M is connected then M is pathconnected.

5.2. Stratifications and Whitney regular stratifications of sets in Rn In this section we consider special partitions of subsets in Rn. Stratification.

n DEFINITION 5.3. Let S be a subset of R , and let Σ = {Sj, j ∈ J} be a family of subsets Sj ⊂ S, j ∈ J, J a finite index set.

a) Σ = {Sj, j ∈ J} is called a partition of S if [ S = Sj and Sj ∩ S` = ∅ for all j, ` ∈ J, j 6= ` . j∈J k b) A partition Σ of S is called a C -stratification (k ≥ 1) of S if every set Sj ∈ Σ, j ∈ J, is a Ck-manifold. n In this case, we call a pair (S, Σ) a stratified set in R , and each Sj ∈ Σ, j ∈ J, a stratum of S.

As illustrative examples we consider four different partitions of R2. 0 2 2 0 2 1. Σ1 = {S1 ,S2 } with S2 = {0},S1 = R \{0}

0 1 1 2 0 2 1 2. Σ2 = {S3 ,S4 } with S4 = {x ∈ R | x1 · x2 = 0}, S3 = R \ S4

0 1 2 2 1 2 3. Σ3 = {S8 ,S9 ,S10} with S10 = {0}, S9 = {x ∈ R | x1 · x2 = 0}\{0}, 0 2 1 2 S8 = R \ (S9 ∪ S10) .

0 1 1 1 2 1 2 4. Σ4 = {S5 ,S6 ,S7 } with S7 = {x ∈ R | x1 = 0}\{0}, S6 = {x ∈ R | x2 = 0}, 0 2 1 1 S5 = R \ (S7 ∪ S6 ) . n 5.2. STRATIFICATIONS AND WHITNEY REGULAR STRATIFICATIONS OF SETS IN R 61

2 1 1 The partitions 1,3,4 are stratifications of R . Since S4 is not a manifold (S4 contains an intersection point at x = 0), the partition Σ2 does not provide a stratification.

All stratifications considered lateron will arise from a concrete practical problem. The partitions above can be related to the following problems. 1. Asking for the rank of all real 2 × 1 matrices X = x1 ∈ 2, we have x2 R 2 2 2 S2 = {X ∈ R | rank X = 0} is a manifold of codimension 2 in R 0 2 2 S1 = {X ∈ R | rank X = 1} is (open) a manifold of codimension 0 in R .   x1 0 2. Consider all real 2×2 diagonal matrices X = ∈ R2. Then it follows 0 x2

1 2 S4 = {X ∈ R | det X = 0}.   x1 0 3. Let us investigate the rank of matrices X = ∈ R2. Then obviously, 0 x2

2 2 S10 = {X ∈ R | rank X = 0} (codimension 2), 1 2 S9 = {X ∈ R | rank X = 1} (codimension 1), 0 2 S8 = {X ∈ R | rank X = 2} (codimension 0). n To show that a partition Σ = {Sj, j ∈ J} of a subet S ⊂ R is a stratification, the following principle is useful.

LEMMA 5.2. [Local transformation principle] k The partition Σ = {Sj, j ∈ J} is a C -stratification (k ≥ 1) of S if the following holds: k n For any j ∈ J and x ∈ Sj there exist a C -diffeomorphism Φ = Φx, Φ: R → n n R such that Φ(Sj) = Sj, for all j ∈ J, and a R -neighborhood V of Φ(x) such k that V ∩ Sj is a C -manifold. Proof. Under our assumptions let V be the neighborhood of y = Φ(x) and let g : V → r k k R be the C -function defining locally near y the C -manifold Sj via: y ∈ V ∩ Sj iff g(y) = 0. Then by U = Φ−1(V ), f : U → Rr, f(x) := g(Φ(x)) (y = Φ(x)), obviously we have: x ∈ U ∩ Sj iff f(x) = 0. So, locally near x the function f defines the manifold Sj.  This principle will allows us lateron to restrict ourselves to consider only elements y = Φ(x) in especially simple (standard-) form. Whitney Regular Stratification. We now introduce so-called Whitney Regular Stratifications (see also [21, Section 7.5]). These are stratifications enjoying certain additional regularity conditions. We will make use of this concept only for C∞-stratifications of algebraic sets. So, also in the present section we restrict ourselves to C∞-manifolds. 62 5. MANIFOLDS, STRATIFICATIONS, AND TRANSVERSALITY

n Let Gn,d (d, n ∈ N, d ≤ n) denote the set of all d-dimensional subspaces of R . The sets Gn,d are called Grassmann manifolds. The set G2,1 for example can be identified with the following arc of the unit circle (cf. Figure 5.1): G2,1 ≡ {x = (cos ϕ, sin ϕ) | ϕ ∈ [0, π)} .

x2

x

x1

x0

FIGURE 5.1. Grassmann manifold G2,1.

0 By identifying antipodal points x, x we can extend G2,1 to be the whole unit circle. In this way G2,1 becomes a compact manifold of dimension one. Such a simple construction is no more possible for n ≥ 3. Note, that by identifying any d-dimensional subspace S of n ⊥ R with the (n − d)-dimensional orthogonal complement S , we can identify Gn,d with Gn,n−d. In [21, p.314], by means of orthogonal projections the sets Gn,d are presented as submanifolds of the sets Sn(d) of symmetric (n × n)-matrices of rank d.

n THEOREM 5.6. The set Gn,d is diffeomorph to a compact submanifold of S (d)(⊂ R(n+1)n/2) of dimension (n − d)d.

DEFINITION 5.4. [Whitney regular stratification] (See, e.g., [14, p.10], [21, Def. 7.5.1]) Let (S, Σ) be a C∞-stratified set in Rn. Let X,Y be strata in Σ, X 6= Y . (a) Let be given x ∈ X. Then, Y is called Whitney regular over X at x, if for every sequence of pairs (xj, yj) ∈ X × Y, j ∈ N, satisfying for j → ∞, (1) xj → x, yj → x T Y → T ⊂ G (2) yj n,dim Y (3) Lj := span{yj − xj} → L ∈ Gn,1 , the following holds: L ⊂ T . (b) The stratum Y is said to be Whitney regular over X if Y is Whitney regular over X at every point x ∈ X. (S, Σ) is called a Whitney regular stratification if every stratum Y ∈ Σ is Whitney regular over every stratum X ∈ Σ, X 6= Y .

1 As examples let us consider the stratifications 3, 4 above. In Σ4 we choose Y = S7 , X = 1 1 1 S6 , x = 0, and the sequence (xj, yj) = ((− j , 0), (0, j )), j = 1, 2,... (see Figure 5.2). We find n 5.2. STRATIFICATIONS AND WHITNEY REGULAR STRATIFICATIONS OF SETS IN R 63 0 Tyj Y = {λ 1 | λ ∈ R} =: T 1 Lj = span{yj − xj} = {λ 1 | λ ∈ R} =: L.

x2 1 S5 1 S7 = Y

yj x1 1 xj S6 = X L

1 FIGURE 5.2. Statification with stratum Y = S7 which is not Whitney reg- 1 ular over X = S6 at x = 0.

Obviously, L 6⊂ T and Y is not Whitney regular over X at x. Note that since x = 0 6∈ Y the stratum X is Whitney regular over Y at 0. We invite the reader to prove the fact that 2 the stratification Σ3 of R (see Section 5.2.1) is Whitney regular.

EX. 5.3. Let (S, Σ) be a Whitney regular stratification of the set S ⊂ Rm. Then (Rn × S, Rn × Σ) is a Whitney regular stratification of Rn × S in Rn+m and ({0} × S, {0} × Σ) is a Whitney regular stratification of {0} × S.

Proof. We only have to mention that for Rn (with only one stratum) the conditions in Definition 5.4 (Whitney regularity) trivially hold. Then for the product stratification (Rn × S, Rn × Σ) the relations in Definition 5.4 are fulfilled by noticing that for any n n n stratum Y in Σ we have T(x,y)(R × Y ) ≡ Tx(R ) × Ty(Y ) = R × Ty(Y ). The same arguments are valid for {0} × S.  REMARK 5.2. (a) In view of condition (1) in Definition 5.4(a) it follows that Y is trivially Whitney regular over X at x ∈ X if x 6∈ cl Y . (b) The fact that Y is Whitney regular over X at x ∈ X remains invariant after some (local) diffeomorphism (see Ex. 5.2) n 0 0 Φ: U → R , with Φ(X ∩ U) = X , Φ(Y ∩ U) = Y , where U is a neighbourhood of x. This means that Y 0 is Whitney regular over X0 at x0 = Φ(x). (See [21, Rem. 7.5.2]). 64 5. MANIFOLDS, STRATIFICATIONS, AND TRANSVERSALITY

Whitney regular stratifications enjoy certain nice properties. We will give these results without proofs.

THEOREM 5.7. Let (S, Σ), S ⊂ Rn, be a C∞-stratified set. Suppose that X,Y ∈ Σ, X 6= Y , x ∈ X ∩ cl Y and Y is Whitney regular over X in x. Then, (a) dim X < dim Y . (b) If {y }∞ ⊂ Y, y → x, T Y → T (⊂ G ) then T X ⊂ T . j j=1 j yj n,dim Y x Proof. Cf. [21, Lem. 7.5.7, Cor. 7.5.9]. 

THEOREM 5.8. Let (S, Σ) be a Whitney regular stratification in Rn. (a) Let S be closed and let Σj denote the union of all j-dimensional strata in Σ, 0 ≤ j ≤ n. Then the sets k [ Σj are closed for all k = 0, . . . , n. j=0 (b) Let S be locally closed. Let Σc denote the family of all connected components of the strata of Σ. Then, 1) Σc is locally finite, i.e., for any x ∈ S there exists a Rn-neighbourhood U such that U ∩ Y 6= ∅ only for finitely many strata Y ∈ Σc. 2) [frontier conditions] If X,Y ∈ Σc and X ∩ cl Y 6= ∅ then X ⊂ cl Y (i.e., X ⊂ ∂Y ).

Proof. See [21, Cor. 7.5.10] for (a), and [21, Th. 7.5.12], [14, p.16f] for (b). 

Whitney regular stratification of semi-algebraic sets.

THEOREM 5.9. Each semi-algebraic set S ⊂ Rn allows (at least one) Whitney regular C∞-stratification with finitely many semi-algebraic strata. Proof. See [14, p.12,(2.7)].  This general theorem allows another proof of the genericity result in Theorem 3.1.

EX. 5.4. Let p : Rn → R be a polynomial function, p 6≡ 0. Then the inverse image p−1(0) is a (closed) set of measure zero. Proof. By Theorem 5.9 the algebraic set p−1(0) has a stratification into finitely many manifolds Sj, j ∈ J. It must follow,

dim Sj ≤ n − 1, j ∈ J.

In fact, assuming dim Sj0 = n for some j0 ∈ J, the stratum Sj0 is a (nonempty) open set in Rn and Ex. 2.5 implies p ≡ 0, a contradiction to p 6≡ 0. Using Theorem 5.4 the result is proven.  5.3. STRATIFICATIONS AND WHITNEY REGULAR STRATIFICATIONS OF SETS OF MATRICES 65

Theorem 5.9 states that any semi-algebraic set allows a canonical Whitney-regular stratifi- cation. Note, that a given set can allow different Whitney-regular stratifications. Consider 2 for example the stratifications Σ1 and Σ3 of R above. However, in the sequel we are in- terested in proving that a concrete partition of a given semi-algebraic subset S ⊂ Rn is a Whitney-regular stratification. The following homogeneity condition will be helpful.

DEFINITION 5.5. Let be given a semi-algebraic set S ⊂ Rn and a semi-algebraic C∞- ∞ stratification Σ = {Sj, j ∈ J} of S, i.e., the C -strata Sj are semi-algebraic (cf. Defini- tion 2.4). The stratification Σ is said to satisfy the homogeneity condition if the following holds: n Given any stratum Sj0 ∈ Σ, j0 ∈ J, and x, y ∈ Sj0 (x 6= y), there exists R -neighbourhoods U of x and V of y and a C∞-diffeomorphism Φ: U → V, Φ(x) = y which (locally) pre- serves the subsets Sj ∈ Σ, j ∈ J, i.e., z ∈ Sj ∩ U ⇒ Φ(z) ∈ Sj ∩ V .

THEOREM 5.10. Let be given a semi-algebraic set S ⊂ Rn and a semi-algebraic C∞- stratification Σ = {Sj, j ∈ J} of S. Suppose the stratification fulfills the homogeneity condition of Definition 5.5. Then, (S, Σ) is a Whitney-regular C∞-stratified set. Proof. See [14].  5.3. Stratifications and Whitney regular stratifications of sets of matrices In this section we investigate certain stratifications of sets of matrices. Let us recall some notations and introduce new ones. For given n, m ∈ N, m ≥ n, we define M n,m = {A | A is a real (n × m)-matrix} M n = M n,n Sn = {A ∈ M n | A is symmetric, AT = A}

It is obvious that the following identifications hold: (n + 1)n M n,m ≡ n×m ,Sn ≡ N with N = , R R 2 Here , we consider stratifications of sets of matrices according to the ranks and define for fixed n, m ∈ N, m ≥ n, M n,m(j) = {A ∈ M n,m | rank A = j}, j = 0, . . . , n, Sn(j) = {A ∈ Sn | rank A = j}, j = 0, . . . , n,

The following lemma will be used frequently.

LEMMA 5.3. Let A be a real (n × m)-matrix, decomposed as follows (` > 0): A A  A a (` × `)-matrix,A a (` × k)-matrix, n = ` + p A = 11 12 11 12 A21 A22 A21 a (p × `)-matrix,A22 a (p × k)-matrix, m = ` + k. 66 5. MANIFOLDS, STRATIFICATIONS, AND TRANSVERSALITY

Suppose, A11 is nonsingular. Then we have −1 (5.4) rank A = ` ⇔ A22 = A21 A11 A12.

A11 Proof. Since A11 is a nonsingular (` × `)-matrix it follows that rank = `. Conse- A21 quently, rank A = ` iff there is a vector x ∈ R` such that A  A  A x = A 11 x = 12 or 11 12 . A21 A22 A21x = A22 −1 −1 Then, A11 being regular, we find x = A11 A12 and A22 = A21A11 A12.  Obviously, by (5.5) Σ = {M n,m(j), j = 0, . . . , n} we have given a partition of M n,m. We will show that (5.5) yields a stratification.

THEOREM 5.11. Let n, m ∈ N be given, m ≥ n ≥ 1. Then the sets M n,m(j) are C∞- submanifolds of M n,m(≡ Rn×m) with codim M n,m(j) = (n − j)(m − j), j = 0, . . . , n. Thus, the partition Σ in (5.5) is a stratification of M n,m.

n,m Proof. Choose j0 ∈ {0, . . . , n} and Ab ∈ M (j0). Then, Ab contains a (j0 × j0)- submatrix A11 of full rank j0. We can choose appropriate permutation matrices P1,P2 such that A = P1APb 2 has the form  A A  (5.6) A = 11 12 . A21 A22 n,m n,m Since Φ: M → M , Φ(B) := P1BP2, is a strata preserving diffeomorphism, in n,m view of the local transformation principle in Lemma 5.2 it suffice to show, that M (j0) fulfills the conditions of a manifold around A = Φ(Ab). By continuity, there exists a Rn×m-neighborhood U of A such that for all A ∈ U decom- posed as in (5.6) the corresponding (j0 × j0)-submatrix A11 is nonsingular. In view of Lemma 5.3, A ∈ U has rank j0 iff −1 (5.7) f(A) := A22 − A21A11 A12 = 0. 22 To any element aρσ of the (n − j0) × (m − j0)-matrix A22 there corresponds one rational ∞ (thus C -) equation fρσ(A) = 0, ρ = 1, . . . , n − j0, σ = 1, . . . , m − j0. Obviously for the partial derivatives we find ∂  1 if (ν, µ) = (ρ, σ) 22 fρσ(A) = . ∂aνµ 0 otherwise

Consequently, the derivatives ∇fρ,σ(A), ρ = 1, . . . , m−j0, σ = 1, . . . , m−j0, are linearly independent.  5.3. STRATIFICATIONS AND WHITNEY REGULAR STRATIFICATIONS OF SETS OF MATRICES 67

EX. 5.5. Let A ∈ M n be a matrix in the manifold M n(n − 1) of codimension 1. Then in a neigborhood UA of A the following holds: n n A ∈ M ∩ UA is in M (n − 1) ⇔ det A = 0 .

Moreover, ∇A det(A) 6= 0. Proof. We can assume that A has the form  A b  A = 11 with A , a nonsingular (n − 1) × (n − 1)-matrix . cT γ 11 n By (5.7) the defining equation for the condition A ∈ M (n − 1) ∩ UA for a matrix A of A11 b T −1 the form A = cT γ is given by f(A) := γ − c A11 b = 0. In view of formula (7.2) the condition T −1 det A = det A11 · det(γ − c A11 b) = 0 T −1 is equivalent with (use det A11 6= 0) γ − c A11 b = f(A) = 0. To show ∇A det(A) 6= 0 we only have to notice that by the multilinearity of det A we find ∂  A 0  det A = det 11 = det A 6= 0 . ∂γ cT 1 11  A result as in Theorem 5.11 is also valid for symmetric matrices.

THEOREM 5.12. Let n ∈ N be given. Then the sets Sn(j) are submanifolds of Sn (≡ RN ,N = (n + 1)n/2) with (n − j + 1)(n − j) codim Sn(j) = , j = 0, . . . , n. 2 Thus the partition Σ = {Sn(j), j = 0, . . . , n} yields a stratification of Sn.

n Proof. Let j0 ∈ {0, . . . , n}, A ∈ S (j0), be given. Similar to the arguments in the proof of Theorem 5.11, by applying a transformation P T AP , for any A in a neighborhood U of A we can assume a decomposition   A11 A12 A22 a symmetric (n − j0) × (n − j0)-matrix, A = T , A12 A22 A11 a nonsingular symmetric (j0 × j0)-matrix. n Then, in view of Lemma 5.3 an element A ∈ U ∩ S is of rank j0 iff T −1 A22 − A12A11 A12 = 0.

Arguing as in the proof of Theorem 5.11, by symmetry of A22, this identity represents (n − j0 + 1)(n − j0)/2 linearly independent equations.  As a corollary of Theorems 5.11,5.12, we obtain another proof of the result in Ex. 3.1.

COROLLARY 5.2. Let be given m, n ∈ N, m ≥ n ≥ 1. The set M n,m(n)(Sn(n)) of full rank n matrices is generic in M n,m (Sn). 68 5. MANIFOLDS, STRATIFICATIONS, AND TRANSVERSALITY

Proof. By Theorem 5.11 the set M n,m(n) is a manifold of codimension 0, thus open. Any of the manifolds M n,m(j), 0 ≤ j ≤ n−1, has codimension ≥ 1, i.e., dim M n,m(j) < dim M n,m = nm. By Theorem 5.4 these sets M n,m(j) are of measure zero. Consequently n−1 (cf. Lemma 2.1), S M n,m(j) = M n,m\M n,m(n) has measure zero. The proof for Sn(n) j=0 is similar.  Theorems 5.11, 5.12 leads one to believe that a partition of any natural subspace S of M m,n (or Sn) according to the rank will always yield a stratification. Unfortunately, this is not the case in general. Problems arise if the elements of S have a special structure. As an example we will examine symmetric band matrices. For given n, q ∈ N, 1 ≤ q ≤ n, we define the set Sn,q of real symmetric (n × n)-matrices of bandwidth q, n,q n S = {A ∈ S | aj` = 0 for |j − `| ≥ q}, and Sn,q(j) = {A ∈ Sn,q | rank A = j}, j = 0, . . . , n.

n,q Nq Obviously, the set S can be identified with R , where Nq = n+(n−1)+...+(n−q+1). We will make use of the following lemma.

LEMMA 5.4. Let be given A ∈ Sn,q, 1 < q ≤ n. Suppose, A has an eigenvalue ξ of multiplicity m ≥ q. Then (at least) one of the elements in the q-th superdiagonal must be equal to zero, i.e. a1,q · a2,q+1 . . . an−q+1,n = 0. Proof. Consider the (n − q + 1) × (n − q + 1)-submatrix   a1q 0 ... 0 .  .   ∗ a2,q+1 .  S =  .   . 0  ∗ ...... ∗ an−q+1,n of (A − ξI). Then it follows rank S ≤ rank(A − ξI) = n − m ≤ n − q.

This implies det S = a1,q · a2,q+1 . . . an−q+1,n = 0.  Note, that by continuity the set Sn,q(n) of nonsingular band matrices is open (and thus a manifold of codimension 0). However, for 1 ≤ j ≤ n − 1, the sets Sn,q(j) need not be manifolds as will be shown by a simple counterexample. Let us choose n = 3, q = 2 (tridiagonal matrices), j = 1, and  0 0 0  3,2 A =  0 1 0  ∈ S (1). 0 0 0 5.3. STRATIFICATIONS AND WHITNEY REGULAR STRATIFICATIONS OF SETS OF MATRICES 69

Then we look for matrices A from S3,2(1),   a11 a12 0 (5.8) A =  a12 a22 a23  , 0 a23 a33 in a (small) neighborhood U of A. Since A has 0 as double eigenvalue, by Lemma 5.3 we must have a12a23 = 0. Hence, in view of (5.8), using a22 ≈ 1, a matrix A is an element of U ∩ S3,2(1) if and only if one of the following 3 cases occurs:

Case 0. a12 = a23 = 0 implying a11 = a33 = 0 2 Case 1. a12 = 0, a23 6= 0 implying a11 = 0 and a22a33 − a23 = 0 2 Case 2. a23 = 0, a12 6= 0 implying a33 = 0 and a11a22 − a12 = 0. Thus, 3,2 A ∈ (M0 ∪ M1 ∪ M2) ∩ U ⇔ A ∈ S (1) ∩ U where 3,2 M0 = {A ∈ S (1) | a12 = a23 = a11 = a33 = 0}

 a12  3,2 1 a M1 = {A ∈ S (1) | f (A) = 0 , a23 6= 0} with f1(A) =  11  2 a22 a33 − a23   a23 3,2 2 M2 = {A ∈ S (1) | f (A) = 0 , a12 6= 0} with f2(A) =  a33  . 2 a11 a22 − a12

An easy verification using a22 ≈ 1 shows, that ∇f1(A) and ∇f2(A) have (full) rank 3. ∞ 3,2 5 Consequently, M1 and M2 are C -manifolds of dimension two in S ≡ R , defined by different functions f1, f2 with different tangentspaces TAM1 6= TAM2. The intersection

M1 ∩ M2 = {αA | α ∈ R} = M0 3,2 is an one-dimensional submanifold of M1,M2. This shows that S (1) is not a manifold. However, this construction indicates, that by an appropriate sub-partitioning of the sets Sn,q(j), a stratification (into manifolds) could be obtained. We emphasize, that the stratification results in this section allow much deeper results than merely density results and statements on Lebesgue measures of certain strata as given in Corollary 5.2. Since the strata (submanifolds) are defined as solution sets of C∞-functions, the tools of analysis and differential geometry can be applied. In Chapter 6 for example, we will apply Rene Thom’s Transversality Theory to obtain so-called genericity results for parameter dependent families of matrices and corresponding problems. 70 5. MANIFOLDS, STRATIFICATIONS, AND TRANSVERSALITY

Whitney regular stratifications of sets of matrices. We now will prove the Whitney regularity of some stratifications of matrix sets.

THEOREM 5.13. (a) The stratification (M n,m, Σ) (m ≥ n ≥ 1) with Σ = {M n,m(j), j = 0, . . . , n} in Theorem 5.11 is Whitney regular. (b) The stratification (Sn, Σ) with Σ = {Sn(j), j = 0, . . . , n} in Theorem 5.12 is Whitney regular. Proof. The proofs of (a) and (b) will be given simultaneously. Let j ∈ {0, . . . , n} be fixed n,m nm n,m n (n+1)n/2 and put X = M ≡ R ,Xj = M (j) for part (a) and X = S ≡ R ),Xj = Sn(j), m = n for part (b). We are going to show that the assumptions of Theorem 5.10 are satisfied. T T 2 T Xj is a semi-algebraic subset of X: in view of x A Ax = kAxk ≥ 0 the matrix A A is positive semidefinite and for such matrices the relation holds: xT AT Ax = 0 ⇔ Ax = 0 ⇔ AT Ax = 0 .

So the (n × m)-matrix A ∈ X has rank j, i.e., A ∈ Xj, iff the symmetric (m × m)-matrix AT A has 0 as (m − j)-fold eigenvalue. In case (b) AT A can be replaced by the symmetric matrix A (with m = n). Now, consider the characteristic polynomial of AT A, T m T m−1 T det(tI − A A) = t + cm(A A)t + ··· + c1(A A). T T T Then, A ∈ Xj iff c1(A A) = ··· = cm−j(A A) = 0 and cm−j+1(A A) 6= 0. From T matrix theory it is known that the coefficients ck(A A) can be represented as the sum of all (m + 1 − k) × (m + 1 − k)-principal minors of AT A (cf. e.g. [29, Section 4.11]). m−j+1 T Consequently the mapping ψ : X → R , ψk(A) = ck(A A), k = 1, . . . , m − j + 1, is a polynomial mapping such that

A ∈ Xj ⇔ ψk(A) = 0 , k = 1, . . . , m − j, ψm−j+1(A) 6= 0. m−j+1 m−j+1 The subset Vj = {x ∈ R | x1 = ··· xm−j = 0, xm−j+1 6= 0} of R is semi- −1 algebraic. By construction, it follows Xj = ψ (Vj), which by Ex.2.8 is semi-algebraic.

The partition (X, Σ) fulfills the homogeneity propery (cf., Def.5.5): Let be given A1,A2 ∈ Xj We have to construct a (local) strata preserving diffeomorphism Φ: U → V from a X-neighbourhood U of A1 onto a X-neighbourhood V of A2 such that Φ(A1) = A2. Since A1,A2 ∈ Xj, i.e., they have rank j, there exists the following transformation into standart- form: n m In case (a) there exist nonsingular matrices C1,C2 in M , H1,H2 in M such that  I 0  C A H = j , k = 1, 2 , k k k 0 0 n and in case (b) there exist nonsingular matrices B1,B2 in M such that  I 0  B A BT = j , k = 1, 2. k k k 0 0 5.3. STRATIFICATIONS AND WHITNEY REGULAR STRATIFICATIONS OF SETS OF MATRICES 71

Then, the following mappings provide a (global) diffeomorphism Φ: X → X which preserves the rank, i.e., the strata, and satisfies Φ(A1) = A2. n,m −1 −1 In case (a): Φ: X = M → X is defined by Φ(A) = C2 C1AH1H2 . n −1 T T −1 In case (b): Φ: X = S → X is defined by Φ(A) = B2 B1AB1 (B2 ) .  In connection with linear equations Ax = b with symmetric matrices A, we consider the set

n n n N1 (5.9) S1 = {B = (A b) | A ∈ S , b ∈ R } ≡ R , where N1 = (n + 3)n/2 . Obviously, the following defines a partition of S1n:

n n (5.10) Σ = {S1j,0, j = 0, . . . , n} ∪ {S1j,1, j = 0, . . . , n − 1} , where n n S1j,0 = {B = (A b) ∈ S1 | rank A = rank (A b) = j}, n n S1j,1 = {B = (A b) ∈ S1 | rank A = j, rank (A b) = j + 1}.

THEOREM 5.14. The partition (S1n, Σ) of S1n defined by (5.10) is a Whitney regular stratification. The codimensions of the strata are given by: (n − j)(n − j + 3) (n − j)(n − j + 1) codim S1n = , codim S1n = . j,0 2 j,1 2 Proof. We firstly apply Theorem 5.10 to show that the partition yields a Whitney regular stratification. The proof is similar to the proof of Theorem 5.13. n n n The sets S1j,0, S1j,1 are semi-algebraic: Let be given B = (A b) ∈ S1 . Then

n B ∈ S1j,0 ⇔ 0 is a (n − j)-fold eigenvalue of A (5.11) and a (n − j + 1)-fold eigenvalue of BT B. n T B ∈ S1j,1 ⇔ 0 is a (n − j)-fold eigenvalue of A and B B.

T The coefficients ck, bk of the characteristic polynomials of A, B B, resp.,

n n−1 det(tI − A) = t + cn(A)t + ··· + c1(A),

T n+1 T n T det(tI − B B) = t + bn+1(B B)t + ··· + b1(B B) , T resp., are sums of minors of A, B B resp. Thus, the mapping (with cn+1 = bn+2 = 1) n 2(n−j)+3 ψ : S1 → R , ψk(B) = ck(A), k = 1, . . . , n − j + 1, ψn−j+1+k(B) = T bk(B B), k = 1, . . . , n − j + 2, is a polynomial function of B. By (5.11) we have

n B ∈ S1j,0 ⇔ ψk(B) = 0, k = 1, . . . , n − j, n − j + 2,..., 2(n − j) + 2 and ψk(B) 6= 0, k = n − j + 1, 2(n − j) + 3 n B ∈ S1j,1 ⇔ ψk(B) = 0, k = 1, . . . , n − j, n − j + 2,..., 2(n − j) + 1 and ψk(B) 6= 0, k = n − j + 1, 2(n − j) + 2 72 5. MANIFOLDS, STRATIFICATIONS, AND TRANSVERSALITY

We define the sets

2(n−j)+3 Vj,0 = {x ∈ R | xk = 0, k = 1, . . . , n − j, n − j + 2,..., 2(n − j) + 2 and xk 6= 0, k = n − j + 1, 2(n − j) + 3 }, 2(n−j)+3 Vj,1 = {x ∈ R | xk = 0, k = 1, . . . , n − j, n − j + 2,..., 2(n − j) + 1 and xk 6= 0, k = n − j + 1, 2(n − j) + 2 } .

n −1 n −1 These sets are semi-algebraic, and by construction S1j,0 = ψ (Vj,0),S1j,1 = ψ (Vj,1), which in view of Ex. 2.8 are semi-algebraic. n The partition (S1 , Σ) satisfies the homogeneity property: Let be given B1 = (A1 b1) ∈ n n T Ij 0 S1j,0. Since rank A1 = j there exists a matrix C1 ∈ M such that C1A1C1 = 0T 0 . Then,  T      C1 0 Ij 0 c1 c1 (5.12) C1(A1 b1) T = T , with = C1b1 , 0 1 0 0 d1 d1

j n−j where c1 ∈ R , d1 ∈ R . Since rank A1 = rank (A1 b1) = j, it follows that d1 = 0. Furthermore, we obtain  I 0 c   I −c   I 0 0  (5.13) j 1 n 1 = j . 0T 0 0 0T 1 0T 0 0

n The same transformation holds for another B2 = (A2 b2) ∈ S1j,0 with corresponding C2, c2, and d2 = 0. Consider the mapping  CT 0   I −c   I −c −1  CT 0 −1 Φ(A, b) = C−1C (A b) 1 n 1 n 2 2 2 1 0T 1 0T 1 0T 1 0T 1 = (Aˆ ˆb).

−1 In −c In c ˆ Then, since 0T 1 = 0T 1 , the matrix A is symmetric. Thus, we have given a diffeo- n n morphism Φ: S1 → S1 . By construction, Φ(A1 b1) = (A2 b2) and Φ preserves the ranks, i.e. the strata. n Now, let be given B1 = (A1 b1),B2 = (A2 b2) ∈ S1j,1. We again will perform a trans- formation of these matrices into a standard form. We consider B1. Firstly, we can apply the transformation (5.12). Since now rank (A1 b1) = rank A1 + 1 = j + 1 we must have d1 6= 0. The transformation as given in (5.13) yields       Ij 0 c1 In −c1 Ij 0 0 T T = T . 0 0 d1 0 1 0 0 d1

n−j Now, since 0 6= d1 ∈ R , we can assume (d1)r 6= 0, for an index r, 1 ≤ r ≤ n − j. Suppose r 6= n − j. By the permutation matrix P1 obtained from In by interchanging the rows j + r and n, we find

   T    Ij 0 0 P1 0 Ij 0 0 (5.14) P1 T T = T . 0 0 d1 0 1 0 0 d1 5.3. STRATIFICATIONS AND WHITNEY REGULAR STRATIFICATIONS OF SETS OF MATRICES 73

In 0  such that α1 := (d1)n−j = (d1)r 6= 0. By multiplication with T −1 from the right we 0 α1 obtain a matrix of the form of the right-hand matrix in (5.14) with (d1)n−j = 1. Then, we th th successively subtract (d1)k -times the n row of (5.14) from the (j +k) row and perform the corresponding column transformations for k = 1, . . . , n − j − 1. In matrix form these transformations read with an appropriate nonsingular (n × n)-matrix H1:    T    Ij 0 0 H1 0 Ij 0 0 n−j (5.15) H1 T T = T , en−j ∈ R . 0 0 d1 0 1 0 0 en−j

The same transformation holds for B2 with corresponding C2,P2,H2, d2, α2. Then, by n n −1 −1 Φ: S1 → S1 , Φ(A b) = L2 L1(A b)R1R2 , where Lk = HkPkCk and  T     T   −1  T  Ck 0 In −ck Pk 0 In 0 Hk 0 Rk = T T T T T , k = 1, 2, 0 1 0 1 0 1 0 αk 0 1 we have defined a global diffeomorphism. Moreover, Φ is (rank-) strata-preserving with Φ(A1 b1) = (A2 b2). n Proof of the formula for the codimensions: We start with S1j,0 and choose an element n B = (A b) ∈ S1j,0. By using the above transformation Φ of B into the standard form Ij 0  (5.13) we can assume that B has already the form B = (A b) with A = 0T 0 , b = 0. Take any B = (A b) in a small neighbourhood U of B composed as follows,     A11 A12 b1 A = T ,A11 a (j × j)-matrix, b = . A12 A22 b2 T −1 By continuity, A11 is nonsingular. By Lemma 5.3, rank A = j ⇔ A22 = A12A11 A12. The condition rank (A b) = rank A is equivalent with the existence of a solution x = (x1, x2) of       A11 A12 x1 b1 T = . A12 A22 x2 b2 T T −1 T Together we find that, A12x2 = b1 − A11x1, A12x1 + A12A11 A12x2 = b2, giving A12x1 + T −1 T −1 A12A11 (b1−A11x1) = b2 or b2 = A12A11 b1. Thus, the condition rank (A b) = rank A = j is equivalent with the conditions T −1 T −1 (5.16) A22 = A12A11 A12 and b2 = A12A11 b1. These relations represent (by symmetry of A) N := (n−j+1)(n−j)/2+(n−j) = (n−j+ 3)(n − j)/2 equations fk(A, b) = 0, k = 1,...,N. As in the proof of Theorem 5.12 one can easily check that the gradients ∇fk(A, b), k = 1,...,N, are linearly independent. n Now let be given B = (A b) ∈ S1j,1 in the standard form (5.15). Let us consider an element B = (A b) in a small neighbourhood U of B. Then since rank (A b) = j + 1 we obviously have j + 1 ≤ rank (A b) ≤ rank A + 1 or j ≤ rank A. Consequently n (A b) ∈ S1j,1 ∩ U ⇔ rank A = j. 74 5. MANIFOLDS, STRATIFICATIONS, AND TRANSVERSALITY

By Theorem 5.12 this defines (locally) a manifold of codimension (n − j + 1)(n − j)/2.  5.4. Transversal intersections of manifolds n If two manifolds M1,M2 of R intersect ”nicely” we say that they are in or that they intersect transversally. Here is the formal definition.

DEFINITION 5.6. [Transversal intersection of M1 and M2] k n Let M1,M2 be C -manifolds in R (k ≥ 1) with codimensions r1, r2. We say that M1 and k M2 intersect transversally at x ∈ M1 ∩ M2, if with the systems of C -functions hj : U → R, j = 1, . . . , r1, g` : U → R, ` = 1, . . . , r2, respectively, defining in a neighbourhood U of x the manifolds M1,M2, respectively, the following holds:

(5.17) ∇hj(x), ∇g`(x), j = 1, . . . , r1, ` = 1, . . . , r2 are linearly independent . − We say that M1 and M2 intersect transversally (notation M1 t M2) if M1 and M2 intersect transversally at every point x ∈ M1 ∩ M2. Instead of saying that M1 and M2 intersect transversally one often also says that M1 and M2 are in general position.

k n REMARK 5.3. Let M1,M2 be C -manifolds in R (k ≥ 1) with codimensions r1, r2. − (a) If M1 ∩ M2 = ∅ then by Definition 5.6 we have M1 t M2. If M1 ∩ M2 6= ∅ and − n M1 t M2 then, the intersection M = M1 ∩ M2 defines submanifolds of M1,M2, and R of codimensions r2, r1, and r1 + r2, respectively.

(b) By Definition 5.6 the following holds: Suppose that codim M1 + codim M2 > n, or equivalently dim M1 + dim M2 < n. Then we have − M1 t M2 ⇒ M1 ∩ M2 = ∅. (c) We note, that if (5.17) is satisfied, then, by continuity, the condition (5.17) holds for n all x ∈ (M1 ∩ M2) ∩ U with a (sufficiently) small R -neighbourhood U of x. Transversal intersections can also be characterized in terms of conditions on tangent spaces.

n LEMMA 5.5. Let M1,M2 be manifolds in R . Then the following conditions are equiva- lent: − (a) M1 t M2 n (b) For all x ∈ M1 ∩ M2: TxM1 + TxM2 = R (c) For all x ∈ M1 ∩ M2: NxM1 ∩ NxM2 = {0}

Proof. With the notations of Definition 5.6 it follows for A := {∇hj(x), ∇g`(x), j = 1, . . . , r1, ` = 1, . . . , r2}: ⊥ ⊥ A = {∇hj(x), ∇g`(x), j = 1, . . . , r1, ` = 1, . . . , r2} ⊥ ⊥ = {∇hj(x), j = 1, . . . , r1} ∩ {∇g`(x), ` = 1, . . . , r2} = TxM1 ∩ TxM2. Consequently, the system A is a linearly independent set if and only if ⊥ n − (r1 + r2) = dim A = dim (TxM1 ∩ TxM2) 5.4. TRANSVERSAL INTERSECTIONS OF MANIFOLDS 75 or equivalently

dim (TxM1 + TxM2) = dim TxM1 + dim TxM2 − dim (TxM1 ∩ TxM2)

= (n − r1) + (n − r2) − n + (r1 + r2) = n . To show the equivalence of (b) and (c) note that (b) becomes

n ⊥ ⊥ ⊥ R = (TxM1 + TxM2) = (NxM1 + NxM2 ) = (NxM1 ∩ NxM2) , or equivalently (NxM1 ∩ NxM2) = {0}.  We give some illustrative examples.

FIGURE 5.3. Two circles: transversal intersection (left), nontransversal in- tersection (right).

Figure 5.3 shows the intersection of two circles in R2. It is intuitively clear, that a non- transversal intersection of two manifolds M1,M2 can be transformed into a transversal one, by arbitrarily small perturbations. A nontransversal intersection can be regarded as an ex- ceptional case. Moreover, a transversal intersection of two closed manifolds is stable wrt. small perturbations. However, if M1 is closed and M2 not, then, the ”set of transversally intersecting M1,M2” need not be open. We give an example (cf. also [21, Remark 7.2.5]).

EX. 5.6. Let be given the closed manifold M1 = {0}, and the ”relatively”open manifold 2 − M2 = {(x1, 0) | x1 > 0} in R . Since M1 ∩ M2 = ∅, by definition we have M1 t M2. However, under arbitrarily ”small” perturbations of M1 (or M2) nontransversal intersections may appear, since cl M1 ∩ cl M2 6= ∅ (see Figure 5.4).

x2 x2

˜ M2 M2 ( x1 ( x1 M1 M1

FIGURE 5.4. Unperturbed manifolds M1, M2 intersecting transversally, ˜ perturbed manifolds M1, M2 show nontransversal intersection. 76 5. MANIFOLDS, STRATIFICATIONS, AND TRANSVERSALITY 5.5. Transversality of mappings In this section we again formulate all results for C∞-manifolds and C∞-functions, but we recall that the conclusions remain true for the Ck-case, k ≥ 1. ∞ n m Given a function f ∈ C (R , R ) with components (f1, . . . , fm) we introduce the graph n m n n+m graph (f) = {(x, f(x)) ∈ R × R | x ∈ R } ⊂ R . ∞ n+m Note that graph (f) is a C -submanifold in R of codimension cd = m, and dimension d = n (see Definition 5.1). Indeed, (x, y) ∈ graph (f) iff the system of m C∞-equations, g(x, y) := y − f(x) = 0 are satisfied. Since the Jacobian

∇g(x, y) = (−∇f(x) Im) has full rank m the gradients ∇gj(x, y) = (−∇fj(x), ej), j = 1, . . . , m, are linearly independent.

m R graph (f) (x, f(x))

Πx

n x R

FIGURE 5.5. Sketch of the mapping graph (f).

EX. 5.7. Let f ∈ C∞(Rn, Rm). The C∞-mapping n Gf : R → graph (f) , x 7→ (x, f(x)) defines a C∞-diffeomorphism between the manifolds of dimension n given by Rn and graph (f). Its inverse is given by G−1 = Πx , f | graph (f) where Πx is the restriction to graph (f) of the projection Πx : n × m → n, | graph (f) R R R Πx(x, y) = x. Proof. The mapping G is bijective and satisfies (Πx ◦ G )(x) = x. f | graph (f) f 

EX. 5.8. m ∞ n ∞ (a) If M ⊂ R is a C -manifold of codimension cd, then R × M is a C -manifold in n+m R of the same codimension cd. ∞ x (b) Let M be a C -submanifold of graph (f) of codimension cd, then N = Π (M) is a ∞ n C -manifold in R of the same codimension cd 5.5. TRANSVERSALITY OF MAPPINGS 77

Proof. (a) Let the manifold M be locally defined in a neighborhood Vy of y ∈ M by n the equations gj(y) = 0, j = 1, . . . , cd. Then obviously for (x, y) ∈ R × M an element n n (x, y) ∈ R × Vy is in R × M iff the equations hj(x, y) := gj(y) = 0, j = 1, . . . , cd, are satisfied. By assumption the gradients ∇hj(x, y) = (0, ∇gj(y)), j = 1, . . . , cd, are linearly independent. n ∞ −1 (b) Recall that Gf : R → graph (f) defines a C -diffeomorphism. With Gf (M) = Πx (M) = N, the statement follows by using Ex. 5.2. |graph (f) 

DEFINITION 5.7. [f meets M transversally] Let M ⊂ Rm be a manifold and let f ∈ C∞(Rn, Rm). We say that f meets M transver- sally, notation f − M, if the manifolds M = graph (f) and M = n × M intersect t − 1 2 R transversally, i.e., if M1 t M2. This definition implies the following.

LEMMA 5.6. Let f ∈ C∞( n, m) and let M be a manifold in m of codimension c R R − R d (dimension d = m − cd). Assume f t M and n < cd. Then n {f(x) | x ∈ R } ∩ M = ∅ , i.e., f does not meet M at all.

− − n Proof. By Definition 5.7 the condition f t M means graph (f) t R × M for the n manifold M1 := graph (f) of codimension m and M2 := R × M of codimension cd in n+m R . The assumption n < cd implies

codim M1 + codim M2 = m + cd > m + n , n and Remark 5.3(b) yields M1 ∩ M2 = ∅, or graph (f) ∩ (R × M) = ∅, or equivalently n {f(x) | x ∈ R } ∩ M = ∅ . 

m ∞ ∞ n m LEMMA 5.7. Let M ⊂ R be a C -manifold of codimension cd, and f ∈ C (R , R ). − −1 −1 n Suppose f t M. Then either f (M) = ∅, or, f (M) is a manifold in R of codimension cd.

− −1 n Proof. (See [21, Th.7.3.3]) Let f t M and f (M) 6= ∅. Since graph (f), R × M n+m are manifolds in R of codimensions m, cd, respectively, by Remark 5.3(a) the inter- n section graph (f) ∩ (R × M) defines a submanifold in graph (f) of codimension cd. In view of Ex. 5.2 (recall Πx defines a diffeomorphism) the image f −1(M) = |graph (f) x n  n Π graph (f) ∩ (R × M) is also a submanifold of R of codimension cd.  We give a characterization of the transversal intersection of a mapping f with a manifold M in terms of the Jacobian of f and the tangent space of M. 78 5. MANIFOLDS, STRATIFICATIONS, AND TRANSVERSALITY

THEOREM 5.15. Let f ∈ C∞(Rn, Rm) and let M ⊂ Rm be a manifold. Then we have − n f t M if and only if for every x ∈ R with f(x) ∈ M it holds, n m ∇f(x)[R ] + Tf(x)M = R . − Proof. (See [21, Th.7.3.4]) By definition and Lemma 5.5 the relation f t M holds iff for all (x, y) in graph (f) ∩ (Rn × M) we have: n n m (5.18) T(x,y)graph (f) + T(x,y)(R × M) = R × R .

Recall that (x, y) is in graph (f) iff g(x, y) := y − f(x) = 0. So, T(x,y)graph (f) = ⊥ N(x,y)(graph (f)) = ker(−∇f(x) Im). Obviously this set is spanned by the columns of  I  n . ∇f(x) n n n The tangent space T(x,y)(R ×M) of the product (R ×M) is linearly isomorph to TxR × n n TyM = R × TyM. So (5.18) is valid iff the space ∇f(x)[R ] spanned by the columns of n m ∇f(x) satisfies ∇f(x)[R ] + Tf(x)M = R .  1 3 As an example consider the function f : R → R, f(x) = 3 x − x. Take the manifold Mc = {c}, c ∈ R, of codimension 1 in R and TcMc = {0}. Then by Theorem 5.15, for c = 2/3 the mapping f does not meet M2/3 transversally, since at x = −1 with f(−1) ∈ M2/3 we have (cf., Figure 5.6) 0 f (x)[R] + TcMc = 0[R] + {0} = {0}= 6 R . For all c∈ / {±2/3} we however find for any x with f(x) = c : f 0(x) 6= 0 und thus 0 0 f (x)[R] + TcMc = f (x)[R] = R , and f meets Mc transversally.

f graph (f)

R × M2/3 x −1 1

FIGURE 5.6. Nontransversal intersection of M2/3 with f, or of R × M2/3 with graph (f).

m ∞ ∞ p m EX. 5.9. Let S ⊂ R be a C -manifold of codimension cd and let f ∈ C (R , R ). − p Then f t S holds if and only if for every x ∈ R with f(x) ∈ S the following is satisfied: ∞ if in a neighborhood Uy of y = f(x) the functions hj ∈ C (Uy, R), j = 1, . . . , cd, 5.5. TRANSVERSALITY OF MAPPINGS 79 define S (via hj(y) = 0) then the functions gj(x) := hj(f(x)), j = 1, . . . , cd, are linearly independent in the sense, (5.19) ∇gj(x) = ∇hj(f(x)) · ∇f(x), j = 1, . . . , cd, are linearly independent vectors . − − p Proof. Consider condition f t S or equivalently (cf., Definition 5.7) graph (f) t (R × S). The manifold graph (f) of codimension m in Rp × Rm is globally defined by the m equations F (x, y) := y − f(x) = 0 and near y = f(x) the manifold S of codimension cd is given by the cd equations

 h1(y)  . H(x, y) :=  .  = 0 .

hcd (y) − So, by definition, f t S means that the m + cd rows (gradients of the defining equations) of the (m + cd) × (p + m)-matrix  ∇ F (x, y) ∇ F (x, y)   −∇f(x) I  (5.20) x y = m 0 ∇yH(x, y) 0 ∇h(y) are linearly independent. This implies cd ≤ p. By applying Gaussian elimination, sub- tracting ”∇h(y) times the rows of (−∇f(x) Im) from the cd last rows of (0 ∇h(y))”, the right-hand side of (5.20) becomes  −∇f(x) I  m , ∇h(y)∇f(x) 0 and this matrix has rank cd + m iff the matrix ∇h(y)∇f(x) has full rank cd. 

CHAPTER 6

Genericity results for parametric problems

In this chapter we examine the genericity properties of parametric problems. Again, only C∞-functions and C∞-manifolds are considered. The results are based on so-called transversality theorems which are originally due to Rene´ Thom. However, the presentation of the material and the proofs in Section 6.1,6.3,6.5 are essen- tially taken from the landmark book [21].

6.1. Jet transversality theorems For later purposes we have to extend the notion of the graph for a function f ∈ C∞(Rn, Rm), f(x) = (f1(x), . . . , fm(x)). The 1-jet extension j1f of f is given by the mapping 1 n N(n,m,1) j f : R → R with N(n, m, 1) = n + m + mn  x 7→ x, f1(x), . . . , fm(x),

∂ ∂ ∂ ∂  f1(x),..., f1(x), f2(x),..... fm(x) ∂x1 ∂xn ∂x1 ∂xn

The 2-jet extension j2f of f is the mapping 1 j2f : n → N(n,m,2) with N(n, m, 2) = n + m + mn + m · (n + 1)n R R 2  2 2 2 1 ∂ ∂ ∂ x 7→ j f(x), f1(x), f1(x),..., f1(x), ∂x1∂x1 ∂x2∂x1 ∂xn∂xn ∂2 ∂2 ∂2 ∂2  f2(x),..., f2(x),..., fm(x),..., fm(x) ∂x1∂x1 ∂xn∂xn ∂x1∂x1 ∂xn∂xn In the same way we can define the `-jet extension j`f, ` ≥ 3, of f by ` n N(n,m,`) j f : R → R  ` ` `  `−1 ∂ ∂ ∂ x 7→ j f(x) , ` f1(x), `−1 f1(x),...... , ` fm(x) ∂x1 ∂x2∂x1 ∂xn where N(n, m, `) is the length of the vector j`f(x). The space J(n, m, `) := RN(n,m,`) is called the jet space of order `. The function graph (f) coincides with the 0-jet extension j0f. Note that j`f = {j`f(x) | x ∈ Rn} is a manifold of dimension n in J(n, m, `). 81 82 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

The following transversality theorems from Jongen et al. in [21] are the basis for proving genericity results for parametric problems.

THEOREM 6.1. [Jet Transversality Theorem] Let be given n, m ≥ 1, ` ≥ 0, and a manifold M ⊂ J(n, m, `). Then the set ∞ n m ` − (6.1) P = {f ∈ C (R , R ) | j f t M} ∞ n m k is generic (thus dense, cf., Definition 2.3) in C (R , R ) wrt. the Cs -topology for any k ≥ ` + 1. If moreover M is closed in J(n, m, `), then the set P in (6.1) is also open in this topology for k ≥ ` + 1. If f ∈ P and (j`f)−1(M) 6= ∅ then (j`f)−1(M) is a manifold in Rn of codimension, codim (j`f)−1(M) = codim M. Proof. We refer the reader to [21, Th.7.4.5] and [18, 2.1 Tranversality Theory] for a proof. The statement on the codimension of (j`f)−1(M) follows from Lemma 5.7.  This result can be extended to stratifications.

THEOREM 6.2. [Jet Transversality Theorem for stratifications] Let be given n, m ≥ 1, ` ≥ 0, and a Whitney regular stratified set (S, Σ), S ⊂ J(n, m, `). Then the set ∞ n m ` − (6.2) P = {f ∈ C (R , R ) | j f t X for all X ∈ Σ} ∞ n m k is generic (thus dense, cf., Definition 2.3) in C (R , R ) wrt. the Cs -topology for any k ≥ ` + 1. If moreover the stratified set S is closed in J(n, m, `) then the set P in (6.2) is also open in this topology for k ≥ ` + 1. Furthermore for f ∈ P, the inverse images {(j`f)−1(X) | X ∈ Σ} form a Whitney regular stratification for (j`f)−1(S) satisfying codim (j`f)−1(X) = codim X (for all X such that (j`f)−1(X) 6= ∅). Proof. We refer the reader to [21, Th.7.5.11] and [18, 2.1 Tranversality Theory] for a proof. The statement on the codimension of (j`f)−1(X) follows from Lemma 5.7.  We often are not interested in the whole jet extension j`f but only in a part of it. This can be done with the help of

LEMMA 6.1. Let F ∈ C∞(Rn, Rm1 × Rm2 ). With a manifold Mf in Rm2 and an open m1 m1 m2 set S0 in R , consider the manifold M := S0 × Mf in R × R . Let further with the m1 m2 projection Π2(y1, y2) = y2, (y1, y2) ∈ R × R , be defined Fe = Π2 ◦ F . Then it holds, − − F t S0 × Mf ⇔ Fe t M.f

Proof. We make use of the formulas (Π1 is the projection Π1(y1, y2) = y1) ∇Fe(x) =

∇(Π2 ◦ F )(x) = [∇F (x)]y2 and T (S0 × M) ≡ T S × T M = m1 × T M, F (x) f Π1(F (x)) 0 Fe(x) f R Fe(x) f 6.2. GENERICITY RESULTS FOR PARAMETRIC FAMILIES OF MATRICES 83 − to obtain that F t S0 × Mf is equivalent with (see Theorem 5.15)

n m1 m2 ∇F (x)[R ] + TF (x)(S0 × Mf) = R × R , or [∇F (x)] [ n] × [∇F (x)] [ n] + m1 × T M = m1 × m2 . y1 R y2 R R Fe(x) f R R By taking the last m2 relations we obtain our statement. 

REMARK 6.1. As an example of Lemma 6.1 consider for f ∈ C∞(Rn, R) the 1-jet exten- sion 1  n n F (x) := j f(x) = x, f(x), ∇f(x) ,J(n, 1, 1) = (R × R) × R , with m1 = n + 1, m2 = n and the reduced 1-jet extension 1 n n 1 n ˜j f : R → R , ˜j f(x) := ∇f(x), Mf = {0},M = (R × R) × M.f By Lemma 6.1 we have 1 − 1 − − j f t M ⇔ ˜j f t Mf ⇔ ∇f t {0} .

We wish to point out that the results below on parametric problems based on the Jet Transversality Theorems 6.1,6.2, in particular contain the genericity results for the cor- responding non-parametric problems in Chapter 4 as special cases. See Remark 6.2 below for details.

6.2. Genericity results for parametric families of matrices In this section we study parametric families A(t) of matrices. In the first subsection we are interested in the generic behavior of such families wrt. the rank, and in the second sub- section in the generic properties of A(t) wrt. eigenvalues. In both cases we only consider families of symmetric matrices. 6.2.1. The ranks of parametric families of symmetric matrices. In Corollary 5.2 we have seen that generically matrices A ∈ Sn have full rank n. So generically, matrices A ∈ Sn avoid all manifolds Sn(j) of matrices of rank j ≤ n − 1 . We now consider, families A(t) of matrices in Sn depending smoothly on a parameter t ∈ Rp. More precisely, let be given a function p n ∞ p n (6.3) A : R → S , t 7→ A(t) ,A ∈ C (R ,S ) . We now cannot expect that generically A(t) will be of rank n for all t ∈ Rp. Take, e.g., the family A(t) ∈ S2, t ∈ R, given by  t 1  A(t) = . 1 t 84 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

2 Then, the continuous function det A(t) = t − 1 has value −1 for t1 = 0 and value 3 for t2 = 2. Consequently by the Intermediate Value Theorem there must exist at least one point t0 in the interval (0, 2) such that det A(t0) = 0, i.e., A(t0) has rank < 2. In fact, det A(t0) = 0 for t0 = 1 (or t0 = −1) (see Figure 6.1). Moreover, any small smooth perturbation of A(t) will result into a family of matrices which has rank < 2 for some t near t0 = 1 (and near t0 = −1).

x

t t0

2 FIGURE 6.1. The function det A(t) = t − 1, with two zeroes t0 = ±1.

We will now apply the Jet-Transversality Theorem 6.2 to examine which manifolds Sn(j) are generically avoided by a matrix function A(t). As can be expected the behavior essen- tially depends on the dimension p of the parameter space Rp. Recall that by Lemma 5.6 − n n n for A t S (j), the family A(t) does not meet S (j) at all if p < codim S (j).

COROLLARY 6.1. [Genericity result for parametric families of symmetric matrices] Let be given the Whitney regular stratified set (Sn, Σ) with Σ = {Sn(j) , j = 0, . . . , n}, see Theorem 5.13(b). Let

∞ p n − n Pr = {A ∈ C (R ,S ) | A t S (j) ∀j = 0, . . . , n} . ∞ p n k Then the set Pr is dense and open in C (R ,S ) wrt. the Cs -topology for k ≥ 1. Moreover, for A ∈ Pr the sets 0 −1 n Sj = A (S (j)) , j = 0, . . . , n , are manifolds in Rp which define a Whitney regular stratification (Rp, Σ0) of Rp with strata 0 0 set Σ = {Sj | j = 0, . . . , n}. Proof. To apply Theorem 6.2 to the matrix family in (6.3) we take the manifolds

p n Mj := R × S (j) , j = 0, . . . , n .

By Ex.5.3 and Theorem 5.13(b) the family Σ = {Mj, j = 0, . . . , n} defines a Whitney regular stratification (Rp × Sn, Σ) of Rp × Sn. Note that S := Rp × Sn is a closed set. Then Theorem 6.2 states that for ` = 0 and the jet-extension j0A = graph (A) we have that the set ∗ ∞ p n − n n Pr := {A ∈ C (R ,S ) | graph (A) t R × S (j) ∀j = 0, . . . , n} 6.2. GENERICITY RESULTS FOR PARAMETRIC FAMILIES OF MATRICES 85

k ∞ p n is dense and open in the Cs -topology in C (R ,S ) for all k ≥ 1. Recalling that by Definition 5.7 we have − n − p n A t S (j) ⇔ graph (A) t (R × S (j)) this proves the first statement. ∗ ∗ −1 p Furthermore, Theorem 6.2 asserts, that for A ∈ Pr the sets Sj := (graph (A)) (R × Sn(j)) form a Whitney regular stratification of the set (graph (A))−1(Rp × Sn) = Rp. In view of −1 p n p p n (graph (A)) (R × S (j)) = {t ∈ R | (t, A(t)) ∈ R × S (j)} p n −1 n = {t ∈ R | A(t) ∈ S (j)} = A (S (j)) also the second claim follows.  To illustrate this genericity result let p = 1, i.e., t ∈ R : Suppose A is in Pr. By Lemma 5.6 we have n n {A(t) | t ∈ R} ∩ S (j) = ∅ if 1 < codim S (j) . For j = n − 1 we have codim Sn(n − 1) = 1 and for j ≤ n − 2 it follows codim Sn(j) ≥ codim Sn(n − 2) = 3 (cf., Theorem 5.12). Consequently in our case p = 1, the set {A(t) | t ∈ Rp} does generically not meet any set Sn(j) of matrices of rank j ≤ n − 2. For the manifold Sn(n − 1) we possibly have A−1(Sn(n − 1)) 6= ∅ but by Lemma 5.7 (for −1 n A ∈ Pr) the set A (S (n − 1)) is a subset of R of codimension 1, i.e., this set consists of isolated points ti ∈ R, i ∈ I, with a possibly infinite index set I. In other words, generically, a one parametric family A(t), t ∈ R, of matrices in Sn only meets the manifold Sn(n − 1) on a discrete point set. p For the case p > 1, t ∈ R : Here we can conclude in the same way, that A ∈ Pr does not meet Sn(j) if: 1 p < codim Sn(j) = (n − j + 1)(n − j) . 2 Similar results are valid for families A(t) of matrices in M n (see, [36] for details).

6.2.2. Eigenvalues of parametric families of symmetric matrices. In Section 4.2 we have shown that generically a real symmetric matrix A ∈ Sn only has simple eigenvalues. In this section we are interested in the generic behavior of the eigen- values of a parameter family of real symmetric matrices, p n ∞ p n A : R → S , t 7→ A(t),A ∈ C (R ,S ) . Here n and p are given natural numbers. The genericity results below are based on the following stratification of Sn wrt. the distri- n bution of multiplicities of eigenvalues. Let A ∈ S have l distinct eigenvalues λ1 < . . . < λl and let mj(A) denote the multiplicity of eigenvalue λj, j = 1, . . . , l. We define

σ(A) = {m1(A), . . . , ml(A)} . 86 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

Then σ(A) characterizes the multiplicities of the eigenvalues of A. Now we introduce partitions σ of n into positive integers, mj, j = 1, . . . , k, k X σ = {m1, . . . , mk} , mj = n . j=1 The set of all such partitions σ is denoted by S. For σ ∈ S we can define the subsets n n (n+1)n/2 Aσ = {A ∈ S | σ(A) = σ} ⊂ S ≡ R . Here is the stratification result.

THEOREM 6.3. [Stratification of Sn according to multiplicities of eigenvalues] ∞ n The partition Σ = {Aσ, σ ∈ S} forms a Whitney regular C -stratification of the set S . Moreover, for σ = {mj, j = 1, . . . , k} ∈ S we have, k   X (mj + 1)mj codim A = − 1 . σ 2 j=1 Proof. For a proof of this result we refer to [25]. The proof uses techniques as in the proof of Theorem 5.13.  For a similar stratification result for real skew-symmetric matrices we refer to [26]. As a direct consequence of Theorem 6.2, for ` = 0 and the 0-jet mapping j0A = graph (A), we obtain the following genericity result.

COROLLARY 6.2. Let (Sn, Σ) be the Whitney regular stratification of Theorem 6.3. Then the set ∞ p n − Pr = {A ∈ C (R ,S ) | A t Aσ for all Aσ ∈ Σ } ∞ p n ∞ is dense and open in C (R ,S ) wrt. the Cs -topology for all k ≥ 1. Moreover for −1 n 0 0 A ∈ Pr the sets A (Aσ), σ ∈ S, form a Whitney regular stratification (S , Σ ), Σ := −1 n −1 n −1 {A (Aσ), σ ∈ S}, of the set R = A (S ), with codim A (Aσ) = codim Aσ.

As a general observation for A ∈ Pr we note: p • Generically (i.e., for A ∈ Pr) the family A(t), t ∈ R , does not meet any set Aσ with codim Aσ > p. The case p = 1 is especially interesting.

One-parametric families of symmetric matrices.

In this case p = 1, or t ∈ R, generically all sets Aσ with multiple eigenvalue matrices are avoided by A(t). Indeed, according to Theorem 6.3, the stratum Aσ0 with σ0 = {1,..., 1} containing all matrices A ∈ Sn with n distinct eigenvalues has codimension 0, i.e., the set n Aσ0 is an open subset of S . The next stratum Aσ1 , σ1 = {2, 1,..., 1} with exactly one double and n − 2 simple eigenvalues has codimension 3 · 2 codim A = − 1 = 2 . σ1 2 6.2. GENERICITY RESULTS FOR PARAMETRIC FAMILIES OF MATRICES 87

This leads to the following special case of Corollary 6.2.

COROLLARY 6.3. Let p = 1 and let A ∈ Pr. Then for all parameter values t ∈ R the matrices A(t) have n simple eigenvalues. ∞ More precisely, the set of eigenvalues λ1(t), . . . , λn(t) of A(t) is given by n C -functions λj(t), j = 1, . . . , n, t ∈ R, which never intersect on the whole set R (see Figure 6.2).

λ λ3(t)

λ2(t)

λ1(t)

t

FIGURE 6.2. Eigenvalue curves of a generic one-parametric family A(t).

The density part of Corollary 6.3 means, that if we have given a (non-generic) matrix func- tion A ∈ C∞(R,Sn) with intersecting eigenvalue functions, then by abritrarily small ap- ∞ ∞ ˜ propriate C -perturbations E(t) (small wrt. the Cs -topology) we obtain a family A(t) = ˜ A(t) + E(t) from Pr with n non-intersecting eigenvalue curves λj(t), t ∈ R. Here a question arises: Can such a perturbation E(t) be given explicitly? As we shall see, the answer is yes, at least for one-parameter families A(t) which are not only C∞-functions but analytic functions of t ∈ R. Recall that a matrix function A : R → Sn, A ∈ C∞(R,Sn), is called analytic on R, if at each t ∈ R, for any component aij(t) of A(t) the Taylor expansion ∞ X 1 dν a (t) = a (t)(t − t)ν ij ν! dtν ij ν=0

is convergent in some interval (t − ε, t + ε) for some ε > 0, ε = εij(t), i, j = 1, . . . , n. To understand why we need analyticity of A(t), we have to summarize some facts on the stability of eigenvalues and eigenvectors of a one-parametric family of symmetric matrices from the landmark book of Kato [28]. Let A ∈ C∞(R,Sn). It is easy to show, by applying the Implicit Function Theorem, that near a simple eigenvalue λ of A(t) there are locally defined C∞-functions λ(t), λ(t) = λ, and x(t), such that for t near t the value λ(t) is a simple eigenvalue of A(t) with corresponding eigenvector x(t). Problems arrize if λ is an eigenvalue with multiplicity m > 1. Then the eigenvalue func- tions are still C∞-functions, but the corresponding eigenvectors need even not behave 88 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS continuously. The classical counterexample is due to Rellich (cf., e.g., [28, p. 111]),     2 cos 2/t sin 2/t 0 0 A(t) = e1/t , t 6= 0,A(0) = . sin 2/t − cos 2/t 0 0

∞ ∞ −1/t2 This C -matrix family (not analytic) has C -eigenvalue functions λ1,2(t) = ±e with double eigenvalue λ1,2(0) = 0 for t = 0. But it can be shown that no corresponding eigenvector functions x1,2(t) can be defined as continuous functions near t. As with magic, things change when instead of merely a C∞-function, the family A(t) is analytic. Then we have the following deep result (cf., [28, Vol. II, Theorem 6.1]).

THEOREM 6.4. Let A : R → Sn be analytic on R. Then, appropriately defined eigen- value functions λj(t) and corresponding orthonomal eigenvectors xj(t), j = 1, . . . , n (i.e., T kxj(t)k = 1, xj(t) xi(t) = 0, j 6= i), are analytic.

∞ Now, in case the eigenvector functions xj(t) are smooth (say C - or analytic functions), then they allow a simple perturbation of A(t) which completely avoids multiple eigenval- ues. We present an illustrative example,  t 0  A(t) = , t ∈ , 0 −t R with eigenvalue functions λ1(t) = t, λ2(t) = −t, intersecting at t = 0 and corresponding (constant) eigenvectors x1(t) = e1, x2(t) = e2. T T Perturbing the family A with the symmetric matrix e1e2 + e2e1 (dyadic product with the eigenvectors e1, e2) leads to a function  t ε  A˜(t) = A(t) + εe eT + e eT  = , ε 6= 0 , 1 2 2 1 ε −t √ ε 2 2 with no more intersecting eigenvalue functions λ1,2(t) = ± t + ε (cf., Figure 6.3).

x x

t t

FIGURE 6.3. Eigenvalue functions of the unperturbed and the perturbed matrix family.

This perturbation can be generalized (see [36] for details). 6.2. GENERICITY RESULTS FOR PARAMETRIC FAMILIES OF MATRICES 89

n THEOREM 6.5. Let the matrix function A : R → S be analytic on R and let λj(t), xj(t), j = 1, . . . , n, be the analytic eigenpair functions according to Theorem 6.4, with possibly intersecting eigenvalue functions λj(t). Then for any ε 6= 0 the perturbed analytic family

n−1 ε X T T  A (t) = A(t) + ε xj(t)xj+1(t) + xj+1(t)xj(t) j=1

ε has eigenvalue functions λj(t) which never intersect on the whole R. Proof. (See also [36, Corollary 1].) With the analytic orthogonal matrices Q(t) = ε [x1(t) . . . xn(t)] the families A(t),A (t) are transformed as follows:

 λ1(t) 0  T . Q(t) A(t)Q(t) =  ..  , 0 λn(t) and   λ1(t) ε ··· 0  .. .   ε λ2(t) . .  T ε  . . . . .  ε Q(t) A (t)Q(t) =  ......  =: T (t) ,    . . .   . .. .. ε  0 ······ ε λn(t)

ε  where ε 6= 0. Obviously for any λ ∈ R the matrix T (t) − λI has rank ≥ n − 1, since ε ε it contains a nonsingular (n − 1) × (n − 1)-submatrix. So, the eigenvalues λj(t) of T (t) and thus of Aε(t) are simple for all t ∈ R.  As a final example we take the matrix family  t2 0 0  A(t) =  0 t 0  , t ∈ R , 0 0 −t

2 with eigenvalue functions λ1(t) = t , λ2(t) = t, λ3(t) = −t, and two double eigenvalues λ = 1 at t = ±1 and an eigenvalue λ = 0 at t = 0 with multiplicity 3 (see Figure 6.4). The eigenvalue curves of the perturbed family (ε 6= 0)  t2 ε 0  ε T T T T  A (t) = A(t) + ε e1e2 + e2e1 + e2e3 + e3e2 =  ε t ε  , t ∈ R , 0 ε −t never intersect (see Figure 6.5). Let us remark that we also obtain similar results for the real eigenvalues of matrix families A(t) in M n. For the one-parametric case we refer to [36]. 90 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

λ

t

FIGURE 6.4. Three eigenvalue curves with two double eigenvalues and one triple eigenvalue of the family A(t).

λ λ3(t)

λ2(t)

t

λ1(t)

FIGURE 6.5. Eigenvalue curves of the perturbed matrix family Aε(t) with- out intersections. 6.3. Genericity results for parametric unconstrained programs Given a function C∞(Rn × Rp, R), we consider problems (6.4) P (t) : min f(x, t) , x∈Rn depending on a parameter t ∈ Rp. For any fixed t the program P (t) represents an uncon- strained minimization problem (see Section 4.3). In what follows, the numbers n, p are always fixed. The first subsection studies the general case p ≥ 1, whereas the second sub- section especially looks at one-parametric problems. We again emphasize that the material presented in this section is essentially taken from [21, Section 10.1,10.2]. 6.3.1. Parametric unconstrained programs. Recall from Section 4.3 that a local minimizer x of P (t) must necessarily be a solution of the critical point equation T (6.5) ∇x f(x, t) = 0 . 6.3. GENERICITY RESULTS FOR PARAMETRIC UNCONSTRAINED PROGRAMS 91

If at such a point (x, t) the regularity condition, 2 (6.6) ∇xf(x, t) is nonsingular, is fulfilled, then by the Implicit Function Theorem 7.2, around (x, t), the solution set of (6.5) is a C∞-manifold in Rn × Rp of dimension p parameterized by a (locally defined) C∞-function x(t) with x(t) = x. We again call a point (x, t) satisfying (6.5) and (6.6) a nondegenerate critical point. For a non-parametric unconstrained minimization problem given by f(x) we have proven in Section 4.3 that generically all critical points of f are nondegenerate. Again here in the parametric case, we cannot expect that generically for all t ∈ Rp at all critical points (x, t), T i.e., at all points satisfying ∇x f(x, t) = 0, the relation (6.6) holds. However, as we shall see in Lemma 6.2, the following weaker condition will be satisfied generically: T T (6.7) ∇(x,t) ∇x f(x, t) has full rank n at all solutions (x, t) of ∇x f(x, t) = 0 . Also under this condition the critical point set n p T (6.8) C(f) = {(x, t) ∈ R × R | ∇x f(x, t) = 0} is a C∞-manifold of dimension p (codimension n). Indeed, the set C(f) is defined globally T by the n equations ∇x f(x, t) = 0, which under (6.7) satisfy the conditions CM1 for a man- T ifold (see Section 5.1). We call a point (x, t) ∈ C(f) satisfying rank ∇(x,t) ∇x f(x, t) = n 2 but rank ∇xf(x, t) < n a turning point of the set C(f).

LEMMA 6.2. The set 1 ∞ n p (6.9) Pr = {f ∈ C (R × R , R) | f fulfills (6.7)} ∞ n p k is open and dense (thus generic) in C (R × R , R) wrt. the Cs -topology, k ≥ 2. More- 1 n p over, for f ∈ Pr , the set C(f) is a manifold of R × R of codimension n. 1 1 Proof. Consider the reduced j -function ˜j f(x, t) = ∇xf(x, t). By Theorem 5.15 the ˜1 − T condition j f t {0} holds iff for any solution (x, t) of ∇x f(x, t) = 0 we have (use ˜1 T∇xf(x,t){0} = {0}, j f = ∇xf) T n p n ∇(x,t) ∇x f(x, t)[R × R ] + {0} = R , T ˜1 − 1 or equivalently ∇(x,t) ∇x f(x, t) has full rank n. Consequently, j f t {0} iff f ∈ Pr . 1 − 1 − n p According to Lemma 6.1 (see also Remark 6.1) ˜j f t {0} holds iff j f t (R × R × {0}). Since the set (Rn × Rp × {0}) is closed, the statement now follows from the Jet Transversality Theorem 6.1. 1 The fact that for f ∈ Pr the set C(f) is a manifold has been shown before the lemma.  1 This lemma in particular states that generically (for f ∈ Pr ) the critical point set C(f) consists of a C∞-manifold in Rn × Rp of codimension n. But now we are interested in the set of degenerate critical points, i.e., the points (x, t) ∈ C(f) with singular Hessian 2 ∇xf(x, t): n p T 2 (6.10) Cs(f) = {(x, t) ∈ R × R | ∇x f(x, t) = 0, det ∇xf(x, t) = 0} . 92 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

To prove the main genericity result of this section, we consider the reduced 2-jet function ˜2 2  n n j f(x, t) = ∇xf(x, t), ∇xf(x, t) ∈ R × S , and the manifolds n Mj = {0} × S (j) . 2 Note that the set Cs(f) consists of all points (x, t) where ˜j f meets one of the manifolds T 2 Mj, j = 0, . . . , n − 1, i.e., points where ∇x f(x, t) = 0 but ∇xf(x, t) has rank j ≤ n − 1. In view of Theorem 5.13(b) and Ex. 5.3 the stratification n ({0} × S , Σ) , with Σ = {Mj, j = 0, . . . , n} , is Whitney regular. Also notice that the set {0} × Sn is closed. So, by applying Theo- rem 6.2 and Remark 6.1, in our context, we obtain the following genericity theorem for the set ∞ n p ˜2 − (6.11) Pr = {f ∈ C (R × R , R) | j f t Mj ∀Mj ∈ Σ} .

∞ n p THEOREM 6.6. The set Pr is open and dense (thus generic) in C (R × R , R) wrt. the k Cs -topology, k ≥ 3. Moreover, for f ∈ Pr, the inverse images (j˜2f)−1{0} × Sn(j) , j = 0, . . . , n ,

n 1 are submanifolds in C(f) of codimension cd = codim S (j) = 2 (n − j + 1)(n − j), and Σ−1 := {(˜j2f)−1{0} × Sn(j) | j = 0, . . . , n} , yields a Whitney regular stratification of the set C(f) = (j˜2f)−1{0} × Sn. Note that we have 1 (6.12) f ∈ Pr ⇒ f ∈ Pr .

Indeed, using Lemma 6.1 (with S0 × Mf replaced by Mf × S0) with the open subset S0 = Sn(n) of Sn and Mf = {0} we have ˜2 − n − j f t Mn = {0} × S (n) ⇒ ∇xf t {0} , − 1 where according to the proof of Lemma 6.2 ∇xf t {0} is equivalent with f ∈ Pr . n p We present some examples: Let f ∈ Pr. Then the set C(f) ⊂ R × R is a manifold of codimension n, dimension p in Rn × Rp. Further: • (j˜2f)−1{0} × Sn(n) is a relatively open submanifold of codimension n in Rn × Rp and of codimension 0 in C(f). ˜2 −1 n  n p • (j f) {0} × S (n − 1) is a submanifold of codimension n + 1 in R × R and of codimension 1 in C(f).

˜2 −1 n  n p • (j f) {0} × S (n − 2) is a submanifold of codimension n + 3 in R × R and of codimension 3 in C(f). ˜2 −1 n  In particular, for p ≤ 2 the set (j f) {0} × S (n − 2) is empty (for f ∈ Pr). 6.3. GENERICITY RESULTS FOR PARAMETRIC UNCONSTRAINED PROGRAMS 93

So, roughly speaking, for ”low” p the mapping j˜2f only meets sets in {0} × Sn(j) with ”hight” rank j (low codimension). More precisely for f ∈ Pr we have n n p T 2 n p < codim S (j) ⇒ {(x, t) ∈ R × R | ∇x f(x, t) = 0, ∇xf(x, t) ∈ S (j)} = ∅ .

We finish the section with some simple examples for n = p = 1. Consider 1 P (t) : min f(x, t) := x3 − tx . x∈R 3 The critical point set 2 C(f) = {(x, t) ∈ R × R | ∇xf(x, t) = x − t = 0} 2 1 is a p = 1 dimensional manifold in R (see Figure 6.6). Here, f ∈ Pr holds since x C(f)

t

1 3 FIGURE 6.6. Critical point set C(f) for f(x, t) = 3 x − tx with turning point (x, t) = (0, 0).

T ∇(x,t)∇x f(x, t) = (2x, −1) has full rank 1 .

This function is even in Pr with only one point (x, t) in the set ˜2 −1 n  2 2 (j f) {0} × S (n − 1) = {(x, t) | ∇xf(x, t) = x − t = 0, ∇xf(x, t) = 2x = 0} = {(0, 0)} . So, (x, t) = (0, 0) is a so-called turning point of C(f). 1 3 2 For the function f(x, t) = 3 x − t x the citical point set reads C(f) = {(x, t) | x2 − t2 = 0} and C(f) is not a manifold (see Figure 6.7).

x C(f)

t

1 3 2 FIGURE 6.7. Critical point set C(f) for f(x, t) := 3 x − t x, f∈ / Pr. 94 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

1 So, f∈ / Pr , and at the critical point (x, t) = (0, 0) we obtain T ∇(x,t)∇x f(x, t) = (0, 0) . However, if we take the perturbations of f given by f ±(x, t) := f(x, t) ± εx , ε > 0 , 1 − we obtain functions in Pr (even in Pr). The set C(f ) (see Figure 6.8) only consists of 2 nondegenerate critical points, i.e., points (x, t) such that det ∇xf(x, t) 6=√ 0 holds. The set C(f +) (see Figure 6.8) has two degenerate critical points (x, t) = (0, ± ε) which are turning points.

x C(f −) x C(f +)

t t

FIGURE 6.8. Critical points of the perturbed functions f(x, t) ± εx.

6.3.2. One-parametric unconstrained programs. As an important special case of (6.4) we examine one-parametric families of problems P (t), i.e., the case p = 1, t ∈ R. We refer to [21, Section 10.2] for further details.

n T EX. 6.1. Let (x, t) ∈ R × R be a point in C(f) such that ∇∇x f(x, t) (∇ = ∇(x,t)) has 2 2 rank n and det ∇xf(x, t) = 0, i.e., (x, t) ∈ Cs(f). Then it holds: rank ∇xf(x, t) = n−1. In particular, the latter condition is fulfilled for all (x, t) ∈ Cs(f) if f is in Pr. T 2 ∂ T  2 Proof. In view of ∇∇x f(x, t) = ∇xf(x, t) | ∂t ∇x f(x, t) and det ∇xf(x, t) = 0 we must have 2 T 2 rank ∇xf(x, t) ≤ n − 1 and n = rank ∇∇x f(x, t) ≤ rank ∇xf(x, t) + 1 . 2 1 This gives rank ∇xf(x, t) = n − 1. Note that by (6.12), f ∈ Pr implies f ∈ Pr (cf., (6.9)), and thus for all (x, t) ∈ Cs(f) the assumptions in Ex.6.1 are fulfilled.  As we will see in a moment, there is a strong connection between the set Cs(f) (cf., (6.10)) and the following optimization problem with objective φ(x, t) := t, T (6.13) min t s.t. ∇x f(x, t) = 0 . x∈Rn,t∈R We introduce the Lagrangean function L for this program, n X ∂ L(x, t, λ) = φ(x, t) + λ f(x, t) , i ∂x i=1 i 6.3. GENERICITY RESULTS FOR PARAMETRIC UNCONSTRAINED PROGRAMS 95 and recall that by Definition 6.1(b) below, a feasible point (x, t) is a critical point of (6.13), T n if ∇∇x f(x, t) has full rank n (LICQ holds), and with some multiplier vector λ ∈ R we have (∇ = ∇(x,t)), n 0 X ∂ 0 (6.14) ∇T L(x, t, λ) = + λ ∇T f(x, t) = . 1 i ∂x 0 i=1 i

Here is the connection between (6.13) and the set Cs(f).

LEMMA 6.3. Let f ∈ Pr. Then (x, t) is a critical point of (6.13) if and only if (x, t) ∈ Cs(f). Proof. The critical point condition (6.14) in particular means that the multiplier λ 6= 0 Pn ∂ 2 2 is a solution of 0 = λi∇x f(x, t) = ∇ f(x, t)λ. So, det ∇ f(x, t) = 0 and thus i=1 ∂xi x x (x, t) ∈ Cs(f). T 2 Suppose now (x, t) ∈ Cs(f), i.e., ∇x f(x, t) = 0 and det ∇xf(x, t) = 0. So, there is a 2 1 solution ξ 6= 0 of ∇xf(x, t)ξ = 0. But since f ∈ Pr and thus f ∈ Pr (cf., (6.12)) the T matrix ∇∇x f(x, t) has rank n and we can conclude    T n  T ∂    0 X ∇x f(x, t) 0 := ∇∇T f(x, t) ξ = ξ ∂xi 6= . α x i ∂ ∂ f(x, t) 0 i=1 ∂t ∂xi Pn ∂ ∂ Consequently, the number α = ξi f(x, t) is not equal to zero, and by dividing i=1 ∂t ∂xi the foregoing relation by α we have found a solution λ := −ξ/α of the critical point equation (6.14) for the program (6.13).  Now let again f ∈ Pr and (x, t) ∈ Cs(f), and recall that by Ex. 6.1 we have 2 rank ∇xf(x, t) = n − 1. By applying an appropriate linear transformation x → Ux, wlog. we can 2 (6.15) assume: ∇xf(x, t) = diag (0, σ2, . . . , σn) , σi 6= 0, i = 2, . . . , n .

Then since (x, t) ∈ Cs(f) (see Lemma 6.3), (x, t) solves the critical point equations (6.14), 2 where the first n equations have the form ∇xf(x, t)λ = (λ10, λ2σ2,..., λnσ2) = 0 with λ 6= 0. Consequently we find

(6.16) λ1 6= 0, λi = 0, i = 2, . . . , n . ∂ ∂ The last equation in (6.14) then implies −1 = λ1 f(x, t) or ∂t ∂x1 ∂ ∂ (6.17) f(x, t) 6= 0 . ∂t ∂x1 The next lemma characterizes the nondegenerate critical points of (6.13) (see Defini- tion 6.2 below).

LEMMA 6.4. Let f ∈ Pr and let (x, t) be a critical point of (6.13), i.e., (x, t) ∈ Cs(f). Then (under the assumption (6.15)), (x, t) is a nondegenerate critical point of (6.13) if and 96 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS only if ∂3 (6.18) 3 f(x, t) 6= 0 . ∂x1 T Proof. For f ∈ Pr in particular we have rank ∇∇x f(x, t) = n if (x, t) ∈ C(f), where the rows of ∇∇T f(x, t) are the gradients of the constraint functions ∂ f(x, t), x ∂xi i = 1, . . . , n, and are linearly independent. Under (6.15) we find  ∂2 f(x, t)  ∂t∂x1 ∇∇T f(x, t) =  .  . x  diag (0, σ2, . . . , σn) .  ∂2 f(x, t) ∂t∂xn Obviously, the tangent space

n+1 ∂ T T(x,t) = {ξ ∈ R | ∇ f(x, t)ξ = 0, i = 1, . . . , n} = ker ∇∇x f(x, t) ∂xi is spanned by ξ = (1, 0,..., 0) ∈ Rn+1. By Definition 6.2 below, the critical point (x, t) of (6.13) is nondegenerate if with λ in (6.16) it holds 3 T 2 T 2 ∂ ∂ 0 6= ξ ∇ L(x, t, λ)ξ = λ1ξ ∇ f(x, t)ξ = λ1 3 f(x, t) . ∂x1 ∂x1 ∂3 In view of λ1 6= 0 this is equivalent with 3 f(x, t) 6= 0. ∂x1  T ∞ Recall that for f ∈ Pr the solution set C(f) of ∇x f(x, t) = 0 is a one-dimensional C - n manifold of R × R. Let us examine this manifold near a point (x, t) ∈ Cs(f). We again assume (6.15). Then (σi 6= 0, i = 2, . . . , n)   T ∂ T ∇∇ f(x, t) = diag (0, σ2, . . . , σn) ∇ f(x, t) has rank n , x ∂t x 2 ∂2 and also the (n × n)-submatrix (with x := (x2, . . . , xn), fit := f(x, t)) ∂xi∂t

 0 ··· 0 f1t  σ ··· 0 f T  2 2t  (6.19) ∇(x2,t)∇x f(x, t) =  . .  has rank n .  .. .  0 ··· σn fnt T The Implicit Function Theorem applied to ∇x f(x, t) = 0 assures that near (x, t) the set ∞ 2 C(f) can be parameterized via a C -function (x (x1), t(x1)), x1 ≈ x1, such that T 2 (6.20) ∇x f(x1, x (x1), t(x1)) ≡ 0 for x1 near x1 . T ∂ Differentiation wrt. x1, using ∇ f(x, t) = 0 (cf., (6.15)) yields x ∂x1 d 2 ! x (x1) ∂ T dx1 T (6.21) ∇(x2,t)∇x f(x, t) d = −∇x f(x, t) = 0 . t(x1) ∂x1 dx1 6.3. GENERICITY RESULTS FOR PARAMETRIC UNCONSTRAINED PROGRAMS 97

So in view of (6.19) we obtain

d 2 d (6.22) x (x1) = 0 , t(x1) = 0 . dx1 dx1 The first row of relation (6.21) reads

d 2 ! 2 ∂ x (x1) ∂ dx1 ∇(x2,t) f(x, t) d = − 2 f(x, t) ∂x1 t(x1) ∂x dx1 1

d 2 d T and a further differentiation wrt. x1 leads to (recall x (x1), t(x1) = 0) dx1 dx1 d2 2 ! 2 x (x1) 3 ∂ dx1 ∂ 0 + ∇ 2 f(x, t) = − f(x, t) . (x ,t) d2 3 ∂x1 2 t(x1) ∂x1 dx1 ∂ ∂2  In view of ∇ 2 f(x, t) = 0, f(x, t) it follows (x ,t) ∂x1 ∂t∂x1

3 2 2 3 2 ∂ ∂ d ∂ d − ∂x3 f(x, t) f(x, t) t(x ) = − f(x, t) t(x ) = 1 . (6.23) 2 1 3 or 2 1 ∂2 ∂t∂x1 dx1 ∂x1 dx1 f(x, t) ∂t∂x1 ∂2 ∂3 Recall that by (6.17) we have f(x, t) 6= 0. So, under the assumption 3 f(x, t) 6= 0, ∂t∂x1 ∂x1 for the function t(x1) we find (see (6.22)): d d2 (6.24) t(x1) = 0 and 2 t(x1) 6= 0 . dx1 dx1 Consequently, in this case, the program (6.13) has on C(f) (see Figure 6.9) d2 (6.25) a strict local maximizer (or minimizer) at (x, t) if 2 t(x1) < 0 (or > 0) . dx1

x1 (x, t) (x, t) or

t

FIGURE 6.9. Sketch of C(f) with a turning point (x, t).

In other words, near (x, t) ∈ Cs(f), the curve C(f) shows a ”quadratic turning point”. The following theorem states that in the generic case, i.e., for f ∈ Pr, at all points (x, t) ∈ Cs(f) such quadratic turning points occur. 98 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

THEOREM 6.7. Let f ∈ Pr. Then for all (x, t) ∈ Cs(f) the following holds.

(a) All critical points (x, t) of (6.13) (i.e., (x, t) ∈ Cs(f), see Lemma 6.3) are nondegen- erate critical points of (6.13). In particular by Lemma 6.4 and (6.24), around all points (x, t) ∈ Cs(f) the set C(f) has a quadratic turning point.

(b) Let under assumption (6.15) near (x, t) ∈ Cs(f) the solution set C(f) be parameter- 2 2 ized by (x1, x (x1), t(x1)), x1 ≈ x1 (see (6.20)). Then for the determinant of ∇xf(x, t) it holds d 2 det f(x1, x (x1), t(x1)) 6= 0 . dx1 2 2 This means (together with det ∇xf(x, t) = 0, see (6.10)) that the determinant det ∇xf(x, t) changes sign when passing on the curve C(f) a point (x, t) ∈ Cs(f). Moreover, the num- 2 ber of positive (negative) eigenvalues of ∇xf(x, t) changes exactly by one when passing (x, t) ∈ Cs(f). Proof. For the proofs of (a) and (b) we refer to [21, Theorem 10.23], but for completeness we reproduce the proof of (b). (b) Under assumption (6.15), by applying the multi-linearity of the determinant, it follows ∂2 ∂3 with the abreviations fij := f(x, t), f111 := 3 f(x, t) etc., and using (f11, . . . , fn,1) = ∂xi∂xj ∂x1 (0,..., 0) (see (6.15)):

 f11 ··· f1i ··· f1n  ∂ 2 ∂ . . . det ∇xf(x, t) = det  . . .  ∂x1 ∂x1 fn1 ··· fni ··· fnn

n  f11 ··· f11i ··· f1n  X . . . = det  . . .  i=1 fn1 ··· f1ni ··· fnn

 f111 0 ··· 0   f112 σ2 ··· 0  = det  . .   . .. 0  f11n 0 ··· σn

= f111 · σ2 ··· σn .

Recalling σi 6= 0, i = 2, . . . , n, by applying (a) combined with Lemma 6.4 it follows ∂ det ∇2f(x, t) 6= 0. ∂x1 x It is well-known that the eigenvalues depend continuously on the matrix elements. So near 2 (x, t) the n − 1 eigenvalues of ∇xf(x, t) near σ2, . . . , σn will not change sign, and in view ∂ 2 of det ∇ f(x, t) 6= 0 only the eigenvalue near σ1 = 0 will change from positive to ∂x1 x negative (or from negative to positiv).  We finish the section with a remark on the set n Cm(f) = {(x, t) ∈ R × R | x is a local minimizer of P (t)} , 6.4. GENERICITY RESULTS FOR PARAMETRIC SYSTEMS OF NONLINEAR EQUATIONS 99 corresponding to local minimizers of P (t) (cf., (6.4)) for the one-parametric case t ∈ R. Clearly this set Cm(f) is a subset of C(f). It is also evident that for functions f ∈ Pr, on a connected component of the one-dimensional manifold C(f) \ Cs(f) the (n nonzero) 2 eigenvalues of ∇xf(x, t) do not change signs. So, supposing that on one ”side” of the set C(f) near a point (x, t) ∈ Cs(f) the elements (x, t) ∈ C(f) are such that the points x 2 are strict local minimizers of P (t) (i.e., all eigenvalues of ∇xf(x, t) are positive). Then 2 according to Theorem 6.7(b) (see Figure 6.9) at the other ”side” the Hessian ∇xf(x, t) has exactly one negative eigenvalue (n − 1 eigenvalues will remain positive) and these points (x, t) ∈ C(f) cannot correspond to local minimizers. In case of x ∈ R (only one eigenvalue) on this side local maximizers occur. Consequently, generically, the set Cm(f) looks as sketched in Figure 6.10. We also emphasize that for f ∈ Pr, because of the

x1 (x, t) (x, t) or

t

FIGURE 6.10. Sketch of Cm(f) near a point (x, t) ∈ Cs(f), for f ∈ Pr.

∂ ∂2 ∂3 conditions f(x, t) = 2 f(x, t) = 0 and 3 f(x, t) 6= 0 (under (6.15)), the critical ∂x1 ∂x1 ∂x1 point (x, t) ∈ Cs(f) can never yield a minimizer (or a maximizer) x of P (t), i.e., this point does not belong to Cm(f).

6.4. Genericity results for parametric systems of nonlinear equations In this section we examine systems of equations depending on a parameter t ∈ Rp, ∞ n p m E0(t): F (x, t) = 0 , where F ∈ C (R × R , R ) . p For any fixed t ∈ R the problem E0(t) consists of a system of equations as discussed in Section 4.4. Assume that the following condition holds:

n p (6.26) ∇(x,t)F (x, t) has full rank m at all solutions (x, t) ∈ R × R of F (x, t) = 0 .

Under this condition the solution set n p S(F ) := {(x, t) ∈ R × R | F (x, t) = 0} is a C∞-manifold of codimension m (dimension n + p − m) in Rn × Rp. Indeed under (6.26) the set S(F ) satisfies the condition CM1 (cf., Section 5.1). As main result of this section we will show that the condition (6.26) is a generic property of E0(t). Before analyzing the generic structure of E0(t) we give some illustrative examples. 100 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS n = p = m = 1: F (x, t) := x2 − t2 = 0 with (see Figure 6.11) 2 2 S(F ) = {(x, t) ∈ R × R | x − t = 0} .

x C(f)

t

FIGURE 6.11. Solution set S(F ) for F (x, t) := x2 − t2 = 0.

Note, that at the solution point (x, t) = (0, 0) ∈ S(F ) the ”regularity condition” in (6.26) is not fulfilled: rank ∇(x,t)F (x, t) = rank (0, 0) = 0 < m. 2 2 For the perturbed equation F±(x, t) := x − t ± ε = 0 (ε > 0 fixed) the sets S(F±) are sketched in Figure 6.12. x x S(F−) S(F+)

t t

2 2 FIGURE 6.12. Solution sets S(F±) of the perturbed equations x − t ± ε = 0.

Both sets S(F±) are manifolds of dimension 1. For F− even the stronger condition is satisfied: n p ∇xF (x, t) has full rank m at all solutions (x, t) ∈ R × R of F (x, t) = 0 . So by the Implicit Function Theorem 7.2 the set S(F−) can be parameterized√ by functions x(t). This is in contrast to S(F+)√where at the solution points (0, ± ε) ∈ S(F+) we have rank ∇(x,t)F (x, t) = rank (0, ±2 ε) = 1 = m√but rank ∇xF (x, t) = rank 0 = 0 < m, and S(F+) has so-called turning points at (0, ± ε). n = p = 1, m = 2: F (x, t) := (x − t − 1, x − t2) = (0, 0). Obviously the solution set 1 √ √  S(F ) = {(x, t) ∈ × | x = t + 1, x = t2} = 1 ± 5, 3 ± 5 R R 2 consists of two points (cf. Figure 6.13). Also here the condition (6.26) is valid and S(F ) is a nice manifold, however in this case , of dimension 0 in R2. We will now show, that the condition (6.26) is generically fulfilled. 6.4. GENERICITY RESULTS FOR PARAMETRIC SYSTEMS OF NONLINEAR EQUATIONS 101

x

S(F )

t

FIGURE 6.13. Set S(F ) for F (x, t) = (x − t − 1, x − t2) = (0, 0) (red points).

THEOREM 6.8. The set ∞ n p m (6.27) Pr = {F ∈ C (R × R , R ) | F satisfies (6.26) } . ∞ n p m k is open and dense (thus generic) in C (R × R , R ) wrt. the Cs -topology, k ≥ 1. n p Moreover, for F ∈ Pr the solution set S(F ) is a manifold in R × R of codimension m. Proof. Consider the function j0F = graph (F ) and the reduced jet-function ˜j0F (x, t) = − F (x, t). By Theorem 5.15 the condition F t {0} holds iff for any solution (x, t) of F (x, t) = 0 we have (use TF (x,t){0} = {0}) n p m ∇F (x, t)[R × R ] + {0} = R , − or equivalently (6.26). Consequently, F t {0} iff F ∈ Pr. According to Lemma 6.1 − 0 − n p n p F t {0} holds iff j F t (R × R × {0}), and since (R × R × {0}) is closed, the statement follows from the Jet Transversality Theorem 6.1. The fact that for f ∈ Pr the set S(f) is a manifold of codimension m has been shown before.  We just present a special case. • case n + p < m: This in particular implies n + p < codim {0}, 0 ∈ Rm, and by Lemma 5.6 (with n replaced by n+p) it follows that generically, i.e., for F ∈ Pr, we have {F (x, t) | (x, t) ∈ Rn × Rp} ∩ {0} = ∅. In other words we have n p S(F ) = {(x, t) ∈ R × R | F (x, t) = 0} = ∅ .

Let us again discuss a stronger condition than in (6.26) which is related with the condition LICQ (see Section 6.5, (6.31)). Considering the system

T ∞ n p F (x, t) = g1(x, t), . . . , gm(x, t) = 0, gj ∈ C (R × R , R), j = 1, . . . , m , we say that at a solution (x, t) ∈ Rn × Rp the condition LICQ is satisfied if the gradients ∇xgj(x, t), j = 1, . . . , m, are linearly independent or equivalently if rank ∇xF (x, t) = m. 102 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

So, a nice condition would be the following (cf. (6.26)): n p (6.28) ∇xF (x, t) has full rank m at all solutions (x, t) ∈ R × R of F (x, t) = 0 . We can proceed as before in Subsection 6.3.1. By taking the Whitney regular stratification of the closed set {0} × M m,n given by m,n ˆ ˆ  ˆ m,n {0} × M , Σ with Σ = Mj := {0} × M (j), j = 0,..., min{n, m} , and by defining the reduced 1-jet function 1  ˆj F (x, t) = F (x, t), ∇xF (x, t) we obtain the following theorem for the set ˆ ∞ n p m ˆ1 − ˆ ˆ ˆ (6.29) Pr = {F ∈ C (R × R , R ) | j F (x, t) t Mj, ∀Mj ∈ Σ} .

ˆ ∞ n p m THEOREM 6.9. The set Pr is open and dense (thus generic) in C (R × R , R ) wrt. the k ˆ Cs -topology, k ≥ 2. Moreover, for F ∈ Pr the inverse images (ˆj1F )−1{0} × M m,n(j) , j = 0,..., min{n, m} ,

m,n  are submanifolds in S(F ) of codimension cd = codim M (j) = (m − j)(n − j), and these sets form a Whitney regular stratification of the set S(F ) = (ˆj1F )−1{0} × M m,n. Let us also consider some special cases. ˆ • case n + p < m: Then as before, for F ∈ Pr we find n p S(F ) = {(x, t) ∈ R × R | F (x, t) = 0} = ∅ . • case n + p = m: In particular m ≥ n. By Theorem 6.9 the sets (ˆj1F )−1({0} × m,n M (j)) are manifolds of codimension cd = (m − j)(n − j) in S(F ). So for ˆ F ∈ Pr,  discrete set for j = n (x, t) ∈ n+p | F (x, t) = 0, rank ∇ F (x, t) = j} = . R x ∅ for j < n ˆ • general case for F ∈ Pr: If n + p < m + (m − j)(n − j) then  n+p (x, t) ∈ R | F (x, t) = 0, rank ∇xF (x, t) = j} = ∅ . 6.5. Genericity results for parametric constrained programs This section deals with parametric constrained programs of the form: For t ∈ Rp solve n (6.30) P (t) : min f(x, t) s.t. x ∈ F(t) = {x ∈ R | gj(x, t) ≤ 0, j ∈ J} , ∞ n p where J = {1, . . . , m} and the functions f, gj, j ∈ J, are in C (R × R , R). For fixed t ∈ Rp the problem P (t) represents a constrained program as studied in Section 4.5. T We again use the abbreviation G = (gj, j ∈ J) and often write P (f, G), F(G) instead of P (t), F(t), if we wish to stress on the fact that the program is defined by the problem functions (f, G). 6.5. GENERICITY RESULTS FOR PARAMETRIC CONSTRAINED PROGRAMS 103

p Let (x, t), x ∈ F(t), t ∈ R . As before we define the active index set J(x,t) = {j ∈ J | gj(x, t) = 0} and the Lagrangean function near (x, t): X L(x, t, µ) = f(x, t) + µjgj(x, t) .

j∈J(x,t) Recall that the linear independence constrained qualification (LICQ) is said to hold at (x, t) if the gradients

(6.31) ∇xgj(x, t), j ∈ J(x,t) , are linearly independent .

For a better understanding of the structure of the parametric problem P (t) we will not only look at local minimizers of P (t) but also at generalized critical points.

DEFINITION 6.1. Given a problem P (t) = P (f, G), let (x, t), x ∈ F(t), t ∈ Rp. (a) The point (x, t) with x ∈ F(t) is called a generalized critical point (gc-point) of P (t), if there exist Lagrangean multipliers µ0 ∈ R, µj ∈ R, j ∈ J(x,t), not all zero, such that X (6.32) µ0∇xf(x, t) + µj∇xgj(x, t) = 0 ,

j∈J(x,t) or equivalently that ∇xf(x, t), ∇xgj(x, t), j ∈ J(x,t), are linearly dependent. (b) The point (x, t) with x ∈ F(t) is said to be a critical point of P (t), if LICQ holds at

(x, t), and there are multipliers µj, j ∈ J(x,t) (unique by LICQ), such that X (6.33) ∇xf(x, t) + µj∇xgj(x, t) = 0 .

j∈J(x,t)

In the sequel Cgc denotes the set of all gc-points of P (t) and Cm denotes the set Cm = {(x, t) | x is a local minimizer of P (t)} . Note that (by the John-condition, cf., [13]) any local minimizer of P (t) must necessarily be a gc-point, i.e., Cm ⊂ Cgc.

DEFINITION 6.2. A generalized critical point (x, t) of P (t) is called a nondegenerate critical point if (6.33) holds together with • LICQ (i.e., (x, t) is a critical point) • strict complementarity,

(6.34) (SC): µj 6= 0 ∀j ∈ J(x,t) , • and the second order condition

T 2 (6.35) (SOC): d ∇xL(x, t, µ)d 6= 0 ∀d ∈ T(x,t) \{0} ,

n where T(x,t) is the tangent space T(x,t) = {d ∈ R | ∇xgj(x, t)d = 0, j ∈ J(x,t)}. The set of nondegenerate critical points is denoted by Cr. 104 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

6.5.1. Parametric constrained programs. In this subsection we study the generic properties of general parametric problems P (t) in (6.30). Recall from Section 4.5 that in the nonparametric case, generically all local minimizers x are nondegenerate, i.e., according to Theorem 4.8 the conditions LICQ, SC, and SOC are satisfied at x. In the same way we could have proven that generically, in the nonparametric case, all gc-points are nondegenerate critical points (cf., Definition 6.2). However again, it is clear that for parametric programs we cannot expect that generically for all t ∈ Rp all gc-points of P (t) are nondegenerate critical points. More precisely, the higher the dimension p of the parameter space Rp is, the more degeneracies may occur in the generic situation. In what follows, we are interested in a more detailed description of the generic structure of n p the set Cr of gc-points (x, t) ∈ R ×R which are nondegenerate and the set Cs := Cgc\Cr, where at least one of the nice conditions LICQ, SC, or SOC fails.

We start with the set Cr of a given problem P (t) = P (f, G). A critical point (x, t) is a solution of the system, T P T ∇x f(x, t) + j∈J µj∇x gj(x, t) = 0 (6.36) H(x, t, µ) := (x,t) gj(x, t) = 0 , j ∈ J(x,t) with some µ. If (x, t) is nondegenerate, then according to Ex. 4.4 the Jacobian H(x,µ) of H (x, t, µ) G = (g (x, t), j ∈ J )T at ( J(x,t) j (x,t) ),  2 T  ∇ L(x, t, µ) ∇xGJ (x, t) H (x, t, µ) = x (x,t) , is nonsingular. (x,µ) ∇ G (x, t) 0 x J(x,t) So near (x, t, µ) we can apply the Implicit Function Theorem 7.2 to the system (6.36). This theorem implies that in a neighborhood U(x,t) × Uµ of (x, t, µ) the solution set of (6.36) is a p-dimensional manifold parameterized by locally defined C∞-functions (x(t), µ(t)). Moreover, the points (x(t), t) ∈ U(x,t) are nondegenerate critical points of P (t), see Fig- ure 6.14. x Cr ⊂ Cgc (x, t)

(x(t), t) t t

FIGURE 6.14. Local structure of Cr near a nondegenerate critical point (x, t) of P (t).

We now look at the set Cs = Cgc \ Cr of gc-points where at least one of the conditions 6.5. GENERICITY RESULTS FOR PARAMETRIC CONSTRAINED PROGRAMS 105

LICQ, SC, or SOC fails. We give a rough picture for the general case of parametric pro- grams P (t), t ∈ Rp. Only for the one-parametric case (p = 1) a complete description has been accomplished until now (see Subsection 6.5.2 for details).

n p We are interested in points (x, t) ∈ R × R from the singular critical set Cs and consider the failure of LICQ, SC, and SOC for a program P (t) = P (f, G). To that end we introduce 2 n p m nm+1 nm+1 the reduced 2-jet function ˜j (f, G): R × R → R × R × S defined by (6.37) ˜2 2 2  j (f, G)(x, t) := G(x, t), ∇xf(x, t), ∇xgj(x, t), j ∈ J, ∇xf(x, t), ∇xgj(x, t), j ∈ J . T In what follows, we identify (G = (gj, j ∈ J) ) T T (6.38) a ≡ G(x, t), b0 = ∇x f(x, t),B ≡ ∇xG(x, t) , 2 2 A0 ≡ ∇xf(x, t),Aj ≡ ∇xgj(x, t), j ∈ J, and translate the failure of LICQ , SC, SOC into semi-algebraic conditions in the variables  m nm+1 nm+1 (6.39) a, b0,B,A0,Aj, j ∈ J ∈ R × R × S =: N .

LICQ: Points (x, t) ∈ Cs, where LICQ is not fulfilled with J(x,t) = J0 for some J0 ⊂ J, satisfy the conditions

GJ0 (x, t) = 0  rank ∇xGJ0 (x, t) = k < |J0| ≤ m .

In terms of the variables aJ0 ,BJ0 = [B·j, j ∈ J0] (see (6.38)) these conditions (for J0 fixed) read,

(6.40) aJ0 = 0, rank BJ0 = k < |J0| ≤ m . They can easily be shown to be linearly independent and they define a C∞-semi-algebraic manifold SJ0 (k) ⊂ N (see (6.39)) of codimension (cf., Theorem 5.11),

codim SJ0 (k) = |J0| + (n − k)(|J0| − k) ≥ |J0| + (n − |J0| + 1) = n + 1 . Note that in case LICQ fails at (x, t), this point is automatically a gc-point. Moreover, for

|J0| > n + p, generically, there is no solution of aJ0 ≡ GJ0 (x, t) = 0 (cf., the special case ”n + p < m” at the end of Section 6.3).

SC: We consider points (x, t) ∈ Cs, where SC fails but LICQ holds, with J(x,t) = J0 for some J0 ⊂ J. Such points satisfy for some j0 ∈ J0 (with µj0 = 0) the following conditions:

GJ0 (x, t) = 0   T T rank ∇xGJ0\{j0}(x, t) | ∇x f(x, t) = k ≤ |J0| − 1 .

In terms of the variables aJ0 ,BJ0\{j0}, b0 (see (6.38)) these relations give, T  (6.41) aJ0 = 0, rank BJ0\{j0} | b0 = k ≤ |J0| − 1 , 106 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS and these conditions are linearly independent. They also define a C∞-semi-algebraic man- ifold SJ0,j0 (k) ⊂ N (see (6.39)) of codimension (cf., Theorem 5.11),

codim SJ0,j0 (k) = |J0| + (n − k)(|J0| − k) ≥ |J0| + (n − |J0| + 1) = n + 1 .

SOC: Here, we look at points (x, t) ∈ Cs, where SOC fails but LICQ holds, with J(x,t) =  J0 for some J0 ⊂ J. Such a point in particular satisfies the conditions, rank ∇xGJ0 (x, t) = |J0| and   T T (6.42) GJ0 (x, t) = 0, rank ∇xGJ0 (x, t) | ∇x f(x, t) = |J0| ≤ n .

T T We again identify aJ0 ≡ GJ0 (x, t),BJ0 ≡ ∇xGJ0 (x, t) , b0 ≡ ∇x f(x, t) etc. Then the T T critical point equation (6.33) yields BJ0 µ = −b0 or after multiplication by BJ0 , BJ0 BJ0 µ = T T −BJ0 b0 and since BJ0 BJ0 is nonsingular the unique Lagrangean multiplier is given by T −1 T µ = µ(b0,BJ0 ) = − BJ0 BJ0 BJ0 b0 . The failure of SOC (see Ex. 4.4) means that with the matrix A := A + P µ A ∈ Sn 0 j∈J0 j j 2 (recall A0 ≡ ∇xf(x, t) etc.), the relation   ABJ0 (6.43) rank T = k ≤ n + |J0| − 1 , BJ0 0 holds. Again, the conditions (6.42) and (6.43) (together with LICQ, i,e, rank BJ0 = |J0|) SOC define a semi-algebraic set SJ0 (k) ⊂ N (cf., (6.39)) of codimension (in both cases |J0| < n and |J0| = n) SOC 2 codim SJ0 (k) ≥ |J0| + (n − |J0|)(|J0| + 1 − |J0|) + (n + |J0| − n − |J0| + 1) = n + 1 .

SOC Now we can consider the union of all semi-algebraic sets SJ0 (k),SJ0,j0 (k),SJ0 (k) of codimension ≥ n + 1 above, corresponding to the sets where LICQ, SC, or SOC are not satisfied: [ [ [ SOC S1 := SJ0 (k) ,S2 = SJ0,j0 (k) ,S3 = SJ0 (k) . k≤|J0|−1, k≤|J0|−1, k≤n+|J0|−1, J0⊂J J0⊂J,j0∈J0 J0⊂J

Each of these sets S1,S2,S3 is closed. So also the union

S := S1 ∪ S2 ∪ S3 m nm+1 nm+1 is a closed semi-algebraic subset of N = R × R × S with codim S ≥ n + 1 . This closed semi-algebraic set S allows a Whitney regular stratification (see, Theorem 5.9)

(S, Σ) , Σ = {Xi, i ∈ I} , with finitely many strataXi, i ∈ I, and we can define the set Pr of nice problems P (t) = P (f, G) (cf., (6.37)) ∞ n p m+1 ˜2 − Pr = {(f, G) ∈ C (R × R , R) | j (f, G) t Xi ∀Xi ∈ Σ} . According to the jet-transversality Theorem 6.2 we obtain 6.5. GENERICITY RESULTS FOR PARAMETRIC CONSTRAINED PROGRAMS 107

∞ n p m+1 k COROLLARY 6.4. The set Pr is open and dense in C (R × R , R) wrt. the Cs - topology for all k ≥ 3. ˜2 −1 n p The sets j (f, G) (Xi) are manifolds in R ×R with codimension cd = codim Xi, i ∈ I, satisfying cd ≥ n + 1.

n p By definition, the set Cgc is a closed subset of R × R , and as discussed above (cf., n p Figure 6.14), the set Cr ⊂ Cgc of nondegenerate critical points is a manifold in R × R of dimension p (codimension n). ˜2 −1 n p ˜2 The set Cs = Cgc \ Cr equals j (f, G) (S) = {(x, t) ∈ R × R | j (f, G)(x, t) ∈ S} ˜2 −1 n p and this set consists of manifolds j (f, G) (Xi), Xi ∈ Σ, of R × R of codimension ≥ n + 1. Recall that by definition the set Cs contains precisely the gc-points of P (f, G) where (at least) one of the conditions LICQ, SC, or SOC fails. We finish this subsection with some remarks on problems P (t) = P (f, G) for (f, G) from the generic set Pr. 2 • For p = 1 the mapping ˜j (f, G) avoids strata Xi ∈ Σ with codim Xi ≥ n + 2 and possibly intersects the strata of codimension n + 1 at a discrete set of gc-points (x, t) from Cs. • We emphasize that until now, only for p = 1 a complete description of the generic structure of the set Cgc has been achieved. To obtain more details on generic properties for the case p > 1, we would need a more specific stratification Σ = {Xi, i ∈ I} of the jet set N such that each stratum corresponds to a specific distribution of degeneracies a gc-point (x, t) may show in the generic situation. More precisely, we should define strata Xi which precisely correspond, e.g., to situations where LICQ and SC fails but not SOC, or where SC and SOC fails but not LICQ etc. Even for p = 2 this becomes very complex.

REMARK 6.2. Let us mention that the genericity results for parametric problems of the present chapter contains the genericity results for the corresponding non-parametric prob- lems in Chapter 4 as a special case if we chose p = 0. Unconstrained programs: Note that all arguments at the beginning of Subsection 6.3.1 re- 1 main true if we chose p = 0. For this choice the set Pr in Lemma 6.2 equals the set Pr in (4.6), and for p = 0 Lemma 6.2 coincides with Theorem 4.5.

Systems of nonlinear equations: Here the set Pr in Theorem 6.8 for p = 0 equals the set Pr in (4.10) and Theorem 6.8 coincides with Theorem 4.6. Constrained programs: Also here, for p = 0 the statement of Corollary 6.4 yields the genericity result of Theorem 4.9 for non-parametric programs.

6.5.2. One-parametric constrained programs. We are especially interested in parametric programs P (t) (cf., (6.30)) with t ∈ R, i.e., in the one-parametric case p = 1. Here, the generic behavior of the sets Cgc of generalized critical points and the sets Cm corresponding to local minimizers is completely understood and analysed in the original papers of Jongen et al. [23, 22] and the books [21, 17]. We do not go into technical details 108 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS and only summarize the generic properties as presented in [17, Chapter 2], and offer some illustrative examples. For so-called pathfollowing methods to trace numerically the curves of gc-points of one- parametric programs P (t), we also refer to [17].

As we shall see, generically in the sets Cgc only 5 types of gc-points can appear, namely nondegenerate critical points (Type 1), and points of Types 2-5, where precisely one of the regularity conditions SC, SOC, LICQ fails. We roughly describe the generic behavior of Cgc (or Cm) near a gc-point (x, t) of P (t) = P (f, G) for each type and finally present the genericity results. Type 1: These are nondegenerate critical points (x, t) (see Definition 6.2) where LICQ,

SC, and SOC are satisfied. We have seen above that near such a point (x, t) the set Cgc consists of a one-dimensional manifold given by a C∞-curve (x(t), t), t ≈ t, t ∈ R (see Figure 6.15).

x Cgc (x, t)

(x(t), t) | t t

FIGURE 6.15. Sketch of Cgc near a point (x, t) of P (t) of Type 1.

In case that for the point (x, t) of Type 1 the vector x is a nondegenerate local minimizer of P (t), locally near (x, t), this curve (x(t), t) consists of nondegenerate local minimizers x(t) of P (t). Type 2: These are critical points (x, t), where SC fails but LICQ and SOC holds. More precisely we have

(2a) J(x,t) 6= ∅, and precisely one of the multipliers in (6.33) is equal to zero, say:

µp = 0, for one p ∈ J(x,t) . (2b) and 3 other specific conditions are satisfied. All these conditions depend explicitly on quantities which are completely deter- mined by (x, t) and f, G (see [17, p. 43] for details).

Near such a point (x, t) of Type 2 there are two C∞-curves (x(t), t) and (˜x(t), t), t ≈ t (see Figure 6.16), where the first curve (in blue) consists of critical points with J(x(t),t)) = J(x,t) ∞ and with a corresponding C -multiplier vector µ(t). The component µp(t) changes from positive to negative (or from negative to positive) when passing t = t on this curve. 6.5. GENERICITY RESULTS FOR PARAMETRIC CONSTRAINED PROGRAMS 109

x (˜x(t), t)

(x(t), t) (x, t) gp > 0

gp < 0 gp = 0 t

FIGURE 6.16. Sketch of Cgc near a point (x, t) of Type 2.

The other curve (˜x(t), t) (in red) consists of gc-points of the program P˜(t) obtained from P (t) by skipping the constraint gp(x, t) ≤ 0. Half of this curve (red line segment) satisfies gp(˜x(t), t)) < 0, t < t (or t > t), and thus also belongs to Cgc. But the other red dashed part satisfies gp(˜x(t), t)) > 0, t > t (or t < t), and thus does not belong to Cgc (is not feasible for P (t)). So generically near a critical point of type 2, the set Cgc looks as sketched in Figure 6.17. x x

(x, t) (x, t) gp < 0

gp < 0 gp = 0 gp = 0 t or t

FIGURE 6.17. Set Cgc near a point (x, t) of Type 2.

Moreover, if on one branch of the set Cgc there are local minimizers, then near (x, t), the set Cm looks topologically as in Figure 6.18.

Type 3: At critical points (x, t) of this type the condition SOC fails but LICQ and SC holds. More precisely, the critical point equation ∇xL(x, t, µ) = 0 (cf., (6.33)) holds such that with V = [v1 ... vk], where vj, j = 1, . . . , k, k = n − |J(x,t)|, is a basis of the tangent space T(x,t) (cf., Definition 6.2), it follows:

T 2 (3a) V ∇xL(x, t, µ)V has exactly one zero eigenvalue, (3b) and one further specific condition is satisfied. This condition is completely determined by (x, t) and f, G (see [17, p. 44f.] for details). 110 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS x x

(x, t) (x, t)

t or t

FIGURE 6.18. Set Cm near a point (x, t) of Type 2.

Near such a point (x, t) of Type 3 the set Cgc looks like sketched in Figure 6.19, i.e., (x, t) T 2 is a turning point. On these curves, precisely one eigenvalue of V ∇xL(x, t, µ)V changes from positive to negative (or from negative to positive) when passing (x, t).

x (x, t) (x, t) or

t

FIGURE 6.19. Sketch of Cgc near a point (x, t) of Type 3.

T 2 Consequently, since for nondegenerate local minimizers all eigenvalues of V ∇xL(x, t, µ)V must be positive, near a Type 3 critical point (x, t), (in case one ”side” of Cgc corresponds to local minimizers) the set Cm must topologically look like in Figure 6.20. Note that the point (x, t) need not belong to Cm.

x (x, t) (x, t) or

t

FIGURE 6.20. Sketch of Cm near a point (x, t) of Type 3.

Type 4: For these critical points (x, t) LICQ fails with |J(x,t)| ≤ n. More precisely it holds,

(4a) 0 < |J(x,t)| ≤ n, dim span {∇xgj(x, t), j ∈ J(x,t)} = |J(x,t)| − 1, 6.5. GENERICITY RESULTS FOR PARAMETRIC CONSTRAINED PROGRAMS 111

(4b) and 3 further specific conditions are satified. These conditions are completely determined by (x, t) and f, G (see [17, p. 46f.] for details).

Near such a point (x, t) of Type 4 the set Cgc looks like given in Figure 6.21, i.e., (x, t) is a turning point.

x (x, t) (x, t) or

t

FIGURE 6.21. Sketch of Cgc near a point (x, t) of Type 4.

On these curves the signs of all multipliers µj, j ∈ J(x,t), and the signs of all eigenvalues of T 2 V ∇xL(x, t, µ)V change (V as in Type 3). So as for Type 3 (in case one ”side” corre- sponds to local minimizers) the set Cm must topologically look like in Figure 6.22, where the point (x, t) need not belong to Cm.

(x, t) (x, t) x or

t

FIGURE 6.22. Sketch of Cm near a point (x, t) of Type 4.

Type 5: These are gc-points (x, t) where LICQ fails with |J(x,t)| = n+1. More precisely we have,

(5a) Since |J(x,t)| = n+1, the gradients ∇xgj(x, t), j ∈ J(x,t), are linearly dependent, (5b)

(6.44) ∇(x,t)gj(x, t), j ∈ J(x,t), are linearly independent, (5c) and 2 further specific conditions are satisfied. These conditions are completely determined by (x, t) and f, G (see [17, p. 48f.] for details).

By using (6.44) and |J(x,t)| = n + 1 one can show that for each q ∈ J(x,t) the point (x, t) of Type 5 is a nondegenerate critical point of the program Pq(t) obtained from P (t) by 112 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS skipping the constraint gq(x, t) ≤ 0. So near (x, t), for each q ∈ J(x,t) there exists a curve (xq(t), t), t ≈ t, of nondegenerate critical points of Pq(t) (see Figure 6.23). One ”side”of this curve correspond to points satisfying gq(x, t) < 0 (in blue) whereas on the other ”side” (in red) the condition gq(x.t) > 0 is fulfilled, so that these points of the curve are not feasible for P (t). Thus only the blue part together with (x, t) belong to Cgc.

x x xq(t) xq(t) (x, t) (x, t) gq > 0 gq < 0

gq < 0 gq > 0

t or t

FIGURE 6.23. Sketch of the curve (xq(t), t) near a point (x, t) of Type 5.

Any q ∈ J(x,t) gives rise to such a curve segment in Cgc (see Figure 6.24). The local picture of Cgc now depends on whether the Mangasarian Fromovitz Constraint Qualification is satisfied at (x, t) or not.

DEFINITION 6.3. The Mangasarian Fromovitz Constraint Qualification (MFCQ) is said to hold at (x, t), x ∈ F(t), t ∈ R, if there exists a vector ξ ∈ Rn satisfying,

∇xgj(x, t)ξ < 0 , j ∈ J(x,t) . Note that MFCQ is weaker than LICQ.

If (x, t) is a point of Type 5 satisfying MFCQ then near (x, t) the set Cgc, resp, Cm looks like sketched in Figure 6.24, resp., Figure 6.25.

x1(t) x5(t)

x x2(t) x6(t) (x, t) x3(t) x7(t)

x4(t) t

FIGURE 6.24. Sketch of Cgc near a point (x, t) of Type 5 where MFCQ holds. 6.5. GENERICITY RESULTS FOR PARAMETRIC CONSTRAINED PROGRAMS 113 x

(x, t)

x7(t)

x4(t) t

FIGURE 6.25. Sketch of Cm near a point (x, t) of Type 5 where MFCQ holds.

If MFCQ fails at (x, t) then this point is an ”endpoint” of Cgc and Cm and these sets have the form as sketched in Figure 6.26, Figure 6.27. Concerning Cm, for this Type 5 the point (x, t) belongs to Cm.

x1(t) x5(t)

x x2(t) x (x, t) (x, t) x6(t)

x3(t) x7(t)

x4(t)

t or t

FIGURE 6.26. Sketch of Cgc near a point (x, t) of Type 5 where MFCQ fails.

x x (x, t) (x, t)

x7(t)

x4(t)

t or t

FIGURE 6.27. Sketch of Cm near a point (x, t) of Type 5 where MFCQ fails.

Now we summarize the genericity results for one-parametric problems P (t) = P (f, G). Let us introduce the set ∞ n m+1 Pr = {(f, G) ∈ [C (R ×R, R)] | each point in Cgc belongs to one of the Types 1-5.} 114 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

We also define the sets i Cgc = {(x, t) ∈ Cgc | (x, t) is of Type i} , i = 1,..., 5 .

THEOREM 6.10. [Genericity result of the 5 Types] ∞ n m+1 k (a) The set Pr is open and dense in [C (R × R, R)] wrt. the Cs -topology for each k ≥ 3. 1 (b) For (f, G) ∈ Pr the set Cgc is dense and (relatively) open in Cgc. Moreover, all sets i Cs, i = 2,..., 5, are discrete point sets. Proof. Part (a) is Theorem 2.1 in [22]. Part (b) coincides with Theorem 2.2 in [22], and as argued in [22], this statement is a direct implication of the local structure of the set Cgc near a point of Type 1-5 as discussed above. 

We finish the subsection with some additional observations on gc-points of the one-parametric program P (t) = P (f, G).

• By Definition 6.1 the set Cgc of gc-points is always closed, and the (relatively) 1 n open set Cgc is a one-dimensional manifold in R × R. So, for (f, G) ∈ Pr, by 1 Theorem 6.10(b), the set Cgc \ Cgc consists of a discrete point set: 1 i Cgc \ Cgc = ∪i=2,...,5 Cgc .

• Note that Cgc is closed but the set Cm need not be closed. Recall, e.g., the program (unconstrained, J = ∅)

1 3 P (t) min f(x, t) := x − tx, t ∈ R , x∈R 3 2 with Cgc = {(x, t) | x = t, t ≥ 0} (see Figure 6.28). x Cgc

Cm t

1 3 FIGURE 6.28. Critical point set Cgc for f(x, t) := 3 x − tx with turning point (x, t) = (0, 0).

The upper part (in blue) consists of points (x, t) with minimizers x of P (t) 2 (∇xf(x, t) = 2x > 0) and the points on the lower part (in red) correspond to 6.5. GENERICITY RESULTS FOR PARAMETRIC CONSTRAINED PROGRAMS 115

2 maximizers (∇xf(x, t) = 2x < 0). But the turning point (x, t) = (0, 0) is only a 1 3 critical point (for min 3 x ) and x is neither a minimizer nor a maximizer of P (0).

To illustrate the mecanism of the generic structure of the sets Cgc and Cm of a one- parametric program P (t) = P (f, G) near a gc-point (x, t) of Type 2,3,4 and 5, we present some typical examples for x ∈ R2, t ∈ R. These examples will not be given explicitly but visualized by sketches of the changes in the feasible sets (in blue) and the level sets of f (in red, dashed) for t near t. We always indicate the situation for a point t < t, for t = t, and a value t > t. A point x in R2, numbered, e.g., by 1, 2, or 3, always belongs to a gc-point (x, t) of P (t) on a branch of the set Cgc (or Cm) with the same number (see, e.g., Figure 6.29 and Figure 6.30 (left)). The direction −∇f indicates the direction of the negative gradient of f on a point of the level line of f. Examples for Type 2: A first example is given by the following sketches in R2 of the feasible sets, with 2 con- straints g1, g2 ≤ 0, and a level set of f. As the intersection of two active constraints

−∇f −∇f −∇f 1 2 x 2

g1 ≤ 0 g1 ≤ 0 g1 ≤ 0 g2 ≤ 0 g2 ≤ 0 g2 ≤ 0

FIGURE 6.29. Picture for: t < t, t = t, t > t.

2 (g1 = g2 = 0) in R the point 2 represents a gc-point. So, these pictures correspond to a 2 set Cgc and Cm in R × R of the form given in Figure 6.30.

2 2 x x 1 1 (x, t) (x, t)

2 t and t

FIGURE 6.30. Sketch of Cgc (left), Cm (right) near the gc-point (x, t) of Type 2.

In the second example of Type 2, the objective function is (up to small C∞-perturbations) of the form f(x1, x2) = x1x2 with level lines as in Figure 6.31. 116 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

x2

- +

x1

+ -

FIGURE 6.31. Some level lines for f(x) = x1x2.

In this example we are given one constraint g ≤ 0 and the program is sketched in Fig- ure 6.32. Here, at point 2, the condition ∇xf(x, t) = 0 is always satisfied, so that this

x2 x2 x2 −∇f

x1 3 2 1 x1 x1 2 g ≤ 0 2 −∇f g ≤ 0 g ≤ 0

FIGURE 6.32. Picture for: t < t, t = t, t > t. point (if feasible) belongs to a gc-point (neither a minimizer nor a maximizer). With the orientation of −∇f, point 1 is a critical point (neither a minimizer nor a maximizer) and point 3 is a local minimizer. So, for this problem the set Cgc and Cm has the form given in Figure 6.33. Example for Type 3: A typical example has been presented before in Section 6.3 by the unconstrained mini- 1 3 mization problem (J = ∅) min f(x, t) = 3 x − tx, with (x, t) ∈ R × R (see the sketches of the set of critical points C(f) = Cgc in Figure 6.6). Example for Type 4: In the first example MFCQ is satisfied at the gc-point (x, t), in the second MFCQ fails. In both examples we are given two constraints g1 ≤ 0 and g2 ≤ 0. Consider the sketches in Figure 6.34. 6.5. GENERICITY RESULTS FOR PARAMETRIC CONSTRAINED PROGRAMS 117

2 x x 1 (x, t) (x, t) 3 3

t and t

FIGURE 6.33. Sketch of Cgc (left), Cm (right) near the gc-point (x, t) of Type 2.

x g1 ≤ 0 g1 ≤ 0 1 2 −∇f g1 ≤ 0 −∇f −∇f F F F

g2 ≤ 0 g2 ≤ 0 g2 ≤ 0

FIGURE 6.34. Picture for: t < t, t = t, t > t.

For t < t the points 1 and 2 are intersection points of two linear independent constraints (g1 = g2 = 0), and they therefore correspond to critical points. Also for t = t, the point (x, t) is a generalized critical point. With the indicated orientation of −∇f the points 1,2 (and x) do neither belong to minimizers nor to maximizers. For t < t there is no gc- point (x, t) near (x, t). So in a neighborhood of (x, t) the set Cgc looks like sketched in Figure 6.35, and Cm is empty.

1 x (x, t) 2

t

FIGURE 6.35. Sketch of Cgc near the gc-point (x, t) of Type 4.

In the second example program P (t) the condition MFCQ fails at (x, t) (see Figure 6.36). For t < t, with the indicated orientation of −∇f, the point 1 corresponds to a local minimizer and point 2 represents a local maximizer. For t = t, the point x is an isolated feasible point and thus x is a local minimizer (and maximizer) of P (t). For t > t, near x the feasible set of P (t) is empty. So for this example the picture of Cgc, Cm is as in Figure 6.37. 118 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

g1 ≤ 0 g1 ≤ 0 x g ≤ 0 F 1 −∇f 1 2 −∇f −∇f

g2 ≤ 0 g2 ≤ 0 g2 ≤ 0

FIGURE 6.36. Picture for: t < t, t = t, t > t.

1 1 x (x, t) x 2

t t

FIGURE 6.37. Sketch of Cgc (left) and Cm (right) near the gc-point (x, t) of Type 4.

Example for Type 5: We again present sketches of two example programs. In the first example MFCQ is satis- fied at the gc-point (x, t), in the second not. In both examples we are given three constraints g1 ≤ 0, g2 ≤ 0, and g3 ≤ 0. Consider the pictures in Figure 6.38. With this orientation of −∇f the point 1 belongs to

−∇f −∇f −∇f

g3 ≤ 0 x g ≤ 0 g3 ≤ 0 1 3 2 3 F F F

g1 ≤ 0 g2 ≤ 0 g1 ≤ 0 g2 ≤ 0 g1 ≤ 0 g2 ≤ 0

FIGURE 6.38. Picture for: t < t, t = t, t > t. a local minimizer for t < t and t = t. For t < t point 2 belongs to a local minimizer and point 3 to a critical point. So, in a neighborhood of (x, t) the sets Cgc and Cm look like presented in Figure 6.39. Now look at the second example program P (t) where MFCQ fails at (x, t), see Fig- ure 6.40. For t < t, the feasible set and thus Cgc is (locally) empty. For t = t, the 6.6. ONE-PARAMETRIC QUADRATIC AND ONE-PARAMETRIC LINEAR PROGRAMS 119

x 1 2 x 1 2 (x, t) (x, t) 3

t and t

FIGURE 6.39. Sketch of Cgc (left), Cm (right) near the gc-point (x, t) of Type 5 where MFCQ holds.

g3 ≤ 0 g3 ≤ 0 g3 ≤ 0 x F −∇f −∇f 2 3

g1 ≤ 0 g2 ≤ 0 g1 ≤ 0 g2 ≤ 0 g1 ≤ 0 g2 ≤ 0

FIGURE 6.40. Picture for: t < t, t = t, t > t. isolated feasible point (x, t) is a local minimizer and maximizer, and with the indicated orientation of −∇f, for t > t, the point 2 corresponds to a local minimizer and point 3 represents a local maximizer. So, the picture of Cgc, Cm is as in Figure 6.41.

x 2 x 2 (x, t) (x, t) 3

t and t

FIGURE 6.41. Sketch of Cgc (left), Cm (right) near the gc-point (x, t) of Type 5 where MFCQ fails.

6.6. One-parametric quadratic and one-parametric linear programs In this section we examine two special classes of parametric constrained programs, namely one-parametric quadratic and one-parametric linear programs. The aim is to discuss the differences between the generic behaviour of one-parametric general constrained programs P (t) (cf., (6.30)) analysed in Section 6.5.2 (results of the 5 Types) and the generic structure for one-parametric quadratic or linear programs. The results in particular show again, that genericity results obtained for a class of problems need no more be valid for a subclass of problems with a special substructure. Unfortunately, any special subclass needs a new specific genericity analysis. 120 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

6.6.1. Genericity results for one-parametric quadratic programs. In this subsection we consider one-parametric problems of the type, 1 (6.45) Q(t) : min f(x, t) := xT C(t)x + c(t)T x 2 n s.t. x ∈ F(t) = {x ∈ R | B(t)x ≤ b(t)} depending on the parameter t ∈ R, where C ∈ C∞(R,Sn), c ∈ C∞(R, Rn), B ∈ C∞(R,M m,n), b ∈ C∞(R, Rm). So, Q(t) has a quadratic objective and m linear con- straints

gj(x, t) := Bj·(t)x − bj(t) ≤ 0 , j ∈ J := {1, . . . , m} , and is given by the problem data ∞ n n mn m ∞ N (6.46) (C, c, B, b) ∈ C (R,S × R × R × R ) ≡ C (R, R ) , 1 where N = 2 (n + 3)n + (n + 1)m. Often we write Q(C, c, B, b) instead of Q(t). Recall that a point (x, t), x ∈ F(t), is a gc-point of Q(t) if the gradients T T T ∇x f(x, t) = C(t)x + c(t) , ∇x gj(x, t) = Bj·(t) , j ∈ J(x,t), are linearly dependent . T LICQ holds at (x, t) if the vectors Bj·(t) , j ∈ J(x,t), are linearly independent. Then at a gc-point (x, t) satisfying LICQ the critical point relation holds:

X T (6.47) C(t)x + c(t) + µjBj·(t) = 0 , with unique µj .

j∈J(x,t) As usual we say that at a critical point (x, t)

(6.48) (SC) holds, if in (6.47): µj 6= 0 for all j ∈ J(x,t) , n and with T(x,t) := {d ∈ R | Bj·(t)d = 0, j ∈ J(x,t)} T (6.49) (SOC) holds, if: d C(t)d 6= 0 for all d ∈ T(x,t) \{0} .

As before, we are interested in the generic structure of the set Cgc of gc-points of Q(t). This structure has been completely described in [27]. Here we will only present the main results mostly without proofs. It appears that generically for Q(t) only gc-points of Type 1,2,5 occur. Type 1: A gc-point (x, t), where LICQ, SC, and SOC holds. 1 The set of all such critical points is denoted by Cgc. Type 2: A gc-point (x, t), where LICQ holds but SC fails with precisely one multiplier equal to zero: µp = 0, for one p ∈ J(x,t). Moreover two other conditions hold (see [27, Definition 2.2]). Type 5: A gc-point (x, t), where LICQ fails with |J(x,t)| = n + 1. Moreover 3 other conditions hold (see [27, Definition 2.3]). These types are the same as for the general constrained programs P (t) in Subsection 6.5.2. 1 Remind that as for general programs in Subsection 6.5.2, the set Cgc of Type 1 gc-points forms a one dimensional C∞-manifold (locally) parameterized by a C∞-curve (x(t), t). 6.6. ONE-PARAMETRIC QUADRATIC AND ONE-PARAMETRIC LINEAR PROGRAMS 121

However, different from the general programs P (t), here, for quadratic problems Q(t), the ˆ 1 following may happen: There is a point t ∈ R and a branch of Cgc consisting of a curve locally given by (x(t), t) which satisfies lim kx(t)k = ∞ or lim kx(t)k = ∞ . t↑tˆ t↓tˆ As will be shown below, generically for Q(t) we may see a behavior as in Figure 6.42 at such points tˆ, which will be called points tˆof ”Type 3 at infinity” and of ”Type 4 at infinity” .

x x

t t ˆ ˆ t or t

FIGURE 6.42. Point tˆof Type 3 and of Type 4 at infinity.

First we give the genericity result for the Types 1,2,5.

THEOREM 6.11. [Genericity result of the 3 Types] ∞ N ∞ N (a) There is a subset Pr of C (R, R ) (see (6.46)) which is open and dense in C (R, R ) k wrt. the Cs -topology for any k ≥ 1, such that the following is valid: For all (C, c, B, b) ∈ Pr any gc-point of P (C, c, B, b) belongs to one of the Types 1,2 or 5.

(b) For (C, c, B, b) ∈ Pr the sets of gc-points of Type 2 and Type 5 are discrete point sets 1 which are situated in the toplological closure of the manifold Cgc. Proof. A detailed proof of this theorem is to be found in [27, Th. 3.1, Cor. 3.1]. Here we only comment on the main argument used. Given the problem data (C, c, B, b) and any active index set J0 ⊂ J we define the matrix families (6.50)  T   T  C(t) B(t) c(t) C(t) BJ0 (t) c(t) Q(t) = , QJ0 (t) = B(t) 0 −b(t) BJ0 (t) 0 −bJ0 (t) where as usual BJ0 = (Bj·)j∈J0 , bJ0 = (bj)j∈J0 . The genericity result is based on a Whitney regular stratification (Mn,q, Σ) of the set Mn,q of matrices M of the form (as QJ0 (t) in (6.50), with fixed n, q, q = |J0|):  CDT c  M = ,C ∈ Sn,D ∈ M q,n, c ∈ n, d ∈ q . D 0 d R R 122 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

The strata set

(6.51) Σ = {Vi, i ∈ I} ,I finite , is tailored to the structure of gc-points occuring for Q(t). This stratification allows the application of the transversality Theorem 6.2, which implies that for any J0 ⊂ J, |J0| = q, the set of families n 1 o ∞ 2 (n+3)n+(n+1)q − QJ0 ∈ C (R, R | QJ0 t Vi ∀i ∈ I

k ∞ 1 (n+3)n+(n+1)q is Cs open and dense in C (R, R 2 ) for k ≥ 1.  We now describe the phenomenon leading to points tˆ of Type 3 and of Type 4 at infin- ity. To do so, we consider a critical point (x, t) on a connected component of the one- 1 dimensional manifold Cgc. Let this (maximal) component (branch) be parameterized by ∞ the C -function (x(t), t), t ∈ I(x,t), with a maximal open interval I(x,t) containing t. We define ˆ (6.52) t := sup{t ∈ I(x,t)} . For tˆ = ∞ the picture for the right hand side of the branch is clear. For tˆ < ∞ the following may happen (cf., [27, Theorem 3.2]).

THEOREM 6.12. [Genericity result for points tˆof Type 3 and of Type 4 at infinity] ∞ N There is a subset Pr of C (R, R ) (see (6.46)) satisfying the conditions in Theorem 6.11 as well as the following: For (C, c, B, b) ∈ Pr (cf., Theorem 6.11) let tˆ be defined in (6.52) and satisfy tˆ < ∞. Then precisely one of the following alternatives (a) or (b) holds for Q(C, c, B, b). (a) We have n lim x(t) =x ˆ for some xˆ ∈ R , t↑tˆ ˆ and then (ˆx, t) is a gc-point of Type 2 with J(x,t) = J(ˆx,tˆ), |J(x,t)| ≤ n, or a gc-point of Type 5 with J(ˆx,tˆ) = J(x,t) ∪ {j0} for some j0 ∈ J, |J(ˆx,tˆ)| = n + 1. (b) We have lim kx(t)k = ∞ , t↑tˆ and then precisely one of the following cases (i) or (ii) occurs for tˆ. ˆ (i) t is a point of Type 3 at infinity with |J(x,t)| < n:

Here we have with J0 = J(x,t) (cf., (6.50))  ˆ ˆ T ˆ   ˆ ˆ T  C(t) BJ0 (t) c(t) C(t) BJ0 (t) rank ˆ ˆ = rank ˆ + 1 = n + |J0| , BJ0 (t) 0 −bJ0 (t) BJ0 (t) 0 ˆ and rank BJ0 (t) = |J0|. 6.6. ONE-PARAMETRIC QUADRATIC AND ONE-PARAMETRIC LINEAR PROGRAMS 123 ˆ (ii) t is a point of Type 4 at infinity with |J(x,t)| = n:

Here we have with J0 = J(x,t) (cf., (6.50))  ˆ ˆ T ˆ   ˆ ˆ T  C(t) BJ0 (t) c(t) C(t) BJ0 (t) rank ˆ ˆ = rank ˆ + 1 = 2n , BJ0 (t) 0 −bJ0 (t) BJ0 (t) 0 ˆ ˆ  ˆ and rank BJ0 (t), bJ0 (t) = n, rank BJ0 (t) = n − 1. In both cases (i), (ii) we have x(t) lim =x ˜ with kx˜k = 1 . t↑tˆ kx(t)k More precisely, in case xˆ (i) x˜ is given as x˜ = kxˆk where (ˆx, µˆ) is the unique (up to a nonzero constant) solution (ˆx, µˆ) 6= (0, 0) of  ˆ ˆ T      C(t) BJ0 (t) x 0 ˆ = , BJ0 (t) 0 µ 0 and in case ˆ (ii) x˜ is the unique (up to a sign) solution x˜ of BJ0 (t)x = 0, with kx˜k = 1. The set of such points tˆ∈ R in (a) and (b) is a discrete set. We emphasize that a corresponding result is valid if we define ˆ t := inf{t ∈ I(x,t)} and consider lim x(t) . t↓tˆ

For the points tˆ of Type 3 and Type 4 at infinity we even can say more (cf., [27, Theo- rem 3.3]).

THEOREM 6.13. [Two-sided points tˆat infinity]

Let (C, c, B, b) ∈ Pr (cf., Theorem 6.12) and let tˆbe a point of Type 3 or of Type 4 at in- 1 ˆ finity as in Theorem 6.12(b), so that the corresponding branch of Cgc left to t is given ∞ ˆ by the C -curve (x(t), t), t < t, with active index set J(x(t),t) = J(x,t) and satisfies limt↑tˆ x(t)/kx(t)k =x ˜. Then 1 ˆ in Case J(x,t) = J: there exists a branch of points in Cgc right to t, parameterized by a C∞-function (x(t), t), tˆ < t < tˆ+ ε for some ε > 0, such that x(t) lim = −x˜ . t↓tˆ kx(t)k Also for these Type 1 critical points (x(t), t), t > tˆ, the active index set is the same, ˆ ˆ J(x(t),t)) = J(x,t), and for t ↓ t the same Type occurs as for t ↑ t. We call such points tˆ, two-sided points of Type 3 or of Type 4 at infinity. In the reminder of the subsection we present examples to illustrate the phenomenon of points of Type 3 and Type 4 at infinity. We further explain the difference between the 124 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS ordinary gc-point of Type 4 for general programs P (t) and the Type 4 points at infinity for Q(t). Examples of points tˆof Type 3 at infinity: Recall the simplest example of a Type 3 gc-point in Section 6.5, provided by the (non- quadratic) program (with J = ∅):

1 3 P (t) : min x − tx , t ∈ R, x∈R 3 with the critical point set Cgc near the critical point (x, t) = (0, 0) of Type 3 as sketched in Figure 6.6. We now consider a one-parametric quadratic program with J = ∅ (see also [27, p.229]), 1 Q(t) : min (1 − t)x2 + tx , x∈R 2 with critical point equation ∇xf(x, t) = (1 − t)x + t = 0. The set Cgc is thus given by t C = {(x(t), t) | x(t) = , t 6= 1} , gc t − 1 with a two-sided point tˆ= 1 of Type 3 at infinity (see Figure 6.43).

x

t tˆ= 1

FIGURE 6.43. Two-sided point tˆ= 1 of Type 3 at infinity.

To see why (in the generic case) a two-sided point tˆof Type 3 at infinity can only occur if J(x,t) = J, we now look at the problem obtained from Q(t) by adding a constraint:

1 2 Q1(t) : min (1 − t)x + tx s.t. g(x, t) := x − t − 2 ≤ 0 . x∈R 2 Here the constraint x ≤ t + 2 cuts off a part of the critical point set in Figure 6.43 given by t t 2 (x(t), t) = ( t−1 , t), t 6= 1. The activity condition x = t−1 = t + 2 or t = 2 leads to two √ √  √ √  gc-points of Type 2 at (x1, t1) = 2 − 2, − 2 and (x2, t2) = 2 + 2, 2 . Indeed, the KKT conditions ∇xf(x, t) + µ∇xg(x, t) = 0 , g(x, t) = 0 read (1 − t)x + t + µ = 0 , x = t + 2 or x = t + 2 , µ = t2 − 2 . 6.6. ONE-PARAMETRIC QUADRATIC AND ONE-PARAMETRIC LINEAR PROGRAMS 125 √ So the multiplier µ vanishes at t = ± 2 and the points (˜x(t), t) = (t + 2, t) , t ∈ R , also belong to Cgc. The gc-points of Q1(t) are sketched in Figure 6.44. This set Cgc only shows a one-sided point tˆof Type 3 at infinity.

x˜(t) x

(x2, t2)

(x1, t1) t tˆ= 1

FIGURE 6.44. Set Cgc with a (one-sided) point tˆ = 1 of Type 3 at infinity and 2 critical points of Type 2 at (x1, t1), (x2, t2).

Examples of points tˆof Type 4 at infinity: We end up the subsection with some explanation on points of Type 4 at infinity for the quadratic programs Q(t). Consider first the example of a (general) program P (t) with the (nonlinear) constraints for x ∈ R2, 2 g1(x, t) = x1 − x2 ≤ 0 , g2(x, t) = x2 − t ≤ 0 . The feasible set F(t) (see Figure 6.45) is empty for t < 0, and consists of the isolated feasible point x = 0 for t = 0.

x2 g1 ≤ 0 x2 x2 g1 ≤ 0 g1 ≤ 0

x1 F x1 g2 ≤ 0 x1 g2 ≤ 0 x g2 ≤ 0

FIGURE 6.45. Picture for: t < 0, t = 0, 0 < t.

˜ 2 So for P (t), there is a part of the set Cgc given locally by Cgc := {(x, t) | x2 = t, x1 = t, t ≥ 0} (intersection of g1 = g2 = 0) (see Figure 6.46) with a gc-point of Type 4 at (x, t) = (0, 0). Notice that depending on the choice for an objective f, the set Cgc could also contain other branches corresponding to critical points where only one (or no) of the constraints is active. 126 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

x1 ˜ Cgc

t

˜ FIGURE 6.46. The set Cgc for P (t).

For linear constraints, generically, such Type 4 gc-points are no more possible. To see this consider now a problem Q(t) with the (linear) constraints (see Figure 6.47)

g1(x, t) = x2 − 1 ≤ 0 , g2(x, t) = −x1t + x2 ≤ 0 .

x2 x2 x2 g ≤ 0 g ≤ 0 1 x1 1 x1 g ≤ 0 F 1 x1 F g2 ≤ 0 g2 ≤ 0 F g2 ≤ 0

FIGURE 6.47. Picture for: t < 0, t = 0, 0 < t.

So, independent of f, the set of critical points of Q(t) contains the set 1 C˜ = {(x, t) | g (x, t) = g (x, t) = 0} = {(x, t) | x = ( , 1), t 6= 0} , gc 1 2 t with a two-sided point tˆ= 0 of Type 4 at infinity (see Figure 6.48). Note again, that depending on the choice for an objective f the set Cgc could also con- tain other branches corresponding to critical points of Q(t) with only one (or no) active constraint. For general (nonlinear) programs P (t) a gc-point (x, t) of Type 4 as in Figure 6.45 with two active constraints can be generalized to higher dimensions n > 2: n−1 X 2 g1(x, t) = xi − xn ≤ 0 , g2(x, t) = xn − t ≤ 0 . i=1 Also here at (x, t) = (0, 0) ∈ Rn × R a generalized critical point of Type 4 occurs. This is generically excluded for one-parametric programs Q(t), since the constraints are linear. A Type 4 gc-point (x, t) would be a point satisfying with J0 := J(x,t), |J(x,t)| ≤ n, the equations

(6.53) BJ0 (t)x = bJ0 (t) and rank BJ0 (t) = k < |J0| . 6.6. ONE-PARAMETRIC QUADRATIC AND ONE-PARAMETRIC LINEAR PROGRAMS 127

x1

˜ Cgc

t

˜ Cgc

˜ FIGURE 6.48. The set Cgc for Q(t).

We show that the existence of such a point is generically excluded. To do so, we consider two possible cases for (x, t):

|J0|,n case |J0| < n: This means that at t the (|J0| × n)-matrix BJ0 (t) meets the set M (k) of matrices of rank k < |J0| < n. This set is a manifold with codimension (see Theo- rem 5.11) |J0|,n codim M (k) = (n − k)(|J0| − k) ≥ (n − k) ≥ 2 .

However, this situation is generically excluded for a one-parametric matrix function BJ0 (t). case |J0| = n: The second condition in (6.53) means that at t the matrix BJ0 (t) meets the set M n,n(k) for k < n. For k = n − 1 the set M n,n(n − 1) has codimension (n − (n − 1))2 = 1. So, this situation may happen in the generic situation (on a discrete set of points t ∈ R). But the first condition in (6.53) implies (|J0| = n)

rank (BJ0 (t), bJ0 (t)) = rank BJ0 (t) < n . n,n+1 So, the matrix (BJ0 (t), bJ0 (t)) ∈ M has rank < n, i.e., this matrix meets the set M n,n+1(k) with k < n. But M n,n+1(k) then satisfies codim M n,n+1(k) = (n + 1 − k)(n − k) ≥ 2 , and consequently also this situation is generically excluded. In other words, even if at t the n gradients ∇xgj(x, t) = Bj·(t) , j ∈ J0, |J0| = n, are linearly dependent, an intersection point x of the n constraints

Bj·(t)x = bj(t) , j ∈ J0 , is generically excluded. 128 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

6.6.2. Genericity results for one-parametric linear programs. We finally examine the properties of one-parametric linear programs in primal form (cf., Section 3.2), T n (6.54) L(t) : min c(t) x s.t. x ∈ F(t) = {x ∈ R | B(t)x ≤ b(t)} , depending on the parameter t ∈ R, where c ∈ C∞(R, Rn), B ∈ C∞(R,M m,n) b ∈ C∞(R, Rm). We refer the reader also to [11]. In order to facilitate a direct comparison with the results for quadratic programs Q(t) in Subsection 6.6.1 and for general nonlinear programs P (t) in Subsection 6.5.2, here, in contrast to Section 3.2, the primal program L(t) is formulated as a minimum problem. For the one-parametric problem L(t) above (n, m fixed), the problem set is given by ∞ n m,n m ∞ N (6.55) P = {(c, B, b) ∈ C (R, R ×M ×R )} ≡ C (R, R } ,N = n+n·m+m , and we often write L(c, B, b) instead of L(t). T For L(t) a feasible point (x, t), x ∈ F(t), is a gc-point if with f(x, t) := c(t) x, gj(x, t) := Bj·(t)x − bj(t), the gradients T T T (6.56) ∇x f(x, t) = c(t) , ∇x gj(x, t) = Bj·(t) , j ∈ J(x,t) , are linearly dependent. LICQ holds at (x, t) if T (6.57) bj·(t) , j ∈ J(x,t) , are linearly independent , and such a point (x, t) is a critical point for L(t) if LICQ holds as well as the critical point relation X T (6.58) c(t) + µjBj·(t) = 0 with (unique) multipliers µj ∈ R .

j∈J(x,t) Again we say that

(6.59) (SC) holds, if in (6.58) we have: µj 6= 0 ∀j ∈ J(x,t) .

As before we are interested in the generic structure of the set Cgc of gc-points of L(t) and the set Cm corresponding to (global) minimizers. More precisely, we wish to know which types of gc-points can generically occur for programs L(t). As we shall see there are differences between the generic behaviour of these one-parametric linear problems L(t) and the properties of the quadratic programs Q(t) (in Subsection 6.6.1) and the general nonlinear programs P (t) (in Subsection 6.5.2). We start with three differ- ences concerning gc-points of Tye 1,2 and 3. Firstly consider a gc-point (x, t) of Type 1 for L(t). The critical point equations and the feasibility conditions for (x, t) read: P T j∈J µjBj·(t) = −c(t) (6.60) (x,t) . Bj·(t)x = bj(t) , j ∈ J(x,t) In contrast to the corresponding equations for Q(t) or P (t), these relations (6.60) in the unknowns (x, µ) are completely decoupled. For a Type 1 critical point (x, t) for L(t) 6.6. ONE-PARAMETRIC QUADRATIC AND ONE-PARAMETRIC LINEAR PROGRAMS 129 we in particular require that at t the system (6.60) has a unique solution (x, µ), i.e., the (n + |J(x,t)|) × (n + |J(x,t)|)-matrix T ! 0 BJ (t) (6.61) (x,t) is nonsingular . B (t) 0 J(x,t) B (t) = |J | B (t) = n This obviously implies rank J(x,t) (x,t) and rank J(x,t) , which is equiva- lent to |J | = n B (t) = n . (6.62) (x,t) and rank J(x,t) Consequently, for a Type 1 point (x, t) of L(t) the point x is always a non-degenerate vertex of F(t). Secondly, it is evident, that since the Hessians of all problem functions in a linear program is identical zero, for L(t), no gc-points (x, t) of Type 3 as for P (t), or points tˆ of Type 3 at infinity as for Q(t) can occur. Thirdly, generically the gc-points (x, t) of P (t), Q(t) are always locally unique (isolated). Consider, e.g., the sketches of the feasible sets (in blue) and level sets (in red) of a quadratic 2 program Q(t) for x ∈ R , t ∈ R (see Figure 6.49, picture for t1 < t < t2). We have given 3 linear constraints gj(x, t) := Bj·(t)x − bj(t) ≤ 0, j = 1, 2, 3.

−∇f −∇f −∇f

x x1(t1) g2 ≤ 0 g2 ≤ 0 x2(t2) g2 ≤ 0

g1 ≤ 0 g1 ≤ 0 g3 ≤ 0 g3 ≤ 0 g3 ≤ 0 g1 ≤ 0

FIGURE 6.49. Picture for t = t1 < t, t = t, t = t2 > t.

The intersection points x1(t) of g1 = g2 = 0, and x2(t) of g2 = g3 = 0, are always critical points, and obviously, at (x1(t1), t1) and (x2(t2), t2) we have given a critical point of Type 2, and at (x, t) a point of Type 1 for Q(t). The corresponding set Cgc is sketched in Figure 6.50. For the linear program L(t) with the same linear constraints sketched in Figure 6.51 the situation is essentially different. Here, for t = t1 (t = t2) the point x1(t1) (x2(t2)) is a unique (global) vertex minimizer of Type 1 for L(t1) (L(t2)). For any t 6= t near t the intersection points x1(t) of g1 = g2 = 0 and x2(t) of g2 = g3 = 0 are locally unique critical points (primal vertices) of L(t). But at t = t the whole line segment

S(t) = {(1 − τ)x1(t) + τx2(t) | τ ∈ [0, 1]} consist of critical points. So, for t ≈ t the set Cgc looks as given in Figure 6.52. 130 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

x x2(t) (x2(t2), t2)

(x, t)

x1(t) (x1(t1), t1)

t t1 t t2

FIGURE 6.50. Set Cgc of Q(t) with two points (x1(t1), t1), (x2(t2), t2) of Type 2.

−c(t1) −c(t) −c(t2)

x1(t) x2(t)

x1(t1) x g2 ≤ 0 x2(t2) g2 ≤ 0 g2 ≤ 0

F(t1) F(t) F(t2) g1 ≤ 0 g3 ≤ 0 g3 ≤ 0 g1 ≤ 0 g3 ≤ 0 g1 ≤ 0

FIGURE 6.51. Picture for t = t1 < t, t = t, t = t2 > t.

(x2(t), t) x2(t) x

S(t)

x1(t)

(x1(t), t) t t1 t t2

FIGURE 6.52. Set Cgc of L(t) containing a whole line segment S(t) of gc-points of L(t).

∗ The end points (x1(t), t) and (x2(t), t) of S(t) will be called gc-points (x, t) of Type 2 of L(t). In the sequel we will show that generically for one-parametric programs L(t) only gc- points of Type 1, 2∗ , 5 and points tˆof Type 4 at infinity can occur. 6.6. ONE-PARAMETRIC QUADRATIC AND ONE-PARAMETRIC LINEAR PROGRAMS 131

Recall that a problem L(t) = L(c, B, b) is given by an element (c, B, b) ∈ P = C∞(R, Rn× m,n m M × R ) (cf., (6.55)). For an index set J0 ⊂ J we again define ! ! Bj·(t) bj(t) |J0|,n |J0| BJ0 (t) = . ∈ M , bJ0 (t) = . ∈ R . . . j∈J0 j∈J0 Then a gc-point (x, t) of Type 1, 2∗ or 5 and a point t of Type 4 at infinity will be a point where the vector  c(t),BJ0 (t), bJ0 (t) with some J0 ⊂ J meets a certain manifold in (Rn × M |J0|,n × Rm). We now describe the different types of gc-points which can appear for L(t) in the generic case.

Type 1: A point (x, t), x ∈ F(t), where LICQ holds with |J(x,t)| = n and the critical point equation X T c(t) + µjBj·(t) = 0

j∈J(x,t) is valid with SC: µj 6= 0, j ∈ J(x,t). Under this condition, the point x is a nondegenerate primal vertex of F(t) (cf., Defini- tion 3.1) and µ is a non-degenerate dual vertex. Type 5: A point (x, t), x ∈ F(t), satisfying

(5a) |J(x,t)| = n + 1 (so LICQ fails) (5b)  T  T Bj·(t) ∇(x,t)gj(x, t) = 0 0 , j ∈ J(x,t) , are linearly independent , Bj·(t)x − bj(t) 0 d 0 d where Bj·(t) = dt Bj·(t), bj(t) = dt bj(t). (5c) In view of (5a), (5b) there is (up to a common factor) a unique nonzero solution |J | µ ∈ R (x,t) of X T (6.63) µjBj·(t) = 0 .

j∈J(x,t)

We have: µj 6= 0, ∀j ∈ J(x,t).

(5d) By (5a), (5b) there exist unique yj, j ∈ J(x,t), solving

T X T ∇(x,t)f(x, t) + yj∇(x,t)gj(x, t) = 0 ,

j∈J(x,t) or,    T  c(t) X Bj·(t) (6.64) 0 + yj 0 0 = 0 . c (t)x Bj·(t)x − bj(t) j∈J(x,t) Defining γ := y − y µj , j, q ∈ J , we have: qj j q µq (x,t)

γq,j 6= 0 ∀q 6= j, q, j ∈ J(x,t) . 132 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

Type 2∗: A gc-point (ˆx, tˆ) such that we have: |J | = n B (tˆ) = n (2a) (ˆx,tˆ) and rank J(ˆx,tˆ) , i.e., xˆ is a non-degenerate vertex of F(tˆ).

(2b) By (2a) there are unique multipliers µj ∈ R, j ∈ J(ˆx,tˆ), such that

X T c(tˆ) + µjBj·(tˆ) = 0 .

j∈J(ˆx,tˆ)

For precisely one j0 ∈ J(ˆx,tˆ) we have: µj0 = 0. (2c) By (2a) there is (up to a positive factor) a unique solution d 6= 0 of

BJ \{j }(tˆ)d = 0 (ˆx,tˆ) 0 . (6.65) ˆ Bj0·(t)d < 0 If we define bj(tˆ) − Bj·(tˆ)ˆx τ0 := min > 0 , j∈J\J (ˆx,tˆ) Bj·(tˆ)d ˆ Bj·(t)d>0 then in case τ0 = ∞ the set S(tˆ) = {xˆ + τd | 0 ≤ τ ≤ ∞} , half-line, is a set of critical points of L(tˆ). In case τ0 < ∞ the minimum value τ0 is attained at precisely one index j1 ∈ J \ J(ˆx,tˆ) and the set

S(tˆ) = {xˆ + τd | 0 ≤ τ ≤ τ0} , line-segment, L(tˆ) B (tˆ) n is a set of critical points of . Moreover, J(ˆx,tˆ)\{j0}∪{j1} has rank and the point ˆ x˜ :=x ˆ + τ0d is a nondegenerate vertex of F(t) with J(˜x,tˆ) = J(ˆx,tˆ) \{j0} ∪ {j1}. To characterize a point of Type 4 at infinity, as for the program Q(t) in Subsection 6.6.1, we consider a critical point (x, t) of Type 1 on a maximal component of the one-dimensional manifold of Type 1 points, parameterised by the curve (x(t), t), t ∈ I(x,t), with a maximal open interval I(x,t) ⊂ R containing t. We define (see (6.52)) ˆ ˆ  (6.66) t := sup {t ∈ I(x,t)} or t := inf {t ∈ I(x,t)} . For tˆ= ∞ (or tˆ= −∞) the picture is clear. Type 4 at infinity: A point tˆ ∈ R defined by (6.66) (corresponding to the Type 1 point (x, t) and the curve (x(t), t) above) which satisfies the following:

(4a) |J0| = n where J0 := J(x,t) ˆ ˆ ˆ  (4b) rank BJ0 (t) = n − 1 and rank BJ0 (t) bJ0 (t) = n. (4c) x(t) lim =x ˜ , t↑tˆ kx(t)k 6.6. ONE-PARAMETRIC QUADRATIC AND ONE-PARAMETRIC LINEAR PROGRAMS 133 where x˜ is (up to a sign) the unique vector satisfying ˆ BJ0 (t)˜x = 0 , kx˜k = 1 . We now formulate the genericity result for one-parametric linear programs of the type L(t) = L(c, B, b) similar to Theorem 6.12. This result will be based on the Jet Transver- sality Theorem 6.1. More precisely, we show that a generic subset of problems (c, B, b) ∈ P ≡ C∞(R, Rn × M m,n × Rm) avoid certain manifolds (of codimension 2) of Rn × M m,n ×Rm and only meet other manifolds (of codimension 1) at discrete point sets. Since these manifolds are not closed we will not obtain an ”open and dense” result but only a ”genericity” result, see Theorem 6.1.

THEOREM 6.14. [Genericity result for one-parametric linear programs]

There is a generic subset Pr of P such that for each (c, B, b) ∈ Pr the following holds: For any point tˆdefined in (6.66) precisely one of the following alternatives (a) or (b) holds for L(c, B, b). (a) We have (6.67) lim x(t) =x ˆ for some xˆ ∈ R , t↑tˆ and then either ˆ (i) (ˆx, t) is a gc-point of Type 5 and |J(ˆx,tˆ)| = n + 1, or ˆ ∗ (ii) (ˆx, t) is a point of Type 2 and |J(ˆx,tˆ)| = n.

(b) We have (6.68) lim kx(t)k = ∞ , t↑tˆ and then tˆis a point of Type 4 at infinity. The set of points tˆ∈ R in (a) and (b) is a discrete set. Proof. (a) Assuming (6.67), by continuity, the point (ˆx, tˆ) must be a gc-point. More precisely, with some µ0 ∈ R, µj ∈ R, j ∈ J(x,t), |J(x,t)| = n, it holds X T X (6.69) µ0c(tˆ) + µjBj·(tˆ) = 0 , |µ0| + |µj| = 1

j∈J(x,t) j∈J(x,t) B (tˆ)ˆx = b (tˆ) . (6.70) J(x,t) J(x,t) The following cases may occur. Note that by assumption, (ˆx, tˆ) is not a point of Type 1. B (tˆ) = n J = J ∪{j } (ˆx, tˆ) (a1) rank J(x,t) , and (ˆx,tˆ) (x,t) 0 . Then as we shall see, generically, is a point of Type 5.

B (tˆ) = n J = J ∪ J |J | ≥ 2 J := J (a2) rank J(x,t) , and (ˆx,tˆ) (x,t) 1, 1 . So putting 0 (ˆx,tˆ) the matrix ˆ ˆ  n+q,n+1 BJ0 (t) bJ0 (t) ∈ M , q ≥ 2 , has rank ≤ n , 134 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS ˆ ˆ  i.e., BJ0 (t) bJ0 (t) meets a manifold of codimension ≥ (n + 2 − n)(n + 1 − n) = 2 (see Theorem 5.11). This can be excluded for a generic subset Pa2 of P. B (tˆ) = k < n J = J ∪ J |J | ≥ 0 (a3) rank J(x,t) , and (ˆx,tˆ) (x,t) 1, 1 . Together with (6.70) this condition give just the relations (6.53) which have been shown to be excluded for a generic subset Pa3 of P (with arguments after (6.53) in Subsection 6.6.1). B (tˆ) = n J = J (a4) rank J(x,t) , (ˆx,tˆ) (x,t), so that ˆ X ˆ T (6.71) c(t) + µjBj·(t) = 0 with unique µj ∈ R .

j∈J(x,t)

In this case (since (ˆx, tˆ) is not of Type 1) we can assume that q ≥ 1 of these µjs are equal to zero. For q = 1, as we shall show below, (ˆx, tˆ) is a point of Type 2∗. But the case q ≥ 2 is generically excluded. Indeed for q ≥ 2 the condition (6.71) means that for a subset J0 ⊂ J(x,t), |J0| = |J(x,t)| − q ≤ n − 2 (µj = 0 for j ∈ J(x,t) \ J0) the n × (|J0| + 1)- matrix ˆ T ˆ  BJ0 (t) c(t) has rank |J0| , i.e., this matrix meets a manifold of codimension (n−|J0|)(|J0|+1−|J0|) = (n−|J0|) ≥ 2 (cf., Theorem 5.11). Again this is excluded for a generic subset Pa4 of P. Proof of (b): Let (6.68) hold. By assumption, the Type 1 points (x(t), t) satisfy with

|J(x,t)| = n: B (t)x(t) = b (t) ∀t < t,ˆ t ∈ I(x, t) . J(x,t) J(x,t) x(t) ˆ By dividing this relation by kx(t)k using (6.68) we find that kx(t)k , t ↑ t, has some limit point x˜, kx˜k = 1 satisfying B (tˆ)˜x = 0 . (6.72) J(x,t) B (tˆ) = q ≤ n−1 P P q ≤ n−2 So rank J(x,t) . Again for a generic subset b1 of the condition is excluded and the case q = n − 1 is possible for a discrete set of points tˆ∈ R. q = n − 1 B (tˆ) = n − 1 B (tˆ)˜x = 0 So let . Since rank J(x,t) , the relation J(x,t) can only have x(t) ˆ (up to a sign) one solution x˜, kx˜k = 1. Consequently kx(t)k , t ↑ t, can only have x˜ as a limit point (up to a sign), i.e., x(t) (6.73) lim =x ˜ . t↑tˆ kx(t)k P P B (tˆ) b (tˆ) = n Moreover, for a generic subset b2 of we have rank J(x,t) J(x,t) . Indeed n×(n+1) B (tˆ) b (tˆ) ≤ n−1 the case that the -matrix J(x,t) J(x,t) has rank (meets a manifold of codimension ≥ 2) can be generically excluded. Proof of (a) continued: We now analyse further the remaining cases (a1) and (a4) in (a). case (a1), a point (ˆx, tˆ) of Type 5: 6.6. ONE-PARAMETRIC QUADRATIC AND ONE-PARAMETRIC LINEAR PROGRAMS 135 ˆ By assumption, the point (ˆx, t) satisfies with J(x,t), |J(x,t)| = n, B (tˆ) = n J = J ∪ {j } . rank J(x,t) and (ˆx,tˆ) (x,t) 0

So |J(ˆx,tˆ)| = n+1 and condition (5a) of Type 5 is valid. We now prove the other conditions (5b)-(5d). J := J |J | = n+1 B (tˆ) = (5b): Defining 0 (ˆx,tˆ) then 0 . Note that by the assumption rank J(x,t) n, the matrix ˆ ˆ  n+1 BJ0 (t) bJ0 (t) ∈ M has rank n . So, in view of Ex. 5.5 this condition is equivalent with ˆ ˆ  det BJ0 (t) bJ0 (t) = 0 , ˆ ˆ ˆ or using the feasibility condition gJ0 (ˆx, t) = BJ0 (t)ˆx − bJ0 (t) = 0, it is also equivalent with ˆ ˆ  det BJ0 (t) gJ0 (ˆx, t) = 0 . n+1  We now apply Ex. 5.9 with S = M (n), f(t) := BJ0 (t) gJ0 (ˆx, t) and h(M) = n+1 det(M), M ∈ M . By Theorem 6.1 there is a generic subset P51 of P such that for − n+1 (c, B, b) ∈ P51 we have f t M (n), or, equivalently (cf., Ex. 5.9) d d d 0 6= h(f(tˆ)) = detB (tˆ) g (ˆx, tˆ) = detB (tˆ) g (ˆx, tˆ) , dt dt J0 J0 J0 dt J0 ˆ where the last equality follows by using the multi-linearity of the determinant and gJ0 (ˆx, t) = ˆ d ˆ  0. This relation implies that the n + 1 rows of BJ0 (t) dt gJ0 (ˆx, t) , given by ˆ ˆ 0 ˆ 0 ˆ  ∇(x,t)gj(ˆx, t) = Bj·(t) ,Bj·(t)ˆx − bj·(t) , j ∈ J0 = J(ˆx,tˆ), are linearly independent . B (tˆ) = n J = J , |J | = n + 1 (5c): Recall that by the assumption rank J(x,t) , 0 (ˆx,tˆ) 0 , there exist (up to a factor) a unique nonzero multiplier µJ0 , such that X T (6.74) µjBj·(tˆ) = 0 .

j∈J0

We now show that the condition µj1 = 0 for some j1 ∈ J0 can be generically excluded. To do so, note that by assumption there exist unique multipliers γj, j ∈ J(x,t), such that X ˆ T ˆ T γjBj·(t) = −Bj0·(t) .

j∈J(x,t)

If in (6.74) we would have µj1 = 0, then j1 ∈ J(x,t) and there would exist a solution γ ∈ Rn−1 of B (tˆ)T γ = −B (tˆ)T . J(x,t)\{j1} j0· ˆ ˆ Thus, together with the feasibility condition BJ0 (t)ˆx = bJ0 (t) (recall J0 = J(x,t) ∪ {j0}) there must be a solution (γ, x) ∈ Rn−1 × Rn of B (tˆ)T γ = −B (tˆ)T J(x,t)\{j1} j0· B (tˆ)x = b (tˆ) . (6.75) J(x,t) J(x,t) ˆ ˆ Bj0·(t)x = bj0 (t) 136 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

B (tˆ) = n − 1 n Since rank J(x,t)\{j1} (after appropriate operations of the first rows) we can assume that this system has the form: n−1 A11γ = −c1 A11 ∈ M is nonsingular T n−1 a0 γ = −c0 a0, c1 ∈ R , c0 ∈ R n . B11x = b1 B11 ∈ M is nonsingular T n b0 x = γ0 b0, b1 ∈ R , γ0 ∈ R −1 −1 Solving this system using γ = −A11 c1, x = B11 b1 leads to the relations T −1 T −1 c0 − a0 A11 c1 = 0 , γ0 − b0 B11 b1 = 0 , which obviously define two linearly independent equations in the unknowns (A11, a0, c1, c0,B11, b0, b1, γ0). Consequently, the condition µj1 = 1 means that the problem (c, B, b) at tˆmeets a manifold of codimension 2, which can generically be avoided.

(5d): By (5b) there exist unique solutions yj ∈ R, j ∈ J0, of    T  c(tˆ) X Bj·(tˆ) (6.76) 0 ˆ + yj 0 ˆ 0 ˆ = 0 . c (t)ˆx Bj·(t)ˆx − bj(t) j∈J0 For each q ∈ J we subtract y 1 times (6.74) from the first n rows of (6.76) to obtain 0 q µq

X µj  T c(tˆ) + yj − yq Bj·(tˆ) = 0 , q ∈ J0 . µq j∈J0\{q} Assume now γ = 0 holds for some pair (q , j ), q 6= j (recall γ = y − y µj ), then q1,j1 1 1 1 1 q,j j q µq µ ˆ X j  ˆ T c(t) + yj − yq1 Bj·(t) = 0 . µq1 j∈Jj∈J0 \{q1,j1}

Together with the feasibility conditions (recall J0 = J(x,t) ∪ {j0}),

BJ (tˆ)ˆx = bJ (tˆ) (x,t) (x,t) , ˆ ˆ Bj0·(t)ˆx bj0 (t) n−1 n this means that with J1 := J0 \{q1, j1}, |J1| = n−1, there is a solution (γ, x) ∈ R ×R of ˆ T ˆ BJ1 (t) γ = −c(t) B (tˆ)x = b (tˆ) (6.77) J(x,t) J(x,t) . ˆ ˆ Bj0·(t)x = bj0 (t) ˆ T Generically we can assume that the n × (n − 1)-matrix BJ1 (t) has rank n − 1 (use again Theorem 5.11). So, (6.77) has the same structure as the system (6.75), and with the same arguments as above in the proof of part (5c), we conclude that the condition γq1,j1 = 0 leads to the condition that the problem (c, B, b) meets at tˆ a manifold of codimension 2, which again can generically be excluded. case (a4), gc-point (ˆx, tˆ) of Type 2∗. 6.6. ONE-PARAMETRIC QUADRATIC AND ONE-PARAMETRIC LINEAR PROGRAMS 137 ˆ Here, with J0 := J(x,t) = J(ˆx,tˆ), |J0| = n, we have rank BJ0 (t) = n, and with unique µj, j ∈ J0, ˆ X ˆ T (6.78) c(t) + µjBj·(t) = 0 , and µj0 = 0 for precisely one j0 ∈ J0 .

j∈J0 Moreover, xˆ is a feasible nondegenerate vertex of F(tˆ), ˆ ˆ BJ0 (t)ˆx = bJ0 (t) . Now consider the unique solution d 6= 0 (unique up to a factor) of ˆ BJ0\{j0}(t)d = 0 . ˆ We can assume Bj0·(t)d < 0 (choosing the appropriate sign of d). Then obviously, using J(ˆx,tˆ) = J0, for the points xτ ; =x ˆ = τd we find, ˆ ˆ BJ0\{j0}(t)xτ − bJ0\{j0}(t) = 0 ∀τ > 0 ˆ ˆ ˆ Bj0·(t)xτ − bj0 (t) = τBj0·(t)d < 0 ∀τ > 0 Bj·(tˆ)xτ − bj(tˆ) = Bj·(tˆ)ˆx − bj(tˆ) + τBj·(tˆ)d < 0 , j ∈ J \ J0 ∀ small τ > 0 .

Looking at the last relations, we require the feasibility conditions Bj·(tˆ)xτ − bj(tˆ) ≤ 0, j ∈ J \ J0, and define bj(tˆ) − Bj·(tˆ)ˆx τ0 := min > 0 . j∈J\J0 B (tˆ)d ˆ j· Bj·(t)d>0 Then we obviously obtain (cf., also (6.78)) that the set

S(tˆ) := {xˆ + τd | 0 ≤ τ ≤ τ0} consists of gc-points of L(t). In case τ0 = ∞, S(tˆ) is a half-line. In case τ0 < ∞, as we shall show, generically the min-value defining τ0 is attained at precisely one index j1 ∈ J \ ˆ ˆ ˆ J0. Note that the conditions rank BJ0\{j0}(t) = n − 1, BJ0\{j0}(t)d = 0, and Bj1·(t)d > 0, ˆ imply that the matrix BJ0\{j0}∪{j1}(t) is nonsingular. To see that the minimizer defining τ0 is attained only at one index j1, assume to the contrary that τ0 is attained at j1 and j2, j1 6= j2. Then at the gc-point (˜x, tˆ) with x˜ :=x ˆ + τ0d the following relations hold (cf., (6.78)): ˆ T ˆ BJ0\{j0}(t) µ = −c(t) ˆ ˆ BJ0\{j0}∪{j1}(t)˜x = bJ0\{j0}∪{j1}(t) . ˆ ˆ Bj2·(t)˜x = bj2 (t) However with the same arguments as in the case (5d) for (6.77), these relations yield two linearly independent equations, and can be generically excluded. If we take the intersection Pr of all generic subsets Pa2, Pa3, Pa4, Pb1, Pb2, P51 of P we have found a generic subset Pr of P with the required conditions. We finally note, that the conditions leading to points of Type 5, Type 2∗ in (a) and Type 4 at infinity in (b) correspond to the fact that the problem (c, B, b) ∈ P transversally meets at tˆa manifold in Rn × M m,n × Rm of codimension 1. So generically these situations can only occur at a discrete set of point tˆin R.  138 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

It is clear that a result as in Theorem 6.14 is also valid if we define instead of (6.66): tˆ= inf {t ∈ I(x, t)} and consider lim x(t) . t↓tˆ As for the quadratic program Q(t), also for L(t) generically, a point of Type 4 at infinity is two-sided if J = J(x,t) holds.

THEOREM 6.15. [Two-sided point of Type 4 at infinity]

There is a generic subset Pr of P such that for any (c, B, b) ∈ Pr in addition to the properties in Theorem 6.14 the following holds at any point of Type 4 at infinity as in The- orem 6.14(b), so that the corresponding branch of Type 1 points (x(t), t), t < tˆ, satisfies x(t) J(x(t),t) = J(x,t), |J(x,t)| = n, and limt↑tˆ kx(t)k =x ˜. in case J(x,t) = J: there exists a branch of Type 1 points of L(c, B, b), parameterized by ∞ ˆ ˆ a C -curve (x(t), t) for t < t < t + ε, for some ε > 0, such that J(x(t),t) = J(x,t) and x(t) lim = −x˜ . t↓tˆ kx(t)k ˆ In case J(x,t) ( J: this branch right to t is generically excluded. Proof. Recall that we have given a curve of Type 1 points (x(t), t)) of L(c, B, b) such ˆ ˆ that for t < t (t close to t) with J0 := J(x,t), |J0| = n, the matrix BJ0 (t) is nonsingular and x(t) is the unique solution of ˆ (6.79) BJ0 (t)x(t) = bJ0 (t) , t < t , with components given by Cramer’s rule (see (7.3)) as, i det BJ0 (bJ0 )(t) (6.80) xi(t) = , det BJ0 (t) i where BJ0 (bJ0 )(t) is the matrix obtained from BJ0 (t) by replacing the i-th column by bJ0 (t). ˆ Now, condition (4b) of Type 4 at infinity implies rank BJ0 (t) = n − 1, as well as ˆ ˆ  ˆ rank BJ0 (t) bJ0 (t) = n. Consequently, there is some i0 such that the i0-th column B·i0 (t) B (tˆ) n − 1 Bi0 (tˆ) of J0 is linearly dependent from the other columns. So for the matrix J0 B (tˆ) B (tˆ) Bi0 (b )(t) obtained from J0 by skipping column ·i0 it follows with the matrix J0 J0 above: Bi0 (b )(tˆ) = Bi0 (tˆ) b (tˆ) = B (tˆ) b (tˆ) = n . rank J0 J0 rank J0 J0 rank J0 J0 det Bi0 (b )(tˆ) 6= 0 i Thus J0 J0 is valid. We can assume that 0 is the index with max value | det Bi0 (b )(tˆ)| > 0 det B (tˆ) = 0 lim x (t) = J0 J0 . In view of J0 , from (6.80) we conclude t↑tˆ i0 ∞ and with the solution x˜ in (6.73) (proof of Theorem 6.14(b)) we find

xi0 (t) (6.81) lim =x ˜i0 6= 0 . t↑tˆ kx(t)k 6.6. ONE-PARAMETRIC QUADRATIC AND ONE-PARAMETRIC LINEAR PROGRAMS 139

We now show that there is a generic subset P4 of P such that for any (c, B, b) ∈ P4 at all points tˆof Type 4 at infinity we must have d (6.82) det B (tˆ) 6= 0 . dt J0 ˆ ˆ ˆ This implies rank BJ0 (t) = n for all t 6= t, t close to t, and (6.79) defines also for t, t < t < tˆ+ ε (some ε > 0) a solution curve (x(t), t). n To prove (6.82) we apply Ex.5.9 with S = M (n − 1), f(t) = BJ0 (t), h(M) = det M, M ∈ M n (t = x), as well as Theorem 6.1 with j0f = graph (f). This assures that there − n is a generic subset P4 of P, such that for any (c, B, b) ∈ P4 we have f t M (n − 1), or equivalently (cf., Ex 5.9) d d 0 6= h(f(tˆ)) = det B (tˆ) . dt dt J0 case J0 = J: In this case also these points x(t), t > tˆ, are feasible for L(t) = L(c, B, b) (vertices) and the points (x(t), t) are of Type 1 with J(x(t),t) = J0 = J(x,t). ˆ With the same arguments as before (6.73), in view of rank BJ0 (t) = n − 1, there is (up to a sign) a unique xˆ, kxˆk = 1, satisfying

ˆ x(t) BJ0 (t)ˆx = 0 , lim =x ˆ . t↓tˆ kx(t)k

Compared with (6.73) we must have xˆ = ±x˜. But since by (6.82) the value det BJ0 (t) changes its sign when passing tˆ, together with (6.81) we can conclude xˆ = −x˜, which proves the statement for J = J0. case J0 ( J: Then we can take an index j0 ∈ J \ J0 (recall J0 = J(x,t)). For a generic ˆ subset P42 of P the solution x,˜ kx˜k = 1, of BJ0 (t)˜x = 0 (cf., (6.72) and (6.73)) satisfies ˆ (6.83) Bj0·(t)˜x 6= 0 . ˆ ˆ Indeed, assuming that x˜ satisfies BJ0 (t)˜x = 0 and Bj0·(t)˜x = 0, then the (n+1)×n-matrix B (tˆ) J0  ˆ ˆ has rank n − 1, i,e, (c, B, b) meets at t a submanifold of codimension 2, which can Bj0·(t) generically be excluded. Now consider the curve (x(t), t), t 6= tˆof solutions of (see (6.79) and (6.82)) ˆ (6.84) BJ0 (t)x(t) = bJ0 (t) , t 6= t . As shown above, it holds x(t) x(t) (6.85) lim =x ˜ , lim = −x˜ . t↑tˆ kx(t)k t↓tˆ kx(t)k ˆ ˆ By (6.83) it follows |Bj0·(t)˜x| =: δ > 0. Suppose now, Bj0·(t)˜x = δ > 0. Then by ˆ ˆ continuity, using limt→tˆ kx(t)k = ∞, it follows from (6.85) for t < t, t close to t, x(t) δ δ B (tˆ) ≥ , and B (tˆ)x(t) ≥ kx(t)k → ∞ for t ↑ tˆ , j0· kx(t)k 2 j0· 2 140 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS and the points x(t), t < tˆ, t close to tˆ, would be infeasible, contradicting our assumption that (x(t), t), t < tˆ, are Type 1 points of L(c, B, b) (tˆ is a left-sided point of Type 4 at ˆ infinity). So we must have Bj0·(t)(−x˜) = δ > 0 and with the same reasoning, in view of the second relation in (6.85) for t > tˆ, t close to tˆ, we find x(t) δ δ B (t) ≥ , and B (t)x(t) ≥ kx(t)k → ∞ for t ↓ tˆ . j0· kx(t)k 2 j0· 2 But this means that the points x(t), t > tˆ, of the solution curve (x(t), t) of (6.84) become ˆ infeasible for t ↓ t. Consequently in this case J0 ( J a two-sided point of Type 4 at infinity is generically excluded. 

REMARK 6.3. In the formulation of our main Theorem 6.14 we tacitly assumed m ≥ n. In the case m ≤ n − 1 there does not exist any Type 1 point (primal vertex). Note that for m < n − 1 generically there does not occur any gc-point, and for m = n − 1, generically, only for a discrete set of points tˆ∈ R a line S(tˆ) = {x + τd | τ ∈ R} ˆ of gc-points xτ = x + τd, τ ∈ R, xτ ∈ F(t) may appear (see (6.63)) such that with

J = J(xτ ,tˆ), |J| = n − 1, and some µj ∈ R, j ∈ J, it holds (see Figure 6.53) X T c(tˆ) + µjBj·(tˆ) = 0 ,BJ (tˆ)d = 0 . j∈J

x

S(tˆ)

t tˆ

FIGURE 6.53. A line of gc-points S(tˆ) for m = n − 1.

Let us finally comment on the generic behaviour of the set

Cm = {(x, t) | x is a local minimizer of L(t)} ⊂ Cgc .

Clearly if (x, t) ∈ Cm is a gc-point of Type 1 of L(c, B, b), then generically there exists a maximal component of points of Type 1 in Cm, parameterized by a curve (x(t), t), t ∈ I(x, t), where I(x, t) ⊂ R is a maximal open interval. 6.6. ONE-PARAMETRIC QUADRATIC AND ONE-PARAMETRIC LINEAR PROGRAMS 141

Around a gc-point of Type 5, generically, the sets Cgc,Cm show a structure as for Q(t) or P (t) (cf., Figure 6.24, Figure 6.25, Figure 6.26, Figure 6.27).

Structur of Cm near a point tˆof Type 4 at infinity.

Let us also look at the generic behavior of the set Cm around a point tˆof Type 4 at infinity. So suppose, the curve (x(t), t), t ∈ I(x, t), t < tˆ, consists of Type 1 points corresponding to (global) minimizers of L(t) = L(c, B, b). Recall that the minimizers x(t) and corre- sponding multiplier vectors µ(t) are given as unique solutions of

(6.86) BJ0 (t)x(t) = bJ0 (t) T (6.87) BJ0 (t) µ(t) = −c(t) ˆ where J0 = J(x,t), |J0| = n (see condition (4a) above). If t is a one-sided (left-sided) point of Type 4 at infinity, then near tˆ the set Cm contains the curve (x(t), t), t ∈ I(x, t) as given in Figure 6.54.

x

(x(t), t)

t t tˆ

FIGURE 6.54. Part of Cm near a left-sided point tˆ of Type 4 at infinity, where x is a minimixer of L(tˆ).

Now suppose that tˆ is a two-sided point of Type 4 at infinity. By Theorem 6.15, then we must have J0 = J. Recall that for such a point tˆ we have shown (see the proof of Theorem 6.14 part (b)) that generically it holds ˆ ˆ ˆ  rank BJ0 (t) = n − 1, rank BJ0 (t) bJ0 (t) = n . In the same way we can prove, that generically at tˆthe condition (see (6.87)) ˆ T  rank BJ0 (t) − c(t) = n holds. The proof of Theorem 6.15 was based on the analysis of the solution x(t) of (6.86) for t 6= tˆ. The components xi(t) of x(t) are given in (6.80). We can apply the same arguments to the solution µ(t) of (6.87) given explicitly by det Bi (t)T J0 ˆ ˆ µi(t) = T , t 6= t, t ≈ t , det BJ0 (t) 142 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

i T T where BJ0 (t) denotes the matrix obtained from BJ0 (t) by replacing the i-th column T Bi·(t) by −c(t). We then find the following: limt↑tˆ kµ(t)k = ∞ and for some µ˜, kµ˜k = 1, µ(t) µ(t) lim =µ ˜ , lim = −µ˜ . t↑tˆ kµ(t)k t↓tˆ kµ(t)k Since x(t), t < tˆ, are minimizers, the condition µ(t) > 0 must hold for t < tˆ, implying

µ˜ ≥ 0. So −µ˜ must contain a negative component −µ˜i0 < 0, and therefore µi0 (t) < 0 must be valid for t > tˆ, t close to tˆ. But recalling that the solution x(t) of (6.86) for t > tˆ are gc-points of L(t) with corresponding multipliers µ(t) given by (6.87), we conclude that these points x(t) ”right to tˆ” cannot be minimizers. Consequently the set Cm of minimizers of L(t) only contains the points (x(t), t), t ∈ I(x, t), t < tˆ, as shown in Figure 6.55.

x

Cm

(x(t), t) t t tˆ

Cgc \ Cm

FIGURE 6.55. Part of Cm near a two-sided point tˆ of Type 4 at infinity, where x(t) is a minimizer of L(t). The points (x(t), t), t > tˆ belong to Cgc \ Cm.

∗ Structur of Cm near a point (ˆx, tˆ) of Type 2 . ∗ We finally analyse the generic behavior of the set Cm near a point (ˆx, tˆ) of Type 2 (see the definition of this Type, and the proof of Theorem 6.14, case (a4)).

We first consider the case τ0 < ∞: Here with x˜ =x ˆ + τ0d (d in (6.65)) the set S(tˆ) is given by S(tˆ) = {(1 − τ)ˆx + τx˜ | τ ∈ [0, 1]} . For notational simplicity we put (see condition (2c) of Type 2∗)

J0 = J(ˆx,tˆ) := {1, . . . , n}, j0 := 1, j1 = n + 1 ,

J1 = J(˜x,tˆ) := {n + 1, 2, . . . , n},J2 := {2, . . . , n} . So, we have the following situation for t ≈ tˆ:

(6.88) rank BJ0 (t) = rank BJ1 (t) = n , rank BJ2 (t) = n − 1 . 6.6. ONE-PARAMETRIC QUADRATIC AND ONE-PARAMETRIC LINEAR PROGRAMS 143

We now consider the unique solutions x0(t), x1(t) of 0 1 ˆ (6.89) BJ0 (t)x (t) = bJ0 (t) ,BJ1 (t)x (t) = bJ1 (t) , t ≈ t . By construction, x0(t), x1(t), t 6= tˆ, are Type 1 vertices of L(t) satisfying x0(tˆ) =x ˆ, x1(tˆ) =x ˜, with corresponding Lagrangean multipliers µ0(t), µ1(t) given as unique solu- tions of T 0 T 1 (6.90) BJ0 (t) µ (t) = −c(t) ,BJ1 (t) µ (t) = −c(t) . There also is a unique vector y(t) solving T T (6.91) BJ0 (t) y(t) = Bn+1·(t) . By Cramer’s rule the solutions µ0(t), µ1(t), y(t) are given by det Bj (t)T det Bj (t)T det B˜j (t)T 0 J0 1 J1 J0 (6.92) µj (t) = T , µj (t) = T , yj(t) = T , det BJ0 (t) det BJ1 (t) det BJ0 (t) Bj (t)T ,Bj (t)T , B˜j (t)T j where J0 J1 J0 , resp., are obtained by replacing the -th column of T T T T BJ0 (t) ,BJ1 (t) ,BJ0 (t) , resp., by −c(t), −c(t),Bn+1·(t) , respectively. Assuming that the vertices x0(t), t < tˆ, are minimizers of L(t), we must have 0 ˆ 0 ˆ 0 ˆ (6.93) µj (t) > 0, j = 2, . . . , n, t ≈ t, µ1(t) > 0, t < t, µ1(t) = 0 . 0 ˆ 1 ˆ T The condition µ1(t) = 0 implies det BJ0 (t) = 0. By definition (see (6.92)) we find 1 T 1 T 0 ˆ 1 ˆ (6.94) det BJ0 (t) = det BJ1 (t) and thus µ1(t) = µ1(t) = 0 , and further using (6.88), (6.91), T T T T  0 6= det BJ1 (t) = det Bn+1·(t) B2·(t) ...Bn·(t) T T T  (6.95) = det BJ0 (t) y(t) B2·(t) ...Bn·(t) T T T  T (6.96) = y1(t) det B1·(t) B2·(t) ...Bn·(t) = y1(t) det BJ0 (t) , ˆ 1 0 and thus y1(t) 6= 0 (for t ≈ t) and then (cf., (6.92), (6.94)) µ1(t) = µ1(t)/y1(t). Also for j ≥ 2 (use (6.92)): det Bj (t)T = detB (t)T B (t)T .. −c(t) ..B (t)T  J1 n+1· 2· n· | {z } j-th T T T T  = det y1(t)B1·(t) + yj(t)Bj·(t) B2·(t) .. −c(t) ..Bn·(t) | {z } j-th = y (t) det Bj (t)T + y (t) detB (t)T B (t)T .. −c(t) ..B (t)T  . 1 J0 j j· 2· n· | {z } j-th In view of (6.92), (6.96) this implies for j ≥ 2, j-th T T z }| { T  1 0 yj(t) det Bj·(t) B2·(t) .. −c(t) ..Bn·(t) (6.97) µj (t) = µj (t) + · T . y1(t) det BJ0 (t) 144 6. GENERICITY RESULTS FOR PARAMETRIC PROBLEMS

0 ˆ From (6.90) using µ1(t) = 0 it follows T T T  det Bj·(tˆ) B2·(tˆ) .. −c(tˆ) ..Bn·(tˆ) = 0 , | {z } j-th so that from (6.97) by continuity we conclude (see (6.93)) 1 ˆ (6.98) µj (t) > 0, j = 2, . . . , n for t ≈ t . Finally as in the proof of (6.82) we generically can assume d d det B1 (tˆ)T 6= 0 and det B1 (tˆ)T 6= 0 , dt J0 dt J1 0 1 ˆ which means (cf., (6.92)) that the values of µ1(t), µ1(t) change sign when passing t. More precisely, > 0 for t < tˆ < 0 for t < tˆ µ0(t) , µ1(t) . 1 < 0 for t > tˆ 1 > 0 for t > tˆ Together with (6.98) this shows: for t < tˆ: x0(t) are minimizers, x1(t) are not minimizers , for t > tˆ: x1(t) are minimizers, x0(t) are not minimizers so that near (ˆx, tˆ), (˜x, tˆ) the set Cm (in blue) looks like sketched in Figure 6.56.

(˜x, tˆ) x2(t) x

S(tˆ)

Cm x1(t) (ˆx, tˆ) t tˆ

∗ FIGURE 6.56. Set Cm around a Type 2 point (ˆx, tˆ) where S(tˆ) is a line-segment.

0 In case τ0 = ∞, the same holds for the solution x (t) of (6.89) with corresponding mul- 0 0 ˆ 0 tiplier µ (t) in (6.90). Here, we also have µj (t) > 0, j ≥ 2, for t ≈ t, and µ1(t) changes sign at tˆfrom positive to negative. So we obtain 0 ˆ (6.99) µ1(t) < 0 for t > t . Now we consider the (unique) solution d(t) of −1 B (t)d(t) = ∈ n , t ≈ tˆ . J0 0 R 6.6. ONE-PARAMETRIC QUADRATIC AND ONE-PARAMETRIC LINEAR PROGRAMS 145

For t = tˆthe vector d(tˆ) coincides with the direction d in (6.65). Then with the gc-points x0(t) and the corresponding multipliers µ0(t) in (6.90) it follows, using (6.99), T 0 T 0 ˆ c(t) d(t) = −µ (t) BJ0 (t)d(t) = µ1(t) < 0 for t > t . This means that for t > tˆ, t close to tˆ, the vector d(t) is a descent direction for c(t)T x at x0(t). Consequently, for t > tˆ, the objective of the program L(t) is unbounded. In particular, near (ˆx, tˆ) the set Cm (in blue) looks as given in Figure 6.57.

x

S(tˆ)

Cm x1(t) (ˆx, tˆ) t tˆ

∗ FIGURE 6.57. Set Cm around a Type 2 point (ˆx, tˆ) where S(tˆ) is a half-line.

CHAPTER 7

Appendix: Basics from linear algebra and analysis

To keep the booklet largely selfcontained, in this appendix we list some facts from linear algebra and calculus which are used regularly.

n×n Determinant, inverse of a matrix, Cramer’s rule. For a matrix A = (aij) ∈ R we denote its determinant by det(A). Here is the well-known explicit Leibniz formula, X (7.1) det(A) = (sgnσ) a1σ(1)a2σ(2) ··· anσ(n) ,

σ∈πn where the sum is taken over all n! permutations σ of {1, . . . , n} and (sgnσ) = ±1 is the so-called sign of σ. Other well-known formulas are: det(AT ) = det(A) and in case A is nonsingular,  AB  (7.2) det = det(A) · det(D − CA−1B) . CD The latter formula follows from the relation,  AB   I −A−1B   A 0  = . CD 0 I CD − CA−1B Also the next relation is standard: det(A) 6= 0 ⇔ the inverse A−1 of A exists . For an A we have 1 A−1 = A where A = (m )T det(A) adj adj ij i+j with minors mij = (−1) det(A(i, j)), and A(i, j) is the (n − 1) × (n − 1)-matrix obtained from A by deleting row i and colomn j. This formula implies, for invertible A, the following representation of the unique solution x of Ax = b: det(A (b)) (7.3) x = A−1b with components x = i Cramer’s rule, i det(A) where Ai(b) is the matrix obtained from A by replacing the i-th column A·i of A by b.

147 148 7. APPENDIX: BASICS FROM LINEAR ALGEBRA AND ANALYSIS

Eigenvalues. Given A ∈ M n, a number λ ∈ C is called an eigenvalue of A if there exists a nonzero vector x ∈ Cn such that Ax = λx . Such a pair (x, λ) is also called eigenpair of A. We recall that a real-valued matrix A ∈ M n may have complex eigenvalues and eigen- vectors. However if A is a symmetric realvalued matrix, i.e., if A ∈ Sn, then all eigen- values and eigenvectors are real. It is also well-known that for A ∈ Sn the eigenpairs n n (xi, λi) ∈ R × R, i = 1, . . . , n, can be chosen such that the eigenvectors xi ∈ R are orthonormal:  1, i = j xT x = . i j 0, i 6= j

n EX. 7.1. Let A ∈ S . Show that eigenvectors x1, x2 corresponding to eigenvalues λ1 6= λ2 are orthogonal. Proof. The statement follows from (use AT = A) T T T T T (λ1 − λ2)x1 x2 = (λ1x1) x2 − x1 (λ2x2) = (Ax1) x2 − x1 (Ax2) T T T = x1 A x2 − x1 Ax2 = 0 . T So λ1 6= λ2 implies x1 x2 = 0.  Mean value formula, Taylor’s formula. We make use of the mean value formula (or first order Taylor formula) in different forms. For f : Rn → R, f ∈ C1, a well-known form is, f(y) − f(x) = ∇f(x)(y − x) + o(ky − xk) , or with some τ = τ(x, y), 0 < τ < 1, f(y) − f(x) = ∇f(x + τ(y − x))(y − x) . From the Fundamental Theorem of Calculus we obtain the following integral form of the mean value.

LEMMA 7.1. [Mean value formula] Let be given f : D ⊂ Rn → Rm, D open, f ∈ C1(D, Rm) and y, x ∈ D such that the whole line segment {x + τ(y − x) | τ ∈ [0, 1]} is contained in D. Then we have Z 1 f(y) − f(x) = ∇f(x + τ(y − x))(y − x) dτ , 0 and thus (7.4) kf(y) − f(x)k ≤ max k∇f(x + τ(y − x))k ky − xk . 0≤τ≤1 7. APPENDIX: BASICS FROM LINEAR ALGEBRA AND ANALYSIS 149

T Proof. Let f(x) = (fi(x), i = 1, . . . , m) and define gi : [0, 1] → R by

gi(τ) = fi(x + τ(y − x)) . Then the Fundamental Theorem of Calculus yields Z 1 0 gi(1) − gi(0) = gi(τ) dτ , 0 and then by chain rule, Z 1 0 fi(y) − fi(x) = gi(1) − gi(0) = gi(τ) dτ 0 Z 1 = ∇fi(x + τ(y − x))(y − x) dτ . 0 By definition of the Jacobian this leads to, Z 1 f(y) − f(x) = ∇f(x + τ(y − x))(y − x) dτ 0 and Z 1 kf(y) − f(x)k ≤ k∇f(x + τ(y − x))k ky − xk dτ 0 ≤ max k∇f(x + τ(y − x))k ky − xk . 0≤τ≤1  We also state the second order Taylor formula for C2-functions f : Rn → R near a point x ∈ Rn in the form 1 (7.5) f(x) = f(x) + ∇f(x)(x − x) + (x − x)T ∇2f(x)(x − x) + o(kx − xk2) 2 or with some τ = τ(x, x), 0 < τ < 1, 1 f(x) = f(x) + ∇f(x)(x − x) + (x − x)T ∇2f(x + τ(x − x))(x − x) . 2

Inverse- and Implicit Function Theorem. We give some other basis results from analy- sis (see, e.g., [34, p.221] for a proof).

THEOREM 7.1. [Inverse Function Theorem] Let F : Rn → Rn be a Ck-function (k ≥ 1) and let x ∈ Rn such that the Jacobian ∇F (x) is nonsingular. Then there are open neighborhoods U of x and V = F (U) of F (x) such that the restriction F : U → V has a Ck-inverse F −1 : V → U satisfying ∇F −1(F (x)) = [∇F (x)]−1 for x ∈ U. 150 7. APPENDIX: BASICS FROM LINEAR ALGEBRA AND ANALYSIS

This Inverse Function Theorem is directly related to the notion of a (local) diffeomorphism. We recall that a C1-function F : Rn → Rn is said to define a local diffeomorphism at x ∈ Rn if there exists an open neighborhood U of x such that F : U → V with V := F (U), is bijective (thus the inverse F −1 exists) and such that the inverse F −1 : V → U is a C1- function.

REMARK 7.1. Note that if F defines a local diffeomorphism at x then by differentiating the identity F −1(F (x)) = x, x ∈ U, we obtain ∇F −1(F (x)) · ∇F (x) = I , x ∈ U. So in particular, the Jacobian ∇F (x) must be nonsingular. In view of the Inverse Function Theorem we thus have for C1-functions F : ∇F (x) is nonsingular ⇒ F defines a local diffeomorphism at x

We also present a version of the Implicit Function Theorem. Let be given a system of n equations in n + p variables: n p n n p F (x, t) = 0 , where F : R × R → R , (x, t) ∈ R × R . The Implicit Function Theorem (IFT) makes a statement on the structure of the solution set in the “regular” situation.

THEOREM 7.2. [Implicit Function Theorem (IFT)] Let F : Rn × Rp → Rn be a C1-function F (x, t) with (x, t) ∈ Rn × Rp. Suppose for n p (x, t) ∈ R × R we have F (x, t) = 0 and the matrix ∇xF (x, t) is nonsingular. Then in neighborhoods Ux of x, Ut of t, the solution set S(F ) := {(x, t) | F (x, t) = 0} is 1 n described by a C -function x : Ut → R such that x(t) = x and F (x(t), t) = 0 for all t ∈ Ut. More precisely, S(F ) ∩ Ux × Ut = {(x(t), t) | t ∈ Ut} . So, locally near (x, t), the set S(F ) is a p dimensional C1-manifold. Moreover, the gradi- ents ∇x(t) are given by −1 (7.6) ∇x(t) = −[∇xF (x(t), t)] ∇tF (x(t), t) for t ∈ Ut(t) . Proof. For a proof see, e.g., [34]. Note that if x(t) is a C1-function satisfying F (x(t), t) = 0 for t ≈ t, then by differentiation wrt. t we find

∇xF (x(t), t)∇x(t) + ∇tF (x(t), t) = 0 , t ≈ t , which yields (7.6). 

EX. 7.2. Show that in a neighborhood of a solution (x, t) of the system F (x, t) = 0 satis- fying the assumptions of Theorem 7.2, the solution set S(F ) is a manifold of codimension n, and of dimension p in Rn × Rp. Bibliography

[1] Ahmad F., Still G., Maximization of homogeneous polynomials over the simplex and the sphere: struc- ture, stability, and generic behavior. J. Optim. Theory Appl. 181, no. 3, 972-996, (2019). [2] Bando S., Urakawa H., Generic properties of the eigenvalue of the Laplacian for compact Riemannian manifolds, Tohoku Math. Journal 35, 155-172, (1983). [3] Benedetti R., Risler J-J., Real algebraic and semi-algebraic sets, Hermann Editeurs´ des sciences et des Arts, Paris, (1990). [4] Bertsimas D., Tsitsiklis J.N., Introduction to Linear Optimization, Athena Scientific Books, Belmont, (1997). [5] Bomze I.M., Dickinson P.J.C., Still G., The structure of completely positive matrices according to their CP-rank and CP-plus-rank. Linear Algebra Appl. 482, 191-206, (2015). [6] Bouza Allende G., Mathematical programs with equilibrium constraints: Solution techniques from parametric optimization. Thesis (Dr.) Universiteit Twente (The Netherlands), (2006). [7] Bouza G., Guddat J., Still G., Critical sets in one-parametric mathematical programs with complemen- tarity constraints. Optimization 57, no. 2, 319-336, (2008). [8] Bouza G., Still, G., Solving bilevel programs with the KKT-approach. Math. Program. 138, no. 1-2, Ser. A, 309-332, (2013). [9] Caron R., Traynor T., The zero set of a polynomial. (2005), www.uwindsor.ca/math/sites/uwindsor.../05-03.pdf [10] Coste M., An introduction to semi-algebraic geometry. Report, Institut de Recherche Mathematique,´ Rennes, (2002), https://gcomte.perso.math.cnrs.fr.costelintrotosemialgeo.pdf [11] Darinka D., On the singularities in linear one-parametric optimization problems. Optimization 22, no. 2, 193-219, (1991). [12]D ur¨ M., Jargalsaikhan B., Still G., Genericity results in linear conic programming - a tour d’horizon. Math. Oper. Res. 42, no. 1, 77-94, (2017). [13] Faigle U., Kern W., Still G., Algorithmic Principles of Mathematical Programming, Kluwer, Dordrecht, (2002). [14] Gibson C.G., Wirthmuller˝ K., Du Plessis A.A., Looijenga E.J.N., Topological stability of smooth mappings. Lecture Notes in Math., vol. 552, Springer-Verlag, Berlin (1976). [15]G unzel¨ H., Jongen H.Th., Stein O., On the closure of the feasible set in generalized semi-infinite programming. CEJOR Cent. Eur. J. Oper. Res. 15, no. 3, 271-280, (2007). [16] Guillemin V., Pollack A., Differential Topology. Prentice-Hall, Inc., Englewood Cliffs, (1974). [17] Guddat J., Guerra Vazquez F., Jongen H.Th., Parametric optimization: singularities, pathfollowing and jumps. B. G. Teubner, Stuttgart; John Wiley, Sons, Ltd., Chichester, (1990). [18] Hirsch, M.W., Differential Topology. Springer, Berlin, (1976). [19] Jongen H.Th., Jonker P., Twilt, F., Nonlinear optimization in Rn. I. Morse theory, Chebyshev approxi- mation. Methoden und Verfahren der Mathematischen Physik [Methods and Procedures in Mathemati- cal Physics], 29. Verlag Peter D. Lang, Frankfurt am Main, (1983). [20] Jongen H.Th., Jonker P., Twilt F., Nonlinear optimization in Rn. II. Transversality, flows, parametric as- pects. Methoden und Verfahren der Mathematischen Physik [Methods and Procedures in Mathematical Physics], 32. Verlag Peter D. Lang, Frankfurt am Main, (1986).

151 152 BIBLIOGRAPHY

[21] Jongen H.Th., Jonker P., Twilt F., Nonlinear Optimization in finite Dimensions. Kluwer, Dordrecht, (2000). [22] Jongen H.Th., Jonker P., Twilt F., Critical sets in parametric optimization. Mathematical Programming 34, p.333-353, (1986). [23] Jongen H.Th., Jonker P., Twilt F., One-parameter families of optimization problems: equality con- straints. Journal of Optimization Theory and Applications 45, No. 1, p.141-160, (1986). [24] Jongen H.Th., Stein O., On generic one-parametric semi-infinite optimization. SIAM J. Optim. 7, no. 4, 1103-1137, (1997). [25] Jonker P., Still G., Twilt F., On the partition of real symmetric matrices according to the multiplicities of their eigenvalues. Control and Cybernetics, vol. 23, No. 1/2, 169-181, (1994). [26] Jonker P., Pouw M., Still G., Twilt F., On the partition of real skew-symmetric matrices according to the multiplicities of their eigenvalues. Charlemagne and his heritage. 1200 years of civilization and science in Europe, Vol. 2 (Aachen, 1995), 439-454, Brepols, Turnhout, (1998). [27] Jonker P., Still G., Twilt, F. One-parametric linear-quadratic optimization problems. in: Optimization with data perturbations, II. Ann. Oper. Res. 101, 221-253, (2001). [28] Kato T., Perturbation theory for linear operators, Vol. I and Vol. II. Springer Verlag, Tokyo, (1966). [29] Lancaster P., Tismenetsky M., The theory of matrices. Academic Press, INC. Boston, (1985). [30] Lee J.M., Introduction into smooth manifolds. Springer, (2003). [31] Luenberger, D.G., Linear and Nonlinear Programming. Springer, 2d ed., New York, (2003). [32] Minerbe V., An introduction to differential geometry, UPCM-Universite´ Paris 6, (2015). https://webusers.imj-prg.fr [33]M uger¨ M., An introduction to Differential Topology, de Rham Theory and Morse Theory. University of Groningen, www.math.ru.nl/ mueger/diff−notes.pdf [34] Rudin W., Principles of mathematical analysis (3d ed.). McGraw-Hill, (1976). [35] Rupp T., Kuhn-Tucker curves for one-parametric semi-infinite programming. Optimization 20, no. 1, 61-77, (1989). [36] Still G., How to split the eigenvalues of a one-parametric family of matrices. Optimization vol. 49, p. 387-403, (2001). [37] Still G., Linear bilevel problems: genericity results and an efficient method for computing local minima. Math. Methods Oper. Res. 55, no. 3, 383-400, (2002). [38] Scholtes S., Stohr¨ M., How stringent is the linear independence assumption for mathematical programs with complementarity constraints? Math. Oper. Res. 26, no. 4, 851-863, (2001). [39] Tu L.W., An introduction into manifolds (second edition). Springer, (2010). [40] Zeidler E., Nonlinear Functional Analysis and Applications IV. Springer (1988). [41] Zwier G., Structural analysis in semi-infinite programming. Thesis (Dr.) Universiteit Twente (The Netherlands), 165 pp, (1987). Index

active index set, 29, 46, 103 critical point equation, 90 algebraic set, 19 critical point set, 91 properties, 22 cycling, 31 reducible, 24 decomposition into irreducible sets, 24 determinant of A, 147 analytic function, 87 Leibniz formula, 147 diffeomorphism, 55 Baire space, 18 between manifolds, 59 band matrix, 68 local, 150 band width, 68 dimension of a manifold, 55 charts dual cone, 22 of a manifold, 58 eigenpair of A, 37, 148 codimension eigenvalue curves of a manifold, 55 of one-parametric families of matrices, 88 complementarity condition, 29 eigenvalue of A, 37, 148 completely positive matrix, 21 simple, 37 cone, 21 eigenvector of A, 37, 148 connected component of a manifold, 60 feasible set of a stratum, 64 in linear programming, 28 constraint qualification in nonlinear programming, 46 linear independence (LICQ), 46, 103 frontier condition, 64 Mangasarian Fromovitz Constraint for a Whitney regular stratification, 64 Qualification (MFCQ), 112 Fubini formula, 14 cover countable open cover of M, 59 general position, 74 open cover of M, 58 see transversal intersection, 74 Cramer’s rule, 147 generalized critical point (gc-point), 103, 128 critical direction, 47 of Type 2∗, 130 critical point, 103 of Types 1-5, 108, 115 for constrained programs, 95, 103 generic property, 2 for linear programs, 128 of a problem set P ⊂ Rn, 2 for quadratic programs, 120 weakly, 2 for unconstrained programs, 39 generic set nondegenerate, for constrained programs, 95, in Ck(Rn, R), 18 103 in function spaces, 15 nondegenerate, for unconstrained programs, vs open and dense, 15 39, 91 weakly, in Rn, 2 153 154 INDEX

in Rn, 2, 13 irreducible algebraic set, 24 in a Baire space, 18 ` n jet extension j f, 81 weakly, in R , 2 genericity jet space, 81 of det(A) 6= 0, 27 jet transversality theorem, 82 of rank (A) has full rank, 27 Karush-Kuhn-Tucker condition (KKT), 47 of LP with no degenerate vertex, 32 KKT point, 47 of simple eigenvalues, 37 genericity result Lagrangean function, 46, 103 for parametric families of symmetric matrices Lagrangean multipltipliers, 46 wrt. rank, 84 Lebesgue measure, 2 for a parametric system of equations, 99 zero, 11 for constrained programs, 52 Leibniz formula for eigenvalues of real symmetric matrices, 37 for det(A), 147 for LICQ, 50 LICQ for linear equations, 3 see constraint qualification, 46 for linear programs, 32 linear independence constraint qualification for nonlinear equations, 4 (LICQ), 46, 103 for one-parametric constrained programs, 107, linear program, 28 113 primal/dual pair, 28 for one-parametric families of symmetric local transformation principle, 61 matrices wrt. eigenvalues, 86 for one-parametric linear programs, 128, 133 Mangasarian Fromovitz Constraint Qualification for one-parametric quadratic programs, 120 (MFCQ) for one-parametric unconstrained programs, 97 see constraint qualification, 112 manifold for parametric constrained programs, 102 k for parametric unconstrained programs, 90 a C -manifold, 55 for points tˆof Type 3 and Type 4 at infinity, matrix norm, 8 122 mean value formula, 148 for systems of equations, 42, 43 measure zero, 11 f(S) for the Newton method, 46 of an image , 12 ˆ for two-sided points t at infinity, 123 Newton iteration, 39 for unconstrained programs, 40 convergence result, 39 of the 3 Types, 121 Newton method, 45 of the 5 Types, 114 for solving F (x) = 0, 46 two-sided point of Type 4 at infinity, 138 for solving unconstrained programs, 39 Grassmann manifold, 62 nondegenerate critical point for constrained programs, 103 Hausdorff space, 35 for unconstrained programs, 91 Hilbert Basis Theorem, 24 nondegenerate minimizer homogeneity condition for constrained programs, 50 for a stratification, 65 nonlinear constrained program, 46 normal space ideal of a minifold M, 57 of a ring, 25 of an algebraic set, 25 open cover of Rn, 35 prime ideal, 25 optimal value, 28 Implicit Function Theorem (IFT), 150 optimality conditions in Banach spaces, 48 for unconstrained optimization, 39 Inverse Function Theorem, 149 Karush-Kuhn-Tucker (KKT), 47 inverse of A, 147 necessary for constrained programs, 47 INDEX 155

sufficient for constrained programs, 47 of M n,m wrt. ranks, 66 orthonormal eigenvectors, 148 of Sn wrt. eigenvalues, 85 of Sn wrt. ranks, 67 paracompact of sets of matrices, 65 topological vector space, 35 Whitney regular, 61 parametric constrained program, 102 stratified set, 60 one-parametric, 107 stratum, 60 parametric Sard theorem, 36 strict complementarity (SC), 47 parametric symmetric eigenvalue problem, 85 for a critical point, 103, 120, 128 parametric unconstrained problems in LP, 29 one-parametric, 94 strong duality, 29 partition of S, 60 strong topology partition of unity k k n ∞ Cs -topology on C (R , R), 17 C -partition, 35 support of a function, 35 pathconnected, 60 Peano space filling curves, 12 tangent space, 47, 103 ˆ points t of Type 3 at infinity, 121 of a manifold M, 57 ˆ points t of Type 4 at infinity, 121 Tarski-Seidenberg, 21 polynomial, 18 Taylor expansion, 19, 87 set of zeroes, 27 Taylor formula, 148 prime ideal, 25 second order, 149 problem, 1 Thom’s transversality theorems, 81 class, 1 topological space data, 1 second countable, 56 instance, 1 topological vector space, 17, 35 set, 1 topology property k Cs -topology, 17 of a problem, see generic property, 1 transversal intersection, 74 reducible of two manifolds, 74 transversality of mappings, 76 algebraic set, 24 − regular value of f, 36 f t M, 77 residual set transversality theorem in a Baire space, 18 jet transversality, 82 jet transversality for stratifications, 82 Sard theorem turning point, 93, 94, 100, 110, 111 parametric version, 36 of the critical point set, 91 SC quadratic, 97 see strict complementarity, 47 second countable vertex, 29 topological vector space, 36 dual, 29 second order condition, 103 nondegenerate, 29 SOC, 47 primal, 29 for critical points, 120 vertices semi-algebraic set, 19 maximal number of, 31 connected components, 21 properties, 19 weak duality, 29 simplex method, 31 Whitney regular stratification, 61, 62 stability of a semi-algebraic set, 64 of a local minimizer, 47 of M n,m,Sn wrt. rank, 70 stratification of Sn wrt. eigenvalues, 86 a Ck-stratification, 60 of sets of matrices, 70 156 INDEX zeroes of polynomials, 27