FFakultätakultät fürfür MMathematikathematik uundnd IInformatiknformatik

Preprint 2016-06

Susanne Franke, Patrick Mehlitz, Maria Pilecka Optimality conditions for the simple convex bilevel programming problem in Banach spaces

ISSN 1433-9307

Susanne Franke, Patrick Mehlitz, Maria Pilecka

Optimality conditions for the simple convex bilevel programming problem in Banach spaces

TU Bergakademie Freiberg

Fakultät für Mathematik und Informatik

Prüferstraße 9

09599 FREIBERG

http://tu-freiberg.de/fakult1

ISSN 1433 – 9307

Herausgeber: Dekan der Fakultät für Mathematik und Informatik

Herstellung: Medienzentrum der TU Bergakademie Freiberg TU Bergakademie Freiberg Preprint

Optimality conditions for the simple convex bilevel programming problem in Banach spaces

Susanne Franke, Patrick Mehlitz, Maria Pilecka

Abstract The simple convex bilevel programming problem is a convex minimization problem whose feasible set is the solution set of another convex optimization problem. Such problems appear frequently when searching for the projection of a certain point onto the solution set of another program. Due to the nature of the problem, Slater’s constraint qualification generally fails to hold at any feasible point. Hence, one has to formulate weaker constraint qualifications or stationarity notions in order to state optimality conditions. In this paper, we use two different single-level reformulations of the problem, the optimal value and the Karush-Kuhn-Tucker approach, to derive optimality conditions for the original program. Since all these considerations are carried out in Banach spaces, the results are not limited to standard optimization problems in Rn. On the road, we introduce and discuss a certain concept of M-stationarity for mathematical programs with complementarity constraints in Banach spaces. Keywords Bilevel Programming · Convex Programming · Constraint Qualifications · Mathematical Program with Complementarity Constraints · Programming in Banach Spaces Mathematics Subject Classification (2000) 46N10 · 49K27 · 90C25 · 90C33

Dedicated to Professor Stephan Dempe on the occasion of his 60th birthday.

1 Introduction

Bilevel programming problems were first formulated by Stackelberg in his investigation of market econ- omy in 1934 [49], whereas the first mathematical model can be found in [10]. Since then, bilevel problems have been studied thoroughly from both a theoretical point of view and with regard to applications as well as solution strategies, see the monographs [6,12,46]. In order to handle a bilevel programming prob- lem mathematically, it is usually transformed into an optimization problem with only one level. Two such possibilities which we will use in our paper are the optimal value reformulation, introduced in [39], and the Karush-Kuhn-Tucker (KKT) approach, see for example [34]. Instead of the standard bilevel programming problem which has one variable for the upper and one for the lower level, we investigate the special (simple) case where both levels share the same variable. Hence, consider the simple convex bilevel programming problem (SCBPP for short) which is given as stated below: F (x) → min (SCBPP) x ∈ Ψ := Argmin{f(y) | y ∈ Θ}.

Here, F, f : X → R are convex, continuous, and sufficiently smooth functionals of a X , while Θ ⊆ X is a nonempty, closed, and . Note that these assumptions guarantee that the set Ψ is convex, i.e. (SCBPP) is a convex optimization problem. However, the derivation of optimality

S. Franke, P. Mehlitz (corresponding author), and M. Pilecka Technische Universit¨atBergakademie Freiberg, Faculty of Mathematics and Computer Science, 09596 Freiberg, Germany E-mail: {susanne.franke,mehlitz,pilecka}@math.tu-freiberg.de, the work of the first author has been supported by the Deutsche Forschungsgemeinschaft, grant DE-650/7-1 2 Susanne Franke, Patrick Mehlitz, Maria Pilecka

conditions is a challenging problem since most of the standard constraint qualifications generally fail to hold at any feasible point of (SCBPP). Problems of this type arise frequently when a best point (in a certain sense) among the optimal solutions of a convex optimization problem has to be found. Note that searching for an efficient solution of a convex biobjective optimization problems can also be reformulated as (SCBPP). However, solving (SCBPP) provides only special efficient solutions of such a vector optimization problem. Regarding the finite-dimensional situation, (SCBPP) was first introduced and investigated in [48] where an application of this problem is given as well. Further discussions and its relationship to standard bilevel programming problems can be found in [13]. In order to stay as close as possible to the terminology of bilevel programming, we call

f(x) → min x ∈ Θ

the lower level problem of (SCBPP). If the lower level feasible set Θ is described by conic constraints, it is not difficult to show, see Theorem 5.5, that (SCBPP) is equivalent to a certain mathematical program with complementarity constraints (MPCC for short) in Banach spaces, recently introduced and studied in [36, 51, 53]. These problems in finite-dimensional spaces have been the subject of theoretical research for many years. Since the usual KKT conditions for such problems are very restrictive, several different stationarity conditions have been defined, see [18, 41, 45, 60]. Furthermore, since some of the well-known constraint qualifications are violated at every feasible point of such problems, other regularity conditions for MPCCs have been introduced, cf. [17, 19]. The numerical approach to MPCCs is developed as well, see [29] and references therein. For the systematic and comprehensive study on this topic, we refer the interested reader to [34]. Moreover, a study of MPCCs in very general settings can also be found in [38]. Due to the framework of the considered problem in Banach spaces, it is also connected to works on optimal control. Hierarchical programming in function spaces is the subject of discussion in [8,36,55,56]. The special case where the solution of a variational inequality in Sobolev spaces has to be controlled was investigated in [22, 27, 28, 31, 37, 50]. It turns out that the strong stationarity conditions of the surrogate problem are, in general, too strong to hold at the local minimizers of (SCBPP), see Section5. Hence, weaker stationarity notions have to be considered. Therefore, in this paper, we want to consider a generalized concept of Mordukhovich- stationarity (M-stationarity for short) which is known to be weaker than strong stationarity for common finite-dimensional MPCCs. We organized this article as follows: In Section2, we introduce the notation and preliminary results we use throughout the paper. Section3 is dedicated to the study of constraint qualifications for optimization problems in Banach spaces. Afterwards, MPCCs in Banach spaces are discussed in Section4. Here, we recall the notions of weak and strong stationarity already known from [36,51]. Furthermore, we introduce a concept of M-stationarity which is shown to be reasonable in the case where the cone which generates the complementarity constraint is polyhedral. Finally, we consider (SCBPP) in Section5. The lower level optimal objective value and the lower KKT conditions of (SCBPP) are used to state two single-level surrogate problems which are exploited to derive optimality conditions for (SCBPP). The aforementioned KKT approach leads to necessary optimality conditions of M-stationarity-type.

2 Notation and preliminary results

2.1 Notation

n n,+ Let us start this section by introducing the notation we exploit in this paper. We use N, R, R, R , R0 , n×m R , and Sn to denote the natural numbers (without zero), the real numbers, the extended real line, the set of all real vectors with n components, the cone of all vectors from Rn possessing nonnegative components, the set of all real matrices with n rows and m columns, and the set of all symmetric matrices n×n + 1,+ n×m from R , respectively. Furthermore, we stipulate R0 := R0 . For an arbitrary matrix Q ∈ R and |I|×m an index set I ⊆ {1, . . . , n}, QI ∈ R denotes the matrix composed of the rows of Q whose indices n×m n×n come from I. Furthermore, O ∈ R and I ∈ R represent the zero matrix and the identity matrix Optimality conditions for the simple convex bilevel programming problem in Banach spaces 3

+ − of appropriate dimension, respectively. By Sn and Sn we denote the cone of all positive and negative semidefinite matrices from Sn, respectively. The forthcoming definitions and results in this section are based on [5, 22, 38, 40]. Let X be a real Banach space with k·kX and zero vector 0, and let A ⊆ X be nonempty. We denote by lin(A), cone(A), conv(A), int(A), and cl(A) the smallest subspace of X containing A, the smallest containing A, the of A, the interior of A, and the closure of A, respectively. The indicator function δA : X → R of the set A is defined by ( 0 if x ∈ A, ∀x ∈ X : δA(x) := +∞ if x∈ / A.

The (topological) of X is denoted by X ?, and h·, ·i: X × X ? → R is the corresponding dual pairing. Now fix some x ∈ A. Then we define the radial cone, the tangent (or Bouligand) cone, and the weak tangent cone to A at x as stated below:

RA(x) := {d ∈ X | ∃α > 0 ∀t ∈ (0, α): x + td ∈ A}, TA(x) := {d ∈ X | ∃{dk} ⊆ X ∃{tk} ⊆ R: dk → d, tk & 0, x + tkdk ∈ A ∀k ∈ N}, w TA (x) := {d ∈ X | ∃{dk} ⊆ X ∃{tk} ⊆ R: dk * d, tk & 0, x + tkdk ∈ A ∀k ∈ N}. Here, → and * denote the norm and the weak convergence in X , respectively. Furthermore, we will use ? ? → to denote the weak-?-convergence in X . Observe that the cone TA(x) is always a closed set which satisfies w RA(x) ⊆ TA(x) ⊆ TA (x). ε ε Let BX be the closed unit ball of X , while for any ε > 0 and x ∈ X , we use UX (x) and BX (x) to denote the open and closed ε-ball around x, respectively. Let B ⊆ X ? be nonempty. Then cl?(B) denotes the weak-?-closure of B, i.e. the closure with respect to (w.r.t.) the weak-?-topology. We define the polar cones and annihilators of the sets A and B by

◦ ∗ ? ∗ ∗ ∗ A := {x ∈ X | ∀x ∈ A: hx, x i ≤ 0},B◦ := {x ∈ X | ∀x ∈ B : hx, x i ≤ 0}, ⊥ ∗ ? ∗ ∗ ∗ A := {x ∈ X | ∀x ∈ A: hx, x i = 0},B⊥ := {x ∈ X | ∀x ∈ B : hx, x i = 0},

∼ ?? ◦ respectively. In the case where X is reflexive, i.e. if X = X is valid, we can exploit B◦ = B and ⊥ B⊥ = B . For the purpose of simplicity, we omit curly brackets when considering singletons, i.e. we set lin(x) := lin({x}) and x⊥ := {x}⊥ for any x ∈ X , and similar definitions shall hold for all x∗ ∈ X ?. Now X and Y may be arbitrary Banach spaces again. Then the product space X × Y is a Banach space, too, when, e.g., equipped with the sum norm induced by k·kX and k·kY . In the case where Y = X 2 2 holds, we use X := X × X and the components of any x ∈ X are addressed by x1, x2 ∈ X . We exploit a similar notation for all n ∈ N satisfying n ≥ 3 as well as arbitrary subsets of A ⊆ X , i.e., An denotes the Cartesian product of order n of A. For a closed, convex set C ⊆ X and a fixed point x ∈ C, we obtain

RC (x) = cone(C − {x}) and TC (x) = cl(RC (x)),

∗ ◦ ∗ respectively. Moreover, if x ∈ TC (x) is chosen, then the critical cone to C w.r.t. x and x is defined by

∗ ∗ ⊥ KC (x, x ) := TC (x) ∩ (x ) .

In the case that C is additionally a cone, we easily obtain

◦ ◦ ⊥ RC (x) = C + lin(x), TC (x) = cl(C + lin(x)), TC (x) = C ∩ x .

∗ ∗ ◦ The (not necessarily conic) set C is called polyhedric w.r.t. (x, x ), where x ∈ C and x ∈ TC (x) hold, if ∗ ⊥ ∗ cl(RC (x) ∩ (x ) ) = KC (x, x ) is satisfied. Moreover, C is called polyhedric if it is polyhedric w.r.t. all points (x, x∗) satisfying x ∈ C ∗ ◦ and x ∈ TC (x) . The concept of polyhedricity was introduced in [22,37] and was recently studied in [52]. 4 Susanne Franke, Patrick Mehlitz, Maria Pilecka

∗ ∗ ? Recall that C is called polyhedral if there exist functionals x1, . . . , xn ∈ X and scalars β1, . . . , βn ∈ R such that C possesses the representation

∗ C = {x ∈ X | hx, xi i ≤ βi ∀i ∈ {1, . . . , n}}. Since the radial cone to a polyhedral set at an arbitrary point is already closed, polyhedral sets are polyhedric everywhere, see Lemma 2.3 as well. For a Banach space X and a closed (but not necessarily convex) set D ⊆ X satisfying x ∈ D, we define the set of all Fr´echet σ-normals to D at x as stated below:  ∗  σ ∗ ? hy − x, x i NbD(x) := x ∈ X lim sup ≤ σ . y→x, y∈D ky − xkX

0 The closed, convex cone NbD(¯x) := NbD(¯x) is called the Fr´echet normal cone to D atx ¯. If X is re- w ◦ flexive, NbD(x) = TD (x) follows from [38, Theorem 1.10]. Next, we introduce the basic (or limiting, Mordukhovich) normal cone to D at x by

( ∗ ? ) ∃{σk} ⊆ R ∃{xk} ⊆ D ∃{xk} ⊆ X : N (x) := x∗ ∈ X ? . D ∗ ? ∗ ∗ σ σk & 0, xk → x,¯ xk → x , xk ∈ NbD(xk) ∀k ∈ N From [38, Theorem 2.35], we can fix σ to be zero in the above definition as long as X is an , i.e. a Banach space whose separable subspaces possess separable duals. Note that any reflexive Banach space possesses the Asplund property. Furthermore, we can replace the weak-?-convergence in X ? by the weak convergence in the setting of reflexive Banach spaces. In the case of the set D being convex, we have ◦ ND(x) = NbD(x) = TD(x) .

We call D sequentially normally compact (SNC for short) at x if for any sequences {σk} ⊆ R, {xk} ⊆ D, ∗ ? ∗ σk ∗ ? and {xk} ⊆ X such that xk ∈ NbD (xk) is valid for all k ∈ N and σk & 0, xk → x, as well as xk → 0 ∗ hold, xk → 0 is satisfied. Again, we can fix σ to zero in the Asplund setting. Obviously, any subset of a finite-dimensional Banach space is SNC at any of its points. On the other hand, a singleton in X is SNC if and only if X is finite-dimensional, see [38, Theorem 1.21]. Let U, V, and W be arbitrary Banach spaces. By L[U, V] we denote the space of bounded linear operators mapping from U to V. For any operator F ∈ L[U, V], the operator F? ∈ L[V?, U ?] denotes the adjoint of F. The identical mapping of U, denoted by IU , is always a continuous operator as long as U is equipped with the same norm in its definition and image space. For another operator G ∈ L[U, W], the linear operator (F, G) ∈ L[U, V × W] is defined as stated below: ∀u ∈ U :(F, G)[u] := (F[u], G[u]). Let Γ : U → 2V be an arbitrary set-valued mapping. Then gph Γ := {(u, v) ∈ U×V | v ∈ Γ (u)} denotes its graph. Recall that Γ is called Lipschitz-like at some point (¯u, v¯) ∈ gph Γ if there are constants ε > 0, ρ > 0, and L > 0 such that

0 ε ρ 0 0 ∀u, u ∈ UU (¯u): Γ (u) ∩ UV (¯v) ⊆ Γ (u ) + Lku − u kU BV is satisfied. If the above condition holds whenever u0 :=u ¯ is fixed, then Γ is called calm at (¯u, v¯). Obviously, calmness is implied by the Lipschitz-like property. Note that the latter condition is also known as pseudo Lipschitz or Aubin property, see [12, 25]. ∗ ? U ? Assume that U and V are reflexive Banach spaces. The normal coderivative DN Γ (¯u, v¯): V → 2 is defined by ∗ ? ∗ ∗ ∗ ? ∗ ∗ ∀v ∈ V : DN Γ (¯u, v¯)(v ) = {u ∈ U | (u , −v ) ∈ NgphΓ (¯u, v¯)} . We call Γ partially sequentially normally compact (PSNC for short) at the point (¯u, v¯) ∈ gph Γ if for any ∗ ∗ ? ? ∗ ∗ sequences {(uk, vk)} ⊆ gph Γ and {(uk, vk)} ⊆ U × V which satisfy (uk, vk) → (¯u, v¯), uk * 0, vk → 0, ∗ ∗ ∗ and (uk, vk) ∈ Nbgph Γ (uk, vk) for any k ∈ N, we obtain uk → 0. Note that if Γ is PSNC at (¯u, v¯), satisfies ∗ DN Γ (¯u, v¯)(0) = {0}, and possesses a closed graph in a neighborhood of (¯u, v¯), then Γ is Lipschitz-like at the latter point, see [38, Theorem 4.10] where a weaker condition w.r.t. the so-called mixed coderivative has been formulated. Optimality conditions for the simple convex bilevel programming problem in Banach spaces 5

We drop the reflexivity assumption on U and V. Let φ: U → R be a convex functional andu ¯ ∈ U be a point where |φ(¯u)| < ∞ is satisfied. Then

∂φ(¯u) := {u∗ ∈ U ? | φ(u) ≥ φ(¯u) + hu − u,¯ u∗i ∀u ∈ U}

denotes the subdifferential of φ atu ¯. For a nonempty, closed, and convex set S ⊆ U, the functional δS is ◦ convex, and it is easy to see that for any pointu ¯ ∈ S, we have ∂δS(¯u) = TS(¯u) . For some convex cone K ⊆ V, we say that a mapping ψ : U → V is K-convex if the following condition holds:

∀u, u0 ∈ U ∀σ ∈ (0, 1): − σψ(u) − (1 − σ)ψ(u0) + ψ(σu + (1 − σ)u0) ∈ K.

Note that for a K-convex mapping, the set {u ∈ U | ψ(u) ∈ K} is convex. Suppose that θ : U → V is a twice Fr´echet differentiable mapping andu ¯ ∈ U is fixed. Then the linear operators θ0(¯u) ∈ L[U, V] and θ(2)(¯u) ∈ L[U, L[U, V]] denote its first and second order Fr´echet derivative atu ¯, respectively. For any ξ ∈ V?, we use hθ(2)(¯u), ξi to represent the linear operator in L[U, U ?] defined below: (2) (2) ? (2)  ∀du ∈ U : hθ (¯u), ξi[du] := θ (¯u)[du] [ξ] = ξ ◦ θ (¯u)[du] . Here, ◦ denotes the composition of mappings. For some bounded domain Ω ⊆ Rd, C(Ω) denotes the Banach space of all scalar continuous functions on cl(Ω) equipped with the maximum norm. We identify Ω with the measure space (Ω,Σ, | · |) where | · | denotes the Lebesgue measure and Σ represents the σ-algebra of all Lebesgue-measurable subsets of Ω. Fix some constant p ∈ [1, ∞]. Then Lp(Ω) denotes the Banach space of (equivalence classes of) measurable functions on Ω equipped with the norm 1 Z  p p p ∀u ∈ L (Ω): kukLp(Ω) := |u(ω)| dω Ω in the case p ∈ [1, ∞) and

∞ ∀u ∈ L (Ω): kukL∞(Ω) := inf sup |u(ω)|, N∈Σ, |N|=0 ω∈Ω\N

0 otherwise. For p ∈ [1, ∞), the dual space of Lp(Ω) is Lp (Ω) where p0 ∈ (1, ∞] denotes the so-called conjugate coefficient of p characterized via 1/p + 1/p0 = 1. The corresponding dual pairing is given by

0 Z ∀u ∈ Lp(Ω) ∀v ∈ Lp (Ω): hu, vi := u(ω)v(ω)dω. Ω

p If p ∈ (1, ∞) is valid, L (Ω) is reflexive. For any set S ∈ Σ, χS : Ω → R denotes the characteristic function of S given by ( 1 if ω ∈ S ∀ω ∈ Ω : χS(ω) := 0 if ω∈ / S.

p 1/p Clearly, we have χS ∈ L (Ω) for all p ∈ [1, ∞]. Moreover, kχSkLp(Ω) = |S| is valid for any p ∈ [1, ∞). For some p ∈ (1, ∞), W 1,p(Ω) denotes the Sobolev space of all first order weakly differentiable functions p p from L (Ω) whose weak derivatives ∂ω1 , . . . , ∂ωd belong to L (Ω) as well. We equip this space with the norm defined below:

1 d ! p 1,p p X p 1,p ∀u ∈ W (Ω): kukW (Ω) := kukLp(Ω) + k∂ωi ukLp(Ω) . i=1 A detailed introduction to these and more general function spaces can be found in [3].

2.2 Preliminary results

In this section, we present all supplementary results which are used in this paper. 6 Susanne Franke, Patrick Mehlitz, Maria Pilecka

2.2.1 Some facts on bounded linear operators

The first part of the following result is called Generalized Farkas Lemma and is stated in a more abstract setting, e.g., in [20, Theorem 1, Lemma 3].

Lemma 2.1 Let X , Y be Banach spaces, and suppose that K ⊆ Y is a nonempty, closed, convex cone. Assume that φ: X → R is a positively homogeneous and convex functional while A ∈ L[X , Y] is a bounded linear operator.

1. For any ξ ∈ X ?, the following assertions are equivalent: (a) ξ ∈ cl?∂φ(0) + A?[K◦], (b) ∀x ∈ X : A[x] ∈ K =⇒ hx, ξi ≤ φ(x).

Particularly, setting φ(·) := δTS (¯x)(·) (indicator function of the tangent cone to S at x¯) for some closed, convex set S ⊆ X and x¯ ∈ S, we have

? ◦ ? ◦  ◦ cl TS(¯x) + A [K ] = {x ∈ TS(¯x) | A[x] ∈ K} .

2. Suppose that A[X ] − K = Y is satisfied. Then we have

A?[K◦] = {x ∈ X | A[x] ∈ K}◦.

Note that if X is reflexive, then we can replace the weak-?-closure in Lemma 2.1 by the common closure since the weak and weak-?-topology in X ? are equivalent, and convex sets are closed if and only if they are weakly closed by Mazur’s theorem. Indeed, the first statement of Lemma 2.1 is a generalization of the famous Farkas Lemma.

n m m×n n m,+ Remark 2.1 Choosing X = R , Y = R , A ∈ R , b ∈ R , K := R0 , as well as ξ = 0 and defining φ: Rn → R by φ(x) := b>x for any x ∈ Rn, Lemma 2.1 yields that the assertions

1. ∃y ∈ Rm : A>y = b, y ≥ 0, 2. ∀x ∈ Rn : Ax ≥ 0 =⇒ b>x ≥ 0 are equivalent. Hence, precisely one of the following systems possesses a solution:

( ( A>y = b Ax ≥ 0 y ≥ 0 b>x < 0.

Moreover, we can characterize the adjoint of a dense range operator simply by applying Lemma 2.1.

Remark 2.2 Let X and Y be reflexive Banach spaces, while A ∈ L[X , Y] is an arbitrary operator. Then A? is injective if and only if A possesses a dense range.

Lemma 2.2 Let A ∈ L[X , Y] be an injective, bounded, linear operator with a closed range between Banach spaces X and Y. Then there is a constant c > 0 satisfying ckxkX ≤ kA[x]kY for all x ∈ X .

Proof Clearly, due to its closedness, A[X ] is a Banach space as well. Hence, we can define a bijection A˜ ∈ L[X , A[X ]] by means of A˜[x] := A[x] for any x ∈ X . Then A˜−1 is a bounded linear operator as well, −1 i.e., there is a constantc ˜ > 0 which satisfies kA˜ [y]kX ≤ c˜kykY for all y ∈ A[X ]. For arbitrary x ∈ X , we obtain

−1 kxkX = kA˜ [A[x]]kX ≤ c˜kA[x]kY .

1 Hence, the statement of the lemma follows by choosing c := c˜. ut Optimality conditions for the simple convex bilevel programming problem in Banach spaces 7

2.2.2 On polyhedral cones

∗ ∗ ? Let Z be a reflexive Banach space and {z1 , . . . , zk} ⊆ Z be a set of linear independent functionals. We consider the closed, convex, polyhedral cone

∗ K := {z ∈ Z | hz, zi i ≤ 0 ∀i ∈ {1, . . . , k}}. (1)

As mentioned earlier, this set is polyhedric everywhere. Although this result is well-known, we provide a short proof for the sake of completeness.

Lemma 2.3 Let the polyhedral cone K be given as stated in (1). Then K is polyhedric.

Proof The assertion clearly follows if we can show RK (¯u) = TK (¯u) for arbitraryu ¯ ∈ K. Therefore, ∗ chooseu ¯ ∈ K arbitrarily, define I(¯u) := {i ∈ {1, . . . , k} | hu,¯ zi i = 0}, and observe

∗ RK (¯u) = K + lin(¯u) = {z + κu¯ ∈ Z | κ ∈ R, hz, zi i ≤ 0 ∀i ∈ {1, . . . , k}} ∗ = {d ∈ Z | ∃κ ∈ R: hd − κu,¯ zi i ≤ 0 ∀i ∈ {1, . . . , k}} ∗ ∗ ∗ = {d ∈ Z | ∃κ ∈ R: hd, zi i ≤ 0 ∀i ∈ I(¯u), hd, zi i ≤ κhu,¯ zi i ∀i ∈ {1, . . . , k}\ I(¯u)} ∗ = {d ∈ Z | hd, zi i ≤ 0 ∀i ∈ I(¯u)}.

Obviously, RK (¯u) is closed and, hence, it coincides with TK (¯u). This completes the proof. ut

Later we will define a generalized notion of M-stationarity for MPCCs where the cone which induces the complementarity is given by (1). Before we formulate a lemma which summarizes the results obtained in [24], we introduce some necessary notation. From the definition of the polar cone,

n o ◦ Pk ∗ ? K = i=1αizi ∈ Z αi ≥ 0 ∀i ∈ {1, . . . , k}

follows easily. Let us analyze the complementarity set

F := {(u, v) ∈ K × K◦ | hu, vi = 0}.

Choosing (¯u, v¯) ∈ F, we define index sets I(¯u, v¯) and J(¯u, v¯) by

I(¯u, v¯) := {i ∈ {1, . . . , k} | hu,¯ z∗i = 0}, i (2) J(¯u, v¯) := {i ∈ I(¯u, v¯) | α¯i > 0}.

Pk ∗ Here, we exploited the unique representationv ¯ = i=1 α¯izi for someα ¯1,..., α¯k ≥ 0. For any index sets J := J(¯u, v¯) ⊆ P ⊆ Q ⊆ I(¯u, v¯) =: I, we define

∗ ∗ CQ,P := cone({zi | i ∈ Q \ P }) + lin({zi | i ∈ P }), ∗ ∗ ⊥ DQ,P := {z ∈ Z | hz, zi i ≤ 0 ∀i ∈ Q \ P } ∩ {zi | i ∈ P } .

Lemma 2.4 Using the above notations, the following relations hold:

[ NbF (¯u, v¯) = CI,J × DI,J , NF (¯u, v¯) = CQ,P × DQ,P . J⊆P ⊆Q⊆I 8 Susanne Franke, Patrick Mehlitz, Maria Pilecka

2.2.3 On the SNC property of the nonnegative cone in different function spaces

Let Ω ⊆ Rd be a bounded domain. In this section, we want to comment on the SNC property of the cone of nonnegative functions in different function spaces defined on Ω. Therefore, we provide the following three lemmata which describe the situation w.r.t. continuous functions, functions from Lebesgue spaces, and functions from certain Sobolev spaces.

Lemma 2.5 The closed, convex cone

+ C(Ω)0 := {u ∈ C(Ω) | u(ω) ≥ 0 for all ω ∈ Ω}

is SNC everywhere.

+ Proof For some function u ∈ C(Ω), we obtain max{u; 0}, max{−u; 0} ∈ C(Ω)0 . This shows the relation + + lin C(Ω)0 = C(Ω). Noting that int C(Ω)0 6= ∅ is valid, the lemma’s assertion follows from [38, Theorem 1.21]. ut

Lemma 2.6 For any p ∈ [1, ∞), the closed, convex cone

p + p L (Ω)0 := {u ∈ L (Ω) | u(ω) ≥ 0 f.a.a. ω ∈ Ω}

∞ + is nowhere SNC. On the other hand, the closed, convex cone L (Ω)0 is SNC everywhere.

∞ + Proof For p = ∞, we can adapt the proof of Lemma 2.5 since L (Ω)0 possesses a nonempty interior and its span already equals L∞(Ω). Thus, let us fix p ∈ (1, ∞), its conjugate coefficient p0 ∈ (1, ∞) (i.e. the number satisfying 1/p+1/p0 = 1), p + Ω and some arbitraryu ¯ ∈ L (Ω)0 . Clearly, since Ω is open, we find a sequence {Ωk} ⊆ 2 of measurable

sets with positive measure such that |Ωk| & 0 holds true. Let us define uk := (1 − χΩk )¯u. We easily see p uk → u¯ in L (Ω) from the dominated convergence theorem, see [47, Theorem 5.2.2]. Let us define

1 p0 ∀ω ∈ Ω : ηk(ω) := −|Ωk| χΩk (ω).

This leads to

p0 + ⊥ p +◦ ⊥ ηk ∈ −L (Ω) ∩ u = L (Ω) ∩ u = N p + (uk) = Nb p + (uk). 0 k 0 k L (Ω)0 L (Ω)0

p p For arbitrary v ∈ L (Ω), we have v ∈ L (Ωk) and H¨older’sinequality yields

1 1 Z 1 Z  p − p0 − p0 p |hv, η i| = |Ω | v(ω)χ (ω)dω ≤ |Ω | kvk p kχ k p0 = |v(ω)| dω . k k Ωk k L (Ωk) Ωk L (Ωk) Ωk Ωk

Using the dominated convergence theorem once more, the integral on the right tends to zero as k goes ? to ∞. Thus, we obtain ηk → 0. On the other hand, kηkkLp0 (Ω) = 1 for all k ∈ N shows ηk 9 0. Thus, p + L (Ω)0 is not SNC atu ¯. 1 + For p = 1 andu ¯ ∈ L (Ω)0 , we can reprise the above arguments with uk := (1 − χΩk )¯u and ηk := −χΩk 1 ? ∞ ∞ 1 + for all k ∈ N to see uk → u¯ in L (Ω), ηk → 0 in L (Ω), and ηk 9 0 in L (Ω). This shows that L (Ω)0 is not SNC atu ¯ as well. ut

Lemma 2.7 Suppose that Ω possesses a Lipschitz continuous boundary. For any p ∈ (1, ∞) and p > d, the cone 1,p + 1,p W (Ω)0 := {u ∈ W (Ω) | u(ω) ≥ 0 f.a.a. ω ∈ Ω} is SNC everywhere. Optimality conditions for the simple convex bilevel programming problem in Banach spaces 9

1,p 1,p + Proof For any u ∈ W (Ω), we have max{u; 0}, max{−u; 0} ∈ W (Ω)0 from [4, Theorem 5.8.2] which 1,p + 1,p shows lin W (Ω)0 = W (Ω). We invoke the Sobolev embedding theorem, see [3, Theorem 4.12], to obtain W 1,p(Ω) ,→ C(Ω), i.e. W 1,p(Ω) ⊆ C(Ω) and there is a constant γ > 0 with

1,p ∀u ∈ W (Ω): kukC(Ω) ≤ γkukW 1,p(Ω). Letu ¯ ∈ W 1,p(Ω) be a function satisfyingu ¯(ω) ≥ 2γ almost everywhere on Ω. Obviously, we have 1,p + 1,p u¯ ∈ W (Ω)0 . For any v ∈ W (Ω) with ku¯ − vkW 1,p(Ω) ≤ 1, we obtain

|u¯(ω) − v(ω)| ≤ ku¯ − vkC(Ω) ≤ γku¯ − vkW 1,p(Ω) ≤ γ,

1,p + i.e. v(ω) ≥ γ for almost every ω ∈ Ω. This showsu ¯ ∈ int W (Ω)0 . Once more, we apply [38, 1,p + Theorem 1.21] to deduce that W (Ω)0 is SNC everywhere. ut Note that the above results can be extended to sets in function spaces with upper and lower bounds. 2 Especially, we see that for functions ua, ub ∈ L (Ω) with ua(ω) ≤ ub(ω) for almost every ω ∈ Ω, the set

2 Uad := {u ∈ L (Ω) | ua(ω) ≤ u(ω) ≤ ub(ω) f.a.a. ω ∈ Ω}

2 is nowhere SNC. Note that Uad ⊆ L (Ω) is the standard set for control constraints in optimal control.

3 Constraint qualifications in Banach spaces

Throughout the whole section, assume that the following is fulfilled: Let X and Y be arbitrary Banach spaces, let h: X → R as well as H : X → Y be continuously Fr´echet differentiable, and let C ⊆ Y as well as S ⊆ X be nonempty and closed. Feasible regions of mathematical programs are often described by sets of type M := {x ∈ S | H(x) ∈ C}. In order to guarantee that necessary optimality conditions of KKT-type hold at local minimizers of h(x) → min (3) x ∈ M, one generally has to postulate certain constraint qualifications. The aforementioned KKT conditions of (3) at a certain feasible pointx ¯ ∈ M are stated below:

0 0 ? ∃λ ∈ NC (H(¯x)) ∃ξ ∈ NS(¯x): 0 = h (¯x) + H (¯x) [λ] + ξ. (4) We introduce the set Λ(¯x) of Lagrange multipliers (cf. [9]) of (3) atx ¯ by

0 0 ? Λ(¯x) := {(λ, ξ) ∈ NC (H(¯x)) × NS(¯x) | 0 = h (¯x) + H (¯x) [λ] + ξ}. Consequently, the KKT conditions are satisfied atx ¯ ∈ M if and only if the set Λ(¯x) is nonempty. In this section, we give a short overview of regularity conditions and discuss the relationship between them.

3.1 Weak constraint qualifications

Without mentioning it again, we assume throughout this section that the sets C and S are convex. This situation is well-studied in literature. For the upcoming considerations, we choose an arbitrary point x¯ ∈ M. The most common constraint qualification in our setting is KRZCQ (cf. [9, 32, 43]), the so-called Kurcyusz Robinson Zowe Constraint Qualification, which postulates

0 H (¯x)[RS(¯x)] − RC (H(¯x)) = Y. Polarizing this equation (i.e. considering the equality of the polar sets to the sets on the left- and right- hand side of the above equation) leads to

◦ ◦ 0 ? {(λ, ξ) ∈ TC (H(¯x)) × TS(¯x) | H (¯x) [λ] + ξ = 0} = {(0, 0)}. (5) 10 Susanne Franke, Patrick Mehlitz, Maria Pilecka

◦ ◦ Note that due to the convexity of C and S, we have NC (H(¯x)) = TC (H(¯x)) and NS(¯x) = TS(¯x) . Hence, in terms of (4), the set on the left in (5) is called the set of singular Lagrange multipliers (we use again the notation from [9]). Observe that this set is always a nonempty, closed, and convex cone. We polarize (5) once more to obtain the condition

0  cl H (¯x)[TS(¯x)] − TC (H(¯x)) = Y which is equivalent to (5) by the bipolar theorem, cf. [9, Proposition 2.40]. Now it is easy to see that the latter condition is equivalent to

0  cl(H (¯x)[RS(¯x)] − RC (H(¯x)) = Y. (6) It follows from [9, Corollary 2.98] that the constraint qualifications KRZCQ, (5), and (6) are equivalent in the case where S = X holds and C possesses a nonempty interior. Let us consider the situation X = Rn, m m,+ n Y = R , C = −R0 , S = R which describes a standard nonlinear program with inequality constraints. It is common knowledge that the dual form of the well-known MFCQ, the Mangasarian Fromovitz Constraint Qualification, takes precisely the form (5), and, consequently, KRZCQ equals MFCQ in this case. A condition which obviously implies KRZCQ is given by

0 H (¯x)[RS(¯x)] = Y, which reduces to the surjectivity of H0(¯x) if either S = X orx ¯ ∈ int(S) is satisfied. We call this condition FRCQ, the Full Range Constraint Qualification. Observe that for standard nonlinear programs, FRCQ is strictly stronger than LICQ, the Linear Independence Constraint Qualification, as long as not all constraints are active atx ¯. In order to state weaker constraint qualifications, we introduce two convex cones. First, the linearized tangent cone to M atx ¯, denoted by LM (¯x), is given below:

0 LM (¯x) := {d ∈ TS(¯x) | H (¯x)[d] ∈ TC (H(¯x))}.

Obviously, this cone is closed. Furthermore, we will deal with the linearized normal cone SM (¯x) to M at x¯ defined by 0 ? ? ◦ ◦ SM (¯x) := {H (¯x) [λ] + ξ ∈ X | λ ∈ TC (H(¯x)) , ξ ∈ TS(¯x) }. Note that this cone is not necessarily closed, e.g., in the case where C = {0} and S = X hold, while the image of H0(¯x)? is dense in X ?. In the following proposition, we state some results on the cones LM (¯x) and SM (¯x). Proposition 3.1 Let x¯ ∈ M be arbitrarily chosen. Then the following assertions are satisfied:

1. TM (¯x) ⊆ LM (¯x), ◦ ?  2. LM (¯x) = cl SM (¯x) , ◦ 3. if KRZCQ holds at x¯, then LM (¯x) ⊆ TM (¯x) is satisfied and LM (¯x) = SM (¯x) holds, i.e. SM (¯x) is closed w.r.t. the weak-?-topology. Proof The proof of the first assertion is straightforward and, hence, omitted. For the proof of the second statement, we refer the reader to the first part of Lemma 2.1. The first statement of the third assertion ◦ follows from [9, Corollary 2.91]. Hence, we only need to show that LM (¯x) = SM (¯x) is valid under KRZCQ. Therefore, observe that the latter regularity condition is equivalent to

0  H (¯x), IX [X ] − RC (H(¯x)) × RS(¯x) = Y × X which implies 0  H (¯x), IX [X ] − TC (H(¯x)) × TS(¯x) = Y × X . 0  Now we apply the second statement of Lemma 2.1 with A = H (¯x), IX and K = TC (H(¯x)) × TS(¯x) to ◦ obtain LM (¯x) = SM (¯x). ut In the following definition, we present two constraint qualifications which are weaker than KRZCQ. These conditions can be found in a slightly different setting in [1, 21]. Optimality conditions for the simple convex bilevel programming problem in Banach spaces 11

Definition 3.1 Letx ¯ ∈ M be arbitrarily chosen. We say that ACQ, the Abadie Constraint Qualification, holds true atx ¯ if TM (¯x) = LM (¯x) is valid, while SM (¯x) is closed w.r.t. the weak-?-topology. Furthermore, GCQ, the Guignard Constraint Qualification, is said to be satisfied atx ¯ if the relation ◦ TM (¯x) = SM (¯x) holds. From Proposition 3.1, we derive the following result.

Lemma 3.1 Let x¯ ∈ M be arbitrarily chosen. Then the following implications hold for the constraint qualifications from above which might be satisfied at x¯:

FRCQ =⇒ KRZCQ =⇒ ACQ =⇒ GCQ.

Furthermore, if x¯ is a local minimizer of (3) where at least GCQ holds, then the KKT conditions are satisfied at this point. Hence, any of the above constraint qualifications is sufficient for the KKT conditions to be necessary optimality conditions for (3).

Proof We start to verify the implications for the presented constraint qualifications. Obviously, FRCQ implies KRZCQ. If KRZCQ holds atx ¯, then by Proposition 3.1, TM (¯x) = LM (¯x) is satisfied, while SM (¯x) is closed w.r.t. the weak-?-topology, i.e. ACQ holds atx ¯. Now suppose that ACQ is valid atx ¯. ◦ ◦ ?  Then we can easily deduce TM (¯x) = LM (¯x) = cl SM (¯x) = SM (¯x). Consequently, GCQ holds atx ¯ as well. Now assume thatx ¯ is a local minimizer of (3) where GCQ is satisfied. Then the local optimality of x¯ for (3) implies that the first order optimality condition

0 ∀d ∈ TM (¯x): h (¯x)[d] ≥ 0

0 ◦ holds, see [30, Theorem 4.14]. This implies −h (¯x) ∈ TM (¯x) = SM (¯x) by GCQ. Hence, the KKT conditions are valid atx ¯. ut

A special setting where ACQ always holds is described in the following example.

Example 3.1 Let X and Y be Banach spaces, let A ∈ L[X , Y] be a bounded linear operator with closed ∗ ∗ ? range, let b ∈ Y as well as a1, . . . , ak ∈ X and β1, . . . βk ∈ R be fixed, and consider the set

∗ M := {x ∈ X | A[x] = b, hx, ai i ≤ βi ∀i ∈ {1, . . . , k}}.

∗ For somex ¯ ∈ M, we define I(¯x) := {i ∈ {1, . . . , k} | hx,¯ ai i = βi} and obtain

∗ LM (¯x) = {d ∈ X | A[d] = 0, hd, ai i ≤ 0 ∀i ∈ I(¯x)}, ? ? P ∗ ? SM (¯x) = A [Y ] + i∈I(¯x)αiai ∈ X αi ≥ 0 ∀i ∈ I(¯x) .

◦ It is not difficult to see that TM (¯x) = LM (¯x) holds true, and from [9, Proposition 2.201], LM (¯x) = SM (¯x) follows, i.e. SM (¯x) is closed w.r.t. the weak-?-topology. Hence, ACQ is valid atx ¯. Hence, analogously as for optimization problems in finite dimensions, ACQ holds for feasible sets defined using linear restrictions in infinite-dimensional settings at each feasible point.

3.2 Constraint qualifications for nonconvex abstract constraints

Now we want to take a closer look at the situation where C and S are not necessarily convex. Then the machinery of used to construct and analyze KRZCQ is not longer applicable, i.e. it does not provide a constraint qualification for the underlying mathematical problem (3) anymore. On the other hand, the validation of the applicable regularity conditions ACQ and GCQ is an enormously challenging task. Thus, in [53, Section 3], the author applies a generalized concept of tangent approximations to derive a KRZCQ-type regularity condition for (3). We want to use another approach here which dates back to the variational calculus introduced by Mordukhovich, see [38]. 12 Susanne Franke, Patrick Mehlitz, Maria Pilecka

Definition 3.2 Letx ¯ ∈ M be arbitrarily chosen. Then BCQ, the Basic Constraint Qualification, is satisfied atx ¯ if the following two conditions are valid: ) H0(¯x)?[λ] + ξ = 0, H0(¯x) is surjective, =⇒ ξ = 0. λ ∈ NC (H(¯x)), ξ ∈ NS(¯x) Furthermore, we say that PCBCQ, the Partially Convex Basic Constraint Qualification, is satisfied atx ¯ if the sets C and Υ := {x ∈ X | H(x) ∈ C} are convex and the following two conditions are valid:

0 ? ) 0 H (¯x) [λ] + ξ = 0, H (¯x)[X ] − RC (H(¯x)) = Y, ◦ =⇒ ξ = 0. λ ∈ TC (H(¯x)) , ξ ∈ NS(¯x) We start our considerations with a short statement on these conditions. Corollary 3.1 Choose x¯ ∈ M arbitrarily. Then PCBCQ implies the following condition: ) H0(¯x)?[λ] + ξ = 0, ◦ =⇒ λ = 0, ξ = 0. (7) λ ∈ TC (H(¯x)) , ξ ∈ NS(¯x) On the other hand, if C possesses a nonempty interior and the sets C as well as Υ are convex, then (7) implies PCBCQ. Furthermore, BCQ implies the following condition ) H0(¯x)?[λ] + ξ = 0, =⇒ λ = 0, ξ = 0. (8) λ ∈ NC (H(¯x)), ξ ∈ NS(¯x)

0 ? ◦ Proof Let PCBCQ hold atx ¯ ∈ M and suppose that H (¯x) [λ]+ξ = 0 is valid for some λ ∈ TC (H(¯x)) and 0 ? ξ ∈ NS(¯x). Then we automatically have ξ = 0 from the second condition of PCBCQ, i.e. H (¯x) [λ] = 0 ◦ 0 ? is satisfied. Polarizing the first condition of PCBCQ yields {λ ∈ TC (H(¯x)) | H (¯x) [λ] = 0} = {0}. This leads to λ = 0. Consequently, condition (7) is valid. Now assume that condition (7) holds. Then the second condition of PCBCQ is trivially satisfied. ◦ 0 ? Furthermore, choosing 0 ∈ NS(¯x) in (7) leads to {λ ∈ TC (H(¯x)) | H (¯x) [λ] = 0} = {0}. Since C possesses a nonempty interior and due to the postulated convexity assumptions, the statements of Section 3.1 are applicable, and, consequently, this condition equals the first condition of PCBCQ. The fact that BCQ implies (8) is obvious since the surjectivity of H0(¯x) leads to the injectivity of H0(¯x)? by the Closed-Range-Theorem [54, Theorem IV.5.1]. ut Now we want to show that BCQ and PCBCQ are sufficient for the KKT conditions to be necessary optimality conditions for (3) under an additional assumption. Proposition 3.2 Let X be reflexive and x¯ ∈ M be a local minimizer of (3) where S is SNC and BCQ or PCBCQ holds. Then the KKT conditions are satisfied at x¯.

0 Proof From [38, Proposition 5.1], we have −h (¯x) ∈ NM (¯x). Note that M = Υ ∩ S holds. Using the first condition of BCQ and [38, Theorem 1.17], we obtain

0 ? NΥ (¯x) = H (¯x) [NC (H(¯x))]. If PCBCQ holds, by Proposition 3.1

◦ ◦ 0 ? ◦ NΥ (¯x) = TΥ (¯x) = LΥ (¯x) = SΥ (¯x) = H (¯x) [TC (H(¯x)) ] holds as well. Hence, the second condition of BCQ and PCBCQ is equivalent to  NΥ (¯x) ∩ −NS(¯x) = {0}.

The latter condition and the property of S to be SNC atx ¯ imply NM (¯x) ⊆ NΥ (¯x) + NS(¯x) by [38, Corollary 3.5]. Consequently, we obtain

0 0 ? −h (¯x) ∈ H (¯x) [NC (H(¯x))] + NS(¯x) which yields the KKT conditions atx ¯. ut Optimality conditions for the simple convex bilevel programming problem in Banach spaces 13

Following the argumentation in the proof of Proposition 3.2, we note that the constraint qualification (8) also implies that the KKT conditions hold at a local minimizer of (3) under the additional assumption that C is SNC at H(¯x) while S needs to be SNC atx ¯, see [38, Theorem 3.8]. As we have shown in Section 2.2.3, the SNC property is very restrictive in reflexive function spaces. That is why we introduced the constraint qualifications BCQ and PCBCQ: as the above proposition shows, they imply that the KKT conditions hold at a local minimizer of (3) even in the case where C suffers from a lack of the SNC property. In the case where the Banach spaces X and Y are finite-dimensional, we can drop the surjectivity assumption on the constraint mapping H and still obtain an appropriate constraint qualification since C ⊆ Y is SNC everywhere. This result follows, e.g., from [16, Theorem 2.1, Proposition 2.2, Proposi- tion 2.3].

Remark 3.1 Let X as well as Y be finite-dimensional Banach spaces and assume thatx ¯ ∈ M is a local minimizer of (3) where the condition (8) is satisfied. Then the KKT conditions hold atx ¯. That is why in the finite-dimensional setting, the condition (8) is called basic constraint qualification as well. Note that the condition (8) corresponds to NNAMCQ, the No Nonzero Abnormal Multiplier Constraint p p,+ Qualification, provided that the finite-dimensional space Y = R and the cone C = R0 are taken into consideration [59].

Let us consider the set-valued mapping M: Y → 2X as stated below:

∀y ∈ Y : M(y) := {x ∈ S | H(x) + y ∈ C}. (9)

Obviously, M(0) = M is satisfied. It is well-known that certain stability properties of the perturbation mapping M may serve as a constraint qualification for (3) as well. We obtain the following result which is similar to [16, Theorem 2.1] or [38, Lemma 5.47] where slightly different settings are analyzed.

Proposition 3.3 Let x¯ ∈ M be a local optimal solution of (3) where X and Y are reflexive and assume that M is calm at (0, x¯). Then the KKT conditions are valid.

Proof Due to the calmness of M at (0, x¯), we find ε, ρ > 0 and LM > 0 such that

ε ρ ∀y ∈ UY (0): M(y) ∩ UX (¯x) ⊆ M(0) + LMkykY BX % is satisfied. Furthermore, we find a constant % > 0 such that h and H are Lipschitz continuous on UX (¯x) % with modulus Lh and LH , respectively, and h(x) ≥ h(¯x) holds true for all x ∈ M(0) ∩ UX (¯x).  %     Now set  := min εLM; ρ; , α := min ; , as well as β := min ; and choose points 2 2LH LM 2LM α β x ∈ S ∩ UX (¯x) and z ∈ C ∩ UY (H(¯x)) arbitrarily. Then it follows

 kz − H(x)kY ≤ kz − H(¯x)kY + LH kx − x¯kX < ≤ ε. LM

ρ 0 Observe that x ∈ M(z − H(x)) ∩ UX (¯x) is satisfied, which is why we obtain the existence of x ∈ M(0) satisfying 0 kx − x kX ≤ LMkz − H(x)kY <  from the calmness condition. Consequently, we derive

0 0 kx¯ − x kX ≤ kx¯ − xkX + kx − x kX ≤ 2.

0 % Hence, x ∈ M(0) ∩ UX (¯x) holds. Now we easily obtain

0 0 0 h(¯x) ≤ h(x ) = h(x) + (h(x ) − h(x)) ≤ h(x) + Lhkx − xkX ≤ h(x) + LhLMkz − H(x)kY .

That is why (¯x, 0,H(¯x)) is a local optimal solution of

h(x) + LhLMkykY → min x,y,z y − H(x) + z = 0 (x, z) ∈ S × C. 14 Susanne Franke, Patrick Mehlitz, Maria Pilecka

Note that the continuously Fr´echet differentiable mapping (x, y, z) 7→ (y − H(x) + z, x, z) possesses a surjective derivate everywhere. Applying [38, Propositions 1.2, 1.107, 5.3 and Theorem 1.17], which is ∗ ? ∗ possible due to the reflexivity of X and Y, we obtain the existence of y1 ∈ Y , y2 ∈ NC (H(¯x)), and ξ ∈ NS(¯x) which satisfy

0 0 ? ∗ ∗ ∗ ∗ 0 = h (¯x) − H (¯x) [y1 ] + ξ, −y1 ∈ LhLM∂k · kY (0), y1 + y2 = 0.

∗ We put λ := −y1 in order to see that the KKT conditions for (3) are valid atx ¯. This completes the proof. ut

In the following lemma, we show a relationship between BCQ and the calmness of M.

Lemma 3.2 Let x¯ ∈ M be a point where BCQ holds, assume that X as well as Y are reflexive, and suppose that S is SNC at x¯. Then M is calm at x¯.

Proof Due to the closedness of S and C and the continuity of H, gph M is closed. Now we show the ∗ relation DN M(0, x¯)(0) = {0} and that M is PSNC at (0, x¯). By means of [38, Theorem 4.10], this is sufficient for M to be Lipschitz-like and, hence, calm at this point. Since S is SNC atx ¯, Y ×S is SNC at (0, x¯). Due to the special structure of the perturbation mapping ∗ M, we deduce the relationship DN M(0, x¯)(0) = {0} from [38, Corollary 1.69(ii), Theorem 4.32(b)] using the second condition postulated in the definition of BCQ. Now we need to show that M is PSNC at (0, x¯). Exploiting the fact that Y × S is SNC at this point, by means of [38, Corollary 3.80] we only need to show that the set-valued map M0 : Y → 2X such that M0(y) = {x ∈ X | H(x) + y ∈ C} is satisfied for any y ∈ Y is PSNC at (0, x¯). Hence, we choose 0 ∗ ∗ ? ? ∗ ∗ sequences {(yk, xk)} ⊆ gph M and {(yk, xk)} ⊆ Y × X such that (yk, xk) → (0, x¯), yk * 0, xk → 0, ∗ ∗ and (yk, xk) ∈ Nbgph M0 (yk, xk) for all k ∈ N hold true. Applying [38, Corollary 1.15], we obtain

∗ 0 ? ∗ ? ? ∗ Nbgph M0 (yk, xk) = {(vk,H (xk) [vk]) ∈ Y × X | vk ∈ NbC (H(xk) + yk)}.

∗ ? ∗ ∗ ∗ ∗ 0 ? ∗ Thus, there is a sequence {vk} ⊆ Y satisfying vk ∈ NbC (H(xk)+yk), yk = vk, as well as xk = H (xk) [vk] for any k ∈ N. The observation 0 ? ∗ 0 ? 0 ? ∗ ∗ H (¯x) [vk] + H (xk) − H (¯x) [vk] = xk → 0 | {z } →0

0 ? ∗ ∗ leads to H (¯x) [vk] → 0 since {vk} converges weakly and, hence, is bounded. From BCQ, we know that H0(¯x) is surjective which is why H0(¯x)? is injective, see Remark 2.2. Moreover, H0(¯x)?[Y?] is closed by [54, Theorem IV.5.1]. Applying Lemma 2.2, we find a constant c > 0 such that

∗ 0 ? ∗ 0 ≤ ckvkkY? ≤ kH (¯x) [vk]kX ? → 0

∗ ∗ 0 holds, i.e. vk → 0 is satisfied which is equivalent to yk → 0. Hence, M is PSNC at (0, x¯). Due to the above argumentation, M is PSNC at (0, x¯) as well. This completes the proof. ut

Once more we want to take a look at the finite-dimensional situation. The upcoming result and its proof can be found in [16, Proposition 2.3]. Remark 3.2 Let X as well as Y be finite-dimensional Banach spaces and assume thatx ¯ ∈ M is a point where the condition (8) is satisfied. Then the corresponding perturbation mapping M is calm at (0, x¯).

Remark 3.3 The relationship between ACQ and calmness is investigated in [25, Proposition 1] for maps in finite-dimensional spaces where it is established that the latter condition is weaker than the former one under reasonable assumptions. This result can be easily generalized to a general infinite-dimensional framework.

Remark 3.4 Be aware that the conditions in Lemma 3.2 and Remark 3.2 already imply that the pertur- bation mapping M is Lipschitz-like at the reference point, which is a stronger property than calmness. That is why the calmness assumption is a much weaker constraint qualification than BCQ in general, see Proposition 3.3. Optimality conditions for the simple convex bilevel programming problem in Banach spaces 15

4 MPCCs in Banach spaces

In this section, we take a closer look at MPCCs in Banach spaces which can be stated as given below: h(x) → min a(x) ∈ C A(x) ∈ K (MPCC) B(x) ∈ K◦ hA(x),B(x)i = 0.

Here, h: X → R, a: X → Y, A: X → Z, and B : X → Z? are continuously Fr´echet differentiable mappings between real Banach spaces X , Y, and Z such that Z is reflexive. Furthermore, C ⊆ Y is a nonempty, closed, convex set, while K ⊆ Z is a nonempty, closed, convex cone. The reflexivity of X yields the symmetry of the complementarity condition. Note that (MPCC) is a quite irregular problem. The constraint qualification KRZCQ fails to hold at any feasible point of it, cf. [36, Lemma 3.1]. Letx ¯ be a feasible point of (MPCC). Then the tangent cone to the feasible set of (MPCC) atx ¯ turns out to be generally nonconvex, while the corresponding linearized tangent cone is convex. Hence, we cannot expect ACQ to be satisfied atx ¯ in general. As it was shown in [19] by means of a simple example, GCQ may be satisfied at a feasible point of (MPCC) under appropriate assumptions on the given problem. Hence, the KKT conditions of (MPCC) atx ¯, i.e., the existence of λ ∈ Y?, µ ∈ Z?, and ν ∈ Z which satisfy 0 = h0(¯x) + a0(¯x)?[λ] + A0(¯x)?[µ] + B0(¯x)?[ν], (10a) ◦ λ ∈ TC (a(¯x)) , (10b) ⊥ µ ∈ RK◦ (B(¯x)) ∩ A(¯x) , (10c) ⊥ ν ∈ RK (A(¯x)) ∩ B(¯x) , (10d) see [51, Lemma 5.1], may hold. Nevertheless, these conditions turn out to be too strong to yield applicable necessary optimality conditions for (MPCC) in general. In the finite-dimensional setting, various stationarity concepts for (MPCC) have been formulated for example in [41, 45, 60]. Here, we recall the generalized concepts of weak and strong stationarity for (MPCC), recently intro- duced in [36] and [51], respectively. Afterwards, we define and study a generalized notion of M-stationarity which turns out to be reasonable w.r.t. common finite-dimensional, semidefinite, and polyhedral MPCCs. A feasible pointx ¯ of (MPCC) is called strongly stationary (S-stationary for short) if there are multipliers λ ∈ Y?, µ ∈ Z?, and ν ∈ Z which satisfy (10a), (10b), as well as

µ ∈ KK◦ (B(¯x),A(¯x)), (11a)

ν ∈ KK (A(¯x),B(¯x)). (11b) Observe that the conditions (11a) and (11b) are weaker than the corresponding conditions (10c) and (10d), respectively. Hence, any KKT point of (MPCC) is S-stationary, but the converse implication, which holds for common MPCCs in finite dimensions, is generally not satisfied. It was shown in [51] that the S-stationarity conditions provide a necessary optimality condition of reasonable strength whenever the cone K is polyhedric. In the absence of polyhedricity, one could use linearization approaches in order to obtain better optimality conditions than S-stationarity, see [51, Section 6] and [53]. A feasible pointx ¯ of (MPCC) is called weakly stationary (W-stationary for short) whenever there exist multipliers λ ∈ Y?, µ ∈ Z?, and ν ∈ Z such that (10a) and (10b) as well as µ ∈ clK◦ − K◦ ∩ A(¯x)⊥ ∩ A(¯x)⊥, (12a) ν ∈ clK − K ∩ B(¯x)⊥ ∩ B(¯x)⊥ (12b) are satisfied. Note that any S-stationary point is also W-stationary, see [36, Lemma 3.4]. It was shown in [51, Proposition 5.2] that a local minimizerx ¯ of (MPCC) is S-stationary whenever the linear operator (a0(¯x),A0(¯x),B0(¯x)) is surjective. Observe that this condition is a stronger version of MPCC-LICQ (cf. [19, Definition 4.4]) for common finite-dimensional MPCCs. 16 Susanne Franke, Patrick Mehlitz, Maria Pilecka

We state the following result which is a direct consequence of [51, Lemma 5.1, Proposition 5.2] and the definition of the critical cones appearing in the strong stationarity conditions. Corollary 4.1 Let x¯ ∈ X be a local optimal solution of (MPCC) where (a0(¯x),A0(¯x),B0(¯x)) is a sur- jective operator, whereas the cones RK (A(¯x)) and RK◦ (B(¯x)) are closed. Then x¯ is a KKT point of (MPCC).

Remark 4.1 Letx ¯ ∈ X be a feasible point of (MPCC). Observe that in the case where Z is infinite- dimensional, the cones RK (A(¯x)) and RK◦ (B(¯x)) are rarely closed. A similar situation appears for semidefinite MPCCs, see [9, Section 5.3.1]. On the other hand, the closedness assumption is always satisfied if K is a polyhedral cone, see proof of Lemma 2.3. Note that this closedness assumption implies that the S-stationarity and KKT conditions are equivalent which, as we already mentioned above, is not the case for general MPCCs.

Recall that a feasible pointx ¯ of a common finite-dimensional MPCC (i.e. X = Rn, Y = Rm, Z = Rk, m,+ k,+ m k C = −R0 , K = R0 ) is called M-stationary if there are λ ∈ R , µ, ν ∈ R , which solve the system

0 Pm 0 Pk 0 Pk 0 0 = h (¯x) + j=1λjaj(¯x) + i=1µiAi(¯x) + i=1νiBi(¯x), λ ≥ 0, λ>a(¯x) = 0,

∀i ∈ I+0(¯x): µi = 0, (13)

∀i ∈ I0−(¯x): νi = 0,

∀i ∈ I00(¯x): µiνi = 0 ∨ (µi < 0 ∧ νi > 0).

Here, the index sets I+0(¯x), I0−(¯x), and I00(¯x) are defined as stated below:

I+0(¯x) := {i ∈ {1, . . . , k} | Ai(¯x) > 0 ∧ Bi(¯x) = 0},

I0−(¯x) := {i ∈ {1, . . . , k} | Ai(¯x) = 0 ∧ Bi(¯x) < 0},

I00(¯x) := {i ∈ {1, . . . , k} | Ai(¯x) = 0 ∧ Bi(¯x) = 0}.

◦ Observe that the last three conditions in (13) are equivalent to (µ, ν) ∈ Ngph TK (·) (A(¯x),B(¯x)), where ◦ ◦ TK (·) denotes the normal cone mapping generated by the convex cone K, i.e., the mapping x 7→ TK (x) . This observation gives rise to the following equivalent reformulation of (MPCC):

h(x) → min a(x) ∈ C ◦ (A(x),B(x)) ∈ gph TK (·) .

Indeed, this problem is equivalent to (MPCC) since we have

◦ ◦ ⊥ (u, v) ∈ gph TK (·) ⇐⇒ u ∈ K ∧ v ∈ K ∧ v ∈ u . Now we introduce dummy variables in order to state (MPCC) as a problem of type (3) with a nonconvex set S: h(x) → min x,u,v a(x) ∈ C A(x) − u = 0 (14) B(x) − v = 0 ◦ (u, v) ∈ gph TK (·) . This justifies the following definition. Definition 4.1 A feasible pointx ¯ ∈ X of (MPCC) is called Mordukhovich stationary (M-stationary for short) if there exist multipliers λ ∈ Y?, µ ∈ Z?, and ν ∈ Z which satisfy (10a), (10b), and

◦ (µ, ν) ∈ Ngph TK (·) (A(¯x),B(¯x)). Optimality conditions for the simple convex bilevel programming problem in Banach spaces 17

Observe that without a specific setting this definition is rarely applicable since the appearing basic normal cone is usually difficult to compute. However, as presented above, if a common finite-dimensional MPCC is considered, then this generalization of M-stationarity equals the already existing notion. Furthermore, in the setting of semidefinite complementarity programming, the notion of M-stationarity is derived in a similar way, see [16]. A formula for the corresponding basic normal cone to the graph of the normal + cone mapping for the cone Sk is presented in [16, Theorem 3.1]. Clearly, in the finite-dimensional case, the introduced stationarity concepts correspond exactly to the concepts from literature. Therefore, from results for common finite-dimensional MPCCs, we expect

S-stationarity =⇒ M-stationarity =⇒ W-stationarity (15)

for any feasible point of (MPCC). However, combining [51, Section 6] and [16, Definition 6.1] shows that there does not need to exist a relation between S- and M-stationary points (in the above meaning) + for semidefinite MPCCs. Here, the difficulties arise from the fact that the cone Sk is not polyhedric everywhere. Even if we restrict ourselves to the consideration of polyhedric cones, the situation might be delicate: in [35], the authors show that M- and W-stationarity are equivalent for MPCCs whose 2 + complementarity condition is induced by the polyhedric cone L (Ω)0 . In the following, we are going to verify that for polyhedric cones K, we always obtain that an S- stationary point is M-stationary as well. Therefore, we need the following result which has been shown in the setting of Hilbert spaces in [53, Lemma 3.11]. Here, we provide a generalized version in reflexive Banach spaces.

∗ ◦ ∗ Lemma 4.1 Let (z, z ) ∈ gph TK (·) be given such that K is polyhedric w.r.t. (z, z ). Then the following calculation rules hold: ∗ ∗ ∗ ∗ ∗ ◦ ◦ Tgph TK (·) (z, z ) = {(d, d ) ∈ KK (z, z ) × KK (z , z) | hd, d i = 0} , ∗  ∗ ∗ ◦ ◦ conv Tgph TK (·) (z, z ) = KK (z, z ) × KK (z , z). Proof From [33, Theorem 3.1] it is possible to deduce

∗ ∗ ∗ ∗ ◦ ∗ ◦ Tgph TK (·) (z, z ) = {(d, d ) ∈ KK (z, z ) × KK (z, z ) | hd, d i = 0} .

∗ ∗ ◦ ∗ Exploiting the polyhedricity of K w.r.t. (z, z ), we obtain KK (z, z ) = KK◦ (z , z) from [51, Lemma 5.2] which shows the first formula. For the validation of the second formula, we first observe that the inclusion ⊆ is a straightforward ∗ consequence of the lemma’s first assertion. For the proof of the converse inclusion, choose d ∈ KK (z, z ) ∗ ∗ ∗ ∗ ◦ ◦ and d ∈ KK (z , z) arbitrarily. Then we obviously have (2d, 0), (0, 2d ) ∈ Tgph TK (·) (z, z ) from the first ∗ ∗  ◦ formula of this lemma. This yields (d, d ) ∈ conv Tgph TK (·) (z, z ) , and the proof is completed. ut Now we can prove the desired relation between S- and M-stationarity in the polyhedric situation. Lemma 4.2 Let x¯ ∈ X be a feasible point of (MPCC) where the complementarity cone K is polyhedric w.r.t. (z, z∗) with z := A(¯x) and z∗ := B(¯x). Then

∗ ∗ ∗ ◦ ◦ Nbgph TK (·) (z, z ) = KK (z , z) × KK (z, z ) is satisfied. Particularly, if x¯ is an S-stationary point of (MPCC), it is M-stationary as well.

Proof We proceed in two steps. First, we show

∗ w ∗ ∗  T ◦ (z, z ) ⊆ T ◦ (z, z ) ⊆ conv T ◦ (z, z ) . (16) gph TK (·) gph TK (·) gph TK (·) The first of these inclusions is clear because of the definition of the appearing tangent cones. Hence, we w ∗ verify the second one. Therefore, choose (u, v) ∈ T ◦ (z, z ) arbitrarily. Then there are sequences gph TK (·) ◦ ∗ {(uk, vk)} ⊆ gph TK (·) and {tk} ⊆ R satisfying uk → z, vk → z , tk & 0, and u − z v − z∗ k * u, k * v. tk tk 18 Susanne Franke, Patrick Mehlitz, Maria Pilecka

∗ uk−z vk−z ∗ Obviously, we have ∈ RK (z) and ∈ RK◦ (z ) for all k ∈ . Hence, by Mazur’s theorem and tk tk N ◦ ∗ ∗ ⊥ the convexity of K and K , we obtain u ∈ TK (z) and v ∈ TK◦ (z ). Now choose some w ∈ K ∩ (z ) . Then we have

D ∗ E D ∗ E D ∗ E D E vk−z z z uk−z ∗ w − uk, ≤ w − uk, − = z − uk, − = , z ≤ 0, tk tk tk tk and passing to the limit k → ∞ yields

hw − z, vi ≤ hu, z∗i ≤ 0.

∗ ⊥ ∗ ⊥ Using w := z, we obtain u ∈ (z ) and, hence, u ∈ KK (z, z ) holds. Moreover, we derive v ∈ z by ∗ fixing w := 0 and w := 2z. Consequently, v ∈ KK◦ (z , z) is satisfied. From Lemma 4.1 we have ∗ ∗ ∗  ◦ ◦ KK (z, z ) × KK (z , z) = conv Tgph TK (·) (z, z ) which finally yields the second inclusion. Now we simply polarize (16). From M ◦ = conv(M)◦ for arbitrary sets M, we obtain

∗ w ∗ ◦ N ◦ (z, z ) = T ◦ (z, z ) bgph TK (·) gph TK (·) ∗ ◦ ∗ ◦ = KK (z, z ) × KK◦ (z , z) ∗ ∗ = KK◦ (z , z) × KK (z, z ) where the last equality holds due to the polyhedricity of K w.r.t. (z, z∗), see [51, Lemma 5.2]. This completes the proof. ut

As we remarked earlier, the polyhedricity assumption is necessary for S-stationarity to imply M- stationarity since this relation is not necessarily satisfied for semidefinite MPCCs. In the next lemma, we show that for polyhedral complementarity cones, the relations in (15) are valid.

Lemma 4.3 Let x¯ ∈ X be a feasible point of (MPCC) where the complementarity cone K is given as stated in (1). Then the implications from (15) hold for x¯.

Proof Note that due to Lemma 2.3, the cone K is polyhedric. Hence, by Lemma 4.2, S-stationarity implies M-stationarity. Consequently, we only need to show

∗  ◦ ◦ ⊥ ⊥  ∗ ⊥ ∗ ⊥ ◦ Ngph TK (·) (z, z ) ⊆ cl K − K ∩ z ∩ z × cl K − K ∩ (z ) ∩ (z ) (17)

where z := A(¯x) and z∗ := B(¯x). First, we define the index sets I := I(z, z∗) and J := J(z, z∗) as stated in (2). By means of Lemma 2.4, it is sufficient to show

◦ ◦ ⊥ ⊥ ∗ cl K − K ∩ z ∩ z = lin({zi | i ∈ I}), ∗ ⊥ ∗ ⊥ ∗ ⊥ cl K − K ∩ (z ) ∩ (z ) = {zi | i ∈ J} .

◦ ⊥ ∗ We start with the verification of the first equality. Therefore, observe K ∩ z = cone({zi | i ∈ I}). This leads to ◦ ◦ ⊥ ∗ ∗ K − K ∩ z = cone({zi | i ∈ {1, . . . , k}\ I}) + lin({zi | i ∈ I}) which is a closed set by means of [9, Proposition 2.201]. Hence, we obtain

◦ ◦ ⊥ ⊥ ◦ ◦ ⊥ ⊥ ∗ cl K − K ∩ z ∩ z = (K − K ∩ z ) ∩ z = lin({zi | i ∈ I}), i.e. the first assertion is correct. In order to prove the second one, we show both inclusions separately. Therefore, we first observe

∗ ⊥ ∗ ∗ ⊥ K ∩ (z ) = {w ∈ Z | hw, zi i ≤ 0 ∀i ∈ {1, . . . , k}\ J} ∩ {zi | i ∈ J} . This yields ∗ ⊥ ∗ K − K ∩ (z ) ⊆ {w ∈ Z | hw, zi i ≤ 0 ∀i ∈ J}. Optimality conditions for the simple convex bilevel programming problem in Banach spaces 19

Since the set on the right is closed, we have ∗ ⊥ ∗ ⊥ ∗ ∗ ⊥ ∗ ⊥ cl K − K ∩ (z ) ∩ (z ) ⊆ {w ∈ Z | hw, zi i ≤ 0 ∀i ∈ J} ∩ (z ) = {zi | i ∈ J} . ∗ ⊥ ∗ ⊥ On the other hand, the set {zi | i ∈ J} is obviously contained in (z ) . Moreover, we obtain ∗ ⊥ ∗ ⊥ ∗ ⊥ ∗ ⊥ ∗ ⊥ {zi | i ∈ J} ⊆ K ∩ (z ) − K ∩ (z ) ⊆ K − K ∩ (z ) ⊆ cl K − K ∩ (z ) which shows the other inclusion, i.e. the second equation is satisfied as well. ut From Lemma 2.4 and the proof of the above lemma, we obtain the following corollary which states explicit representations of the S-, M-, and W-stationarity conditions for MPCCs whose complementarity cone is polyhedral. Corollary 4.2 Let x¯ ∈ X be a feasible point of (MPCC) where the complementarity cone K is given as stated in (1). Let I := I(A(¯x),B(¯x)) and J := J(A(¯x),B(¯x)) be the index sets defined in (2). Then x¯ is ? 1. W-stationary if and only if there exist λ ∈ Y , µ1, . . . , µk ∈ R, and ν ∈ Z which solve the system 0 0 ? Pk 0 ? ∗ 0 ? 0 = h (¯x) + a (¯x) [λ] + i=1µiA (¯x) [zi ] + B (¯x) [ν], (18a) ◦ λ ∈ TC (a(¯x)) , (18b)

∀i ∈ {1, . . . , k}\ I : µi = 0, (18c) ∗ ∀i ∈ J : hν, zi i = 0; (18d) ? 2. M-stationary if and only if there exist multipliers λ ∈ Y , µ1, . . . , µk ∈ R, and ν ∈ Z as well as a partition {I1,I2,I3} of I \ J which satisfy the conditions in (18) and

∀i ∈ I1 : µi = 0, (19a) ∗ ∀i ∈ I2 : hν, zi i = 0, (19b) ∗ ∀i ∈ I3 : µi > 0 ∧ hν, zi i < 0; (19c) ? 3. S-stationary if and only if there exist multipliers λ ∈ Y , µ1, . . . , µk ∈ R, and ν ∈ Z which satisfy the conditions in (18) and ∗ ∀i ∈ I \ J : µi ≥ 0 ∧ hν, zi i ≤ 0. (20) Observe that the conditions (19) can be equivalently represented by ∗ ∗ ∀i ∈ I \ J : hν, µizi i = 0 ∨ (µi > 0 ∧ hν, zi i < 0). k k,+ Recalling the standard finite-dimensional MPCC, where Z = R and K = R0 hold, we have k K = {z ∈ R | hz, −eii ≤ 0 ∀i ∈ {1, . . . , k}}. k k Here, ei ∈ R denotes the i-th unit vector in R . Hence, it is easily seen that the aforementioned notion of M-stationarity for standard MPCCs coincides with the M-stationarity notion w.r.t. an MPCC whose complementarity cone is polyhedal. In [7], the authors discuss a linear problem with conic constraints comprising the nonpolyhedral (and k+1 nonpolyhedric) second-order cone Kk ⊆ R given by k k K = {(x, t) ∈ R × R | t ≥ kxk2} k k where k·k2 denotes the Euclidean norm in R . They use a polyhedral approximation K of the cone K in order to simplify the given conic problem. It is shown that this approximation is reasonably good under mild assumptions. Transferring this idea to (MPCC), one could think of approximating the possibly nonpolyhedral complementarity cone by a polyhedral one in certain situations to obtain a surrogate MPCC of a type studied in Lemma 4.3 and Corollary 4.2. How this can be done and how the two problems are related is, however, beyond the scope of this paper and a question of future research. We strongly believe that the relations in (15) hold in much more general situations than the ones described in Lemmata 4.2 and 4.3. However, a detailed study of M-stationarity is beyond the scope of this paper and a topic of future research as well. We want to close this section with the presentation of general situations where M-stationarity is a necessary optimality condition for (MPCC). Applying Proposition 3.2 to (14), we obtain the following result. 20 Susanne Franke, Patrick Mehlitz, Maria Pilecka

◦ Proposition 4.1 Let x¯ ∈ X be a a local optimal solution of (MPCC) such that gph TK (·) is SNC at (A(¯x),B(¯x)), and let X be reflexive. Furthermore, let a0(¯x) be surjective, while the following constraint qualification holds: ) 0 = a0(¯x)?[λ] + A0(¯x)?[µ] + B0(¯x)?[ν], ◦ =⇒ µ = 0, ν = 0. (21) ◦ λ ∈ TC (a(¯x)) , (µ, ν) ∈ Ngph TK (·) (A(¯x),B(¯x))

Then x¯ is an M-stationary point of (MPCC).

Let X and Y be reflexive. Be aware that the constraint qualification (21) is satisfied for some feasible pointx ¯ ∈ X of (MPCC) whenever the linear operator (a0(¯x),A0(¯x),B0(¯x)) has a dense range since in this case, its adjoint is injective, see Remark 2.2. This condition may be strictly weaker than the surjectivity of (a0(¯x),A0(¯x),B0(¯x)) if at least one of the spaces Y or Z is infinite-dimensional. We have to mention that the SNC property of the complementarity set is restrictive in a function space setting as presented in [31, Example 2] for the Sobolev space W 1,2(Ω) where Ω ⊆ R is some intervall. The case Ω ⊆ Rd with d ≥ 2 needs to be considered in the future. In [35, Lemma 4.8], the 2 + authors show that the complementarity set which is induced by the polyhedric cone L (Ω)0 is nowhere SNC, and this observation does not depend on the domain’s dimension.

5 Optimality conditions for the SCBPP

Now we want to deduce optimality conditions for (SCBPP) using some reformulations of the problem. Therefore, we first use an optimal value approach where we replace the lower level problem by two new upper level constraints ensuring lower level feasibility and lower level optimality by bounding the lower level objective function value appropriately from above. On the other hand, we can replace the lower level problem by its KKT conditions provided the set Θ is stated in a certain form. Due to the results in [14], it seems necessary to study the relationship between the original and the surrogate problem. In order to characterize the solutions of (SCBPP), one first needs to ensure that this convex op- timization problem possesses a solution. Therefore, we included the subsequent theorem (the proof of this standard result can be deduced from the contributions in [30]). Recall that the general assumptions on (SCBPP) only comprise that F and f are convex and continuous, while Θ is nonempty, closed, and convex.

Theorem 5.1 Problem (SCBPP) possesses an optimal solution if X is reflexive, while Θ is bounded or f is coercive.

5.1 Optimal value approach

In this section, we assume that F and f are continuously Fr´echet differentiable. Suppose that the set Ψ is nonempty. Then we denote the optimal value of the lower level problem by α ∈ R, i.e., α := f(x∗) for arbitrary x∗ ∈ Ψ. Consequently, we have Ψ = {x ∈ Θ | f(x) − α ≤ 0}, and (SCBPP) is equivalent to

F (x) → min f(x) − α ≤ 0 (22) x ∈ Θ which is a standard nonlinear but convex optimization problem in Banach spaces. Slater’s constraint qualification generally fails to hold everywhere, since there is no feasible point which satisfies the first constraint in (22) as a strict inequality. Checking the standard constraint qualifications, we easily obtain the following result.

Lemma 5.1 KRZCQ fails to hold at any feasible point of (22). Optimality conditions for the simple convex bilevel programming problem in Banach spaces 21

Proof Letx ¯ ∈ X be a feasible point of (22). Hence, it follows fromx ¯ ∈ Ψ that f 0(¯x)[d] ≥ 0 holds for all d ∈ TΘ(¯x), see [30, Theorem 4.14]. Furthermore, we have f(¯x) = α. This leads to

0 + + + f (¯x)[RΘ(¯x)] − R + (0) ⊆ 0 + 0 = 0 , −R0 R R R i.e. KRZCQ fails to hold atx ¯. ut

On the other hand, the constraint qualification ACQ may be satisfied at a feasible point of (SCBPP), as the following example illustrates.

2 2 Example 5.1 Consider X := R , Θ := [0, 1]2, and F, f : R → R given as stated below: 2 2 2 2 ∀x = (x1, x2) ∈ R : F (x) := x1 + x2, f(x) := (x1 − 2) . Then we have Ψ = {1}×[0, 1], and the global optimal solution of the corresponding (SCBPP) isx ¯ = (1, 0). Observe that we have TΨ (¯x) = cone({(0, 1)}) and

LΨ (¯x) = {(d1, d2) ∈ cone({(−1, 0), (0, 1)}) | − 2d1 ≤ 0} = cone({(0, 1)}), 2 + SΨ (¯x) = {(−2λ + ξ1, ξ2) ∈ R | λ ≥ 0, (ξ1, ξ2) ∈ cone({(1, 0), (0, −1)})} = R × −R0 . Consequently, ACQ holds atx ¯. The latter observation yields that there are situations where the KKT conditions of (22) may be necessary and sufficient optimality conditions since ACQ (and, hence, GCQ) may hold for (22). Theorem 5.2 Let x¯ ∈ X be a feasible point of (SCBPP) where GCQ for (22) holds. Then x¯ is an optimal solution of (SCBPP) if and only if the following condition is satisfied:

◦ 0 0 ∃λ ≥ 0 ∃ξ ∈ TΘ(¯x) : 0 = F (¯x) + λ · f (¯x) + ξ. (23)

∗ ∗ ∗ ? Example 5.2 Suppose that there exist functionals c , a1, . . . , am ∈ X , an operator A ∈ L[X , Y] with closed range (where Y is another Banach space), some vector b ∈ Y, and scalars β1, . . . , βm ∈ R such that ∗ ∗ Ψ := Argmin{hx, c i | A[x] = b, hx, ai i ≤ βi ∀i ∈ {1, . . . , m}} holds and Ψ is nonempty. Then the KKT conditions are necessary and sufficient for optimality of (SCBPP) by Theorem 5.2 and Example 3.1.

Next we present a different approach to derive the necessity of the KKT conditions. Therefore, we introduce a set-valued mapping ∆: R → 2X as stated below:

∀u ∈ R: ∆(u) := {x ∈ Θ | f(x) − α ≤ u}.

From the definition of ∆ and the choice of α, we clearly have ∆(0) = Ψ and ∆(u) = ∅ for all u < 0. We obtain the following theorem which is inspired by the results developed in [26] and is similar to Proposition 3.3. However, in the proof of the upcoming result, we avoid any nonsmooth calculus which allows us to formulate less restrictive assumptions on the data. Particularly, we do not need any reflexivity of the Banach space X . Theorem 5.3 Let x¯ ∈ X be a feasible point of (SCBPP) such that ∆ is calm at (0, x¯) ∈ gph ∆. Then x¯ is an optimal solution of (SCBPP) if and only if condition (23) is satisfied.

Proof Again, the KKT conditions are sufficient for optimality due to the inherent convexity of (SCBPP). Hence, we suppose thatx ¯ is a solution of (SCBPP). Since ∆ is calm at (0, x¯), there exist ε > 0, % > 0, and C > 0 such that ∆(u) ∩ % (¯x) ⊆ Ψ + C|u| is satisfied for all u ∈ ε (0). Moreover, F is continuously UX BX UR Fr´echet differentiable, i.e., it is locally Lipschitz continuous atx ¯. Particularly, there exist constants σ > 0 σ and L > 0 such that the condition |F (x) − F (y)| ≤ Lkx − ykX holds for all x, y ∈ UX (¯x). We define σ σ γ ∗ γ := min{ε; %; 2 ; 2C } and choose (u, x) ∈ gph ∆∩U ×X (0, x¯). Hence, there is some x ∈ Ψ which satisfies ∗ σ R kx − x kX ≤ C|u| < 2 . Moreover, we have ∗ ∗ σ σ kx − x¯kX ≤ kx − xkX + kx − x¯kX < 2 + 2 = σ. 22 Susanne Franke, Patrick Mehlitz, Maria Pilecka

∗ σ ∗ ∗ Consequently, we obtain x, x ∈ UX (¯x). Observe that we have F (x ) ≥ F (¯x) from the feasibility of x for (SCBPP). Taking all these observations together, we can derive

∗ ∗ F (¯x) − F (x) ≤ F (x ) − F (x) ≤ Lkx − xkX ≤ CL|u|. Hence, we have ∀(u, x) ∈ gph ∆ ∩ γ (0, x¯): F (x) − F (¯x) + CL|u| ≥ 0. (24) UR×X γ Now we choose somex ˜ ∈ Θ ∩UX (¯x). If f(˜x)−α < γ holds, then (24) yields F (˜x)+CL(f(˜x)−α) ≥ F (¯x). Otherwise, we have

F (¯x) ≤ F (˜x) + Lkx¯ − x˜kX ≤ F (˜x) + Lγ ≤ F (˜x) + L(f(˜x) − α). Summing up the above results,x ¯ is a local and, hence, global optimal solution of the convex optimization problem F (x) + max{L; CL}(f(x) − α) → min x ∈ Θ.

◦ 0 0 Consequently, there is some ξ ∈ TΘ(¯x) satisfying 0 = F (¯x) + max{L; CL}f (¯x) + ξ, which completes the proof. ut

We just want to mention two criteria which ensure the calmness condition on the perturbation mapping ∆ appearing in Theorem 5.3. The first result is taken from [23, Theorem 3.3]. Remark 5.1 Letx ¯ ∈ X be a feasible point of (SCBPP) where the condition

0 ◦ − f (¯x) ∈/ bd TΘ(¯x) (25)

0 ◦ is satisfied. Then the perturbation mapping ∆ is calm at (0, x¯). Since we havex ¯ ∈ Ψ, −f (¯x) ∈ TΘ(¯x) 0 ◦ always holds, i.e. (25) is equivalent to −f (¯x) ∈ int TΘ(¯x) . An example of (SCBPP) satisfying the condition (25) in finite-dimensional case can be found in [15]. Recall that due to [11], we say that the lower level problem of (SCBPP) possesses a weak sharp minimum if there exists a constant γ > 0 such that

∀x ∈ Θ : f(x) − α ≥ γ inf kx − ykX y∈Ψ is satisfied. In [11, Section 3], one can find several examples of optimization problems which possess this property. We easily obtain the following result. Its proof mainly parallels the argumentation in the proof of [26, Proposition 3.8] and, thus, is omitted. Remark 5.2 Suppose that the lower level problem of (SCBPP) possesses a weak sharp minimum. Then the perturbation mapping ∆ is calm at any point (0, x¯) ∈ gph ∆.

5.2 KKT approach

Here, we assume that X is a reflexive Banach space and that F is continuously Fr´echet differentiable, while f is twice continuously Fr´echet differentiable. Furthermore, we suppose that there exist a reflexive Banach space Z, a nonempty, closed, convex cone K ⊆ Z, and a K-convex, twice continuously Fr´echet differentiable mapping g : X → Z such that Θ = {x ∈ X | g(x) ∈ K} holds. In this case, we have

◦ 0 0 ? ∀x ∈ Θ : Λ(x) = {λ ∈ TK (g(x)) | 0 = f (x) + g (x) [λ]} = {λ ∈ K◦ | 0 = f 0(x) + g0(x)?[λ], 0 = hg(x), λi}. Let GCQ be satisfied at any feasible point of Θ. Then due to the inherent convexity of the lower level problem of (SCBPP) and Lemma 3.1, for any x ∈ X , we have x ∈ Ψ if and only if g(x) ∈ K and Λ(x) 6= ∅ hold. We need the following result which is easily deduced from Lagrange . Optimality conditions for the simple convex bilevel programming problem in Banach spaces 23

Proposition 5.1 Let x1, x2 ∈ Ψ be points where GCQ is satisfied for the lower level problem. Then the relation Λ(x1) = Λ(x2) is valid.

Proof We introduce the lower level Lagrange function L: X × Z? → R by means of

∀x ∈ X ∀λ ∈ Z? : L(x, λ) := f(x) + hg(x), λi.

Due to the validity of GCQ, the sets Λ(x1) and Λ(x2) are nonempty, see Lemma 3.1. Hence, we can choose λ1 ∈ Λ(x1) and λ2 ∈ Λ(x2). According to [9, Proposition 2.157] which is applicable due to the postulated convexity assumptions, (x1, λ1) and (x2, λ2) are saddle points of L. In particular, we obtain the inequalities

L(x1, λ2) ≤ L(x1, λ1) ≤ L(x2, λ1),

L(x2, λ1) ≤ L(x2, λ2) ≤ L(x1, λ2).

We deduce that L(x1, λ2) = L(x1, λ1) and L(x2, λ1) = L(x2, λ2) hold true, i.e., (x1, λ2) and (x2, λ1) are saddle points of the Lagrange function L. Once more we apply [9, Proposition 2.157] to obtain λ2 ∈ Λ(x1) and λ1 ∈ Λ(x2). This yields the above assertion. ut

Suppose that GCQ holds at all feasible points of the lower level problem, while some x∗ ∈ Ψ and λ∗ ∈ Λ(x∗) are known a priori. Then by means of the above Proposition 5.1 we can replace the condition x ∈ Ψ equivalently by g(x) ∈ K, 0 = f 0(x) + g0(x)?[λ∗], and 0 = hg(x), λ∗i. Hence, without mentioning it again we will assume throughout the section that GCQ holds at any lower level feasible point, that there is some x∗ ∈ Ψ, and we will use the notation Λ := Λ(x∗). Conse- quently, (SCBPP) is equivalent to the optimization problem

F (x) → min f 0(x) + g0(x)?[λ∗] = 0 (26) g(x) ∈ K hg(x), λ∗i = 0

for any choice of λ∗ ∈ Λ. The following theorem is a consequence of Lemma 3.1.

Theorem 5.4 Let x¯ ∈ X be a feasible point of (SCBPP) and assume that there exists some λ∗ ∈ Λ such that GCQ holds for (26) at x¯. Then x¯ is an optimal solution of (SCBPP) if and only if there exist multipliers ζ ∈ X , η ∈ Z?, and κ ∈ R which solve the following system:

0 = F 0(¯x) + f (2)(¯x)[ζ] + hg(2)(¯x)[ζ], λ∗i + g0(¯x)?[η + κλ∗], (27) η ∈ K◦ ∩ g(¯x)⊥.

Proof The necessity of the presented optimality condition is a consequence of Lemma 3.1. Since GCQ ◦ holds for (26) atx ¯, the relation SΨ (¯x) = TΨ (¯x) is satisfied. Hence, if the system (27) possesses a solution, 0 ◦ 0 then −F (¯x) ∈ SΨ (¯x) = TΨ (¯x) holds which yields F (¯x)[d] ≥ 0 for all d ∈ TΨ (¯x). Due to the convexity of F and Ψ,x ¯ must be an optimal solution of (SCBPP). ut

Note that the optimality system (27) is generally not related to the condition (23). While (27) involves only second order information on f,(23) contains only first order information on the lower level objective function. Observe that we generally cannot expect KRZCQ to hold at a certain point of (26) for some λ∗ ∈ Λ due to the structure of the appearing constraints. However, weaker constraint qualifications such as ACQ or GCQ may hold for this problem. Following Example 3.1, the reformulation (26) satisfies ACQ everywhere whenever the lower level problem is a quadratic optimization problem with linear constraints and certain closed-range-assumptions hold on the appearing linear operators. The upcoming example is taken from semidefinite programming. 24 Susanne Franke, Patrick Mehlitz, Maria Pilecka

Example 5.3 We consider the SCBPP

1 2 1 2 1 2 2 (x1 − 2) + 2 x2 + 2 (x3 − 2) → min     y1 y2 0  + x ∈ Ψ := Argmin y1 + y2 + y3 y2 y1 0  ∈ S . 3  0 0 y3 

3 It is easy to see Θ = {y ∈ R | y1 ≥ 0, y3 ≥ 0, y1 + y2 ≥ 0, y1 − y2 ≥ 0} and, thus, Ψ = cone({(1, −1, 0)} in this example. Consequently, its optimal solution is given byx ¯ := (1, −1, 0). Since the lower level feasible set Θ possesses a nonempty interior, Slater’s constraint qualification holds at the lower level problem, and, thus, KRZCQ is valid at any point of Θ, see [9, Propositions 2.104, 2.106]. The lower level KKT conditions atx ¯ equal

− 0 = 1 + λ1,1 + λ2,2, 0 = 1 + 2λ1,2, 0 = 1 + λ3,3, 0 = λ1,1 + λ2,2 − 2λ1,2, λ ∈ S3 and can be satisfied by any matrix from  1 1  √ √  − 2 − 2 γ  1 1 2 2 Λ = − − γ  ∈ S3 − ≤ γ ≤ . 2 2 2 2  γ γ −1  We choose λ∗ ∈ Λ to be the matrix where γ = 0 holds. Then the surrogate problem (26) posesses the form 1 2 1 2 1 2 2 (x1 − 2) + 2 x2 + 2 (x3 − 2) → min   x1 x2 0 + x2 x1 0  ∈ S3 (28) 0 0 x3

x1 + x2 + x3 = 0.

Its feasible set equals Ψ, and we have TΨ (¯x) = lin({(1, −1, 0}). Now we need to compute LΨ (¯x) w.r.t. (28). Using [9, Section 5.3.1], the formula   √   p1√,1 + 2p1,2 + p2,2 2(p1,3 + p2,3) + TS+ (g(¯x)) = P ∈ S3 ∈ S2 3 2(p1,3 + p2,3) 2p3,3 is obtained. Combining this observation with some straightforward calculations shows

3 LΨ (¯x) = {d ∈ R | d1 + d2 ≥ 0, d3 ≥ 0, d1 + d2 + d3 = 0} = lin({(1, −1, 0)}) = TΨ (¯x), i.e. ACQ is valid atx ¯ for (28). One can check that    a a b − ⊥ ◦  2  S3 ∩ g(¯x) = TS+ (g(¯x)) = a a b ∈ S3 a ≤ 0, c ≤ 0, ac − b ≥ 0 3  b b c 

is satisfied. Thus, the optimality conditions (27) reduce to the existence of constants a, b, c, κ ∈ R such that 0 = −1 + 2a − κ, 0 = −2 + c − κ, a ≤ 0, c ≤ 0, ac − b2 ≥ 0. Choosing a = −1, b = 0, c = −1, and κ = −3 provides a solution of this system. Note that due to the polyhedrality of Θ and the linearity of the lower level objective function, the perturbation mapping ∆ from Section 5.1 is calm everywhere, see [44, Proposition 1]. Thus, we know from Theorem 5.3 that the KKT conditions (23) provide a necessary and sufficient optimality condition atx ¯ as well. Noting that the second order derivative of the lower level objective function vanishes, the KKT system (23) is essentially different from the optimality system derived via Theorem 5.4.

Remark 5.3 It may happen that for some feasible pointx ¯ ∈ X , GCQ for (26) holds for some but not all multipliers in Λ. Optimality conditions for the simple convex bilevel programming problem in Banach spaces 25

Example 5.4 We consider the SCBPP

−x2 → min 2 2 2 x ∈ Ψ := Argmin{y3 | y1 − y3 ≤ 0, −y3 ≤ 0, y1 + y2 − 1 ≤ 0}. Obviously, the lower level problem satisfies Slater’s constraint qualification, so GCQ holds at any lower level feasible point. We easily obtain Ψ = conv({(0, −1, 0), (0, 1, 0)}) and Λ = conv({(1, 0, 0), (0, 1, 0)}). That means thatx ¯ = (0, 1, 0) is the unique global optimal solution of this SCBPP. Observe that the relation TΨ (¯x) = cone({(0, −1, 0)}) holds true. For λ∗ = (1, 0, 0), the surrogate problem (26) takes the form

−x2 → min

2x1 = 0

−x3 ≤ 0 2 2 x1 + x2 − 1 ≤ 0 2 x1 − x3 = 0. Clearly, its feasible set equals Ψ. The corresponding linearized tangent cone takes the form

3 LΨ,λ∗ (¯x) = {(d1, d2, d3) ∈ R | 2d1 = 0, 2d2 ≤ 0, −d3 = 0} = cone({(0, −1, 0)}), i.e., it equals TΨ (¯x). Since we are in finite dimensions, the corresponding cone SΨ,λ∗ (¯x) is finitely generated and, hence, closed. This means that for λ∗ = (1, 0, 0), ACQ holds atx ¯ and, consequently, GCQ is satisfied for this problem as well. Now we consider λ˜ = (0, 1, 0). The corresponding surrogate problem (26) is given by

−x2 → min 2 x1 − x3 ≤ 0 2 2 x1 + x2 − 1 ≤ 0

−x3 = 0. The corresponding linearized tangent cone is presented below:

3 + LΨ,λ˜(¯x) = {(d1, d2, d3) ∈ R | 2d2 ≤ 0, −d3 = 0} = R × (−R0 ) × {0}.

Due to the finite-dimensional setting, the cone SΨ,λ˜(¯x) equals the polar cone of LΨ,λ˜(¯x), i.e., we obtain

+ SΨ,λ˜(¯x) = {0} × R0 × R. On the other hand, we have ◦ + TΨ (¯x) = R × R0 × R. Hence, GCQ fails to hold atx ¯ for the choice λ˜ = (0, 1, 0).

If no lower level multiplier from Λ is known a priori, one might consider the surrogate problem F (x) → min x,λ f 0(x) + g0(x)?[λ] = 0 g(x) ∈ K (29) λ ∈ K◦ hg(x), λi = 0 which is an MPCC. Due to the appearing complementarity constraint, we have to clarify the relationship between (SCBPP) and (29). In the case of standard bilevel programming, [14] depicts that this consider- ation might be delicate. Applying Proposition 5.1 once more, the feasible set of (29) equals Ψ × Λ, which is a convex set, so the discussion may lead to stronger results than for standard bilevel programming problems. The next theorem characterizes the situation for (SCBPP). 26 Susanne Franke, Patrick Mehlitz, Maria Pilecka

Theorem 5.5 The following statements hold. 1. If x¯ ∈ X is an optimal solution of (SCBPP), then for any λ ∈ Λ the point (¯x, λ) is a global optimal solution of (29). 2. If (¯x, λ¯) ∈ X × Z? is a global optimal solution of (29), then x¯ is an optimal solution of (SCBPP). 3. The feasible set of (29) is convex. Combining these three statements with Proposition 5.1, problem (29) possesses no local solutions different from its global solutions.

Proof The proof of the first two statements parallels the procedure used to verify [14, Theorem 2.1, Theorem 2.3] for finite-dimensional bilevel programming problems. ? For the proof of the third statement, we fix two arbitrary feasible points (x1, λ1), (x2, λ2) ∈ X ×Z of (29). Then we have x1, x2 ∈ Ψ and λ1, λ2 ∈ Λ. Choose an arbitrary scalar τ ∈ (0, 1). Then τx1+(1−τ)x2 is an element of Ψ since this set is convex. Moreover, GCQ holds at τx1 + (1 − τ)x2, so the KKT conditions of the lower level are satisfied, and by means of Proposition 5.1 the corresponding set of Lagrange multipliers is Λ which is convex as well. That is why τλ1 + (1 − τ)λ2 ∈ Λ is satisfied. This yields the feasiblity of τ(x1, λ1) + (1 − τ)(x2, λ2) for (29). ut Now we state optimality conditions for (SCBPP) arising from (29).

Theorem 5.6 Let x¯ ∈ X be a feasible point of (SCBPP) and assume that there is some λ¯ ∈ Λ such that GCQ holds for (29) at (¯x, λ¯). Then x¯ is an optimal solution of (SCBPP) if and only if there exist multipliers ζ ∈ X and µ ∈ Z? which solve the following system: 0 = F 0(¯x) + f (2)(¯x)[ζ] + hg(2)(¯x)[ζ], λ¯i + g0(¯x)?[µ], (30a) ¯ ⊥ µ ∈ RK◦ (λ) ∩ g(¯x) , (30b) 0 ¯⊥ − g (¯x)[ζ] ∈ RK (g(¯x)) ∩ λ . (30c) Proof Letx ¯ be an optimal solution of (SCBPP). Then, by means of Theorem 5.5, (¯x, λ¯) is a global optimal solution of (29). Since GCQ for (29) holds at this point, the KKT conditions must be satisfied by Lemma 3.1. From Section4 it is clear that there exist multipliers ζ ∈ X , µ ∈ Z?, and ν ∈ Z which solve 0 = F 0(¯x) + f (2)(¯x)[ζ] + hg(2)(¯x)[ζ], λ¯i + g0(¯x)?[µ], 0 = g0(¯x)[ζ] + ν, (31) ¯ ⊥ µ ∈ RK◦ (λ) ∩ g(¯x) , ¯⊥ ν ∈ RK (g(¯x)) ∩ λ . Now ν is eliminated from the above system to show that (30) possesses a solution. Suppose that for (¯x, λ¯), there are multipliers ζ ∈ X and µ ∈ Z? which solve (30). Then we can introduce ν := −g0(¯x)[ζ] to see that (¯x, λ¯) is a KKT point of (29). From [51, Lemma 5.1, remarks after 0 ¯ Lemma 5.2] we obtain that (−F (¯x), 0) ∈ SΨ×Λ(¯x, λ) is satisfied. We exploit the validity of GCQ to 0 ¯ 0 obtain F (¯x)[d] ≥ 0 for all (d, ξ) ∈ TΨ×Λ(¯x, λ) which leads to F (¯x)[d] ≥ 0 for all d ∈ TΨ (¯x). Hence,x ¯ is an optimal solution of (SCBPP) and the proof is completed. ut

Remark 5.4 Note that by definition of the radial cone, condition (30b) in Theorem 5.6 is equivalent to µ ∈ (K◦ + lin(λ¯)) ∩ g(¯x)⊥ = K◦ ∩ g(¯x)⊥ + lin(λ¯). Hence, if the system (30) possesses a solution for some λ¯ ∈ Λ, then (27) possesses a solution for λ∗ := λ¯, i.e., the optimality conditions from Theorem 5.6 are stronger than those ones presented in Theorem 5.4.

n k k,+ Remark 5.5 In the case of a finite-dimensional standard (SCBPP) (i.e. X = R , Z = R , K = −R0 ), the conditions (30) hold at an optimal solutionx ¯ ∈ X of it if there exists λ¯ ∈ Λ such that the matrix

 (2) Pk ¯ (2) 0 ? f (¯x) + i=1 λigi (¯x) g (¯x) (n+2k)×(n+k)  g0(¯x) O  ∈ R OI Optimality conditions for the simple convex bilevel programming problem in Banach spaces 27

possesses full row rank n + 2k, see Corollary 4.1. Unfortunately, this condition does never hold in the presence of lower level constraints. By means of [19, Theorem 4.6], we can weaken this assumption such that only  (2) Pk ¯ (2) 0 ?  f (¯x) + λig (¯x) g (¯x) i=1 i ¯ 0 ∈ (n+k+|I00(¯x,λ)|)×(n+k)  g (¯x)I00(¯x,λ¯)∪I0+(¯x,λ¯) O  R

OII−0(¯x,λ¯)∪I00(¯x,λ¯) ¯ possesses full row rank which might be satisfied in the case of strict complementarity, i.e., I00(¯x, λ) = ∅. Note that we used ¯ ¯ I−0(¯x, λ) := {i ∈ {1, . . . , k}| gi(¯x) < 0 ∧ λ = 0}, ¯ ¯ I0+(¯x, λ) := {i ∈ {1, . . . , k}| gi(¯x) = 0 ∧ λ > 0}, (32) ¯ ¯ I00(¯x, λ) := {i ∈ {1, . . . , k}| gi(¯x) = 0 ∧ λ = 0}. Either way, the above arguments depict that the surjectivity condition imposed to imply S-stationarity or even the KKT conditions at optimal solutions of the surrogate problem (29) is too strong to hold in most of the cases. Hence, it seems necessary to consider weaker stationarity notions. Obviously, such conditions are weaker than the KKT conditions, so we cannot expect to obtain sufficient optimality criteria. Here, M-stationarity may pay off. Applying Proposition 4.1 yields the following M-stationarity-type result. ◦ Theorem 5.7 Let x¯ ∈ X be a local optimal solution of (SCBPP). Suppose that gph TK (·) is SNC at (A(¯x),B(¯x)). Moreover, suppose that there is some λ¯ ∈ Λ such that for any x∗ ∈ X ?, the system x∗ = f (2)(¯x)[ζ] + hg(2)(¯x)[ζ], λ¯i + g0(¯x)?[µ] possesses a solution (ζ, µ) ∈ X × Z?, whereas the following constraint qualification holds: (2) (2) ¯ 0 ? ) 0 = f (¯x)[ζ] + hg (¯x)[ζ], λi + g (¯x) [µ], 0 0 =⇒ µ = 0, g (¯x)[ζ] = 0. ◦ ¯ (µ, −g (¯x)[ζ]) ∈ Ngph TK (·) (g(¯x), λ) Then there exist multipliers ζ ∈ X and µ ∈ Z? which solve the following system: 0 = F 0(¯x) + f (2)(¯x)[ζ] + hg(2)(¯x)[ζ], λ¯i + g0(¯x)?[µ], 0 (33) ◦ ¯ (µ, −g (¯x)[ζ]) ∈ Ngph TK (·) (g(¯x), λ). We could easily specify the above result for the case where K is polyhedral, see Corollary 4.2. Here, M-stationarity was shown to be weaker than S- but stronger than W-stationarity. However, we just present this result for the case of a standard finite-dimensional SCBPP as described in Remark 5.5 since it is the most interesting one in view of possible applications, and we leave it to the interested reader to state the more general version. Note that we make use of Remark 3.1 in order to weaken the corresponding constraint qualification even more. Corollary 5.1 Let x¯ ∈ X be a local optimal solution of the standard finite-dimensional (SCBPP) (i.e. n k k,+ ¯ X = R , Z = R , K = −R0 ). Suppose that there is some λ ∈ Λ such that the constraint qualification (2) Pk ¯ (2) Pk 0 ?  0 = f (¯x)[ζ] + λig (¯x)[ζ] + µigi(¯x) ,  i=1 i i=1  ∀i ∈ I (¯x, λ¯): µ = 0,  −0 i =⇒ µ = 0, ζ = 0 ∀i ∈ I (¯x, λ¯): g0(¯x)[ζ] = 0, 0+ i  ¯ 0 0  ∀i ∈ I00(¯x, λ): µigi(¯x)[ζ] = 0 ∨ (µi > 0 ∧ gi(¯x)[ζ] > 0) holds where the appearing index sets are defined in (32). Then there exist multipliers ζ ∈ X and µ ∈ Rk which solve the following system: 0 (2) Pk ¯ (2) Pk 0 ? 0 = F (¯x) + f (¯x)[ζ] + i=1λigi (¯x)[ζ] + i=1µigi(¯x) , ¯ ∀i ∈ I−0(¯x, λ): µi = 0, ¯ 0 ∀i ∈ I0+(¯x, λ): gi(¯x)[ζ] = 0, ¯ 0 0 ∀i ∈ I00(¯x, λ): µigi(¯x)[ζ] = 0 ∨ (µi > 0 ∧ gi(¯x)[ζ] > 0). This system equals (33) in this special situation. 28 Susanne Franke, Patrick Mehlitz, Maria Pilecka

Finally, it is possible to derive these optimality conditions of M-stationarity-type via the calmness of a certain perturbation mapping of type (9). Let us define P : X × Z × Z? → X × Z? by ∗ ∗ ? 0 0 ? ∗ ∗ ◦ P(x , z, z ) := {(x, λ) ∈ X × Z | f (x) + g (x) [λ] + x = 0, (g(x) + z, λ + z ) ∈ gph TK (·) } for any (x∗, z, z∗) ∈ X ? × Z × Z?. Then we obtain the following result from Proposition 3.3. Theorem 5.8 Let x¯ ∈ X be a local optimal solution of (SCBPP) and suppose that there is some λ¯ ∈ Λ such that P is calm at (0, 0, 0, x,¯ λ¯). Then there exist multipliers ζ ∈ X and µ ∈ Z? which satisfy the conditions in (33). Analogous results for the finite-dimensional case can be found in [42, 57, 58, 60]. Note that the constraint qualifications postulated in Theorem 5.7 and Corollary 5.1 imply the calm- ness condition appearing in Theorem 5.8, see Lemma 3.2 and Remark 3.1, respectively. Due to Remark 3.4, the calmness condition is much weaker than these constraint qualifications. One may check [2] for further constraint qualifications implying the calmess of P in the finite-dimensional setting. If we consider for example a finite-dimensional (SCBPP) where the lower level problem is a quadratic convex optimization problem with polyhedral feasible set, then the corresponding perturbation mapping P is polyhedral (that means its graph is the union of finitely many polyhedral sets). Due to [44, Proposi- tion 1], P is calm at any point of its graph. Moreover, GCQ holds at any lower level feasible point of such a problem, see Example 3.1. Consequently, the M-stationarity-type optimality conditions are satisfied at each optimal solution of this problem. Our final example shows that there exist quite general problems where the corresponding perturbation mapping P is calm. Observe that the calmness property holds at the point of interest although the presented problem possesses quadratic constraints in the lower level. Since many semidefinite problems reflect quadratic constraints, this shows that the calmness approach is promising for semidefinite lower level problems as well. Example 5.5 We consider the (SCBPP)

−x2 → min 2 1 2 5 x ∈ Ψ := Argmin{y1 | − y1 ≤ 0, y1 + y2 − 1 ≤ 0, (y1 − 1) + (y2 − 2 ) − 4 ≤ 0}. Taking a closer look at the KKT reformulation

−x2 → min x,λ

1 − λ1 + λ2 + (2x1 − 2)λ3 = 0

λ2 + (2x2 − 1)λ3 = 0

−x1 ≤ 0, λ1 ≥ 0, λ1x1 = 0

x1 + x2 − 1 ≤ 0, λ2 ≥ 0, λ2(x1 + x2 − 1) = 0 2 1 2 5 2 1 2 5  (x1 − 1) + (x2 − 2 ) − 4 ≤ 0, λ3 ≥ 0, λ3 (x1 − 1) + (x2 − 2 ) − 4 = 0, 2 3 its feasible set reduces to {(x, λ) ∈ R × R | x1 = 0, x2 ∈ [0, 1], λ1 = 1, λ2 = λ3 = 0} and its unique optimal solution is given by (¯x, λ¯) = ((0, 1), (1, 0, 0)). Technical calculations show that the corresponding perturbation map P is calm at (0, 0, 0, x,¯ λ¯) (the choice of the circle in the lower level constraints is of essential importance). Now we want to check that the M-stationarity-type conditions presented in Corollary 5.1 are fulfilled at this point. First, we deduce

(0, 0) = (0, −1) + µ1(−1, 0) + µ2(1, 1) + µ3(−2, 1). ¯ ¯ Observe I0+(¯x, λ) = {1}, which leads to ζ1 = 0, and I00(¯x, λ) = {2, 3}, which means that

µ2(ζ1 + ζ2) = 0 ∨ (µ2 > 0 ∧ ζ1 + ζ2 > 0),

µ3(−2ζ1 + ζ2) = 0 ∨ (µ3 > 0 ∧ −2ζ1 + ζ2 > 0)

need to be fulfilled. The multipliers µ1 = −5, µ2 = −1, µ3 = 2, and ζ1 = ζ2 = 0 satisfy the M-stationarity- type conditions from Corollary 5.1, but not the KKT conditions (31) since µ2 is negative. Note that there exist multipliers which satisfy the corresponding KKT conditions (e.g. µ1 = µ2 = 1, µ3 = 0, ζ1 = ζ2 = 0) which is not surprising in light of Theorem 5.6 since GCQ for the KKT reformulation holds at (¯x, λ¯). Optimality conditions for the simple convex bilevel programming problem in Banach spaces 29

Acknowledgements The authors would like to thank Gerd Wachsmuth for some fruitful discussion which led to the result postulated in Lemma 4.2.

References

1. Abadie, J.M.: Probl`emesd’optimisation. Institut Blaise Pascal, Paris (1965) 2. Adam, L., Henrion, R., Outrata, J.V.: On M-stationarity conditions in MPECs and the associated constraint qualifi- cations. Mathematical Programming, Series B pp. 1–31 (2017). DOI 10.1007/s10107-017-1146-3 3. Adams, R.A., Fournier, J.J.F.: Sobolev spaces. Elsevier Science, Oxford (2003) 4. Attouch, H., Buttazzo, G., Michaille, G.: Variational Analysis in Sobolev and BV Spaces: Applications to PDEs and Optimization. SIAM, Philadelphia (2005) 5. Aubin, J.P., Frankowska, H.: Set-Valued Analysis. Birkh¨auser,Basel (2009) 6. Bard, J.F.: Practical Bilevel Optimization: Algorithms and Applications. Kluwer Academic Publishers, Dordrecht (1998) 7. Ben-Tal, A., Nemirovski, A.: On Polyhedral Approximations of the Second-Order Cone. Mathematics of Operations Research 26(2), 193–205 (2001) 8. Benita, F., Mehlitz, P.: Bilevel Optimal Control With Final-State-Dependent Finite-Dimensional Lower Level. SIAM Journal on Optimization 26(1), 718–752 (2016) 9. Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, New York (2002) 10. Bracken, J., McGill, J.: Mathematical programs with optimization problems in the constraints. Operations Research 21, 37–44 (1973) 11. Burke, J.V., Ferris, M.C.: Weak Sharp Minima in Mathematical Programming. SIAM Journal on Control and Opti- mization 31(5), 1340–1359 (1993) 12. Dempe, S.: Foundations of Bilevel Programming. Kluwer Academic, Dordrecht (2002) 13. Dempe, S., Dinh, N., Dutta, J.: Optimality Conditions for a Simple Convex Bilevel Programming Problem. In: R.S. Burachik, J.C. Yao (eds.) Variational Analysis and Generalized Differentiation in Optimization and Control, Springer Optimization and Its Applications, vol. 47, chap. 7, pp. 149–161. Springer, New York (2010) 14. Dempe, S., Dutta, J.: Is bilevel programming a special case of a mathematical program with complementarity con- straints? Mathematical Programming, Series A 131(1), 37–48 (2012) 15. Dempe, S., Zemkoho, A.B.: The Generalized Mangasarian-Fromovitz Constraint Qualification and Optimality Condi- tions for Bilevel Programs. Journal of Optimization Theory and Applications 148(1), 46–68 (2011) 16. Ding, C., Sun, D., Ye, J.J.: First order optimality conditions for mathematical programs with semidefinite cone com- plementarity constraints. Mathematical Programming, Series A 147(1-2), 539–579 (2014) 17. Flegel, M.L., Kanzow, C.: Abadie-Type Constraint Qualification for Mathematical Programs with Equilibrium Con- straints. Journal of Optimization Theory and Applications 124(3), 595–614 (2005) 18. Flegel, M.L., Kanzow, C.: On M-stationary points for mathematical programs with equilibrium constraints. Journal of Mathematical Analysis and Applications 310(1), 286 – 302 (2005) 19. Flegel, M.L., Kanzow, C.: On the Guignard constraint qualification for mathematical programs with equilibrium constraints. Optimization 54(6), 517–534 (2005) 20. Glover, B.M.: A generalized Farkas lemma with applications to quasidifferentiable programming. Zeitschrift f¨urOper- ations Research 26(1), 125–141 (1982) 21. Guignard, M.: Generalized Kuhn-Tucker Conditions for Mathematical Programming Problems in a Banach Space. SIAM Journal on Control 7(2), 232–241 (1969) 22. Haraux, A.: How to differentiate the projection on a convex set in . Some applications to variational inequalities. Journal of the Mathematical Society of Japan 29(4), 615–631 (1977) 23. Henrion, R., Jourani, A.: Subdifferential Conditions for Calmness of Convex Constraints. SIAM Journal on Optimiza- tion 13(2), 520–534 (2002) 24. Henrion, R., Mordukhovich, B.S., Nam, N.M.: Second-Order Analysis of Polyhedral Systems in Finite and Infinite Dimensions with Applications to Robust Stability of Variational Inequalities. SIAM Journal on Optimization 20(5), 2199–2227 (2010) 25. Henrion, R., Outrata, J.V.: Calmness of constraint systems with applications. Mathematical Programming 104(2), 437–464 (2005) 26. Henrion, R., Surowiec, T.M.: On calmness conditions in convex bilevel programming. Applicable Analysis 90(6), 951–970 (2011) 27. Hinterm¨uller,M., Kopacka, I.: Mathematical Programs with Complementarity Constraints in Function Space: C- and Strong Stationarity and a Path-Following Algorithm. SIAM Journal on Optimization 20(2), 868–902 (2009) 28. Hinterm¨uller,M., Surowiec, T.M.: First-Order Optimality Conditions for Elliptic Mathematical Programs with Equi- librium Constraints via Variational Analysis. SIAM Journal on Optimization 21(4), 1561–1593 (2011) 29. Hoheisel, T., Kanzow, C., Schwartz, A.: Theoretical and numerical comparison of relaxation methods for mathematical programs with complementarity constraints. Mathematical Programming 137(1), 257–288 (2013) 30. Jahn, J.: Introduction to the Theory of Nonlinear Optimization. Springer, Berlin (1996) 31. Jaruˇsek,J., Outrata, J.V.: On sharp necessary optimality conditions in control of contact problems with strings. Nonlinear Analysis: Theory, Methods & Applications 67(4), 1117–1128 (2007) 32. Kurcyusz, S., Zowe, J.: Regularity and Stability for the Mathematical Programming Problem in Banach Spaces. Applied Mathematics and Optimization 5(1), 49–62 (1979) 33. Levy, A.B.: Sensitivity of Solutions to Variational Inequalities on Banach Spaces. SIAM Journal on Control and Optimization 38(1), 50–60 (1999) 34. Luo, Z.Q., Pang, J.S., Ralph, D.: Mathematical Programs with Equilibrium Constraints. Cambridge University Press, Cambridge (1996) 30 Susanne Franke, Patrick Mehlitz, Maria Pilecka

35. Mehlitz, P., Wachsmuth, G.: On the Limiting Normal Cone to Pointwise Defined Sets in Lebesgue Spaces. Set-Valued and Variational Analysis (2016). DOI 10.1007/s11228-016-0393-4 36. Mehlitz, P., Wachsmuth, G.: Weak and strong stationarity in generalized bilevel programming and bilevel optimal control. Optimization 65(5), 907–935 (2016) 37. Mignot, F.: Contrˆoledans les in´equations variationelles elliptiques. Journal of 22(2), 130 – 185 (1976) 38. Mordukhovich, B.S.: Variational Analysis and Generalized Differentiation. Springer, Berlin (2006) 39. Outrata, J.V.: A note on the usage of nondifferentiable exact penalties in some special optimization problems. Kyber- netika 24(4), 251–258 (1988) 40. Outrata, J.V.: On Optimization Problems with Variational Inequality Constraints. SIAM Journal on Optimization 4(2), 340–357 (1994) 41. Outrata, J.V.: Optimality Conditions for a Class of Mathematical Programs with Equilibrium Constraints. Mathe- matics of Operations Research 24(3), 627–644 (1999) 42. Outrata, J.V.: A Generalized Mathematical Program with Equilibrium Constraints. SIAM Journal on Control and Optimization 38(5), 1623–1638 (2000) 43. Robinson, S.M.: Stability theory for systems of inequalities, Part II: Differentiable nonlinear systems. SIAM Journal of Numerical Analysis 13(4), 497–513 (1976) 44. Robinson, S.M.: Some continuity properties of polyhedral multifunctions. In: H. K¨onig,B. Korte, K. Ritter (eds.) Mathematical Programming at Oberwolfach, pp. 206–214. Springer, Berlin (1981) 45. Scheel, H., Scholtes, S.: Mathematical Programs with Complementarity Constraints: Stationarity, Optimality, and Sensitivity. Mathematics of Operations Research 25(1), 1–22 (2000) 46. Shimizu, K., Ishizuka, Y., Bard, J.F.: Nondifferentiable and two-level mathematical programming. Kluwer Academic Publishers, Dordrecht (1997) 47. Simonnet, M.: Measures and Probabilities. Springer, New York (1996) 48. Solodov, M.: An explicit descent method for bilevel convex optimization. Journal of Convex Analysis 14(2), 227–238 (2007) 49. von Stackelberg, H.: Marktform und Gleichgewicht. Springer, Berlin (1934) 50. Wachsmuth, G.: Strong Stationarity for Optimal Control of the Obstacle Problem with Control Constraints. SIAM Journal on Optimization 24(4), 1914–1932 (2014) 51. Wachsmuth, G.: Mathematical Programs with Complementarity Constraints in Banach Spaces. Journal on Optimiza- tion Theory and Applications 166(2), 480–507 (2015) 52. Wachsmuth, G.: A guided tour of polyhedric sets. Basic properties, new results on intersections and applica- tions (2016). URL https://www.tu-chemnitz.de/mathematik/part_dgl/publications/Wachsmuth__A_guided_tour_ of_polyhedric_sets.pdf 53. Wachsmuth, G.: Strong stationarity for optimization problems with complementarity constraints in absence of poly- hedricity. Set-Valued and Variational Analysis 25(1), 133–175 (2016) 54. Werner, D.: Funktionalanalysis. Springer, Berlin (1995) 55. Ye, J.J.: Necessary Conditions for Bilevel Dynamic Optimization Problems. SIAM Journal on Control and Optimization 33(4), 1208–1223 (1995) 56. Ye, J.J.: Optimal Strategies For Bilevel Dynamic Problems. SIAM Journal on Control and Optimization 35(2), 512–531 (1997) 57. Ye, J.J.: Optimality Conditions for Optimization Problems with Complementarity Constraints. SIAM Journal on Optimization 9(2), 374–387 (1999) 58. Ye, J.J.: Constraint Qualifications and Necessary Optimality Conditions for Optimization Problems with Variational Inequality Constraints. SIAM Journal on Optimization 10(4), 943–962 (2000) 59. Ye, J.J.: Necessary and sufficient optimality conditions for mathematical programs with equilibrium constraints. Journal of Mathematical Analysis and Applications 307(1), 350 – 369 (2005) 60. Ye, J.J., Ye, X.Y.: Necessary Optimality Conditions for Optimization Problems with Variational Inequality Constraints. Mathematics of Operations Research 22(4), 977–997 (1997)