An Interior Point Method with a Primal-Dual Quadratic Barrier Penalty Function for Nonlinear Semidefinite Programming

An interior point method with a primal-dual quadratic barrier penalty function for nonlinear semideﬁnite programming

Atsushi Kato ∗ and Hiroshi Yabe † and Hiroshi Yamashita ‡

August 20, 2011 (Revised June 1, 2012)

Abstract

In this paper, we consider an interior point method for nonlinear semidefinite programming. Yamashita, Yabe and Harada presented a primal-dual interior point method in which a nondifferentiable merit function was used. By using shifted barrier KKT conditions, we propose a differentiable primal-dual merit function within the framework of the line search strategy, and prove the global convergence property of our method.

Keyword. Nonlinear semideﬁnite programming, Primal-dual interior point method, Primal-dual quadratic barrier penalty function, Global convergence

1 Introduction

In this paper, we consider the following nonlinear semidefinite programming (SDP) problem: { min 푓(푥), 푥 ∈ R푛, (1.1) s.t. 푔(푥) = 0, 푋(푥) ર 0, where the functions 푓 : R푛 → R, 푔 : R푛 → R푚 and 푋 : R푛 → S푝 are sufficiently smooth, and S푝 denotes the set of 푝-th order real symmetric matrices. By 푋(푥) ર 0 and 푋(푥) ≻ 0, we mean that the matrix 푋(푥) is positive semidefinite and positive definite, respectively. Numerical methods which solve the nonlinear SDP have been developed and studied by several authors [3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 15, 17, 21]. Drummond, Iusem and Svaiter [4] studied the central path of a nonlinear convex SDP and discussed the behavior between the logarithmic barrier function and objective function. Leibfritz and Mostafa

∗Department of Mathematical Information Science, Tokyo University of Science, 1-3, Kagurazaka, Shinjuku-ku, Tokyo 162-8601, Japan. e-mail: [email protected] †Department of Mathematical Information Science, Tokyo University of Science, 1-3, Kagurazaka, Shinjuku-ku, Tokyo 162-8601, Japan. e-mail: [email protected] ‡Mathematical Systems Inc., 2-4-3, Shinjuku, Shinjuku-ku, Tokyo 160-0022, Japan e- mail: [email protected]

1 [15] proposed an interior point trust region algorithm. Huang, Yang and Teo [9] trans- formed a nonlinear SDP into a mathematical program with a matrix equality constraint and proposed the sequential quadratic penalty method. Correa and Ramirez [3] presented the sequentially semidefinite programming local method which is based on the sequential quadratic programming method with a line search strategy. Kanzow, Nagel, Kato and Fukushima [12] proposed the trust region-type method. This method solves the subprob- lem which can be converted to a linear semidefinite program at each iteration. Yamashita, Yabe and Harada [21] proposed a primal-dual interior point method with a line search strategy. Huang, Teo and Yang [10] introduced an approximate augmented Lagrangian function for nonlinear SDP. Huang, Yang and Teo [11] considered a nonlinear SDP with a matrix equality constraint and proposed the penalty method. Gomez and Ramirez [7] proposed a filter method for nonlinear SDP. Though Yamashita, Yabe and Harada [21] derived a primal-dual merit function, their function was not differentiable. Thus it is significant to consider a differentiable primal- dual merit function. In this paper, following the idea of Yamashita and Yabe [20] for nonlinear programming problems, we propose a primal-dual interior point method whose merit function is differentiable for nonlinear SDP problems. Our primal-dual interior point method can obtain the global convergence property under weaker assumptions than those in [21]. This paper is organized as follows. In Section 2, we introduce the optimality conditions for problem (1.1) and some notations, and briefly review the primal-dual interior point method of Yamashita et al. [21]. In Section 3, we describe how to get a search direction and a step size. We prove the global convergence of the proposed method in Section 4.

In what follows, ∥∙∥, ∥∙∥1 and ∥∙∥∞ denote the 푙2, 푙1 and 푙∞ norms for vectors. ∥∙∥퐹 denotes the Frobenius norm for matrices. We deﬁne the following norm ⎛ ⎞ √ 푠 ( ) 푠 2 ⎝ 푡 ⎠ = + ∥푈∥2 푡 퐹 푈 ∗ for (푠, 푡, 푈) ∈ R푛 × R푚 × S푝. The superscript 푇 denotes the transpose of a vector or a matrix. 퐼 denotes the identity matrix. tr(푀) denotes the trace of the matrix 푀. We 푇 deﬁne the inner product ⟨푀1, 푀2⟩ by ⟨푀1, 푀2⟩ = tr(푀1푀2 ) for any matrices 푀1 and 푀2 in R푝×푝.

2 Optimality conditions and algorithm for ﬁnding a KKT point

In this section, we introduce the optimality conditions for problem (1.1) and Algorithm SDPIP that finds a KKT point. At first, we confirm the optimality conditions for problem (1.1). Define the Lagrangian function of problem (1.1) as follows.

퐿(푤) = 푓(푥) − 푦푇 푔(푥) − ⟨푋(푥), 푍⟩ , where 푤 = (푥, 푦, 푍), and 푦 ∈ R푚 and 푍 ∈ S푝 are the Lagrange multiplier vector and matrix which correspond to the equality and positive semideﬁniteness constraints. The

2 ﬁrst-order necessary conditions for optimality of problem (1.1), which is called the Karush- Kuhn-Tucker (KKT) conditions, are given by (see [2]): ⎛ ⎞ ⎛ ⎞ ∇푥퐿(푤) 0 ⎝ ⎠ ⎝ ⎠ 푟0(푤) ≡ 푔(푥) = 0 (2.1) 푋(푥)푍 0 and 푋(푥) ર 0, 푍 ર 0, (2.2) where ∇푥퐿(푤) is a gradient vector of the Lagrangian function given by

∗ ∇푥퐿(푤) = ∇푓(푥) − ∇푔(푥)푦 − 풜 (푥)푍.

Here the matrix ∇푔(푥) is given by

푛×푚 ∇푔(푥) = (∇푔1(푥) ⋅ ⋅ ⋅ ∇푔푚(푥)) ∈ R , and 풜∗(푥) is an operator such that for 푍, ⎛ ⎞ ⟨퐴1(푥), 푍⟩ ⎜ . ⎟ 풜∗(푥)푍 = ⎝ . ⎠ ,

⟨퐴푛(푥), 푍⟩

푝 where matrices 퐴푖(푥) ∈ S are deﬁned by ∂푋(푥) 퐴푖(푥) = ∂푥푖 for 푖 = 1, ⋅ ⋅ ⋅ , 푛. Given a positive barrier parameter 휇, interior point methods usually replace the complementarity condition 푋(푥)푍 = 0 by 푋(푥)푍 = 휇퐼 and deal with the barrier KKT conditions ⎛ ⎞ ⎛ ⎞ ∇푥퐿(푤) 0 ⎝ 푔(푥) ⎠ = ⎝ 0 ⎠ 푋(푥)푍 − 휇퐼 0 and 푋(푥) ≻ 0, 푍 ≻ 0 instead of conditions (2.1) and (2.2). These conditions are derived from the necessary condition for optimality of the following problem:

푛 min 퐹퐵푃 1(푥, 휇) = 푓(푥) + 휌∥푔(푥)∥1 − 휇 log(det푋(푥)), 푥 ∈ R , (2.3) 푋(푥)≻0 where 휌 is a given positive penalty parameter (see [19]). In connection with (1.1), we consider the following problem:

1 2 푛 min 퐹퐵푃 2(푥, 휇) = 푓(푥) + ∥푔(푥)∥ − 휇 log(det푋(푥)), 푥 ∈ R . (2.4) 푋(푥)≻0 2휇

3 The necessary conditions for the optimality of this problem are given by 1 ∇퐹 (푥, 휇) = ∇푓(푥) + ∇푔(푥)푔(푥) − 휇풜∗(푥)푋−1(푥) = 0, (2.5) 퐵푃 2 휇 and 푋(푥) ≻ 0. We deﬁne the variables 푦 and 푍 by 푦 = −푔(푥)/휇 and 푍 = 휇푋(푥)−1. Then the above conditions are written as ⎛ ⎞ ⎛ ⎞ ∇푥퐿(푤) 0 푟(푤, 휇) = ⎝ 푔(푥) + 휇푦 ⎠ = ⎝ 0 ⎠ (2.6) 푋(푥)푍 − 휇퐼 0 and

푋(푥) ≻ 0, 푍 ≻ 0. (2.7)

We call these conditions the shifted barrier KKT (SBKKT) conditions, the point which satisﬁes SBKKT conditions the SBKKT point and the point which satisﬁes conditions (2.7) the interior point. It should be noted that we deal with 푥, 푦 and 푍 as independent variables and that condition (2.6) reduces to condition (2.1) when 휇 = 0, i.e., 푟0(푤) = 푟(푤, 0). In order to obtain a KKT point, we propose the following algorithm, which uses the SBKKT conditions.

Algorithm SDPIP

Step 0. (Initialization) Set 휀 > 0, 푀푐 > 0 and 푘 = 0. Let a positive decreasing sequence {휇푘} , 휇푘 → 0 be given.

Step 1. (Approximate SBKKT point) Find an interior point 푤푘+1 that satisﬁes

∥푟(푤푘+1, 휇푘)∥∗ ≤ 푀푐휇푘. (2.8)

Step 2. (Termination) If ∥푟0(푤푘+1)∥∗ ≤ 휀, then stop. Step 3. (Update) Set 푘 := 푘 + 1 and go to Step 1.

In contrast to the interior point method in [21], stopping criterion (2.8) contains 푔(푥) + 휇푦. We should note that the global convergence property of Algorithm SDPIP can be shown in the same way as Theorem 1 of [21].

3 How to obtain an approximate SBKKT point

Algorithm SDPIP given in the previous section needs to ﬁnd an approximate SBKKT point at each iteration. In this section, we propose an algorithm to obtain such a point, which will be described as Algorithm SDPLS at the end of this section. Algorithm SDPLS uses the following iterative scheme:

푤푘+1 = 푤푘 + 훼푘Δ푤푘,

4 where the subscript 푘 denotes an iteration count of Algorithm SDPLS, Δ푤푘 = (Δ푥푘, Δ푦푘, 푛 푚 푝 Δ푍푘) ∈ R × R × S is the 푘-th search direction and 훼푘 > 0 is the 푘-th step size. In what follows, we denote 푋(푥) simply by 푋 if it is not confusing. Note that an initial interior point 푤0 and a ﬁxed barrier parameter 휇 > 0 are taken over from Algorithm SDPIP. This section is organized as follows. In Section 3.1, we introduce a Newton-like method to calculate a search direction Δ푤푘. In Section 3.2, we propose our merit function. In Section 3.3, we introduce how to obtain a suitable step size 훼푘 which guarantees that the new point 푤푘+1 is an interior point. In Section 3.4, we give Algorithm SDPLS.

3.1 How to obtain a Newton direction In this subsection, we omit the subscript 푘 and we assume that 푋 ≻ 0 and 푍 ≻ 0. To obtain a search direction Δ푤, we apply a Newton-like method to the system of equations (2.6). However, it is diﬃcult to express the Newton direction explicitly because 푋푍 = 푍푋 generally dose not hold. To overcome this matter, we introduce a nonsingular matrix 푇 ∈ S푝 and scale 푋 and 푍 by 푋˜ = 푇 푋푇 푇 and 푍˜ = 푇 −푇 푍푇 −1, respectively, so that 푋˜푍˜ = 푍˜푋˜ holds. We replace the equation 푋푍 = 휇퐼 by a form 푋˜ ∘ 푍˜ = 휇퐼, where the multiplication 푋˜ ∘ 푍˜ is deﬁned by 푋˜푍˜ + 푍˜푋˜ 푋˜ ∘ 푍˜ = . 2 Instead of (2.6), we consider the following scaled symmetrized residual: ⎛ ⎞ ⎛ ⎞ ∇푥퐿(푤) 0 ⎝ 푔(푥) + 휇푦 ⎠ = ⎝ 0 ⎠ . (3.1) 푋˜ ∘ 푍˜ − 휇퐼 0

Remark 3.1. To achieve 푋˜푍˜ = 푍˜푋˜, the scaling matrix 푇 can be chosen as follows. (i) If we set 푇 = 푋−1/2, then we have 푋˜ = 퐼 and 푍˜ = 푋−1/2푍푋−1/2, which correspond to the HRVW/KSH/M direction for linear SDP problems [8, 14, 16].

(ii) If we set 푇 = 푊 −1/2 with 푊 = 푋1/2(푋1/2푍푋1/2)−1/2푋1/2, then we have 푋˜ = 푊 −1/2푋푊 1/2 = 푊 1/2푍푊 1/2 = 푍˜, which correspond to the NT direction for linear SDP problems [18]. Now we apply the Newton method to the nonlinear equations (3.1) and we can obtain the Newton step Δ푤 by solving the following linear system of equations:

∗ 퐺Δ푥 − ∇푔(푥)Δ푦 − 풜 (푥)Δ푍 = −∇푥퐿(푥, 푦, 푍) (3.2) ∇푔(푥)푇 Δ푥 + 휇Δ푦 = −푔(푥) − 휇푦 (3.3) Δ푋˜푍˜ + 푍˜Δ푋˜ + 푋˜Δ푍˜ + Δ푍˜푋˜ = 2휇퐼 − 푋˜푍˜ − 푍˜푋,˜ (3.4) where 퐺 denotes the Hessian matrix of the Lagrangian function ∇2퐿(푤) or its approxi- ∑ 푥 푛 푝 mation, Δ푋 = 푖=1 Δ푥푖퐴푖(푥) ∈ S and Δ푋˜ = 푇 Δ푋푇 푇 and Δ푍˜ = 푇 −푇 Δ푍푇 −1.

5 2 Remark 3.2. In practice, the matrix 퐺 approximates the Hessian matrix ∇푥퐿(푤) by the Levenberg-Marquardt type modiﬁcation or the quasi-Newton updating formula (see Remarks 2 and 3 of [21]).

The explicit form of Newton directions are given by (see [21]) ( )( ) ( ) 퐺 + 퐻 −∇푔(푥) Δ푥 ∇푓(푥) − ∇푔(푥)푦 − 휇풜∗(푥)푋−1 = − ∇푔(푥)푇 휇퐼 Δ푦 푔(푥) + 휇푦 and Δ푍˜ = 휇푋˜−1 − 푍˜ − (푋˜ ⊙ 퐼)−1(푍˜ ⊙ 퐼)Δ푋,˜ where the elements of the matrix 퐻 ∈ S푝 are represented by the form 〈 〉 ˜ ˜ −1 ˜ ˜ 퐻푖푗 = 퐴푖(푥), (푋 ⊙ 퐼) (푍 ⊙ 퐼)퐴푗(푥)

˜ 푇 with 퐴푖(푥) = 푇 퐴푖(푥)푇 and the operator 푃 ⊙ 푄 means that 1 (푃 ⊙ 푄)푈 = (푃 푈푄푇 + 푄푈푃 푇 ) 2 for 푈 ∈ S푝, nonsingular 푃 ∈ R푝×푝 and 푄 ∈ R푝×푝. Therefore, the equations (3.2) – (3.4) are rewritten by 1 {퐺 + 퐻 + ∇푔(푥)∇푔(푥)푇 }Δ푥 = −∇퐹 (푥, 휇), (3.5) 휇 퐵푃 2

1 Δ푦 = − {푔(푥) + 휇푦 + ∇푔(푥)푇 Δ푥} (3.6) 휇 and

Δ푍 = 휇푋−1 − 푍 − (푇 푇 ⊙ 푇 푇 )(푋˜ ⊙ 퐼)−1(푍˜ ⊙ 퐼)(푇 ⊙ 푇 )Δ푋. (3.7)

When the matrices 퐴푖(푥)(푖 = 1, . . . , 푛) are linearly independent, the matrix 퐻 is positive deﬁnite (see [21]). Note that we can calculate easily a direction Δ푍 and a matrix 퐻 from Remark 3.1. The following lemma is obtained directly from the relationships (3.5) – (3.7).

Lemma 3.1. If the matrix 퐺 + 퐻 + 1 ∇푔(푥)∇푔(푥)푇 is nonsingular, then the Newton 휇 equations (3.2) – (3.4) give a unique search direction Δ푤 = (Δ푥, Δ푦, Δ푍) ∈ R푛×R푚×S푝, which is given by (3.5) – (3.7).

To obtain an unique direction, the interior point method in [21] needs the assumption that ∇푔(푥) is of full rank, while our interior point method do not need such an assumption.

6 3.2 Differentiable primal-dual merit function Yamashita et al. [21] proposed a primal-dual interior point method which consists of the outer iteration that finds a KKT point and the inner iteration that calculates an approximate barrier KKT point. To globalize the algorithm, they introduced the primal- dual merit function, which is defined by

퐹ℓ1(푥, 푍, 휇) = 퐹퐵푃 1(푥, 휇) + 휈퐹푃 퐷1(푥, 푍, 휇), where 휈 is a positive parameter, the function 퐹퐵푃 1(푥, 휇) is deﬁned by (2.3) and 퐹푃 퐷1(푥, 푍, 휇) is deﬁned by

퐹푃 퐷1(푥, 푍, 휇) = ⟨푋, 푍⟩ − 휇 log{det(푋푍)}.

However this merit function is not diﬀerentiable and a dual variable 푦 is not contained. By taking account of these issues, we propose the following diﬀerentiable merit function in the whole primal-dual space:

퐹ℓ2(푤, 휇) = 퐹퐵푃 2(푥, 휇) + 휈퐹푃 퐷2(푤, 휇), (3.8) where 퐹퐵푃 2(푥, 휇) is deﬁned by (2.4) and 퐹푃 퐷2(푤, 휇) is the primal-dual barrier function which is given by

1 1 2 1 ⟨푋, 푍⟩/푝 + ∥푍 2 푋푍 2 − 휇퐼∥ 퐹 (푤, 휇) = ∥푔(푥) + 휇푦∥2 + log 퐹 . (3.9) 푃 퐷2 2 {det(푋푍)}1/푝

1 1 2 Note that ∥푍 2 푋푍 2 − 휇퐼∥퐹 is rewritten by

1 1 2 1 1 1 1 푇 ∥푍 2 푋푍 2 − 휇퐼∥퐹 = tr{(푍 2 푋푍 2 − 휇퐼)(푍 2 푋푍 2 − 휇퐼) } 1 1 1 1 = tr{(푍 2 푋푍 2 − 휇퐼)(푍 2 푋푍 2 − 휇퐼)} 1 1 1 1 2 = tr(푍 2 푋푍푋푍 2 ) − 2휇tr(푍 2 푋푍 2 ) + 휇 tr(퐼) = tr(푋푍푋푍) − 2휇tr(푋푍) + 푝휇2. (3.10)

˜ ˜ Let 휆푖 and 휏푖 for 푖 = 1, ⋅ ⋅ ⋅ , 푝 denote the eigenvalues of the matrices 푋 and 푍, respectively. Since 푋˜푍˜ = 푍˜푋˜ holds, the matrices 푋˜ and 푍˜ share the same eigensystem. ˜ ˜ Therefore, the matrix 푋푍 has eigenvalues 휆푖휏푖, 푖 = 1, ⋅ ⋅ ⋅ , 푝. i.e., the following holds ∑푝 ∏푝 ⟨푋, 푍⟩ = 휆푖휏푖, det(푋푍) = 휆푖휏푖. (3.11) 푖=1 푖=1

The function 퐹푃 퐷2(푤, 휇) has the following properties.

Lemma 3.2. The following relationships hold:

(i) 퐹푃 퐷2(푤, 휇) ≥ 0.

(ii) 퐹푃 퐷2(푤, 휇) = 0 if and only if 푔(푥) + 휇푦 = 0 and 푋푍 − 휇퐼 = 0.

7 Proof. To show this lemma, we separate 퐹푃 퐷2(푤, 휇) in (3.9) into the two functions 퐹푆(푥, 푦, 휇) and 퐹푃 (푥, 푍, 휇). They are given by 1 퐹 (푥, 푦, 휇) = ∥푔(푥) + 휇푦∥2 푆 2 and 1 1 2 ⟨푋, 푍⟩/푝 + ∥푍 2 푋푍 2 − 휇퐼∥ 퐹 (푥, 푍, 휇) = log 퐹 . 푃 {det(푋푍)}1/푝 (i) The following obviously holds

퐹푆(푥, 푦, 휇) ≥ 0. The equations (3.11) and arithmetic and geometric means yield

( ) 1 ∑푝 ∏푝 푝 1 1 1 2 1 휆 휏 + ∥푍 2 푋푍 2 − 휇퐼∥ ≥ 휆 휏 = {det(푋푍)} 푝 . 푝 푖 푖 퐹 푖 푖 푖=1 푖=1 Thus we have

퐹푃 (푥, 푍, 휇) ≥ log 1 = 0. (ii) Suppose that 푔(푥) + 휇푦 = 0 and 푋푍 − 휇퐼 = 0. From 푔(푥) + 휇푦 = 0, we obtain 1 1 퐹푆(푥, 푦, 휇) = 0. 푋푍 − 휇퐼 = 0 means 푍 2 푋푍 2 − 휇퐼 = 0. Since 휆1휏1 = ⋅ ⋅ ⋅ = 휆푝휏푝 = 휇 holds, we get ( )1/푝 1 ∑푝 ∏푝 휆 휏 = 휆 휏 , 푝 푖 푖 푖 푖 푖=1 푖=1 which implies 퐹푃 (푥, 푍, 휇) = log 1 = 0. Therefore, 퐹푃 퐷2(푤, 휇) = 0. Conversely suppose that 퐹푃 퐷2(푤, 휇) = 0. This means that 1 퐹 (푥, 푦, 휇) = ∥푔(푥) + 휇푦∥2 = 0, 퐹 (푥, 푍, 휇) = log 1 = 0. 푆 2 푃 The equality 푔(푥) + 휇푦 = 0 clearly holds. It follows from (3.11) and arithmetic and geometric means that ( ) ∑푝 ∏푝 1/푝 ∑푝 1 1 1 2 1 휆 휏 + ∥푍 2 푋푍 2 − 휇퐼∥ = 휆 휏 ≤ 휆 휏 , 푝 푖 푖 퐹 푖 푖 푝 푖 푖 푖=1 푖=1 푖=1 which implies 1 1 2 ∥푍 2 푋푍 2 − 휇퐼∥퐹 ≤ 0. We obtain 푋푍 − 휇퐼 = 0. Therefore, the proof is complete. □

The next lemma helps derive the derivative of 퐹ℓ2(푤, 휇).

Lemma 3.3. The following holds for 푖 = 1, ⋅ ⋅ ⋅ , 푛 : ∂ tr(푋푍푋푍) = 2 ⟨퐴푖(푥), 푍푋푍⟩ . ∂푥푖

8 푝 Proof. From the deﬁnition of the matrix 퐴푖(푥) ∈ S , we obtain ∂ ∂푋 ∂푋 tr(푋푍푋푍) = tr( 푍푋푍) + tr(푋푍 푍) ∂푥푖 ∂푥푖 ∂푥푖 = tr(퐴푖(푥)푍푋푍) + tr(푋푍퐴푖(푥)푍)

= tr(퐴푖(푥)푍푋푍) + tr(퐴푖(푥)푍푋푍)

= 2tr(퐴푖(푥)푍푋푍)

= 2 ⟨퐴푖(푥), 푍푋푍⟩ .

Therefore, the lemma is proven. □

From Lemma 3.2, (3.8) and (3.10), we calculate the derivative of the merit function which is given by ⎛ ⎞ ∇퐹퐵푃 2(푥, 휇) + 휈∇푥퐹푃 퐷2(푤, 휇) ⎝ ⎠ ∇퐹ℓ2(푤, 휇) = 휈∇푦퐹푃 퐷2(푤, 휇) , (3.12) 휈∇푍 퐹푃 퐷2(푤, 휇) where 1 풜∗(푥)푍 + 2{풜∗(푥)(푍푋푍) − 휇풜∗(푥)푍} 푝 ∇푥퐹푃 퐷2(푤, 휇) = 1 1 2 ⟨푋, 푍⟩/푝 + ∥푍 2 푋푍 2 − 휇퐼∥퐹 1 − 풜∗(푥)푋−1 + ∇푔(푥){푔(푥) + 휇푦}, (3.13) 푝

∇푦퐹푃 퐷2(푤, 휇) = 휇{푔(푥) + 휇푦}, (3.14) 1 푋 + 2(푋푍푋 − 휇푋) 푝 1 −1 ∇푍 퐹푃 퐷2(푤, 휇) = − 푍 (3.15) 1 1 2 푝 ⟨푋, 푍⟩/푝 + ∥푍 2 푋푍 2 − 휇퐼∥퐹 and ⎛ ⎞ ⟨퐴1(푥), 푍푋푍⟩ ⎜ . ⎟ 풜∗(푥)(푍푋푍) = ⎝ . ⎠ .

⟨퐴푛(푥), 푍푋푍⟩

Note that ∇퐹퐵푃 2(푥, 휇) is given in (2.5) and that ∇푥퐹푃 퐷2(푤, 휇) and ∇푦퐹푃 퐷2(푤, 휇) are derivatives of 퐹푃 퐷2(푤, 휇) with respect to vectors 푥 and 푦, respectively. ∇푍 퐹푃 퐷2(푤, 휇) is a derivative of 퐹푃 퐷2(푤, 휇) with respect to a matrix 푍 (see [1]). The following lemma shows that an SBKKT point is equivalent to a stationary point of the function 퐹ℓ2(푤, 휇).

Lemma 3.4. The following statements are equivalent:

(a) 푟(푤, 휇) = 0.

(b) ∇퐹퐵푃 2(푥, 휇) = 0, 푔(푥) + 휇푦 = 0 and 푋푍 − 휇퐼 = 0.

9 Proof. The equivalence of (a) and (b) is obvious from (2.5) and (2.6). Therefore, it suﬃces to show the equivalence of (b) and (c). Suppose that (b) is satisﬁed. From the equality 푔(푥) + 휇푦 = 0, ∇푦퐹푃 퐷2(푤, 휇) = 0 clearly holds. Since the −1 −1 1 1 expression 푋푍 − 휇퐼 = 0 implies that 푋 = 휇푍 , 푍 = 휇푋 and 푍 2 푋푍 2 − 휇퐼 = 0, we have 1 휇푍−1 + 2(휇푋 − 휇푋) 1 ∇ 퐹 (푤, 휇) = 푝 − 푍−1 = 0. 푍 푃 퐷2 휇 푝 By (3.13), the expressions 푔(푥) + 휇푦 = 0 and 푋푍 − 휇퐼 = 0 also yield

1 풜∗(푥)(휇푋−1) + 2{휇풜∗(푥)푍 − 휇풜∗(푥)푍} 1 ∇ 퐹 (푤, 휇) = 푝 − 풜∗(푥)푋−1 = 0. 푥 푃 퐷2 휇 푝

Therefore, (c) holds by (3.12).

Conversely assume that (c) is satisﬁed. By the assumption ∇푦퐹푃 퐷2(푤, 휇) = 0, 푔(푥) + 휇푦 = 0 is obvious. From ∇푍 퐹푃 퐷2(푤, 휇) = 0, we get

1 푋 + 2(푋푍푋 − 휇푋) 1 푝 − 푍−1 = 0. 1 1 2 푝 ⟨푋, 푍⟩/푝 + ∥푍 2 푋푍 2 − 휇퐼∥퐹 Multiplying both sides of the above equation by 푋−1 from the right, we have

1 퐼 + 2(푋푍 − 휇퐼) 1 푝 − 푍−1푋−1 = 0, 1 1 2 푝 ⟨푋, 푍⟩/푝 + ∥푍 2 푋푍 2 − 휇퐼∥퐹 which implies { } 1 ⟨푋, 푍⟩ 1 1 2 −1 −1 1 2(푋푍 − 휇퐼) = + ∥푍 2 푋푍 2 − 휇퐼∥ 푍 푋 − 퐼. 푝 푝 퐹 푝

1 − 1 Multiplying both sides of the above equation by 푍 2 from the left and 푍 2 from the right, we have { } 1 1 1 ⟨푋, 푍⟩ 1 1 2 1 1 −1 1 2(푍 2 푋푍 2 − 휇퐼) = + ∥푍 2 푋푍 2 − 휇퐼∥ (푍 2 푋푍 2 ) − 퐼. 푝 푝 퐹 푝

1 1 푇 Multiplying both sides of the above equation by (푍 2 푋푍 2 −휇퐼) from the right, we obtain

1 1 1 1 푇 2(푍 2 푋푍 2 − 휇퐼)(푍 2 푋푍 2 − 휇퐼) { } 1 ⟨푋, 푍⟩ 1 1 2 1 1 −1 1 1 1 = + ∥푍 2 푋푍 2 − 휇퐼∥ {퐼 − 휇(푍 2 푋푍 2 ) } − (푍 2 푋푍 2 − 휇퐼). 푝 푝 퐹 푝

Considering the trace of the above equation, we have

1 1 2 2∥푍 2 푋푍 2 − 휇퐼∥ { 퐹 } 1 ⟨푋, 푍⟩ 1 1 2 1 1 −1 1 1 1 = + ∥푍 2 푋푍 2 − 휇퐼∥ [푝 − 휇tr{(푍 2 푋푍 2 ) }] − {tr(푍 2 푋푍 2 ) − 휇푛} 푝 푝 퐹 푝 [ ] { } 휇 1 1 −1 1 1 2 1 −1 −1 = 1 − tr{(푍 2 푋푍 2 ) } ∥푍 2 푋푍 2 − 휇퐼∥ + 휇 1 − ⟨푋, 푍⟩⟨푋 , 푍 ⟩} . 푝 퐹 푝2

10 Then it follows from arithmetic and geometric means that [ ] { } ˜ ˜ ˜−1 ˜−1 휇 1 1 −1 1 1 2 ⟨푋, 푍⟩ ⟨푋 , 푍 ⟩ 1 + tr{(푍 2 푋푍 2 ) } ∥푍 2 푋푍 2 − 휇퐼∥ = 휇 1 − 푝 퐹 푝 푝 { ∏ } ( 푝 휆 휏 )1/푝 ≤ 휇 1 − ∏푖=1 푖 푖 푝 1/푝 ( 푖=1 휆푖휏푖) = 0.

1 1 −1 Because tr{(푍 2 푋푍 2 ) } > 0 holds, we obtain 푋푍 − 휇퐼 = 0. From 푔(푥) + 휇푦 = 0 and 푋푍 − 휇퐼 = 0, we get

휇 풜∗(푥)푋−1 + 2{휇풜∗(푥)푍 − 휇풜∗(푥)푍} 1 ∇ 퐹 (푤, 휇) = 푝 − 풜∗(푥)푋−1 = 0. 푥 푃 퐷2 휇 푝

∇퐹ℓ2(푤, 휇) = 0 and ∇푥퐹푃 퐷2(푤, 휇) = 0 imply ∇퐹퐵푃 2(푥, 휇) = 0. Therefore, the proof is complete. □

Now we introduce the directional derivative 퐷(퐹ℓ2(푤, 휇); Δ푤) of the merit function 퐹ℓ2(푤, 휇) at a point 푤 in the search direction Δ푤 which is deﬁned by

푇 퐷(퐹ℓ2(푤, 휇); Δ푤) = ∇퐹푃 퐵2(푤, 휇) Δ푥 + 휈퐷(퐹푃 퐷2(푤, 휇); Δ푤), (3.16) where

푇 푇 퐷(퐹푃 퐷2(푤, 휇); Δ푤) = ∇푥퐹푃 퐷2(푤, 휇) Δ푥 + ∇푦퐹푃 퐷2(푤, 휇) Δ푦

+⟨∇푍 퐹푃 퐷2(푤, 휇), Δ푍⟩. (3.17)

The next two lemmas evaluate the directional derivatives 퐷(퐹푃 퐷2(푤, 휇); Δ푤) and 퐷(퐹ℓ2(푤, 휇); Δ푤) in the search direction Δ푤.

Lemma 3.5. The Newton direction Δ푤 satisﬁes

1 1 2 ∥푍 2 푋푍 2 − 휇퐼∥퐹 2 퐷(퐹푃 퐷2(푤, 휇); Δ푤) ≤ − − ∥푔(푥) + 휇푦∥ . 1 1 2 ⟨푋, 푍⟩/푝 + ∥푍 2 푋푍 2 − 휇퐼∥퐹

Proof. From (3.13) – (3.15) and the deﬁnition of Δ푋, we have

푇 ⟨Δ푋, 푍⟩/푝 + 2{⟨Δ푋, 푍푋푍⟩ − 휇⟨Δ푋, 푍⟩} ∇푥퐹푃 퐷2(푤, 휇) Δ푥 = 1 1 2 ⟨푋, 푍⟩/푝 + ∥푍 2 푋푍 2 − 휇퐼∥퐹 1 − ⟨Δ푋, 푋−1⟩ + {∇푔(푥)푇 Δ푥}푇 {푔(푥) + 휇푦}, (3.18) 푝 푇 푇 ∇푦퐹푃 퐷2(푤, 휇) Δ푦 = 휇Δ푦 {푔(푥) + 휇푦} (3.19) and ⟨푋, Δ푍⟩/푝 + 2{⟨푋푍푋, Δ푍⟩ − 휇⟨푋, Δ푍⟩} ⟨∇푍 퐹푃 퐷2(푤, 휇), Δ푍⟩ = 1 1 2 ⟨푋, 푍⟩/푝 + ∥푍 2 푋푍 2 − 휇퐼∥퐹 1 − ⟨푍−1, Δ푍⟩. (3.20) 푝

11 It follows from (3.4) that

⟨Δ푋, 푍⟩ + ⟨푋, Δ푍⟩ = tr(Δ푋푍) + tr(푋Δ푍) = tr(Δ푋˜푍˜ + 푋˜Δ푍˜) 1 = tr(Δ푋˜푍˜ + 푋˜Δ푍 + Δ푍˜푋˜ + 푍˜Δ푋) 2 1 = tr(2휇퐼 − 푋˜푍˜ − 푍˜푋˜) 2 1 1 = tr(휇퐼 − 푍 2 푋푍 2 ) (3.21) and

⟨Δ푋, 푍푋푍⟩ + ⟨푋푍푋, Δ푍⟩ = tr(Δ푋푍푋푍) + tr(푋푍푋Δ푍) = tr{(푋˜푍˜)(Δ푋˜푍˜ + 푋˜Δ푍˜)} 1 = tr{(푋˜푍˜)(Δ푋˜푍˜ + 푋˜Δ푍˜ + Δ푍˜푋˜ + 푍˜Δ푋˜)} 2 1 = tr{(푋˜푍˜)(2휇퐼 − 푋˜푍˜ − 푍˜푋˜)} 2 1 1 1 1 = tr{(푍 2 푋푍 2 )(휇퐼 − 푍 2 푋푍 2 )}.

Thus we obtain that

⟨Δ푋, 푍푋푍⟩ + ⟨푋푍푋, Δ푍⟩ − 휇{⟨Δ푋, 푍⟩ + ⟨푋, Δ푍⟩} 1 1 1 1 1 1 = tr{(푍 2 푋푍 2 )(휇퐼 − 푍 2 푋푍 2 )} − tr{휇퐼(휇퐼 − 푍 2 푋푍 2 )} 1 1 2 = −∥푍 2 푋푍 2 − 휇퐼∥퐹 . (3.22) By (3.4), (3.11) and (3.21), we get

∑푝 ˜ ˜ ⟨Δ푋, 푍⟩ + ⟨푋, Δ푍⟩ = tr(휇퐼 − 푋푍) = 푝휇 − 휆푖휏푖 (3.23) 푖=1 and

⟨Δ푋, 푋−1⟩ + ⟨푍−1, Δ푍⟩ = tr(Δ푋푋−1) + tr(푍−1Δ푍) = tr(푋−1Δ푋푍푍−1) + tr(Δ푍푍−1푋−1푋) = tr{(Δ푋˜푍˜ + 푋˜Δ푍˜)(푋˜푍˜)−1} 1 = tr{(2휇퐼 − 푋˜푍˜ − 푍˜푋˜)(푋˜푍˜)−1} 2 = tr{(휇퐼 − 푋˜푍˜)(푋˜푍˜)−1} = tr{휇(푋˜푍˜)−1 − 퐼} ∑푝 1 = 휇 − 푝. (3.24) 휆 휏 푖=1 푖 푖 We have from (3.6) that

{∇푔(푥)푇 Δ푥 + 휇Δ푦}푇 {푔(푥) + 휇푦} = −∥푔(푥) + 휇푦∥2. (3.25)

12 For simplicity, we deﬁne the function ℎ(푥, 푍, 휇) by ∑푝 ⟨푋, 푍⟩ 1 1 2 휆푖휏푖 1 1 2 ℎ(푥, 푍, 휇) = + ∥푍 2 푋푍 2 − 휇퐼∥ = + ∥푍 2 푋푍 2 − 휇퐼∥ . 푝 퐹 푝 퐹 푖=1 It follows from (3.17) – (3.20) and (3.22) – (3.25) that 퐷(퐹 (푤, 휇); Δ푤) 푃 퐷2 [ 1 1 = {⟨Δ푋, 푍⟩ + ⟨푋, Δ푍⟩} ℎ(푥, 푍, 휇) 푝 ] +2{⟨Δ푋, 푍푋푍⟩ + ⟨푋푍푋, Δ푍⟩ − 휇(⟨Δ푋, 푍⟩ + ⟨푋, Δ푍⟩) } 1 − {⟨Δ푋, 푋−1⟩ + ⟨푍−1, Δ푍⟩} + {∇푔(푥)푇 Δ푥 + 휇Δ푦}푇 {푔(푥) + 휇푦} 푝 { } ∑푝 1 휆푖휏푖 1 1 2 = 휇 − − 2∥푍 2 푋푍 2 − 휇퐼∥ ℎ(푥, 푍, 휇) 푝 퐹 푖=1 휇 ∑푝 1 − + 1 − ∥푔(푥) + 휇푦∥2 푝 휆 휏 푖=1 푖 푖 1 1 2 푝 휇 − ∥푍 2 푋푍 2 − 휇퐼∥ 휇 ∑ 1 = 퐹 − − ∥푔(푥) + 휇푦∥2 ℎ(푥, 푍, 휇) 푝 휆 휏 푖=1 푖 푖 푝 1 1 2 푝휇 휇 ∑ 1 ∥푍 2 푋푍 2 − 휇퐼∥ ≤ ∑ − − 퐹 − ∥푔(푥) + 휇푦∥2. 푝 휆 휏 푝 휆 휏 ℎ(푥, 푍, 휇) 푖=1 푖 푖 푖=1 푖 푖 From arithmetic and geometric means, we obtain

푝 1 1 2 푝휇 휇 ∑ 1 ∥푍 2 푋푍 2 − 휇퐼∥ ∑ − − 퐹 − ∥푔(푥) + 휇푦∥2 푝 휆 휏 푝 휆 휏 ℎ(푥, 푍, 휇) 푖=1 푖 푖 푖=1 푖 푖 1 1 2 푝휇 휇 ∥푍 2 푋푍 2 − 휇퐼∥ ≤ ∑ − ∏ − 퐹 − ∥푔(푥) + 휇푦∥2 푝 푝 1/푝 푖=1 휆푖휏푖 ( 푖=1 휆푖휏푖) ℎ(푥, 푍, 휇) 1 1 2 ∥푍 2 푋푍 2 − 휇퐼∥ ≤ − 퐹 − ∥푔(푥) + 휇푦∥2. ℎ(푥, 푍, 휇) Therefore, the proof is complete. □

Lemma 3.6. The Newton direction Δ푤 satisﬁes 1 퐷(퐹 (푤, 휇); Δ푤) ≤ −Δ푥푇 {퐺 + 퐻 + ∇푔(푥)∇푔(푥)푇 }Δ푥 ℓ2 휇 1 1 2 ∥푍 2 푋푍 2 − 휇퐼∥ −휈 퐹 − 휈∥푔(푥) + 휇푦∥2. 1 1 2 ⟨푋, 푍⟩/푝 + ∥푍 2 푋푍 2 − 휇퐼∥퐹 Proof. From Lemma 3.5, (3.5) and (3.16), we obtain 1 퐷(퐹 (푤, 휇); Δ푤) ≤ −Δ푥푇 {퐺 + 퐻 + ∇푔(푥)∇푔(푥)푇 }Δ푥 ℓ2 휇 1 1 2 ∥푍 2 푋푍 2 − 휇퐼∥ −휈 퐹 − 휈∥푔(푥) + 휇푦∥2. 1 1 2 ⟨푋, 푍⟩/푝 + ∥푍 2 푋푍 2 − 휇퐼∥퐹

13 Therefore, the proof is complete. □

The following lemma claims that an SBKKT point is obtained if Δ푥 = 0, 푔(푥)+휇푦 = 0 and 푋푍 − 휇퐼 = 0 hold.

Lemma 3.7. If the matrix 퐺+퐻 + 1 ∇푔(푥)∇푔(푥)푇 is positive deﬁnite, then the following 휇 hold:

(i) The Newton direction Δ푤 is a descent direction for the merit function 퐹ℓ2(푤, 휇).

(ii) If 퐷(퐹ℓ2(푤, 휇); Δ푤) = 0 holds, then the point 푤 is an SBKKT point.

Proof. (i) It is obvious from Lemma 3.6. (ii) By Lemma 3.5 and the assumption, we obtain that 1 0 ≤ −Δ푥푇 {퐺 + 퐻 + ∇푔(푥)푇 ∇푔(푥)}Δ푥 휇 1 1 2 ∥푍 2 푋푍 2 − 휇퐼∥ −휈 퐹 − 휈∥푔(푥) + 휇푦∥2 ≤ 0, 1 1 2 ⟨푋, 푍⟩/푝 + ∥푍 2 푋푍 2 − 휇퐼∥퐹 which implies that Δ푥 = 0, 푔(푥) + 휇푦 = 0 and 푋푍 − 휇퐼 = 0 hold. Δ푥 = 0 means

∇퐹퐵푃 2(푥, 휇) = 0 from (3.5). Thus from Lemma 3.4, 푟(푤, 휇) = 0 follows. □

3.3 Algorithm SDPLS

To globalize our interior point method and guarantee that the new point 푤푘+1 is an interior point, we use the line search strategy to obtain a suitable step size 훼푘. At ﬁrst, we introduce the algorithm which ﬁnds a step size 훼푘 by using the backtracking at the 푘-th iteration.

Algorithm LS

Step 0. (Initialization) Choose the parameters 휈 > 0, 훾 ∈ (0, 1), 훽 ∈ (0, 1) and 휀0 ∈ (0, 1). Calculate an initial step size훼 ¯푘 by

훼¯푘 = min {훼¯푥푘, 훼¯푧푘, 1} ,

where훼 ¯푥푘 and훼 ¯푧푘 are calculated by { − 훾 if 푋 is linear 휆 (푋−1Δ푋 ) 푘 훼¯ = min 푘 푘 푥푘 1 otherwise

and 훾 훼¯푧푘 = − −1 . 휆min(푍푘 Δ푍푘)

Here, 휆min(푀) means the minimum eigenvalue of the matrix 푀. Set ℓ푘 = 0.

14 Step 1. (Backtracking) If the integer ℓ푘 satisﬁes the conditions

ℓ푘 ℓ푘 퐹ℓ2(푤푘 +훼 ¯푘훽 Δ푤푘, 휇) ≤ 퐹ℓ2(푤푘, 휇) + 휀0훼¯푘훽 퐷(퐹ℓ2(푤푘, 휇); Δ푤푘) (3.26)

and ℓ푘 푋(푥푘 +훼 ¯푘훽 Δ푥푘) ≻ 0, (3.27) ℓ then we set 훼푘 =훼 ¯푘훽 푘 and stop.

Step 2. (Update) Set ℓ푘 := ℓ푘 + 1 and return to Step 1.

Since the functions 푓, 푔 and 푋 are sufficiently smooth, there exists a step size 훼푘 ∕= 0 which satisfies (3.26) and (3.27) at each 푘. Thus, Algorithm LS is terminated in a finite number of iteration counts. Now we propose Algorithm SDPLS that finds an approximate SBKKT point.

Algorithm SDPLS

Step 0. (Initialization) Give an initial interior point 푤0, the ﬁxed barrier parameter 휇 > 0 and a parameter 푀푐 > 0. Set 푘 = 0.

Step 1. (Termination) If ∥푟(푤푘, 휇)∥∗ ≤ 푀푐휇, then stop.

Step 2. (Newton direction) Calculate the matrix 퐺푘 and the scaling matrix 푇푘. Deter- mine a search direction Δ푤푘 by (3.5) – (3.7).

Step 3. (Step size) Find a step size 훼푘 by Algorithm LS. Step 4. (Update) Set

푤푘+1 = 푤푘 + 훼푘Δ푤푘 and 푘 := 푘 + 1. Return to Step 1.

It is notable that we can use a common step size 훼푘 for all variables (푥, 푦, 푍) in Step 4 diﬀerently from the interior point method in [21].

4 Global convergence to an SBKKT point

This section shows the global convergence property of Algorithm SDPLS.

Since 퐷(퐹ℓ2(푤푘, 휇); Δ푤푘) = 0 means that 푤푘 is an SBKKT point from Lemma 3.7 (ii), we assume Algorithm SDPLS generates an inﬁnite sequence {푤푘} i.e., we assume 퐷(퐹ℓ2(푤푘, 휇); Δ푤푘) ∕= 0 for any 푘 ≥ 0. To prove global convergence, we make the following assumptions. Assumptions

(A1) The functions 푓, 푔푖, 푖 = 1, ..., 푚, and 푋 are twice continuously diﬀerentiable.

(A2) The sequence {푥푘} generated by Algorithm SDPLS remains in a compact set Ω of R푛.

15 (A3) The matrix 퐺 +퐻 + 1 ∇푔(푥 )∇푔(푥 )푇 is uniformly positive deﬁnite and the matrix 푘 푘 휇 푘 푘 퐺푘 + 퐻푘 is uniformly bounded. ˜ ˜ (A4) The scaling matrix 푇푘 is chosen such that 푋푘 and 푍푘 commute, and both of the −1 sequences {푇푘} and {푇푘 } are bounded.

Assumption (A3) means that there exists a positive constant 휗 such that { } 1 휗∥푣∥2 ≤ 푣푇 퐺 + 퐻 + ∇푔(푥 )∇푔(푥 )푇 푣 푘 푘 휇 푘 푘 holds for all 푘 ≥ 0 and any 푣 ∈ R푛. The interior point method in [21] assumed that the matrix 퐺푘 + 퐻푘 was uniformly positive deﬁnite and bounded and that for all 푥푘 in Ω, the matrix ∇푔(푥푘) was of full rank and the matrices 퐴1(푥푘), . . . , 퐴푛(푥푘) were linearly independent. However assumption (A3) is much weaker than that. Moreover, the interior point method in [21] supposed that the penalty parameter 휌 was suﬃciently large so that

휌 > ∥푦푘 + Δ푦푘∥∞ held for all 푘, while our method dose not need such an assumption. The following lemma shows the boundedness of the sequence {Δ푤푘}.

Lemma 4.1. Suppose that assumptions (A1) – (A3) hold. Let the sequence {푤푘} be generated by Algorithm SDPLS. Then the followings hold.

(i) lim inf푘→∞ det(푋푘) > 0 and lim inf푘→∞ det(푍푘) > 0.

(ii) The sequence {푤푘} is bounded. Moreover, if assumption (A4) is satisﬁed, the following holds.

(iii) The sequence {Δ푤푘} is bounded.

Proof. (i) The logarithmic function term in (3.8) guarantees that all eigenvalues (휆푘)푖 and (휏푘)푖 (푖 = 1, ⋅ ⋅ ⋅ , 푝) are bounded away from zero at each 푘. (ii) Lemma 3.7 (i) and the condition (3.26) imply that the sequence {퐹ℓ2(푤푘, 휇)} is monotonically decreasing, i.e., 퐹ℓ2(푤0, 휇) ≥ 퐹ℓ2(푤푘, 휇). Since assumptions (A1) and (A2) hold, there exists a constant 푐 such that 퐹퐵푃 2(푥푘, 휇) ≥ 푐 for all 푘 ≥ 0. Then, we obtain 휈 퐹 (푤 , 휇) ≥ 퐹 (푤 , 휇) ≥ 퐹 (푥 , 휇) + ∥푔(푥 ) + 휇푦 ∥2 ≥ 퐹 (푥 , 휇) ≥ 푐, ℓ2 0 ℓ2 푘 퐵푃 2 푘 2 푘 푘 퐵푃 2 푘 which implies that the sequences {푦푘} and {퐹ℓ2(푤푘, 휇)} are bounded. Next we prove the boundedness of {푍푘} by contradiction. We suppose that there exists the sequence {(휏푗)푘} of the maximum eigenvalue of 푍푘 such that

lim (휏푗)푘 = ∞. 푘→∞

16 It follows from (3.9), (3.11) and arithmetic and geometric means that

1 1 ⟨푋 , 푍 ⟩ /푝 + ∥푍 2 푋 푍 2 − 휇퐼∥2 퐹 (푤 , 휇) ≥ log 푘 푘 푘 푘 푘 퐹 푃 퐷2 푘 1/푝 {det(푋푘푍푘)} 1 1 ⟨푋 , 푍 ⟩ /푝 + ∥푍 2 푋 푍 2 − 휇퐼∥2 ≥ log 푘 푘 푘 푘 푘 퐹 ⟨푋푘, 푍푘⟩ /푝 ∑푝 1 − 2푝휇 ∑푝 (휆 )2(휏 )2 + (휆 ) (휏 ) + 푝휇2 푖 푘 푖 푘 푝 푖 푘 푖 푘 = log 푖=1 푖=1 ∑푝 1 (휆 ) (휏 ) 푝 푖 푘 푖 푘 푖=1 ¯ ¯ = log(퐹1(푤푘, 휇) + 퐹2(푤푘, 휇)), (4.1) ¯ ¯ where 퐹1(푤푘, 휇) and 퐹2(푤푘, 휇) are deﬁned by

2 2 1 − 2푝휇 (휆푗)푘(휏푗)푘 + (휆푗)푘(휏푗)푘 ¯ 푝 퐹1(푤푘, 휇) = ∑ 1 (휆 ) (휏 ) + 1 (휆 ) (휏 ) 푝 푗 푘 푗 푘 푝 푖 푘 푖 푘 푖∕=푗 and ∑ 1 − 2푝휇 ∑ (휆 )2(휏 )2 + (휆 ) (휏 ) + 푝휇2 푖 푘 푖 푘 푝 푖 푘 푖 푘 ¯ 푖∕=푗 푖∕=푗 퐹2(푤푘, 휇) = ∑ . 1 (휆 ) (휏 ) + 1 (휆 ) (휏 ) 푝 푗 푘 푗 푘 푝 푖 푘 푖 푘 푖∕=푗 ¯ ¯ Dividing the functions 퐹1(푤푘, 휇) and 퐹2(푤푘, 휇) by (휏푗)푘 and taking limit as 푘 → ∞, we have from the assumption (A2) 1 − 2푝휇 (휆 )2(휏 ) + (휆 ) 푗 푘 푗 푘 푝 푗 푘 → ∞ ∑ (휆 ) (휏 ) 1 (휆 ) + 1 푖 푘 푖 푘 푝 푗 푘 푝 (휏푗)푘 푖∕=푗 and ∑ (휆 )2(휏 )2 1 − 2푝휇 ∑ (휆 ) (휏 ) 푝휇2 푖 푘 푖 푘 + 푖 푘 푖 푘 + (휏푗)푘 푝 (휏푗)푘 (휏푗)푘 푖∕=푗 푖∕=푗 → +0, ∑ (휆 ) (휏 ) 1 (휆 ) + 1 푖 푘 푖 푘 푝 푗 푘 푝 (휏푗)푘 푖∕=푗 which imply by (4.1) that 퐹ℓ2(푤푘, 휇) → ∞. This contradicts the boundedness of {퐹ℓ2(푤푘, 휇)}. Therefore, {푍푘} is bounded. (iii) By assumptions (A2) and (A3), the matrix 퐺 +퐻 + 1 ∇푔(푥 )∇푔(푥 )푇 is uniformly 푘 푘 휇 푘 푘 bounded. Thus we conclude that Δ푤푘 is uniformly bounded by the result (ii), (3.5), (3.6) and (3.7). □

Now we prove the global convergence property of an inﬁnite sequence {푤푘} generated by Algorithm SDPLS.

17 Theorem 4.1. Suppose assumptions (A1) – (A4) hold. Let {푤푘} be the sequence generated by Algorithm SDPLS. Then there exists at least one accumulation point of {푤푘}, and any accumulation point of the sequence {푤푘} is an SBKKT point.

Proof. By Lemma 4.1 (ii), the sequence {푤푘} has at least one accumulation point. From Lemma 3.6 and assumption (A3), we have

2 퐷(퐹ℓ2(푤푘, 휇); Δ푤푘) ≤ −휗∥Δ푥푘∥ + 휈퐷(퐹푃 퐷2(푤푘, 휇); Δ푤푘) < 0, (4.2) and from Lemma 3.4 and (3.26),

ℓ푘 퐹ℓ2(푤푘+1, 휇) − 퐹ℓ2(푤푘, 휇) ≤ 휀0훼¯푘훽 퐷(퐹ℓ2(푤푘, 휇); Δ푤푘)

ℓ푘 2 ≤ −휀0훼¯푘훽 {휗∥Δ푥푘∥ − 휈퐷(퐹푃 퐷2(푤푘, 휇); Δ푤푘)}

ℓ푘 ≤ 휀0훼¯푘훽 휈퐷(퐹푃 퐷2(푤푘, 휇); Δ푤푘) < 0. (4.3)

Since the sequence {퐹ℓ2(푤푘, 휇푘)} is monotonically decreasing and bounded below from the proof of Lemma 4.1 (ii), the left-hand side of (4.3) converges to 0. From Lemma 4.1

(i), we have lim inf푘→∞ 훼¯푘 > 0, which yields by (4.3)

ℓ푘 lim 훽 퐷(퐹ℓ2(푤푘, 휇); Δ푤푘) = 0. 푘→∞ We will show

lim 퐷(퐹ℓ2(푤푘, 휇); Δ푤푘) = 0. (4.4) 푘→∞ For this purpose, we suppose that there exists an inﬁnite subsequence 퐾 ⊂ {0, 1, ⋅ ⋅ ⋅ } and a positive constant 훿 such that

∣퐷(퐹ℓ2(푤푘, 휇); Δ푤푘)∣ ≥ 훿 > 0 ∀푘 ∈ 퐾, (4.5) which implies lim 훽ℓ푘 = 0. 푘→∞, 푘∈퐾

Thus we can assume that ℓ푘 > 0 for suﬃciently large 푘 ∈ 퐾 without loss of generality. Since the point 푤푘 + 훼푘Δ푤푘/훽 dose not satisfy condition (3.26), we have

퐹ℓ2(푤푘 + 훼푘Δ푤푘/훽, 휇) − 퐹ℓ2(푤푘, 휇) > 휀0훼푘퐷(퐹ℓ2(푤푘, 휇); Δ푤푘)/훽. (4.6)

Thus by assumption (A1) and the mean value theorem, there exists a 휃푘 ∈ (0, 1) such that

퐹ℓ2(푤푘 + 훼푘Δ푤푘/훽, 휇) − 퐹ℓ2(푤푘, 휇)

= 훼푘퐷(퐹ℓ2(푤푘 + 휃푘훼푘Δ푤푘/훽, 휇); Δ푤푘)/훽. (4.7)

Combining (4.6) and (4.7), we have

퐷(퐹ℓ2(푤푘 + 휃푘훼푘Δ푤푘/훽, 휇); Δ푤푘) > 휀0퐷(퐹ℓ2(푤푘, 휇); Δ푤푘).

18 This inequality yields

퐷(퐹ℓ2(푤푘 + 휃푘훼푘Δ푤푘/훽, 휇); Δ푤푘) − 퐷(퐹ℓ2(푤푘, 휇); Δ푤푘)

> −(1 − 휀0)퐷(퐹ℓ2(푤푘, 휇); Δ푤푘) > 0. (4.8)

Thus by the property ℓ푘 → ∞ (푘 ∈ 퐾) and Lemma 4.1 (iii), we have 훼푘 → 0 and ∥휃푘훼푘Δ푤푘/훽∥ → 0, 푘 ∈ 퐾. Thus the left-hand side of (4.8) and therefore 퐷(퐹ℓ2(푤푘, 휇); Δ푤푘) converges to 0 when 푘 → ∞, 푘 ∈ 퐾. This contradicts assumption (4.5). Therefore, we get (4.4). It follows from Lemma 3.5 and (4.2) that

퐷(퐹ℓ2(푤푘, 휇); Δ푤푘) 1 1 2 2 2 2 ∥푍푘 푋푘푍푘 − 휇퐼∥퐹 2 ≤ −휗1∥Δ푥푘∥ − 휈 1 1 − 휈∥푔(푥푘) + 휇푦푘∥ < 0. 2 2 2 ⟨푋푘, 푍푘⟩/푝 + ∥푍푘 푋푘푍푘 − 휇퐼∥퐹

Since the boundedness of the sequence {푤푘} guarantees that the denominator of the above equation dose not approach inﬁnity, (4.4) implies that

Δ푥푘 → 0, 푔(푥푘) + 휇푦푘 → 0, 푋푘푍푘 − 휇퐼 → 0, which implies by (3.5), assumption (A3) and Lemma 3.4 that ∇퐹퐵푃 2(푥푘, 휇) → 0 and any accumulation point of the sequence {푤푘} is an SBKKT point. □

5 Concluding remarks

In this paper, we have proposed a new merit function of a primal-dual interior point method for nonlinear SDP problems and have proved the global convergence of our method. Yamashita et al [21] considered a nondiﬀerentiable merit function in (푥, 푍)- space, while we have dealt with a diﬀerentiable primal-dual merit function in the whole space by using the shifted barrier KKT conditions. The global convergence of the present method have been shown under the weaker assumptions than those in [21].

References

[1] Bernstein, D.S.: Matrix Mathematics: Theory, Facts, and Formulas (Second Edi- tion). Princeton University Press, Princeton (2009) [2] Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004) [3] Correa, R., Ramirez, H.C.: A global algorithm for nonlinear semideﬁnite programming. SIAM Journal on Optimization. 15, 303–318 (2004) [4] Drummond, L.M.G., Iusem, A.N., Svaiter, B.F.: On the central path for nonlinear semideﬁnite programming. RAIRO-Recherche Operationnelle-Operations Research. 34, 331–345 (2000)

19 [5] Fares, B., Apkarian, P., Noll, D.: An augmented Lagrangian method for a class of LMI-constrained problems in robust control theory. International Journal of Control 74, 348-360 (2001)

[6] Freund, R. W., Jarre, F. and Vogelbusch, C. H.: Nonlinear semideﬁnite programming: sensitivity, convergence, and an application in passive reduced-order modeling. Mathematical Programming 109, 581-611 (2007)

[7] Gomez, W., Ramirez, H.: A ﬁlter algorithm for nonlinear semideﬁnite programming. Computational and Applied Mathematics. 29, 297–328 (2010)

[8] Helmberg, C., Rendl, F., Vanderbei, R.J., Wolkowcz, H.,: An interior point method for semideﬁnite programming. SIAM Journal on Optimization. 6, 342-361 (1996)

[9] Huang, X.X., Yang, X.Q., Teo, K.L.: A sequential quadratic penalty method for nonlinear semideﬁnite programming. Optimization. 52, 715–738 (2003)

[10] Huang, X.X., Teo, K.L., Yang, X.Q.: Approximate augmented Lagrangian functions and nonlinear semideﬁnite programs. ACTA Mathematica Sinica-English Series. 22, 1283–1296 (2006)

[11] Huang, X.X., Yang, X.Q., Teo, K.L.: Lower-order penalization approach to nonlinear semideﬁnite programming. Journal of Optimization Theory and Applications. 132, 1–20 (2007)

[12] Kanzow, C., Nagel, C., Kato, H., Fukushima, M.: Successive linearization methods for nonlinear semideﬁnite programs. Computational Optimization and Applications. 31, 251–273 (2005)

[13] Kocvara, M., Stingl, M.: PENNON : A code for convex nonlinear and semideﬁnite programming. Optimization Methods and Software 18, 317-333 (2003)

[14] Kojima, M., Shindoh, S., Hara, S.: Interior point methods for the monotone semidefinite linear complementarity problem in symmetric matrices. SIAM Journal on Op- timization. 7, 86–125 (1997)

[15] Leibfritz, F., Mostafa, E.M.E.: An interior point constrained trust region method for a special class of nonlinear semideﬁnite programming problems. SIAM Journal on Optimization. 12, 1048–1074 (2002)

[16] Monteiro, R.D.C.: Primal-dual path following algorithms for semideﬁnite programming. SIAM Journal on Optimization. 8, 769–796 (1998)

[17] Stingl, M.: On the Solution of Nonlinear Semideﬁnite Programs by Augmented La- grangian Methods, Doctoral Thesis, University of Erlangen, 2005.

[18] Todd, M.J., Toh, K.C., Tütüncü,R.H.: On the Nesterov-Todd direction in semidefinite programming. SIAM Journal on Optimization. 8, 769–796 (1998)

[19] Yamashita, H.: A globally convergent primal-dual interior point method for constrained optimization. Optimization Methods and Software. 10, 443–469 (1998)

20 [20] Yamashita, H., Yabe, H.: An interior point method with a primal-dual quadratic barrier penalty function for nonlinear optimization. SIAM Journal on Optimization. 14, 479–499 (2003)

[21] Yamashita, H., Yabe, H., Harada, K.: A primal-dual interior point method for nonlinear semideﬁnite programming. to appear in Mathematical Optimization.