Solving Variational Problems via Smoothing-Nonsmooth Reformulations ∗

Defeng Sun and Liqun Qi

School of The University of New South Wales 2052, Australia

Email: (Defeng Sun) [email protected] (Liqun Qi) L.Qi @unsw.edu.au

February 5, 1999; Revised, September 21, 1999

Abstract

It has long been known that problems can be reformulated as nonsmooth equations. Recently, locally high-order convergent Newton methods for nonsmooth equations have been well established via the concept of semismoothness. When the constraint of the variational inequality problem is a rectangle, several locally convergent Newton methods for the reformulated nonsmooth equations can aslo be globalized. In this paper, our main aim is to provide globally and locally high-order convergent Newton methods for solving variational inequality problems with general constraints. To achieve this, we first prove via convolution that these nonsmooth equa- tions can be well approximated by smooth equations, which have desirable properties for the design of Newton methods. We then reformulate the variational inequality problems as equivalent smoothing-nonsmooth equations and apply Newton-type meth- ods to solve the latter systems, and so the variational inequality problems. Stronger convergence results have been obtained.

1 Introduction

It has been a long history in mathematical programming field to construct smoothing functions to approximate nonsmooth functions. In this paper we will restrict our study to the smoothing functions of those nonsmooth functions arising from variational inequality problems. The variational inequality problem (VIP for abbreviation) is to find x∗ ∈ X such that (x − x∗)T F (x∗) ≥ 0 for all x ∈ X, (1.1)

∗The main results of this paper were presented at the International Conference on Nonlinear Opti- mization and Variational Inequalities, Hong Kong, December 15–18, 1998. This work is supported by the Australian Research Council.

1 2 where X is a nonempty closed convex of

E(y):=F (ΠX (y)) + y − ΠX (y) = 0 (1.4)

∗ n ∗ ∗ in the sense that if y ∈< is a solution of (1.4) then x := ΠX (y ) is a solution of (1.1), and conversely if x∗ is a solution of (1.1) then y∗ := x∗ − F (x∗) is a solution of (1.4) [33]. Both (1.3) and (1.4) are nonsmooth equations and have led to various generalized Newton’s methods under semismoothness assumptions [32, 26, 23]. The difference between (1.3) and (1.4) is that W is only defined on D and may not have definition outside D if F is not well defined outside D while E is defined on

2 Preliminaries

2.1 Generalized Jacobians Suppose that H :

For a point-set A :

plenA = {A ∈

In [15], Hiriart-Urruty proved the following useful results.

Theorem 2.1 (i) H0(x; u, v) = max vT Vu. V ∈∂H(x)

(ii) CH(x) = plen∂H(x), where CH(x):={V ∈

2.2 Convolution

n Let Φ : < →<+ be a kernel function, i.e., Φ is integrable (in the sense of Lebesgue) and

Z Φ(x)dx =1.

n Define Θ : <++ ×< →<+ by

Θ(ε, x)=ε−nΦ(ε−1x),

n where (ε, x) ∈<++ ×< . Then a smoothing approximation of H via convolution can be described by

G(ε, x):=Z H(x − y)Θ(ε, y)dy = Z H(x − εy)Φ(y)dy = Z H(y)Θ(ε, x − y)dy, (2.2)

n where (ε, x) ∈<++ ×< . In order to make (2.2) meaningful, we need some assumptions about the kernel function Φ, which will be addressed in the next section. When H is bounded and uniformly continuous, see [27] for some discussions about G. For convenience of discussion, we always define

G(0,x)=H(x) and for any ε<0, let G(ε, x)=G(−ε, x),x∈

2.3 Jacobian Characterizations of the Projection Operator ΠX

The following properties about the projection operator ΠX are well known [42].

Proposition 2.1 For any x, y ∈

(i) kΠX (y) − ΠX (x)k≤ky − xk. 4

(ii) T 2 (y − x) (ΠX (y) − ΠX (x)) ≥kΠX (y) − ΠX (x)k .

Part (i) of Proposition 2.1 says that ΠX is a nonexpansive and globally Lipschitz con- tinuous operator while part (ii) of Proposition 2.1 implies that ΠX is a monotone operator. Then Clarke’s generalized Jacobian ∂ΠX is well defined everywhere by Rademacher’s the- n orem and for any x ∈< , all S ∈ ∂ΠX (x) are positive semidefinite [17]. The next theorem, which gives a partial answer to an open question posed by Hiriart-Urruty [15], summarizes some important properties of ∂ΠX .

n Theorem 2.2 For x ∈<,allS ∈ ∂ΠX (x) are symmetric, positive semidefinite and kSk≤1.

Proof: We only need to prove that all S ∈ ∂ΠX (x) are symmetric by considering of the arguments before this theorem. Define φ :

1 φ(y)= (kyk2 −ky − Π (y)k2),y∈

∇φ(y)=ΠX (y).

Thus, if ΠX is differentiable at some point y, we have

0 2 ΠX (y)=∇ φ(y),

0 which, according to [22, 3.3.4], proves that ΠX (y) is symmetric. This, by the definition of ∂ΠX , has in fact proved that all S ∈ ∂ΠX (x) are symmetric.

We will see subsequently that the symmetric property of all S ∈ ∂ΠX (x) plays an essential role in our analysis.

2.4 Quasi P0-matrix and quasi P -matrix n×n A matrix A ∈< is a called a P0-matrix (P -matrix) if every of its principal minors is nonnegative (positive). Here, we will introduce some generalizations of P0-matrix and P -matrix in order to exploit the properties of the generalized Jacobians of the projection operator ΠX .

n×n Definition 2.1 A matrix A ∈< is called a quasi P0-matrix (P -matrix) if there exists n×n T an orthogonal matrix U ∈< such that UAU is a P0-matrix (P -matrix).

It is obvious that any P0-matrix (P -matrix) is a quasi P0-matrix (P -matrix). Any quasi P -matrix is a quasi P0-matrix and any quasi P -matrix is nonsingular. If A is a quasi P0-matrix, then for any ε>0, B := A + εI is a quasi P -matrix, where I is the identity matrix. We will see later that the concepts of quasi P0-matrix and P -matrix are useful in the analysis of nonsingularity of generalized Jacobians considered in this paper. 5

n Theorem 2.3 Suppose that S ∈ ∂ΠX [x − F (x)], x ∈< . Then there exists an orthogonal n×n T matrix U ∈< such that Σ:=USU is a diagonal matrix with 0 ≤ Σii ≤ 1,i ∈ 0 T 0 {1, 2, ···,n}. Moreover, if UF (x)U is a P0-matrix (P -matrix), then V := I−S(I−F (x)) is a quasi P0-matrix (P -matrix).

Proof: By Theorem 2.2 we know that there exists an orthogonal matrix U such that T Σ:=USU to be a diagonal matrix with 0 ≤ Σii ≤ 1,i∈{1, 2, ···,n}. Then

UVUT =(I − Σ) + Σ(UF0(x)U T ).

0 T T Since 0 ≤ Σii ≤ 1, i ∈{1, 2, ···,n} and that UF (x)U is a P0-matrix (P -matrix), UVU is a P0-matrix (P -matrix) as well. This, by Definition 2.1, completes our proof.

In [7], Facchinei and Pang introduced a concept of so called generalized P0-function. Suppose that X is the Cartesian product of m (with m ≥ 1) lower dimensional sets:

m X := Y Xj, (2.3) j=1

j nj m with each X being a nonempty closed convex subset of < and Pj=1 nj = n. Corre- spondingly, suppose that both the variable x and the function F (x) are partitioned in the following way:  x1   F 1(x)  2 2  x   F (x)  x =   and F (x)=  , (2.4)  .   .   .   .   xm   F m(x)  where for each j, both xj and F j(x) belong to

xj0 =6 yj0 and (xj0 − yj0 )T (F j0 (x) − F j0 (x)) ≥ 0.

(Note that this definition is the one given in [7] if D = X. The above presentation is more accurate than that in [7]). If F is a generalized P0-function on any D ∈ L(X), we say that F is a generalized P0-function on L(X).

Corollary 2.1 If F is a continuously differentiable generalized P0-function on L(X) and n 0 x ∈< , then for any S ∈ ∂ΠX [x − F (x)], V := I − S[I − F (x)] is a quasi P0-matrix.

Proof: According to Theorem 2.2, for any S ∈ ∂ΠX [x − F (x)], there exist m matrices i i Si ∈ ∂ΠXi [x − F (x)] such that

S = diag(S1,S2, ···,Sm), 6

T and for each i ∈{1, 2, ···,m} there exists an orthogonal matrix Ui such that Σi := UiSUi is a diagonal matrix with 0 ≤ (Σi)jj ≤ 1,j ∈{1, 2, ···,ni}. Define

U := diag(U1,U2, ···,Um) and Σ := diag(Σ1, Σ2, ···, Σm). Then U is an orthogonal matrix and S = U T ΣU.

Since F is a generalized P0-function on L(X), it is easy to show that for any nonzero vector h ∈

0 T i where Q := UF (x)U . Then there exists an index j(i) such that dj(i) =6 0 and i i dj(i)(Qd)j(i) =06 , which implies that Q is a P0-matrix. Thus, by Theorem 2.3, we get the desired result.

n 0 Corollary 2.2 If X is a rectangle and for x ∈<, F (x) is a P0-matrix (P -matrix) 0 or if F (x) is positive semidefinite (positive definite), then for any S ∈ ∂ΠX [x − F (x)], 0 V := I − S[I − F (x)] is a quasi P0-matrix (P -matrix).

Actually, when X is a rectangle and F is a P0-function (P -function), Ravindran and Gowda [34] have proved that W and E are P0-functions (P -functions). See [12] for more discussions about this topic. One might expect that if F is monotone, then W or E must be monotone as well. This is, however, not true, and can be seen clearly by the following example. Consider the NCP with 20! F (x)=Mx, M = ,x∈<2 32

2 and X = <+. Then 10! V := ∈ ∂W(0). 32 Clearly, V is not positive semidefinite, and so, the function W defined in (1.3) is not monotone even F itself is strongly monotone. It is also noted that if X is not a rectangle n 0 and for some x ∈< , F (x) is only a P0-matrix, V ∈ ∂W(x) may be not a quasi P0-matrix, which can be demonstrated by the following example. Consider the VIP with 10! F (x)=Mx, M = ,x∈<2 20 1

2 and X = {x ∈<| x1 + x2 =0}. Note that M is a P -matrix, but is not a positive semidefinite matrix. Then the function W defined in (1.3) is continuously differentiable with −90! W 0(0, 0) = . 10 1 7

0 0 The matrix W (0, 0) is not a quasi P0-matrix, because otherwise T := W (0, 0) + diag(1, 1) should be a quasi P -matrix and then det(T ) > 0. The latter is not true since det(T )= −16 < 0. Next, we discuss a result related to level sets of the function W defined in (1.3).

Theorem 2.4 Suppose that X is of structure (2.3) and that F is a continuous generalized n n P0-function on L(X). Suppose that H : < →< is continuous, H is partitioned in the way as F in (2.4), and there exists a constant γ>0 such that for any j ∈{1, 2, ···,m} and y, x ∈

n Y (x):=x − ΠX [x − (F (x)+H(x))],x∈< .

Then for any c ≥ 0, the following set

n Lc := {x ∈< |kY (x)k≤c} is bounded.

k Proof: Suppose by contradiction that Lc is unbounded. Then there exists a sequence {x } k k k such that x ∈ Lc and {x } is unbounded. For each k, x can be partitioned as

xk =((xk)1, (xk)2, ···, (xk)m), with (xk)j ∈

I∞ := {j| (xk)j is unbounded} and Ib := {j| (xk)j is bounded}.

Then I∞ is nonempty. Let a =(a1,a2, ···,am) be an arbitrary vector in X with aj ∈ Xj,j ∈{1, 2, ···,m} and define a sequence {yk} by

( (xk)j if j ∈ Ib (yk)j := . aj + Y j(xk)ifj ∈ I∞

k Then {y } is bounded. Since F is a generalized P0-function on L(X), there exists an index ∞ k j k j b k j k j jk ∈ I (note that (y ) =(x ) for all j ∈ I ) such that (y ) k =(6 x ) k and

[(yk)jk − (xk)jk ]T (F jk (yk) − F jk (xk)) ≥ 0. (2.6)

Without loss of generality, assume that jk ≡ j0. Since

k j0 j0 k k j0 j0 k j0 k (x ) − Y (x )=ΠXj0 [(x ) − (F (x )+H (x ))], by (ii) of Proposition 2.1 with y := aj0 and x := (xk)j0 − (F j0 (xk)+Hj0 (xk)), we have

(−Y j0 (xk)+F j0 (xk)+Hj0 (xk))T [aj0 − (xk)j0 + Y j0 (xk)] ≥ 0, which, together with (2.6), implies that

(Hj0 (xk))T [(yk)j0 − (xk)j0 ] ≥ (Y j0 (xk)+F j0 (yk))T [(yk)j0 − (xk)j0 ]. 8

This means that there exists a constant c1 > 0 such that for all k sufficiently large,

j0 k T k j0 k j0 k j0 k j0 (H (x )) [(y ) − (x ) ] ≥−c1k(y ) − (x ) k.

Thus, for all k sufficiently large, we have

j0 k j0 k T k j0 k j0 j0 k k j0 k j0 (H (x ) − H (y )) [(y ) − (x ) ] ≥−(c1 + kH (y )k)k(y ) − (x ) k,

k which contradicts (2.5) (note that {y } is bounded). This contradiction shows that Lc is bounded. When F itself is strongly monotone, from Theorem 2.4 we have the following result, which appeared in [24] with a lengthy proof.

Corollary 2.3 Suppose that F is continuous and strongly monotone on

{x ∈

All the results hold for W in this subsection can be parallelized to E. Here we omit the details.

3 Smoothing approximations

In this section, unless otherwise stated, we suppose that H :

supp(Φ) = {y ∈ 0}.

The kernel function Φ in supp(Φ) will be omitted if it is obvious from the context.

3.1 Supp(Φ) is bounded In this subsection, we suppose that supp(Φ) is bounded. The first smoothing function we want to discuss is the Steklov averaged function. Define

( 1 if max |yi|≤0.5 Φ(y)= i (3.1) 0 otherwise.

Then, for any ε>0, the function G(ε, ·) defined in (2.2), according to [6], is the Steklov averaged function of H with

1 x1+ε/2 xn+ε/2 Z ···Z ∈

Proposition 3.1 Suppose that H : 0, G(ε, ·) is continuously differentiable on

0 (Gj)x(ε, x)

n x1+ε/2 xi−1+ε/2 xi+1+ε/2 xn+ε/2 T 1 Z Z Z Z = X e dy ··· dy − dy ··· dy i εn−1 1 i 1 i+1 n i=1 x1−ε/2 xi−1−ε/2 xi+1−ε/2 xn−ε/2

1 [H (y , ···,y − ,x − ε/2,y , ···,y ) − H (y , ···,y − ,x + ε/2,y , ···,y )], ε j 1 i 1 i i+1 n j 1 i 1 i i+1 n (3.3) where j =1, 2, ···,m and ei is the ith unit coordinate vector, i =1, 2, ···,n.

Proposition 3.2 (Sobolev [37], Schwartz [35] ) If supp(Φ) is bounded and Φ is con- tinuously differentiable, then for any ε>0, G(ε, ·) is continuously differentiable and for x ∈

Vh− H0(x; h)=O(khk2).

See [32, 26] for details about the discussion of semismoothness and strong semismoothness.

Theorem 3.1 If Φ is defined by (3.1) or if supp(Φ) is bounded and Φ is continuously differentiable, then we have

n (i) G(·, ·) is continuously differentiable on <++ ×< . (ii) G is locally Lipschitz continuous on 0 and x ∈

(iv) For any x ∈

πx∂G(0,x) ⊆ plen∂H(x) ⊆ ∂H1(x) × ∂H2(x) ×···×∂Hm(x), (3.6)

where

m×n m πx∂G(0,x):={A ∈< | there exists a ∈< such that (aA) ∈ ∂G(0,x)}. 10

n 0 (v) For any ε>0 and x ∈< , Gx(ε, x) is positive semi-definite (positive definite) if H is monotone (strongly monotone) on

(vi) If m = n and there exists a continuously differentiable function f :

n (vii) If H is semismooth at x, then for any (ε, d) ∈<++ ×< with (ε, d) → 0 we have

ε ! G(ε, x + d) − G(0,x) − G0(ε, x + d) = o(k(ε, d)k), (3.7) d

and ε ! G(ε, x + d) − G(0,x) − G0(ε, x + d) = O(k(ε, d)k2) (3.8) d if H is strongly semismooth at x.

Proof: (i) Similar to the proof of Propositions 3.1 and 3.2, we can see that for any fixed n x ∈< , G(·,x) is continuously differentiable on <+. Moreover, by (3.3) and (3.4)), we can see that for any ε>0 and x ∈

lim G0 (τ,z)=G0 (ε, x). τ→ε,z→x x x

0 n Then by the definition we can prove that G (ε, x) exists for any (ε, x) ∈<++ ×< . The continuity of G0 follows from the continuous differentiability of G(·,x) and G(ε, ·). (ii) Let (ε, x), (τ,z) be any two points in

kG(ε, x) − G(τ,z)k = k Z H(x −|ε|y)Φ(y)dy − H(z −|τ|y)Φ(y)dyk

T T 0 v [G(ε, z + tu) − G(ε, z)] v Gx(ε, z)u = lim t↓0 t vT [H(z + tu − εy) − H(z − εy)] = lim Z Φ(y)dy t↓0 supp t ≤ sup H0(z − εy; u, v). y∈supp 11

Therefore, T 0 T v Gx(ε, z)u ≤ sup max v Vu. y∈supp V ∈∂H(z−εy) Since ∂H is upper semicontinuous, we have

T 0 T lim sup v Gx(ε, z)u ≤ max v Vu. z→x,ε↓0 V ∈∂H(x)

Hence, by Theorem 2.1, for any

0 A = lim Gx(ε, z), z→x,ε↓0 we have vT Au ≤ H0(x; u, v), which, by Theorem 2.1, implies A ∈ plen∂H(x). (iv) Define

B(x):={B ∈

Since, if G is differentiable at any (0,y) ∈<×

πx∂G(0,x) ⊆ plen∂H(x).

Since for each i, ∂Hi(x) is compact and convex,

plen∂H(x) ⊆ ∂H1(x) × ∂H2(x) ×···×∂Hm(x).

So, (3.6) is proved. (v) First, from our assumptions there exists a nonnegative number α (α>0ifH is strongly monotone) such that

(x − z)T [H(x) − H(z)] ≥ αkx − zk2, ∀x, z ∈

Then for any h ∈0, we have

T T 0 h [G(ε, x + th) − G(ε, x)] h Gx(ε, x)h = lim t↓0 t hT [H(x + th − εy) − H(x − εy)] = lim Z Φ(y)dy t↓0 supp t ≥ lim Z αhT hΦ(y)dy t↓0 supp = αhT h, which proves (v). 12

n (vi) For any ε>0, define γε : < →

Z n γε(x)= f(x − εy)Φ(y)dy, x ∈< .

Then, by (i) of this theorem, γε is continuously differentiable. By direct computations, we have Z ∇γε(x)= ∇f(x − εy)Φ(y)dy

= Z H(x − εy)Φ(y)dy

= G(ε, x),x∈

0 2 n Gx(ε, x)=∇ γε(x),x∈< , 0 which means that Gx(ε, x) is symmetric by [22, 3.3.4]. n (vii) According to the definition, for any (ε, d) ∈<++ ×< with (ε, d) → 0 we have

ε ! G(ε, x + d) − G(0,x) − G0(ε, x + d) d H(x + d + td − (ε + tε)y) − H(x + d − εy) = lim Z H(x + d − εy) − H(x) −  Φ(y)dy t↓0 supp t Z = lim [H(x + d − εy) − H(x) − Vt(d − εy)]Φ(y)dy, t↓0 supp where the last equality follows from the Mean Value Theorem [4, Proposition 2.6.5] and Vt is an element of the convex hull of

∂H([x + d − εy, x + d − εy + t(d − εy)]).

By Carath´eodory theorem, there exists at most mn + 1 elements

Wt,i ∈ ∂H(x + d − εy + tθi(d − ε)) such that mn+1 Vt = X λiWt,i, i=1 mn+1 where θi,λi ∈ [0, 1] and Pi=1 λi = 1. Then ε ! G(ε, x + d) − G(0,x) − G0 (ε, x + d) x d mn+1 Z = lim X λi [H(x + d − εy) − H(x) − Wt,i(d − εy)] Φ(y)dy t↓0 supp i=1

mn+1 Z = lim X λi [H(x + d − εy + tθi(d − εy)) − H(x) t↓0 supp i=1 −Wt,i(d − εy + tθi(d − εy)) + O(t)] Φ(y)dy. 13

Then, if H is semismooth at x, we have

0 ε ! kG(ε, x + d) − G(ε, x) − Gx(ε, x + d) k = sup o(kd − εyk)=o(k(ε, d)k). d y∈supp

This proves (3.7). If H is strongly semismooth at x, by following the above arguments, we can prove (3.8).

In proving part (iii) of Theorem 3.1, we used an idea from Ioffe [16]. For the Steklov averaged function, Xu and Chang [41] proved a similar result to part (iii) of Theorem 3.1. Recently, by assuming that H :

3.2 Supp(Φ) is infinite

Assumption 3.1 (i) H is globally Lipschitz continuous with Lipschitz constant L.

(ii)

κ := Z kykΦ(y)dy < ∞.

(iii) Φ is continuously differentiable and for any ε>0 and x ∈

Z 0 H(y)Θx(ε, x − y)dy

exists.

(iv) For any ε>0, x ∈

Z 0 0 sup kykk[Θx(ε, x + th − y) − Θx(ε, x − y)]hkdy = o(khk). t∈[0,1]

Proposition 3.3 Suppose that Assumption 3.1 holds. Then for any ε>0, G(ε, ·) is continuously differentiable with Jacobian given by

0 Z 0 n Gx(ε, x)= H(y)Θx(ε, x − y)dy, x ∈< . (3.10)

n Proof: By (i) and (ii) of Assumption 3.1, for any (ε, x) ∈<++ ×< , G is well defined. 14

By (iii) and (iv) of Assumption 3.1, for any h ∈

Z 0 kG(ε, x + h) − G(ε, x) − H(y)Θx(ε, x − y)hdyk

Z 0 = k H(y)[Θ(ε, x + h − y) − Θ(ε, x − y) − Θx(ε, x − y)h]dyk

Z 0 0 ≤ kH(y)k max k[Θx(ε, x + th − y) − Θx(ε, x − y)]hkdy

Z 0 0 ≤ L sup kykk[Θx(ε, x + th − y) − Θx(ε, x − y)]hkdy t∈[0,1]

Z 0 0 +(kH(y0)k + Lky0k) sup k[Θx(ε, x + th − y) − Θx(ε, x − y)]hkdy t∈[0,1]

= o(khk),

n 0 where y0 ∈< is an arbitrary point. Then we have proved that Gx(ε, x) exists and (3.10) holds. The continuity of G0(ε, ·) follows from (iv) of Assumption 3.1.

Assumption 3.2 (i) For any ε>0 and x ∈

Z 0 H(y)Θε(ε, x − y)dy (3.11)

(ii) For any ε>0, x ∈

Z 0 0 sup kyk|[Θε(ε + tτ, x − y) − Θε(ε + τ,x − y)]τ|dy = o(τ). (3.12) t∈[0,1]

(iii) For any ε>0 and x ∈

Z 0 0 lim kyk|Θε(τ,z − y) − Θε(ε, x − y)|dy =0. (3.13) τ→ε,z→x

1 2 Φ(y)= √ e−kyk . ( π)n

Analogously to Theorem 3.1 we have the following theorem.

Theorem 3.2 Suppose that supp(Φ) is infinite and Φ is continuously differentiable. Sup- pose that Assumption 3.1 holds. Then we have

n (i) If Assumption 3.2 holds, then G(·, ·) is continuously differentiable on <++ ×< . 15

(ii) G is globally Lipschitz continuous on 0 and x ∈

0 kGx(ε, x)k≤L. (3.14)

(iii) 0 lim Gx(ε, z) ⊆ plen∂H(x). z→x,ε↓0

(iv) For any x ∈

πx∂G(0,x) ⊆ plen∂H(x) ⊆ ∂H1(x) × ∂H2(x) ×···×∂Hm(x). (3.15)

n 0 (v) For any ε>0 and x ∈< , Gx(ε, x) is positive semi-definite (positive definite) if H is monotone (strongly monotone) on

(vi) Suppose that m = n. If there exists a continuously differentiable function f : 0 and x ∈

Z γε(x):= f(x − εy)Φ(y)dy

0 n is well defined, then Gx(ε, x) is symmetric at any (ε, x) ∈<++ ×< .

n (vii) If H is semismooth at x, then for any (ε, d) ∈<++ ×< with (ε, d) → 0 we have

ε ! G(ε, x + d) − G(0,x) − G0(ε, x + d) = o(k(ε, d)k). (3.16) d

Proof: (i) From parts (i) and (ii) of Assumption 3.2 we can see that for any (ε, x) ∈ n 0 0 <++×< , Gε(ε, x) exists and Gε(·,x) is continuous. Part (iii) of Assumption 3.2 guarantees that lim G0 (τ,z)=G0 (ε, x), τ→ε,z→x ε ε 0 0 which, together with the continuity of Gε(·,x) and Gx(ε, ·) (Proposition 3.3), implies that n G is continuously differentiable on <++ ×< . (ii) Let (ε, x), (τ,z) be any two points in

kG(ε, x) − G(τ,z)k = k Z H(x −|ε|y)Φ(y)dy − H(z −|τ|y)Φ(y)dyk

kG(ε, y) − G(ε, z)k≤Lky − zk∀y, z ∈

This implies (3.14). (iii) For any given δ>0, there exists a number M>0 such that δ Z Φ(y)dy ≤ . kyk≥M 2 Let n DM := {y ∈< |kyk≤M}. Suppose that v ∈0 such that for all ε ∈ [0,τ] and all z with kz − xk≤τ we have δ sup max vT Vu≤ max vT Vu+ kvkkuk. (3.17) ∈ − ∈ y∈DM V ∂H(z εy) V ∂H(x) 2 Since H is globally Lipschitz continuous, by the definition, we have T T 0 v [G(ε, z + tu) − G(ε, z)] v Gx(ε, z)u = lim t↓0 t vT [H(z + tu − εy) − H(z − εy)] = lim{Z Φ(y)dy ↓ t 0 DM t vT [H(z + tu − εy) − H(z − εy)] + Z Φ(y)dy} kyk≥M t δ ≤ sup F 0(z − εy; u, v)+Lkvkkuk , y∈DM 2 which, together with (3.17), implies that for any δ>0, there exists a number τ>0 such that for all ε ∈ [0,τ] and all z with kz − xk≤τ we have

T 0 T δ T δ v Gx(ε, z)u ≤ sup max v Vu+ Lkvkkuk ≤ max v Vu+(L +1)kvkkuk . ∈ − ∈ y∈DM V ∂H(z εy) 2 V ∂H(x) 2 This means that T 0 T lim sup v Gx(ε, z)u ≤ max v Vu. z→x,ε↓0 V ∈∂H(x) Hence, by Theorem 2.1, for any 0 A = lim Gx(ε, z), z→x,ε↓0 we have vT Au ≤ F 0(x; u, v), which, by Theorem 2.1, implies A ∈ plen∂H(x). (iv) Similar to the proof of (iv) in Theorem 3.1. (v) Similar to the proof of (v) in Theorem 3.1. (vi) By the definition,

Z T γε(x + h) − γε(x) − H(x − εy) hΦ(y)dy

= Z [f(x + h − εy) − f(x − εy) − f 0(x − εy)h]Φ(y)dy

1 = Z Z [f 0(x + th − εy) − f 0(x − εy)]hdtΦ(y)dy.

Since f 0 = H is globally Lipschitz continuous, for any δ>0 there exists a number M>0 such that 1 δ Z Z k[f 0(x + th − εy) − f 0(x − εy)]hkdtΦ(y)dy ≤ khk2. kyk≥M 0 2 For the given M, there exists a number τ>0 such that for all ε ∈ [0,τ] and all h with khk≤τ we have 1 δ Z Z k[f 0(x + th − εy) − f 0(x − εy)]hkdtΦ(y)dy ≤ khk kyk≤M 0 2 because f 0 is uniformly continuous on any bounded set. Therefore, for any δ>0 there exists a number τ ∈ (0, 1] such that for all ε ∈ [0,τ] and all h with khk≤τ we have

1 Z Z k[f 0(x + th − εy) − f 0(x − εy)]hkdtΦ(y)dy ≤ δkhk.

Z T γε(x + h) − γε(x) − H(x − εy) hΦ(y)dy = o(khk),

Since G(ε, ·) is continuous, γε is continuously differentiable. By Proposition 3.3, G(ε, ·)is continuously differentiable, and so γε(·) is twice continuously differentiable with 2 0 n ∇ γε(x)=Gx(ε, x),x∈< . 2 0 This, together with the symmetric property of ∇ γε(x), implies that Gx(ε, x) is symmetric. (vii) First, for any given δ>0, under the assumptions, we know that there exists a number M>0 such that δ max{Z Φ(y)dy, Z kykΦ(y)dy}≤ . kyk≥M kyk≥M 4 Then for any d ∈0, we have H(x + d + td − (ε + tε)y) − H(x + d − εy) k Z H(x + d − εy) − H(x) −  Φ(y)dyk kyk≥M t

δ ≤ (Lkdk + ε) . 2 Next, according to the proof in part (vii) of Theorem 3.1, we know that there exists a number τ>0 such that for all (ε, d) ∈0 and k(ε, d)k≤τ we have H(x + d + td − (ε + tε)y) − H(x + d − εy) lim sup k Z H(x + d − εy) − H(x) −  Φ(y)dyk t↓0 kyk≤M t

δ ≤ k(ε, d)k. 2 18

Therefore, for any given δ>0, there exists a number τ>0 such that for all (ε, d) ∈0 and k(ε, d)k≤τ we have ε ! kG(ε, x + d) − G(0,x) − G0 (ε, x + d) k x d

H(x + d + td − (ε + tε)y) − H(x + d − εy) = k lim Z H(x + d − εy) − H(x) −  Φ(y)dyk t↓0

≤ lim sup k Z [H(x + d − εy) − H(x) t↓0 kyk≤M

H(x + d + td − (ε + tε)y) − H(x + d − εy) −  Φ(y)dyk t

+ lim sup k Z [H(x + d − εy) − H(x) t↓0 kyk≥M

H(x + d + td − (ε + tε)y) − H(x + d − εy) −  Φ(y)dyk t

δ δ ≤ k(ε, d)k +(L +1) k(ε, d)k 2 2

≤ (L +1)δk(ε, d)k, which proves (3.16).

Contrary to Theorem 3.1, in Theorem 3.2 we could not prove a result similar to (3.8). When X is a rectangle, such a result holds for several smoothing functions with supp(Φ) being infinite [31]

3.3 Smoothing-nonsmooth reformulations

If in (2.2), H is replaced by ΠX , a smoothing approximation of ΠX via convolution can be described by Z G(ε, x)= ΠX (x − y)Θ(ε, y)dy, (3.18) 0 and x ∈< , we have G(ε, x) ∈ X. As we stated early, we always define n G(0,x):=ΠX (x) and G(−|ε|,x):=G(|ε|,x),x∈< . Suppose that Φ is chosen such that G is continuous on

ε ! R(ε, x):= =0, (3.20) S(ε, x) where S(ε, x):=F (G(ε, x)) + x − G(ε, x).

Theorem 3.3 Suppose that Φ is defined by (3.1) or suppose that supp(Φ) is bounded and n Φ is continuously differentiable. Then Q is continuously differentiable on <++ ×< , and 0 T for any ε>0 there exists an orthogonal matrix U such that P := UGx(ε, x − F (x))U is a diagonal matrix with 0 ≤ Pii ≤ 1,i∈{1, 2, ···,n}. Moreover, we have 0 T 0 (i) if UF (x)U is a P0-matrix (P -matrix), then Px(ε, x) is a quasi P0-matrix (P -matrix); n (ii) if ΠX is semismooth at x − F (x), then for any (ε, d) ∈<++ ×< with (ε, d) → 0 we have ε ! Q(ε, x + d) − Q(0,x) − Q0(ε, x + d) = o(k(ε, d)k), (3.21) d and ε ! Q(ε, x + d) − G(0,x) − G0(ε, x + d) = O(k(ε, d)k2) (3.22) d 0 if ΠX is strongly semismooth at x − F (x) and F is Lipschitz continuous at x.

n Proof: First, from Theorem 3.1, P is continuously differentiable on <++ ×< . Then, since ΠX is monotone and globally Lipschitz continuous with Lipschitz constant 1, from Theorem 0 3.1 and the proof of Theorem 2.2, we know that for any ε>0, Gx(ε, x) is symmetric, 0 positive semidefinite and kGx(ε, x)k≤1. Thus, there exists an orthogonal matrix U such 0 T that P := UGx(ε, x − F (x))U is a diagonal matrix with 0 ≤ Pii ≤ 1,i∈{1, 2, ···,n}. Part (i) can be proved similarly to that in Theorem 2.3 and part (ii) follows from (vii) in Theorem 3.1.

Theorem 3.4 Suppose that supp(Φ) is infinite and Φ is continuously differentiable. Sup- pose that Assumptions 3.1 and 3.2 hold and that

Z kyk2Φ(y)dy < ∞. 0 there exists an 0 T orthogonal matrix U such that P := UGx(ε, x − F (x))U is a diagonal matrix with 0 ≤ Pii ≤ 1,i ∈{1, 2, ···,n}. Moreover, we have 0 T 0 (i) if UF (x)U is a P0-matrix (P -matrix), then Px(ε, x) is a quasi P0-matrix (P -matrix); 20

n (ii) if ΠX is semismooth at x − F (x), then for any (ε, d) ∈<++ ×< with (ε, d) → 0 we have ε ! Q(ε, x + d) − G(0,x) − Q0(ε, x + d) = o(k(ε, d)k). (3.23) d

Proof: Similar to the proof of Theorem 3.3, we can prove this theorem by Theorem 3.2. We omit the details. In this subsection, the results hold for P hold for R as well. We omit the details here.

3.4 Computable smoothing functions of ΠX The smoothing approximation function G via convolution is not “computable” if X is not a rectangle since in this case a multivariate integral is involved. When X is a rectangle, see [2, 9] for various concrete forms of G. In [30], the authors discussed a way to get an approximate smoothing function of ΠX (·) when X is explicitly expressed as

n X := {x ∈< | gi(x) ≤ 0,i=1, 2, ···,m}, (3.24) where for each i, gi is a twice continuously differentiable convex function. Suppose that 0 0 the Slater constraint qualification holds, i.e., there exists a point x such that gi(x ) < 0 n m for all i ∈{1, 2, ···,m}. Then for any x ∈< , there exists a vector λ ∈<+ such that

m  y − x + X λ ∇g (y)=0  i i . (3.25) i=1  λ − max(0,λ+ g(y)) = 0

Suppose that a(ε, t) is the Chen-Harker-Kanzow-Smale (CHKS) smoothing function [1, 18, 36] of max(0,t),t ∈<, which is given by √ 4ε2 + t2 + t a(ε, t):= ,t∈<. 2 (We can use other smoothing functions of max(0,t),t ∈<. Here, we choose the CHKS smoothing function for the ease of discussion.) Define A : <×

Ai(ε, z):=a(ε, zi),i∈{1, 2, ···,m}.

Consider the perturbed system of (3.25)

m   y − x + X λ ∇g (y) D((y, λ), (ε, x)) := i i =0, (3.26)  i=1   λ − A(ε, λ + g(y)) 

n where ε and x are treated as parameters and y and λ as variables. For any (ε, x) ∈<++×< , the system (3.26) has a unique solution (y(ε, x),λ(ε, x)).

Proposition 3.4 ([30]) Suppose that the Slater constraint qualification holds. Then 21

n n 0 (i) y(·) is continuously differentiable on <++ ×< and for any (ε, x) ∈<++ ×< , yx(ε, x) is symmetric, positive semidefinite and

0 kyx(ε, x)k≤1.

(ii) For any x0 ∈

See [30] for some important differentiability properties of the above defined smoothing function. Note that for any ε>0 and x ∈

n y(0,x):=ΠX (x) and y(−|ε|,x):=y(|ε|,x), (ε, x) ∈<×< .

Then we can define a smoothing function P of W as

P (ε, x):=x − y(ε, x − F (x)), where (ε, x) ∈ <×

S(ε, x):=F (y(ε, x)) + x − y(ε, x), (ε, x) ∈<×

It is noted that for any ε =0,6 y(ε, x) ∈ int X. This guarantees that S is well defined even if F is only defined on X.

4 Algorithm and its convergence

Suppose that Q :

Assumption 4.1 (i) Q is continuous on

(ii) For any ε>0 and x ∈

Remark: According to Theorems 3.3 and 3.4 and Corollary 2.1, we know that if F is a continuously differentiable generalized P0-function on L(X), then for any ε>0 and n 0 x ∈< , Px(ε, x) is a quasi P0-matrix. This implies that if we redefine

P (ε, x):=P (ε, x)+εx or P (ε, x):=x − G(ε, x − (F (x)+εx)), 22

0 then Px(ε, x) becomes a quasi P -matrix, and thus, a nonsingular matrix. Therefore, if F is a generalized P0-function on L(X), then, by redefining P if necessary, part (ii) in Assumption 4.1 always holds. n Chooseε ¯ ∈<++ and γ ∈ (0, 1) such that γε<¯ 1. Letz ¯ := (¯ε, 0) ∈<×< . Define the n+1 merit function ψ : < →<+ by ψ(z):=kQ(z)k2

n+1 and define β : < →<+ by β(z):=γ min{1,ψ(z)}. Let Ω:={z := (ε, x) ∈<×

k k Step 1. If Q(z )=0then stop. Otherwise, let βk := β(z ). Step 2. Compute ∆zk := (∆εk, ∆xk) ∈<×

k 0 k k Q(z )+Q (z )∆z = βkz¯. (4.1)

Step 3. Let lk be the smallest nonnegative integer l satisfying ψ(zk + δl∆zk) ≤ [1 − 2σ(1 − γε¯)δl]ψ(zk). (4.2)

Define zk+1 := zk + δlk ∆zk. Step 4. Replace k by k +1 and go to Step 1. The following global convergence result is proved in Qi et al. [31].

Theorem 4.1 Suppose that Assumption 4.1 is satisfied. Then an infinite sequence {zk} is generated by Algorithm 4.1 with {zk}∈Ω and each accumulation point z˜ of {zk} is a solution of Q(z)=0.

When F is a generalized P0-function on L(X), we have the following stronger result. Theorem 4.2 Suppose that Q is defined by (3.19) with P given by P (ε, x):=x − G(ε, x − (F (x)+εx)), (ε, x) ∈

n n 0 where G is continuously differentiable on <++ ×< and for any (ε, x) ∈<++ ×< , Gx(ε, x) 0 is symmetric, positive semidefinite and kGx(ε, x)k≤1. Suppose that X is of the structure (2.3) and that F is a continuously differentiable generalized P0-function on L(X). If the solution set of the VIP is nonempty and bounded, then a bounded infinite sequence {zk} is generated by Algorithm 4.1 and each accumulation point z˜ of {zk} is a solution of Q(z)=0. 23

n 0 Proof: By the proof of Corollary 2.1 we can prove that for any ε>0 and x ∈< , Px(ε, x) is a quasi P -matrix, and so, Q0(ε, x) is nonsingular. Then, by Theorem 4.1, an infinite {zk} is generated by Algorithm 4.1 and {zk}∈Ω. Since ψ(zk) is a decreasing sequence, k limk→∞ ψ(z ) exists. If lim ψ(zk) > 0, (4.3) k→∞ then there exists a ε0 > 0 such that εk >ε0. By the proof of Theorem 2.4, we can prove that for any δ>0 the following set

{x ∈0, the following set

{x ∈

Theorem 4.3 Suppose that Assumption 4.1 is satisfied and that z∗ := (0,x∗) is an accu- mulation point of the infinite sequence {zk} generated by Algorithm 4.1. Suppose that for any ε>0 and d ∈

ε ! P (ε, x∗ + d) − P (0,x∗) − P 0(ε, x∗ + d) = o(k(ε, d)k) d and that all V ∈ ∂Q(z∗) are nonsingular. Then the whole sequence {zk} converges to z∗,

kzk+1 − z∗k = o(kzk − z∗k) 24 and εk+1 = o(εk). Furthermore, if for any ε>0 and d ∈

ε ! P (ε, x∗ + d) − P (0,x∗) − P 0(ε, x∗ + d) = O(k(ε, d)k2), d then kzk+1 − z∗k = O(kzk − z∗k2) and εk+1 = O(εk)2.

In Theorem 4.3 we assume that all V ∈ ∂Q(z∗) are nonsingular in order to get a high order convergent result. According to Theorems 3.1 and 3.2, this assumption is satisfied by assuming that all

∗ ∗ 0 ∗ T ∈ I − plen∂ΠX [x − F (x )][I − F (x )] are nonsingular. By the definition of plen, the latter is equivalent to say that all

∗ ∗ 0 ∗ T ∈ I − ∂ΠX [x − F (x )][I − F (x )] are nonsingular. See Section 2 for the conditions to ensure the nonsingularity of the above matrices. In particular, if F 0(x∗) is positive definite, all

∗ ∗ 0 ∗ T ∈ I − ∂ΠX (x − F (x ))[I − F (x )] are nonsingular.

5 Concluding Remarks

In this paper, by using convolution, we reformulated the VIP equivalently as smoothing- nonsmooth equations, which have some desirable properties. A globally and locally high- order convergent Newton method has been applied for solving the smoothing-nonsmooth equations, and so the VIP. There is no specific assumption on the structure of the constraint set X. Due to a multivariate integral involved, it may not be easy to compute the smoothing functions of the projection operator ΠX via convolution. However, it may lead to the discovery of some effective ways to compute smoothing functions of ΠX when the structure of X can be used. In fact, based on this observation, an effective way to compute the smoothing functions of ΠX is proposed in [30] when X can be expressed as the set defined by several twice continuously differentiable convex functions. We believe that the research done in this paper can deepen the understanding of smoothing functions of ΠX when X is not a rectangle.

Acknowledgement: The authors would like to thank Houduo Qi for his comments and suggestions during writing this paper. 25

References

[1] B. Chen and P.T. Harker, “A non-interior-point continuation method for linear com- plementarity problems”, SIAM Journal on Matrix Analysis and Applications, 14 (1993), 1168–1190. [2] C. Chen and O.L. Mangasarian, “A class of smoothing functions for nonlinear and mixed complementarity problems”, Computational Optimization and Applications, 5 (1996), 97–138. [3] X. Chen, L. Qi and D. Sun, “Global and superlinear convergence of the smoothing Newton method and its application to general box constrained variational inequali- ties”, Mathematics of Computation, 67 (1998), 519-540. [4] F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, 1983. [5] B.C. Eaves, “On the basic theorem of complementarity”, Mathematical Programming, 1 (1971), 68–75. [6] Y.M. Ermoliev, V. I. Norkin and R. J.-B. Wets, “The minimization of semicontinuous functions: mollifier subgradients”, SIAM Journal on Control and Optimization, 33 (1995), 149–167. [7] F. Facchinei and J.S. Pang, “Total stability of variational inequalities”, Preprint, Department of Mathematical Sciences, Whiting School of Engineering, The Johns Hopkins University, Baltimore, Maryland 21218-2682, U.S.A., 1998. [8] A. Fischer, “Solution of monotone complementarity problems with locally Lipschitzian functions”, Mathematical Programming, 76 (1997), 513–532. [9] S.A. Gabriel and J.J. Mor´e, “Smoothing of mixed complementarity problems”, in Complementarity and Variational Problems: State of the Art, M.C. Ferris and J.S. Pang, eds., SIAM, Philadelphia, Pennsylvania, 1997, pp. 105–116. [10] M.S. Gowda and G. Ravindran, “Algebraic univalence theorems for nonsmooth func- tions”, Research Report, Department of Mathematics and Statistics, University of Maryland, Baltimore Country, Baltimore, Maryland 21250, 1998. [11] M.S. Gowda and R. Sznajder, “Weak univalence and connectedness of inverse images of continuous functions”, to appear in Mathematics of Operations Research, 24 (1999), 255–261. [12] M.S. Gowda and M.A. Tawhid, “Existence and limiting behavior of trajectories asso- ciated with P0-equations”, Computational Optimization and Applications, 12 (1999), 229–251. [13] A.M. Gupal, “On a method for the minimization of almost differentiable functions”, Kibernetika (1977) 114-116. [14] P.T. Harker and J.-S. Pang, “Finite-dimensional variational inequality and nonlinear complementarity problem: A survey of theory, algorithms and applications”, Mathe- matical Programming, 48 (1990), 161–220. 26

[15] J.-B. Hiriart-Urruty, “Characterizations of the plenary hull of the generalized Jacobian matrix”, Mathematical Programming Study, 17 (1982), 1–12.

[16] A.D. Ioffe, “Nonsmooth analysis: differentiable calculus of nondifferentiable map- pings”, Transactions of the American Mathematical Society, 266 (1981), 1–56.

[17] H. Jiang and L. Qi, “Local uniqueness and convergence of iterative methods for non- smooth variational inequalities”, Journal of and Applications, 196 (1995), 314–331. [18] C. Kanzow, “Some noninterior continuation methods for linear complementarity prob- lems”, SIAM Journal on Matrix Analysis and Applications, 17 (1996), 851–868.

[19] B. Kummer, “Newton’s method for nondifferentiable functions”, in: Advances in math- ematical optimization, 114–125, Akademie-Verlag, , 1988. [20] B. Kummer, “Newton’s method based on generalized derivatives for nonsmooth func- tions: convergence analysis”, in: Advances in optimization, 171–194, Springer, Berlin, 1992.

[21] D.Q. Mayne and E. Polak, “Nondifferential optimization via adaptive smoothing”, Journal of Optimization Theory and Applications, 43 (1984), 19–30. [22] J.M. Ortega and W.C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970.

[23] J.-S. Pang and L. Qi, “Nonsmooth equations: Motivation and algorithms”, SIAM Journal on Optimization, 3 (1993), 443–465. [24] J.M. Peng and M. Fukushima, “A hybrid Newton method for solving the variational inequality problem via the D-gap function”, Mathematical Programming, 86 (1999), 367–386.

[25] D. Pu and J. Zhang, “Inexact generalized Newton methods for second-order C- differentiable optimization,” Journal of Computational and Applied Mathematics, 93 (1998) 107-122. [26] L. Qi, “Convergence analysis of some algorithms for solving nonsmooth equations”, Mathematics of Operations Research, 18 (1993), 227–244.

[27] L. Qi and X. Chen, “A globally convergent successive approximation method for severely nonsmooth equations”, SIAM Journal on Control and Optimization, 33 (1995), 402-418.

[28] L. Qi, D. Ralph and G. Zhou, “Semiderivative functions and reformulation methods for solving complementarity and variational problems”, to appear in Nonlinear Opti- mization and Applications 2, G. Di Pillo and F. Giannessi (eds.), Kluwer Academic Publishers [29] L. Qi and D. Sun, “A survey of some nonsmooth equations and smoothing Newton methods”, Progress in Optimization: Contributions from Australasia, A. Eberhard, B. Glover, R. Hill and D. Ralph (eds.), Kluwer Academic Publishers, 121–146, 1999. 27

[30] L. Qi and D. Sun, “Smoothing functions and a smoothing Newton method for com- plementarity and variational inequality problems”, AMR 98/24, Applied Mathematics Report, School of Mathematics, the University of New South Wales, October 1998.

[31] L. Qi, D. Sun and G. Zhou, “A new look at smoothing Newton methods for nonlinear complementarity problems and box constrained variational inequalities”, Mathemati- cal Programming, 87 (2000), 1–35.

[32] L. Qi and J. Sun, “A nonsmooth version of Newton’s method”, Mathematical Pro- gramming, 58 (1993), 353–367.

[33] S.M. Robinson, “Normal maps induced by linear transformation”, Mathematics of Operations Research, 17 (1992), 691–714.

[34] G. Ravindran and M.S. Gowda, “Regularization of P0-functions in box variational inequality problems”, to appear in SIAM Journal on Optimization.

[35] L. Schwartz, Th´eorie des Distributions, Hermann, , 1966.

[36] S. Smale, “Algorithms for solving equations”, in Proceedings of the International Congress of Mathematicians, Berkeley, California, 1986, pp. 172–195.

[37] S.L. Sobolev, Some Applications of Analysis in Mathematical Physics, 3rd edition, Nauka, Moscow, 1988.

[38] D. Sun, M. Fukushima and L. Qi, “A computable generalized Hessian of the D-gap function and Newton-type methods for variational inequality problem”, in: M.C. Ferris and J.-S. Pang, eds., Complementarity and Variational Problems – State of the Art, SIAM Publications, Philadelphia, 1997, pp. 452-473.

[39] T.H. Sweetser, “A set-valued strong derivative for vector-valued Lipschitz functions”, Journal of Optimization Theory and Applications, 23 (1977), 549–562.

[40] H. Xu, “Deterministically computable generalized Jacobians and Newton’s methods”, Preprint, School of Information Technology and Mathematics Sciences, University of Ballarat, Victoria, VIC 3353, Australia, 1998.

[41] H. Xu and X. Chang, “Approximate Newton methods for nonsmooth equations”, Journal of Optimization Theory and Applications, 93 (1997), 373–394.

[42] E.H. Zarantonello, “Projections on convex sets in Hilbert space and spectral theory”, in Contributions to Nonlinear Functional Analysis, E.H. Zarantonello, ed., Academic Press, New YorK, 1971, pp. 237–424.