ABS Algorithms for Linear Equations and Optimization
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Computational and Applied Mathematics 124 (2000) 155–170 www.elsevier.nl/locate/cam View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector ABS algorithms for linear equations and optimization Emilio Spedicatoa; ∗, Zunquan Xiab, Liwei Zhangb aDepartment of Mathematics, University of Bergamo, 24129 Bergamo, Italy bDepartment of Applied Mathematics, Dalian University of Technology, Dalian, China Received 20 May 1999; received in revised form 18 November 1999 Abstract In this paper we review basic properties and the main achievements obtained by the class of ABS methods, developed since 1981 to solve linear and nonlinear algebraic equations and optimization problems. c 2000 Elsevier Science B.V. All rights reserved. Keywords: Linear equations; Optimization; ABS methods; Quasi-Newton methods; Linear programming; Feasible direction methods; KT equations; Interior point methods 1. The scaled ABS class: general properties ABS methods were introduced by Aba y et al. [2], in a paper considering the solution of linear equations via what is now called the basic or unscaled ABS class. The basic ABS class was later generalized to the so-called scaled ABS class and subsequently applied to linear least-squares, nonlinear equations and optimization problems, as described by Aba y and Spedicato [4] and Zhang et al. [43]. Recent work has also extended ABS methods to the solution of diophantine equations, with possible applications to combinatorial optimization. There are presently over 350 papers in the ABS ÿeld, see the bibliography by Spedicato and Nicolai [24]. In this paper we review the basic properties of ABS methods for solving linear determined or underdetermined systems and some of their applications to optimization problems. Let us consider the general linear (determined or underdetermined) system, where rank(A)is arbitrary, Ax = b; x ∈ Rn;b∈ Rm;m6n (1) ∗ Corresponding author. E-mail address: [email protected] (E. Spedicato). 0377-0427/00/$ - see front matter c 2000 Elsevier Science B.V. All rights reserved. PII: S 0377-0427(00)00419-2 156 E. Spedicato et al. / Journal of Computational and Applied Mathematics 124 (2000) 155–170 or T ai x − bi =0;i=1;:::;m; (2) where T a1 A = ··· : (3) T am The steps of the scaled ABS class algorithms are the following: n n;n (A) Let x1 ∈ R be arbitrary, H1 ∈ R be nonsingular arbitrary, v1 be an arbitrary nonzero vector in Rm. Set i =1. (B) Compute the residual ri = Axi − b.Ifri = 0 stop (xi solves the problem.) Otherwise compute T T si = HiA vi.Ifsi 6= 0, then go to (C). If si = 0 and = vi ri = 0, then set xi+1 = xi;Hi+1 = Hi and go to (F). Otherwise stop (the system has no solution). (C) Compute the search vector pi by T pi = Hi zi; (4) n where zi ∈ R is arbitrary save for the condition T T vi AHi zi 6=0: (5) (D) Update the estimate of the solution by xi+1 = xi − ipi; (6) where the stepsize i is given by T T i = vi ri=ri Api: (7) (E) Update the matrix Hi by T T T T Hi+1 = Hi − HiA viwi Hi=wi HiA vi; (8) n where wi ∈ R is arbitrary save for the condition T T wi HiA vi 6=0: (9) m (F) If i = m, then stop (xm+1 solves the system). Otherwise deÿne vi+1 as an arbitrary vector in R but linearly independent of v1;:::;vi. Increment i by one and go to (B). The matrices Hi appearing in step (E) are generalizations of (oblique) projection matrices. They probably ÿrst appeared in a book by Wedderburn [34]. They have been named Abaans since the First International Conference on ABS methods (Luoyang, China, 1991) and this name will be used here. It should be noted that there are several alternative formulations of the linear algebra of the above process, using, e.g., vectors instead of the square matrix Hi (see [4]). Such alternative formulations di er in storage requirement and arithmetic cost and the choice for the most convenient formulation may depend on the relative values of m and n. The above recursion deÿnes a class of algorithms, each particular method being determined by the choice of the parameters H1;vi;zi;wi. The basic ABS class is obtained by taking vi = ei;ei being E. Spedicato et al. / Journal of Computational and Applied Mathematics 124 (2000) 155–170 157 m the ith unitary vector in R . Parameters wi;zi;H1 have been introduced, respectively, by Aba y, Broyden and Spedicato, whose initials are referred to in the name of the class. It is possible to show that the scaled ABS class is a complete realization of the so-called Petrov–Galerkin iteration for solving a linear system, where the iteration has the form xi+1 = xi − ipi with i;pi chosen so that T ri+1vj =0;j=1;:::;i, holds, the vectors vj being arbitrary and linearly independent. It appears that all deterministic algorithms in the literature having ÿnite termination on a linear system are members of the scaled ABS class. Moreover, the quasi-Newton methods of the Broyden class, which (under some conditions) are known to have termination in at most 2n steps, can be embedded in the ABS class. The iterate of index 2i − 1 generated by Broyden’s iteration corresponds to the ith iterate of a certain algorithm in the scaled ABS class. Referring to the monograph of Aba y and Spedicato [4] for proofs, we give some properties of methods of the scaled ABS class, assuming, for simplicity, that A has full rank: T T • Deÿne Vi =(v1;:::;vi);Wi =(w1;:::;wi). Then Hi+1A Vi =0;Hi+1Wi = 0, meaning that vectors T A vj;wj;j=1;:::;i; span the null spaces of Hi+1 and its transpose, respectively. T T • The vectors HiA vi;Hi wi are nonzero if and only if ai;wi are linearly independent of a1;:::;ai−1; w1;:::;wi−1, respectively. T T • Deÿne Pi =(p1;:::;pi). Then the implicit factorization Vi Ai Pi =Li holds, where Li is nonsingular lower triangular. From this relation, if m = n, one obtains the following semiexplicit factorization of the inverse, with P = Pn;V= Vn;L= Ln: A−1 = PL−1V T: (10) For several choices of the matrix V the matrix L is diagonal, hence formula (10) gives a fully explicit factorization of the inverse as a byproduct of the ABS solution of a linear system, a property that does not hold for the classical solvers. It can also be shown that all possible fac- torizations of form (10) can be obtained by proper parameter choices in the scaled ABS class, another completeness result. T T • Deÿne Si and Ri by Si =(s1;:::;si);Ri =(r1;:::;ri), where si = HiA vi;ri = Hi wi. Then the T Abaan can be written in the form Hi+1 = H1 − SiRi and the vectors si, ri can be built via a Gram–Schmidt-type iteration involving the previous vectors (the search vector pi can be built in a similar way). This representation of the Abaan in terms of 2i vectors is computationally convenient when the number of equations is much less than the number of variables. Notice that there is also a representation in terms of n − i vectors. • A compact formula for the Abaan in terms of the parameter matrices is the following: T T T −1 T Hi+1 = H1 − H1A Vi(Wi H1A Vi) Wi H1: (11) Letting V = Vm;W= Wm, one can show that the parameter matrices H1;V;W are admissible (i.e. T T are such that condition (9) is satisÿed) i the matrix Q = V AH1 W is strongly nonsingular (i.e., is LU factorizable). Notice that this condition can always be satisÿed by suitable exchanges of the columns of V or W , equivalent to a row or a column pivoting on the matrix Q.IfQ is strongly nonsingular and we take, as is done in all algorithms so far considered, zi = wi, then condition (5) is also satisÿed. It can be shown that the scaled ABS class corresponds to applying (implicitly) the unscaled ABS algorithm to the scaled (or preconditioned) system V TAx = V Tb, where V is an arbitrary nonsingular 158 E. Spedicato et al. / Journal of Computational and Applied Mathematics 124 (2000) 155–170 matrix of order m. Therefore, the scaled ABS class is also complete with respect to all possible left preconditioning matrices, which in the ABS context are deÿned implicitly and dynamically (only the ith column of V is needed at the ith iteration, and it can also be a function of the previous column choices). 2. Subclasses of the ABS class In the monograph of Aba y and Spedicato [4] nine subclasses are considered of the scaled ABS class. Here we mention only two of them. (a) The conjugate direction subclass. This class is obtained by setting vi = pi. It is well deÿned under the condition (sucient but not necessary) that A is symmetric and positive deÿnite. It contains the implicit Choleski algorithm and the ABS versions of the Hestenes–Stiefel and the Lanczos algorithms (i.e., algorithms that generate the same sequence xi as these classical methods in exact arithmetic). This class generates all possible algorithms whose search directions are A-conjugate. The vector xi+1 minimizes the energy or A-weighted Euclidean norm of the error over x1 + Span(p1;:::;pi). If x1 = 0 then the solution is approached monotonically from below in the energy norm.