Uzawa-Type and Augmented Lagrangian Methods for Double Saddle Point Systems

Uzawa-type and augmented Lagrangian methods for double saddle point systems Michele Benzi and Fatemeh Panjeh Ali Beik Abstract We study different types of stationary iterative methods for solving a class of large, sparse linear systems with double saddle point structure. In particular, we propose a class of Uzawa-like methodsincluding a generalized (block) Gauss-Seidel (GGS) scheme and a generalized (block) successive overrelaxation(GSOR) method. Both schemes rely on a relaxation parameter, and we establish convergence intervals for these parameters. Additionally, we investigate the performance of these methods in combination with an augmented Lagrangian approach. Numerical experiments are reported for test problems from two different applications, a mixed-hybrid dis- cretization of the potential fluid flow problem and finite element modeling of liquid crystal directors. Our results show that fast convergence can be achieved with a suitable choice of parameters. 1 Introduction Consider the following linear system of equations: T T A B C x b1 A u B 0 0 y = b2 b, (1) ≡ ≡ C 0 D z b3 − Michele Benzi Department of Mathematics and Computer Science, Emory University, Atlanta, Georgia 30322, USA. Current address: Classe di Scienze, Scuola Normale Superiore, Piazza dei Cavalieri 7, 56126 Pisa, Italy. e-mail: [email protected] Fatemeh Panjeh Ali Beik Department of Mathematics, Vali-e-Asr University of Rafsanjan, PO Box 518, Rafsanjan, Iran. e-mail: [email protected] 1 2 M. Benzi and F. P. A. Beik n n m n where A R × is symmetric positive definite (SPD), B R × has full row rank, p n∈ p p ∈ C R × and the matrix D R × is symmetric positive semidefinite (SPS). In this paper∈ we focus primarily on∈ two cases: D is either SPD, or the zero matrix. When p n D is zero, C R × is assumed to have full row rank. Throughout the paper we assume that n∈ m + p. Linear systems≥ of this type arise, e.g., from finite element models of liquid crys- tals (case D = 0) and from mixed finite element approximation of potential fluid flow problems6 (case D = 0); see [1, 10, 13] and the references therein for detailed descriptions of these problems. The following two propositions give necessary and sufficient conditions for the invertibility of the coefficient matrix A in (1). Proposition 1. [1, Proposition 2.3] Let A be SPD and assume that B andC havefull row rank. Consider the linear system (1) with D = 0. Then range(BT ) range(CT )= 0 is a necessary and sufficient condition for the coefficient matrix A∩ to be invertible.{ } Proposition 2. [1, Proposition 2.1] Assume that A and D are respectively SPD and SPS matrices. Then matrix A is invertible if and only if B has full row rank. As is well known, stationary iterative schemes for solving A x = b are uniquely associated with a given splitting A = M N where M is nonsingular. More pre- cisely, an iterative scheme produces a sequence− of approximate solutions as follows: 1 uk+1 = G uk + M − b, k = 0,1,2,..., (2) 1 where G = M − N and u0 is given. It is well-known that (2) is convergent for any initial guess if and only if ρ(G ) < 1, see [14]. In practice, stationary methods may fail to converge or converge too slowly. For this reason they are usually combined with acceleration techniques, such as Cheby- shev or Krylov subspace methods [14]. These acceleration schemes, while very suc- cessful, have some limitations. For instance, the use of Chebyshev acceleration may require spectral information that is not always available, while Krylov acceleration necessitates the computation of an orthonormal basis for the Krylov subspace. For methods like GMRES the latter operation is known to have an adverse impact on the parallel efficiency, especially on emerging multicore and hybrid architectures [7, 17]. On future-generation exascale architectures, resilience is also expected to be an issue with these methods [3, 15]. This realization has spurred renewed in- terest in classical fixed point iterations of the form (2), which do not require any orthogonalization steps. Alternative acceleration techniques, such as Monte Carlo and Anderson-type acceleration, are currently being investigated by researchers [3, 9, 11, 12, 16]. Acceleration is only needed, of course, if the basic stationary scheme (2) converges slowly. There are, however, situations where fast convergence of (2) can be obtained, for examplethrough the use of suitable relaxation parameters and, in the case of saddle point problems, augmented Lagrangian techniques. In this paper we show that it is possible to have fast convergence of stationary methods for linear systems of the form (2), without the need for Krylov acceleration. Uzawa-type iterative schemes for double saddle point systems 3 The remainder of this paper is organized as follows. Before ending this section, we present some notations that are used throughout the paper. In section 2 we investigate a class of Uzawa-like methods, which can also be interpreted as generalized (block) Gauss–Seidel method. In this section, we also consider the use of an augmented Lagrangian technique to improve the performance of the iteration. In section 3 we propose the GSOR method to solve (1) in the case that its (3,3)-block is SPD. The convergence properties of the GSOR method are also studied. Illustrative ex- amples are reported in section 4 for test problems appearing in groundwater flow and liquid crystal modeling. Finally, we briefly state our conclusions in section 5. Notations. For a given arbitrary square matrix W, its spectrum is denoted by σ(W ). If all eigenvalues of W are real, we use λmin(W ) and λmax(W ) to denote the min- imum and maximum eigenvalues of W , respectively. Moreover, the notation ρ(W) stands for the spectral radius of W . If W is symmetric positive (semi)definite we write W 0(W < 0). Furthermore for two given matrices W1 and W2, by W1 W2 ≻ ≻ (W1 < W2) we mean W1 W2 0 (W1 W2 < 0). For given vectors x, y and z of dimensions n, m and p, (x−;y;z)≻will denote− a column vector of dimension n+m+ p. 2 Uzawa-like iterative schemes Uzawa’s method (see, e.g., [4]) has long been a popular technique for solving saddle point problems. In this section, we investigate possible extensions of Uzawa’s method to the double saddle point problem (1). Since this involves a (lower) block triangular splitting of the coefficient matrix, these schemes can also be regarded as a generalization of the classical (block) Gauss–Seidel scheme. To this end, first we split A as follows: A = M N , (3) GGS − GGS where A 0 0 0 BT CT M 1 N −1 − GGS = B α Q 0 and GGS = 0 α Q 0 , C −0 M 0− 0 N in which the parameter α > 0 and the matrix Q 0 are given and D = N M where M is a negative definite matrix. ≻ − 2.1 Double saddle point problems with zero (3,3)-block Here we assume that the matrix D in A is zero. Substituting M = N into the splitting (3), we consider the following iterative method for solving (1), u G¯ u M 1 b k 0 1 2 (4) k+1 = GGS k + GGS− , = , , ..., 4 M. Benzi and F. P. A. Beik in which the arbitrary initial guess u0 is given and 0 A 1BT A 1CT − − − − G¯ = 0 I αQ 1S αQ 1BA 1CT , (5) GGS − − B − − − 0 M 1CA 1BT I + M 1S − − − C 1 T 1 T where SB = BA− B and SC = CA− C . We recall next the following theorem and lemma, which we need to prove the convergence of iterative method (4) under appropriate conditions. The lemma is an immediate consequence of Weyl’s Theorem, see [8, Theorem 4.3.1]. Theorem 1. [8, Theorem 7.7.3] Let A and B be two n n real symmetric matrices such that A is positive definite and B is positive semidefinite× . Then A < B if and only if ρ(A 1B) 1, and A B if and only if ρ(A 1B) < 1. − ≤ ≻ − Lemma 1. Let A and B be two Hermitian matrices. Then, λmax(A + B) λmax(A)+ λmax(B), ≤ λmin(A + B) λmin(A)+ λmin(B). ≥ Theorem 2. Let A 0,Q 0 and M 0. Assume that the matrices B and C have full row rank and that≻ range≻(BT ) range≺ (CT )= 0 . If M < CA 1CT and ∩ { } − − 1 0 < α , (6) λ 1 ≤ max(Q− SB) the iterative scheme (4) converges to the solution of (1). λ σ G¯ Proof. Let ( GGS ) and (x;y;z) be a corresponding eigenvector which is equiv- alent to say that∈ BT y CT z = λAx, (7) − − 1 1 Qy = λ(Bx Qy), (8) − α − α Mz = λ(Cx + Mz). (9) First we observe that λ = 1. Otherwise, From (8) and (9), we respectively conclude that Bx = 0 and Cx = 06 which together with (7) and the positive definiteness of A imply that x = 0. Now using the assumption range(BT ) range(CT )= 0 , we can deduce that y and z are both zero vectors. Consequently,∩ it must be (x{;y};z)= (0;0;0), which contradicts our assumption that (x;y;z) is an eigenvector. Assuming λ = 1, from (8) and (9), we have 6 λ λ y = αQ 1Bx and z = M 1Cx. λ 1 − 1 λ − − − Uzawa-type iterative schemes for double saddle point systems 5 We observe that x cannot be zero. Substituting y and z from the above relation into (7), it can be found that λ is either zero or it satisfies the following relation: 1 λ = α p˜ q˜, (10) − − where x BT Q 1Bx x CT M 1Cx p˜ = ∗ − andq ˜ = ∗ − .

Uzawa-Type and Augmented Lagrangian Methods for Double Saddle Point Systems

A Learning-Based Iterative Method for Solving Vehicle

The Generalized Simplex Method for Minimizing a Linear Form Under Linear Inequality Restraints

An Efficient Three-Term Iterative Method for Estimating Linear

Revised Simplex Algorithm - 1

A Non-Iterative Method for Optimizing Linear Functions in Convex Linearly Constrained Spaces

Lecture Notes on Iterative Optimization Algorithms

Iterative Methods for Solving Optimization Problems

Topic 3 Iterative Methods for Ax = B

On Some Iterative Methods for Solving Systems of Linear Equations

Matrices and Matroids for Systems Analysis (Algorithms And

Lecture 20 1 Linear Equations

6. Iterative Methods for Linear Systems