Randomized Iterative Methods with Alternating Projections
Total Page:16
File Type:pdf, Size:1020Kb
Randomized Iterative Methods with Alternating Projections Hua Xianga∗ Lin Zhangb† a School of Mathematics and Statistics, Wuhan University, Wuhan, 430072, P.R. China b Institute of Mathematics, Hangzhou Dianzi University, Hangzhou 310018, P.R. China. September 24, 2018 Abstract We use a unified framework to summarize sixteen randomized iterative methods including Kaczmarz method, coordinate descent method, etc. Some new iterative schemes are given as well. Some relationships with mg and ddm are also discussed. We analyze the convergence properties of the iterative schemes by using the matrix integrals associated with alternating projectors, and demonstrate the convergence behaviors of the randomized iterative methods by numerical examples. Keywords. linear systems, randomized iterative methods, Kaczmarz, co- ordinate descent, alternating projection 1 Introduction Solving a linear system is a basic task in scientific computing. Given a real matrix A Rm×n and a real vector b Rm, in this paper we consider the following consistent∈ linear system ∈ Ax = b. (1) For a large scale problem, iterative methods are more suitable than direct meth- ods. Though the classical iterative methods are generally deterministic [1, 3], arXiv:1708.09845v1 [math.NA] 31 Aug 2017 in this paper, we consider randomized iterative methods, for example, Kacz- marz method, the coordinate descent (cd) method and their block variants. Motivated by work in [4], we give a very general framework to such randomized iterative schemes. We need two matrices Y Rm×l and Z Rn×l, where l 6 min m,n , and usually l is much smaller than∈ m and n. In∈ most practical cases,{ only} one of ∗Corresponding author (H. Xiang). E-mail: [email protected]. H. Xiang is supported by the National Natural Science Foundation of China (No.11571265, No.11471253) and NSFC- RGC (No.11661161017). †E-mail: [email protected]. L. Zhang is supported by Natural Science Foundation of Zhe- jiang Province of China (LY17A010027) and National Natural Science Foundation of China (No.11301124). 1 the matrices Y and Z is given, and the other one is determined consequently. Suppose that Y and Z are of full column rank. Let E Y T AZ, which is invertible in most cases, otherwise we use the pseudoinverse≡ in the following. For convenience, we define Ξ := ZE†Y T = Z(Y T AZ)†Y T , where denotes the (Moore-Penrose) pseudoinverse. In this† paper, we consider the following iterative scheme: xk+1 = xk + Ξ(b Axk), (2) − where xk (k = 0, 1, ) is the kth approximate solution, and x0 is the initial guess. This is our general··· framework under consideration. Let T := I ΞA = I Z(Y T AZ)†Y T A. Then (2) can be rewritten as − − xk+1 = T xk + Ξb. (3) Assume that x∗ is the true solution of (1), and define the kth iteration error ek = xk x∗. Then we have the error propagation: − ek+1 = (I ΞA)ek = Tek. (4) − Besides the formulations (2) and (3), the iteration scheme can be viewed in seemingly different but equivalent ways as pointed out in [4]. From an algebraic viewpoint, xk+1 is the solution of Y T Ax = Y T b and x = xk + Zy. From a geometric viewpoint, xk+1 is the random intersect of two affine spaces. That is, xk+1 = (x∗ + Null(Y T A)) (xk + Range(Z)). { We} can easily check that∩ ΞA and T are projectors. If Y or Z is different in each step k, then the scheme is nonstationary and T depends on k, denoted by Tk. Choosing Y or Z properly in each step, we can recover the classical Kaczmarz method, and the coordinate descent method, etc. We will show how to achieve this in Section 2. Furthermore, from the error propagation, we have k+1 0 e = Tk T e , (5) ··· 0 where T0, ,Tk are a series of different alternating projectors applied to the initial iteration··· error e0. We show that under such alternating projections the final iteration error approaches zero in the sense of probability in Section 3. The convergence is proved by using the matrix integrals associated projections, without the assumption on the matrix positive definiteness in [4]. Numerical examples are given in Section 4. 2 Iterative schemes We consider the iterative methods under the unified framework of (2) or (3). Depending on the choice of parameter matrices Y and Z, we divide the iterative schemes into three types for convenience. In the following we always assume that the parameter matrix G is symmetric positive definite (spd). (1) Type-k methods. In this type, Y is given. We set Z = GAT Y , then consequently ΞA = ZE†Y T A = GAT Y (Y T AGAT Y )†Y T A. These are row action methods, and the Kaczmarz algorithm is the typical method of this type. 2 For this type, as pointed in [4], the iterative scheme can be expressed from a sketching viewpoint as k+1 k T T x = arg min x x −1 , s.t., Y Ax = Y b. || − ||G And also it can be rewritten from the optimization viewpoint as follows. k+1 ∗ k x = arg min x x −1 , s.t., x = x + Zy. || − ||G (2) Type-c methods. These are column action methods, and the coordinate descent algorithm is the typical method of this type. In this type, Z is given. we set Y = GAZ, then ΞA = ZE†Y T A = Z(ZT AT GAZ)†ZT AT GA. Similarly, for this type methods we can reexpress the iterative scheme from a sketching viewpoint as k+1 k T T x = arg min x x G, s.t., Y Ax = Y b, || − || b where A is supposed to be of full column rank, and then G := AT GA is spd. And also we can reformulate from the optimization viewpoint b k+1 ∗ k x = arg min x x G, s.t., x = x + Zy. || − || b (3) Type-s methods. This type of iterative schemes is applied to the symmet- ric cases. In this type, we choose Y = Z, then E = ZT AZ,Ξ= Z(ZT AZ)†ZT , and ΞA = Z(ZT AZ)†ZT A. For the spd case, under the settting G = A−1, type-k and type-c methods will be the same, and both reduce to type-s meth- ods. In the following, by choosing specific Y or Z, we recover the classical well- known iteration schemes, and also obtain some new ones. The iteration schemes are summarize in Tables 1 and 2. We will derive and discuss them one by one. 2.1 Kaczmarz and row methods The Kaczmarz method, also known as the algebraic reconstruction technique (art), is a typical row action method, dated back to the Polish mathematician Stefan Kaczmarz in 1937 and reignited by Strohmer and Vershynin [11]. The advantage of Kaczmarz method and the following cd lies in the fact that at a time they only need access to individual rows (or columns) rather than the entire coefficient matrix. Due to its simplicity, it has numerous applications in image reconstruction, signal processing, etc. In the Kaczmarz algorithm, the approximate solution x is projected onto the hyperplane determined by a random row of the matrix A and the respective T T element of the right hand side b. Let A = (α1, , αm). That is, αi is the ··· T T αi x bi ith row of A. The ith component of (1) reads α x = bi, i.e., = (i = i ||αi|| ||αi|| 1, ,m). For an approximation x not in the plane, the distance of this point ··· αT x x to the plane is bi i αi , which is used to update the approximate ||αi|| − ||αi|| ||αi|| solution. Then the iterative scheme reads: T bi αi x x x + − 2 αi, (6) ← αi || || 3 Table 1: Matrices in the randomized iterative methods No. Y Z E = Y T AZ Ξ= ZE†Y T T = I ΞA T T T T T T † T T− T T † T T K1 ei A ei e AA ei A ei(e AA ei) e I A ei(e AA ei) (A ei) i i i − i K2 ω AT ω ωT AAT ω AT ω(ωT AAT ω)†ωT I AT ω(ωT AAT ω)†ωT A T T T T T T † T − T T T † T K3 IR A IR I AA IR A IR(I AA IR) I I A IR(I AA IR) I A R R R − R R K4 Ω AT Ω ΩT AAT Ω AT Ω(ΩT AAT Ω)†ΩT I AT Ω(ΩT AAT Ω)†ΩT A T T T T T T † T − T T T † T K5 IR GA IR IR AGA IR GA IR(IR AGA IR) IR I GA IR(IR AGA IR) IR A K6 Ω GAT Ω ΩT AGAT Ω GAT Ω(ΩT AGAT Ω)†ΩT I − GAT Ω(ΩT AGAT Ω)†ΩT A T T T T † T − T T † T C1 Aej ej e A Aej ej(e A Aej ) (Aej ) I ej (e A Aej) (Aej ) A j j − j C2 Aω ω ωT AT Aω ω(ωT AT Aω)†ωT AT I ω(ωT AT Aω)†ωT AT A T T T T † T T − T T † T T C3 AIC IC I A AIC IC (I A AIC ) I A I IC (I A AIC ) I A A C C C − C C C4 AΩΩ ΩT AT AΩ Ω(ΩT AT AΩ)†ΩT AT I Ω(ΩT AT AΩ)†ΩT AT A T T T T † T T − T T † T T C5 GAIC IC I A GAIC IC (I A GAIC ) I A G I IC (I A GAIC ) I A GA C C C − C C C6 GAΩΩ ΩT AT GAΩ Ω(ΩT AT GAΩ)†ΩT AT G I Ω(ΩT AT GAΩ)†ΩT AT GA T T † T − T † T S1 ei ei ei Aei ei(ei Aei) ei I ei(ei Aei) ei A S2 ω ω ωT Aω ω(ωT Aω)†ωT I − ω(ωT Aω)†ωT A T T † T − T † T S3 IC IC IC AIC IC (IC AIC ) IC I IC (IC AIC ) IC A S4Ω Ω ΩT AΩ Ω(ΩT AΩ)†ΩT I − Ω(ΩT AΩ)†ΩT A − Table 2: Randomized iterative schemes No.