Fast Implementation of Low Rank Approximation of a Sylvester Matrix 1)

MM Research Preprints, 170{182 KLMM, AMSS, Academia Sinica No. 24, December 2005 Fast Implementation of Low Rank Approximation of a Sylvester Matrix 1) Bingyu Li, Zhuojun Liu and Lihong Zhi Key Lab of Mathematics Mechanization AMSS, Beijing 100080 China fliby,zliu,[email protected] Abstract. In [16], authors described an algorithm based on Structured Total Least Norm (STLN) for constructing a Sylvester matrix of given lower rank and obtaining the nearest perturbed polynomials with exact GCD of given degree. For their algorithm, the overall computation time depends on solving a sequence least squares (LS) problems. In this paper, a fast implementation for solving these LS problems is proposed. The increased e±ciency is obtained by exploiting the low displacement rank of the involved coe±cient matrices. 1. Introduction The authors in [16] described an algorithm based on STLN [23, 21] for constructing a Sylvester matrix of given lower rank and obtaining the nearest perturbed polynomials with exact GCD of given degree. For their algorithm, the overall computation time depends on solving a sequence least squares (LS) problems and the algorithm has a complexity of O(st2), where s = 2m + 2n ¡ k + 3, t = 2m + 2n ¡ 2k + 3. Motivated by the previous work [8, 27, 17] on displacement structure of the Sylvester matrix, in this paper, a fast implementation for solving these LS problems is proposed. It has a quadratic amount of complexity O(s2). The increased e±ciency is obtained by exploiting the low displacement rank of the involved coe±cient matrices. It can be seen that the fast implementation is more e±cient when k is not large. The organization of this paper is as follows. In Section 2, we introduce some results for the computation of the nearest perturbed polynomials described in [16]. In the following two sections, we propose a fast algorithm for solving the involved LS problems e±ciently and derive a forward error analysis respectively. In Section 5 we give numerical tests to show the performance of the fast algorithm. Then we give some concluding remarks in the last section. 2. Preliminaries m n Given two polynomials a; b 2 R[x] with a = amx + ¢ ¢ ¢ + a1x + a0 and b = bnx + ¢ ¢ ¢ + b1x + b0, am 6= 0; bn 6= 0. S is the Sylvester matrix for a and b. The perturbations of a and b are denoted by m n ¢a = ¢amx + ¢ ¢ ¢ + ¢a1x + ¢a0; ¢b = ¢bnx + ¢ ¢ ¢ + ¢b1x + ¢b0 respectively. We consider the 2 2 minimal perturbation problem: minimize k¢ak2 + k¢bk2 preserving that a + ¢a and b + ¢b have an exact GCD of a given degree. 1) The work is Partially supported by a National Key Basic Research Project of China 2004CB318000 and Chinese National Science Foundation under Grant 10371127 and 10401035. 171 (m+n¡k+1)£(m+n¡2k+2) Denote Sk = [a Ak] 2 R as the k-th Sylvester matrix, 2 3 am 0 ¢ ¢ ¢ 0 0 bn 0 ¢ ¢ ¢ 0 0 6 7 6 am¡1 am ¢ ¢ ¢ 0 0 bn¡1 bn ¢ ¢ ¢ 0 0 7 6 . 7 6 . 7 6 . ¢ ¢ ¢ . ¢ ¢ ¢ . 7 S = ; (2..1) k 4 0 0 ¢ ¢ ¢ a0 a1 0 0 ¢ ¢ ¢ b0 b1 5 0 0 ¢ ¢ ¢ 0 a0 0 0 ¢ ¢ ¢ 0 b0 | {z } | {z } n¡k+1 m¡k+1 where a is the ¯rst column of Sk, Ak consists of the last m + n ¡ 2k + 1 columns of Sk. The perturbations ¢a and ¢b are expressed by a m + n + 2-dimensional vector d; T d = [d1; d2; : : : ; dm+n+1; dm+n+2] : The k-th Sylvester structured perturbation of Sk is represented as [¢a Dk]. Theorem 1 Given univariate polynomials a(x) ; b(x) 2 R[x] with deg(a) = m and deg(b) = n. Let S(a; b) be the Sylvester matrix of a(x) and b(x), Sk be the k-th Sylvester matrix, 1 · k · min(m; n). Then the following statements are equivalent: (1) rank(S) · m + n ¡ k. (2) Rank de¯ciency of Sk is greater than or equal to one. Theorem 2 [16] Given univariate polynomials a(x) ; b(x) 2 R[x] with deg(a) = m and deg(b) = n and a positive integer k · min(m; n). Suppose Sk is the k-th Sylvester matrix of a and b. Partition Sk = [a Ak], where a is the ¯rst column of Sk and Ak consists of the last m + n ¡ 2k + 1 columns of Sk. Then we have dim Nullspace(Sk) ¸ 1 () Akx = a has a solution: Theorem 3 [16] Given integers m; n and k, k · min(m; n), then there exists a Sylvester matrix S 2 R(m+n)£(m+n) with rank m + n ¡ k. Based on the above theorems we know that, for a given degree k; it is always possible to ¯nd a k-th Sylvester structured perturbation matrix [¢a Dk] such that a + ¢a 2 Range(Ak + Dk). The minimal perturbation problem can be formulated as the following equality-constrained least squares problem: min kdk2; subject to r = 0; (2..2) x where the structured residual r is given by r = a + ¢a ¡ (Ak + Dk)x: Applying the penalty method [1], we transform (2..2) into °· ¸° ° wr ° min ° ° ; (2..3) ° d ° d;x 2 where w is a large penal value. The vectors ¢a and Dkx can be expressed as ¢a = Pk d;Dkx = Xkd: 172 Here · ¸ I 0 P = m+1 ; k 0 0 where Im+1 is an identity matrix of order m + 1, and 2 3 0 xn+1¡k 6 7 6 .. .. 7 6 x1 . xn+2¡k . 7 6 . 7 6 . .. .. 7 6 . 0 . xn+1¡k 7 6 7 Xk = 6 xn¡k x1 xm+n+1¡2k xn+2¡k 7 : 6 . 7 4 .. .. 5 xn¡k xm+n+1¡2k | {z } | {z } m+1 n+1 Then (2..3) becomes the following least squares problem: ° · ¸ · ¸ · ¸ ° ° ° ° w(Xk ¡ Pk) w(Ak + Dk) ¢d ¡wr ° min ° + ° ; (2..4) ¢x ¢d Im+n+2 0 ¢x d 2 where Im+n+2 is an identity matrix of order m + n + 2: Thus, an iterative algorithm was derived (see [16]) for solving the previously proposed minimal perturbation problem. Let us denote the coe±cient matrix of the system in (2..4) by M, · ¸ w(X ¡ P ) w(A + D ) M = k k k k ; Im+n+2 0 · ¸ · ¸ ¢d wr and denote y = ; z = ; the least squares problem (2..4) can be rewritten as ¢x ¡d min kMy ¡ zk : (2..5) y 2 3. Fast implementation of LS problem (2..5) In this section we propose a fast implementation for solving the LS problem (2..5). The increased e±ciency is obtained by exploiting the low displacement rank of the involved coe±cient matrices. 3.1. Displacement structure Hereafter, Ii and Zi denote i £ i identity matrix and i £ i lower shift matrix respectively, where i is an integer. The displacement structure of an n £ n symmetric matrix R is de¯ned as: rR = R ¡ F RF T ; where F is an n £ n lower-triangular matrix. The choice of F depends on the matrix R, e.g., if R is a Toeplitz matrix, F is chosen equal to a lower shift matrix Zn. If rR has a lower rank r (¿ n) independent of n, the size of R, then r is referred to as the displacement rank of R. It follows that rR can be factored as rR = GJGT ; (3..1) 173 where G is an n £ r matrix and J is a signature matrix of the form · ¸ I 0 J = p ; p + q = r: 0 ¡Iq The integers p, q denote the numbers of positive eigenvalues and negative eigenvalues of rR respectively. The factorization (3..1) is nonunique. If G satis¯es (3..1) then G£ also satis¯es (3..1) for any J-unitary matrix £, i.e. for any £ such that £J£T = J. The pair (G; J) is said to be a generator pair. LDLT factorization of a strong regular R (i.e., all its leading submatrices are nonsingular) can be e±ciently carried out by a generalized Schur algorithm [15], which operates on the generator pair (G; J) directly and costs O(rn2). The number of iterations of the generalized Schur algorithm is equal to the size of R. Let G1 = G and denote by Gi the generator matrix at the beginning of the i-th iteration. A J-unitary matrix 0 0 £i is chosen such that Gi = Gi£i is in proper form, i.e., the top row of Gi has a single nonzero entry. More precisely, denote by fi the top row of Gi. The nonzero entry is in the ¯rst column if T T fiJfi > 0 (positive step); the nonzero entry is in the last column if fiJfi < 0 (negative step). The T case fiJfi = 0 is ruled out by the strong regularity of R. Denote an index by j, ½ j = 1; for a positive step; j = r; for a negative step. · ¸ 0 ¹ 0 T Denote by li = ¹ the i-th column of L, then li is the jth column of Gi. If fiJfi > 0 we set li T D(·i; i) = 1.¸ If fiJfi < 0 we set D(i; i) = ¡1. The next generator matrix Gi+1 is the nonzero part 0 0 of , which is formed as multiplying the jth column of Gi by Fi, the submatrix obtained Gi+1 0 by deleting the ¯rst i ¡ 1 rows and columns of F , and keeping the other columns of Gi unchanged. To reduce the generator matrix Gi to a proper form, the J-unitary transformation matrix £i can be formed as a product of a sequence of orthogonal transformations and a hyperbolic rotation.

Load more