<<

ISSN: 2277-3754 ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 7, Issue 4, October 2017 A New Algorithm to Obtain the Adjugate using CUBLAS on GPU González, H.E., Carmona, L., J.J. Information Technology Department, ININ-ITTLA Information Technology Department, ININ

proved to be the best option for most practical applications. Abstract— In this paper a parallel code for obtain the Adjugate The new transformation proposed here can be obtained from Matrix with real coefficients are used. We postulate a new linear it. Next, we briefly review this topic. transformation in matrix product form and we apply this linear transformation to an (A|I) by means of both a minimum and a complete pivoting strategies, then we obtain the II. LU-MATRICIAL DECOMPOSITION WITH GT. Adjugate matrix. Furthermore, if we apply this new linear NOTATION AND DEFINITIONS transformation and the above pivot strategy to a augmented The problem of solving a linear system of equations matrix (A|b), we obtain a Cramer’s solution of the linear system of Ax  b is central to the field of matrix computation. There are equations. That new algorithm present an O n3 computational   several ways to perform the elimination process necessary for complexity when n . We use subroutines of CUBLAS 2nd A, b  R its matrix triangulation. We will focus on the Doolittle-Gauss rd and 3 levels in double and we obtain correct numeric elimination method: the algorithm of choice when A is square, results. dense, and un-structured. nxn Index Terms—Adjoint matrix, Adjugate matrix, Cramer’s Let us assume that AR is nonsingular and that we rule, CUBLAS, GPU. wish to solve the linear system Ax  b . Here we show how for exact arithmetic and partial pivoting and column I. INTRODUCTION interchanges some Gauss transformations M1 ,..., Mn1 can

A Linear System of Equations (LSE) can be defined as a almost always be found such that Mn1 ,..., M2M1A  U is set of m equations with n unknowns represented by a matrix upper triangular [9]. The original problem is then A, a vector b and an unknown vector x, namely, equivalent to the upper triangular system Ax  b .Many methods have been proposed to solve such linear equations. A famous one is Cramer’s rule, where each Ux  Mn1 ,..., M 2M1  b wich can be solved through component of the solution is determined as the ratio of two . back-substitution. nxn When trying to solve a system of n equations using Suppose, then, that AR and that, for some k

DOI:10.17605/OSF.IO/RPB6F Page 12

ISSN: 2277-3754 ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 7, Issue 4, October 2017 Definition. An elementary lower triangular matrix of interchange equations before this is done through P, a order n and index k is a matrix of the form [10] that records the row exchanges as detailed T Mk  In mek below. Where: A. LU Decomposition Theorem 1 . . . 0   Using the above expression, the following can be established . . . e T  (0,...,0,1,0,...,0)T   [11]: k   , I  . . . Theorem. Let A k denote the leader or main sub-matrix (k k n    . . .  x k ) of AR nxn . If is non-singular for k=1,...,n ;   nxn 0 . . . 1 then there exist a lower triangular matrix LR and an upper triangular matrix URnxn so that A=PLU. T Furthermore, Ak  u11...ukkk 1,..., n. m  0,0,0,...,0, mk 1 ,...,mn 

k times III. DERIVATION OF A NEW LINEAR In general an elementary lower triangular matrix has the TRANSFORMATION above form. In the previous section we have defined an elementary lower The computational significance of elementary lower T triangular matrix of order n and index k as Mk  In mek . triangular matrices is that they can be used to introduce zero We shall find the above expression and the next theorem, components into a vector. Thus, whose proof can be found in [10] useful.  a11   a11      Theorem. The pivot elements a (k) (k 1,..., n) , are  .   .  k,k  .   .  nonzero if and only if the leading principal sub      .   .  matrices A (k 1,..., n) , are non-singular.     k  ak1  ak1  M k   . On this basis, the following can be stated a   0   k 1,1    A  Rnxn A (k  1,..., n)  .   .  Corollary. Let be a matrix with k  .   .      non-singular leading principal sub-matrices. Then, there  .   .  exists a unique for k=1 (a (1) 1) whose     00  an1   0  entries are: The matrix M is said to be a GT. The vector m is referred to k 11  as the Gauss vector. The components of m are known as    .  multipliers. Then it follows that   A (k) A (k)  (k) . (k) (k1)  11 12    A  M k A  0 A (k)  (n  k)  .   22    (k) (n  k)  1k  (k)  (k)  Where A is an upper triangular matrix?  ak,k  11 D      This process illustrates the k-th step of the decomposition k  (k)    ak 1,k 1   process, in which we used  k 1  k k . 1 1 1 (i) T (i) T   (Mk  M1)  M1  Mk  (In  m ei )  In  m ei i1 i1  .  We find the final expression for the decomposition process as   . L (k) 0 I 0 A (k) A (k)     11  k  11 12  (k) A   (k)     a    (k)  0 A   k,k L 21 I nk  22  0 I nk       (k)  (k)  a  L 0    k 1,k 1 n  (M ...M ) 1   11   I  (m (1) ,...,m (k) ,0,...,0). k 1  (k)  n L 21 I nk  (k ) (k ) In general, the forward elimination consists of n-1 steps. At Where ak,k and ak 1,k 1 are the pivots elements. the k-th step, multiples of the k-th equations are subtracted Now, if we scale the matrix Mk with that diagonal matrix, from the remaining equations to eliminate the k-th variable. If we will have (k ) the pivot element ak,k is null or “small”, it is advisable to

DOI:10.17605/OSF.IO/RPB6F Page 13

ISSN: 2277-3754 ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 7, Issue 4, October 2017 ' Mk  DkMk  A. New Decomposition Theorem through Determinants 1  Using any of the above expressions, one can state  1  1  .  nxn '   Theorem. Let A  R and M   Mk .Then M is a . kn1    .  lower triangular matrix all whose components are   determinants and U  MA also has all components in  1k  (k) (k )  a  a   form. Furthermore, A  un,n   k 1,k  k,k     a(k)  a(k )   k 1,k 1  k 1,k 1 k 1 Proof. It follows from an induction on n. For n=1 the  .  '   theorem is trivially true since M  M1  1 and U  MA  a11.  . .    For the induction step (k 1,..., n 1) we have: . .   n=2  .   (k ) (k)  k=1  an,k  ak,k    . . .    a(k)  a(k)    a11 a12  '  1 0  '  k 1,k 1  k 1,k 1 n  A    ; M    ; M  M ; a a  1  a a  1 and, once simplified, it can be re-expressed as  21 22   21 11  1 a a  M'   11 12 k (k )  1 0  a11 a12    ak 1,k 1 U  MA      a a     0 11 12  (k )  a21 a11 a21 a22    ak 1,k 1   a a  1   21 22   .    n=3  .  k=1,2  .   (k )   ak 1,k 1   k  a a a   1 0 0    a(k ) a(k )   1  11 12 13     k 1,k k ,k k 1  A  a a a ; M '   a a 0 ;  .   21 22 23  1  21 11    a a a   a 0 a   . .   31 32 33   31 11   . .       .      a(k ) . . . a(k )  a a a   n,k  k ,k n  11 12 13  a a a a  U  M ' A   0 11 12 11 13  By using this new transformation and applying the 1 1 ;  a21 a22 a21 a23  elimination process to a matrix A, all of whose entries are  a a a a  integers, all intermediate results are integers too, forming a  11 12 11 13   0  number ring [12], since they are obtained through additions  a31 a32 a31 a33  and products. Thus, A  Inxn  (M',U)Inxn . 1 The products of these results times the factor are   a(k)   k 1,k 1 a 0 0 integers too, because previously, in the k 1 step of the  11  ' 1   elimination process, such results had been multiplied by M 2  0 a11 0 (k ) ; a a11   k 1,k 1 .This multiplication process leads to a simplification of a a a a  0  11 12 11 12  the final result.  a a a a  (1)  31 32 21 22  We can re-express this linear transformation as: (a00 1)

 0     1        1 0 0   .     ' '   .   M  M M    a a 0  ;     2 1  21 11   .  a31 a32 a11 a12 a11 a12  1       (k ) Ik 0     a 0  a a a a a a  '  k ,k  (k )  k  T   21 22 31 32 21 22  M k  a I  e ;k  1,..., n 1 2  1  k ,k  (k )  k  ak 1,k   0 Ink     a(k )    k 1,k 1   .   The last row of the matrix M multiplied for the last column of   .   the matrix A is equivalent to Laplace Expansion of A taking       .   out the last column. Then we have u33  A .     a(k )     n,k  

DOI:10.17605/OSF.IO/RPB6F Page 14

ISSN: 2277-3754 ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 7, Issue 4, October 2017   a a a . . . a   11 12 13 1,n2   a a a . . . a   n,1 n,2 n,3 n,n2 a a a  11 12 13  a3,1 a3,2 a3,3 . . . a3,n2  a11 a12 a11 a13  U  MA   0  ; mn1,2   . . . . . ; a a a a  21 22 21 23  . . . . .  a11 a12 a13    . . . . .  0 0 a21 a22 a23    an2,1 an2,2 an2,3 . . . an2,n2  a31 a32 a33  n a11 a12 a13 . . . a1,n2 k=1,2,…,n-1 a21 a22 a23 . . . a2,n2

 a a a . . . a a  an,1 an,2 an,3 . . . an,n2  11 12 13 1,n1 1,n  a a a . . . a a . . . . .  21 22 23 1,n1 2,n  mn1,3   ;  a a a . . . a a   31 32 33 3,n1 3,n  . . . . .  ......  . . . . . A    ......   ; an2,1 an2,2 an2,3 . . . an2,n2  ......    a11 a12 a13 . . . a1,n2 an1,1 an1,2 an1,3 . . . an1,n1 an1,n    a21 a22 a23 . . . a2,n2  an,1 an,2 an,3 . . . an,n1 an,n  a31 a32 a33 . . . a3,n2 M  M ' ...M'  m   . . . . . n1 1 n1,n1  m 0 0 . . . 0 0  . . . . .  11   m21 m22 0 . . . 0 0  . . . . .   m31 m32 m33 . . . 0 0   an2,1 an2,2 an2,3 . . . an2,n2  . . . .     an,1 an,2 an,3 . . . an,n1  . . . .   . . . .  a21 a22 a23 . . . a2,n1   a a a . . . a mn1,1 mn1,2 mn1,3 . . . mn1,n1 0  31 32 33 3,n1  m m m . . . m m  m   . . . . .  n,1 n,2 n,3 n,n1 n,n  n,1 ; where . . . . .

. . . . . m11 1 ; m21  a21 ; m22  a11 ; a a a . . . a a31 a32 a11 a12 a11 a12 n1,1 n1,2 n1,3 n1,n1 m31   ; m32   ; m33  ; a21 a22 a31 a32 a21 a22

a11 a12 a13 . . . a1,n1 a a a . . . a n,1 n,2 n,3 n,n2 an,1 an,2 an,3 . . . an,n1

a21 a22 a23 . . . a2,n2 a31 a32 a33 . . . a3,n1

a31 a32 a33 . . . a3,n2 mn,2   . . . . . ; mn1,1   ...... ; . . . . .

. . . . . an1,1 an1,2 an1,3 . . . an1,n1

an2,1 an2,2 an2,3 . . . an2,n2

DOI:10.17605/OSF.IO/RPB6F Page 15

ISSN: 2277-3754 ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 7, Issue 4, October 2017

a11 a12 a13 . . . a1,n1 a11 a12 a1,n u  a a a a21 a22 a23 . . . a2,n1 3,n 21 22 2,n

a31 a32 a3,n an,1 an,2 an,3 . . . an,n1 m   . . . . . ; n,3 The Laplace Expansion of sub-matrix A taking out the . . . . . n1,n1 last column, is equivalent to multiply the (n-1)- th row of the . . . . . matrix M by the (n-1)- th column of A. Then we have

an1,1 an1,2 an1,3 . . . an1,n1 un1,n1  An1,n1 and

a11 a12 a13 . . . a1,n1 a11 a12 a13 . . . a1,n1 a a a . . . a a21 a22 a23 . . . a2,n1 21 22 23 2,n1

a31 a32 a33 . . . a3,n1 a31 a32 a33 . . . a3,n1 u n1,n1  . . . . . ; mn,n1   . . . . . ; ...... an1,1 an1,2 an1,3 . . . an1,n1 a a a . . . a n,1 n,2 n,3 n,n1 In a similar way, we have a a a . . . a 11 12 13 1,n1 a11 a12 a13 . . . a1,n

a21 a22 a23 . . . a2,n1 a21 a22 a23 . . . a2,n

a31 a32 a33 . . . a3,n1 a31 a32 a33 . . . a3,n u  . . . . . ; mn,n  . . . . . n1,n ...... an1,1 an1,2 an1,3 . . . an1,n an1,1 an1,2 an1,3 . . . an1,n1 Finally, taking the last row of M multiplied by the last column and of A we have u11 u12 u13 . . . u1,n1 u1,n    a11 a12 a13 . . . a1,n 0 u u . . . u u  22 23 2,n1 2,n  a a a . . . a   21 22 23 2,n 0 0 u33 . . . u3,n1 u3,n   a31 a32 a33 . . . a3,n  ......  U  MA    u n,n  ......   . . . . .  ......    . . . . .  0 0 0 . . . u n1,n1 u n1,n  an,1 an,2 an,3 . . . an,n  0 0 0 . . . 0 u   n,n  Now, if M is the new transformation, then we have where 1 ' u  a ; u  a ; u  a ; u  a ; M   M k ; U  MA 11 11 12 12 13 13 1,n1 1,n1 kn1

u1,n  a1,n In order to solve the linear system of equations Ax  b , we have a a a a u  11 12 ; u  11 13 ; MAx  Mb 22 a a 23 a a 21 22 21 23 Ux  Mb a a a a We can use the “backward process” and solve the linear u  11 1,n1 ; u  11 1,n 2,n1 2,n system of equations using only determinants. a21 a2,n1 a21 a2,n

a11 a12 a13 a11 a12 a1,n1 For a matrix A with floating point entries this process requires

u33  a21 a22 a23 ; u3,n1  a21 a22 a2,n1 ; nn 12n 1 n  2n 1 n3 4 a31 a32 a33 a31 a32 a3,n1    n 1 6 2 3 3 Floating point multiplications [13].

DOI:10.17605/OSF.IO/RPB6F Page 16

ISSN: 2277-3754 ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 7, Issue 4, October 2017 B. New Decomposition Theorem with Determinants with M  Jk Total Pivoting 1 Gauss elimination in real numbers is unstable due to the (k )  a possibility of finding arbitrarily small pivots. This process can k 1k 1 a(k )  a(k )  be alleviated, however, by exchanging rows during the  kk 1k  elimination. In our case, only when a pivot is zero, we  . .  exchange the row or column in a similar way to the Gauss    . .  process. The following theorem is given without proof:  . .  nxn Theorem. Let AR . Suppose that the New  (k )   akk  Transformation M' M' ; row’s permutation matrices 1 k1  a(k )  (3)  k 1k 1  P  P1 Pk1 and column’s permutation matrices (k )  akk  Π  Π1 Πk1 have been determined so   . . that U  M' P  M' P AΠ ...Π . Then, the Upper   k1 k1 1 1 1 k1  . .  Matrix U is obtained from PAΠ without exchanging any   . . rows or columns and U  MPAΠ . Furthermore, if each ≡    (k ) (k )  number of row’s exchanges plus numbers of column’s   ank akk  exch exchanges, we have A  1 un,n It is sufficient to do the operations indicated by the new Linear nxn nxn Transformation MJ : I  I to find the adjugate matrix C. Matrix L and U 1 of size n : M   M  AAdj . Now, although it is not strictly necessary, should we wish to J Jk kn obtain the matrix L, then the elements of the matrices L = ( lij) Calculating the matrix with M proves highly efficient when and U = ( u ) can be computed starting from the following J ij working with the augmented matrix (A|I) : formulas [14]: u1k = a1k ; k = 1 ,..., n M M ... M (A | I )  (| A | I | AAdj ) . a j1 Jn Jn1 J1 n, n n n n, n lj1 = ; j = 2 ,..., n u11 j1 This process demands ujk = ( ajk   ljs usk ) uj - 1 j - 1 ; k = j ,..., n ; j  2 s1 n2  n2n 1 n1 n  2n 1 n3 n2 3 1 k 1  in  i 1    n 1 ljk = ( ajk   ljs usk ) ; j = k ,..., n ; k 2 6 2 2 2 2 u i1 kk s1 Floating point multiplications.

IV. EXACT CALCULATION OF THE ADJUGATE MATRIX Another form to express the new transformation is Adj ( A ) WITH ANOTHER NEW LINEAR TRANSFORMATION

MJk  (k )   1  a1k   eT      (k ) 1 a(k ) The adjugate is also called the adjoint. We avoid this usage  ak 1k 1    2k   because, in functional analysis, it refers to the equivalent of  .    .     Adj     the conjugate of a matrix. The adjugate A of a  .    .     matrix A is the transpose of the co-factor´s matrix of the  .   .   1      elements of A . The computation of the adjugate, from its M   eT   a(k )I  (k ) e T  (4) (k  1,..., n); Jk (k ) k kk 0kk  k 2  akk    definition, involves the calculation of n determinants of  (k )   .   ak 1   order n 1 . On the other hand, the calculation from the     .    .    Adj 1   formula A  A  A breaks down when A is singular and  .    .         1 T . is potentially unstable when A is ill-conditioned with respect  (k ) en        (k )  ak 1k 1   a   to inversion [15].   nk  

a(1)  1 Expressing the new Linear Transformation in Jordan version 00 we have: 1 AAdj  M  M J  Jk kn (1) For (a00 1) and k 1,...,n By means of both a minimum and complete pivoting strategy, we have

Adj A  ΠMJP

DOI:10.17605/OSF.IO/RPB6F Page 17

ISSN: 2277-3754 ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 7, Issue 4, October 2017 Proof. Let it suffice to explain how the algorithm works with modified Doolittle-Gauss elimination process. Additionally the following case. we expressed the equations (3) and (4), a modified Let A Inxn and Doolittle-Gauss-Jordan elimination process to calculate the adjugate matrix.  a a a  0 1 0 1 0 0 Most simultaneous linear equation systems can also be  11 12 13      A  a21 a22 a23  ; P  1 0 0 ; Π  0 0 1 solved with these new linear transformations. The result is       3 a31 a32 a33  0 0 1 0 1 0 Cramer-type solutions in O(n ) . This fact is new. We have also proposed a modified Doolittle-Gauss-Jordan M M M PAΠ  A I elimination process in two versions: the first one applied to J 3 J2 J1 the A matrix and the second to the augmented matrix (A|b) . M  A PAΠ1 J The first one is a new algorithm to compute determinants in 1 1 1 M  Π A A P n J exact form if and only if A I , and the second is a new Adj ΠMJP  A method to solve linear system of equations. (1) On the other hand, we have proposed a modified For k=1,…,3 ; (a00 1) we have: Doolittle-Gauss-Jordan elimination process in two version:

the first one applied to the augmented matrix (A|I) and the A Adj  ΠM M M P  J 3 J 2 J1 second to the augmented matrix (A|b). The first version is a  a a a a a a  new algorithm to calculate, in exact form, the Adjugate  22 23  12 13 12 13  n  a32 a33 a32 a33 a22 a23  matrix, if A I . The second version is a new direct method   n a21 a23 a11 a13 a11 a13 to solve linear system of equations in exact form ifA, b  I .     n  a31 a33 a31 a33 a21 a23  Provided that A, b  R the above algorithms calculate, in  a a a a a a   21 22 11 12 11 12  approximate form, Cramer-type solutions of the linear     a31 a32 a31 a32 a21 a22  systems of equations. Gaussian elimination is usually the most economical way to A. Solution of Ax=b with M J solve Ax  b . Nevertheless there are three reasons why this The simultaneous linear equations systems can also be solved new method might be relevant when the matrix´s coefficients with this new linear transformation. are integers: 1 (1) The flop counts tend to exaggerate the Gaussian M  M  AAdj J  Jk Since: kn , then: elimination advantage. AAdjb (2) The present method provides guaranteed stability; AAdj Ax  AAdjb; ; A Ix  AAdjb ; x  A I there is no “growth factor” to worry about as in

This result is a Cramer-type solution inO(n3 ) . Consequently, Gaussian elimination. this is a new result. (3) In cases of ill-conditioning, the reliability of the present method is unsurpassed when A In . B. Numeric Results Finally, when referring to the Cramer’s rule, it has been We use two models of Tesla GPU: M2070-Q and K20c. We affirmed by G. Strang [16] that: “…Thus each component of show the results in the following graph x is a ratio of two determinants, a polynomial of degree n divided by another polynomial of degree n. This fact might have been recognized from Gauss elimination, but it never was”. This is made evident in the present paper.

REFERENCES [1] Curtis F. Gerald, Wheatley, Patrick O. Applied Numerical Analysis Addison Wesley Publishing Co. Page: 140, 1994. [2] Lay, David C. and Its Applications Addison Wesley Publishing Co. Page: 177, 1994. [3] Terrence J. Akai Applied Numerical Methods for Engineers John Wiley and Sons, Inc. Page: 56, 1994.

[4] Valenza, Robert J. Linear Algebra: An Introduction to V. CONCLUSION Abstract Mathematics Springer-Verlag, N.Y. Page: 163, 1993. In this paper we introduce a new theorem LU on the [5] Kolman, Bernard Introductory Linear Algebra with decomposition into determinants of matrix A and the new Applications Macmillan Publishing Company Page: 98, 1993. linear transformations, expressed as equations (1), (2), a

DOI:10.17605/OSF.IO/RPB6F Page 18

ISSN: 2277-3754 ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 7, Issue 4, October 2017 [6] Eaves, Edgar D., Carruth, J. Harvey Introductory Mathematical Analysis Wm. C. Brown Publishers Page: 524, 1993. [7] Kahaner, D., Moler, C., Nash, S. Numerical Methods and Software Prentice Hall, Englewood Cliffs, N.J. Pages: 41-53, 1989. [8] Wilkinson, J. H. The algebraic eigenvalue problem Oxford University Press, Oxford Pages: 189-264, 1965. [9] Householder, A.S. The theory of Matrices in Numerical Analysis Blaisdell Publishing Company, N.Y. First Edition, Second Printing Pages: 122-146, 1965. [10] Stewart, G.W. Introduction to Matrix Computations Academic Press, Inc. Pages: 115, 120, 1973. [11] Golub, G.H., Van Loan, Ch. F. Matrix Computations John Hopkins University Press Page: 56, 1983. [12] Grosswald, E. Topics from the Theory of Numbers Mc Millan Company, N.Y. Pages: 277-279, 1966. [13] González, H.E., Carmona L., J.J. A New LU Decomposition On Hybrid GPU-Accelerated Multicore Systems Computación y Sistemas Vol. 17 No. 3, Pages. 413-422, 2013. [14] González, H.E. Método Cramer-LU aplicado al algorithm Simplex, Tesis. Universidad Nacional Autónoma de México (UNAM) Pages: 15-17, 2005. [15] Stewart, G.W. On the Adjugate Matrix Linear Algebra and its Applications Vol. 283, Pages: 151-164, 1998. [16] Strang, G. Linear Algebra and its Applications Academic Press, Inc Pages: 163-164, 1976.

AUTHOR BIOGRAPHY

H. E. González. He received the PhD in Operations Research at the National Autonomous University of Mexico (UNAM) in 2005. He has been working in research activities for more than twenty years. He has published a book and more than ten scientific papers. Presently, he is a full-time researcher at the National Institute of Nuclear Research (ININ) at the Department of Systems.

J.J. Carmona L., He received his B.S. in Electronic Engineering from UAM. Presently, he is a full-time researcher at the National Institute of Nuclear Research (ININ) at the Department of Systems.

DOI:10.17605/OSF.IO/RPB6F Page 19