HP MLIB User's Guide

Total Page:16

File Type:pdf, Size:1020Kb

HP MLIB User's Guide HP MLIB User’s Guide Seventh Edition HP Part No. B6061-96027 and B6061-96028 HP MLIB December 2004 Printed in USA Edition: Seventh Document Numbers: B6061-96027 and B6061-96028 Remarks: Released Dcember 2004 with HP MLIB software version9.0. User’s Guide split into two volumes with this release. Edition: Sixth Document Number: B6061-96023 Remarks: Released September 2003 with HP MLIB software version 8.50. Edition: Fifth Document Number: B6061-96020 Remarks: Released September 2002 with HP MLIB software version 8.30. Edition: Fourth Document Number: B6061-96017 Remarks: Released June 2002 with HP MLIB software version 8.20. Edition: Third Document Number: B6061-96015 Remarks: Released September 2001 with HP MLIB software version 8.10. Edition: Second Document Number: B6061-96012 Remarks: Released June 2001 with HP MLIB software version 8.00. Edition: First Document Number: B6061-96010 Remarks: Released December 1999 with HP MLIB software version B.07.00. This HP MLIB User’s Guide contains both HP VECLIB and HP LAPACK information. This document supercedes both the HP MLIB VECLIB User’s Guide, fifth edition (B6061-96006) and the HP MLIB LAPACK User’s Guide, sixth edition (B6061-96005). Notice Copyright 1979-2004 Hewlett-Packard Development Company. All Rights Reserved. Reproduction, adaptation, or translation without prior written permission is prohibited, except as allowed under the copyright laws. The information contained in this document is subject to change without notice. Hewlett-Packard makes no warranty of any kind with regard to this material, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Hewlett-Packard shall not be liable for errors contained herein or for incidental or consequential damages in connection with the furnishing, performance or use of this material. Table of Contents Preface . xv VECLIB . xv LAPACK . xvi ScaLAPACK . xvi SuperLU . xvii SOLVERS . xvii VMATH . xviii Purpose and audience . xix Organization . xx Notational conventions . xxii Documentation resources . xxiii Part 1 1 Introduction to VECLIB . 1 Overview . 1 Chapter objectives . 2 Standardization . 3 Accessing VECLIB . 3 Compiling and linking (VECLIB) . 5 Problem with +ppu compatibility and duplicated symbols . 14 Optimization . 17 Parallel processing . .17 Linking for parallel or non parallel processing . 18 Controlling VECLIB parallelism at runtime . 18 Performance benefits . 19 OpenMP-based nested parallelism . 20 v Message passing-based nested parallelism . 21 Default CPS library stack is too small for MLIB . 22 Default Pthread library stack is too small for MLIB . 22 Roundoff effects . 23 Data types and precision . 23 VECLIB naming convention . 24 Data type and byte length . 25 Operator arguments . 25 Error handling . 28 Low-level subprograms . 28 High-level subprograms . 28 Troubleshooting . .29 HP MLIB man pages . 30 2 Basic Vector Operations . 31 Overview . 31 Chapter objectives . .32 Associated documentation . 32 What you need to know to use vector subprograms . 33 BLAS storage conventions . 33 BLAS indexing conventions . 35 Operator arguments in the BLAS Standard . 36 Representation of a permutation matrix . 37 Representation of a Householder matrix . 38 Subprograms for basic vector operations . 39 ISAMAX/IDAMAX/IIAMAX/ICAMAX/IZAMAX Index of maximum of magnitudes 40 ISAMIN/IDAMIN/IIAMIN/ICAMIN/IZAMIN Index of minimum of magnitudes . 43 ISCTxx/IDCTxx/IICTxx/ICCTxx/IZCTxx Count selected vector elements. 46 ISMAX/IDMAX/IIMAX Index of maximum element of vector . 49 ISMIN/IDMIN/IIMIN Index of minimum element of vector. 51 ISSVxx/IDSVxx/IISVxx/ICSVxx/IZSVxx Search vector for element . 53 SAMAX/DAMAX/IAMAX/SCAMAX/DZAMAX Maximum of magnitudes . 56 SAMIN/DAMIN/IAMIN/SCAMIN/DZAMIN Minimum of magnitudes . 59 vi Table of Contents SASUM/DASUM/IASUM/SCASUM/DZASUM Sum of magnitudes . 62 SAXPY/DAXPY/CAXPY/CAXPYC/ZAXPY/ZAXPYC Elementary vector operation 65 SAXPYI/DAXPYI/CAXPYI/ZAXPYI Sparse elementary vector operation . 68 SCLIP/DCLIP/ICLIP Two sided vector clip . 71 SCLIPL/DCLIPL/ICLIPL Left sided vector clip . 74 SCLIPR/DCLIPR/ICLIPR Right sided vector clip . 77 SCOPY/DCOPY/ICOPY/CCOPY/CCOPYC/ZCOPY/ZCOPYC Copy vector . 80 SDOT/DDOT/CDOTC/CDOTU/ZDOTC/ZDOTU Dot product . 84 SDOTI/DDOTI/CDOTCI/CDOTUI/ZDOTCI/ZDOTUI Sparse dot product . 88 SFRAC/DFRAC Extract fractional parts. 92 SGTHR/DGTHR/IGTHR/CGTHR/ZGTHR Gather sparse vector . 94 SGTHRZ/DGTHRZ/IGTHRZ/CGTHRZ/ZGTHRZ Gather and zero sparse vector . 96 SLSTxx/DLSTxx/ILSTxx/CLSTxx/ZLSTxx List selected vector elements . 99 SMAX/DMAX/IMAX Maximum of vector . 103 SMIN/DMIN/IMIN Minimum of vector. 105 SNRM2/DNRM2/SCNRM2/DZNRM2 Euclidean norm . 107 SNRSQ/DNRSQ/SCNRSQ/DZNRSQ Euclidean norm squared . 110 SRAMP/DRAMP/IRAMP Generate linear ramp. 112 SROT/DROT/CROT/CSROT/ZROT/ZDROT Apply Givens rotation . 114 SROTG/DROTG/CROTG/ZROTG Construct Givens rotation . 118 SROTI/DROTI Apply sparse Givens rotation . 120 SROTM/DROTM Apply modified Givens rotation . 123 SROTMG/DROTMG Construct modified Givens rotation . 127 SRSCL/DRSCL/CRSCL/CSRSCL/ZRSCL/ZDRSCL Scale vector. 130 SSCAL/DSCAL/CSCAL/CSSCAL/CSCALC/ZSCAL/ZDSCAL/ZSCALC Scale vector . 133 SSCTR/DSCTR/ISCTR/CSCTR/ZSCTR Scatter sparse vector. 136 SSUM/DSUM/ISUM/CSUM/ZSUM Vector sum . 139 SSWAP/DSWAP/ISWAP/CSWAP/ZSWAP Swap two vectors . 141 SWDOT/DWDOT/CWDOTC/CWDOTU/ZWDOTC/ZWDOTU Weighted dot product . 145 SZERO/DZERO/IZERO/CZERO/ZZERO Clear vector . 150 F_SAMAX_VAL/F_DAMAX_VAL/F_CAMAX_VAL/F_ZAMAX_VAL Maximum absolute value and location . 153 F_SAMIN_VAL/F_DAMIN_VAL/F_CAMIN_VAL/F_ZAMIN_VAL Minimum absolute value.
Recommended publications
  • New Fast and Accurate Jacobi Svd Algorithm: I
    NEW FAST AND ACCURATE JACOBI SVD ALGORITHM: I. ZLATKO DRMAC· ¤ AND KRESIMIR· VESELIC¶ y Abstract. This paper is the result of contrived e®orts to break the barrier between numerical accuracy and run time e±ciency in computing the fundamental decomposition of numerical linear algebra { the singular value decomposition (SVD) of a general dense matrix. It is an unfortunate fact that the numerically most accurate one{sided Jacobi SVD algorithm is several times slower than generally less accurate bidiagonalization based methods such as the QR or the divide and conquer algorithm. Despite its sound numerical qualities, the Jacobi SVD is not included in the state of the art matrix computation libraries and it is even considered obsolete by some leading researches. Our quest for a highly accurate and e±cient SVD algorithm has led us to a new, superior variant of the Jacobi algorithm. The new algorithm has inherited all good high accuracy properties, and it outperforms not only the best implementations of the one{sided Jacobi algorithm but also the QR algorithm. Moreover, it seems that the potential of the new approach is yet to be fully exploited. Key words. Jacobi method, singular value decomposition, eigenvalues AMS subject classi¯cations. 15A09, 15A12, 15A18, 15A23, 65F15, 65F22, 65F35 1. Introduction. In der Theorie der SÄacularstÄorungenund der kleinen Oscil- lationen wird man auf ein System lineÄarerGleichungen gefÄurt,in welchem die Co- e±cienten der verschiedenen Unbekanten in Bezug auf die Diagonale symmetrisch sind, die ganz constanten Glieder fehlen und zu allen in der Diagonale be¯ndlichen Coe±cienten noch dieselbe GrÄo¼e ¡x addirt ist.
    [Show full text]
  • November 1960 Table of Contents
    AMERICAN MATHEMATICAL SOCIETY VOLUME 7, NUMBER 6 ISSUE NO. 49 NOVEMBER 1960 AMERICAN MATHEMATICAL SOCIETY Nottces Edited by GORDON L. WALKER Contents MEETINGS Calendar of Meetings ••••••••••.••••••••••••.••• 660 Program of the November Meeting in Nashville, Tennessee . 661 Abstracts for Meeting on Pages 721-732 Program of the November Meeting in Pasadena, California • 666 Abstracts for Meeting on Pages 733-746 Program of the November Meeting in Evanston, Illinois . • • 672 Abstracts for Meeting on Pages 747-760 PRELIMINARY ANNOUNCEMENT OF MEETINGS •.•.••••• 677 NEWS AND COMMENT FROM THE CONFERENCE BOARD OF THE MATHEMATICAL SCIENCES ••••••••••••.••• 680 FROM THE AMS SECRETARY •...•••.••••••••.•••.• 682 NEWS ITEMS AND ANNOUNCEMENTS .•••••.•••••••••• 683 FEATURE ARTICLES The Sino-American Conference on Intellectual Cooperation .• 689 National Academy of Sciences -National Research Council .• 693 International Congress of Applied Mechanics ..••••••••. 700 PERSONAL ITEMS ••.•.••.•••••••••••••••••••••• 702 LETTERS TO THE EDITOR.. • • • • • • • . • . • . 710 MEMORANDA TO MEMBERS The Employment Register . • . • • . • • • • • . • • • • 712 Employment of Retired Mathematicians ••••••.••••..• 712 Abstracts of Papers by Title • • . • . • • • . • . • • . • . • • . • 713 Reciprocity Agreement with the Societe Mathematique de Belgique .••.•••.••.••.•.•••••..•••••.•. 713 NEW PUBLICATIONS ..•..•••••••••.•••.•••••••••• 714 ABSTRACTS OF CONTRIBUTED PAPERS •.•••••••.••..• 715 RESERVATION FORM .•...•.•.•••••.••••..•••...• 767 MEETINGS CALENDAR OF MEETINGS Note: This Calendar lists all of the meetings which have been approved by the Council up to the date at which this issue of the NOTICES was sent to press. The summer and annual meetings are joint meetings of the Mathematical Asso­ ciation of America and the American Mathematical Society. The meeting dates which fall rather far in the future are subject to change. This is particularly true of the meetings to which no numbers have yet been assigned. Meet­ Deadline ing Date Place for No.
    [Show full text]
  • REPORTS in INFORMATICS Department of Informatics
    REPORTS IN INFORMATICS ISSN 0333-3590 Sparsity in Higher Order Methods in Optimization Geir Gundersen and Trond Steihaug REPORT NO 327 June 2006 Department of Informatics UNIVERSITY OF BERGEN Bergen, Norway This report has URL http://www.ii.uib.no/publikasjoner/texrap/ps/2006-327.ps Reports in Informatics from Department of Informatics, University of Bergen, Norway, is available at http://www.ii.uib.no/publikasjoner/texrap/. Requests for paper copies of this report can be sent to: Department of Informatics, University of Bergen, Høyteknologisenteret, P.O. Box 7800, N-5020 Bergen, Norway Sparsity in Higher Order Methods in Optimization Geir Gundersen and Trond Steihaug Department of Informatics University of Bergen Norway fgeirg,[email protected] 29th June 2006 Abstract In this paper it is shown that when the sparsity structure of the problem is utilized higher order methods are competitive to second order methods (Newton), for solving unconstrained optimization problems when the objective function is three times continuously differentiable. It is also shown how to arrange the computations of the higher derivatives. 1 Introduction The use of higher order methods using exact derivatives in unconstrained optimization have not been considered practical from a computational point of view. We will show when the sparsity structure of the problem is utilized higher order methods are competitive compared to second order methods (Newton). The sparsity structure of the tensor is induced by the sparsity structure of the Hessian matrix. This is utilized to make efficient algorithms for Hessian matrices with a skyline structure. We show that classical third order methods, i.e Halley's, Chebyshev's and Super Halley's method, can all be regarded as two step Newton like methods.
    [Show full text]
  • Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods1
    Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods1 Richard Barrett2, Michael Berry3, Tony F. Chan4, James Demmel5, June M. Donato6, Jack Dongarra3,2, Victor Eijkhout7, Roldan Pozo8, Charles Romine9, and Henk Van der Vorst10 This document is the electronic version of the 2nd edition of the Templates book, which is available for purchase from the Society for Industrial and Applied Mathematics (http://www.siam.org/books). 1This work was supported in part by DARPA and ARO under contract number DAAL03-91-C-0047, the National Science Foundation Science and Technology Center Cooperative Agreement No. CCR-8809615, the Applied Mathematical Sciences subprogram of the Office of Energy Research, U.S. Department of Energy, under Contract DE-AC05-84OR21400, and the Stichting Nationale Computer Faciliteit (NCF) by Grant CRG 92.03. 2Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830- 6173. 3Department of Computer Science, University of Tennessee, Knoxville, TN 37996. 4Applied Mathematics Department, University of California, Los Angeles, CA 90024-1555. 5Computer Science Division and Mathematics Department, University of California, Berkeley, CA 94720. 6Science Applications International Corporation, Oak Ridge, TN 37831 7Texas Advanced Computing Center, The University of Texas at Austin, Austin, TX 78758 8National Institute of Standards and Technology, Gaithersburg, MD 9Office of Science and Technology Policy, Executive Office of the President 10Department of Mathematics, Utrecht University, Utrecht, the Netherlands. ii How to Use This Book We have divided this book into five main chapters. Chapter 1 gives the motivation for this book and the use of templates. Chapter 2 describes stationary and nonstationary iterative methods.
    [Show full text]
  • Applied Numerical Linear Algebra. Lecture 14
    Algorithms for the Symmetric Eigenproblem: Divide-and-Conquer Algorithms for the Symmetric Eigenproblem: Bisection and Inverse Iteration Algorithms for the Symmetric Eigenproblem: Jacobi’s Method Algorithms for the Singular Value Decomposition(SVD) Applied Numerical Linear Algebra. Lecture 14. 1/54 Algorithms for the Symmetric Eigenproblem: Divide-and-Conquer Algorithms for the Symmetric Eigenproblem: Bisection and Inverse Iteration Algorithms for the Symmetric Eigenproblem: Jacobi’s Method Algorithms for the Singular Value Decomposition(SVD) 5.3.3. Divide-and-Conquer This method is the fastest now available if you want all eigenvalues and eigenvectors of a tridiagonal matrix whose dimension is larger than about 25. (The exact threshold depends on the computer.) It is quite subtle to implement in a numerically stable way. Indeed, although this method was first introduced in 1981 [J. J. M. Cuppen. A divide and conquer method for the symmetric tridiagonal eigenproblem. Numer. Math., 36:177-195, 1981], the ”right” implementation was not discovered until 1992 [M. Gu and S. Eisenstat. A stable algorithm for the rank-1 modification of the symmetric eigenproblem. Computer Science Dept. Report YALEU/DCS/RR-916, Yale University, September 1992;M. Gu and S. C. Eisenstat. A divide-and-conquer algorithm for the symmetric tridiagonal eigenproblem. SIAM J. Matrix Anal Appl, 16:172-191, 1995]). 2/54 Algorithms for the Symmetric Eigenproblem: Divide-and-Conquer Algorithms for the Symmetric Eigenproblem: Bisection and Inverse Iteration Algorithms for the Symmetric Eigenproblem: Jacobi’s Method Algorithms for the Singular Value Decomposition(SVD) a1 b1 0 ... ... 0 b1 a2 b2 0 ... 0 ... ... ... ... ... ... ... am 1 bm 1 ... ... 0 − − 0 bm 1 am bm 0 0 T = − 0 0 b a b 0 m m+1 m+1 0 0 0 b ..
    [Show full text]
  • The Astrometric Core Solution for the Gaia Mission Overview of Models, Algorithms and Software Implementation
    Astronomy & Astrophysics manuscript no. aa17905-11 c ESO 2018 October 30, 2018 The astrometric core solution for the Gaia mission Overview of models, algorithms and software implementation L. Lindegren1, U. Lammers2, D. Hobbs1, W. O’Mullane2, U. Bastian3, and J. Hernandez´ 2 1 Lund Observatory, Lund University, Box 43, SE-22100 Lund, Sweden e-mail: Lennart.Lindegren, [email protected] 2 European Space Agency (ESA), European Space Astronomy Centre (ESAC), P.O. Box (Apdo. de Correos) 78, ES-28691 Villanueva de la Canada,˜ Madrid, Spain e-mail: Uwe.Lammers, William.OMullane, [email protected] 3 Astronomisches Rechen-Institut, Zentrum fur¨ Astronomie der Universitat¨ Heidelberg, Monchhofstr.¨ 12–14, DE-69120 Heidelberg, Germany e-mail: [email protected] Received 17 August 2011 / Accepted 25 November 2011 ABSTRACT Context. The Gaia satellite will observe about one billion stars and other point-like sources. The astrometric core solution will determine the astrometric parameters (position, parallax, and proper motion) for a subset of these sources, using a global solution approach which must also include a large number of parameters for the satellite attitude and optical instrument. The accurate and efficient implementation of this solution is an extremely demanding task, but crucial for the outcome of the mission. Aims. We aim to provide a comprehensive overview of the mathematical and physical models applicable to this solution, as well as its numerical and algorithmic framework. Methods. The astrometric core solution is a simultaneous least-squares estimation of about half a billion parameters, including the astrometric parameters for some 100 million well-behaved so-called primary sources.
    [Show full text]
  • The Singular Value Decomposition: Anatomy of Optimizing an Algorithm for Extreme Scale
    The University of Manchester Research The singular value decomposition DOI: 10.1137/17M1117732 Document Version Final published version Link to publication record in Manchester Research Explorer Citation for published version (APA): Dongarra, J., Gates, M., Haidar, A., Kurzak, J., Luszczek, P., Tomov, S., & Yamazaki, I. (2018). The singular value decomposition: Anatomy of optimizing an algorithm for extreme scale. SIAM REVIEW, 60(4), 808-865. https://doi.org/10.1137/17M1117732 Published in: SIAM REVIEW Citing this paper Please note that where the full-text provided on Manchester Research Explorer is the Author Accepted Manuscript or Proof version this may differ from the final Published version. If citing, it is advised that you check and use the publisher's definitive version. General rights Copyright and moral rights for the publications made accessible in the Research Explorer are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. Takedown policy If you believe that this document breaches copyright please refer to the University of Manchester’s Takedown Procedures [http://man.ac.uk/04Y6Bo] or contact [email protected] providing relevant details, so we can investigate your claim. Download date:06. Oct. 2021 SIAM REVIEW \bigcirc c 2018 Society for Industrial and Applied Mathematics Vol. 60, No. 4, pp. 808–865 The Singular Value Decomposition: Anatomy of Optimizing an Algorithm \ast for Extreme Scale Jack Dongarray z Mark Gates Azzam Haidarz Jakub Kurzakz Piotr Luszczekz Stanimire Tomovz Ichitaro Yamazakiz Abstract. The computation of the singular value decomposition, or SVD, has a long history with many improvements over the years, both in its implementations and algorithmically.
    [Show full text]
  • Singular Value Decomposition in Embedded Systems Based on ARM Cortex-M Architecture
    electronics Article Singular Value Decomposition in Embedded Systems Based on ARM Cortex-M Architecture Michele Alessandrini †, Giorgio Biagetti † , Paolo Crippa *,† , Laura Falaschetti † , Lorenzo Manoni † and Claudio Turchetti † Department of Information Engineering, Università Politecnica delle Marche, Via Brecce Bianche 12, I-60131 Ancona, Italy; [email protected] (M.A.); [email protected] (G.B.); [email protected] (L.F.); [email protected] (L.M.); [email protected] (C.T.) * Correspondence: [email protected]; Tel.: +39-071-220-4541 † These authors contributed equally to this work. Abstract: Singular value decomposition (SVD) is a central mathematical tool for several emerging applications in embedded systems, such as multiple-input multiple-output (MIMO) systems, data an- alytics, sparse representation of signals. Since SVD algorithms reduce to solve an eigenvalue problem, that is computationally expensive, both specific hardware solutions and parallel implementations have been proposed to overcome this bottleneck. However, as those solutions require additional hardware resources that are not in general available in embedded systems, optimized algorithms are demanded in this context. The aim of this paper is to present an efficient implementation of the SVD algorithm on ARM Cortex-M. To this end, we proceed to (i) present a comprehensive treatment of the most common algorithms for SVD, providing a fairly complete and deep overview of these algorithms, with a common notation, (ii) implement them on an ARM Cortex-M4F microcontroller, in order to develop a library suitable for embedded systems without an operating system, (iii) find, through a comparative study of the proposed SVD algorithms, the best implementation suitable for a low-resource bare-metal embedded system, (iv) show a practical application to Kalman filtering of an inertial measurement unit (IMU), as an example of how SVD can improve the accuracy of existing algorithms and of its usefulness on a such low-resources system.
    [Show full text]
  • Object Oriented Design Philosophy for Scientific Computing
    Mathematical Modelling and Numerical Analysis ESAIM: M2AN Mod´elisation Math´ematique et Analyse Num´erique M2AN, Vol. 36, No 5, 2002, pp. 793–807 DOI: 10.1051/m2an:2002041 OBJECT ORIENTED DESIGN PHILOSOPHY FOR SCIENTIFIC COMPUTING Philippe R.B. Devloo1 and Gustavo C. Longhin1 Abstract. This contribution gives an overview of current research in applying object oriented pro- gramming to scientific computing at the computational mechanics laboratory (LABMEC) at the school of civil engineering – UNICAMP. The main goal of applying object oriented programming to scien- tific computing is to implement increasingly complex algorithms in a structured manner and to hide the complexity behind a simple user interface. The following areas are current topics of research and documented within the paper: hp-adaptive finite elements in one-, two- and three dimensions with the development of automatic refinement strategies, multigrid methods applied to adaptively refined finite element solution spaces and parallel computing. Mathematics Subject Classification. 65M60, 65G20, 65M55, 65Y99, 65Y05. Received: 3 December, 2001. 1. Introduction Finite element programming has been a craft since the decade of 1960. Ever since, finite element programs are considered complex, especially when compared to their rival numerical approximation technique, the finite difference method. Essential elements of complexity of the finite element method are: • Meshes are generally unstructured, leading to the necessity of node renumbering schemes and complex sparcity structures of the tangent
    [Show full text]
  • Ubung 4.10 Check That the Inverse of L
    Ubung¨ 4.10 Check that the inverse of L(k) is given by 1 .. 0 . . . .. .. . . 0 1 (L(k))−1 = . (4.7) . .. . −lk ,k . +1 . . 0 .. . . .. .. 0 ··· 0 −l 0 ··· 0 1 n,k Each step of the above Gaussian elimination process sketched in Fig. 4.2 is realized by a multiplication from the left by a matrix, in fact, the matrix (L(k))−1 (cf. (4.7)). That is, the Gaussian elimination process can be described as A = A(1) → A(2) =(L(1))−1A(1) → A(3) =(L(2))−1A(2) =(L(2))−1(L(1))−1A(1) → ... → A(n) =(L(n−1))−1A(n−1) = ... =(L(n−1))−1(L(n−2))−1 ... (L(2))−1(L(1))−1A(1) =:U upper triangular Rewriting this|{z yields,} the LU-factorization A = L(1) ··· L(n−1) U =:L The matrix L is a lower triangular matrix| as{z the product} of lower triangular matrices (cf. Exercise 4.5). In fact, due to the special structure of the matrices L(k), it is given by 1 . l .. L = 21 . . .. .. ln ··· ln,n 1 1 −1 as the following exercise shows: Ubung¨ 4.11 For each k the product L(k)L(k+1) ··· L(n−1) is given by 1 .. 0 . . . .. .. . . 0 1 L(k)L(k+1) ··· L(n−1) = . . . lk ,k 1 +1 . . l .. k+2,k+1 . . .. .. 0 ··· 0 l l ··· l 1 n,k n,k+1 n,n−1 40 Thus, we have shown that Gaussian elimination produces a factorization A = LU, (4.8) where L and U are lower and upper triangular matrices determined by Alg.
    [Show full text]
  • New Fast and Accurate Jacobi Svd Algorithm: I. Lapack Working Note 169
    NEW FAST AND ACCURATE JACOBI SVD ALGORITHM: I. LAPACK WORKING NOTE 169 ZLATKO DRMACˇ ∗ AND KRESIMIRˇ VESELIC´ † Abstract. This paper is the result of contrived efforts to break the barrier between numerical accuracy and run time efficiency in computing the fundamental decomposition of numerical linear algebra – the singular value decomposition (SVD) of a general dense matrix. It is an unfortunate fact that the numerically most accurate one–sided Jacobi SVD algorithm is several times slower than generally less accurate bidiagonalization based methods such as the QR or the divide and conquer algorithm. Despite its sound numerical qualities, the Jacobi SVD is not included in the state of the art matrix computation libraries and it is even considered obsolete by some leading researches. Our quest for a highly accurate and efficient SVD algorithm has led us to a new, superior variant of the Jacobi algorithm. The new algorithm has inherited all good high accuracy properties, and it outperforms not only the best implementations of the one–sided Jacobi algorithm but also the QR algorithm. Moreover, it seems that the potential of the new approach is yet to be fully exploited. Key words. Jacobi method, singular value decomposition, eigenvalues AMS subject classifications. 15A09, 15A12, 15A18, 15A23, 65F15, 65F22, 65F35 1. Introduction. In der Theorie der S¨acularst¨orungen und der kleinen Oscil- lationen wird man auf ein System line¨arer Gleichungen gef¨urt, in welchem die Co- efficienten der verschiedenen Unbekanten in Bezug auf die Diagonale symmetrisch sind, die ganz constanten Glieder fehlen und zu allen in der Diagonale befindlichen Coefficienten noch dieselbe Gr¨oße x addirt ist.
    [Show full text]
  • Arxiv:1301.5758V3
    Generalized Householder Transformations for the Complex Symmetric Eigenvalue Problem J. H. Noble,1 M. Lubasch,2 and U. D. Jentschura1, 3 1Department of Physics, Missouri University of Science and Technology, Rolla, Missouri 65409, USA 2Max–Planck–Institut f¨ur Quantenoptik, Hans–Kopfermann–Straße 1, 85748 Garching, Germany 3MTA–DE Particle Physics Research Group, P.O.Box 51, H-4001 Debrecen, Hungary We present an intuitive and scalable algorithm for the diagonalization of complex symmet- ric matrices, which arise from the projection of pseudo–Hermitian and complex scaled Hamilto- nians onto a suitable basis set of “trial” states. The algorithm diagonalizes complex and sym- metric (non–Hermitian) matrices and is easily implemented in modern computer languages. It is based on generalized Householder transformations and relies on iterative similarity transformations ′ − T → T = QT T Q, where Q is a complex and orthogonal, but not unitary, matrix, i.e, QT = Q 1 − but Q+ 6= Q 1. We present numerical reference data to support the scalability of the algorithm. We construct the generalized Householder transformations from the notion that the conserved scalar product of eigenstates Ψn and Ψm of a pseudo–Hermitian quantum mechanical Hamiltonian can be reformulated in terms of the generalized indefinite inner product R dx Ψn(x,t)Ψm(x,t), where the integrand is locally defined, and complex conjugation is avoided. A few example calculations are described which illustrate the physical origin of the ideas used in the construction of the algorithm. PACS numbers: 03.65.Ge, 02.60.Lj, 11.15.Bt, 11.30.Er, 02.60.-x, 02.60.Dc I.
    [Show full text]