Libflame the Complete Reference ( Version 5.1.0-56 )

libflame The Complete Reference ( version 5.1.0-56 ) Field G. Van Zee The University of Texas at Austin Copyright c 2011 by Field G. Van Zee. 10 9 8 7 6 5 4 3 2 1 All rights reserved. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, contact either of the authors. No warranties, express or implied, are made by the publisher, authors, and their employers that the programs contained in this volume are free of error. They should not be relied on as the sole basis to solve a problem whose incorrect solution could result in injury to person or property. If the programs are employed in such a manner, it is at the user's own risk and the publisher, authors, and their employers disclaim all liability for such misuse. Trademarked names may be used in this book without the inclusion of a trademark symbol. These names are used in an editorial context only; no infringement of trademark is intended. Library of Congress Cataloging-in-Publication Data not yet available Draft, November 2008 This \Draft Edition" allows this material to be used while we sort out through what mechanism we will publish the book. Contents 1. Introduction 1 1.1. What's provided...........................................1 1.2. What's not provided.........................................6 1.3. Acknowledgments...........................................7 2. Setup for GNU/Linux and UNIX9 2.1. Before obtaining libflame .....................................9 2.1.1. System software requirements................................9 2.1.2. System hardware support.................................. 10 2.1.3. License............................................ 10 2.1.4. Source code.......................................... 11 2.1.5. Tracking source code revisions............................... 11 2.1.6. If you have problems..................................... 11 2.2. Preparation.............................................. 11 2.3. Configuration............................................. 12 2.3.1. configure options...................................... 12 2.3.2. Running configure ..................................... 19 2.4. Compiling............................................... 20 2.4.1. Parallel make ......................................... 21 2.5. Installation.............................................. 22 2.6. Linking against libflame ...................................... 22 2.6.1. Linking with the lapack2flame compatibility layer.................... 24 3. Setup for Microsoft Windows 25 3.1. Before obtaining libflame ..................................... 25 3.1.1. System software requirements................................ 25 3.1.2. System hardware support.................................. 26 3.1.3. License............................................ 26 3.1.4. Source code.......................................... 26 3.1.5. Tracking source code revisions............................... 26 3.1.6. If you have problems..................................... 26 3.2. Preparation.............................................. 26 3.3. Configuration............................................. 28 3.3.1. IronPython.......................................... 28 3.3.2. Running configure.cmd .................................. 30 3.4. Compiling............................................... 30 3.5. Installation.............................................. 32 3.6. Dynamic library generation..................................... 32 3.7. Linking against libflame ...................................... 34 i 4. Using libflame 37 4.1. FLAME/C examples......................................... 37 4.2. FLASH examples........................................... 39 4.3. SuperMatrix examples........................................ 40 5. User-level Application Programming Interfaces 45 5.1. Conventions.............................................. 45 5.1.1. General terms......................................... 45 5.1.2. Notation............................................ 46 5.1.3. Objects............................................ 49 5.2. FLAME/C Basics.......................................... 50 5.2.1. Initialization and finalization................................ 50 5.2.2. Object creation and destruction............................... 51 5.2.3. General query functions................................... 53 5.2.4. Interfacing with conventional matrix arrays........................ 55 5.2.5. More query functions.................................... 62 5.2.6. Assignment/Update functions................................ 67 5.2.7. Math-related functions.................................... 70 5.2.8. Miscellaneous functions................................... 80 5.2.9. Advanced query routines.................................. 82 5.3. Managing Views........................................... 84 5.3.1. Vertical partitioning..................................... 84 5.3.2. Horizontal partitioning.................................... 85 5.3.3. Bidirectional partitioning.................................. 86 5.3.4. Merging views........................................ 88 5.4. FLASH................................................ 89 5.4.1. Motivation.......................................... 89 5.4.2. Concepts........................................... 90 5.4.3. Interoperability with FLAME/C.............................. 91 5.4.4. Object creation and destruction............................... 92 5.4.5. Interfacing with flat matrix objects............................. 95 5.4.6. Interfacing with conventional matrix arrays........................ 101 5.4.7. Object query functions.................................... 104 5.4.8. Managing Views....................................... 107 5.4.8.1. Vertical partitioning................................ 107 5.4.8.2. Horizontal partitioning.............................. 108 5.4.8.3. Bidirectional partitioning............................. 109 5.4.9. Utility functions....................................... 110 5.4.9.1. Miscellaneous functions.............................. 110 5.5. SuperMatrix............................................. 110 5.5.1. Overview........................................... 110 5.5.2. API.............................................. 111 5.5.3. Integration with FLASH front-ends............................. 115 5.6. Front-ends............................................... 115 5.6.1. BLAS operations....................................... 115 5.6.1.1. Level-1 BLAS................................... 115 5.6.1.2. Level-2 BLAS................................... 130 5.6.1.3. Level-3 BLAS................................... 143 5.6.2. LAPACK operations..................................... 154 5.6.3. Utility functions....................................... 191 5.7. External wrappers.......................................... 213 5.7.1. BLAS operations....................................... 214 5.7.1.1. Level-1 BLAS................................... 214 5.7.1.2. Level-2 BLAS................................... 223 5.7.1.3. Level-3 BLAS................................... 231 5.7.2. LAPACK operations..................................... 238 5.7.3. LAPACK-related utility functions............................. 249 5.8. LAPACK compatibility (lapack2flame).............................. 249 5.8.1. Supported routines...................................... 250 A. FLAME Project Related Publications 259 A.1. Books................................................. 259 A.2. Dissertations............................................. 259 A.3. Journal Articles............................................ 259 A.4. Conference Papers.......................................... 260 A.5. FLAME Working Notes....................................... 262 A.6. Other Technical Reports....................................... 265 B. License 267 B.1. BSD 3-clause license......................................... 267 List of Contributors A large number of people have contributed, and continue to contribute, to the FLAME project. For a complete list, please visit http://www.cs.utexas.edu/users/flame/ Below we list the people who have contributed directly to the knowledge and understanding that is summa- rized in this text. Paolo Bientinesi The University of Texas at Austin Ernie Chan The University of Texas at Austin John A. Gunnels IBM T.J. Watson Research Center Kazushige Goto The University of Texas at Austin Tze Meng Low The University of Texas at Austin Margaret E. Myers The University of Texas at Austin Enrique S. Quintana-Ort´ı Universidad Jaume I Gregorio Quintana-Ort´ı Universidad Jaume I Robert A. van de Geijn The University of Texas at Austin v Chapter 1 Introduction In past years, the FLAME project, a collaborative effort between The University of Texas at Austin and Universidad Jaime I de Castellon, developed a unique methodology, notation, and set of APIs for deriving and representing linear algebra libraries. In an effort to better promote the techniques characteristic to the FLAME project, we have implemented a functional prototype library that demonstrates findings and insights from the last decade of research. We call this library libflame.1 The primary purpose of libflame is to provide the scientific and numerical computing communities with a modern, high-performance dense linear algebra library that is extensible, easy to use, and available under an open source license. Its developers have published numerous papers and working notes over the last decade documenting the challenges and motivations

Libflame the Complete Reference ( Version 5.1.0-56 )

Linpack Evaluation on a Supercomputer with Heterogeneous Accelerators

Red Hat Enterprise Linux 6 Developer Guide

Atmospheric Modelling and HPC

Key Benefits Key Features

CCPP Technical Documentation Release V3.0.0

Towards a Fully Automated Extraction and Interpretation of Tabular Data Using Machine Learning

Compiling Maplesim C Code for Simulation in Vissim

Supermatrix: a Multithreaded Runtime Scheduling System for Algorithms-By-Blocks

Benchmark of C++ Libraries for Sparse Matrix Computation

Using Machine Learning to Improve Dense and Sparse Matrix Multiplication Kernels

Anatomy of High-Performance Matrix Multiplication

DD2358 – Introduction to HPC Linear Algebra Libraries & BLAS