Twenty-Plus Years of Netlib and NA-Net Jack Dongarra, Gene Golub
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Gene H. Golub 1932–2007
Gene H. Golub 1932–2007 A Biographical Memoir by Dianne P. O’Leary ©2018 National Academy of Sciences. Any opinions expressed in this memoir are those of the author and do not necessarily reflect the views of the National Academy of Sciences. GENE HOWARD GOLUB February 29, 1932–November 16, 2007 Elected to the NAS, 1993 Gene Howard Golub was the most influential person of his generation in the field of numerical analysis, in partic- ular, in the area of matrix computation. This was a conse- quence not just of his extraordinary technical contribu- tions but was also due to his clear writing, his influential treatise on matrix computation, his mentorship of a host of students and collaborators, his service as founder and editor of two journals, his leadership in the Society for Industrial and Applied Mathematics (SIAM), his efforts to bring colleagues together in meetings and electronic forums, and his role as a research catalyst. By Dianne P. O’Leary Biography Gene was born in Chicago, Illinois, on Leap Day, February 29, 1932, to Bernice Gelman Golub (an immigrant to the U.S. from Latvia) and Nathan Golub (from Ukraine). Gene’s father worked as a “bread man” and his mother was a seamstress, and Gene was proud of his Jewish heritage. From the age of 12 Gene worked, first at his cousin’s pharmacy and later at other jobs. After junior college and a year at the University of Chicago, he transferred to the University of Illinois in Urbana-Champaign, where he received his BS in Mathematics (1953), MA in Mathematical Statistics (1954), and PhD in Mathematics (1959). -
Over-Scalapack.Pdf
1 2 The Situation: Parallel scienti c applications are typically Overview of ScaLAPACK written from scratch, or manually adapted from sequen- tial programs Jack Dongarra using simple SPMD mo del in Fortran or C Computer Science Department UniversityofTennessee with explicit message passing primitives and Mathematical Sciences Section Which means Oak Ridge National Lab oratory similar comp onents are co ded over and over again, co des are dicult to develop and maintain http://w ww .ne tl ib. org /ut k/p e opl e/J ackDo ngarra.html debugging particularly unpleasant... dicult to reuse comp onents not really designed for long-term software solution 3 4 What should math software lo ok like? The NetSolve Client Many more p ossibilities than shared memory Virtual Software Library Multiplicityofinterfaces ! Ease of use Possible approaches { Minimum change to status quo Programming interfaces Message passing and ob ject oriented { Reusable templates { requires sophisticated user Cinterface FORTRAN interface { Problem solving environments Integrated systems a la Matlab, Mathematica use with heterogeneous platform Interactiveinterfaces { Computational server, like netlib/xnetlib f2c Non-graphic interfaces Some users want high p erf., access to details {MATLAB interface Others sacri ce sp eed to hide details { UNIX-Shell interface Not enough p enetration of libraries into hardest appli- Graphic interfaces cations on vector machines { TK/TCL interface Need to lo ok at applications and rethink mathematical software { HotJava Browser -
Prospectus for the Next LAPACK and Scalapack Libraries
Prospectus for the Next LAPACK and ScaLAPACK Libraries James Demmel1, Jack Dongarra23, Beresford Parlett1, William Kahan1, Ming Gu1, David Bindel1, Yozo Hida1, Xiaoye Li1, Osni Marques1, E. Jason Riedy1, Christof Voemel1, Julien Langou2, Piotr Luszczek2, Jakub Kurzak2, Alfredo Buttari2, Julie Langou2, Stanimire Tomov2 1 University of California, Berkeley CA 94720, USA, 2 University of Tennessee, Knoxville TN 37996, USA, 3 Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA, 1 Introduction Dense linear algebra (DLA) forms the core of many scienti¯c computing appli- cations. Consequently, there is continuous interest and demand for the devel- opment of increasingly better algorithms in the ¯eld. Here 'better' has a broad meaning, and includes improved reliability, accuracy, robustness, ease of use, and most importantly new or improved algorithms that would more e±ciently use the available computational resources to speed up the computation. The rapidly evolving high end computing systems and the close dependence of DLA algo- rithms on the computational environment is what makes the ¯eld particularly dynamic. A typical example of the importance and impact of this dependence is the development of LAPACK [1] (and later ScaLAPACK [2]) as a successor to the well known and formerly widely used LINPACK [3] and EISPACK [3] libraries. Both LINPACK and EISPACK were based, and their e±ciency depended, on optimized Level 1 BLAS [4]. Hardware development trends though, and in par- ticular an increasing Processor-to-Memory speed gap of approximately 50% per year, started to increasingly show the ine±ciency of Level 1 BLAS vs Level 2 and 3 BLAS, which prompted e®orts to reorganize DLA algorithms to use block matrix operations in their innermost loops. -
History of MATLAB and the Company Mathworks
Engineering Problem Solving and Programming (CS 1133) History of MATLAB and the Company MathWorks K. Ming Leung [email protected] http://cis.poly.edu/˜mleung Department of Computer Science and Engineering Polytechnic Institute of NYU CS 1133: History, MATLAB, MathWorks – p. 1/1 Material presented here was taken with slight modifications from The Origin of MATLAB and The Growth of MATLAB and The MathW at the MathWorks website. In the late 1970s, Cleve Moler, then a professor at the University of New Mexico, used a computer language called Fortran to develop the first version of MATLAB. His goal was to provide to students the ability to utilize powerful numerical packages for scientific and engineering computations without having to write FORTRAN code. He visited Stanford University to spend his sabbatical in 1979 and taught the graduate numerical analysis course where the students used MATLAB for some of their homeworks. Half of the students in the class were from mathematics and com- puter science, and the other half were from engineering. Al- though the students in mathematics and computer science didn’t particularly like MATLAB, the engineering students were im- pressed. The emphasis on matrices (arrays) in MATLAB proved to be very useful to them. A few of the Stanford engineering students from his class later joined two consulting companies in Palo Alto, California. These companies extended MATLAB to have more capability in control analysis and signal processing. Jack Little, a Stanford- and MIT-trained control engineer, was the principal developer of one of the first commercial prod- ucts based on Fortran MATLAB. -
Jack Dongarra: Supercomputing Expert and Mathematical Software Specialist
Biographies Jack Dongarra: Supercomputing Expert and Mathematical Software Specialist Thomas Haigh University of Wisconsin Editor: Thomas Haigh Jack J. Dongarra was born in Chicago in 1950 to a he applied for an internship at nearby Argonne National family of Sicilian immigrants. He remembers himself as Laboratory as part of a program that rewarded a small an undistinguished student during his time at a local group of undergraduates with college credit. Dongarra Catholic elementary school, burdened by undiagnosed credits his success here, against strong competition from dyslexia.Onlylater,inhighschool,didhebeginto students attending far more prestigious schools, to Leff’s connect material from science classes with his love of friendship with Jim Pool who was then associate director taking machines apart and tinkering with them. Inspired of the Applied Mathematics Division at Argonne.2 by his science teacher, he attended Chicago State University and majored in mathematics, thinking that EISPACK this would combine well with education courses to equip Dongarra was supervised by Brian Smith, a young him for a high school teaching career. The first person in researcher whose primary concern at the time was the his family to go to college, he lived at home and worked lab’s EISPACK project. Although budget cuts forced Pool to in a pizza restaurant to cover the cost of his education.1 make substantial layoffs during his time as acting In 1972, during his senior year, a series of chance director of the Applied Mathematics Division in 1970– events reshaped Dongarra’s planned career. On the 1971, he had made a special effort to find funds to suggestion of Harvey Leff, one of his physics professors, protect the project and hire Smith. -
C++ API for BLAS and LAPACK
2 C++ API for BLAS and LAPACK Mark Gates ICL1 Piotr Luszczek ICL Ahmad Abdelfattah ICL Jakub Kurzak ICL Jack Dongarra ICL Konstantin Arturov Intel2 Cris Cecka NVIDIA3 Chip Freitag AMD4 1Innovative Computing Laboratory 2Intel Corporation 3NVIDIA Corporation 4Advanced Micro Devices, Inc. February 21, 2018 This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration) responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early testbed platforms, in support of the nation's exascale computing imperative. Revision Notes 06-2017 first publication 02-2018 • copy editing, • new cover. 03-2018 • adding a section about GPU support, • adding Ahmad Abdelfattah as an author. @techreport{gates2017cpp, author={Gates, Mark and Luszczek, Piotr and Abdelfattah, Ahmad and Kurzak, Jakub and Dongarra, Jack and Arturov, Konstantin and Cecka, Cris and Freitag, Chip}, title={{SLATE} Working Note 2: C++ {API} for {BLAS} and {LAPACK}}, institution={Innovative Computing Laboratory, University of Tennessee}, year={2017}, month={June}, number={ICL-UT-17-03}, note={revision 03-2018} } i Contents 1 Introduction and Motivation1 2 Standards and Trends2 2.1 Programming Language Fortran............................ 2 2.1.1 FORTRAN 77.................................. 2 2.1.2 BLAST XBLAS................................. 4 2.1.3 Fortran 90.................................... 4 2.2 Programming Language C................................ 5 2.2.1 Netlib CBLAS.................................. 5 2.2.2 Intel MKL CBLAS................................ 6 2.2.3 Netlib lapack cwrapper............................. 6 2.2.4 LAPACKE.................................... 7 2.2.5 Next-Generation BLAS: \BLAS G2"..................... -
A Peer-To-Peer Framework for Message Passing Parallel Programs
A Peer-to-Peer Framework for Message Passing Parallel Programs Stéphane GENAUD a;1, and Choopan RATTANAPOKA b;2 a AlGorille Team - LORIA b King Mongkut’s University of Technology Abstract. This chapter describes the P2P-MPI project, a software framework aimed at the development of message-passing programs in large scale distributed net- works of computers. Our goal is to provide a light-weight, self-contained software package that requires minimum effort to use and maintain. P2P-MPI relies on three features to reach this goal: i) its installation and use does not require administra- tor privileges, ii) available resources are discovered and selected for a computation without intervention from the user, iii) program executions can be fault-tolerant on user demand, in a completely transparent fashion (no checkpoint server to config- ure). P2P-MPI is typically suited for organizations having spare individual comput- ers linked by a high speed network, possibly running heterogeneous operating sys- tems, and having Java applications. From a technical point of view, the framework has three layers: an infrastructure management layer at the bottom, a middleware layer containing the services, and the communication layer implementing an MPJ (Message Passing for Java) communication library at the top. We explain the de- sign and the implementation of these layers, and we discuss the allocation strategy based on network locality to the submitter. Allocation experiments of more than five hundreds peers are presented to validate the implementation. We also present how fault-management issues have been tackled. First, the monitoring of the infras- tructure itself is implemented through the use of failure detectors. -
GEORGE ELMER FORSYTHE January 8, 1917—April 9, 1972
MATHEMATICS OF COMPUTATION, VOLUME 26, NUMBER 120, OCTOBER 1972 GEORGE ELMER FORSYTHE January 8, 1917—April9, 1972 With the sudden death of George Forsythe, modern numerical analysis has lost one of its earliest pioneers, an influential leader who helped bring numerical analysis and computer science to their present positions. We in computing have been fortunate to have benefited from George's abilities, and we all feel a deep sense of personal and professional loss. Although George's interest in calculation can be traced back at least to his seventh grade attempt to see how the digits repeated in the decimal expansion of 10000/7699, his early research was actually in summability theory, leading to his Ph.D. in 1941 under J. D. Tamarkin at Brown University. A brief period at Stanford thereafter was interrupted by his Air Force service, during which he became interested in meteor- ology, co-authoring an influential book in that field. George then spent a year at Boeing, where his interest in scientific computing was reawakened, and in 1948 he began his fruitful association with the Institute for Numerical Analysis of the National Bureau of Standards on the UCLA campus, remaining at UCLA for three years after leaving the Institute in 1954. In 1957 he rejoined the Mathematics Department at Stanford as Professor, became director of the Computation Center, and eventually Chairman of the Computer Science Department when it was founded in 1965. George greatly influenced the development of modern numerical analysis, especially in numerical linear algebra where his 1953 classification of methods for solving linear systems gave structure to a field of great importance. -
Evolving Software Repositories
1 Evolving Software Rep ositories http://www.netli b.org/utk/pro ject s/esr/ Jack Dongarra UniversityofTennessee and Oak Ridge National Lab oratory Ron Boisvert National Institute of Standards and Technology Eric Grosse AT&T Bell Lab oratories 2 Pro ject Fo cus Areas NHSE Overview Resource Cataloging and Distribution System RCDS Safe execution environments for mobile co de Application-l evel and content-oriented to ols Rep ository interop erabili ty Distributed, semantic-based searching 3 NHSE National HPCC Software Exchange NASA plus other agencies funded CRPC pro ject Center for ResearchonParallel Computation CRPC { Argonne National Lab oratory { California Institute of Technology { Rice University { Syracuse University { UniversityofTennessee Uniform interface to distributed HPCC software rep ositories Facilitation of cross-agency and interdisciplinary software reuse Material from ASTA, HPCS, and I ITA comp onents of the HPCC program http://www.netlib.org/nhse/ 4 Goals: Capture, preserve and makeavailable all software and software- related artifacts pro duced by the federal HPCC program. Soft- ware related artifacts include algorithms, sp eci cations, designs, do cumentation, rep ort, ... Promote formation, growth, and interop eration of discipline-oriented rep ositories that organize, evaluate, and add value to individual contributions. Employ and develop where necessary state-of-the-art technologies for assisting users in nding, understanding, and using HPCC software and technologies. 5 Bene ts: 1. Faster development of high-quality software so that scientists can sp end less time writing and debugging programs and more time on research problems. 2. Less duplication of software development e ort by sharing of soft- ware mo dules. -
Wrangling Rogues: a Case Study on Managing Experimental Post-Moore Architectures Will Powell Jeffrey S
Wrangling Rogues: A Case Study on Managing Experimental Post-Moore Architectures Will Powell Jeffrey S. Young Jason Riedy Thomas M. Conte [email protected] [email protected] [email protected] [email protected] School of Computational Science and Engineering School of Computer Science Georgia Institute of Technology Georgia Institute of Technology Atlanta, Georgia Atlanta, Georgia Abstract are not always foreseen in traditional data center environments. The Rogues Gallery is a new experimental testbed that is focused Our Rogues Gallery of novel technologies has produced research on tackling rogue architectures for the post-Moore era of computing. results and is being used in classes both at Georgia Tech as well as While some of these devices have roots in the embedded and high- at other institutions. performance computing spaces, managing current and emerging The key lessons from this testbed (so far) are the following: technologies provides a challenge for system administration that • Invest in rogues, but realize some technology may be are not always foreseen in traditional data center environments. short-lived. We cannot invest too much time in configuring We present an overview of the motivations and design of the and integrating a novel architecture when tools and software initial Rogues Gallery testbed and cover some of the unique chal- stacks may be in an unstable state of development. Rogues lenges that we have seen and foresee with upcoming hardware may be short-lived; they may not achieve their goals. Find- prototypes for future post-Moore research. Specifically, we cover ing their limits is an important aspect of the Rogues Gallery. -
Simulating Low Precision Floating-Point Arithmetic
SIMULATING LOW PRECISION FLOATING-POINT ARITHMETIC Higham, Nicholas J. and Pranesh, Srikara 2019 MIMS EPrint: 2019.4 Manchester Institute for Mathematical Sciences School of Mathematics The University of Manchester Reports available from: http://eprints.maths.manchester.ac.uk/ And by contacting: The MIMS Secretary School of Mathematics The University of Manchester Manchester, M13 9PL, UK ISSN 1749-9097 SIMULATING LOW PRECISION FLOATING-POINT ARITHMETIC∗ NICHOLAS J. HIGHAMy AND SRIKARA PRANESH∗ Abstract. The half precision (fp16) floating-point format, defined in the 2008 revision of the IEEE standard for floating-point arithmetic, and a more recently proposed half precision format bfloat16, are increasingly available in GPUs and other accelerators. While the support for low precision arithmetic is mainly motivated by machine learning applications, general purpose numerical algorithms can benefit from it, too, gaining in speed, energy usage, and reduced communication costs. Since the appropriate hardware is not always available, and one may wish to experiment with new arithmetics not yet implemented in hardware, software simulations of low precision arithmetic are needed. We discuss how to simulate low precision arithmetic using arithmetic of higher precision. We examine the correctness of such simulations and explain via rounding error analysis why a natural method of simulation can provide results that are more accurate than actual computations at low precision. We provide a MATLAB function chop that can be used to efficiently simulate fp16 and bfloat16 arithmetics, with or without the representation of subnormal numbers and with the options of round to nearest, directed rounding, stochastic rounding, and random bit flips in the significand. We demonstrate the advantages of this approach over defining a new MATLAB class and overloading operators. -
A Bibliography of Publications of Cleve B. Moler
A Bibliography of Publications of Cleve B. Moler Nelson H. F. Beebe University of Utah Department of Mathematics, 110 LCB 155 S 1400 E RM 233 Salt Lake City, UT 84112-0090 USA Tel: +1 801 581 5254 FAX: +1 801 581 4148 E-mail: [email protected], [email protected], [email protected] (Internet) WWW URL: http://www.math.utah.edu/~beebe/ 10 February 2020 Version 1.56 Abstract 6 [Mol00a]. 6600 [DM73]. 6600/7600 [DM73]. This bibliography records publications of Cleve B. Moler. 754 [Mol19b]. 75th [DFM+07]. 7600 [DM73]. 77 [GWL+92]. A&M [Dut16]. Accuracy [DMW83]. Title word cross-reference Accurate [Mol67b]. ACM [ACM72]. acoustics [MS70b]. Affair [CMMP95]. Algebra [GM87a, HM88, Mol80d]. 10435 [Mol95b]. 2 [Mol74]. Ax = λBx Algebraic [MS71]. AXBT + CXDT = E [FM67, Joh68, FM71, Mol67b, MS78a]. [GWL+92, GLAM92]. algebraische [FM71]. Algorithm [GWL+92, MS73, MS78b, MS71, MS72, -by- [Mol74]. Mol95c, Mol01]. algorithms [MS78a]. Analysis [GO80, MM83a, Mol68b, Mol68c, 1 [Mol18a]. 128-bit [Mol17c]. 15th [Mol92]. Mol69c, Mol69d, Mol83, Mol70]. Analyst 16-bit [Mol17b]. 1978 [MS78a]. [Hig99]. Analytic [LM67, MRS94]. anniversary [DFM+07]. Annual [ACM72]. 2007 [DFM+07]. anymore [Mol15]. apology [Mol99a]. Applications [GM87a]. approximate 50th [DFM+07]. 1 2 [Efi64]. Approximations [FHM67]. April [MV78a, MV78b, MV03, MV79]. [AFI67]. Ardent [Mol18a]. Arithmetic Computed [DMW83]. Computer [Mol17b, Mol17c, Mol19b, Mol19c, Mol96a]. [AFI67, FM67, FMM77, MLB87c, FM71, arrival [DFM+07]. Art [DFM+07, Mol69e]. FMM80, Cur78, Joh68]. Aspects [CMMP95]. Atlantic [AFI67]. Computer-Verfahren [FM71]. computers August [ACM72, MS78a]. Award [Ano11]. [DM73]. Computing [DGG+08, DFM+07, Mol19a, Mol75, Mol83, B [AF69, Joh68]. basis [MRS94]. Bauer Mol04a, Mol04b, Mol08b].