Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) Copyright (C) 1986-1996 by Cambridge University Press. Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine- readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books, diskettes, or CDROMs visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to [email protected] (outside North America). 90 Recipes Fortran Volume 2 of in Second Edition Numerical Fortran Numerical Recipes Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) Copyright (C) 1986-1996 by Cambridge University Press. Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine- readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books, diskettes, or CDROMs visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to [email protected] (outside North America). Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) Copyright (C) 1986-1996 by Cambridge University Press. Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine- readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books, diskettes, or CDROMs visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to [email protected] (outside North America). Computing 90 Recipes Scientific Foreword by Fortran Polaroid Corporation Volume 2 of Michael Metcalf CERN, Geneva, Switzerland Second Edition William H. Press
in Brian P. Flannery Saul A. Teukolsky William T. Vetterling Parallel Department of Physics, Cornell University
Harvard-Smithsonian Center for Astrophysics EXXON Research and Engineering Company of
Fortran Numerical Recipes Art
The rical Nume Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) Copyright (C) 1986-1996 by Cambridge University Press. Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine- readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books, diskettes, or CDROMs visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to [email protected] (outside North America). ). CIP http://www.nr.com ) Library of Congress Cataloging-in-Publication Data Cambridge University Press 1986, 1996, Numerical Recipes Software 1986, 1996, X E c c 52Ðdc20 96-5567 p. cm. Without an additional license to use the contained software, this book is intended as Machine-readable media containing the software in this book, with included licenses The software may also be downloaded, with immediate purchase of a license [et al.]. Ð 2nd ed. 0285 Includes bibliographical references and index. ISBN 0-521-57439-0 (hardcover) 1. FORTRAN 90 (Computerscience) program language) 3. 2. Numerical Parallel analysisÐData programming processing. (Computer ... Unlicensed transfer ofcomputer except one Numerical that is Recipes specifically licensed,corrections, is programs strictly prohibited. and to Technical questions, requestsSoftware, any for P.O. other Box information format,781 243, should 863-1739. Cambridge, or be MA to addressed 02238 any to (USA), Numerical email “[email protected]”, Recipes or fax a text and reference book,software for by reading the purposes individual only. ownerroutines of A into a free a copy single license of computer for this is“License limited granted book Information” under who use (pp. terms personally of described xviiÐxx) types on the for one p.low information or xviii. cost. on more See obtaining the more section general licenses at for use onorder a form single at screen,“[email protected]” the (rest are back of availableMidland of world), Avenue, from Port the Chester, Cambridge or NY book, 10573 University write (USA), Press. email for to further information. to Cambridgealso “[email protected]” See University possible, (North the Press, from America) the 110 or Numerical Recipes Software Web site ( A catalog record for this book is availableISBN from the British 0ISBN Library. 521 57439 0 0ISBN 521 43064 Volume X 2 0 (thisISBN 521 book) Volume 43721 1 0 0 521 57440 Example 4 book in FORTRAN FORTRAN diskette (IBM 3.5 ISBN 0ISBN 521 57608 3 0 521 57607 CDROM 5 (IBM PC/Macintosh) CDROM (UNIX) Published by the PressThe Syndicate Pitt of Building, the Trumpington University40 Street, of West 20th Cambridge Cambridge CB2 Street,10 1RP New Stamford York, Road, NY Oakleigh, 10011-4211, Melbourne USA 3166, Australia Copyright except for all computerCopyright programs and procedures,and which except are for AppendixAll C1, which Rights is Reserved. placed into the public domain. Numerical Recipes in FortranVolume 90: 2 of The Fortran Art Numerical ofReprinted Recipes, Parallel with Second Scientific Edition, corrections Computing, firstThe 1997. published code 1996. in this volume is correctedPrinted to software in version the 2.08 Typeset United in States T of America Numerical recipes in Fortran 90 : the art of parallel scientific computing / William H. Press I. Press, William H. QA76.73.F25N85 1996 519.4 Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) Copyright (C) 1986-1996 by Cambridge University Press. Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine- readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books, diskettes, or CDROMs visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to [email protected] (outside North America). v Contents Fortran 90 Code Chapters 1009 23.0 Introduction and Summary Listing23.1 Routines That Move Data23.2 Routines Returning a Location23.3 Argument Checking and Error Handling23.4 Routines for Polynomials and Recurrences23.5 Routines for Outer Operations on23.6 Vectors Routines for Scatter with Combine23.7 Routines for Skew Operations on23.8 Matrices Other Routines 987 994 996 1000 992 990 1004 1002 1007 22.0 Why Think Parallel?22.1 Fortran 90 Data Parallelism:22.2 Arrays Linear and Recurrence Intrinsics and Related22.3 Calculations Parallel Synthetic Division and Related22.4 Algorithms Fast Fourier Transforms22.5 Missing Language Features 964 977 971 962 983 981 21.0 Introduction21.1 Quick Start: Using the21.2 Fortran Fortran 90 90 Numerical Language Recipes Concepts Routines21.3 More on Arrays and21.4 Array Fortran Sections 90 Intrinsic Procedures21.5 Advanced 936 Fortran 90 Topics21.6 And Coming Soon: Fortran 95 941 937 945 935 953 959 Preface to Volume 2Foreword by Michael MetcalfLicense Information x viii xvii 21 Introduction to Fortran 90 Language Features 935 23 Numerical Recipes Utility Functions for Fortran 90 987 22 Introduction to Parallel Programming 962 B1 PreliminariesB2 Solution ofB3 Linear Algebraic Equations Interpolation and Extrapolation 1014 1043 1010 vi Contents
B4 Integration of Functions 1052 B5 Evaluation of Functions 1070 B6 Special Functions 1083 B7 Random Numbers 1141 ii est tp/wwn.o rcl -0-7-43(ot mrc ny,or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) B8 Sorting 1167 B9 Root Finding and Nonlinear Sets of Equations 1182 B10 Minimization or Maximization of Functions 1201 B11 Eigensystems 1225 B12 Fast Fourier Transform 1235 B13 Fourier and Spectral Applications 1253 B14 Statistical Description of Data 1269 B15 Modeling of Data 1285 B16 Integration of Ordinary Differential Equations 1297 B17 Two Point Boundary Value Problems 1314 B18 Integral Equations and Inverse Theory 1325 B19 Partial Differential Equations 1332 B20 Less-Numerical Algorithms 1343
References 1359
Appendices
C1 Listing of Utility Modules (nrtype and nrutil) 1361 C2 Alphabetical Listing of Explicit Interfaces 1384 C3 Index of Programs and Dependencies 1434
General Index to Volumes 1 and 2 1447 or CDROMs g of machine- America). Contents vii
Contents of Volume 1: Numerical Recipes in Fortran 77
Plan of the Two-Volume Edition xiii Preface to the Second Edition xv Preface to the First Edition xviii or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) License Information xx Computer Programs by Chapter and Section xxiv 1 Preliminaries 1 2 Solution of Linear Algebraic Equations 22 3 Interpolation and Extrapolation 99 4 Integration of Functions 123 5 Evaluation of Functions 159 6 Special Functions 205 7 Random Numbers 266 8 Sorting 320 9 Root Finding and Nonlinear Sets of Equations 340 10 Minimization or Maximization of Functions 387 11 Eigensystems 449 12 Fast Fourier Transform 490 13 Fourier and Spectral Applications 530 14 Statistical Description of Data 603 15 Modeling of Data 650 16 Integration of Ordinary Differential Equations 701 17 Two Point Boundary Value Problems 745
18 Integral Equations and Inverse Theory 779 or CDROMs g of machine-
19 Partial Differential Equations 818 America). 20 Less-Numerical Algorithms 881 References 916 Index of Programs and Dependencies 921 Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) Copyright (C) 1986-1996 by Cambridge University Press. Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine- readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books, diskettes, or CDROMs visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to [email protected] (outside North America). viii Preface to Volume 2 principles, but we“tricks” are that also have not frequentin shy applicability. this about volume’s These principal pointing are text out not chapters parallelization only (Chapters discussed hint” 21Ð23), logo but (left, are also above).are specific “Fortran Also 90(left). scattered After tips,” throughout you with the read the their code text chapters, own chapters you distinct might want graphic simply call-out to browse 90 A special note to C programmers: Right now, there is no effort at producing a As we watched the gestation and birth of Fortran 90 by its governing “X3J3 Fortran 90 compilers are now widely available. Microsoft’s Fortran PowerSta- There have been previous special versions of Fortran for parallel supercomput- With the Fortran 90 language as a tool, we want this volume to be your complete Fortran 90 is not just the long-awaited updating of the Fortran language to f sprinkled throughout the chapters of Fortran 90 code, called out by a special “parallel among these hints and tips. parallel version of Cstability. that is comparable We to think,this Fortran therefore, volume 90 for that in an maturity, C educationalconstructions, acceptance, programmers excursion and and into will the Fortran be numerical 90,programming well its algorithms have parallel served that not programming by capitalize beenand using on far we them. from think our that(in C minds you Chapters and as will 21Ð23) C++ we find willdevelop have be that standard written amply time parallel this repaid spent volume, extensions. in in the absorbing future, its as principal C lessons and C++ eventually modern computing practices. Itcomputing, is that also of multiprocessor the computers vanguard andParallel of widespread computing a parallel has much programming. been larger a revolutiontime. in feature of Now, the however, largest it supercomputers is for rapidly quite moving some towardsCommittee” the (a desktop. processMichael interestingly Metcalf, described in the by Forewordmoment a that for follows), moving leading it Numerical committee became Recipesrather clear from member, than to Fortran us 77 later. that to the Fortran right 90 was sooner, tion for Windows 95 bringswe that have firm’s undeniable tested marketingcompatibility this and force performance. to compiler PC thoroughly Inhad desktop; on generally the fine UNIX our experiences world, with, code DEC’s wefor Fortran RS/6000 90 have and and for similar similarly Alpha found machines. AXP tested, andFortran it NAG’s and IBM’s Fortran xlf 90 90 excellent compiler also in compatibility bringsother excellent to excellent a compilers, both variety availablebackwards compatible and of with on Fortran UNIX the 77, by way. platforms.your the way, legacy Fortran so 90 code, you is don’t There or have completely to are keep throw away an no old doubt ers, compiler but always around. specific toto a particular provide hardware. a Fortran general, 90, architecture-independentEqually by importantly, framework contrast, it for is is designed parallel an computation. computer international standard, hardware agreed and software upon manufacturers by and a international large standards group bodies. of guide for learning howthis to regard, “think and parallel.” applicableparallel to computing The many languages as present language they and come itself along. future is Our computers, very treatment or emphasizes general general even in to other Preface to Volume 2 ix
A final word of truth in packaging: Don’t buy this volume unless you also buy (or already have) Volume 1 (now retitled Numerical Recipes in Fortran 77). Volume 2 does not repeat any of the discussion of what individual programs actually do, or of the mathematical methods they utilize, or how to use them. While our Fortran 90 code is thoroughly commented, and includes a header comment for
each routine that describes its input and output quantities, these comments are not or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) supposed to be a complete description of the programs; the complete descriptions are in Volume 1, which we reference frequently. But here’s a money-saving hint to our previous readers: If you already own a Second Edition version whose title is Numerical Recipes in FORTRAN (which doesn’t indicate either “Volume 1” or “Volume 2” on its title page) then take a marking pen and write in the words “Volume 1.” There! (Differences between the previous reprintings and the newest reprinting, the one labeled “Volume 1,” are minor.)
Acknowledgments
We continue to be in the debt of many colleagues who give us the benefit of their numerical and computational experience. Many, though not all, of these are listed by name in the preface to the second edition, in Volume 1. To that list we must now certainly add George Marsaglia, whose ideas have greatly influenced our new discussion of random numbers in this volume (Chapter B7). With this volume, we must acknowledge our additional gratitude and debt to a number of people who generously provided advice, expertise, and time (a great deal of time, in some cases) in the areas of parallel programming and Fortran 90. The original inspiration for this volume came from Mike Metcalf, whose clear lectures on Fortran 90 (in this case, overlooking the beautiful Adriatic at Trieste) convinced us that Fortran 90 could serve as the vehicle for a book with the larger scope of parallel programming generally, and whose continuing advice throughout the project has been indispensable. Gyan Bhanot also played a vital early role in the development of this book; his first translations of our Fortran 77 programs taught us a lot about parallel programming. We are also grateful to Greg Lindhorst, Charles Van Loan, Amos Yahil, Keith Kimball, Malcolm Cohen, Barry Caplin, Loren Meissner, Mitsu Sakamoto, and George Schnurer for helpful correspondence and/or discussion of Fortran 90’s subtler aspects. We once again express in the strongest terms our gratitude to programming consultant Seth Finkelstein, whose contribution to both the coding and the thorough testing of all the routines in this book (against multiple compilers and in sometimes- buggy, and always challenging, early versions) cannot be overstated. or CDROMs
WHP and SAT acknowledge the continued support of the U.S. National Science g of machine-
Foundation for their research on computational methods. America).
February 1996 William H. Press Saul A. Teukolsky William T. Vetterling Brian P. Flannery Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) Copyright (C) 1986-1996 by Cambridge University Press. Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine- readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books, diskettes, or CDROMs visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to [email protected] (outside North America). had become renowned x Numerical Recipes Foreword by Michael Metcalf authors and I were each lecturing at the International Center Fortran 90? How did it begin, what shaped it, and how, after nearly is Back in 1966, the version of Fortran now known as Fortran 66 was the first During all this time, X3J3 kept functioning, developing the successor version Sipping coffee on a sunbaked terrace can be surprisingly productive. One of But what Numerical Recipes The Birth of a Standard language ever to be standardized, by the predecessorStandards of the Institute present American (ANSI). National Itdeveloped was by John an Backus all-American ofprogramming IBM affair. language in New in Fortran York, and Northwhich had it Backus America. first was had the been also dominant hadfavored Many scientific a Algol for Europeans hand). its preferred precisely Eventually, expressibleengineers however, Algol syntax the began who (in mathematicians to appreciated who defer to Fortran’supgraded the pragmatic, Fortran scientists 77 and even was standardized natural,subsequently by endorsed style. the by ANSI other technical national In committee, bodiesIts 1978, and X3J3, dominance by in and the all ISO fields in of Geneva,optimizing scientific compilers Switzerland. and came numerical onto computing the grew market. asPascal, new, Although highly newer Basic, languages, particularly PL/1,throughout and the 1980s later remained Ada true todid attracted Fortran. C their draw Only increasing towards own support thepower from adherents, end scientific of of scientific programmers that decade structures who users had and discovered the pointers. to Fortran 77. It waslate to 1970s, be were a mainly decadethat to of standard. add strife to and Among contention. Fortran these 77it The were features early to dynamic that plans, map storage had in and the directly hadmarket. an to onto The array be intention the language, was left enabling architecture to outBut have of of two this new new supercomputers, factors version then ready becamethe within coming significant five next at onto years, standard that in the time. 1982. shouldcase not The in just first 1966 codify was and the existing decision 1978, practice, that but as also had extend largely the been functionality the of the language through the for Theoretical Physics in90. Trieste, Italy, he The on numericaldevelopment numerical of analysis analysis the and community new I Fortran hadapparent standard, on made and that Fortran so, important the unsurprisingly, contributions it algorithms became to for quickly the which foundering, did its driving forces finally steer it to a successful conclusion? could, to great advantage, bebeen recast expressed in in a serial newNow, form, mold. nested first iterations These in could algorithms Fortran be had,the 77 other replaced hitherto, features and by of then array a in rich operations“Numerical array C, and Recipes language Pascal, assignments, could in be and Fortran and exploited. Basic. 90"it Thus first is was conceived the a and, idea of after delight a three to years’ gestation, assist at the birth. Foreword xi
innovative additions (even though, for the array language, there was significant borrowing from John Iverson’s APL and from DAP Fortran). The second factor was that X3J3 no longer operated under only American auspices. In the course of the 1980s, the standardization of programming languages came increasingly under the authority of the international body, ISO. Initially this was in an advisory role, but
now ISO is the body that, through its technical committee WG5 (in full, ISO/IEC or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) JTC1/SC22/WG5), is responsible for determining the course of the language. WG5 also steers the work of the development body, then as now, the highly skilled and competent X3J3. As we shall see, this shift in authority was crucial at the most difficult moment of Fortran 90’s development. The internationalization of the standards effort was reflected in the welcome given by X3J3 to six or seven European members; they, and about one-third of X3J3’s U.S. members, provided the overlapping core of membership of X3J3 and WG5 that was vital in the final years in bringing the work to a successful conclusion. X3J3 membership, which peaked at about 45, is restricted to one voting member per organization, and significant decisions require a majority of two-thirds of those voting. Nationality plays no role, except in determining the U.S. position on an international issue. Members, who are drawn mainly from the vendors, large research laboratories, and academia, must be present or represented at two-thirds of all meetings in order to retain voting rights. In 1980, X3J3 reported on its plans to the forerunner of WG5 in Amsterdam, Holland. Fortran 8x, as it was dubbed, was to have a basic array language, new looping constructs, a bit data type, data structures, a free source form, a mechanism to “group” procedures, and another to manage the global name space. Old features, including COMMON, EQUIVALENCE, and the arithmetic-IF, were to be consigned to a so-called obsolete module, destined to disappear in a subsequent revision. This was part of the “core plus modules” architecture, for adding new features and retiring old ones, an aid to backwards compatibility. Even though Fortran 77 compilers were barely available, the work seemed well advanced and the mood was optimistic. Publication was intended to take place in 1985. It was not to be. One problem was the sheer number of new features that were proposed as additions to the language, most of them worthwhile in themselves but with the totality being too large. This became a recurrent theme throughout the development of the standard. One example was the suggestion of Lawrie Schonfelder (Liverpool University), at a WG5 meeting in Vienna, Austria, in 1982, that certain features already proposed as additions could be combined to provide a full-blown derived data type facility, thus providingFortran with abstract data types. This idea was taken up by X3J3 and has since come to be recognized, along with the array language, as or CDROMs
one of the two main advances brought about by what became Fortran 90. However, g of machine-
the ramifications go very deep: all the technical details of how to handle arrays of America). objects of derived types that in turn have array components that have the pointer attribute, and so forth, have to be precisely defined and rigorously specified.
Conflict
The meetings of X3J3 were often full of drama. Most compiler vendors were represented as a matter of course but, for many, their main objective appeared to be to maintain the status quo and to ensure that Fortran 90 never saw the light of xii Foreword day. One vendor’s extended (and much-copied) version of Fortran 77 had virtually become an industry standard, and it saw as its mission the maintenance of this lead. A new standard would cost it its perceived precious advantage. Other large vendors had similar points of view, although those marketing supercomputers were clearly keen on the array language. Most users, on the other hand, were hardly prepared to
invest large amounts of their employers’ and their own resources in simply settling or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) for a trivial set of improvements to the existing standard. However, as long as X3J3 worked under a simple-majority voting rule, at least some apparent progress could be made, although the underlying differences often surfaced. These were even sometimes between users — those who wanted Fortran to become a truly modern language and those wanting to maintain indefinite backwards compatibility for their billions of lines of existing code. At a watershed meeting, in Scranton, Pennsylvania, in 1986, held in an atmosphere that sometimes verged on despair, a fragile compromise was reached as a basis for further work. One breakthrough was to weaken the procedures for removing outdated features from the language, particularly by removing no features whatsoever from the next standard and by striking storage association (i.e., COMMON and EQUIVALENCE) from the list of features to be designated as obsolescent (as they are now known). A series of votes definitively removed from the language all plans to add: arrays of arrays, exception handling, nesting of internal procedures, the FORALL statement (now in Fortran 95), and a means to access skew array sections. There were other features on this list that, although removed, were reinstated at later meetings: user-defined operators, operator overloading, array and structure constructors, and vector-valued subscripts. After many more travails, the committee voted, a year later, by 26 votes to 9, to forward the document for what was to become the first of three periods of public comment. While the document was going through the formal standards bureaucracy and being placed before the public, X3J3 polished it further. X3J3 also prepared procedures for processing the comments it anticipated receiving from the public, and to each of which, under the rules, it would have to reply individually. It was just as well. Roughly 400 replies flooded in, many of them very detailed and, disappointingly for those of us wanting a new standard quickly, unquestionably negative towards our work. For many it was too radical, but many others pleaded for yet more modern features, such as pointers. Now the committee was deadlocked. Given that a document had already been published, any further change required not a simple but a two-thirds majority. The conservatives and the radicals could each block a move to modify the draft standard, or to accept a revised one for public review — and just that happened, or CDROMs
in Champagne-Urbana, Illinois, in 1988. Any change, be it on the one hand to g of machine- modify the list of obsolescent features, to add the pointers or bit data type wanted America). by the public, to add multi-byte characters to support Kanji and other non-European languages or, on the other hand, to emasculate the language by removing modules or operator overloading, and hence abstract data types, to name but some suggestions, none of these could be done individually or collectively in a way that would achieve consensus. I wrote: “In my opinion, no standard can now emerge without either a huge concession by the users to the vendors (MODULE / USE) and/or a major change in the composition of the committee. I do not see how members who have worked for up to a decade Foreword xiii
or more, devoting time and intellectual energy far beyond the call of duty, can be expected to make yet more personal sacrifices if no end to the work is in sight, or if that end is nothing but a travesty of what had been designed and intended as a modern scientific programming language. ...I think the August meeting will be a watershed — if no progress is achieved there will be dramatic resignations, and ISO
could even remove the work from ANSI, which is failing conspicuously in its task." or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) (However, the same notes began with a quotation from The Taming of the Shrew: “And do as adversaries do in law, / Strive mightily, but eat and drink / as friend." That we always did, copiously.)
Resolution
The “August meeting” was, unexpectedly, imbued with a spirit of compromise that had been so sadly lacking at the previous one. Nevertheless, after a week of discussing four separate plans to rescue the standard, no agreement was reached. Now the question seriously arose: Was X3J3 incapable of producing a new Fortran standard for the international community, doomed to eternal deadlock, a victim of ANSI procedures? Breakthrough was achieved at a traumatic meeting of WG5 in Paris, France, a month later. The committee spent several extraordinary days drawing up a detailed list of what it wanted to be in Fortran 8x. Finally, it set X3J3 an ultimatum that was unprecedented in the standards world: The ANSI committee was to produce a new draft document, corresponding to WG5’s wishes, within five months! Failing that, WG5 would assume responsibility and produce the new standard itself. This decision was backed by the senior U.S. committee, X3, which effectively directed X3J3 to carry out WG5’s wishes. And it did! The following November, it implemented most of the technical changes, adding pointers, bit manipulation intrin- sic procedures, and vector-valued subscripts, and removing user-defined elemental functions (now in Fortran 95). The actual list of changes was much longer. X3J3 and WG5, now collaborating closely, often in gruelling six-day meetings, spent the next 18 months and two more periods of (positive) public comment putting the finishing touches to what was now called Fortran 90, and it was finally adopted, after some cliff-hanging votes, for forwarding as a U.S. and international standard on April 11, 1991, in Minneapolis, Minnesota. Among the remaining issues that were decided along the way were whether pointers should be a data type or be defined in terms of an attribute of a variable, implying strong typing (the latter was chosen), whether the new standard should coexist alongside the old one rather than definitively replace it (it coexisted for a or CDROMs while in the U.S., but was a replacement elsewhere, under ISO rules), and whether, g of machine- in the new free source form, blanks should be significant (fortunately, they are). America).
Fortran 90
The main new features of Fortran 90 are, first and foremost, the array language and abstract data types. The first is built on whole array operations and assignments, array sections, intrinsic procedures for arrays, and dynamic storage. It was designed with optimization in mind. The second is built on modules and module procedures, derived data types, operator overloading and generic interfaces, together with xiv Foreword
pointers. Also important are the new facilities for numerical computation including a set of numeric inquiry functions, the parametrization of the intrinsic types, new control constructs — SELECT CASE and new forms of DO, internal and recursive procedures and optional and keyword arguments, improved I/O facilities, and many new intrinsic procedures. Last but not least are the new free source form, an
improved style of attribute-oriented specifications, the IMPLICIT NONE statement, or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) and a mechanism for identifying redundant features for subsequent removal from the language. The requirement on compilers to be able to identify, for example, syntax extensions, and to report why a program has been rejected, are also significant. The resulting language is not only a far more powerful tool than its successor, but a safer and more reliable one too. Storage association, with its attendant dangers, is not abolished, but rendered unnecessary. Indeed, experience shows that compilers detect errors far more frequently than before, resulting in a faster development cycle. The array syntax and recursion also allow quite compact code to be written, a further aid to safe programming. No programming language can succeed if it consists simply of a definition (witness Algol 68). Also required are robust compilers from a wide variety of vendors, documentation at various levels, and a body of experience. The first Fortran 90 compiler appeared surprisingly quickly, in 1991, especially in view of the widely touted opinion that it would be very difficult to write one. Even more remarkable was that it was written by one person, Malcolm Cohen of NAG, in Oxford, U.K. There was a gap before other compilers appeared, but now they exist as native implementations for almost all leading computers, from the largest to PCs. For the most part, they produce very efficient object code; where, for certain new features, this is not the case, work is in progress to improve them. The first book, Fortran 90 Explained, was published by John Reid and me shortly before the standard itself was published. Others followed in quick succession, including excellent texts aimed at the college market. At the time of writing there are at least 19 books in English and 22 in various other languages: Chinese, Dutch, French, Japanese, Russian, and Swedish. Thus, the documentation condition is fulfilled. The body of experience, on the other hand, has yet to be built up to a critical size. Teaching of the language at college level has only just begun. However, I am certain that this present volume will contribute decisively to a significant breakthrough, as it provides models not only of the numerical algorithms for which previous editions are already famed, but also of an excellent Fortran 90 style, something that can develop only with time. Redundant features are abjured. It shows that, if we abandon these features and use new ones in their place, the appearance of code can initially seem or CDROMs
unfamiliar, but, in fact, the advantages become rapidly apparent. This new edition g of machine- of Numerical Recipes stands as a landmark in this regard. America).
Fortran Evolution
The formal procedures under which languages are standardized require them either to evolve or to die. A standard that has not been revised for some years must either be revised and approved anew, or be withdrawn. This matches the technical pressure on the language developers to accommodate the increasing complexity both of the problems to be tackled in scientific computation and of the underlyinghardware Foreword xv
on which programs run. Increasing problem complexity requires more powerful features and syntax; new hardware needs language features that map onto it well. Thus it was that X3J3 and WG5, having finished Fortran 90, began a new round of improvement. They decided very quickly on new procedures that would avoid the disputes that bedevilled the previous work: WG5 would decide on a plan for
future standards, and X3J3 would act as the so-called development body that would or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) actually produce them. This would be done to a strict timetable, such that any feature that could not be completed on time would have to wait for the next round. It was further decided that the next major revision should appear a decade after Fortran 90 but, given the somewhat discomforting number of requests for interpretation that had arrived, about 200, that a minor revision should be prepared for mid-term, in 1995. This should contain only “corrections, clarifications and interpretations” and a very limited number (some thought none) of minor improvements. At the same time, scientific programmers were becoming increasingly concerned at the variety of methods that were necessary to gain efficient performance from the ever-more widely used parallel architectures. Each vendor provided a different set of parallel extensions for Fortran, and some academic researchers had developed yet others. On the initiative of Ken Kennedy of Rice University, a High-Performance Fortran Forum was established. A coalition of vendors and users, its aim was to produce an ad hoc set of extensions to Fortran that would become an informal but widely accepted standard for portable code. It set itself the daunting task of achieving that in just one year, and succeeded. Melding existing dialects like Fortran D, CM Fortran, and Vienna Fortran, and adopting the new Fortran 90 as a base, because of its array syntax, High-Performance Fortran (HPF) was published in 1993 and has since become widely implemented. However, although HPF was designed for data parallel codes and mainly implemented in the form of directives that appear to non-HPF processors as comment lines, an adequate functionality could not be achieved without extending the Fortran syntax. This was done in the form of the PURE attributefor functions — an assertion that they contain no side effects — and the FORALL construct — a form of array assignment expressed with the help of indices. The dangers of having diverging or competing forms of Fortran 90 were immediately apparent, and the standards committees wisely decided to incorporate these two syntactic changes also into Fortran 95. But they didn’t stop there. Two further extensions, useful not only for their expressive power but also to access parallel hardware, were added: elemental functions, ones written in terms of scalars but that accept array arguments of any permitted shape or size, and an extension to allow nesting of WHERE constructs, Fortran’s form of masked assignment. To readers of Numerical Recipes, perhaps the most relevant of the minor improvements that or CDROMs
Fortran 95 brings are the ability to distinguish between a negative and a positive real g of machine-
zero, automatic deallocation of allocatable arrays, and a means to initialize the values America). of components of objects of derived data types and to initialize pointers to null. The medium-term objective of a relatively minor upgrade has been achieved on schedule. But what does the future hold? Developments in the underlying principles of procedural programming languages have not ceased. Early Fortran introduced the concepts of expression abstraction (X=Y+Z) and later control expression (e.g., the DO loop). Fortran 77 continued this with the if-then-else, and Fortran 90 with the DO and SELECT CASE constructs. Fortran 90 has a still higher level of expression abstraction (array assignments and expressions) as well as data structures and even xvi Foreword
full-blown abstract data types. However, during the 1980s the concept of objects came to the fore, with methods bound to the objects on which they operate. Here, one particular language, C++, has come to dominate the field. Fortran 90 lacks a means to point to functions, but otherwise has most of the necessary features in place, and the standards committees are now faced with the dilemma of deciding
whether to make the planned Fortran 2000 a fully object-oriented language. This or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) could possibly jeopardize its powerful, and efficient, numerical capabilities by too great an increase in language complexity, so should they simply batten down the hatches and not defer to what might be only a passing storm? At the time of writing, this is an open issue. One issue that is not open is Fortran’s lack of in-built exception handling. It is virtually certain that such a facility, much requested by the numerical community, and guided by John Reid, will be part of the next major revision. The list of other requirements is long but speculative, but some at the top of the list are conditional compilation, command line argument handling, I/O for objects of derived type, and asynchronous I/O (which is also planned for the next release of HPF). In the meantime, some particularly pressing needs have been identified, for the handling of floating-point exceptions, interoperability with C, and allowing allocatable arrays as structure components, dummy arguments, and function results. These have led WG5 to begin processing these three items using a special form of fast track, so that they might become optional but standard extensions well before Fortran 2000 itself is published in the year 2001.
Conclusion
Writing a book is always something of a gamble. Unlike a novel that stands or falls on its own, a book devoted to a programming language is dependent on the success of others, and so the risk is greater still. However, this new Numerical Recipes in Fortran 90 volume is no ordinary book, since it comes as the continuation of a highly successful series, and so great is its significance that it can, in fact, influence the outcome in its own favor. I am entirely confident that its publication will be seen as an important event in the story of Fortran 90, and congratulate its authors on having performed a great service to the field of numerical computing.
Geneva, Switzerland Michael Metcalf January 1996 or CDROMs g of machine- America). Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) Copyright (C) 1986-1996 by Cambridge University Press. Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine- readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books, diskettes, or CDROMs visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to [email protected] (outside North America). 50 (PC, Macintosh, educational $ xvii 140 (general UNIX). Downloading the packaged reported bugs in such cases are typing errors! ). They are packaged as a password-protected $ kind of license available to you is the free “immediate only License Information virtually all http://www.nr.com You can type the programs fromthis this case, book the directly into yourlicense” computer. (see In below).machine-readable You copy are to any not othertype authorized person, the to programs nor into to transfer a have computer orbug any on distribute reports your other behalf. a from person We you doshown if not that want you to choose hear this option, because experience has institutions’ UNIX) to software from the On-Lineacquire Store a is more also general (multiscreen, the site, way or to corporate) start license. if you want to You can download thethe Numerical Numerical Recipes Recipes programs On-Line electronically( Software from Store, located atfile, our and Web site you’llget need a to single-screen purchase licenseOn-Line and a Store, password with license immediately, fees to on-line, ranging from unpack from the them. You can You can purchase mediaversity containing Press. the programs Diskette fromfor versions Cambridge machines are running Uni- available Windows 3.1, in9660 95, IBM-compatible or format format NT. CDROM for versions PC,these in ISO- Macintosh, include both and Fortran UNIX and systems C are versions (as also well available; as versions in Pascal • • • Pick one of the following methods: We make no warranties, express or implied, that the programs contained Read this section if you want to use the programs in this book on a computer. How to Get the Code onto Your Computer Disclaimer of Warranty in this volume are freeof of merchantability, or error, that or they are willapplication. consistent meet They with your should any requirements not particular for be standard solution any relied could on particular for result solving in a injury problemprograms to whose in a incorrect such person a or manner, lossdisclaim it of all is property. liability at If for your youuse direct own do risk. of use or the the consequential The authors programs. damages and resulting publisher from your You’ll need to read the following Disclaimercomputer, of Warranty, and get the acquire programs a onto your Numericalwhich Recipes can software be license. the free (Withoutintended “immediate this as license” license, a under terms text described and below, reference the book book, is for reading purposes only.) xviii License Information
and BASIC from the first edition) on a single CDROM. Diskettes pur- chased from Cambridge University Press include a single-screen license for PC or Macintosh only. The CDROM is available with a single- screen license for PC or Macintosh (order ISBN 0 521 576083), or (at a slightly higher price) with a single-screen license for UNIX workstations
(order ISBN 0 521 576075). Orders for media from Cambridge Univer- or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) sity Press can be placed at 800 872-7423 (North America only) or by email to [email protected] (North America) or [email protected] (rest of world). Or, visit the Web sites http://www.cup.org (North America) or http://www.cup.cam.ac.uk (rest of world).
Types of License Offered
Here are the types of licenses that we offer. Note that some types are automatically acquired with the purchase of media from Cambridge University Press, or of an unlocking password from the Numerical Recipes On-Line Software Store, while other types of licenses require that you communicate specifically with Numerical Recipes Software (email: [email protected] or fax: 781 863-1739). Our Web site http://www.nr.com has additional information.
• [“Immediate License”] If you are the individual owner of a copy of this book and you type one or more of its routines into your computer, we authorize you to use them on that computer for your own personal and noncommercial purposes. You are not authorized to transfer or distribute machine-readable copies to any other person, or to use the routines on more than one machine, or to distribute executable programs containing our routines. This is the only free license. • [“Single-Screen License”] This is the most common type of low-cost license, with terms governed by our Single Screen (Shrinkwrap) License document (complete terms available through our Web site). Basically, this license lets you use Numerical Recipes routines on any one screen (PC, workstation, X-terminal, etc.). You may also, under this license, transfer pre-compiled, executable programs incorporating our routines to other, unlicensed, screens or computers, providing that (i) your application is noncommercial (i.e., does not involve the selling of your program for a fee), (ii) the programs were first developed, compiled, and successfully run on a licensed screen, and (iii) our routines are bound into the programs in such a manner that they cannot be accessed as individual routines and or CDROMs g of machine-
cannot practicably be unbound and used in other programs. That is, under America). this license, your program user must not be able to use our programs as part of a program library or “mix-and-match” workbench. Conditions for other types of commercial or noncommercial distribution may be found on our Web site (http://www.nr.com). • [“Multi-Screen, Server, Site, and Corporate Licenses”] The terms of the Single Screen License can be extended to designated groups of machines, defined by number of screens, number of machines, locations, or ownership. Significant discounts from the corresponding single-screen License Information xix
prices are available when the estimated number of screens exceeds 40. Contact Numerical Recipes Software (email: [email protected] or fax: 781 863-1739) for details. • [“Course Right-to-Copy License”] Instructors at accredited educational institutions who have adopted this book for a course, and who have already purchased a Single Screen License (either acquired with the or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) purchase of media, or from the Numerical Recipes On-Line Software Store), may license the programs for use in that course as follows: Mail your name, title, and address; the course name, number, dates, and estimated enrollment; and advance payment of $5 per (estimated) student to Numerical Recipes Software, at this address: P.O. Box 243, Cambridge, MA 02238 (USA). You will receive by return mail a license authorizing you to make copies of the programs for use by your students, and/or to transfer the programs to a machine accessible to your students (but only for the duration of the course).
About Copyrights on Computer Programs
Like artistic or literary compositions, computer programs are protected by copyright. Generally it is an infringement for you to copy into your computer a program from a copyrighted source. (It is also not a friendly thing to do, since it deprives the program’s author of compensation for his or her creative effort.) Under copyright law, all “derivative works” (modified versions, or translations into another computer language) also come under the same copyright as the original work. Copyright does not protect ideas, but only the expression of those ideas in a particular form. In the case of a computer program, the ideas consist of the program’s methodology and algorithm, including the necessary sequence of steps adopted by the programmer. The expression of those ideas is the program source code (particularly any arbitrary or stylistic choices embodied in it), its derived object code, and any other derivative works. If you analyze the ideas contained in a program, and then express those ideas in your own completely different implementation, then that new program implementation belongs to you. That is what we have done for those programs in this book that are not entirely of our own devising. When programs in this book are said to be “based” on programs published in copyright sources, we mean that the ideas are the same. The expression of these ideas as source code is our own. We believe that no material in this book infringes on an existing copyright. or CDROMs g of machine-
Trademarks America).
Several registered trademarks appear within the text of this book: Sun is a trademark of Sun Microsystems, Inc. SPARC and SPARCstation are trademarks of SPARC International, Inc. Microsoft, Windows 95, Windows NT, PowerStation, and MS are trademarks of Microsoft Corporation. DEC, VMS, Alpha AXP, and ULTRIX are trademarks of Digital Equipment Corporation. IBM is a trademark of International Business Machines Corporation. Apple and Macintosh are trademarks of Apple Computer, Inc. UNIX is a trademark licensed exclusively through X/Open xx License Information
Co. Ltd. IMSL is a trademark of Visual Numerics, Inc. NAG refers to proprietary computer software of Numerical Algorithms Group (USA) Inc. PostScript and Adobe Illustrator are trademarks of Adobe Systems Incorporated. Last, and no doubt least, Numerical Recipes (when identifying products) is a trademark of Numerical Recipes Software. ii est tp/wwn.o rcl -0-7-43(ot mrc ny,or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) Attributions
The fact that ideas are legally “free as air” in no way supersedes the ethical requirement that ideas be credited to their known originators. When programs in this book are based on known sources, whether copyrighted or in the public domain, published or “handed-down,” we have attempted to give proper attribution. Unfor- tunately, the lineage of many programs in common circulation is often unclear. We would be grateful to readers for new or corrected information regarding attributions, which we will attempt to incorporate in subsequent printings. or CDROMs g of machine- America). Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) Copyright (C) 1986-1996 by Cambridge University Press. Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine- readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books, diskettes, or CDROMs visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to [email protected] (outside North America). (Oxford: Oxford University Press). [1] 935 . [3] to Fortran 90 Language Features Fortran 90/95 Explained and ”) it will not always work correctly! , cited throughout this chapter as “M&R.” Other good ”) that it is dealing with a Fortran 77 file. If you instead [1] [2] .f90 .for ”or“ .f language, with a large number of new constructions and intrinsic functions. big Our goal, in the remainder of this chapter, is to give a good, working description Fortran 90 is in many respects a backwards-compatible modernization of the In a certain trivial sense, Fortran 90 is strictly backwards-compatible with It is best, therefore, to approach Fortran 90 as a new computer language, albeit starting points include of those Fortranto 90 Fortran language 77 featuresfrequently programmers, that in are with the not particularchapter, Fortran by immediately emphasis 90 itself, self-explanatory on will versions notyou those of teach acquire you that the a to occur reading Numerical writea knowledge Fortran most head Recipes of 90 start the code. routines. that language,M&R But you and or it can This perhaps ought another rapidly provide to Fortran pick help enough up 90 of the reference rest book. of what you need to know from CITED REFERENCES AND FURTHERMetcalf, READING: M., and Reid, J. 1996, 21.0 Introduction Chapter 21. Introduction long-used (and much abused)a Fortran 77 new language, language butmachines. it for is These also, parallel twin in design programmingprocess goals other of on of respects, the becoming present language fluent sometimes and add in confusion Fortran future to 90 the multiprocessor programming. Fortran 77. That is,legacy any Fortran Fortran 90 77 compiler codetrivial, is however, supposed without is to error. that be you ableending have to The to in compile tell “ reason the any compiler for (usually terming via a this source compatibility file name one with a lotfairly in common with Fortran 77. Indeed, in such terms, Fortran 90 is a try to pass off Fortransomething 77 ending code in as “ native Fortran 90 (e.g., by naming the source file Here, in one shortthe chapter, language. we do Luckily, not thereis pretend are to by good provide books Metcalf a that and complete do description Reid exactly of that. Our favorite one 936 Chapter 21. Introduction to Fortran 90 Language Features
Kerrigan, J.F. 1993, Migrating to Fortran 90 (Sebastopol, CA: O’Reilly). [2] Brainerd, W.S., Goldberg, C.H., and Adams, J.C. 1996, Programmer’s Guide to Fortran 90, 3rd ed. (New York: Springer-Verlag). [3]
21.1 Quick Start: Using the Fortran 90 or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) Numerical Recipes Routines
This section is for people who want to jump right in. We’ll compute a Bessel function J0(x), where x is equal to the fourth root of the Julian Day number of the 200th full moon since January 1900. (Now there’s a useful quantity!) First, locate the important files nrtype.f90, nrutil.f90, and nr.f90,as listed in Appendices C1, C1, and C2, respectively. These contain modules that either are (i) used by our routines, or else (ii) describe the calling conventions of our routines to (your) user programs. Compile each of these files, producing (with most compilers) a .mod file and a .o (or similarly named) file for each one. Second, create this main program file: PROGRAM hello_bessel USE nrtype USE nr, ONLY: flmoon, bessj0 IMPLICIT NONE INTEGER(I4B) :: n=200,nph=2,jd REAL(SP) :: x,frac,ans call flmoon(n,nph,jd,frac) x=jd**0.25_sp ans=bessj0(x) write (*,*) ’Hello, Bessel: ’, ans END PROGRAM Here is a quick explanation of some elements of the above program: The first USE statement includes a module of ours named nrtype, whose purpose is to give symbolic names to some kinds of data types, among them single-precision reals (“sp”) and four-byte integers (“i4b”). The second USE statement includes a module of ours that defines the calling sequences, and variable types, expected by (in this case) the Numerical Recipes routines flmoon and bessj0. The IMPLICIT NONE statement signals that we want the compiler to require us explicitly to declare all variable types. We strongly urge that you always take this option. The next two lines declare integer and real variables of the desired kinds. The variable n is initialized to the value 200, nph to 2 (a value expected by flmoon). We call flmoon, and take the fourth root of the answer it returns as jd. Note that the
constant 0.25 is typed to be single-precision by the appended sp. or CDROMs
We call the bessj0 routine, and print the answer. g of machine- America). Third, compile the main program file, and also the files flmoon.f90, bessj0.f90. Then, link the resulting object files with also nrutil.o (or sim- ilar system-dependent name, as produced in step 1). Some compilers will also require you to link with nr.o and nrtype.o. Fourth, run the resulting executable file. Typical output is: Hello, Bessel: 7.3096365E-02 21.2 Fortran 90 Language Concepts 937
21.2 Fortran 90 Language Concepts
The Fortran 90 language standard defines and uses a number of standard terms for concepts that occur in the language. Here we summarize briefly some of the most important concepts. Standard Fortran 90 terms are shown in italics. While by no means complete, the information in this section should help you get a quick start or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) with your favorite Fortran 90 reference book or language manual. A note on capitalization: Outside a character context, Fortran 90 is not case- sensitive, so you can use upper and lower case any way you want, to improve readability. A variable like SP (see below) is the same variable as the variable sp. We like to capitalize keywords whose use is primarily at compile-time (statements that delimit program and subprogram boundaries, declaration statements of variables, fixed parameter values), and use lower case for the bulk of run-time code. You can adopt any convention that you find helpful to your own programming style; but we strongly urge you to adopt and follow some convention.
Data Types and Kinds
Data types (also called simply types) can be either intrinsic data types (the familiar INTEGER, REAL, LOGICAL, and so forth) or else derived data types that are built up in the manner of what are called “structures” or “records” in other computer languages. (We’ll use derived data types very sparingly in this book.) Intrinsic data types are further specified by their kind parameter (or simply kind), which is simply an integer. Thus, on many machines, REAL(4) (with kind =4) is a single-precision real, while REAL(8) (with kind =8) is a double-precision real. Literal constants (or simply literals) are specified as to kind by appending an underscore, as 1.5 4 for single precision, or 1.5 8 for double precision. [M&R, §2.5Ч2.6] Unfortunately, the specific integer values that define the different kind types are not specified by the language, but can vary from machine to machine. For that reason, one almost never uses literal kind parameters like 4 or 8, but rather defines in some central file, and imports into all one’s programs, symbolic names for the kinds. For this book, that central file is the module named nrtype, and the chosen symbolic names include SP, DP (for reals); I2B, I4B (for two- and four-byte integers); and LGT for the default logical type. You will therefore see us consistently writing REAL(SP),or1.5 sp, and so forth. Here is an example of declaring some variables, including a one-dimensional
array of length 500, and a two-dimensional array with 100 rows and 200 columns: or CDROMs g of machine-
USE nrtype America). REAL(SP) :: x,y,z INTEGER(I4B) :: i,j,k REAL(SP), DIMENSION(500) :: arr REAL(SP), DIMENSION(100,200) :: barr REAL(SP) :: carr(500)
The last line shows an alternative form for array syntax. And yes, there are default kind parameters for each intrinsic type, but these vary from machine to machine and can get you into trouble when you try to move code. We therefore specify all kind parameters explicitly in almost all situations. 938 Chapter 21. Introduction to Fortran 90 Language Features
Array Shapes and Sizes
The shape of an array refers to both its dimensionality (called its rank), and also the lengths along each dimension (called the extents). The shape of an array is specified by a rank-one array whose elements are the extents along each dimension,
and can be queried with the shape intrinsic (see p. 949). Thus, in the above example, or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) shape(barr) returns an array of length 2 containing the values (100, 200). The size of an array is its total number of elements, so the intrinsic size(barr) would return 20000 in the above example. More often one wants to know the extents along each dimension, separately: size(barr,1) returns the value 100, while size(barr,2) returns the value 200. [M&R, §2.10] Section §21.3, below, discusses additional aspects of arrays in Fortran 90.
Memory Management
Fortran 90 is greatly superior to Fortran 77 in its memory-management capa- bilities, seen by the user as the ability to create, expand, or contract workspace for programs. Within subprograms (that is, subroutines and functions), one can have automatic arrays (or other automatic data objects) that come into existence each time the subprogram is entered, and disappear (returning their memory to the pool) when the subprogram is exited. The size of automatic objects can be specified by arbitrary expressions involving values passed as actual arguments in the calling program, and thus received by the subprogram through its corresponding dummy arguments. [M&R, §6.4] Here is an example that creates some automatic workspace named carr:
SUBROUTINE dosomething(j,k) USE nrtype REAL(SP), DIMENSION(2*j,k**2) :: carr Finer control on when workspace is created or destroyed can be achieved by declaring allocatable arrays, which exist as names only, without associated memory, until they are allocated within the program or subprogram. When no longer needed, they can be deallocated. The allocation status of an allocatable array can be tested by the program via the allocated intrinsic function (p. 952). [M&R, §6.5] Here is an example in outline:
REAL(SP), DIMENSION(:,:), ALLOCATABLE :: darr ...
allocate(darr(10,20)) or CDROMs
... g of machine-
deallocate(darr) America). ... allocate(darr(100,200)) ... deallocate(darr) Notice that darr is originally declared with only “slots” (colons) for its dimensions, and is then allocated/deallocated twice, with different sizes. Yet finer control is achieved by the use of pointers. Like an allocatable array, a pointer can be allocated, at will, its own associated memory. However, it has the additional flexibility of alternatively being pointer associated with a target that 21.2 Fortran 90 Language Concepts 939
already exists under another name. Thus, pointers can be used as redefinable aliases for other variables, arrays, or (see §21.3) array sections. [M&R, §6.12] Here is an example that first associates the pointer parr with the array earr, then later cancels that association and allocates it its own storage of size 50: REAL(SP), DIMENSION(:), POINTER :: parr
REAL(SP), DIMENSION(100), TARGET :: earr or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) ... parr => earr ... nullify(parr) allocate(parr(50)) ... deallocate(parr)
Procedure Interfaces
When a procedure is referenced (e.g., called) from within a program or subprogram (examples of scoping units), the scoping unit must be told, or must deduce, the procedure’s interface, that is, its calling sequence, including the types and kinds of all dummy arguments, returned values, etc. The recommended procedure is to specify this interface via an explicit interface, usually an interface block (essentially a declaration statement for subprograms) in the calling subprogram or in some module that the calling program includes via a USE statement. In this book all interfaces are explicit, and the module named nr contains interface blocks for all of the Numerical Recipes routines. [M&R, §5.11] Here is a typical example of an interface block: INTERFACE SUBROUTINE caldat(julian,mm,id,iyyy) USE nrtype INTEGER(I4B), INTENT(IN) :: julian INTEGER(I4B), INTENT(OUT) :: mm,id,iyyy END SUBROUTINE caldat END INTERFACE Once this interface is made known to a program that you are writing (by either explicit inclusion or a USE statement), then the compiler is able to flag for you a variety of otherwise difficult-to-find bugs. Although interface blocks can sometimes seem overly wordy, they give a big payoff in ultimately minimizing programmer time and frustration. For compatibility with Fortran 77, the language also allows for implicit inter- faces, where the calling program tries to figure out the interface by the old rules of
Fortran 77. These rules are quite limited, and prone to producing devilishly obscure or CDROMs program bugs. We strongly recommend that implicit interfaces never be used. g of machine- America). Elemental Procedures and Generic Interfaces
Many intrinsic procedures (those defined by the language standard and thus usable without any further definition or specification) are also generic. This means that a single procedure name, such as log(x), can be used with a variety of types and kind parameters for the argument x, and the result returned will have the same type and kind parameter as the argument. In this example, log(x) allows any real or complex argument type. 940 Chapter 21. Introduction to Fortran 90 Language Features
Better yet, most generic functions are also elemental. The argument of an elemental function can be an array of arbitrary shape! Then, the returned result is an array of the same shape, with each element containing the result of applying the function to the corresponding element of the argument. (Hence the name elemental, meaning “applied element by element.”) [M&R, §8.1] For example:
REAL(SP), DIMENSION(100,100) :: a,b or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) b=sin(a) Fortran 90 has no facility for creating new, user-defined elemental functions. It does have, however, the related facility of overloading by the use of generic interfaces. This is invoked by the use of an interface block that attaches a single generic name to a number of distinct subprograms whose dummy arguments have different types or kinds. Then, when the generic name is referenced (e.g., called), the compiler chooses the specific subprogram that matches the types and kinds of the actual arguments used. [M&R, §5.18] Here is an example of a generic interface block: INTERFACE myfunc FUNCTION myfunc_single(x) USE nrtype REAL(SP) :: x,myfunc_single END FUNCTION myfunc_single FUNCTION myfunc_double(x) USE nrtype REAL(DP) :: x,myfunc_double END FUNCTION myfunc_double END INTERFACE A program with knowledge of this interface could then freely use the function reference myfunc(x) for x’s of both type SP and type DP. We use overloading quite extensively in this book. A typical use is to provide, under the same name, both scalar and vector versions of a function such as a Bessel function, or to provide both single-precision and double-precision versions of procedures (as in the above example). Then, to the extent that we have provided all the versions that you need, you can pretend that our routine is elemental. In such a situation, if you ever call our function with a type or kind that we have not provided, the compiler will instantly flag the problem, because it is unable to resolve the generic interface.
Modules
Modules, already referred to several times above, are Fortran 90’s generalization
of Fortran 77’s common blocks, INCLUDEd files of parameter statements, and (to or CDROMs some extent) statement functions. Modules are program units, like main programs or g of machine- subprograms (subroutines and functions), that can be separately compiled. A module America). is a convenient place to stash global data, named constants (what in Fortran 77 are called “symbolic constants” or “PARAMETERs”), interface blocks to subprograms and/or actual subprograms themselves (module subprograms). The convenience is that a module’s information can be incorporated into another program unit via a simple, one-line USE statement. [M&R, §5.5] Here is an example of a simple module that defines a few parameters, creates some global storage for an array named arr (as might be done with a Fortran 77 common block), and defines the interface to a function yourfunc: 21.3 More on Arrays and Array Sections 941
MODULE mymodule USE nrtype REAL(SP), PARAMETER :: con1=7.0_sp/3.0_sp,con2=10.0_sp INTEGER(I4B), PARAMETER :: ndim=10,mdim=9 REAL(SP), DIMENSION(ndim,mdim) :: arr INTERFACE FUNCTION yourfunc(x)
USE nrtype or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) REAL(SP) :: x,yourfunc END FUNCTION yourfunc END INTERFACE END MODULE mymodule As mentioned earlier, the module nr contains INTERFACE declarations for all the Numerical Recipes. When we include a statement of the form
USE nr, ONLY: recipe1 it means that the program uses the additional routine recipe1. The compiler is able to use the explicit interface declaration in the module to check that recipe1 is invoked with arguments of the correct type, shape, and number. However, a weakness of Fortran 90 is that there is no fail-safe way to be sure that the interface module (here nr) stays synchronized with the underlying routine (here recipe1). You might think that you could accomplish this by putting USE nr, ONLY: recipe1 into the recipe1 program itself. Unfortunately, the compiler interprets this as an erroneous double definition of recipe1’s interface, rather than (as would be desirable) as an opportunity for a consistency check. (To achieve this kind of consistency check, you can put the procedures themselves, not just their interfaces, into the module.)
CITED REFERENCES AND FURTHER READING: Metcalf, M., and Reid, J. 1996, Fortran 90/95 Explained (Oxford: Oxford University Press).
21.3 More on Arrays and Array Sections
Arrays are the central conceptual core of Fortran 90 as a parallel programming language, and thus worthy of some further discussion. We have already seen that arrays can “come into existence” in Fortran 90 in several ways, either directly declared, as
REAL(SP), DIMENSION(100,200) :: arr or else allocated by an allocatable variable or a pointer variable, or CDROMs
REAL(SP), DIMENSION(:,:), ALLOCATABLE :: arr g of machine-
REAL(SP), DIMENSION(:,:), POINTER :: barr America). ... allocate(arr(100,200),barr(100,200)) or else (not previously mentioned) passed into a subprogram through a dummy argument:
SUBROUTINE myroutine(carr) USE nrtype REAL(SP), DIMENSION(:,:) :: carr ... i=size(carr,1) 942 Chapter 21. Introduction to Fortran 90 Language Features
j=size(carr,2) In the above example we also show how the subprogram can find out the size of the actual array that is passed, using the size intrinsic. This routine is an example of the use of an assumed-shape array, new to Fortran 90. The actual extents along each dimension are inherited from the calling routine at run time. A subroutine with assumed-shape array arguments must have an explicit interface in the calling or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) routine, otherwise the compiler doesn’t know about the extra information that must be passed. A typical setup for calling myroutine would be: PROGRAM use_myroutine USE nrtype REAL(SP), DIMENSION(10,10) :: arr INTERFACE SUBROUTINE myroutine(carr) USE nrtype REAL(SP), DIMENSION(:,:) :: carr END SUBROUTINE myroutine END INTERFACE ... call myroutine(a) Of course, for the recipes we have provided all the interface blocks in the file nr.f90, and you need only a USE nr statement in your calling program.
Conformable Arrays
Two arrays are said to be conformable if their shapes are the same. Fortran 90 allows practically all operations among conformable arrays and elemental functions that are allowed for scalar variables. Thus, if arr, barr, and carr are mutually conformable, we can write,
arr=barr+cos(carr)+2.0_sp and have the indicated operations performed, element by corresponding element, on the entire arrays. The above line also illustrates that a scalar (here the constant 2.0 sp, but a scalar variable would also be fine) is deemed conformable with any array — it gets “expanded” to the shape of the rest of the expression that it is in. [M&R, §3.11] In Fortran 90, as in Fortran 77, the default lower bound for an array subscript is 1; however, it can be made some other value at the time that the array is declared:
REAL(SP), DIMENSION(100,200) :: farr REAL(SP), DIMENSION(0:99,0:199) :: garr
... or CDROMs
farr = 3.0_sp*garr + 1.0_sp g of machine- America). Notice that farr and garr are conformable, since they have the same shape, in this case (100, 200). Also note that when they are used in an array expression, the operations are taken between the corresponding elements of their shapes, not necessarily the corresponding elements of their indices. [M&R, §3.10] In other words, one of the components evaluated is, farr(1,1) = 3.0_sp*garr(0,0) + 1.0_sp This illustrates a fundamental aspect of array (or data) parallelism in Fortran 90. Array constructions should not be thought of as merely abbreviations for do-loops 21.3 More on Arrays and Array Sections 943
over indices, but rather as genuinely parallel operations on same-shaped objects, abstracted of their indices. This is why the standard makes no statement about the order in which the individual operations in an array expression are executed; they might in fact be carried out simultaneously, on parallel hardware. By default, array expressions and assignments are performed for all elements
of the same-shaped arrays referenced. This can be modified, however, by use of or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) a where construction like this: where (harr > 0.0_sp) farr = 3.0_sp*garr + 1.0_sp end where Here harr must also be conformable to farrand garr. Analogously with the Fortran if-statement, there is also a one-line form of the where-statement. There is also a where ... elsewhere ... end where form of the statement, analogous to if ... else if ... end if. A significant language limitation in Fortran 90 is that nested where-statements are not allowed. [M&R, §6.8]
Array Sections
Much of the versatility of Fortran 90’s array facilities stems from the availability of array sections. An array section acts like an array, but its memory location, and thus the values of its elements, is actually a subset of the memory location of an already-declared array. Array sections are thus “windows into arrays,” and they can appear on either the left side, or the right side, or both, of a replacement statement. Some examples will clarify these ideas. Let us presume the declarations REAL(SP), DIMENSION(100) :: arr INTEGER(I4B), DIMENSION(6) :: iarr=(/11,22,33,44,55,66/) Note that iarr is not only declared, it is also initializedby an initializationexpression (a replacement for Fortran 77’s DATA statement). [M&R, §7.5] Here are some array sections constructed from these arrays: Array Section What It Means arr(:) same as arr arr(1:100) same as arr arr(1:10) one-dimensional array containing first 10 elements of arr arr(51:100) one-dimensional array containing sec- ond half of arr or CDROMs g of machine-
arr(51:) same as arr(51:100) America). arr(10:1:-1) one-dimensional array containing first 10 elements of arr, but in reverse order arr((/10,99,1,6/)) one-dimensional array containing ele- ments 10, 99, 1, and 6 of arr, in that order arr(iarr) one-dimensional array containing ele- ments 11, 22, 33, 44, 55, 66 of arr,in that order 944 Chapter 21. Introduction to Fortran 90 Language Features
Now let’s try some array sections of the two-dimensional array
REAL(SP), DIMENSION(100,100) :: barr
Array Section What It Means ii est tp/wwn.o rcl -0-7-43(ot mrc ny,or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) barr(:,:) same as barr barr(1:100,1:100) same as barr barr(7,:) one-dimensional array containing the 7th row of barr barr(7,1:100) same as barr(7,:) barr(:,7) one-dimensional array containing the 7th column of barr barr(21:30,71:90) two-dimensional array containing the sub-block of barr with the indicated ranges of indices; the shape of this array section is (10, 20) barr(100:1:-1,100:1:-1) two-dimensional array formed by flip- ping barr upside down and backwards barr(2:100:2,2:100:2) two-dimensional array of shape (50, 50) containing the elements of barr whose row and column indices are both even
Some terminology: A construction like 2:100:2, above, is called a subscript triplet. Its integer pieces (which may be integer constants, or more general integer expressions) are called lower, upper, and stride. Any of the three may be omitted. An omitted stride defaults to the value 1. Notice that, if (upper − lower) has a different sign from stride, then a subscript triplet defines an empty or zero-length array, e.g., 1:5:-1 or 10:1:1 (or its equivalent form, simply 10:1). Zero-length arrays are not treated as errors in Fortran 90, but rather as “no-ops.” That is, no operation is performed in an expression or replacement statement among zero-length arrays. (This is essentially the same convention as in Fortran 77 for do-loop indices, which array expressions often replace.) [M&R, §6.10] It is important to understand that array sections, when used in array expressions, match elements with other parts of the expression according to shape, not according to indices. (This is exactly the same principle that we applied, above, to arrays or CDROMs with subscript lower bounds different from the default value of 1.) One frequently g of machine- exploits this feature in using array sections to carry out operations on arrays that America). access neighboring elements. For example,
carr(1:n-1,1:n-1) = barr(1:n-1,1:n-1)+barr(2:n,2:n) constructs in the (n − 1) × (n − 1) matrix carr the sum of each of the corresponding elements in n × n barr added to its diagonally lower-right neighbor. Pointers are often used as aliases for array sections, especially if the same array sections are used repeatedly. [M&R, §6.12] For example, with the setup
REAL(SP), DIMENSION(:,:), POINTER :: leftb,rightb 21.4 Fortran 90 Intrinsic Procedures 945
leftb=>barr(1:n-1,1:n-1) rightb=>barr(2:n,2:n)
the statement above can be coded as
carr(1:n-1,1:n-1)=leftb+rightb ii est tp/wwn.o rcl -0-7-43(ot mrc ny,or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) We should also mention that array sections, while powerful and concise, are sometimes not quite powerful enough. While any row or column of a matrix is easily accessible as an array section, there is no good way, in Fortran 90, to access (e.g.) the diagonal of a matrix, even though its elements are related by a linear progression in the Fortran storage order (by columns). These so-called skew-sections were much discussed by the Fortran 90 standards committee, but they were not implemented. We will see examples later in this volume of work-around programming tricks (none totally satisfactory) for this omission. (Fortran 95 corrects the omission; see §21.6.)
CITED REFERENCES AND FURTHER READING: Metcalf, M., and Reid, J. 1996, Fortran 90/95 Explained (Oxford: Oxford University Press).
21.4 Fortran 90 Intrinsic Procedures
Much of Fortran 90’s power, both for parallel programming and for its concise expression of algorithmic ideas, comes from its rich set of intrinsicprocedures. These have the effect of making the language “large,” hence harder to learn. However, effort spent on learning to use the intrinsics — particularly some of their more obscure, and more powerful, optional arguments — is often handsomely repaid. This section summarizes the intrinsics that we find useful in numerical work. We omit, here, discussion of intrinsics whose exclusive use is for character and string manipulation. We intend only a summary, not a complete specification, which can be found in M&R’s Chapter 8, or other reference books. If you find the sheer number of new intrinsic procedures daunting, you might
want to start with our list of the “top 10” (with the number of different Numerical or CDROMs
Recipes routines that use each shown in parentheses): size (254), sum (44), g of machine- dot product (31), merge (27), all (25), maxval (23), matmul (19), pack (18), America). any (17), and spread (15). (Later, in Chapter 23, you can compare these numbers with our frequency of using the short utility functions that we define in a module named nrutil — several of which we think ought to have been included as Fortran 90 intrinsic procedures.) The type, kind, and shape of the value returned by intrinsic functions will usually be clear from the short description that we give. As an additional hint (though not necessarily a precise description), we adopt the following codes: 946 Chapter 21. Introduction to Fortran 90 Language Features
Hint What It Means [Int] an INTEGER kind type [Real] a REAL kind type [Cmplx] a COMPLEX kind type
[Num] a numerical type and kind or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) [Lgcl] a LOGICAL kind type [Iarr] a one-dimensional INTEGER array [argTS] same type and shape as the first argument [argT] same type as the first argument, but not necessarily the same shape
Numerical Elemental Functions
Little needs to be said about the numerical functions with identical counterparts in Fortran 77: abs, acos, aimag, asin, atan, atan2, conjg, cos, cosh, dim, exp, log, log10, max, min, mod, sign, sin, sinh, sqrt, tan, and tanh. In Fortran 90 these are all elemental functions, so that any plausible type, kind, and shape of argument may be used. Except for aimag, which returns a real type from a complex argument, these all return [argTS] (see table above). Although Fortran 90 recognizes,for compatibility,Fortran 77’s so-called specific names for these functions (e.g., iabs, dabs, and cabs for the generic abs), these are entirely superfluous and should be avoided. Fortran 90 corrects some ambiguity (or at least inconvenience) in Fortran 77’s mod(a,p) function, by introducing a new function modulo(a,p). The functions are essentially identical for positive arguments, but for negative a and positive p, modulo gives results more compatible with one’s mathematical expectation that the answer should always be in the positive range 0 to p. E.g., modulo(11,5)=1, and modulo(-11,5)=4. [M&R, §8.3.2]
Conversion and Truncation Elemental Functions
Fortran 90’s conversion (or, in the language of C, casting) and truncation functions are generally modeled on their Fortran 77 antecedents, but with the addition of an optional second integer argument, kind, that determines the kind of or CDROMs
the result. Note that, if kind is omitted, you get a default kind — not necessarily g of machine-
related to the kind of your argument. The kind of the argument is of course known America). to the compiler by its previous declaration. Functions in this category (see below for explanation of arguments in slanted type) are: [Real] aint(a,kind) Truncate to integer value, return as a real kind.
[Real] anint(a,kind) Nearest whole number, return as a real kind. [Cmplx] cmplx(x,y,kind) 21.4 Fortran 90 Intrinsic Procedures 947
Convert to complex kind. If y is omitted, it is taken to be 0. [Int] int(a,kind) Convert to integer kind, truncating towards zero.
[Int] nint(a,kind)
Convert to integer kind, choosing the nearest whole number. or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0)
[Real] real(a,kind) Convert to real kind.
[Lgcl] logical(a,kind) Convert one logical kind to another.
We must digress here to explain the use of optional arguments and keywords as Fortran 90 language features. [M&R, §5.13] When a routine (either intrinsic or user-defined) has arguments that are declared to be optional, then the dummy names given to them also become keywords that distinguish — independent of their position in a calling list — which argument is intended to be passed. (There are some additional rules about this that we will not try to summarize here.) In this section’s tabular listings, we indicate optional arguments in intrinsic routines by printing them in smaller slanted type. For example, the intrinsic function eoshift(array,shift,boundary,dim) has two required arguments, array and shift, and two optional arguments, boundary and dim. Suppose we want to call this routine with the actual arguments myarray, myshift, and mydim, but omitting the argument in the boundary slot. We do this by the expression eoshift(myarray,myshift,dim=mydim) Conversely, if we wanted a boundary argument, but no dim, we might write eoshift(myarray,myshift,boundary=myboundary) It is always a good idea to use this kind of keyword construction when invoking optional arguments, even though the rules allow keywords to be omitted in some unambiguous cases. Now back to the lists of intrinsic routines.
A peculiarity of the real function derives from its use both as a type conversion and for extracting the real part of complex numbers (related, but not identical, usages): If the argument of real is complex, and kind is omitted, then the result isn’t a default real kind, but rather is (as one generally would want) the real kind type corresponding to the kind type of the complex argument, that is, single-precision or CDROMs
real for single-precision complex, double-precision for double-precision, and so on. g of machine-
[M&R, §8.3.1] We recommend never using kind when you intend to extract the America). real part of a complex, and always using kind when you intend conversion of a real or integer value to a particular kind of REAL. (Use of the deprecated function dble is not recommended.) The last two conversion functions are the exception in that they don’t allow a kind argument, but rather return default integer kinds. (The X3J3 standards committee has fixed this in Fortran 95.) [Int] ceiling(a) Convert to integer, truncating towards more positive. 948 Chapter 21. Introduction to Fortran 90 Language Features
[Int] floor(a) Convert to integer, truncating towards more negative.
Reduction and Inquiry Functions on Arrays
These are mostly the so-called transformational functions that accept array or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) arguments and return either scalar values or else arrays of lesser rank. [M&R, §8.11] With no optional arguments, such functions act on all the elements of their single array argument, regardless of its shape, and produce a scalar result. When the optional argument dim is specified, they instead act on all one-dimensional sections that span the dimension dim, producing an answer one rank lower than the first argument (that is, omitting the dim dimension from its shape). When the optional argument mask is specified, only the elements with a corresponding true value in mask are scanned. [Lgcl] all(mask,dim) Returns true if all elements of mask are true, false otherwise. [Lgcl] any(mask,dim) Returns true if any of the elements of mask are true, false otherwise. [Int] count(mask,dim) Counts the true elements in mask. [Num] maxval(array,dim,mask) Maximum value of the array elements.
[Num] minval(array,dim,mask) Minimum value of the array elements. [Num] product(array,dim,mask) Product of the array elements.
[Int] size(array,dim) Size (total number of elements) of array, or its extent along dimension dim. [Num] sum(array,dim,mask) Sum of the array elements. The use of the dim argument can be confusing, so an example may be helpful. Suppose we have 1234 or CDROMs g of machine-
myarray = 5678 America). 9101112 where, as always, the i index in array(i,j) numbers the rows while the j index numbers the columns. Then sum(myarray,dim=1) =(15, 18, 21, 24) that is, the i indices are “summed away” leaving only a j index on the result; while sum(myarray,dim=2) =(10, 26, 42) 21.4 Fortran 90 Intrinsic Procedures 949
that is, the j indices are “summed away” leaving only an i index on the result. Of course we also have
sum(myarray) =78
Two related functions return the location of particular elements in an array. The returned value is a one-dimensional integer array containing the respective subscript or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) of the element along each dimension. Note that when the argument object is a one-dimensional array, the returned object is an integer array of length 1, not simply an integer. (Fortran 90 distinguishes between these.) [Iarr] maxloc(array,mask) Location of the maximum value in an array.
[Iarr] minloc(array,mask) Location of the minimum value in an array. Similarly returning a one-dimensional integer array are [Iarr] shape(array) Returns the shape of array as a one-dimensional integer array. [Iarr] lbound(array,dim) When dim is absent, returns an array of lower bounds for each dimension of subscripts of array. When dim is present, returns the value only for dimension dim, as a scalar. [Iarr] ubound(array,dim) When dim is absent, returns an array of upper bounds for each dimension of subscripts of array. When dim is present, returns the value only for dimension dim, as a scalar.
Array Unary and Binary Functions
The most powerful array operations are simply built into the language as operators. All the usual arithmetic and logical operators (+, -, *, /, **, .not., .and., .or., .eqv., .neqv.) can be applied to arrays of arbitrary shape or (for the binary operators) between two arrays of the same shape, or between arrays and scalars. The types of the arrays must, of course, be appropriate to the operator used. The result in all cases is to perform the operation element by element on the arrays. We also have the intrinsic functions,
[Num] dot product(veca,vecb) or CDROMs Scalar dot product of two one-dimensional vectors and . g of machine- veca vecb America). [Num] matmul(mata,matb) Result of matrix-multiplying the two two-dimensional matrices mata and matb. The shapes have to be such as to allow matrix multiplication. Vectors (one-dimensional arrays) are additionally allowed as either the first or second argument, but not both; they are treated as row vectors in the first argument, and as column vectors in the second. You might wonder how to form the outer product of two vectors, since matmul specifically excludes this case. (See §22.1 and §23.5 for answer.) 950 Chapter 21. Introduction to Fortran 90 Language Features
Array Manipulation Functions These include many powerful features that a good Fortran 90 programmer should master. [argTS] cshift(array,shift,dim) If dim is omitted, it is taken to be 1. Returns the result of circularly ii est tp/wwn.o rcl -0-7-43(ot mrc ny,or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) left-shifting every one-dimensional section of array (in dimension dim)byshift (which may be negative). That is, for positive shift, values are moved to smaller subscript positions. Consult a Fortran 90 reference (e.g., [M&R, §8.13.5]) for the case where shift is an array. [argTS] merge(tsource,fsource,mask) Returns same shape object as tsource and fsource containing the former’s components where mask is true, the latter’s where it is false. [argTS] eoshift(array,shift,boundary,dim) If dim is omitted, it is taken to be 1. Returns the result of end-off left- shifting every one-dimensional section of array (in dimension dim) by shift (which may be negative). That is, for positive shift, values are moved to smaller subscript positions. If boundary is present as a scalar, it supplies elements to fill in the blanks; if it is not present, zero values are used. Consult a Fortran 90 reference (e.g., [M&R, §8.13.5]) for the case where boundary and/or shift is an array. [argT] pack(array,mask,vector) Returns a one-dimensional array containing the elements of array that pass the mask. Components of optional vector are used to pad out the result to the size of vector with specified values. [argT] reshape(source,shape,pad,order) Takes the elements of source, in normal Fortran order, and returns them (as many as will fit) as an array whose shape is specified by the one-dimensional integer array shape. If there is space remaining, then pad must be specified, and is used (as many sequential copies as necessary) to fill out the rest. For description of order, consult a Fortran 90 reference, e.g., [M&R, 8.13.3]. [argT] spread(source,dim,ncopies) Returns an array whose rank is one greater than source, and whose dim dimension is of length ncopies. Each of the result’s ncopies
array sections having a fixed subscript in dimension dim is a copy of or CDROMs
source. (That is, it spreads source into the dimth dimension.) g of machine- America). [argT] transpose(matrix) Returns the transpose of matrix, which must be two-dimensional. [argT] unpack(vector,mask,field) Returns an array whose type is that of vector, but whose shape is that of mask. The components of vector are put, in order, into the positions where mask is true. Where mask is false, components of field (which may be a scalar or an array with the same shape as mask) are used instead. 21.4 Fortran 90 Intrinsic Procedures 951
Bitwise Functions
Most of the bitwise functions should be familiar to Fortran 77 programmers as longstanding standard extensions of that language. Note that the bit positions number from zero to one less than the value returned by the bit size function.
Also note that bit positions number from right to left. Except for bit size, the or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) following functions are all elemental. [Int] bit size(i) Number of bits in the integer type of i. [Lgcl] btest(i,pos) True if bit position pos is 1, false otherwise. [Int] iand(i,j) Bitwise logical and. [Int] ibclr(i,pos) Returns i but with bit position pos set to zero. [Int] ibits(i,pos,len) Extracts len consecutive bits starting at position pos and puts them in the low bit positions of the returned value. (The high positions are zero.) [Int] ibset(i,pos) Returns i but with bit position pos set to 1. [Int] ieor(i,j) Bitwise exclusive or. [Int] ior(i,j) Bitwise logical or. [Int] ishft(i,shift) Bitwise left shift by shift (which may be negative) with zeros shifted in from the other end. [Int] ishftc(i,shift) Bitwise circularly left shift by shift (which may be negative). [Int] not(i) Bitwise logical complement. or CDROMs g of machine-
Some Functions Relating to Numerical Representations America).
[Real] epsilon(x) Smallest nonnegligible quantity relative to 1 in the numerical model of x. [Num] huge(x) Largest representable number in the numerical model of x. [Int] kind(x) 952 Chapter 21. Introduction to Fortran 90 Language Features
Returns the kind value for the numerical model of x. [Real] nearest(x,s) Real number nearest to x in the direction specified by the sign of s. [Real] tiny(x)
Smallest positive number in the numerical model of x. or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0)
Other Intrinsic Procedures
[Lgcl] present(a) True, within a subprogram, if an optional argument is actually present, otherwise false.
[Lgcl] associated(pointer,target) True if pointer is associated with target or (if target is absent) with any target, otherwise false. [Lgcl] allocated(array) True if the allocatable array is allocated, otherwise false. There are some pitfalls in using associated and allocated, having to do with arrays and pointers that can find themselves in undefined status [see §21.5, and also M&R, §3.3 and §6.5.1]. For example, pointers are always “born” in an undefined status, where the associated function returns unpredictable values. For completeness, here is a list of Fortran 90’s intrinsic procedures not already mentioned: Other Numerical Representation Functions: digits, exponent, fraction, rrspacing, scale, set exponent, spacing, maxexponent, minexponent, precision, radix, range, selected int kind, selected real kind. Lexical comparison: lge, lgt, lle, llt. Character functions: ichar, char, achar, iachar, index, adjustl, adjustr, len trim, repeat, scan, trim, verify. Other: mvbits, transfer, date and time, system clock, random seed, random number. (We will discuss random numbers in some detail in Chapter B7.)
CITED REFERENCES AND FURTHER READING: Metcalf, M., and Reid, J. 1996, Fortran 90/95 Explained (Oxford: Oxford University Press). or CDROMs g of machine- America). 21.5 Advanced Fortran 90 Topics 953
21.5 Advanced Fortran 90 Topics
Pointers, Arrays, and Memory Management
One of the biggest improvements in Fortran 90 over Fortran 77 is in the handling of arrays, which are the cornerstone of many numerical algorithms. In this or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) subsection we will take a closer look at how to use some of these new array features effectively. We will look at how to code certain commonly occurring elements of program design, and we will pay particular attention to avoiding “memory leaks,” where — usually inadvertently — we keep cumulatively allocating new storage for an array, every time some piece of code is invoked. Let’s first review some of the rules for using allocatable arrays and pointers to arrays. Recall that a pointer is born with an undefined status. Its status changes to “associated” when you make it refer to a target, and to “disassociated” when you nullify the pointer. [M&R, §3.3] You can also use nullify on a newly born pointer to change its status from undefined to disassociated; this allows you to test the status with the associated inquiry function. [M&R, §6.5.4] (While many compilers will not produce a run-time error if you test an undefined pointer with associated, you can’t rely on this laissez-faire in your programming.) The initial status of an allocatable array is “not currently allocated.” Its status changes to “allocated” when you give it storage with allocate, and back to “not currently allocated” when you use deallocate. [M&R, §6.5.1] You can test the status with the allocated inquiry function. Note that while you can also give a pointer fresh storage with allocate, you can’t test this with allocated — only associated is allowed with pointers. Note also that nullifying an allocated pointer leaves its associated storage in limbo. You must instead deallocate, which gives the pointer a testable “disassociated” status. While allocating an array that is already allocated gives an error, you are allowed to allocate a pointer that already has a target. This breaks the old association, and could leave the old target inaccessible if there is no other pointer associated with it. [M&R, §6.5.2] Deallocating an array or pointer that has not been allocated is always an error. Allocated arrays that are local to a subprogram acquire the “undefined” status on exit from the subprogram unless they have the SAVE attribute. (Again, not all compilers enforce this, but be warned!) Such undefined arrays cannot be referenced in any way, so you should explicitly deallocate all allocated arrays that are not saved before returning from a subprogram. [M&R, §6.5.1] The same rule applies to arrays declared in modules that are currently accessed only by the subprogram. or CDROMs While you can reference undefined pointers (e.g., by first nullifying them), it is good g of machine- America). programming practice to deallocate explicitly any allocated pointers declared locally before leaving a subprogram or module. Now let’s turn to using these features in programs. The simplest example is when we want to implement global storage of an array that needs to be accessed by two or more different routines, and we want the size of the array to be determined at run time. As mentioned earlier, we implement global storage with a MODULE rather than a COMMON block. (We ignore here the additional possibility of passing 954 Chapter 21. Introduction to Fortran 90 Language Features
global variables by having one routine CONTAINed within the other.) There are two good ways of handling the dynamical allocation in a MODULE. Method 1 uses an allocatable array:
MODULE a REAL(SP), DIMENSION(:), ALLOCATABLE :: x END MODULE a ii est tp/wwn.o rcl -0-7-43(ot mrc ny,or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0)
SUBROUTINE b(y) USE a REAL(SP), DIMENSION(:) :: y ... allocate(x(size(y))) ... [other routines using x called here] ... END SUBROUTINE b Here the global variable x gets assigned storage in subroutine b (in this case, the same as the length of y). The length of y is of course defined in the procedure that calls b. The array x is made available to any other subroutine called by b by including a USE a statement. The status of x can be checked with an allocated inquiry function on entry into either b or the other subroutine if necessary. As discussed above, you must be sure to deallocate x before returning from subroutine b. If you want x to retain its values between calls to b, you add the SAVE attribute to its declaration in a, and don’t deallocate it on returning from b. (Alternatively, you could put a USE a in your main program, but we consider that bug-prone, since forgetting to do so can create all manner of difficult-to-diagnose havoc.) To avoid allocating x more than once, you test it on entry into b:
if (.not. allocated(x)) allocate(x(size(y))) The second way to implement this type of global storage (Method 2) uses a pointer:
MODULE a REAL(SP), DIMENSION(:), POINTER :: x END MODULE a
SUBROUTINE b(y) USE a REAL(SP), DIMENSION(:) :: y REAL(SP), DIMENSION(size(y)), TARGET :: xx ... x=>xx ... [other routines using x called here] ... END SUBROUTINE b
Here the automatic array xx gets its temporary storage automatically on entry or CDROMs
into b, and automatically gets deallocated on exit from b. [M&R, §6.4] The global g of machine- pointer x can access this storage in any routine with a USE a that is called by b. America). You can check that things are in order in such a called routine by testing x with associated. If you are going to use x for some other purpose as well, you should nullify it on leaving b so that it doesn’t have undefined status. Note that this implementation does not allow values to be saved between calls: You can’t SAVE automatic arrays — that’s not what they’re for. You would have to SAVE x in the module, and allocate it in the subroutine instead of pointing it to a suitable automatic array. But this is essentially Method 1 with the added complication of using a pointer, so Method 1 is simpler when you want to save values. When you don’t 21.5 Advanced Fortran 90 Topics 955
need to save values between calls, we lean towards Method 2 over Method 1 because we like the automatic allocation and deallocation, but either method works fine. An example of Method 1 (allocatable array) is in routine rkdumb on page 1297. An example of Method 1 with SAVE is in routine pwtset on p. 1265. Method 2 (pointer) shows up in routines newt (p. 1196), broydn (p. 1199), and fitexy
(p. 1286). A variation is shown in routines linmin (p. 1211) and dlinmin (p. 1212): or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) When the array that needs to be shared is an argument of one of the routines, Method 2 is better. An extension of these ideas occurs if we allocate some storage for an array initially, but then might need to increase the size of the array later without losing the already-stored values. The function reallocate in our utility module nrutil will handle this for you, but it expects a pointer argument as in Method 2. Since no automatic arrays are used, you are free to SAVE the pointer if necessary. Here is a simple example of how to use reallocate to create a workspace array that is local to a subroutine:
SUBROUTINE a USE nrutil, ONLY : reallocate REAL(SP), DIMENSION(:), POINTER, SAVE :: wksp LOGICAL(LGT), SAVE :: init=.true. if (init) then init=.false. nullify(wksp) wksp=>reallocate(wksp,100) end if ... if (nterm > size(wksp)) wksp=>reallocate(wksp,2*size(wksp)) ... END SUBROUTINE a Here the workspace is initially allocated a size of 100. If the number of elements used (nterm) ever exceeds the size of the workspace, the workspace is doubled. (In a realistic example, one would of course check that the doubled size is in fact big enough.) Fortran 90 experts can note that the SAVE on init is not strictly necessary: Any local variable that is initialized is automatically saved. [M&R, §7.5] You can find similar examples of reallocate (with some further discussion) in eulsum (p. 1070), hufenc (p. 1348), and arcode (p. 1350). Examples of reallocate used with global variables in modules are in odeint (p. 1300) and ran state (p. 1144). Another situation where we have to use pointers and not allocatable arrays is when the storage is required for components of a derived type, which are not allowed to have the allocatable attribute. Examples are in hufmak (p. 1346) and or CDROMs
arcmak (p. 1349). g of machine-
Turning away from issues relating to global variables, we now consider several America). other important programming situations that are nicely handled with pointers. The first case is when we want a subroutine to return an array whose size is not known in advance. Since dummy arguments are not allocatable, we must use a pointer. Here is the basic construction:
SUBROUTINE a(x,nx) REAL(SP), DIMENSION(:), POINTER :: x INTEGER(I4B), INTENT(OUT) :: nx LOGICAL(LGT), SAVE :: init=.true. if (init) then 956 Chapter 21. Introduction to Fortran 90 Language Features
init=.false. nullify(x) else if (associated(x)) deallocate(x) end if ... nx=...
allocate(x(nx)) or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) x(1:nx)=... END SUBROUTINE a Since the length of x can be found from size(x), it is not absolutely necessary to pass nx as an argument. Note the use of the initial logic to avoid memory leaks. If a higher-level subroutine wants to recover the memory associated with x from the last call to SUBROUTINE a, it can do so by first deallocating it, and then nullifying the pointer. Examples of this structure are in zbrak (p. 1184), period (p. 1258), and fasper (p. 1259). A related situation is where we want a function to return an array whose size is not predetermined, such as in voltra on (p. 1326). The discussion of voltra also explains the potential pitfalls of functions returning pointers to dynamically allocated arrays. A final useful pointer construction enables us to set up a data structure that is essentially an array of arrays, independently allocatable on each part. We are not allowed to declare an array of pointers in Fortran 90, but we can do this indirectly by defining a derived type that consists of a pointer to the appropriate kind of array. [M&R, §6.11] We can then define a variable that is an allocatable array of the new type. For example,
TYPE ptr_to_arr REAL(SP), DIMENSION(:), POINTER :: arr END TYPE TYPE(ptr_to_arr), DIMENSION(:), ALLOCATABLE :: x ... allocate(x(n)) ... do i=1,n allocate(x(i)%arr(m)) end do sets up a set x of n arrays of length m. See also the example in mglin (p. 1334). There is a potential problem with dynamical memory allocation that we should mention. The Fortran 90 standard does not require that the compiler perform “garbage collection,” that is, it is not required to recover deallocated memory into nice contiguous pieces for reuse. If you enter and exit a subroutine many times, and each time a large chunk of memory gets allocated and deallocated, you could or CDROMs
run out of memory with a “dumb” compiler. You can often alleviate the problem g of machine-
by deallocating variables in the reverse order that you allocated them. This tends to America). keep a large contiguous piece of memory free at the top of the heap.
Scope, Visibility, and Data Hiding
An important principle of good programming practice is modularization, the idea that different parts of a program should be insulated from each other as much as possible. An important subcase of modularization is data hiding, the principle that actions carried out on variables in one part of the code should not be able to 21.5 Advanced Fortran 90 Topics 957
affect the values of variables in other parts of the code. When it is necessary for one “island” of code to communicate with another, the communication should be through a well-defined interface that makes it obvious exactly what communication is taking place, and prevents any other interchange from occurring. Otherwise, different sections of code should not have access to variables that they don’t need.
The concept of data hiding extends not only to variables, but also to the names or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) of procedures that manipulate the variables: A program for screen graphics might give the user access to a routine for drawing a circle, but it might “hide” the names (and methods of operation) of the primitive routines used for calculating the coordinates of the points on the circumference. Besides producing code that is easier to understand and to modify, data hiding prevents unintended side effects from producing hard-to-find errors. In Fortran, the principal language construction that effects data hiding is the use of subroutines. If all subprograms were restricted to have no more than ten executable statements per routine, and to communicate between routines only by an explicit list of arguments, the number of programming errors might be greatly reduced! Unfortunately few tasks can be easily coded in this style. For this and other reasons, we think that too much procedurization is a bad thing; one wants to find the right amount. Fortunately Fortran 90 provides several additional tools to help with data hiding. Global variables and routine names are important, but potentially dangerous, things. In Fortran 90, global variables are typically encapsulated in modules. Access is granted only to routines with an appropriate USE statement, and can be restricted to specific identifiers by the ONLY option. [M&R, §7.10] In addition, variable and routine names within the module can be designated as PUBLIC or PRIVATE (see, e.g., quad3d on p. 1065). [M&R, §7.6] The other way global variables get communicated is by having one routine CONTAINed within another. [M&R, §5.6] This usage is potentially lethal, however, because all the outer routine’s variables are visible to the inner routine. You can try to control the problem somewhat by passing some variables back and forth as arguments of the inner routine, but that still doesn’t prevent inadvertent side effects. (The most common, and most stupid, is inadvertent reuse of variables named i or j in the CONTAINed routine.) Also, a long list of arguments reduces the convenience of using an internal routine in the first place. We advise that internal subprograms be used with caution, and only to carry out simple tasks. There are some good ways to use CONTAINS, however. Several of our recipes have the following structure: A principal routine is invoked with several arguments. It calls a subsidiary routine, which needs to know some of the principal routine’s or CDROMs
arguments, some global variables, and some values communicated directly as g of machine- arguments to the subsidiary routine. In Fortran 77, we have usually coded this by America). passing the global variables in a COMMON block and all other variables as arguments to the subsidiary routine. If necessary, we copied the arguments of the primary routine before passing them to the subsidiary routine. In Fortran 90, there is a more elegant way of accomplishing this, as follows: SUBROUTINE recipe(arg) REAL(SP) :: arg REAL(SP) :: global_var call recipe_private CONTAINS 958 Chapter 21. Introduction to Fortran 90 Language Features
SUBROUTINE recipe_private ... call subsidiary(local_arg) ... END SUBROUTINE recipe_private SUBROUTINE subsidiary(local_arg) ... END SUBROUTINE subsidiary or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) END SUBROUTINE recipe Notice that the principal routine (recipe) has practically nothing in it — only dec- larations of variables intended to be visible to the subsidiary routine (subsidiary). All the real work of recipe is done in recipe private. This latter routine has visibility on all of recipe’s variables, while any additional variables that recipe private defines are not visible to subsidiary — which is the whole purpose of this way of organizing things. Obviously arg and global var can be much more general data types than the example shown here, including function names. For examples of this construction, see amoeba (p. 1208), amebsa (p. 1222), mrqmin (p. 1292), and medfit (p. 1294).
Recursion
A subprogram is recursive if it calls itself. While forbidden in Fortran 77, recursion is allowed in Fortran 90. [M&R, §5.16Ч5.17] You must supply the keyword RECURSIVE in front of the FUNCTION or SUBROUTINEkeyword. In addition, if a FUNCTION calls itself directly, as opposed to calling another subprogram that in turn calls it, you must supply a variable to hold the result with the RESULT keyword. Typical syntax for this case is:
RECURSIVE FUNCTION f(x) RESULT(g) REAL(SP) :: x,g if ... g=... else g=f(...) end if END FUNCTION f When a function calls itself directly, as in this example, there always has to be a “base case” that does not call the function; otherwise the recursion never terminates. We have indicated this schematically with the if...else...end if structure. On serial machines we tend to avoid recursive implementations because of the additional overhead they incur at execution time. Occasionally there are algorithms or CDROMs for which the recursion overhead is relatively small, and the recursive implementation g of machine- is simpler than an iterative version. Examples in this book are quad 3d (p. 1065), America). miser (p. 1164), and mglin (p. 1334). Recursion is much more important when parallelization is the goal. We will encounter in Chapter 22 numerous examples of algorithms that can be parallelized with recursion.
SAVE Usage Style
A quirk of Fortran 90 is that any variable with initial values acquires the SAVE attribute automatically. [M&R, §7.5 and §7.9] As a help to understanding 21.6 And Coming Soon: Fortran 95 959
an algorithm, we have elected to put an explicit SAVE on all variables that really do need to retain their values between calls to a routine. We do this even if it is redundant because the variables are initialized. Note that we generally prefer to assign initial values with initialization expressions rather than with DATA statements. We reserve DATA statements for cases where it is convenient to use the repeat count
feature to set multiple occurrences of a value, or when binary, octal, or hexadecimal or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) constants are used. [M&R, §2.6.1]
Named Control Structures
Fortran 90 allows control structures such as do loops and if blocks to be named. [M&R, §4.3Ч4.5] Typical syntax is
name:do i=1,n ... end do name One use of naming control structures is to improve readability of the code, especially when there are many levels of nested loops and if blocks. A more important use is to allow exit and cycle statements, which normally refer to the innermost do loop in which they are contained, to transfer execution to the end of some outer loop. This is effected by adding the name of the outer loop to the statement: exit name or cycle name. There is great potential for misuse with named control structures, since they share some features of the much-maligned goto. We recommend that you use them sparingly. For a good example of their use, contrast the Fortran 77 version of simplx with the Fortran 90 version on p. 1216.
CITED REFERENCES AND FURTHER READING: Metcalf, M., and Reid, J. 1996, Fortran 90/95 Explained (Oxford: Oxford University Press).
21.6 And Coming Soon: Fortran 95
One of the more positive effects of Fortran 90’s long gestation period has been the general recognition, both by the X3J3 committee and by the community at large, that Fortran needs to evolve over time. Indeed, as we write, the process of bringing
forth a minor, but by no means insignificant, updating of Fortran 90 — named or CDROMs
Fortran 95 — is well under way. g of machine- Fortran 95 will differ from Fortran 90 in about a dozen features, only a handful America). of which are of any importance to this book. Generally these are extensions that will make programming, especially parallel programming, easier. In this section we give a summary of the anticipated language changes. In §22.1 and §22.5 we will comment further on the implications of Fortran 95 to some parallel programming tasks; in §23.7 we comment on what differences Fortran 95 will make to our nrutil utility functions. No programs in Chapters B1 through B20 of this book edition use any Fortran 95 extensions. 960 Chapter 21. Introduction to Fortran 90 Language Features
FORALL Statements and Blocks
Fortran 95 introduces a new forall control structure, somewhat akin to the where construct, but allowing for greater flexibility. It is something like a do-loop, but with the proviso that the indices looped over are allowed to be done in any order
(ideally, in parallel). The forall construction comes in both single-statement and or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) block variants. Instead of using the do-loop’s comma-separated triplets of lower- value, upper-value, and increment, it borrows its syntax from the colon-separated form of array sections. Some examples will give you the idea. Here is a simple example that could alternatively be done with Fortran 90’s array sections and transpose intrinsic:
forall (i=1:20, j=1:10:2) x(i,j)=y(j,i) The block form allows more than one executable statement:
forall (i=1:20, j=1:10:2) x(i,j)=y(j,i) z(i,j)=y(i,j)**2 end forall Here is an example that cannot be done with Fortran 90 array sections:
forall (i=1:20, j=1:20) a(i,j)=3*i+j**2 forall statements can also take optional masks that restrict their action to a subset of the loop index combinations:
forall (i=1:100, j=1:100, (i>=j .and. x(i,j)/=0.0) ) x(i,j)=1.0/x(i,j) forall constructions can be nested, or nested inside where blocks, or have where constructions inside them. An additional new feature in Fortran 95 is that where blocks can themselves be nested.
PURE Procedures
Because the inside iteration of a forall block can be done in any order, or in parallel, there is a logical difficulty in allowing functions or subroutines inside such blocks: If the function or subroutine has side effects (that is, if it changes any data elsewhere in the machine, or in its own saved variables) then the result of a forall calculation could depend on the order in which the iterations happen to be done. This can’t be tolerated, of course; hence a new PURE attribute for subprograms.
While the exact stipulations are somewhat technical, the basic idea is that if you or CDROMs declare a function or subroutine as PURE, with a syntax like, g of machine- America). PURE FUNCTION myfunc(x,y,z) or PURE SUBROUTINE mysub(x,y,z) then you are guaranteeing to the compiler (and it will enforce) that the only values changed by mysubor myfuncare returned function values, subroutine arguments with the INTENT(OUT) attribute, and automatic (scratch) variables within the procedure. You can then use your pure procedures within forall constructions. Pure functions are also allowed in some specification statements. 21.6 And Coming Soon: Fortran 95 961
ELEMENTAL Procedures
Fortran 95 removes Fortran 90’s nagging restriction that only intrinsic functions are elemental. The way this works is that you write a pure procedure that operates on scalar values, but include the attribute ELEMENTAL (which automatically implies PURE). Then, as long as the function has an explicit interface in the referencing program, you can call it with any shape of argument, and it will act elementally. or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) Here’s an example: ELEMENTAL FUNCTION myfunc(x,y,z) REAL :: x,y,z,myfunc ... myfunc = ... END In a program with an explicit interface for myfunc you could now have REAL, DIMENSION(10,20) :: x,y,z,w ... w=myfunc(x,y,z)
Pointer and Allocatable Improvements
Fortran 95, unlike Fortran 90, requires that any allocatable variables (except those with SAVE attributes) that are allocated within a subprogram be automatically deallocated by the compiler when the subprogram is exited. This will remove Fortran 90’s “undefined allocation status” bugaboo. Fortran 95 also provides a method for pointer variables to be born with disassociated association status, instead of the default (and often inconvenient) “undefined” status. The syntax is to add an initializing=> NULL() to the declaration, as: REAL, DIMENSION(:,:), POINTER :: mypoint => NULL() This does not, however, eliminate the possibility of undefined association status, because you have to remember to use the null initializer if want your pointer to be disassociated.
Some Other Fortran 95 Features
In Fortran 95, maxloc and minloc have the additional optional argument DIM, which causes them to act on all one-dimensional sections that span through the named dimension. This provides a means for getting the locations of the values or CDROMs returned by the corresponding functions maxval and minval in the case that their g of machine-
DIM argument is present. America). The sign intrinsic can now distinguish a negative from a positive real zero value: sign(2.0,-0.0) is −2.0. There is a new intrinsic subroutine cpu time(time)that returns as a real value time a process’s elapsed CPU time. There are some minor changes in the namelist facility, in defining minimum field widths for the I, B, O, Z, and F edit descriptors, and in resolving minor conflicts with some other standards. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) Copyright (C) 1986-1996 by Cambridge University Press. Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copying of machine- readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books, diskettes, or CDROMs visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to [email protected] (outside North America). 962 to Parallel Programming to program multiprocessor machines. Symmetric multiprocessor In recent years we Numerical Recipes authors have increasingly become Let’s review the past: Take a screwdriver and open up the computer (workstation Next ask, how many of these chips are CPUs? That is, how many implement Why do our desktop computers have dozens or hundreds of memory chips, but Stabilizing this equilibrium has been the fact that there has been no standard, or 22.0 Why Think Parallel? Chapter 22. Introduction convinced that aprogramming,” certain is revolution, aboutcommunity cryptically to of burst denoted supercomputer forth by users,all from and the its computing. become words gestation the “parallel mainstream and methodology adolescence for inor the PC) that sitsbe on sure your to desk. unplugwith it more (Don’t first!) than blame amemory us Count chips million if the have gates one this integrated or voids (transistors).in circuits four your coming million — gates, warranty; years. As just and and the this wesuch number bigger You’ll write, chips will ones, probably increase in in rapidly count 1995, your at even computer. least lowly dozens,von and Neumann often processors hundreds,For capable of most of computers,of executing in computers arbitrary, 1995, do stored the havefunctions. secondary answer program So, processors is: two code? that or offloadunder three about input-output is the and/or one. often user’s a video more direct accurate A control. answer, significant but only number most one often is only usually oneto (user-accessible) manufacture? CPU? Do CPU No.primarily chips because intrinsically Are cost fixed CPU more developmentsmaller chips and number more design of expensive costs units thanCPU’s must are sold. memory relatively be chips? expensive distributed We becauseonly over have there Yes, one is a been per only in one computer, per a because computer; kind they and of are there is economic relativelywidely expensive. taught, equilibrium: methodology for parallel programming.scientific Except computing for the on special supercomputers case (where of or large geometric problems often character), have it areally regular knows is how not too much of an exaggeration to say that nobody 22.0 Why Think Parallel? 963
operating systems, for example, have been very slow in developing; and efficient, parallel methodologies for query-serving on large databases are even now a subject of continuing research. However, things are now changing. We consider it an easy prognostication that, by the first years of the new century, the typical desktop computer will have 4 to 8
user-accessible CPUs; ten years after that, the typical number will be between 16 and or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) 512. It is not coincidence that these numbers are characteristic of supercomputers (including some quite different architectures) in 1995. The rough rule of ten years’ lag from supercomputer to desktop has held firm for quite some time now. Scientists and engineers have the advantage that techniques for parallel com- putation in their disciplines have already been developed. With multiprocessor workstations right around the corner, we think that now is the right time for scientists and engineers who use computers to start thinking parallel. We don’t mean that you should put an axe through the screen of your fast serial (single-CPU) workstation. We do mean, however, that you should start programming somewhat differently on that workstation, indeed, start thinking a bit differently about the way that you approach numerical problems in general. In this volume of Numerical Recipes in Fortran, our pedagogical goal is to show you that there are conceptual and practical benefits in parallel thinking, even if you are using a serial machine today. These benefits include conciseness and clarity of code, reusability of code in wider contexts, and (not insignificantly) increased portability of code to today’s parallel supercomputers. Of course, on parallel machines, either supercomputers today or desktop machines tomorrow, the benefits of thinking parallel are much more tangible: They translate into significant improvements in efficiency and computational capability.
Thinking Parallel with Fortran 90
Until very recently, a strong inhibition to thinking parallel was the lack of any standard, architecture-independent, computer language in which to think. That has changed with the finalization of the Fortran 90 language standard, and with the availability of good, optimizing Fortran 90 compilers on a variety of platforms. There is a significant body of opinion (with which we, however, disagree) that there is no such thing as architecture-independent parallel programming. Proponents of this view, who are generally committed wizards at programming on one or another particular architecture, point to the fact that algorithms that are optimized to one architecture can run hundreds of times more slowly on other architectures. And,
they are correct! or CDROMs
Our opposing point of view is one of pragmatism. We think that it is not g of machine- hard to learn, in a general way, what kinds of architectures are in general use, and America). what kinds of parallel constructions work well (or poorly) on each kind. With this knowledge (much of which we hope to develop in this book) the user can, we think, write good, general-purpose parallel code that works on a variety of architectures — including, importantly, on purely serial machines. Equally important, the user will be aware of when certain parts of a code can be significantly improved on some, but not other, architectures. Fortran 90 is a good test-bench for this point of view. It is not the perfect language for parallel programming. But it is a language, and it is the only 964 Chapter 22. Introduction to Parallel Programming
cross-platform standard language now available. The committee that developed the language between 1978 and 1991 (known technically as X3J3) had strong representation from both a traditional “vectorization” viewpoint (e.g., from the Cray XMP and YMP series of computers), and also from the “data parallel” or “SIMD” viewpoints of parallel machines like the CM-2 and CM-5 from Thinking Machines,
Inc. Language compromises were made, and a few (in our view) almost essential or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) features were left out (see §22.5). But, by and large, the necessary tools are there: If you learn to think parallel in Fortran 90, you will easily be able to transfer the skill to future parallel standards, whether they are Fortran-based, C-based, or other.
CITED REFERENCES AND FURTHER READING: Metcalf, M., and Reid, J. 1996, Fortran 90/95 Explained (Oxford: Oxford University Press).
22.1 Fortran 90 Data Parallelism: Arrays and Intrinsics
The underlying model for parallel computation in Fortran 90 is data parallelism, implemented by the use of arrays of data, and by the provision of operations and intrinsic functions that act on those arrays in parallel, in a manner optimized by the compiler for each particular hardware architecture. We will not try to draw a fine definitional distinction between “data parallelism” and so-called SIMD (single instruction multiple data) programming. For our purposes the two terms mean about the same thing: The programmer writes a single operation, “+” say, and the compiler causes it to be carried out on multiple pieces of data in as parallel a manner as the underlying hardware allows. Any kind of parallel computing that is not SIMD is generally called MIMD (multiple instruction multiple data). A parallel programming language with MIMD features might allow, for example, several different subroutines — acting on different parts of the data — to be called into execution simultaneously. Fortran 90 has few, if any, MIMD constructions. A Fortran 90 compiler might, on some machines, execute MIMD code in implementing some Fortran 90 intrinsic functions (pack or unpack, e.g.), but this will be hidden from the Fortran 90 user. Some extensions of Fortran 90, like HPF, do implement MIMD features explicitly; but we will not consider these
in this book. Fortran 95’s forall and PURE extensions (see §21.6) will allow some or CDROMs significantly greater access to MIMD features (see §22.5). g of machine- America).
Array Parallel Operations
We have already met the most basic, and most important, parallel facility of Fortran 90, namely, the ability to use whole arrays in expressions and assignments, with the indicated operations being effected in parallel across the array. Suppose, for example, we have the two-dimensional matrices a, b, and c,
REAL, DIMENSION(30,30) :: a,b,c 22.1 Fortran 90 Data Parallelism: Arrays and Intrinsics 965
Then, instead of the serial construction,
do j=1,30 do k=1,30 c(j,k)=a(j,k)+b(j,k) end do end do
which is of course perfectly valid Fortran 90 code, we can simply write or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) c=a+b The compiler deduces from the declaration statement that a, b, and c are matrices, and what their bounding dimensions are. Let us dwell for a moment on the conceptual differences between the serial code and parallel code for the above matrix addition. Although one is perhaps used to seeing the nested do-loops as simply an idiom for “do-the-enclosed-on-all- components,” it in fact, according to the rules of Fortran, specifies a very particular time-ordering for the desired operations. The matrix elements are added by rows, in order (j=1,30), and within each row, by columns, in order (k=1,30). In fact, the serial code above overspecifies the desired task, since it is guaranteed by the laws of mathematics that the order in which the element operations are done is of no possible relevance. Over the 50 year lifetime of serial von Neuman computers, we programmers have been brainwashed to break up all problems into single executable streams in the time dimension only. Indeed, the major design problem for supercomputer compilers for the last 20 years has been to undo such serial constructions and recover the underlying “parallel thoughts,” for execution in vector or parallel processors. Now, rather than taking this expensive detour into and out of serial-land, we are asked simply to say what we mean in the first place, c=a+b. The essence of parallel programming is not to force “into the time dimen- sion” (i.e., to serialize) operations that naturally extend across a span of data, that is, “in the space dimension.” If it were not for 50-year-old collective habits, and the languages designed to support them, parallel programming would probably strike us as more natural than its serial counterpart.
Broadcasts and Dimensional Expansion: SSP vs. MMP
We have previously mentioned the Fortran 90 rule that a scalar variable is conformable with any shape array. Thus, we can implement a calculation such as
yi = xi + s, i =1,...,n (22.1.1) or CDROMs
with code like g of machine-
y=x+s America). where we of course assume previous declarations like REAL(SP) :: s REAL(SP), DIMENSION(n) :: x,y with n a compile-time constant or dummy argument. (Hereafter, we will omit the declarations in examples that are this simple.) This seemingly simple construction actually hides an important underlying parallel capability, namely, that of broadcast. The sums in y=x+s are done in parallel 966 Chapter 22. Introduction to Parallel Programming
on different CPUs, each CPU accessing different components of x and y. Yet, they all must access the same scalar value s. If the hardware has local memory for each CPU, the value of s must be replicated and transferred to each CPU’s local memory. On the other hand, if the hardware implements a single, global memory space, it is vital to do something that mitigates the traffic jam potentially caused by all the CPUs
trying to access the same memory location at the same time. (We will use the term or send email to [email protected] (outside North visit website http://www.nr.com or call 1-800-872-7423 (North America only), diskettes, computer, is strictly prohibited. To order Numerical Recipes books, readable files (including this one) to any server Permission is granted for internet users to make one paper copy their own personal use. Further reproduction, or any copyin Programs Copyright (C) 1986-1996 by Numerical Recipes Software. Copyright (C) 1986-1996 by Cambridge University Press. Sample page from NUMERICAL RECIPES IN FORTRAN 90: The Art of PARALLEL Scientific Computing (ISBN 0-521-57439-0) “broadcast” to refer equally to both cases.) Although hidden from the user, Fortran 90’s ability to do broadcasts is an essential feature of it as a parallel language. Broadcasts can be more complicated than the above simple example. Consider, for example, the calculation