The Fortran I Compiler

the Top T HEME ARTICLE THE FORTRAN I COMPILER The Fortran I compiler was the first demonstration that it is possible to automatically generate efficient machine code from high-level languages. It has thus been enormously influential. This article presents a brief description of the techniques used in the Fortran I compiler for the parsing of expressions, loop optimization, and register allocation. uring initial conversations about the At the same time, it is almost universally topic of this article, it became evident agreed that the most important event of the 20th that we can’t identify the top compiler century in compiling—and in computing—was algorithm of the century if, as the the development of the first Fortran compiler DCiSE editors originally intended, we consider only between 1954 and 1957. By demonstrating that the parsing, analysis, and code optimization algo- it is possible to automatically generate quality rithms found in undergraduate compiler text- machine code from high-level descriptions, the books and the research literature. Making such a IBM team led by John Backus opened the door selection seemed, at first, the natural thing to do, to the Information Age. because fundamental compiler algorithms belong The impressive advances in scientific comput- to the same class as the other algorithms discussed ing, and in computing in general, during the past in this special issue. In fact, fundamental compiler half century would not have been possible with- algorithms, like the other algorithms in this issue, out high-level languages. Although the word al- are often amenable to formal descriptions and, as gorithm is not usually used in that sense, from a result, to mathematical treatment. the definition it follows that a compiler is an al- However, in the case of compilers the difficulty gorithm and, therefore, we can safely say that the is that, paraphrasing John Donne, no algorithm is Fortran I translator is the 20th century’s top an island, entire of itself. A compiler’s components compiler algorithm. are designed to work together to complement each other. Furthermore, next to this conceptual objec- tion, there is the very practical issue that we don’t The language have enough information to decide whether any The IBM team not only developed the com- of the fundamental compiler algorithms have had piler but also designed the Fortran language, and a determinant impact on the quality of compilers. today, almost 50 years later, Fortran is still the language of choice for scientific programming. The language has evolved, but there is a clear 1521-9615/00/$10.00 © 2000 IEEE family resemblance between Fortran I and today’s Fortran 77, 90, and 95. Fortran’s influence DAVID PADUA is also evident in the most popular languages to- University of Illinois at Urbana-Champaign day, including numerically oriented languages 70 COMPUTING IN SCIENCE & ENGINEERING such as Matlab as well as general-purpose lan- The flip side of using novel and sophisticated guages such as C and Java. compiler algorithms was implementation and Ironically, Fortran has been the target of crit- debugging complexity. Late delivery and many icism almost from the beginning, and even bugs created more than a few Fortran skeptics, Backus voiced serious objections: “‘von Neuman but Fortran eventually prevailed: languages’ [like Fortran] create enormous, un- necessary intellectual roadblocks in thinking It gradually got to the point where a program in about programs and in creating the higher-level Fortran had a reasonable expectancy of compiling combining forms required in a powerful pro- all the way through and maybe even of running. gramming methodology.”1 This gradual change in status from an experimen- Clearly, some language features, such as implicit tal to a working system was true of most compil- typing, were not the best possible choices, but For- ers. It is stressed here in the case of Fortran only tran’s simple, direct design enabled the develop- because Fortran is now almost taken for granted, ment of very effective compilers. Fortran I was the as if it were built into the computer hardware.2 first of a long line of very good Fortran compilers that IBM and other companies developed. These powerful compilers are perhaps the single most Optimization techniques important reason for Fortran’s success. The Fortran I compiler was the first major project in code optimization. It tackled problems of crucial importance whose general solution The compiler was an important research focus in compiler The Fortran I compiler was fairly small by to- technology for several decades. Many classical day’s standards. It consisted of 23,500 assembly techniques for compiler analysis and optimiza- language instructions and required 18 person- tion can trace their origins and inspiration to the years to develop. Modern commercial compilers Fortran I compiler. In addition, some of the ter- might contain 100 times more instructions and minology the Fortran I implementers used al- require many more person-years to develop. most 50 years ago is still in use today. Two of the However, its size not withstanding, the compiler terms today’s compiler writers share with the was a very sophisticated and complex program. It 1950s IBM team are basic block (“a stretch of pro- performed many important optimizations—some gram which has a single entry point and a single quite elaborate even by today’s standards—and it exit point”3) and symbolic/real registers. Symbolic “produced code of such efficiency that its output registers are variable names the compiler uses in would startle the programmers who studied it.”1 an intermediate form of the code to be gener- However, as expected, the success was not uni- ated. The compiler eventually replaces symbolic versal.2 The compiler seemingly generated very registers with real ones that represent the target good code for regular computations; however, ir- machine’s registers. regular computations, including sparse and sym- Although more general and perhaps more bolic computations, are generally more difficult to powerful methods have long since replaced those analyze and transform. Based on my understand- used in the Fortran I compiler, it is important to ing of the techniques used in the Fortran I com- discuss Fortran I methods to show their ingenu- piler, I believe that it did not do as well on these ity and to contrast them with today’s techniques. types of computations. A manifestation of the difficulties with irregular computations is that sub- Parsing expressions scripted subscripts, such as A(M(I,J),N(I,J)), One of the difficulties designers faced was how were not allowed in Fortran I. to compile arithmetic expressions taking into The compiler’s sophistication was driven by account the precedence of operators. That is, the need to produce efficient object code. The in the absence of parentheses, exponentiation project would not have succeeded otherwise. Ac- should be evaluated first, then products and di- cording to Backus: visions, followed by additions and subtractions. Operator precedence was needed to avoid ex- It was our belief that if Fortran, during its first tensive use of parentheses and the problems as- months, were to translate any reasonable scientific sociated with them. For example, IT, an experi- source program into an object program only half mental compiler completed by A. Perlis and J.W. as fast as its hand-coded counterpart, the accep- Smith in 1956 at the Carnegie Institute of Tech- tance of our system would be in serious danger.1 nology,4 did not assume operator precedence. As JANUARY/FEBRUARY 2000 71 Donald Knuth pointed out: “The lack of operator copy propagation followed by dead-code elimination.6 priority (often called precedence or hierarchy) in Given an assignment x = a, copy propagation the IT language was the most frequent single substitutes a for occurrences of x whenever it can cause of errors by the users of that compiler.”5 determine it is safe to do so. Dead-code elimina- The Fortran I compiler would expand each tion deletes statements that do not affect the pro- operator with a sequence of parentheses. In a gram’s output. Notice that if a is propagated to simplified form of the algorithm, it would all uses of x, x = a can be deleted. The Fortran I compiler also identified permu- • replace + and – with ))+(( and ))-((, tations of operations, which reduced memory ac- respectively; cess and eliminated redundant computations re- • replace * and / with )*( and )/(, respec- sulting from common subexpressions.3 It is tively; interesting to contrast the parsing algorithm of • add (( at the beginning of each expression Fortran I with more advanced parsing algorithms and after each left parenthesis in the original developed later on. These algorithms, which are expression; and much easier to understand, are based on syntac- • add )) at the end of the expression and be- tic representation of expressions such as:7 fore each right parenthesis in the original expression. expression = term [ [ + | −] term]... term = factor [ [ * | / ] factor]... Although not obvious, the factor = constant | variable | (expression ) It is interesting to algorithm was correct, and, in the words of Knuth, “The re- Here, a factor is a constant, variable, or expression contrast the parsing sulting formula is properly enclosed by parentheses. A term is a factor possibly parenthesized, believe it or followed by a sequence of factors separated by * algorithm of Fortran I not.”5 For example, the expres- or /, and an expression is a term possibly followed sion A + B * C was expanded by a sequence of terms separated by + or −. The with more advanced as ((A))+((B)*(C)). The precedence of operators is implicit in the notation: translation algorithm then terms (sequences of products and divisions) must parsing algorithms. scanned the resulting expres- be formed before expressions (sequences of addi- sion from left to right and in- tions and subtractions).

Load more