Tony Brooker and the Atlas Compiler Compiler
Total Page:16
File Type:pdf, Size:1020Kb
Tony Brooker and the Atlas Compiler Compiler. Simon Lavington and many others. [email protected] February 2014, and as revised April 2016. Introduction and acknowledgements. Of all the software associated with the Ferranti Atlas computer, two systems were destined to be remembered as landmarks. One was the Supervisor, which was the first multi- tasking, multi-user operating system. The other was the Compiler Compiler which, when provided with a syntactic and semantic description of a programming language, would automatically generate a compiler for that language. Both the Supervisor and the Compiler Compiler (CC) were conceived and developed at the University of Manchester between about 1960 to 1964, in the spirit of making a high-performance computer more efficient and user-friendly. The Compiler Compiler was the idea of R A (Tony) Brooker. This article begins by describing the pre-history of the CC and then goes on to discuss its implementation and use on Atlas. We try to capture the spirit of the time by including an account of the practical difficulties with which the researchers had to deal and the comings and goings of compiler team members. The wider CC picture is then briefly described. Finally there is a substantial technical Appendix, specially written by Tony Brooker, which describes the CC methodology and gives Tony’s comments on contemporary research. Unless otherwise stated, all the quotes in the main text come from Tony Brooker – see [1] in the list of References given at the end (page 26 and following). One of the aims of this article is to record personal views and the background information that never gets into the formal published papers. Many ex-Atlas people have contributed their memories. Besides Tony Brooker himself, particular thanks are due to John Clegg, Robin Kerr and Iain MacCallum. Alas Derrick Morris and Jeff Rohl, two other members of the original joint Ferranti/University compiler team at Manchester, are no longer with us. Several (non-Manchester) Atlas software implementers, particularly George Coulouris, Bob Hopgood, Ian Pyle and Dik Leatherdale, have also been most helpful in illuminating the wider CC picture. 1. Pre-history. The Compiler Compiler was conceived by Tony Brooker, who had arrived at Manchester in October 1951. The Ferranti Mark I had been delivered to the University in February of that year and Tony’s job, he recalls, “was to make it useable”. When he arrived, the Mark I programming system was “extremely neat and very clever but pretty meaningless and very unfriendly”. Programs had to be written in ‘backwards binary’ as 5-track teleprinter characters, with guidance from a manual written by Alan Turing that was full of minor errors. 1 An obvious improvement would have been to develop some sort of symbolic assembler for the Mark I, but Tony looked deeper. Whilst helping the scientific and engineering users with their problem-solving he became aware that two fundamental issues were causing difficulties even for experienced programmers: firstly, the need to scale numerical calculations involving real quantities; secondly, the need for memory-management between the small but fast primary store and the larger but slower drum backing store. The scaling issue could be solved by implementing a library of interpretive routines for floating-point arithmetic – so long as the invoking of these routines was made easy. The second issue could be eased by providing systems software which gave the end-user the impression of a one-level store – again, so long as the user was treated gently. Tony’s answer was Mark I Autocode, which allowed arithmetic expressions to be written down directly using named integer and real variables without the user being aware of where those variables were stored. The one-level store could be implemented in a balanced way on the Mark I because the time to transfer a block of information from the drum was approximately the same as the time taken by a software-implemented floating- point operation running in the fast store. For the Mark I users of the day, floating-point calculations dominated their problem- solving. Tony’s autocode pre-allocated some alphabetic symbols {r, s, t, … z} to real variables and other symbols {i, j, k, ..} to integers. A library of standard functions and organisational routines for printing, etc., was permanently stored on the Mark I’s drum. When running Autocode programs, 128 tracks on the drum were reserved for instructions and 128 tracks for variables – which was adequate for all but the most ambitious problem- solving at the time [ref. 2]. The autocode system was released in March 1954. Whilst Mark I Autocode is now recognised as one of the earliest high-level languages, its benefits to end-users at the time were perhaps more organisational than linguistic. The next Manchester computer, the Ferranti Mercury, had floating-point hardware but the memory-management problems were still present. By 1958 Tony had developed Mercury Autocode, which extended the linguistic richness of Mark I Autocode whilst failing to provide a general solution to the problem of Mercury’s two levels of storage (core and drum). Tony later recalled that “In a way I missed out... I introduced the idea of chapters where the programmer would break his huge programme up into chapters, and usually the chapter was sufficient …[Only one chapter could be in the main memory at a time and it was the programmer’s responsibility to transfer control between the chapters]. As regards numbers... I’ve forgotten offhand how I solved the problem of numbers. Well I didn’t really solve it”. By this time (1958) the MUSE project, subsequently called Atlas, was under way at Manchester University and Tony Brooker was placed in charge of software provision. The engineers were in the process of solving the problem of the Atlas one-level store via hardware page-address registers and the drum learning program – though Tony had certainly contributed to the discussions – see [ref. 3]. Tony later recalled that “the only problem left for me to solve” was what high-level language(s) should be implemented for Atlas. “I suppose I’ll have to come up with some, some kind of autocode for the Atlas”, he said. Fortran was of course a contender, though Tony added somewhat dismissively that he “wouldn’t have called Fortran a high level language”. 2 At this time there were proposals in America and Europe for a universal programming language, initially known as International Algorithmic Language, IAL. Detailed specifications were put forward by the ACM and also by GAMM (Gesellschaft für Angewandte Mathematik und Mechanik). It was decided to hold a joint meeting to combine the two proposals. The meeting took place from 27 th May to 2nd June 1958 in Zurich, as a result of which ALGOL 58 was defined. Historians now view the Zurich meeting as seminal. Tony recalls: “I never received any invitation to the conference at Zurich and I had no contact whatever with the authors of the 1958 report…. By that time I was resigned to ploughing my own furrow. Although I had attended BCS conferences I was really quite naive about overseas activities”. When he read the Zurich report, Tony thought that ALGOL 58 was “an incredibly inefficient language in some ways” but he nevertheless considered devising a strict subset of ALGOL 58 for Atlas. Robin Kerr, who implemented an Atlas compiler for Algol 60 several years later, thinks [ref. 4] that Tony was initially put off by the difficulty of writing a one-pass compiler for ALGOL 58. Tony himself says his trouble lay in “finding a strict subset [of Algol 58] which was at the same time a useful programming tool”. In the end, of course, Tony designed and implemented Atlas Autocode which was “perfectly usable and it had recursion …the only thing wrong with the Atlas Autocode was that it wasn’t ALGOL 60”. Back in 1958, Tony put off the decision about choice of languages for Atlas. “I thought, well, before jumping in, why don’t I think about a system for writing any kind of compiler, so that whatever turns out to be popular, I can do it”. Tony started by considering the question of how one could describe autocodes in general. Thus was born the idea of the Compiler Compiler. Tony and his former research student Derrick Morris began to publish relevant CC research from 1960 onwards – see [refs. 5 to 9] – long before an operational CC system had been implemented. Their first paper [ref. 5] used the phrase “an autocode to write autocodes”. Tony’s first UK publication with the words Compiler Compiler in the title did not appear until 1963 [ref. 11]. Shortly before this Saul Rosen, an enthusiast at Purdue University, “decided to rewrite my account in terms that Americans could understand”. Rosen’s paper, which drew attention to what Rosen called the “elegance and generality” of the CC, appeared in a journal that was much more widely-read [ref. 11]. Henceforth the by-line Brooker and Morris became better-known. The fact that anything which describes a high-level language is, of course, itself a high- level language was what first attracted interest from the theoreticians. Interest from software implementers was slower to materialise. The Compiler-Compiler as a tool for writing compilers was intrinsically inefficient because it worked top-down and was very recursive – (see also the description given in the Appendix). “You’ve got to keep invoking all this heavy apparatus, which is very time-consuming….to make the CC practically useful we had to put in an artificial addition which says, first check that it isn’t a simple expression, when it can be done in a blinding flash”.