System Programming in a High Level Language
Total Page:16
File Type:pdf, Size:1020Kb
System Programming in a High Level Language Andrew D Birrell Dissertation submitted for the degree of Doctor of Philosophy in the University of Cambridge, December 1977. Except where otherwise stated in the text, this dissertation is the result of my own work, and is not the outcome of work done in collaboration. Copyright © A.D.Birrell, 1977 Acknowledgements The work described in this dissertation has been supported by the Science Research Council. It would not have been possible without the support and encouragement of Brenda, and of my supervisor Roger Needham, and of my head of department, Professor M.V.Wilkes. In particular, Brenda showed particular diligence in typing this document, and perseverance in encouraging me to write it. The paper reproduced as Appendix Y was first published as an invited paper at the 1977 conference on Algol68 at the University of Strathclyde [40]. Contents Section A: Introduction 1. The Problem 1.1 Aims and Requirements 1.2 High Level Language Requirements 1.3 System Programming Language Requirements 1.4 The Present Approach 2. Background 2.1 The Algol68C Project 2.2 The CAP Project Section B: High Level Language Implementation 1. Today's Algol68C System 2. Separate Compilation Mechanisms 2.1 Requirements 2.2 Separate Compilation in Algol68C 2.3 Separate Compilation in Other Systems 2.4 A Complete Separate Compilation Scheme 3. Compiler Portability 3.1 Other Intermediate Codes 3.2 Portability in Algol68C 3.3 Summary and Conclusions Section C: System Programming Facilities 1. Runtime System 1.1 The CAP Algol68C Runtime System 2. Hardware Objects and Operations 2.1 Storage Allocation 2.2 New Data-types 2.3 Conclusions 3. The 'Program' 3.1 Program Structure 3.2 Program Environment Section D: Summary and Conclusions Appendix X: CAP Algol68C Documentation Appendix Y: Storage Management for Algol68 Appendix Z: Cap Project Notes Section A: Introduction 1 The Problem 1.1 Aims and Requirements This thesis is concerned with the construction of a high level language system suitable for the implementation of a general purpose operating system for a computer. There are three aspects to this task: firstly, a suitable high level language must be chosen or designed; secondly, a suitable implementation of this language must be manufactured; thirdly, the operating system itself must be written. These three aspects inevitably overlap in time - experience in implementing the language may cause one to review decisions taken in the design of the language, and experience in constructing the operating system will bring to light inadequacies, inconveniences and inelegancies in both the implementation and the language. Most previous work in this field has been concerned with the first of these aspects, and has adopted the approach of designing special-purpose languages, categorized as 'System Programming Languages' (SPL's) or 'Machine Oriented Languages' (MOL's). Various such languages have been developed, some of which are discussed below. Few such languages have achieved the elegance or generality of ordinary general-purpose languages such as Pascal or Algol68. Little or no investigation has previously - 1 - been made into the second of these aspects, the implementation of the language. The implementation, as distinct from the language, can have a very considerable effect on the practicability of using the resulting language system for manufacturing an operating system. Certainly, there are languages which, however brilliant the implementation, would inevitably be disastrous for writing an operating system; but the implementation, however suitable the language, makes the difference between the language system being an aid or an impediment to the system programmer. It is with aspects of the implementation that this thesis is mainly concerned. It should be emphasised that we are considering the real construction of an operating system on physical hardware without sophisticated external support. The 'language system' must not amount to a simulation package, nor include facilities which would normally be considered part of the operating system (such as in Simula or Concurrent Pascal), unless those facilities are themselves implemented using a high level language (they could then be considered to be part of the operating system). It is a principle of the design that the language system should not contain imbedded design decisions which would be more properly considered as part of the design of the operating system - it should be a tool, not an integral part of the operating system. Also, since we are providing a general purpose operating system, user programs can be written in any language - we are not assuming a single-language operating system. - 2 - When embarking on the production of a suitable language system, we must at an early stage make a fundamental decision: whether to base the work on an existing high level language which can be modified as necessary, or whether to construct a new language from scratch. If we adopt the latter alternative, we will inevitably design a special-purpose language and follow in the footsteps of BLISS et al. In fact, I have adopted the former alternative. This leads to new ground: we must decide to what extent modifications to the language are needed, and this should enable us to decide which factors in the design of a suitable language arise peculiarly from writing an operating system, and which arise from the normal requirements of a non-numerical program. Also, when following this path we can investigate the extent to which suitable implementation techniques can aid us in achieving our aims with minimal special purpose modifications to the language. The language system produced was intended to be used (and is used) for an operating system for the Cambridge CAP computer. This computer, and its operating system, have many unusual fea- tures, but the techniques and facilities developed for the language system are not especially CAP-oriented - they have applicability to many other machines and environments. - 3 - 1.2 Use of Normal High Level Language Facilities The advantages of using a high-level language instead of machine-code are numerous and well known; they amount to saying that, when writing in a high-level language, it is easier to write programs, more difficult to write faulty ones, and when you have written a faulty program you discover the fault sooner. We are not much concerned here with faulty programs (although much of the distinction between a 'good' and a 'bad' implementation of a language lies in what it will do with a faulty program), but it must always be borne in mind when considering any topic in language design or implementation that a programmer must be told as early as possible of any mistake. As far as possible, all error checking should be performed at compile time, and we must always try to have sufficient redundancy in constructs for a simple mistake to be detectable. Error detection is one of the major gains in using a high level language, but first we must be able to express our problem conveniently. Much system programming is amenable to writing in a normal high level language with no particular trouble. Such operations as converting a textual file title into a disc address are merely mathematical mappings (albeit somewhat complicated ones), and can be implemented using techniques remarkably similar to those used in any non-numerical computing. A programmer implementing such algorithms is involved in the task of taking a single, complicated operation and breaking it into several simpler ones. - 4 - When writing in machine code, he is forced to resolve the problem into particular bit or word manipulations allowed by his particular hardware and operating environment; when writing in a high level language, such drastic measures are not forced upon him, since he has available to him operations which are less basic. A high level language makes available to the programmer sets of abstract objects, which he can consider not in terms of his hardware and operating system, but purely in terms of the objects, their relationships to each other, and the operations he can perform upon them. For almost all of almost every system program, it is sufficient to express the algorithm in such abstract terms - it is extremely rare for the programmer to require to manipulate non-abstract (hardware or system oriented) objects or operations. Expressing algorithms in such abstract terms clearly has many advantages. If the abstract objects and operations are suitable, it will be much easier to convert an algorithm into them than into the machine objects, purely because they involve 'higher level' concepts more closely related to the original algorithm. For example, indexing an array is more closely related to table look-up than is adding a fixed point integer, multiplied by the amount of store-per-element, to the base address of a sequence of elements; indeed, the programmer can use the abstract facility without knowing how it is implemented today. Thus, a suitable set of abstractions will be ones which would be encountered while converting the abstract algorithm into machine code - using the abstractions saves a step - 5 - in the conversion. If the algorithm can be expressed purely in terms of the abstractions, then it has no dependency on the hardware or operating environment. This has several beneficial consequences: firstly, it implies that the programmer has not made a mistake in mapping his objects onto the hardware (all such mistakes are centralised in the compiler!); secondly, the hardware can change without affecting his program (for a system program, the gain here is that the program can be developed on a different computer, or before the hardware of the target computer has stabilised); thirdly, the operating system interfaces can change (this is quite likely to happen during the development of a system, and having consequently to rewrite all the programs, rather than just recompile them, would be unfortunate).