A DATA DEFINITION FACILITY for PROGRAMMING LANGUAGES By
Total Page:16
File Type:pdf, Size:1020Kb
A DATA DEFINITION FACILITY FOR PROGRAMMING LANGUAGES by T. A. Standish Carnegie Institute of Technology ....... PittsbUrgh, Pennsylvania May 18, 1967 Submitted to the Carnegie Institute of Technology ....... in partial fulfillment of the requirements for the degree of Doctor of Philosophy This work was supported by the Advanced Research Projects Agency of the Office of the Secretary of Defense (SD-146). Abstract This dissertation presents a descriptive notation for data structures which is embedded in a programming language in such a way that the resulting language behaves as a synthetic tool for describing data and processes in a number of application areas. A series of examples including formulae, lists, flow charts, Algol text, files, matrices, organic molecules and complex variables is presented to explore the use of this tool. In addition, a small formal treatment is given dealing with the equivalence of evaluators and their data structures. -ii- Table of Contents Title Page ....................... i Abstract ........................ ii Table of Contents .................... iii Acknowledgments .................... v Chapter I. Introduction .................. 1 Chapter II. A Selective Review of the Work of Others ...... 17 Chapter III. The Data Definition Facility ........... 25 1. Chapter Summary .............. 25 2. General Description .............. 25 3. Component Descriptions ........... 30 4. Elementary Descriptors ........... 34 5. Modified Descriptors ............ 40 6. Descriptor Formulae ............ 48 7. Declaring Descriptor Variables. and Descriptor Procedures ......... 51 8. Predicates, Selectors, Constructors and Declarations ............. 53 9. Constructors ............... 53 10. Selectors ................. 58 11. Predicates ................ 60 12. Declarations ............... 64 13. Reference Variables, Pointer Expressions and the Contents Operation ......... 65 14. Overlay Assignments, Sharing of Structures and Copying of Structures .......... 68 - iii- Table of Contents, Continued 15. Paths, Path Variables and Path Concatenation .......... 70 16. Other Operations on Descriptors ........ 71 17. Parallel Assignments and Block Expressions. 72 Chapter IV. Examples and Applications .......... 73 1. Formula Manipulation ............ 73 2. List Processing .............. 95 3. Mapping Algol Text into Flow Charts ...... 104 4. Electronic Circuits ............. 121 5. Complex Variables ............. 128 6. Files .................. 129 7. Matrices ................. 134 8. Recognizers for Graphical Objects ....... 137 9. Organic Chemistry Molecules ......... 182 Chapter V. A Small Formal Study of the Equivalence Of Evaluators and Their Data Structures ..... 195 Chapter VI. Some Reduction Algorithms and Background Machines ............. 239 Chapter VII.Future Directions and Conclusions. ..... 254 Appendix I. Additions to the Revised Algol Report Defining the Syntax of the Data Definition Facility ..... 267 Appendix II. Summary of the Semantics of the Data Definition Facility ............. 2 76 Appendix III. Data Structure Definitions .......... 280 References ..................... 28 7 -iv- A cknowledgements These early days of the science of computing, even if somewhat muddled, are filled with excitement and at Carnegie Tech especially the environment has been stimulating. This is due mainly, I feel, to the presense and leadership of a gifted, dynamic and personable triumvirate who have marshalled people, resources and, most of all, inspiring ideas in the construction of the computing enterprise at Carnegie. It has been both a unique privilege and an enormous pleasure to have had the opportunity to participate in this enterprise for the past four years. I am particularly grateful to several individuals who have been helpful in the preparation of this dissertation. First and foremost is Professor Alan J. Perlis. His outstanding lectures and teachings have provided many of the ideas that I find are now keystones in my thinking, and he has given generously of his valuable time and advice in helping me to prepare this dissertation. My colleagues in the Formula Algol group : Renato Iturriaga, Rudy Krutar and Jay Earley, with whom I have shared the rewards and frustrations of sustained, constructive toil, are also deserving of mention for hours of interesting conversation and for their creative suggestions. I am also grateful to Professor Allen Newell whose remarkable originality has been a broadening influence and -V - a stimulus in getting me to look at different approaches to the work of this dissertation. As a recipient of a National Science Foundation fellowship during my graduate studies it is a pleasure gratefully to acknowledge my indebtedness to the National Science Foundation for financial support. Finally, computer time and facilities were provided in part by the Advanced Research Projects Agency of the Office of the Secretary of Defense under contract SD-146 to the Carnegie Institute of Technology. - vi - Chapter I Introduction This dissertation presents a method for defining and manipulating several varieties of data structures. This method consists of embedding a descriptive notation for data structures within a programming language in such a way that the resulting language behaves as a synthetic tool for describing and constructing data and programs in a variety of application areas. A series of examples are presented which explore the use of this tool and a small formal presentation is given dealing with the equivalence of evaluators and their data structures. One hope for the dissertation is that the data structure notation it presents will serve both a descriptive and an integrative role. Enough examples have accumulated in our programming experience of different kinds of useful and durable data structures that it becomes worthwhile to study them deliberately and to search for a notation which, to some extent, summarizes the variations in structure that they represent. What is desired is a notation whose permissible variations allow us to describe a wide range of different data structures including, as subsets, the useful structures known to us and new ones as well. A good descriptive notation should not only -2- be a summary of what is known, it should lead along natural lines of generalization suggested by its combining laws and rules of growth to the description of new structures. With regard to its use as part of a programming technique for describing and manipulating data, the notation should permit descriptions of data structures to be concise, clear and direct. In our view, the significant problems that we face in dealing initially with new areas of research in computer science are "problems of discovery, formulation, representation and immediate generalization [45]" and that with regard to practical application "we are not [during the initial stages of exploration] at the place of building very elaborate or formal mathematical structures that are significant [45]". The study of data structures has perhaps advanced beyond this initial stage. At present, we have many results of the processes of discovery and formulation (e. g. lists, arrays, strings, rings, formulae, trees and directed graphs ). These discoveries make possible a process of immediate generalization. The data structure notation presented in this dissertation exhibits one immediate generalization of known data structures. Thus the notation summarizes, integrates and provides descriptive power. It also points in new directions. -3- Giving a notation of broad descriptive power enhances our opportunities for creating meaningful formal mathematical structures. A small formal treatment of the equivalence of evaluators and their data structures, given in Chapter V, although not of broad practical significance, is intended as a harbinger of what becomes available to us in the realm of mathematical formalization. Ultimately, of course, we hope that significant mathematical structures can be formulated which will summarize the significant discoveries about data structures with precision and generality. At this point the study of data structures will approach maturity. Another objective of the dissertation is to provide a method for improving programming languages. One of the constant concerns of computer science is to design good programming languages, and, once they are implemented, to enrich them and improve them, for programming languages are a principal means whereby we specify and control the behavior of computers and whereby we can organize our transactions with computers to achieve multiplication of effect without multiplication of effort. For a given programming task, the greater the ease with which a programmer can describe in a given language the data of -4- interest and the greater the ease in that language with which he can manipulate the data he has described, the greater the usefulness of that language for that task. Conversely, a programming language is relatively less useful for a given task to the degree that it becomes awkward, indirect and complex to formulate and manipulate representations of the data of interest to the task in terms of the primitive data and primitive operations supplied in that language. Indirectness of data representation and data manipulation may cause a loss both of clarity of structure and also of efficiency since operations on the representations which could be expressed or performed once prior to writing an algorithm may be distributed through it and since the primitive data structures used