Controlling Large Software Development . in a Distributed Environment
Total Page:16
File Type:pdf, Size:1020Kb
Controlling Large Software Development . In a Distributed Environment Eric Emerson Schmidt Controlling Large Software Development in a Dist ri buted Envi ronment Eric Emerson Schmidt CSL-82-7 December 1982 [P82-00039] © Copyright Eric Emerson Schmidt 1982. All rights reserved. Xerox Corporation XEROX Palo Alto Research Centers 3333 Coyote Hill Road Palo Alto, California 94304 Controlling Large Software Development in a Distributed Environment Report number CSL-82-7. This report reproduces a dissertation submitted to the University of California. Berkeley in partial fulfillment of the degree requirements for the Doctor of Philosophy in Computer Science. The research in this dissertation was supported by Xerox Corporation. Abstract Breaking a program up into modules is an important technique for managing the complexity of large systems. As the number of modules increases, the modules themselves need to be managed. Changing even a single module can be difficult. Compilation and loading are complicated. Saving the state of a program for others to build on is quite error-prone. The development of a large program as part of a multi-person project is even worse. This thesis presents solutions to these problems. We use new languages to describe the modules that comprise a system and tools that automate software development. The first solution developed is a version control system of modest goals that has been used to maintain up to 450,000 lines of code over the past year. Users of this system list versions of files in description files (OF files) that are automatically maintained for the user. OF files may refer to other OF files when one software package depends on another. A working set of software that is saved in a safe location is called a release. The need for a release process was identified and an iterative algorithm that uses OF files to perform releases has been developed. Based on experience with the OF system and the desire to automate the entire compile-edit debug-release cycle, a second solution was developed in which the development cycle is controlled by the System M odeller. The modeller automatically manages the compilation, loading. and saving of new modules as they are produced. The user describes his software in a system model that lists the versions of files used. the information needed to compile the system. and the interconnections between the various modules. The modeller is connected to the editor and is notified when files are edited and new versions are created. To provide fast response. the modeller behaves like an incremental compiler: only those modules that change are analyzed and recompiled. CR categories: 0.1.1 [Programming Techniques]: Applicative (Functional) Programming; 0.2.2 [Software Engineering]: Tools and Techniques - Modules and Interfaces; 0.2.7 [Software Engineering]: Distribution and Maintenance - Version Control; 0.2.9 [Software Engineering]: Management - Software Configuration Management; 0.3.2 [Programming Languages]: Language Classifications - Applicative Languages. Key words and ph rases: Version control, release process, Cedar, Mesa, software management, automatic compilation, applicative language, modules, OF files, system models. TABLE OF CONTENTS Chapter 1 Page 1 Chapter 2 Page 15 Chapter 3 Page 49 Chapter 4 Page 75 Chapter 5 Page 103 References Page 107 Appendix A Page III Appendix B Page 115 Appendix C Page 125 Appendix D Page 129 Appendix E Page 135 Acknowledgments I would like to thank my committee, Robert S. Fabry and Richard J. Fateman of the Computer Science Department of V.C. Berkeley, James A. Reeds of the Statistics Department of Berkeley, and Butler W. Lampson of Xerox PARe's Computer Science Laboratory (esL) for their comments on this thesis. Fabry was my principal research adviser throughout my years at Berkeley; his encouragement and support were constant. Lampson is the intellectual father of the research described in this thesis. He coined the term "System Modelling" and designed the Cedar Kernel language, on which the SML language is based. Monthly meetings with him over the past three years gave me insight and encouragement in this research. Every idea in this thesis has been affected by his critical approach and incisive suggestions. Parts of section 3.4 are based on an internal memo by Lampson on the Cedar Kernel language. Roy Levin of eSL had a strong influence on this research. He proposed the use of OF files as part of the Cedar release process, served as Release Master for almost all Cedar releases, and participated in the design of the Release Tool. His thorough approach is responsible for the success of the release process. He has contributed several ideas to the design. presented here. of a release process based on system models. James H. Morris of eSL brought me into Xerox as a research intern five years ago. He remained my official sponsor at Xerox through another summer and later made me a member of the Cedar project. which he headed until this year. His strong personal influence through those years kept me directed toward a Ph.D.: he is both my friend and mentor. System Modelling has been strongly influenced by Edwin H. Satterthwaite of eSL, who has used the system modeller heavily and has played a major role in providing fast turnaround through module replacement. Satterthwaite has given me advice on the design of the SM language and its implementation, and has allowed me to reproduce the system model that he developed for the Cedar compiler as an appendix to the thesis. I would like to thank these fearless readers of earlier versions of my thesis: Mark Brown. Jim Donahue and Jim Horning of esL, Brian Lewis of Xerox SOD. David Elliott of SRI International, William N. Joy of Sun Microsystems, Inc. and Dan Halbert of V.C. Berkeley. Each made valuable structural and syntactic improvements to this thesis. Lewis took the OF software described in Chapter 2. converted it to run in SOD'S programming environment. and served as Release Master for their first release. Halbert carried many versions of this thesis between Palo Alto and Berkeley: his door-to-door service was invaluable. Every member of the Cedar project has used some of the software described here. I thank them all for their patience in helping to debug these programs, and for their many suggestions that helped turn the programs into a usable system. Thanks also go to Paul Rovner and Warren Teitelman of eSL for allowing me to reproduce some of their OF files in the appendices to the thesis. This research and the Cedar project would not have been possible without the support of CSL and particularly of its manager, Robert W. Taylor. The text of this thesis was prepared using the Bravo editor, and the figures were done using the SIL illustrator. Both of these programs were run on a Dorado. These three tools made preparation of this thesis much easier. I would like to thank my wife, Wendy, for editing this thesis as it was prepared and for supporting me through the good and bad times. This thesis is dedicated to the memory of my father, Dr. Wilson Schmidt, Professor of Economics, who died while this thesis was being prepared. His emphasis on the importance of graduate education and the personal value of a Ph.D. was the inspiration for my pursuit of a Ph.D. CHAPTER 1: INTRODUCTION 1 1. Introduction 1.1 Introduction This dissertation presents a solution for problems occurring when large numbers of components of a software system are developed and maintained by many different programmers. Tools and languages are described that automate production of new versions of the system and the integration of packages into a stable system. The solution is developed in the context of a project to build Cedar, a new programming environment at Xerox's Palo Alto Research Center. The Cedar project [Deutsch-Taft. 1980] is an attempt to take the Mesa language and build around it a programming environment based on ideas from Interlisp and Smalltalk, while retaining the strong type-checking properties of Mesa. Cedar makes it easy for programmers to share packages and collaborate on software development. Cedar programmers work in a distributed computing environment and have to be able to share each other's programs in various stages of development. In this setting, control of versions and file management is difficult because of the large number of files in Cedar and the requirement that versions of files must agree. The first solution developed manages versions of files based on description files (OF files) that have information about versions of files needed and their locations. OF files that describe packages of software are input to a release process. The release process checks the submitted OF files to see if the programs they describe are made from compatible versions of software, and. if so, copies the files to a safe location. The Release Tool performs these checks and copies the files; an interactive algorithm is used that can be repeated after errors in OF files are found and fixed. Use of the Release Tool allows the person making the release, who is called the Release Master. to release software with which he ·is not familiar. The OF system automates version control for the programmer. Based on experience with it. a second solution was developed that is a complete program management system. The user describes his software· in system models. which are complete descriptions of a software system. Similar to a blueprint or schematic, a model combines in one place 1) information about the versions of files needed and hints about their locations, 2) additional information needed to compile the system, and 3) information about interconnections between modules, such as which procedures are used and where they are defined.