CALIFORNIA STATE UNIVERSITY, NORTHRIDGE

THE EVOLUTION OF SOFTWARE DESIGN IDEAS

A thesis submitted in partial satisfaction of the requirements for the degree of Master of Science in by David J. Koepke

May 1985 The thesis of David J. Koepke is approved:

Professor Diane Schwartz

Professor Russell '1:1;;'{t;Chair

California State University, Northridge

11 ACKNOWLEDGEMENTS

I would like to express my sincere gratitude to Professor Russ Abbott for all the enlightening conversations, suggestions, and generous extended use of his software design books and articles. I would also like to thank Pat Morales for her vital assistance in typing and revising this thesis, typing and distributing the inquiry letters, and her moral support throughout this project. A special thanks to Larry Constantine, Edsger Dijkstra, C.A.R. Hoare, James Emery and Donald Knuth for their responses to my questions on the history of software design.

iii TABLE OF CONTENTS

Page List of Figures ...... vii

Abstract ...... Vlll

Chapter 1 INTRODUCTION ...... 1 1.1 Overview of Thesis ...... 3

1.2 Previous Historical Texts 4

1.3 Historical Errors ...... 4 1.4 Design Influences of the Past 6

Chapter 2 ABSTRACTION 9

2.1 Early Abstraction 10

2.2 Top-Down Design/Stepwise Refinement 14 2.3 Levels of Abstraction ...... 20 2.4 Program Families 22

2.5 Abstract Data Types 24

2.6 Summary 32

Chapter 3 MODULARITY 34

3.1 Early Modularity 35

3.2 Information Hiding ..•.....• 42 3.3 Object-Oriented Programming 46

3.4 Summary ...... 53

iv Chapter 4 STRUCTURED PROGRAMMING 54 4.1 Genesis of Structured Programming ...... 56

4.2 GOTO Controversy······~················· 58 4.3 Summary ...... •...... 65

Chapter 5 CONCURRENCY ...... ~ .. ~ ...... 66 5.1 Shared Variable Based Concurrency 68

5.1.1 Busy Waiting 70

5.1.2 Semaphores ...... •· ...... 71 5.1.3 Critical Regions ...... 72 5.1.4 Monitors 73

5.2 Message Based Concurrency ••.••• 74

5.2.1 Communicating Sequential Processes ...... 75 5.2.2 Distributed Processes ...... 77 5.2.3 Ada 78 5.3 Summary ...... 80

Chapter 6 DATA FLOW ...... 81 6.1 Early Data Flow 81

6.2 Data Flow Computers and Languages 87

6.3 Summary 90

Chapter 7 FUTURE PROGRAM DESIGN ...... 91 7.1 Conventional Languages 91

7.2 Non-Conventional Languages 93

v Chapter 8 CONCLUSIONS AND COMMENTS •••.•••..••.•... 96

8.1 Software Design Trends . . . • . . • • • . . • . . • • . • 96

8.2 Origin of Design Ideas •..•....•...•..... 107

8.2.1 Refinement of a Vague Idea ...•... 107

8.2.2 Reapplication of a Proven Idea ... 108

8.3 Understanding the Ideas •..••...•....••.. 109

References 112

Appendix - Letters ...... 122

Vl LIST OF FIGURES AND TABLES

Figures

Figure 2.1 Abstract Flowchart 13

Figure 2.2 Abstract Data Type - Stack ...... 25

Figure 2.3 Axiomatic Queue Specification .•...... 31

Figure 3.1 Emery's Module Hierarchy Chart . ·-· ...... 40 Figure 5.1 Evolution of Concurrency 69

Figure 5.2 CSP Correspondence···········~·········· 77

Figure 5.3 Ada Rendezvous_ • • • • • • • • • • . . • • • • • • • • • • • • . • 7 9

Figure 6.1 Early Data Flow Diagram ...... • • • . 82

Figure 6.2 Adams' Data Flow Graph 84

Figure 6.3 First Design Data Flow Diagram ...... •.. 85

Figure 6.4 Serial Data Flow • . . . . • ...... • ...... 86

Figure 6.5 Multiple Data Flow 87

Figure 8.1 Procedural Program Mode 1 ••..•.•.••••..•• 97

Figure 8.2 Object Program Model . • ...... 97

Figure 8.4 Evolution of High-Level Data Abstraction 99

Figure 8.5 Evolution of Concurrency Constructs ..... 101

Figure 8.6 Evolution of Information Hiding ...... •.. 102

Tables

Table 8.3 Action vs. Object Model . . . . . • ...... • . . 98

vii ABSTRACT

THE EVOLUTION OF SOFTWARE DESIGN IDEAS by

David J. Koepke Master of Science in Computer Science

This th~esis is a study of how software design ideas have evolved from the invention of the flowchart in 1947 to the present. The intent of this study is to accurately record software design history, to discover how new design ideas are conceived, and to recognize historial trends that may suggest future software design ideas. The specific design ideas discussed are abstraction, top-down design, levels of abstraction, program families, abstract data types, modularity, information hiding, object-oriented programming, structured programming, concurrency, and data flow. Each idea is defined and a step-by-step history of how it evolved is given. Also included is a brief discussion of future program design.

viii Chapter 1 INTRODUCTION

The importance of software design has been recognized since the NATO-Sponsored Conference on in 1968 [Dijk 72b]. Its importance is realized by considering that design significantly affects both the cost of developing and maintaining software. For example, in 1979 the annual cost of software in the u.s. was about 20 billion dollars [Boeh 79]. Approximately one third of this amount was used to develop software, of which design is an important step. The remaining two thirds of this 20 billion was spent on software maintenance: correcting errors, enhancing functionality, and improving performance [Wulf 79]. All of these activities are directly affected by a program's design. This thesis is a study of the evolution of software design ideas. Design ideas or design principles are general guidelines that program designers use in transforming the specification of a problem into a well-structured program. The design ideas discussed 1n this text are abstraction, levels of abstraction, program families, abstract data types, modularity, top-down design, information hiding, object-oriented programming, structured programming, concurrency and data flow.

1 2

The purpose of this thesis is to see what can be learned from studying the evolution of software des ig.n ideas. In particular, to observe past and present software design trends and to extrapolate from these trends to perhaps discover where software design is headed. Also, by observing how present design ideas have originatedi insight may be gained and used to develop future software design ideas. In other words, this thesis is a study of the human learning process applied specifically to design and the programming community of the past thirty five years. A second goal is to accurately record software design history: who first stated an idea, when they said it, in what book or article it first appeared, and how the idea has developed. A list of common historical misconceptions is given in Section 1.3. Portions of software design history have been reintroduced in some of the chapters to present each idea's history separately, so that maximum continuity of idea development would be obtained. The dates of origin given to the ideas discussed are based on the date of publication of an article or technical report or when it was first presented at a conference. The actual date of origin of an idea preceeds the published date. 3

1.1 Overview of Thesis

This thesis is presented in four conceptual sections: the process of abstraction (Chapter 2), object or module abstraction (Chapter 3), control abstraction (Chapters 4, 5, 6), and future abstraction (Chapter 7). The chapters are organized as follows: Chapter 2 discusses abstraction, the major design technique of programming. Subtopics include the techniques for applying abstraction: top-down design/stepwise refinement, levels of abstraction, program families, and abstract data types. Chapter 3 discusses modularity. Modules are program sub-units that contain desired abstractions. The subtopics of this chapter are information hiding, the idea which has greatly influenced the module concept, and object-oriented programming, the latest form of modularity Chapter 4 covers structured programming. This idea has been presented in a separate chapter because it is a design philosophy that contains some of the other ideas within it. The main thrust of this chapter is the use of GOTOs. Chapter 5 discusses concurrency and is presented 1n two major sections, shared variable concurrency and message based concurrency. Shared variable concurrency discusses the ideas of busy waiting, semaphores, critical regions, and monitors. Message based concurrency discusses 4

communicating sequential processes, distributed processes,

and ADA tasks. Chapter 6 explains data flow, the idea that program control should be based on the availability of data. Subsections of this chapter discuss early data flow concepts and recent uses of data -flow to facilitate concurrency. Chapter 7 presents a brief look at software design in the future. Chapter 8, the conclusion, discusses the benefits gained from studying the evolution of software design

ideas.

1.2 Previous Historical Texts

The research of current literature on software design did not reveal any general papers or books on the history of software design ideas (i.e., none covering a broad range of design ideas). However, papers by Shaw on abstraction [Shaw 84], by Yourdon on structured pro·gramming [Your 79], and by Gehani [Geha 84] and Andrews [Andr 83] on concurrency each covered the history of a single idea.

1.3 Historical Errors

The few historical inaccuracies that were found during research were very slight in that credit was usually given to the second person in a sequence of developers of an 5

idea. For example, structured programming is often claimed, [McCr 73], [Shaw 84] to have begun with Dijkstra's

1968 letter to the ACM, "GOTO Statement Considered Harmful"

[Dijk 68a], because it triggered a GOTO controversy. Perhaps a more accurate genesis of structured programming is Dijkstra's 1965 article, "Programming Considered as a

Human Activity" [Dijk 65a]. This article not only contains arguments against the use of GOTOs but also presents the

concepts of modularity and stepwise refinement within the general goal of program correctness (see Section 4.1 for more detail). Larry Constantine is often credited [Cons 84] with the

invention of the data flow diagram because of the 1974 paper "Structured Design" [Stev 74]. In fact, he was the

first to apply this type of diagram (which had been used as a model for the analysis of parallel computations in 1963), to software design (see Section 6.1 for more detail).

The class concept is usually attributed to SIMULA-67

[Lisk 74], the first language to use this concept. However, the creator of the class concept was C.A.R. Hoare, who first presented this idea in the 1965 Algol Bulletin

[Hoar 65] (see Sections 2.5 and 3.3 for more detail).

The monitor concept [Brya 79], [Barn 82] is often attributed to Brinch Hansen [Brin 72] and/or Hoare [Hoar 71], [Hoar 74]. However, both Brinch Hansen [Brin

73a] and Hoare [Hoar 74] attribute the concept of the monitor to Dijkstra, in his 1971 article, "Hierarchical 6

Ordering of Sequential Processes" [Dijk 71] (see Section

5.1.4 for more detail).

1.4 Design Influences of the Past

The last major stimulus to software design was caused by the rapid advance in hardware technology of the early 1960's. The limited memory and execution speed of computers of the 1950's and early 1960's forced programmers to concentrate on one design goal -- efficiency. Programmers had to develop clever tricks, such as modifying their own code, to overcome the computer's limitations. This restrictive hardware environment negatively impacted design and resulted in programs that were less clear. The limited technology of these machines also meant programs were necessarily small. The smaller programs of earlier times suffered less (than today's large programs) from the problems caused by poor designs. As a result, program design was not a problem and was not even considered as a topic of discussion, as is evidenced by the void of design techniques in books and articles from 1950s to 1962. By the early 1960's, faster machines with more memory became available. Suddenly, efficiency was no longer of overriding importance. Also, these new machines brought with them larger, more complex problems to be solved. 7 v '

These larger programs amplified the problems associated with poor designs; program debugging and modification became overwhelming problems. The technological advances of hardware caused a shift in emphasis from relatively little design (with efficiency as its primary goal), to the importance of good design (with modifiability, understandability, and reliability as the goals). The turning point came in 1968 at the NATO-Sponsored Conference on Software Engineering at Garmish, Germany [Dijk 72b]. The production of software had reached what many at the conference called a software crisis. It was generally agreed that the development of software (and software design) needed to be studied. Shortly afterward, in the mid-1970's, true software design methodologies emerged for the first time [Stev 74], [Jack 75]. Software design had begun to make the transition from craft to science.

More recently, the influence of hardware on software has been felt in different ways. The restrictive or negative influence of hardware on software and software design has caused many researchers to re-evaluate the architectural basis of today's computers [Back 78]. It has been realized that current programming languages are more concerned with the architectural needs of hardware design than with the problem solving needs of the programmer [Step 82]. This doesn't make sense with software costs greatly outstripping hardware costs! This idea (that it is 8 time for hardware to accomodate the needs of languages) was proposed b~ Dijkstra in 1962 [Dijk 62]! Another impact of hardware on software has been the recent advances made in LSI semiconductor technology. Soon, parallelism in hardware will be available on a large scale, making the use of concurrency in languages and software design more desireable. A brief look at newly proposed hardware architectures is given in Chapters 6 and 7 to assess their likely influence on future software design. Chapter 2 ABSTRACTION

Abstraction is the process of concentrating on the relevant properties of an object while ignoring its other properties. An abstraction represents some of an object's properties (the ones that are relevant from a particular frame of reference). A road map, for example, is an abstraction of the actual roads in the country it represents. The map shows the directional and length relationships of the roads. It ignores other properties such as the scenery, the frequency of road usage, etc. The abstraction (the map) is useful because it is easier to use (study) than the actual roads it represents. Abstraction is used for the same reason in program design, it simplifies the design process. Knuth explains:

We understand complex things by systematically breaking them into successively simpler parts and understanding how these parts fit together locally. Thus, we have different levels of understanding, and each of these levels corresponds to an abstraction .•. [Knut 74:291].

Kopetz notes that:

9 10

••• abstraction only makes sense if it is given purpose, a point of view, is specified. Only then is it possible to distinguish between relevant properties and irrelevant detail (in relation to this viewpoint). [Kope 79:103]

In software design, the purpose of abstraction is to make programs understandable, modifiable, and reliable. The specific abstraction ideas discussed in this chapter are (chronologically): subroutines, stepwise refinement, levels of abstraction, program families, and abstract data types.

2.1 Early Abstraction

The use of abstraction dates back to the beginning of programming because abstraction permeates nearly every aspect of programming. For example, the value of an integer, a variable, or a programming language itself, are all abstractions of their actual representation on a given machine. During the early years of programming (1947-1967), abstraction was used but not discussed explicitly. The earliest use of abstraction as a programming design aid was the creation of the flow diagram in 1947, by Herman

Goldstine and John van Neumann [Knut 80]. The flow diagram, later shortened to flowchart, was an abstract way of representing algorithms in which control flow was easily visualized. 11

The invention of the closed subroutine was one of the most important early abstraction techniques because it allowed the abstraction of actions. In 1945, Conrad Zuse described notations for subroutines in his manuscript, "The Plankankul" (program calculus), but hardware was not yet available to support his ideas [Knut 80]. Zuse's paper remained unpublished until 1972, however it showed that the subroutine concept had been conceived as early as 1945! Independently of Zuse, David Wheeler invented the closed subroutine in 1949 [Wilk 80]. Although the subroutine allowed programmers to use abstraction by concentrating on what it does while abstracting away how it does it, it was not thought of in this sense at this time (1951). It was merely considered as a labor saving device:

The labor of drawing up a program for a particular problem is often reduced if short, ready-made programs for performing the more common operations are available. These short programs are usually called subroutines, and may be incorporated as they stand in the program, thus reducing the amount of work which has to be done ... [Wilk 51:1].

As Dijkstra recalled:

I do not remember having appreciated subroutines as a means for "rebuilding" a given machine into a more suitable one, curiously enough. Nor do I remember from those days [1951] subroutines as objects to be conceived and constructed by the user to reflect his analysis: they were more the standard routines to be used by the user. [Dijk 72a:46] 12

One year later in 1952, at the first ACM National Conference, Wheeler first described subroutines as a design abstraction.

When a programme has been made from a set of subroutines the breakdown of the code is more complete than it would otherwise be. This'allows the coder to concentrate on one section of a programme at a time without the overall detailed programme continually intruding. [Whee 52:170]

Wheeler's comment about the coder (they weren't called programmers yet!) concentrating on one section of the program at a time was the first explanation of how the subroutine (and the abstraction it contained) could be used to simplify program development.

Assemblers and macro-assemblers were first developed in the mid-1950's [Wegn 76]. The use of these more abstract symbolic languages made programming easier by abstracting away the programming details of particular machine codes. Macro-assemblers allowed programmers to define their own macro-instructions. These macro-instructions provided programmers with a means of creating new instructions and therefore allowed a more abstract language. By 1957, the subroutine was begining to be used and thought of in a more abstract sense than previously:

The concept of subroutine used here is an extremely broad one which corresponds roughly to any connected chunk carved out of the flow chart of a routine when we find it convenient to think of this chunk as a unit. Subroutines may 13

therefore be thought of as convenient molecules of routines which, in macro flow chart, can be presented as single blocks. [Gorn 57:263]

The use of macro flow charts demonstrated how abstraction was being used in program development at this time (Figure 2.1). Thus, by 1957, abstraction had begun to play a bigger role in program design. The development of FORTRAN 1, the first high-level language, in 1957 and ALGOL 60 in 1960 [Wege 76], made programming easier than before because these languages were more abstract or English-like than assembly languages. ALGOL allowed subroutine nesting to an arbitrary level and forced programs to be made entirely out of blocks.

00 WHAT IS PERFORM THE NEEDED ·TO DESIRED ~ BEGIN OPERATIONS

I DO WHAT IS FINO OUT WHETHER YES NEEDED TO BE A REPETITION ABLE TO REPEAT IS HEEDED J>o

Figure 2.1 Abstract Flowchart Source: [Gorn 57:263]

The first intended use of abstraction as a program design technique began during the early 1960's. This technique is known as top-down design. 14

2.2 Top-Down Design/Stepwise Refinement

Top-Down design, also known as stepwise refinement and functional decomposition, is a general program development method in which a program is developed by a sequence of refinements. First, a problem is abstractly described in terms of what it is to do functionally. Then this statement is divided into two or more subproblems that will implement the initially described problem. Each of the new subproblems are then divided into subproblems that will implement them. This process repeats until all the subtasks are defined in terms of the underlying implementation language. Early notions of top-down design were discussed in terms of dividing or breaking a problem into pieces. However, these descriptions were usually quite vague and did not describe a systematic step-by-step process. Perhaps the earliest notion of dividing a program into parts was an unpublished paper by Haskell Curry in 1950 [Curr 50]. Curry explains his view:

The first step in planning a program is to analyze the computation into certain main parts, called here divisions, such that the program can be synthesized from them. These main parts must be such that they, or at any rate some of them, are independent computations in their own right, •.. [Curr 50:paragraph 34]

Although Curry's proposals were very advanced for their time, in practice they were not successful because, 15

according to Knuth, he factored problems in an unnatural way producing component parts that had multiple entrances and exists [Knut 80]. Yet his paper showed that the idea of program development by problem divisions had been envisioned as early as 1950! As part of a book review of Daniel McCracken's first book, Charles L. Baker suggested that the idea of program division was widely used in 1957:

One technique that is widely used is to break the problem into small, self-contained subroutines, trying at all times to isolate the various sections of coding as much as possible ..• by isolating the various portions of the code as much as possible the problem is reduced to many much smaller ones. [Bake 57:304]

Despite Baker's suggestion, a technique was not described by authors of the time. In fact, the reason

Baker mentioned this technique was because he felt McCracken had erred by not describing it in his book. Thus, the idea of program subdivision was still quite vague in 1957. It was not yet explained how one goes about breaking a problem into subtasks. The guidelines "to isolate various sections of coding" and to make the subroutines "small" and "self-contained" were too vague to be of much use but were a predecessor to top-down design. Five years later, in 1962, James Emery talked about factors to be considered during task segmentation: "The segmentation of a major data processing task into modular 16

subtasks must take a number of factors into consideration." [Emer 62:264]. Although Emery talked about the criteria for the division process (to simplify, to avoid repeated code, to encapsulate possible changes, and for tasks dealing with the same data), he did not talk about the division process itself. The first description of the top-down design process was given by Dijkstra in his 1965 paper, "Programming Considered as a Human Activity":

I assume the programmer's genius matches the difficulty of his problem and assume that he has arrived at a suitable subdivision of the task. He then proceeds in the usual manner in the following stages: - he makes the complete specifications of the individual parts; - he satisfies himself that the total problem is solved provided he had at his disposal program parts meeting the various specifications; - he constructs the individual parts, satisfying the specifications, but independent of one another and the further context in which they are used. Obviously, the construction of such an individual part may again be a task of such complexity, that inside this part of the job, a further subdivision is required. [Dijk 65a:215]

Dijkstra called this process the "dissection technique", but it was the first description of what is currently called top-down design and stepwise refinement. Dij kstra noted that this technique of "Divide and Rule" has been known since ancient times. 17

Dijkstra explained that the dissection technique was a transplanted use of mathematical proofs:

The analogy between proof construction and program construction is, again, striking. In both cases, the available starting points are given (axioms and existing theory versus primitives and available library programs); in both cases the goal is given (the theorem to be proved versus the desired performance); in both cases the complexity is tackled by division into parts (lemmas versus subprograms and procedures). [Dijk 65a:215]

The origin of the term "top-down" is- unknown. It is possible that many people used this term nearly simultaneously because of its obvious reference to the direction in which a hierarchy is built using the _dissection technique. Just a few months after Dijkstra explained his dissection technique, Larry Constantine came very close to using this term:

Working from the top level down allows the programmer analyst to limit the scope of his considerations to the immediate module or hierarchical level being worked on Contrast the "bottom up" approach. [Cons 65:21]

Although Constantine said "top level down" rather than "top-down", the terms are very close and their meaning synonymous. Nearly three years later in 1968, at the NATO Conference on Software Engineering, the term top-down was used by several of the participants as if its meaning were common knowledge [Buxt 76]. Thus, between 1965 and 1968, 18 the term top-down became popular and meant the dissection technique conveyed by Dijkstra. In 1969, Dijkstra added the idea of simultaneous refinement of data and operations to the dissection technique:

In the refinement of an abstract program (i.e. composed from abstract statements operating on abstract data structures) we observe the phenomenon of "joint refinement." For abstract data structures of a given type a certain representation is chosen in terms of new (perhaps still rather abstract) data structures. The immediate consequence of this design decision is that the abstract statements operating on the original abstract data structure have to be redefined in terms of algorithmic refinements operating on new data structures in terms of which it was decided to represent the original abstract data structure. [Dijk 69:47]

The "joint refinement" of data and operations further clarified the top-down idea. -Prior to this, the refinement (and abstraction) of data had not been mentioned as part of the dissection technique. In 1970, Harlen Mills gave the first description of the language to be used in the top-down development process. Mills proposed that the top-down design process be carried out directly in code and functional specifications in English [Mill 71]. He explained that, in this way, syntax checking and program execution through the use of program stubbs can occur very early during implementation. Mills also gave the first example (although not complete) of the top-down development process. 19

The term "stepwise refinement" was first used by Niklaus Wirth in his 1971 article, "Program Development by Stepwise Refinement" [Wirt 71]. Wirth explains the process of stepwise refinement:

•.• the program is gradually developed in a sequence of refinement steps. In each step, one or several instructions of the given program are decomposed into more detailed instructions. This successive decomposition or refinement of specifications terminates when all instructions are expressed in terms of the underlying computer or programming language • • • [Wirt 71:221] . ·

Wirth presented a complete discussion of program refinement criteria (to decompose decisions, to untangle seemingly interdependent parts, to allow modifiability, and to defer details) and gave the first complete example of a program developed by stepwise refinement. In summary, top-down design is the application of the old adage "Divide and Rule" to program design. The technique of "Divide and Rule" has been known since ancient times [Dijk 65a]. Who first applied this idea to program design and when is not clearly marked in books or journals of the time. Based on Baker's comments it appears that an early form of top-down design existed 1n 1957 [Bake 57]. Dijkstra, however, was the first to describe the top-down process as a sequence of steps, in 1965 [Dijk 65a], and is essentially the founder of top-down design. 1 20 ' '

2.3 Levels of Abstraction

Levels of abstraction, also referred to as hierarchical design, is the idea that software should be designed as a series of abstract layers. Each layer is designed to use only the resources (i.e., data, I/0 devices) of its own layer or the layer beneath it. The advantage of building software in layers is that it allows the designer to think about each level independently, in terms of its conceived objects and the objects it uses in the level immediately below it. The idea of levels of abstraction was first described by Dijkstra in his 1967 paper, "The Structure of the 'THE' -Multiprogramming System" [Dijk 67]. Dijkstra described the structure of an operating system as a hierarchy of abstractions. Each level of the hierarchy fulfilled a partial implementation of the system. The system was designed "bottom ~up" of software layers that became progressively more abstract at higher levels. The key idea of this concept was that each level dealt only with resources of the level imme?iately below it. Dijkstra explains:

At level 0 we find the responsibility 1 for processor allocation ..•. Our first abstraction has been achieved~ above level 0 the number of processors actually shared is no longer relevant. [Dijk 67b:343]

Dijkstra's · ~aper is somewhat vague, because he 21

described the resulting structure of a particular operating system's design. One has to extract the general design

ideas out of the system description. Dijkstra described the idea of levels of abstraction in more general terms the next year in his 1968 paper, "Complexity Controlled through Hierarchical Ordering of Function and Variability": [software] .•. can be regarded as structured in layers. We conceive an ordered sequence of machines: A[O], A[l] •... A[n], where A[O] is the given hardware machine ..• the software of layer i will use some of the resources of machine A[i] to provide resources for A[i+l]: in machine A[i+l] and higher these used resources of machine A[i] must be regarded as no longer there! Phrasing the structure of our total task as the design of an ordered sequence of machines provided us with a useful framework in marking the successive stages of design and production of the system. [Dijk 68b:ll4]

Dijkstra's papers were the first to conceptualize software as layers of abstract machines bridging the gap between an operating system and a given machine. Although this idea was applied to an operating system, it can easily be applied to application software, to bridge the gap between a problem and the implementation language. As mentioned 1n top-down design (Section 2 . 2 ) , Dijkstra presented the idea of "joint refinement" of abstract statements and abstract data structures in 1969 [Dijk 69]. This idea added the refinement of abstract data structures to the layered approach. 22

The key to distinguishing levels of abstraction from a hierarchy of modules resulting from top-down design is that

abstract layers deal _only with functions or data in its

level or the level immediately below it, whereas a module

in a hierarchy could conceivably call a module several levels below it.

2.4 Program Families

Program families is the idea that a program should be conceived as a family of related programs rather than as an

individual program. Closely related programs are distinguished from one another by the implementation decisions they contain. Each design decision made during program development eliminates some family members. Eventually (in the last refinement), a single family member

is produced. The program family view keeps one aware of the limitations imposed (families excluded) by design decisions

implemented at each hierarchical level. Thus, high-level decisions should be general and retain many related family members, while low-level decisions should be problem specific. In 1969, Dijkstra first suggested the program family viewpoint by explaining:

... a large program will, over its lifetime, exist in a multitude of versions, so that during the composition process we should view our 23

composition as a family of related programs rather than as a single program. [Dijk 69:44]

He clarified this by saying that various members of a

family should have common (high-level) structure. Thus,

high-level modules should be generalized, not detailed.

Parnas further explored the program family idea in

1976 [Parn 76]. Parnas, like Dijkstra, stated that program

development should be conceived as developing a family of

related programs, rather than as a single program. Parnas

explained that it is design decisions that distinguish

program families. Decisions that are shared by the family

should be made early in the stepwise refinement process.

Decisions that differentiate the family members should be

contained in low-level modules (or later in the stepwise

refinement process). Thus, one conclusion Parnas reached

based on the family viewpoint was that a new version of a

program should come from a common ancestor (common higher-level}, rather modifying version 1 into version 2.

Parnas explained that stepwise refinement and "module specification" are two techniques for developing program

families. Stepwise refinement was discussed in Section

2.2. Parnas described "module specification" as identifying those design decisions that cannot be common to

the family and creating a module to hide those decisions.

This technique can be used during early stages of design to

create low-level modules which can be used by high-level modules or the main program. As Parnas notes: 24

••• the concept of program families provides one way of considering program strucutre more objectively. one may ask which programs have been excluded and which still remain. [Parn 76:8]

2.5 Abstract Data Types

An abstract data type is a definition of a class of

abstract data structures and its associated abstract

operations. An example of an abstract data type is shown

in Figure 2.2. This example is a definition of a class of

objects of type stack. The programmer must assign an

identifier to this type to create an object of type stack.

Abstract data types provide a way for abstract data,

and abstract operations which use that data, to be hidden

in a single construct. They also allow users to define

useful data types which are not present in the language

being used. Abstract data types fully facilitate levels of

abstraction by allowing abstract objects to be described in

terms of other abstract objects and are used just like any other built-in language type.

According to Liskov [Lisk 74], the development of

abstract data types was based on three previous areas of work: extensible languages, standard abstract operations,

and the notion of classes. Another influence, which was a

continuation of standard abstract operations, was

information hiding. 25

stack: cluster(element_type: ~) ll push, pop, to·p, erasetop, empty;

!..!!J!(type_param: W!,) = (tp: integer; e_type: .t:£2!,; stk: array[l •• ] of type_param;

s: ~(element_type);

9 .tp !"' 0; s.e_type :,. element_type; ~s; !!1!!.

push: operation(s: ~. v: s.e_type);

s.tp :,. s.tp+l; s.stk[s.tp] :,. v; return; end

pop: operation(s: ~)returns s.e_type;

if s . tp = 0 then error ; ;:"tp := s. tp-1; --­ return s.stk[s.tp+l]; end

top: operation(s: ~) returns s.e_type;

if s.tp = 0 then~; ~ s.stk[s.tp]; !!1!!.

erasetop: operation(s: ~);

if s.tp = 0 then error; s.tp := s.tp:r;- --- ~; end

empty: operation(s: ~)returns boolean;

~ s.tp = 0; ~d

~ stack

Figure 2.2 Abstract Data Type - Stack

Source: [Lisk 74:54] 26

Extensible languages are languages that allow programmers to define their own data types. SIMULA 67 (1967) and PASCAL (1970) are examples of early extensible languages. In general, new data types are constructed by combining primitive types of the language. COBOL (1961) was the first language to use "record" data types that contained user named components of more than one type [Wegn 79]. However, COBOL data type elements were limited in variety. An early pioneer of extensible languages was Douglas Ross and his AED-1 language [Wirt 66]. Ross described what he termed "n-component elements" and "plex structures" as early as 1961:

Stated briefly, an n-component element is a single unit of information about a problem, which specifies in each of its components one attribute or property of the element. Plex structures are an interconnected set of n-component elements. [Ross 63:306]

The contents of n-component elements were accessed by alphabetical indices (i.e., A(B) obtains component A of element B). Thus, one had to remember what is contained 1n each component to access it correctly. Hoare improved on this access method by making data attributes accessible by meaningful identifiers [Hoar 95]. For example, Salary(John) would access John's salary. Hoare and Ross used the reference concept (pointers) to allow the expression of relationships between objects. ,, ' 27

Extensible languages provided a means ·of creating

richer abstract data types but they did not combine these

types with operations acting on them in a single program

unit. The idea of linking the operations on data types

with the data types themselves is derived from papers by

Mealy [Meal 67] and Balzer [Balz 67] (according ·to [Lisk 74]), in 1967.

Mealy presented the view that a set of data can be

modeled as a mapping from a set of selectors to a set of

values, and that operations on the data either use the map

to access parts of the data or redefine the map. Thus,

Mealy presented a more abstract vfew of operations on data.

Balzer presented the concept he termed "dataless

programming":

The programmer should be able to construct his program in terms of the logical processing required without regard to either the representation of data or the method of accessing and updating. This concept we call "Dataless Programming." [Balz 67:535]

To achieve this concept, Balzer proposed the use of a single canonical form for all data and function references

that specified what data the system was to retrieve or store. The implementation of data and operations that use that data were hidden in what Balzer called a "data collection." To modify or access data in a "data collection," Balzer proposed the use of a standard set of abstract operators (delete, insert, replace, and add) to operate on all data collections. 28

Balzer's idea, that a standard set of operations could manipulate all data types,· was too limiting for all applications [Lisk 74]. But his notion that data and operations on that data, should be specified together in one program structure to hide representation details, was a major step towards abstract data types. Balzer had the right idea of hiding a data representation in a single form but optimistically tried to apply it to all the abstract types of a program. Another impetus to abstract data types was the concept of classes. The class concept was first introduced by Hoare in his 1965 article entitled "Record Handling" [Hoar 65]. The class concept provided a means of defining many objects by a single class definition. Abstract data types use this mechanism to define many individual types rather than a single type. In 1967, SIMULA 67 introduced the class construct to programming languages [Wegn 79]. SIMULA 67 was designed by Ole-Johan Dahl, Bjorn Myhrgaug, and Kristen Nygaard and implemented by members of the Norwegian Computing Center [Birt 73]. SIMULA's classes are similar in structure to abstract data types but lack the notion of information hiding (i.e., SIMULA allows local data to be directly accessed by an outside object via "remote accessing"). Despite this flow, SIMULA objects went beyond earlier extensible languages, by encouraging a separation of the 29

definition of abstract data from its implementation.

SIMULA also allowed abstract actions to be defined as part of a record class. Thus, SIMULA's creators had conceived of combining abstract data with abstract operations about the same time as Mealy and Balzer. However, a published description of SIMULA objects was not produced until 1968

[Dahl 68].

The idea of information hiding introduced by David

Parnas in 1971 (see Section 3.2), stated that a data structure and operations on that structure should always be hidden together in a single module [Parn 72]. Whereas

Balzer had proposed that all data structure implementations be hidden in a single module, Parnas advocated that each data structure should hide its implementation and modification details within a single module. This is exactly the intention of abstract data types.

Two years after Parnas's information hiding idea, in

1973, papers by James Morris [Morr 73] and J. Palme

[Palm 73] were among the first to discuss protection methods for SIMULA's class construct. These protection techniques essentially allowed the creation of abstract data types, but the true spirit of abstract data types

{i.e., that the type implementation should be hidden) was not expressed. For example, Morris presented a mechanism which allowed the programmer to indicate whether or not an abstract data type was to be protected. 30

The first formal presentation of abstract data types was given by Barbara Liskov and Stephen Zilles in their 1974 paper, "Programming with Abstract Data Types" [Lisk 74]. In this paper, they described a programming language which permitted the use and definition of abstract data types. They also gave an example of an abstract data type definition and usage for a stack (Figure 2.2). Recent research on abstract data types has focused on their specification, implementation, and verification [Shaw 84]. John Guttag has proposed that abstract data types be specified by algebraic axioms prior to their implementation [Gutt 77]. Algebraic axioms give meaning to an abstract data type independently of how it is to be implemented. For example, in the specification of abstract data type Queue (Figure 2.3), there are no assumptions about type "Item", the element type of the queue. The queue could be implemented by a variety of item types (e.g., arrays, linked lists, etc.). Thus, the queue specification is a type schema rather than a single type definition. 31

NEW: -> Queue ADD: Queue X Item -> Queue FRONT: Queue -> Item REMOVE: Queue -::> Queue ISEMPTY?: Queue -> Boolean

1) ISEMPTY? (NEW) = true 2) ISEMPTY? (ADD(a,i)) =false 3) FRONT (NEW) = error 4) FRONT (ADD(q,i)) = If ISEMPTY? (q) then i else FRONT(q) 5) REMOVE(NEW) = error 6) REMOVE (ADD(q,i)) = if ISEMPTY? (q) then NEW else ADD(REMOVE(q),i)

Figure 2.3 Axiomatic Queue Specification

Source: [Gut t 77:201]

Describing abstract data types in an implementation independent manner can be used as an intermediate first step in implementing abstract data types and in informal and formal proofs. Axiomatic specifications can also be used to hide implementation details from programmers as advocated by Parnas [Parn 72]. Problems with the axiomatic approach are that it 1s difficult to tell when the axioms are complete (all cases defined), that performance constraints cannot be specified, and that they require formal training to be produced and read. Recent research on the implementation of abstract data types oeals with selecting a best implementation among many possible implementations. Choosing a best implementation 32

is determined by analytic methods that g1ve performance measures for the different implementations. White argues that the selection of an implementation should be based on the frequency in which operations of the type are used

[Whit 83]. He also contends that performance may be enhanced, if this frequency varies, by using multiple implementations. Present weaknesses of these analytic methods are the necessary use of overly complex mathematical expressions and the numerous assumptions which must be made to simplify them.

The practical usefulness of abstract data types is still an unanswered question because successful applications on a large scale have not been substantiated

[Wirt 79]. Wirth suggests that the lack of proven implementations may be a result of missing implementation facilities and that it may be inherently difficult to create a useful abstract data type for a given problem

[Wirth 79].

2.6 Summary

Abstraction is "the" major tool of program design.

The use of abstraction reduces program complexity by allowing one to focus from a particular frame of reference at each stage of development. Abstraction has been applied in many ways because it is a very general design idea. The specific design ideas that use abstraction are 33 " '

(chronologically): subroutines (action abstraction), top-down design (program development abstraction), levels of abstraction {resource abstraction), program families (decision abstraction), and abstract data types (object-­ data and actions). Abstract data types are the most recent usage of abstraction in program design. They simultaneously provide the concepts of modularity and levels of abstraction. 0 '

Chapter 3 MODULARITY

The meaning of modularity is somewhat nebulous because several slightly different definitions exist in the literature. The numerous definitions are partly due to the evolving meaning of the term "module." Initially, the term "module" was used to describe a callable sequence of statements (i.e., a subroutine). Modularity meant programs composed of or created from modules. It was then learned that there were good and bad ways of dividing a program into modules [Grie 79]. This is because module creation also determines the interdependence of modules to one another, and this interdependence, if slightr results in good modules; if great, results in poor modules (with respect to program modifiability). As a result, the meaning of modularity also entails a description of the interrelationship of modules to one another. This relationship is often described in terms of goals or objectives: "The primary goal ... should be to decompose the program in such a way that the modules are highly independent from one another" [Myer 73:100]. Module independence means:

34 35

••• well defined interfaces such that each module makes no assumptions about the operation of other modules except what is contained in the interface specified. [Earl 73:1]

In other words, a module is a callable segment of code which has a conceptual bullet-proof shield that can only be accessed via its parameters. Module independence can also mean independence on a more physical level. That is, modules have also been described as subprograms that can be compiled separately [Lisk 72L Modularity also means:

One must be able to convince himself of the correctness of a program module, independently of the context of its use in building larger units of software [Denn 73:129].

Since modularity is defined in terms of goals, its meaning has become an ideal which we strive to reach.

3.1 Early Modularity

The beginning of modularity is marked by the invention of the subroutine. In 1945, Zuse described notations for subroutines, but hardware was not yet available to support his ideas [Knut 78] and his work was not published until 1972. However, Zuse's paper showed that the subroutine concept had been conceived as early as 1945 (prior to stored program computers)! 36

Independent of Zuse, Wheeler invented the closed subroutine in 1949 [Wilk 80]. The closed subroutine for the first time allowed the creation of separate program units which could be created and reused. Parameters were passed to allow different computations. Prior to this, the only type of subroutine available was the open subroutine, in which a sequence of commands were inserted into the main program text. The closed subroutine allowed programs to be built out of modules but it was not used or thought of in that sense in 1951. It was considered a labor-saving device, saving the time involved in receding a commonly used function for another program. Wilkes explains:

The labor of drawing up a program for a particular problem is often reduced if short, ready-made , programs for performing the more common computing operations are available. These short programs are usually called subroutines, and they may be incorporated as they stand in the program, thus reducing the amount of work which has to be done ... [Wilk 51:1]

Subroutines were made available to other programmers by placing them in subroutine libraries. The perceived advantage of library subroutines, which have been thoroughly checked out, was that they limited the possibilities of mistakes to new code, thus reducing debugging time.

In 1950, Curry described program parts as independent, thus being one of the first to touch on an important characteristic of modularity. Curry explains: 37

The first step in planning the program is to analyze -~the computation into certain m~in partsi called here divisions, such that the program can be synthesized from them. Those main parts must be such that they, or at any rate some of them, are independent computations in their own right, ••• [Curr 50:paragraph 54].

Curry's article was not published until 1972, but his paper showed that the concept of a program being organized into independent parts had been conceived as early as 1950. In 1952, Wheeler stated: "When a program has been made from a set of subroutines the breakdown of the code is more complete than it would otherwise be" [Whee 52:186]. Wheeler's comments, although advanced for 1952, are too vague to determine exactly what he meant. The notion of a program being made up from a set of subroutines sounds very modular, but it is not clear if he means the program 1s composed entirely of subroutines. The description, "the breakdown of the code is more complete," is too vague to give much meaning. Throughout much of the 1950's, subroutines remained as standard functions to be used [Dijk 72a]. By the early

1960's, with the emergence of FORTRAN (1954-1958) and ALGOL 60 (1957-1960), the concept of modularity arose. These languages were the first high-level languages and their structures (procedures and functions) reflected current thoughts on programming. 38

Dijkstra explains:

- .•• when ALGOL 60 emerged, the scene changed and we did not talk any more about closed subroutines: we called them "procedures" instead. They remained to be appreciated by the programmer as a very handy means for shortening the program text, and more and more programmers started to use them for the purpose of structuring, so that program adaptation to foreseen changes in problem specification could be confined to the replacement of one or more procedure bodies, or to a procedure call with some actual parameters changed. [Dijk 72a:47] ·

In 1957, as part of a book review, Baker described some of the attributes of the module:

One technique that is widely used is to break the problem into small, self-contained subroutines, trying at all times to isolate the various sections of coding as much as possible. [Bake 57:304]

The terms "self-contained" and "isolated" described modularity more accurately than before, but more importantly, Baker says "[to create modules] as much as possible." Thus, by 1957, programs were beginning to be thought of as composed entirely of modules. Prior to this, a program called a few subroutines for convenience. Another aspect often associated with modularity, 1s the ability of a module to be compiled separately from the rest of the program. Separately compilable subroutines were first implemented 1n 1958, in FORTRAN II [Back 81]. This feature .meant that if a change was localized to a particular subroutine, the entire program did not have to be recompiled. 39 (l

The term "modular" dates back to 1960, in an unpublished paper entitled "Modular Programming," by Robert Bizzell [Bizz 60]. The first published use of the term "modular" was Emery's 1962 paper, "Modular- Data Processing Systems Written in Cobol" [Emer 62]. In this article, Emery defined modular as segmented programs; i.e., programs that are finely divided or made up of many segments. These segments were implemented as subroutines, usually called from a subroutine library. The new advantage to this modular approach was that it added flexibility in altering programs. Emery explains:

In order to allow for great flexibility in systems design, the library of modules available to the systems designer must be largely independent of one another. In this way the presence or absence of one module will not affect the other modules. Because of this independence, modifications and extensions to the module library are relatively easy to make. [Emer 62:263]

Although Emery stated that modules should be independent, he did not describe in any detail how this independence was to be achieved. Emery's paper contained module hierarchy charts (Figure 3.1) which showed that by 1962 programs were conceived as being made-up entirely of modules. Thus, the module concept had made the transition from being thought of as a useful sequence of code to components in system 40 design. Emery recommended creating modules for portions of a task likely to change and in general to simplify higher level processing.

• FORECASTING-A~D­ I~VE~TORY -CO~TROL I

:l~VE~TORY-CALCCLATIO~S~ !ANALYSIS\

! FORECAST- I CALClTLATIONS j l ' SEASO:S AL- i I TREND- I ' ADJUSTMENT · , ADJUSTMENT j

Figure 3.1 Emery's Module Hierarchy Chart Source: [Emer 62:264]

Emery also suggested that modules dealing with the same data should exist as submodules of the same higher-level modules. This idea could be considered a predecessor to the idea of abstract data types. Emery's concept of modularity was somewhat different from its current meaning; he conceived of modules as standard building blocks in which a tailored system could be constructed by selecting desired module functions. Emery borrowed the modular idea from the construction industry [Emer 84].

The use of library subroutines as program building blocks has been only partially successful because 41

documentation is often inadequate and because parameters do not always allow a wide variety of uses [Shaw 84].

In 1965, Constantine characterized modularity as a

"black box" with minimized interdependencies and restricted inter-module communication [Cons 65]. The term "black box"

(an object of known functionality with unknown internal workings), gave for the first time, the notion of a shell to modularity. Minimized interdependencies and restricted inter-module communication were descriptions that finally zeroed in on what had previously been described as module independence. Three years later, in 1968, Constantine introduced the terms "coupling" and "cohesion" [Cons 68] (according to

[Myer 78]), which helped explain module independence. Coupling 1s a measure of the strength of association between two modules. Cohesion is a measure of the strength of relationships between elements of the same module.

These terms brought to light the programming specifics of module independence. In describing the degree of coupling, for example, Constantine explained that it is determined by the complexity of the module interface, by the type of connection (how called), and by the type of information passed (data, control, or a combination of the two). Cohesion helped focus on the purpose of the module elements. That is, modules that performed a single function were considered the most cohesive. These ideas 42

were forerunners to the idea of information hiding because they described what had been meant by module independence

(Section 3.2).

In 1969, Dijkstra described the module concept by using the analogy of a module as a pearl:

As each pearl embodies a specific design decision (or, as the case may be, a specific aspect of the original problem statement) it is the natural unit of interchange in program modification (or, as the case may be, program adaptation to a change in problem statement). [Dijk 69:47]

Dijkstra's description clarified the reason for module creation; to embody a specific design decision or problem aspect. The analogy of the module as a pearl again gave the module a protected quality. This notion became crystallized by information hiding.

3.2 Information Hiding

Information hiding is the idea that a module hides all information about its internal workings to as great an extent as possible from all other modules. The only information that is revealed 1s through the module interface (i.e., parameters that are passed). Information hiding also says that these parameters are to reveal as little as possible about how the module carries out its function(s). 43

Perhaps the earliest notion of information hiding was the idea of "Dataless Programming" expressed by Balzer in 1967 [Balz 67]. Balzer explains "dataless programming":

It conceives of a program as the specification of a set of manipulations to be performed on a set of data value~, and that this specification should be indep~ndent of the form in which these data values are represented. To achieve such independence, there must be a set of declaration that tell the programming system how to retrieve and store data values from the particular representation being used. [Balz 67:535]

Balzer's idea was to isolate data and actions on that data in a data collection that is separate from the source program. The program could access data by referencing the collection name, along with a standard operation (delete, insert, replace, add). Using this scheme meant that changing a data structure from an array to a linked list did not affect the source text, only the data declaration. The use of a standard set of operators for all data types has not proved feasible, but Balzer's idea of hiding data implementation details was a predecessor to information hiding. A more general view of information hiding was first presented by Parnas in 1971 [Parn 71]. Parnas explained that the less information module connections contain, the better. He defined module connections as the assumptions modules make about one another. Parnas explained that the reason for limiting the amount of information in 44

connections is to facilitate correctness proofs and to allow modules to be easily modified. A second point Parnas made was that programmers

developing particular modules should be given only the information they need to implement those modules (i.e., limited information instead of all available information on the system to be built). In this way,· a programmer cannot make any unnecessary assumptions about other modules. It is interesting to note that the idea of information hiding presented by Parnas in 1971 [Parn- 71] is slightly different from its current view. Parnas said a module's developer (and therefore the module) may only know of limited information, rather than saying that a module should hide its information from all other modules. Whether each module knows of only limited information or each module hides that information from other modules, the result is the same, information is hidden within modules. In 1972, one year later, Parnas wrote, "On The Criteria to be Used in Decomposing Systems into Modules" [Parn 72]. In this paper Parnas stated: "The effectiveness of modularization is dependent upon the criteria used in dividing a system into modules." [Parn 72:1053]. He says the criteria that should be used is information hiding. By information hiding he meant that a module should wholly contain and hide from other modules all knowledge of a particular design decision. This means -----~- ---- ~-- --·------..-----·- ______-..:...._ __ -.,------·,-

45 {J '

that other modules can make no assumptions about the hidden design decision except through parameters. Parnas proposed that modules be created on the basis of design decisions which are likely to change. A list of such decisions is made and modules are created to hide

those decisions. In this ~ay changes carr easily be made within a single module. Parnas also explained that the interface to a module should be chosen to reveal as little as possible about how it works. That is, to use generalized parameters and to use as few as possible. In addition to the general criteria of information hiding for modules, Parnas suggested the following specific advisable decompositions: 1) A data structure, its internal organization, accessing procedures and modification procedures, should be part of a single module (the idea behind abstract data types);

2) The sequence of instructions necessary to call a given routine and the routine itself should be part of a single module; 3) The formats of control blocks for queues in operating systems should be hidden in a single module; 4} Character codes, alphabetic orderings, and similar data should be contained in a single module; and 5} The order of module execution should be hidden in a single module. 46

Parnas argued that the traditional method of decomposition, making each major processing step a module,

is incorrect, because in most cases design decisions transcend time of execution. The purpose of information hiding 1s to limit program modification to a single module if a design decision must later be changed. In summary, Parnas' information hiding idea was the missing refinement to the meaning of module independence. That is, not only should a module contain a specific aspect of a problem, but no other module should know of that aspect by any means other than through parameters.

3.3 Object-Oriented Programming

Object-oriented programming is programming that consists solely of creating objects and facilitating communication between those objects. The object-oriented system, SMALLTALK-80 [Gold 83], is discussed in this section to present a current view of object systems. An object is the basic component of an object-oriented software system and represents everything from numbers to programs. Objects consist of protected data and a set of operations. Once an object is created, its variables and operations remain in existence for the lifetime of the object. An object's variables are local to itself and can only be accessed via messages. -- -~ ------~------______:_~------~-~---~~--- --·-----

47

In object-oriented programming, operators and operands

are replaced by messages and objects. A message is a request for an object to carry out orie of its -6perations. A message consists of a selector and possibly some arguments. A selector is a literal name for the type of operation desired from the receiver (the object receiving the message). Arguments are additional information used by

the receiver to carry out the d~sired oper~iion. In object-oriented systems, messages (their selectors) determine which operation the receiving object will perform. For example, in the statement:

3 + 4

the receiver is the object "3", the selector, "+", indicates that addition is desired, and the argument, "4", is the number to be added to the receiver. The objects of object-oriented systems facilitate modularity and information hiding because local data may only be accessed by messages. In object-oriented programming, all objects are instances of a class. A class is a definition of a set of similar objects which acts like a template 1n producing individual objects. A class definition is composed of the class name, a declaration of variables, and a definition of the class methods. Each method consists of a message pattern (selector and dummy variables) and a sequence of

operations. When an initialization method of a class is -- -'------·~~ ------.------.:...- ----~--.:.._ -~~------48 executed, a new object, or instance of the class, 1s produced. The following example is a partial description of the class named FinancialHistory in SMALLTALK 80 [Gold 80]:

class name FinancialHistory instance variables- cashOnHand incomes expenditures instance methods transaction recording receive: amount from: source incomes at: source put: (self totalReceivedFrom: source) + amount. cashOnHand <- cashOnHand - amount inquiries cashOnHand (totalReceivedFrom: source> ifTrue: [incomes at:source] ifFalse: [0] ifTrue: [expenditures at: reason] ifFalse: [0] initialization cashOnHand <- amount incomes <- Dictionary new expenditures <- Dictionary new

The instance variables of this class are "cashOnHand," "incomes," and "expenditures." The subheadings "transaction recording," "inquiries," and "initialization" are method category descriptions and do not affect or subdivide the methods. The instance methods are listed beneath each selector (the selectors are enclosed in angle brackets). Dummy arguments act as place holders and are separated from the selector by a colon. The details of the methods are not important to the object concept. 49

When the message "initialBalance 200" is sent to an object of type "financialHistory," the above selectors are searched. Since "initialBalance" is one of the possible methods, a match o_f this selector is found, and the three statements below it are executed. The execution of these statements causes the allocation of the instance variables. In an object-orient€d system, many classes exist as part of the standard functionality of the programming language and its environment. Programming in an object-oriented system consists of creating classes, creating instances of classes, and specifying sequences of message exchanges among the objects. The history of objects originated in programming languages of the late 1950's. In assembly languages and FORTRAN, a variable (local to a subroutine), when called a second time, retained the value it received from its previous call. Although unintentional, a variable's value was technically accessible from one call to the next. The local variable had already acquired an important quality of objects-- it didn't disappear after being called. In 1960, ALGOL 60 introduced the "own" variable [Wegn 76]. The "own" variable made it possible for selected variables declared as "own" to retain their value after their block had been exited. The "own" was the first intentional use of local variables as permanent objects. 50

The concept of objects was first fully developed in 1967 by the programming language SIMULA 67. SIMULA was the

first language to introduce the class, object, and instance concepts. In SIMULA, objects are created by instartciating an instance of a class. These objects remain in existence until they are no longer referenced. SIMULA objects were the first to allow abstract data and operations on that data, to coexist in a single language construct that survived beyond being called. However, SIMULA's objects have a shortcoming in that they allow their local data to be accessed by other objects via a method known as "remote accessing." For example, the internal data of an object named OBJ could be accessed by another object by specifying OBJ.DATANAME. This deficiency is the one major aspect of object-orientedness that SIMULA lacked. By allowing local data to be directly accessed by other objects, implementation details could spread throughout the program. Abstract data types, developed in 1974, corrected SIMULA's hiding deficiency by no longer allowing direct outside acces of local data. Abstract data types and their history were presented in Section 2.5. The SMALLTALK language is a recent development in object-oriented systems. SMALLTALK was originally developed by Allen Kay as a first step in his Dynabook project goal of extremely flexible, human-oriented computer systems [Kay 82]. Kay explains SMALLTALK: 51

The major influences on the design were SIMULA and FLEX, but a few deep properties of LISP and PLANNER were also incorporated. SMALLTALK is simply organized as a universe consisting solely of objects in communication ... [Kay 82:211]

In SMALLTALK, there is no distinction between data and

the procedures that manipulate data. SMALLTALK went through a series of refinements from

1976 to 1980 to reach its current status of SMALLTALK-80. Researchers at the Xerox Palo Alto Research Center, headed by Dan Ingalls, developed the SMALLTALK-80 object-oriented system. SMALLTALK, like abstract data types, added what SIMULA had lacked, local data that cannot be directly accessed by other objects.

SMALLTALK also differs from SIMULA and other programming languages by its dynamic (run-time) binding of operators and by its use of the message/object model rather

than the conventional operator/operand model. The difference between these two models is a subtle one that

results in a higher degree of information hiding by the message/object model [Cox 83].

In the operator/operand model, operators and operands are treated as if they are independent. When the operation A+ B is to be executed in a module, it is this module's environment that determines how this operation will be

implemented. That is, the environment (the module) - ---~-~--- -~-~------...----- ·-----~------·------

52 ,, ' explicitly declares A and B's types via type declarations.

This determines how A+ B will be executed (i.e., .if A and B are integers then integer addition will be performed, if they are reals then real addition will be performed).

In contrast, when the message object pair A + B is to be executed in the message/object model, it is the object A that determines how this message object pair will be executed (i.e., object A's methods). If object A is an integer then integer addition is performed. If A is .a real, then real addition is performed. In the message/object model, nothing has to be declared about types A and B in the module containing the message/object pa1r A + B. The significance of this difference is that 1n the operator/operand model, operator/operand interdependencies (type declarations) proliferate throughout the system.

Each and every time the '+' operator is used, the types of the variables involved in addition must be declared. In the message/object model, operator/operand dependencies are not spread throughout the system, they are hidden within objects. This extra degree of information hiding is facilitated by the fact that object operations are bound at run time, not during compilation. ------~ --~~- - -~-- ---~---- -~------~~

53

3.4 Summary

The module concept has evolved from a useful sequence of commands, to basic building blocks of programs, and more recently to a highly defined object (data and actions) instance. Thinking_ of modules as objects is the latest evolutionary trend in modular programming. Objects contain a data type, as well as all related functions that operate on that type, hidden within a single construct. Objects may only communicate by predefined messages. Another characteristic of objects is that they remain in existence for the duration of the program. The meaning of the term module is still nebulous, but our perception of a useful one has been greatly clarified. -'~ ------

Chapter 4

STRUCTURED PROGRAMMING

Structured programming is a design term that, quite

curiously, lacks a precise definition. One reason for this

is that Dijkstra, the term's originator, never defined

structured programming nor had he ever intended to do so:

I never intended to give a definition of when a program was "structured." I coined the term "structured programming" to label a scientific effort aimed at controlling/avoiding program complexity. My target was to discover how to keep programs intellectually manageable. [Dijk 84:1]

Dijkstra's explanation confirms some of the more

general definitions given to structured programming. For

example, Wirth described structured programming as

the expression of a conviction that a programmer's knowledge must not consist of a bag of tricks and trade secrets, but of a general intelectual ability to tackle problems systematically, and that particular techniques should be replaced (or augmented) by a method. At its heart lies an attitude rather than a recipe: the admission of the limitations of our minds. [Wirt 74:249]

A definition given by Bates accurately describes the

spirit of Dijkstra's original meaning:

54 55

What was originally meant by the term structured programming was the philosophy of structuring programs in such a way as to make them more intelectually manageable and more amenable to a convincing proof of correctness. [Bates 76:1]

Currently, structured programming has a meaning on two levels. In a more general sense, it is constructing programs that are more understandible and therefore easier to prove. Structured programming is also a handful of distinct techniques for achieving this more general goal. These associated methods are: using only the three basic control structures (sequence, selection, and iteration i.e. , no GOTOs), top-down design and implementation, modularity, single-entry/single-exit modules, smaller more comprehensible modules, and indented code to show structure. These ideas have become associated with structured programming through general use (or misuse) of the term. The idea of chief programmer teams is often included in many definitions [Bate 76], however this concept is a management idea and is not discussed. Structured programming is not actually a design idea. It is a programming philosophy which contains two design ideas (stepwise refinement and modularity), two control structure diciplines (the three basic control structures and single-entry/single-exit modules), and two techniques of good programming style (indented code and smaller modules). Since stepwise refinement and modularity are -- - ~ ---- - _ __: ____~~~ ~~~~-~~-· - ~.....------.

56 presented in other chapters, this chapter will concentrate on the evolution of the other ideas.

4.1 Genesis of Structured Programming

The earliest published remarks calling for structured programming were briefly stated by Dijkstra in 1962 [Dijk 62], [Dijk 84]. Dijkstra stated that it was time for languages to assist in program correctness and that the design goal of increased processor speed should be replaced by the goal of program reliability. However, he did not give any specific suggestions at this time.

The first presentation of many of the structured programming ideas can be found in Dijkstra's 1965 paper,

"Programming Considered as a Human Activity" [Dijk 65a].

This insightful article presented a concern for program correctness, arguments against the GOTO statement, top-down design, and modularity.

Dijkstra proposed that program correctness 1s analogous to a mathmatical proof and that the complexity of either is best handled by division of the problem (program or proof) into parts. Dijkstra explained that a programmer should develop a program by repeatedly dividing it into parts. He called this method the "dissection technique". Although this idea had been vaguely described prior to 1965, Dijkstra presented a more abstract v1ew in which the dissection ------~---_____.:______---·~------'-~~--~

57 technique was also a means of aiding in proofs of program correctness and of controlling program complexity. Dijkstra also explained his conviction that the abi 1 i ty to convince oneself of. program .correctness is greatly dependent on the clarity of the program: program clarity being the degree to which the program reflects the structure of the process performed. He stated that self modifying code is one cause of lack.of·clarity .• Another cause surfaced when he was told by two unrelated programming managers that the quality of the i~ programs. was inversely proportional to the density of GOTO statements 1n their programs. Dijkstra reported conducting experiments in which the GOTO statement was eliminated from programs and found that in all cases tried, programs written without GOTOs were shorter and more lucid. The reason the GOTO decreases program clarity, Dijkstra explained, is that it gives separate control over what we know as transfer of control (i.e., sequence, procedure call and return, conditional clause, and the FOR statement). Also, program clarity is decreased because the GOTO greatly increases the number of ways a program may fail to stop. With all the good design ideas contained in this paper, one may wonder why programming industry habits were not immediately affected. It has been suggested that Dijkstra's article was simply not read because the IFIP . . - --··------~--~-~------

58

Congress (where this paper was presented) did not attract a large reading audience, and that in 1965, getting hardware to work right was a much higher priority than worrying about GOTOs [Your 79].

4.2 GOTO Controversy

Many people [McCr 73], [Shaw 84] feel the beginning of structured programming was marked by Dijkstra's famous 1968

letter to the Communications of the ACM, "Go To Statement Considered Harmful" [Dijk 68a], because it incited a GOTO controversy. In this letter, Dijkstra warned that the unbridled use of the GOTO statement has potentially disastrous effects on programs. He even recommended its abolition from high-level languages. This was heresy to many Assembly and FORTRAN programmers of the time [Your 79]. The GOTO controversy had begun. In the conclusion of his letter, Dijkstra stated that the undesirability of the GOTO was not new and he credited several people for influencing his thinking in this area, including P. Landin, c. Stachey, H. Zemanek, Hoare, and Wirth. The GOTO debate reached a peak in 1972 with articles appearing both for [Hopk 72] and against [Wulf 72] the use of the GOTO statement. This debate was laid to rest by Knuth in 1974 [Knut 74]. The point brought out by the GOTO controversy was that when the three basic control 59 structures (sequence, selection, and iteration) are used correctly, there simply isn't much need for the GOTO

(except when efficiency is considered [Knut 74]).

The non~necessity of the GOTO had already been proven by Bohm and Jacopini in 1966 [Bohm 66]. They showed that it is possible to write any flowchart (and therefore any program logic) using only the three basic control structures. In other words, .the- GOTO statement is . not needed (logically) at all.

According to Knuth, the firs~ programmer to systematically avoid using GOTOs was perhaps D. Shore

[Knut 74]. In 1966, Shore stated that since the summer of

1960, he had been writing programs in an outline form and that he never found it necessary to use GOTOs [Knut 74]. He reported that this method made programs easier to plan, modify, and test.

Knuth also noted that in 1964, Peter Naur had published remarks about how a GOTO that "looks back" 1s often a FOR statement in disguise, and that the clarity of an algorithm improves when the FOR statement is inserted where it belongs [Knut 74]. However, neither Shore nor Naur explained why the GOTO was undesirable, nor that they were against its use. In fact, Naur's article was against the misuse of the GOTO, in particular, when another control structure is being simulated by it [Naur 63]. Dijkstra's letter was the first to say, don't use GOTOs. However, - ----~- ...... --·~-~--- ·_ ------

60 these earlier remarks showed that the use of GOTOs had already been in question. The benefit to programs written without GOTOs is that they are much easier to read and understand. The path of control can be easily read from top to bottom because each of the three basic control structures have a single point of entry and a single point of exit. In 1969, Dijkstra presented the paper entitled "Structured Programming" [Dijk 69]. This was the first published occurrence of the term "structured programming". Unfortunately, and this is a major source of the term's confusion, the title is the only place the words "structured programming" appear together in the article. By not giving an explicit definition, Dijkstra left the term's meaning open to interpretation. The theme of this paper, program composition techniques to assure correctness, was very similiar to his earlier article, "Programming Considered as a Human Activity," however some of his earlier ideas were expanded. Dijkstra explained that program correctness via testing is out of the question for large programs because the number of inputs to test is too large and because, "Program testing can be used to show the presence of bugs, but never their absence!" [Dijk 69:44]. Dijkstra was more concerned with the correctness of large programs because correctness proofs had already been given for small 61 programs, and because the amount of labor needed to prove large programs increases exponentially with program size.

So rather than first writing a large program and then setting about the difficult task of proving correctness, Dijkstra asked, "For what program structures can we give correctness proofs without undue labor, even if the programs get large?" [Dijk 69:44]. The answer, he said, is a rigid adherence to certains sequencing disciplines (namely, conditional and repetitive clauses and procedure calls), to allow stepwise abstraction from possible different routings. Based solely on the content of this article, the techniques presented under the structured programming title are, using only sequence, repetition, and conditional control structures (i.e., no GOTOs), stepwise refinement, and modularity. However, a concern for program correctness is the overriding purpose of these techniques. One year later, in 1970, Mills added some new ideas to structured programming [Mill 71]. Mills, perhaps for the first time, defined structured programming as: "a complex of ideas of organization and discipline in the programming process" [Mill 71:42]. He said it involves two major principles: 1) top-down programming and 2) the use of the three basic control structures. Mills added the idea of keeping program segments or modules to within a page in length (about fifty lines of 62 code). He explained that, in this way; alL control paths are visible and proving a program segment correct is reduced to at most one page. The , idea of smaller;, more comprehensible modules had already been mentioned in 1962 by Emery [Emer 62], but he did not propose a specific module size. Mills also proposed that program segments have only a single-entry at the top and a single~exit at the bottom. He explained that by following this rule there are no side effects on program control. This idea was very similar to the idea of eliminating GOTOs, except it was applied to procedure call and return statements. Also explained by Mills was the use of conventions to indent the body of code within control structures, to make control~ logic visually apparent. Knuth credits Shore as being the first to use this convention in 1962 [Knut 74]. Mills explained that there were advantages to top-down implementation in addition to the benefits of top-down design. They are: 1) that problems of syntax and control logic were usually isolated within newly added segments: and 2) that once the top-level program skeleton is made, several programmers can work simultaneously on lower segements. Although Dijkstra's GOTO letter created waves in academic circles, it did not make an impact on the programming industry of the time [Your 79]. This impact 63

came in 1972 with Terry Baker's article, Chief Prog~a~er Team Management of Production Programming" [Bake 72a] (according to [Your 79]). Baker's paper described how IBM used a highly skilled programmer team, a program production library, and structured programming to implement a highly successful large programming project for the New York Times.

This project was described as ~huge success because of increased programmer productivity and greatly reduced coding error rates, much of which was attributed to structured programming. For the first time the idea of structured programming was given undeniable credibility by its use in a successful production environment, by IBM [Your 79]. As a result, structured programming became popular and an industry "buzz word".

Baker's article, however, presented structured programming and top-down design as separate ideas. This separation was a likely source of confusion to authors who would later try to define it. Also, Baker gave no reference to top-down programming and referenced Bohm and Jacopini as the source to structured programming. This may have been the reason many people believed that IBM had invented structured programming [Your 79]. In 1972, just eight months after his earlier article, Baker presented a follow-up paper on IBM's New York Times Project [Bake 72b]. In this article, Baker corrected his 64

earlier paper by describing structured programming as a coupling of top-down design and the use of the three basic control patterns, and by referencing Dijkstra and Mills as the source to these ideas.

During this same year, the book "Structured Programming" [Dahl 72] was published. In the first section of this book Dijkstra presented a synthesis of his structured programming ideas. Again he did not define structured programming. In December of 1973, Datamation magazine devoted its entire issue to the explanation of structured programming [Data 73]. Authors to four of the articles each gave a slightly different definition. McCracken said it was the use of the three basic control structures, single-entry/single-exit modules, and the use of indented code. Donaldson explained it as the use of the three basic control structures, short routine lengths, and indented coding. Miller and Lindamood described it as the three basic control structures, application of management techniques to top-down design and implementation, and levels of abstraction. And Baker and Mills said it was, "a top-down sequence for program unit creation and testing, and a technical standard for the coding of each unit [i.e., no GOTOs]." [Bake 73:59]. The only idea in common in all the definitions was the use of the three basic control structures. This helps 65 [\ ' explain why structured programming has been referred to as

"GOTO-less programming".

4.3 Summary

Structured programming is the philosophy that programs should be constructed of constructs- that are easily understood and easily proven correct. Its meaning has been mistified due to a missing definition and a myriad of slightly different definitions. ~Most_ of· its associated ideas were first.described by Dijkstra in 1965•[Dijk 65a]. The meaning of structured programming may still be debatable but its impact on programming is not. It has helped programming become more of a science. Program development, verification, control structures, and style have all been positively influenced by structured programming. (1 '

Chapter 5 CONCURRENCY

Concurrency is the simultaneous execution of two or more program components (modules or processes). This simultaneous execution may be conceptual via interleaved execution on one CPU (multiprogramming) or, actual execution via more than one CPU (multiprocessing). The implementation of concurrency, whether actual or virtual, is logically equivalent [Dijk 65b] and does not affect concurrent program design. Concurrency is a design idea because it is a possible method of program control. Adding concurrency to one's arsenal of design ideas removes him or her from being restricted to thinking of solutions as sequential. Dennis explains: "Imposing a time relation on independent actions of separate parts of a system is a common source of overspecification." [Denn 73b:lll]. Concurrency is useful in design because many problems contain independent aspects that are more accurately modeled and more easily solved using concurrency (e.g., operating systems, real time systems, database systems, and simulation systems). The main advantage of concurrency is that it permits faster

66 67 processing. That is, concurrent program solutions can (·be

implemented more· efficiently on multicomputers and multiprocessors than can sequential • solutions ·[Geha 84].

Real time savings can:~alsoc be realized by concurrent solutions. For example; an editing program designed to accept inputs during processing (due=to concurrentcdesign) has the potential advantage of altering the· actions0 of an earlier command with a later command. This can save the· user time if the first command was not desired. or- the actual desired command was a combination of: two commands (e.g., two last page commands in which the first last page command would paint the last page prior to interpreting the second command}. The difficult aspect of concurrent programming occurs when otherwise independent processes must interact with one another. Three interrelated facets of this interaction are communication, synchronization, and mutual exclusion. Communication is the transfer of information between independent processes. Synchronization is the coordination or set of constraints on the time ordering of events between processes. Mutual exclusion is the constraint that only one process may have access to a shared resource (data, input/output devices, or processes). The earliest form of concurrency was the coroutine, first developed simultaneously by Melvin Conway and Joel Erdwinn in 1963 [Conw 63a]. Control of the coroutine was 68

specified by the resume statement, which transferred control from one coroutine to the other and back, as follows:

resume end

resume B; ---11.,.~ ... return

The coroutine is a low-level way of specifying concurrency because process switching is completely specified by the "resume" statement. It does not provide true concurrency because only one routine at a time is actually executed. The development of concurrent language constructs has taken two main paths that have recently rejoined, as shown in Figure 5.1. Initially (and on the left hand path), concurrent constructs facilitated process interaction via shared variables. The right hand path shows constructs based on message passing.

5.1 Shared Variable Based Concurrency

In 1963, Conway proposed . the "fork" and "join" statements as a means of implementing concurrency [Conw 63b]. The "fork" statement, unlike the "resume", executes two or more routines concurrently. The "fork" and its associated "join" statement are used as follows: 69

Busy-Waiting

Semaphores! ( Critical Regions

Message Passing

Monitors

Remote Procedure Calls

Figure 5.1 Evolution of Concurrency

Source: [Andr 83:38] 70

Program Pl; Program P2; fork P2; ...... join P2; end

When the "fork" in Pl is executed, Pl and P2 execute concurrently until either Pl executes the "join" or P2 terminates. When Pl reaches the join and P2 terminates. the statements following the join are executed. A similar notion was used in PL/I. Pl/I and ALGOL 68 were among the first languages to provide concurrency [Geha 84].

5.1.1 Busy Waiting

The "fork" and "join" statements were implemented by means of shared variables (e.g., a lock bit [Denn 66]). If the lock bit associated with a routine is locked, the routine trying to access shared data must wait and continually test this bit to gain access. This is known as busy waiting. Busy waiting wastes processor time because the continuous testing is non-productive. Using only busy waiting for synchronization and mutual exclusion is difficult to design, understand, and prove [Andr 83]. 71

5.1.2 Semaphores

In 1965, Dijkstra first introduced the "cobegin" statement (originally called "parbegin") [Dijk 65b]. The cobegin statement:

cobegin Pl II P2 II . . . lj Pn coend causes concurrent execution of processes Pl thru Pn. The cobegin construct was the first attempt at making concurrency more structured. That is, it explicitly stated which routines are to execute concurrently in the calling routine (as opposed to nested calls) and provided a single-entry/single-exit control structure for concurrent processes.

The cobegin statement was implemented by the semaphore, also invented by Dijkstra in 1965 [Dijk 65b]. The semaphore is a nonnegative integer variable used to synchronize concurrent processes via the operations V(s) and P(s), which have the following definitions [Brin 71]:

P(s) - If the semaphore S has the value TRUE, then suspend the calling task; otherwise set S to TRUE and let the task continue. V(s) - If there are tasks waiting, suspended as a result of executing a P operation on s, then allow one such task to proceed; otherwise set the semaphore to FALSE.

The semaphore provides mutual exclusion by having all tasks execute a P(s) before accessing common data, and a V(s) 72

after accessing the data. The advantage of the semaphore was that it delayed processes rather than using busy waiting. Delayed processes could then gain access to a resource via any schedueling algorithm (e.g., first-in first-out) using queues. Semaphores are difficult to understand because they are very low level in nature (analagous to the GOTO). Problems occur using semaphores because [Geha 84]: 1) It is possible to jump around them; 2) It is not possible to perform an alternative, if the semaphore is busy; 3) It is not possible to wait for more than one semaphore; 4) Semaphores are visible to tasks that don't need them; and 5) Problems occur if one forgets either a P or V operation.

5.1.3 Critical Regions

The problem of forgetting P and V semaphore operations was solved by the use of critical regions and conditional critical regions. The concept of critical regions is due to Dijkstra [Brin 73b], while its language constructs were introduced independently by Hoare in 1971, [Hoar 71] and by Brinch Hansen in 1972 [Brin 72]. The critical region statement:

region so do S 73 0 • states that when statement "S" is executed, it has exclusive access to the shared data "SO". The conditional critical region statement:

region SO when C do S states that the condition "C" must be true before statement "S" can be executed with a guarentee of exclusive access to shared data "SO". The value of "C" is found by repeated testing (i.e., busy waiting). Critical regions and conditional critical regions improved upon the semaphore by grouping the P and V operations into a single construct but had the disadvantage of using busy waiting. As Brinch Hansen put it:

This controlled amount of busy waiting is the price we pay for the conceptual simplicity achieved by using arbitrary Boolean expressions as synchronization conditions. [Brin 72:576]

5.1.4 Monitors

The succeeding evolutionary development of concurrency was the concept of a monitor. A monitor is a collection of data and its associated procedures, which are shared by concurrent processes. Each monitor operation is performed in mutual exclusion. Simultaneous requests of monitor operations are handeled within the monitor. The monitor concept was first briefly described by Oij kstra as "secretaries" in 1968: 74

Instead of N sequential processes cooperating in critical sect ions via conunon vari'abl-e-s, we take out the critical sections and combine them into

anN+ lst process 7 called a secretary ••• now we have identified a process to which the common variables belong: they belong ·to the common secretary. [Dijk 68b:l34]

The monitor concept was developed into viable language constructs by Brinch Hansen in 1973 [Brin 73b] and Hoare in 1974 [Hoar 74]. The key quality of a monitor is that all operations for task synchronization and communication are hidden in a single construct -- the monitor. As Hoare explains:

The textural grouping of critical regions together with data which they update seems much superior to critical regions scattered through the user program, ••. [Hoar 74:555]

The monitor, although a high-level construct, must be implemented by a low-level synchronization mechanism (i.e., the semaphore) to provide mutual exclusion of shared data [Geha 84]. As a result, the monitor is a combination of high-level and low-level concurrent mechanism and embodies a some of the problems associated with semaphores.

5.2 Message Based Concurrency

One of the first systems to use message passing for concurrent synchronization and communication was the RC 4000 system described by Brinch Hansen in 1970 [Brin 70] 75

(according to [Brya 79]). In this system, the primitive

mechanisms:

send message (receiver, message, buffer), wait message (sender, message, buffer), send answer (result, answer, buffer), and wait answer (result, answer, buffer)

were used to communicate between processes. A common pool of buffers was used to temporarly hold a message until another process read it. A message queue was associated with each process to hold many simultaneous requests. A drawback of message buffering was that it introduced an additional resource problem, the limited pool of buffers. Other disadvantages were that it may not be possible to implement mutual exclusion efficiently

[Geha 84]. These early message passing operations were too low level in nature because they only provided a basic form of communnication, requiring the programmer to decide on the types of control and data messages, and where these messages must be placed in the program [Brya 79].

5.2.1 Communicating Sequential Processes

More recently, several high-level languages with high-level message passing constructs have been developed.

One of the first, developed by Hoare in 1976, was

communicating sequential processes [Hoar 78]. In 76 communicating sequential processes (CSP), processes communicate and synchronize via input and output statements. A correspondence must occur between an input statement of one process and the output statement of another process, before they can synchronize or communicate. If either task is not ready, waiting occurs until both are ready. The basic form of the CSP input command (denoted by"?"):

P ? ( x) states that a value is to received from process "P", and is assigned to the target variable "x". A CSP output command (denoted by"!") of the form:

P (e) states that expression "e" is to be output to process "P". A correspondence between an input and output statement is shown in Figure 5.2 (lines show the threads of control). In CSP, concurrent processes are specified by a variation of the cobegin statement (e.g., [Pl II P2]). The ideas of CSP have had a major impact on the design of concurrent language constructs because of their simplicity and suitability for efficient implementation [Geha 84]. The strength of CSP is that it unifies the concepts of synchronization and communication (made evident by Figure 5.2). 77

coi!IIIUfti~atlon:

tra~s.:~it

value ~f •

Figure 5 ~~- CSP Correspondence ,.~~-·, Source: [Wegn 83:449]

5.2.2 Distributed Processes

In 1978, Brinch Hansen desc~ibed the concurrent programming concept ca.l1ed , "Distributed Processes" (DP)

[Brin 78]. DP was the first language description to use remote procedure calls [Andr 83]. A remote procedure call

.. ,. is a call to a procedure defined in another process. A DP process has the form:

Process name own variables common procedures initial statement

When a common procedure is called (e.g., call process name.procedure name), a server process is created to execute the called procedure. The server process and the calling process execute with mutual exclusion by means of a 78 variant of conditional critical regions. DP is similar to monitors but uses active processes rather than inactive collections of procedures for concurrent communication. In DP, information can be passed in either direction between processes via input and output parameters. The calling mechanism in DP is asynchronous. That is, the calling process does not specify the name of the caller {as in CSP). This feature allows DP processes to be incorporated into libraries where called process names are not known.

5.2.3 Ada

Some of the defficiencies of CSP [Hoar,78) are rectified by the realities of implementation in the recently developed language, ADA [USDe 81). ADA is a first attempt to provide high-level concurrency in a general purpose language [Geha 84]. Communication in ADA is effected by the use of remote procedure calls and accept statements. Processes in ADA are called tasks. A task has the form:

Task A specification entry Pl { .. ) entry P2 ( .. ) end specification A Task A body begin loop select accept Pl { .. ) or accept P2 ( .. ) end select end loop end 79

p ' When execution reaches the accept statement, task A is

delayed until a matching entry call has been made. A call to task A has the syntax A.Pl( •• ). When this occurs, a rendezvous is accomplished. As shown in Figure 5.3, the calling task passes its parameters, the remote procedure is executed exclusivelyr parameters are returned, and both tasks again execute concurrently.

Figure 5.3 ADA Rendezvous Source: [Wegn 83:449]

ADA is a selective combination of CSP and DP concepts and represents the state of the art in a high-level concurrent constructs. ADA's inter-task communication is closer in form to DP; one task calls a procedure-like entry defined in another task, with parameters providing two-way transfer of data. However, ADA's accept and entry call more closely resembles CSP's input and output commands (as is visually evident by a comparison of Figure 5.2 and Figure 5.3). 80

9 '

ADA also adopted DP's as¥nchronous remote procedure call so that concurrency library processes could be used. A somewhat different approach to concurrency has been taken by proponents of data flow programming. This approach is discussed in Chapter 6.

5.3 Summary

Concurrency's evolutionary course has gradually traversed a path from low-level (basic instruction) constructs to a high-level (structured) constructs. These constructs are (from lowest to highest level): busy waiting, semaphores, critical; regions, monitors, and remote procedure calls. High-level concurrent constructs have been easier to use and understand. The recent development of the CSP correspondence and ADA rendezvous concept has unified the notions of synchronization and communication into a single construct. Chapter 6 DATA FLOW

Data flow, also referred to as data driven, is the idea that a program's order of execution is determined only by the availability of needed data. Once the data required by a program object is available, execution occurs

immediately. The power of the data flow concept is that ordering based on data flow is the best (fastest) one can hope to achieve because the availability of data is the ultimate restricting factor in program exeuction. This idea can be applied during program design to achieve maximum efficiency of program control.

6.1 Early Data Flow

An early notion of data flow existed in data processing flowcharts from 1920-1950 [Coug 74]. Figure 6.1 is an example of this type of early flowchart. Data was represented pictorially and arrows showed the data's movement through the system. This diagram demonstrates that the concept of data flow was realized at a very early date.

81 82

I'IIODUCTION !'lAMMING Dt:l'l'. I'URCHAIINil DVAin'MIMI' ACCOIJif'I'INil Dt:l'l'.

I STORES I ( BUYER ) ( OROEII PIIEPAIIATION ) ~ ~ r---'\~ TIAYlllll IIAVILIIM IIIYlL!II rRUUISIIIDI UQUISIIIII UIIISIIIH I ~ ~ ~ - t._o E:::ZI l~ • II .. T. IUVIllll IIQUISIIIDI ...... ~T· lyllytr ~-=- lraftlilll,__ I ;::::::::,.-. _, 'oiiHIIIIIIIHio WlGc.-rtltllltl I PUKUU01111 I PUICMISI :r V£ .. 00. • DID£1 I .lCKI'lOW, "'£N' r-; I PUIICHASlfrtG 1T PUICMASUtC I'IIICUSI OliO I !t(C(LVLNG _.._. f.cUSPAYAIU ACC 'TS P&YIIll r.,.....,...... ,.,._ ... ! RIC !lVIII ~ IIPDII ,.., _...... PUIICHA$11'tG 1 l RIC! lVIII l!POIT '~ ACC'TSPAYI,Ilf •atlfMCifite .._ ... ---_,.,......

Figure 6.1 Early Data Flow Diagram

Source: [Coug 74:52] 83

The data flow graph (previously referred to as a directed or transitive directive graph) was used as early as 1959 as a tool in the analysis of flowcharts [Pros 59], [Karp 60]. Data flow graphs derived from flowcharts were transformed into matrices and analyzed to determine errors and to eliminate program redundancies. In 1963, the directed graph was used as a computational model to aid in the analysis and optimization of parallel computations [Estr 63]. The directed graph was used to depict parallel computations, from which program computation speeds were calculated and maximized. Possibly the first use of the term "data flow" was by Duane Adams in this 1968 thesis entitled, "A Computational Model with Data Flow Sequencing" [Adam 68]. Adams also used the data flow graph to describe and estimate parallel computations. Figure 6.2 shows a typical data flow diagram from this paper. Data flow was first applied to program design by Wayne Stevens, Glen Myers, and Larry Constantine in their 1974 paper, "Structured Design" [Stev 74]. Structured design is a program design technique in which a problem is initially specified as a data flow diagram. This diagram is transformed into a hierarchy of modules based on major points of data input and output. The idea was to base the order of module execution on a problem's data flow.

Constantine was the originator of this idea [Myer 78]. 84

file

Figure 6.2 Adams' Data Flow Graph Source: [Adam 68:82] 85

The first data flow diagram used in presenting this technique is shown in Figure 6.3. Inexplicably, it was not given a name at the time and it was not until 1978 [Your 78], that Yourdon and Constantine referred to this diagram as a data flow diagram.

Figure 6.3 First Design Data Flow Diagram

Source: [Stev 74:120]

A weakness of the structured design method is that it assumes modules are to be executed sequentially. However, program design based on the flow of data was a significant first of structured design. Currently, data flow is not being used in program design in a high-level form, other than as described in structured design. Data flow can be implemented in a high-level form, to some extent {i.e., serially, Figure

6~4) by present concurrent constructs. For example, if process B requires data from process A to continue execution, B can use the accept statement/rendezvous mechanism of Ada to trigger continued execution, as soon as 86

(l •

A's data has been produced. High-level data flow program components of this type have been described as transducers

[Abbo 85]. A transducer is an object that accepts data, processes · it, and passes it on to another program component.

Figure 6.4 Serial Data Flow

Perhaps a needed high-level data flow construct is one in which data is accepted from two or more independent processes. For example, a task C could receive data from two separate tasks A and B, where A and B input data types

A and B respectively (Figure 6.5). This construct can be implemented by ADA accept statements as:

Task body PROCESSC

begin loop --forever FLAGA := FALSE; FLAGB := FALSE; while not (FLAGA and FLAGB) loop select accept INPUTA (A:type A); FLAGA : = TRUE; end INPUTA; or accept INPUTB (B:type B); FLAGB := TRUE: end INPUTB; end select; end loop; Processdata; PROCESSD (C:Type C); end while; end loop; end; 87

Figure 6.5 Multiple Data Flow

In this example, as soon as data A and B are received they are processed and the result is passed to PROCESSD. However, the use of flags and the while loop to implement this construct is burdensome and a simpler syntax seems desirable.

6.2 Data Flow Computers and Languages

The data flow concept has been recently used as the basis of computer architectural design [Ager 82]. That is, instead of using an instruction counter to keep track of the next instruction to be executed, as in traditional von 88

Nuemann machines, data flow machines enable an instruction

if all required input values have been computed. The purpose of data flow computers is to provide efficient concurrency. The concurrency provided by data flow systems is of a low-level form (i.e., at the level of basic operations addition or substraction). Data flow machines consist of hundreds, even thousands, of cooperating processors and are therefore powerfully suited to handle "micro-level" concurrency on a large scale [Turn 82]. To take advantage of large scale concurrent hardware data flow languages are being developed. The strategy behind the data flow language school of thought is to express computations in such a way as to allow the concurrency inherent in programs to be extracted at the instruction level. Thus, data flow languages do not allow concurrency to be explicitly stated. In fact, control structures in general are not allowed. Data flow programs are viewed as an unordered set of instructions (i.e., the order listed is not the order of execution). The notion of sequential execution is replaced by concurrent execution of operators as soon as their operands are ready. In order to extract all possible concurrency from a problem, data flow languages have many restrictions. One such restriction 1s that data flow languages must be side-effect free. Side effects occur when more than one 89 identifier is able to access the sameLdata object. When a data object is modified via access by one indeitifier, the other identifier's value is changed. This is referred to as a side effect. To avoid side effects, execution of an instruction in a data flow language can only compute values for new identifiers, it cannot alter a previously defined identifier. An additional restriction is that the language must follow the single assignment rule: each identifier may only be defined once in a given environment. Data flow languages have been claimed to be more modular (than non-data flow languages) because side efects, which may allow the effects of a computation to propagate in an uncontrolled manner, are eliminated [Brya 79], [Wege 79]. However, data flow languages are presently in an infant state, and it is too early to draw any conclusions about their potential for success. In fact, no realistic data flow machines have been constructed as yet [Geha 84]. Also, progress in the development of data flow computers and languages has been slow [Wege 79]. At present, practical data flow machines are behind the state of the art of data flow languages. Additionally, even proponents of data flow languages admit the need for a high-level data [Brya 79] so that data flow can be used at a level higher than basic instructions. An example of a data flow language and discussion of its likely impact on design is given in Section 7.2. 90

6.3 Summary

Data flow is the idea that the order of program execution should be based on the availability of data. The data flow concept has existed since the early days of data procesing (1950). It was applied to program design in 1974, to determine an efficient order of module execution. In recent years, it is being used as a basic model for computer architecture which has given rise to data flow languages. These restrictive languages allow concurrency to be extracted from programs at the "micro" (instruction) level. Chapter 7

FUTURE PROGRAM DESIGN

7.1 Conventional Languages

In his 1982 article, "New Directions in Programming,"

Anthony Wasserman discusses his view of future programming practices. He divides his predictions into: the short term (next five years), the medium term (1987- 2000), and the long term (2000- early part of the 21st century).

In the short term, Wasserman predicts a greater development of software tools, increased use of high-level languages, environmental programming improvements and increased prototyping of systems. By software tools he means computer-based support for some of the tasks carried out during software development. These changes, would probably have little effect on software design, except that software tools would assist in the design process. The current design ideas of levels of abstraction, generics (abstract data types accepting a type parameter) and program families already support the development of prototypes.

In the medium term (1987- 2000), Wasserman predicts greater automation of the development process and the use of well-tested library components. By automated methods

91 92 he means that it will be posible to produce executable programs for certain common classes of systems from a description of inputs and desired outputs. Should these developments occur, program design would rise to a higher level than is presently employed. Designers would work with proven, well-documented low-level components and would not have to resort to the level of detail currently employed. In the long term (2000 early part of the - 21st century), wasserman predicts that programmers will be able to describe the function of a program in terms--of behavior and outputs without precise algorithms. Abstractions will be used at unprecedented high levels, so that designers will describe what is desired without describing how. So-called automatic programming, the ability to produce an executable program from non-procedural specifications will become a reality. If these changes do occur, software design would virtually become software specification. Objects would be described by axioms or sets of rules, and from these specifications code would be generated under automated guidance by the designer. Design would shift away from data representation and program structure and move toward understanding and describing the problem itself. 93

7.2 Non-Conventional Languages

The last major influence on software design was an advance in hardware technology during the early 1960's (more memory and faster processing speed mentioned in

Chapter 1). Currently, new hardware architectures are being influenced by the recognition of inherent flaws in existing architectures and by recent advances in hardware technology. These new architectures will have an effect on languages and therefore software design. From the beginning of programming, programming languages and software design have been influenced by the original single-word/fetch/store hardware designed by John von Neumann in the 1940's [Step 82]. This architecture has been referred to as the 'von Neumann bottleneck" [Back 78] because only one word of program or data can be processed at a time. Programming languages, and therefore lower levels of design, are negatively impacted by the von Neumann architecture. For example, assignment statements can only affect a single variable at a time. This is a reflection of the underlying single location fetch/store architecture [Step 82]. A predominant factor influencing software systems today is the advancement of microprocessor and LSI semiconductor technology [Wirt 79]. In the near future it will become possible to cheaply build computers containing; 94

ultimately, thousands of independent processing units [Turn

82]. The use of these multiprocessor architectures and the displacement of von Neumann machines will be dependent on the development of programming languages and constructs that facilitate the use of concurrency [Turn 82]. Turner has explored the form these languages might take if concurrent architectures become a reality. He predicts that it will be necessary to use a data flow language [Turn 82] (discussed in Chapter 6). Turner gives an example of this type of language called KRC. In this language, data structures are built using lists (e.g., [1 •• 100] for an array from 1 to 100). Elements of lists may be of any type. Functions are defined by equations which characterize their functionality. Another feature of KRC is the use of set abstraction to denote lists. For example, the statement:

.(N*N;N <.-[1 •• ]} denotes the list of the squares of all natural numbers. The values of the list can be obtained by iterating the formula over all members of the set. Programs written in KRC tend to be 10 to 20 times shorter than conventional languages and less detailed. For example:

FAC N =PRODUCT [l .. N] is a program for calculating factorial. An equivalent ALGOL program is nine lines long. Programming in a language like KRC would be at a very high level (descriptive rather than procedural). Many of the details necessary in conventional languages would become implicit. For example, KRC programs have no type declarations (the type is. implicit) and no order of execution is specified. It appears that design for this type of language would be simplified and that only high level specifications would be required. Turner predicts that data flow languages or a similar kind of non-procedural language will challenge von Neumann languages in about a decade. 0 .

Chapter 8

CONCLUSIONS AND COMMENTS

The study of the evolution of software design ideas has provided three benefits: 1) it has clarified or characterized software design trends; 2) it has shown how

new design ideas are conceived; and 3) it has aided in

understanding software design ideas.

8.1 Software Design Trends

The evolution of software design ideas has clarified how programs are conceived during design. Programs are no

longer viewed as a sequential hierarchy of procedures

(actions) that pass data up and down the hierarchy (Figure

8.1), but rather as a set of objects (data and actions) which remain in existence, execute concurrently if possible, and interact with each other by sending messages

(Figure 8.2). In this newer object model, high-level objects are not arranged in a hierarchy.

The historical development of this object view of systems is shown in Table 8.3. In the 1960's and early 1970's, programs were viewed from an action viewpoint:

96 97

r .

Call Proc. 1 (data:in out:data) Call Proc. 2 (data:in out:data) Call Proc. 3 (data:in out:data)

l t Proc. 1 Proc.2 Proc. 3 data:in out:data) (data:in out:data) (Data:in out:data)

Figure 8.1. Procedural Program Model

Figure 8.2 Object Program Model subroutines and procedures were the constructs used to hold

action ~bstractions created using stepwise refinement. These procedures were executed sequentially via procedure calls. GOTO-less programming simplified the control between these procedures. Beginning in the late 1960's and early 1970's, programs began to be viewed as objects in concurrent interaction. Abstract data types and objects (SMALLTALK) were developed to support the object model of programs. Recent descriptions of object-oriented design are given by [Abbo 85] and [Wien 84].

Action or Object Model Procedural Model (date and actions) (1960 - 1970) (1971 - ? ) Top-Down/Stepwise Subroutines Abstract Data Refinement Procedures Types, Objects Sequential Concurrency Control Procedure Calls Data Flow Constructs GOTO-less Programming

Table 8.3 Action vs. Object Model

The development of the object model has resulted from three trends: the increasing development of higher level abstract constructs, the increasing use of information hiding, and the unification of programming concepts. The trend toward progressively higher levels of abstract constructs has been a significant and consistent trend over the course of software design history. An early example of 99

this trend was the virtual elimination of the GOTO

statement in the late 1960's and early 1970's. The GOTO statement was recognized as too low-level (machine level) in nature, to be useful as a control construct in high-level languages •. Another example of this trend hasbeen the development of progressively higher levels of abstract user-defined data, shown in Figure 8.4.

1960 COBOL Limited user-named records AED ., Arbitrarily GOmplex records 1962 1964 Hoare-Record Class ,. Accesible records, permanent 1966 data SIMULA 67 Abstract user-defined objects 1968

1970 PASCAL ,. User-defined record 1972

1974 CLU, ALP HARD ,. Abstract data type - clusters forms 1976 1978

1980 SMALLTALK-80 Objects ADA Packages

Figure 8.4 Evolution of High-Level Data Abstraction

COBOL was the first language to allow users to describe record data structures, although in limited variety [Wegn 79]. Ross' AED language first presented the .LVV

idea that data structures should be defined by the programmer in 1961 [Hoar 65], [Ross 61]. In 1965, Hoare proposed the record class concept which made abstract data elements easily accessible.. SIMULA 67 was the .first language to allow abstract data types (although unprotected) by implementing the class concept in 1967. Since then, many languages which support abstract user defined data have been developed. Abstract data types provide the programmer with a means of specifying arbitrarily high-level data (PASCAL, CLU, ALPHARD, SMALLTALK-80, MODULA-2, and ADA). The evolution of concurrency has provided another example of the historical trend toward higher level constructs (Figure 8.5). The use of shared variables for synchronization and communication in 1963 was not only low-level, it was wasteful of CPU time {i.e., busy waiting). The introduction of semaphores in 1965 eradicated the need for busy waiting, however semaphores were also a low-level construct. In 1971, critical regions improved upon semaphores by combining a P and V semaphore pair into a single construct. The monitor concept combined all critical regions acting on a given resource, into a single high-level construct in 1973. However, the monitor used semaphores to provide mutual exclusion of its operations. Recently (1976-1981), the remote procedure call/rendezvous concept has placed concurrency at its 101

highest level by unifying the concepts of synchronization and communication.

1960

1962 Busy waiting ; Primitive shared variable 1964 Semaphore ; Primitive shared variable 1966 1968 1970 Critical regions ;P(s) and V(s) combined 1972 Monitors ; Passive procedure using 1974 semaphores.

1976 CSP ; Input/Output correspondence

1978 DP ; Remote procedure call

1980 -- ADA ; .Remote procedure/rendezvous

Figure 8.5 Evolution of Concurrent Constructs

The trend of increasingly higher levels of abstraction may continue, with the "next" higher level being implementation independent problem specifications that facilitate automatic programming. This development will be contingent on the abilities of future compilers [Wulf 80].

For example, can a compiler be expected to find an algorithm for a specification? The trend toward higher level constructs supports the contention [Brya 79] that data flow languages (predicted by 102

some to be the style of language of the future [Turn 82] need a high-level form of concurrency.

The second finer trend of software design ideas has been the increasing use_of information hiding. Figure 8.6 shows the development of module constructs used to hide

information.

1950 Subroutines Actions hidden

1960 Procedures Actions hidden

1962

1964

1966 SIMULA 67 Actions grouped with objects 1968

1970 Parnas Information hiding 1972

1974 CLU, ALPHARD ; Data and actions hidden together

1976

1978

1980 SMALLTALK-80 SMALLTALK-80 Objects ADA Packages

Figure 8.6 Evolution of Information Hiding

Developed in 1951, the subroutine provided a means of hiding the implementation of procedures or abstract 103

0 ' actions. In 1967, SIMULA objects allowed abstract data and operations associated with that data to be contained in a single class construct. However, SIMULA objects leaked their hidden information by allowing other objects to directly access local hidden data. Beginning in 1974, abstract data types hid both data and associated operations in a single construct. SMALLTALK objects have the same hiding attributes as abstract data types but go a small step further by eliminating the proliferation of operator/operand type declarations in a program and hiding them within a single object. Concurrent constructs have also followed the information hiding trend. The use of shared variables (1963), semaphores (1965) and critical regions (1971) spread synchronization commands throughout a program. The more recent development of monitors (1973), CSP (1976), and remote procedure calls (1978), hide synchronization commands in a single place. The third more subtle trend 1n software design has been the unification of several concepts or mechanisms into a single construct. That is, ideas formerly implemented by a single construct are combined in forming a new more general one. For example, objects support: the description of a class of abstract data and actions, levels of abstraction, and information hiding. Objects facilitate levels of abstraction by allowing the joint description of 104

0 ' data and actions in terms of other perhaps high-level data

and actions. Another example of this trend is the rendezvous

concept of concurrency, which unifies synchronization and communication into a single construct.

How much the design ideas discussed in this thesis are used and to what extent they are beneficial is not clear. Certainly, recent concurrent constructs and abstract data types are new developments and have not been widely used.

History has shown that a significant amount of time tends to pass between an idea's introduction and when it is

generally accepted as a good idea. For example, Dijkstra

first aruged against GOTO's in 1965. The GOTO debate reached a peak in 1972 and was laid to rest by Knuth in

1974.

There is quite probably also an acceptance lag between

academic acceptance and general industry usage, although

this has not been documented and suggests an area for future research.

There is little evidence that the newest languages have made a significant impact on software development

[Wulf 80]. However, it has been suggested that the design ideas contained in these languages provide programmers with

better mental tools, and that this benefit cannot be easily measured [Wulf 80]. 105

At present, it is not clear how useful abstract data types or objects will be. This question will remain unanswered until several data abstraction languages have been used to construct large systems [Wulf 80] and the benefits of these systems have been analyzed. One experimental evaluat.ion has concluded that the use of langauges with statically defined types improves program readability and reliability, but only marginally [Gann 77]. Perhaps a greater benefit from objects will be their ease in modification. The use of objects should allow easier modification than earlier constructs because of their high degree of information hiding. Another benefit of objects will be the simplification of program structural design. A number of prominent people [Dijk 65a], [Grie 79], [Jack 75] have stated that the key to successful program structure is to model the program structure to that of the problem structure. But what is problem structure? Problem structure can be thought of as the relationship of the problem parts to one another (assuming the problem has parts). Many classes of problems are made up of separate parts or objects (e.g., operating systems, simulation systems, etc.}. These objects are often made-up of both data and related actions. Creating program objects to match these problem objects models the problem structure more accurately than could be done by procedures. Guttag explains: "Subroutines, while well lUb

suited to the description of abstract events (operations), are not particularly well suited to the description of abstract objects." [Gutt 77:397]. Because the object construct is well suited to describe problem objects, many problems can be easily designed by simply identifying their objects and creating these objects in the program [Cox 82]! Accurately modeling problem structure results in increased program clarity [Dijk 65a]. Another aspect of problem structure is the structure of its data; the relation of the pieces of data to one another. Modeling program structure to a problem's data structure was the idea behind Jackson's data structure design technique [Jack 75]. The strategy was to make the program structure simultaneously match the problem's input and output data structures. Using this technique meant a designer had to map the inherent data structure of a problem into a hierarchy of modules (the program structure). Abstract data types and objects eliminate the need for this mapping because they allow these structures to be directly implemented within an object (i.e., the abstract data type). Furthermore, this structure is encapsulated 1n a single object; in the data structure technique, this structure is spread throughout the program. If a problem's data structure is to be changed, modification is confined to a single object, not the entire program. 107

8.2 Origin of Design Ideas

One purpose of this thesis was to observe how new

design ideas originated. Based on the ideas studied, software design ideas have originated in (at least) two ways: they have been either a refinement of a somewhat vague idea or the application of a previously proven idea to a new area (i.e., program design).

8.2.1 Refinement of a Vague Idea

The idea of information hiding is an example of a refinement of a vague idea. Modularity was a somewhat vague design idea in the early 1960's that meant little more than creating and using program units (subroutines). Although this idea was useful, saving re-ceding, it was not ideal. In the following years, it was stated that modules should be independent of one another. Information hiding clarified or refined the more vague notion of module independence by requiring that a module hide information contained within it. Information hiding also made it clear that data and all operations on that data should be hidden together. Shortly afterward, abstract data types were developed to provide a general language construct for this new type of module. Thus, the design idea of information . hiding came into being as a refinement of the vague notion of procedural module independence, and influenced the 108

development of a new module construct abstract data types. The idea of levels of abstraction can be considered a refinement of top-down design because it restricts modules (resulting from stepwise refinement) to using only the resources of the level immediately below it. Program families can also be considered a refinement of top-down design. That is, the program family idea clarifies which design decisions should be considered first during the top-down development process.

8.2.2 Reapplication of a Proven Idea

The idea of stepwise refinement is the application of an ancient idea, "Divide and Rule," to program design.

Although the proven technique of "Divide and Rule" had been known since ancient times, it took nearly a decade of programming before it was systematically applied to program design. When Dijkstra first described the stepwise refinement process, he did so with program correctness in mind and used the analogy of program construction being equivalent to a mathematical proof.

Dijkstra used the proven method of a mathematical proof and applied it to programming (and program correctness) to obtain the new idea of stepwise refinement. 109

The idea of modularity was also an application of a previous proven idea to programming. Emery-explains:

I borrowed the idea of [of modules] from the construction industr~ in which standard modules. (such as standard wall panels) could be used in the construction of buildings. [Emer 84:1].

Although Emery's concept of the term module (as standard building blocks) is somewhat different from its current meaning, it was a forerunner that helped mold the· module concept. Emery's original meaning may yet be revived if Wasserman's predictions come true (Chapter 7). Thus, language and program designers searching for new ideas should consider refining a currently vague idea or applying a proven idea in another field to programming. An example of this second idea was discovered during the writing of this thesis. A book on the strategies for academic writing [Hash 82] describes the factors to be considered in naming and dividing subjects. When applied to programming, these ideas describe, amazingly well, ideas for dividing and naming modules.

8.3 Understanding the Ideas

The study of software design evolution helps one understand design ideas by clarifying their meaning and by clarifying how the ideas fit together. 110

{1 • Structured programming is and has been, perhaps the

most misunderstood and misused design term. The study of how this idea avolved, reveals the reasons for this misunderstanding: it was not defined by its originator; it was subsequently defined incorrectly by those who made it popular; and, it was given varying definitions by thos.e who tried to accurately define it. The evolution of structured programming has also explained its present two-level meaning to be a merging of its implicit original meaning (programs written so as to be more amenable to a proof of correctness) and the ideas that became associated with it (originally stepwise refinement, no GOTOs, and modularity, and later -- indented code, smaller modules, and single-entry/single-exit modules). Studying the evolution of modularity has clarified its nebulous meaning because its meaning has evolved. Originally, modularity was a somewhat vague concept. To clarify and improve its original meaning, its definition has been refined over the course of programming history. Being cognizant of this refinement helps one understand the current numerous, although similar, definitions. Studying the development of software design ideas also helps one understand how the ideas fit together because certain ideas are the product or merging of two or more earlier ideas. The evolution of an idea clarifies what 111

0 ' that idea lacked and how a new idea fulfilled this need. This clarifies the connection between these ideas. For example, the idea of modularity had existed for about ten years (1962-1972), but lacked a precise way of explaining module independence. Information hiding explained module independence in 1972. As a result (along with other influences), abstract data types emerged in 1974. Just prior to information hiding, the idea of levels of abstraction was presented in 1968. Abstract data types filled the need for a high-level module construct that could better facilitate levels of abstraction than the procedure. Thus, abstract data types were a combined solution for information hiding and for facilitating levels of abstraction. In final summary, the study of software design evolution has clarified: 1) software design trends (the object view of programs, progressively higher levels of abstraction, the increased use of information hiding, and the unification of design concepts); 2) the meaning of the design ideas; and 3) how new design ideas have, and may yet be conceived (the refinement of a vague idea and the application of a proven "good" idea to ·another discipline). Simply stated, the evolution of software design has endeavored to fulfill the philosophy of structured programming -- to structure programs so that they are more understandable to humans. REFERENCES

[Abbo 85] Abbott, Russell J. "An Integrated Approach to Software Development" John Wiley and Sons (1985}. [Adam 68] Adams, Duane "A Computational Model with Data Flow Sequencing" Technical Report CS 117, School of Humanities and Sciences, Stanford University, Stanford, Ca. (December 1968} [Ager 82] Agerwala, Tilak and Arvind "Data Flow Systems IEEE Computer {February 1982) p.l0-13. [Andr 83] Andrews, Gregory R. and Schneider, Fred B. "Concepts and Notations for Concurrent Programming" ACM Computing Surveys Vol.l5, No.1 (March 1983} pp.3-43. [Back 78] Backus, John "Can Programming Be Liberated from the von Neumann Style? A Functional Style and Its Algebra of Programs" Comm. ACM Vol.21, No. {August 1978} pp.613-641. [Back 81] Backus, John "The History of FORTRAN I, II, and III" in History of Programming Languages, Ed. R. Wexelblat, Academic Press N.Y. (1981) pp.25-44. [Bake 57] Baker, Charles L. "Review of D. McCracken, Digital Computer Programming," Math Computing II (1957} pp.298-305. [Bake 72] Baker, F. Terry "Chief Programmer Team Management of Production Programming" IBM Systems Journal Vol.ll. No.1 (January 1972) pp . 56-7 3 • { a ) [Bake 72] Baker, F. Terry "System Quality Through Structured Programming" AFIPS Proceedings of the 1972 Fall Joint Computer Conference Vol.41, AFIPS Press, Montvale, N.J. {September 1972) pp. 339-344. {b) [Bake 73] Baker, F.T., and Mills, Harlan "Chief Programmer Teams" Datamation Vol.l9, No.l2 (December 1973) pp.58-61.

112 113

[Balz 67] Balzer, R.M. "Dataless Programming" Proceedings of the AFIPS Vol.31 (1967) pp.535-544.

[Barn 82] Barnes, J.G.P. "Concurrent Processing For Real-Time Programming" in State of the Art Report Programming Technology, Ed. P.J.L. Wallis, Maidenhead, England (1982).

[Bate 76] Bates, D. "Structured Programming Infotech State of the Art Report" Infotech International Limited, Maidenhead, England (1976).

[Birt 77] Birtwistle, Graham M., Dahl, O.J., Myhrhaug, B., Nygaard, K. "Simula Begin" Petrocelli, N.Y. (1977).

[Bizz 60] Bizzell, Robert "Modular Programming" unpublished paper File APS-137 (August 1960), cited in [Emer 62].

[Bohm 66] Bohm, Gorrado and Jacopini, Guiseppe "Flow­ Diagrams Touring Machines, and Languages with Only Two Formation Rules" Comm. ACM Vol.l3, No.5 (May 1966) pp.366-371.

[Boeh 79] Boehm, Barry W. "Software Engineering: Research and Development Trends and Defense Needs" in Research Directions in Software Technology, Ed. P. Wegner, MIT Press, Cambridge, Mass. (1979) pp.44-86.

[Brin 70] Brinch Hansen, Per "The Nucleus of a Multiprogramming System" Comm. ACM Vol.l3, No.4 (April 1970) pp.238-241,250. [Brin 72] Brinch Hansen, Per "Structured Multi­ programming" Comm. ACM Vol.l5, No.7 (July 1972) pp.574-578. [Brin 73] Brinch Hansen, Per "Concurrent Programming Concepts" ACM Computing Surveys Vol. 5, No.4 (December 1973) pp.223-245. (a)

[Brin 73] Brinch Hansen, Per "Operating System Principles" Prentice-Hall, Englewood Cliffs, N.J. (1973). (b)

[Brin 78] Brinch Hansen, Per "Distributed Processes: A Concurrent Programming Concept" Comm. ACM Vol.21, No.ll (November 1978), pp.934-941. 114

[Brya 79] Bryant, R.E. and· Dennis, J.B. "Concurrent Programming" in Research Directions in Soft­ ware Technology, Ed. P. Wegner, MIT Press, Cambridge, Mass. (1979) pp.584-610.

[Buxt 76] Buxton, J., Naur, P., and Randell, B. Eds. "Software Engineering: Concepts and Techniques Proceedings of the NATO Conferences" Petrocelli, N.Y. (1976). [Cons 65] Constantine, Larry "Towards a Theory in Pro­ grarnming" Data Processing (December 1965) pp.l8-21. [Cons 68] Constantine, L. "Segmentation and Design for Modular Programs" in Modular Programming: Proceedings of a National Symposium, Eds. T.O. Barnett and L. Constantine, Information and Systems Institute, Cambridge, Mass. (1968) pp.23-42. [Cons 84] Constantine, L. Personal correspondence. (August 28, 1984} in Appendix.

[Conw 63] Conway, M.E. "Design of a Separable Transition-Diagram Compiler" Comm. ACM Vol.6, No.7 (July 1963) pp.396-408. (a)

[Conw 63] Conway, M.E. "A Multiprocessor System Design" in Proceedings AFIPS Fall Joint Computer Conference Vol.24 (November 1963) pp.l39-146. (b)

[Coug 74] Couger, J.D., Knapp, R.W. "Evolution of Business System Techniques, Eds. J. Couger and J. Knapp, John Wiley and Sons, N.Y. (1974} pp.43-82. [Cox 83] Cox, Brad "The Message/Object Programming Model" IEEE Softfair-Software Development: Tools, Techniques, Alternatives, IEEE Computer Society, Silver Springs, Md. (1983} pp.51-60. [Curr 50] Curry, H.B. "A Program Composition Technique as Applied to Inverse Interpolation" Navy Ordinance Lab. Memorandum 10337 Silver Springs, Md. (1950} 98 pgs. cited by [Knut 78]. [Dahl 68] Dahl, O.J., Myhrhaug, B., Nygaard, K. "Simula 67 - Common Base Language" Norwegian Computing Center, Forskningsveien lB, Oslo, Norway (May 1968) . 115

[Dahl 72] Dahl, O.J. Dijkstra, E.::, Hoare, C .A. R. "Structured Progr'amming" Academic Pressi N.Y. (1972).

[Data 73] Datamation Vol.l9, No.l2 (December 1973). [Denn 66] Dennis, J.B. and Van Horn, E.C. "Programming Semantics for Mult iprogrammed Computat-ions" Comm. ACM Vo1.9, No.3 (MArch I966) pp.l43-155. [Denn 73] Dennis, J.B. "Modularity" Advanced Course in Software Engineering, Ed. F.L. Baur, Springer­ Verlag, N.Y. (1973) pp.l28-182. (a)

[Denn 73] Dennis, J.B. "Concurrency in Software Systems" in Advanced Course in Software Engineering, Ed. F. Baur, Springer-Verlag, N.Y. (1973) pp • 111-12 7 . ( b)

[Dijk 62] Dijkstra, E.W. ·"some Meditations on Advanced Programming" in Proceedings of the IFIP Congress (1962) pp.535-538.

[Dijk 65] Dijkstra, E.W. "Programming Considered as a Human Activity" Proceedings of -the IFIP Congress North-Holland, Amsterdam, The Netherlands (1965) pp.213-217. (a)

[Dijk 65] Dijkstra, E.W. "Cooperating Sequential Processes" Technological U. Eindhoven, (1965) in Programming Languages, Ed. F. Genuys, Aca­ mic Press, N.Y. (1968) pp.43-119. (b)

[Dijk 67] Dijkstra, E.W. "The Structure of 'THE' I Multiprogramming System" presented at the ACM Symposium on Operating System Principles (October 1967) in Comm. ACM, Vol.ll, No.5 (May 1968) pp.341-346. [Dijk 68] Dijkstra, E.W. "Go To Statement Considered Harmful" Letter to the Editor in Comm. ACM, Vol.ll, No.3 (March 1968) pp.l47-148. (a) [Dijk 68] Dijkstra, E.W. "Complexity Controlled by Hierarchical Ordering of Function and Varia­ bility" Proceedings of the IFIP Congress (1968) pp • 114 -116 • ( b ) [Dijk 69] Dijkstra, E.W. "Structured Programming" presented at NATO Science Conference (1969) in Classics in Software Engineering, Ed. E. Yourdon, Yourdon Press, N.Y. (1979) pp.43-50. 116

[Dijk 71] Dijkstra, E.W. "Hierarchical Ordering of Sequential Processes" Acta Informatica Vol.l, No.2 {1971) pp.ll5-138.

[Dijk 72] Disjkstra, E.W. "Notes on Structured Program­ ming" in Structured Programming, O.J. Dahl, E.W. Dijkstra, C.A.R. Hoare, Academic Press, N.Y. ( 19 7 2) pp. 1-8 2. {a)

[Dijk 72] Dijkstra, E.W. "The Humble Programmer" Comm. ACM Vol.l5, No.lO (October 1972) pp.859-866. (b)

[Dijk 84] Dijkstra, E.W. Personal Correspondence (September 17, 1984) in Appendix. [Earl 73] Earley, J. "Naming Structure and Methodology, in Programming Languages" Tech. Rept. TR-17 University of Calif., Berkeley, Calif. {August

197 3) 0

[Emer 62] Emery, J.C. "Modular Data Processing Systems Written in Cobol" Comm. ACM. Vol.5, No.5 {1962) pp.263-268. [Emer 84] Emery, J.C. Personal Correspondence (December 10, 1984) in Appendix.

[Estr 63] Estrin, G. and Turn, R. "Automatic Assignment of Computations in a Variable Structure Com­ puter System" IEEE Trans. on Electronic Com­ puters, EC-12 (1963) pp.754-773.

[Gann 77] Gannon, J.D. "An Experimental Evaluation of Data Type Conventions" Comm. ACM, Vol.20, No.8 (August 1977) pp.584-595. [Geha 84] Gehani, Narain "Ada Concurrent Programming" Prentice Hall, Englewood Cliffs, N.J. (1984). [Gold 83] Goldberg, Adele "Smalltak-80: The Language and Its Implementation" Addison-Wesley, Reading, Mass. (1983). [Gorn 57] Gorn, Saul "Standardized Programming Methods and Universal Coding" Journal of the ACM (1957) pp.254-273. [Grie 79] Gries, David "Current Ideas in Programming Technology" in Research Directions in Software Technology, Ed. P. Wegner, MIT Press, Cambridge, Mass. (1979) pp.254-275. 117

0 ' [Gutt 77] Guttag, John "Abstract Data Types and the Development of Data Structures" Comm. ACM (June 1977) pp.396-404.

[Hash 82] Hashimoto, Irvin, Kroll, Barry, and Schafer, John "Strategies for Academic Writing" u. of Michigan Press (1982).

[Hoar 65] Hoare, C.A.R. "Record Handling" in Algol Bul­ letin Vol.21 (November 1965) pp.39-69. [Hoar 71] Hoare, C.A.R. "Towards a Theory of Parallel Programming" International Seminar on Operating System Techniques, Belfast N. Ireland (Aug.-Sept. 1971) in Operating System Techniques, Eds. C.A.R. Hoare and R.H. Perrott, Academic Press, ,N.Y. (1973) pp.61-71. [Hoar 74] Hoare, C.A.R. "Monitors: An Operating System Structuring Concept" Comm. ACM (October 1974) pp.549-557. [Hoar 78] Hoare, C.A.R. "Communicating Sequential Processes" Comm. ACM Vo1.21, No.8 (August 1978) pp.666-667. [Hopk 72] Hopkins, Martin E. "A Case for the GOTO" Pro­ ceedings of the 25th National ACM Conference, Boston, Mass. (August 1972) pp.787-790. [Jack 75] Jackson, Michael A. "Principles of Program Design" Academic Press, London (1975). [Karp 60] Karp, R. "A Note on the Application of Graph Theory to Digital Computer Programming" Information and Control, Vol.3 (June 1960) pp.l79-190. [Kay 82] Kay, Allen "New Directions for the Novice Programmer in the 1980's" in State of the Art Report Programming Technology, Ed. P.L. Wallis Pergammon Infotech Limited, Maidenhead, England. (1982) pp.209-248. [Knut 74] Knuth, Donald B. "Structured Programming with go to Statements" Computing Surveys Vol.6, No.4 (December 1974) pp.261-301. [Knut 77] Knuth, Donald B. "Structured Programming with go to Statements" in Current Trends in Program­ ming Methodology Vol.I. Ed. Raymond Yeh, Prentice-Hall, Englewood Cliffs, New Jersey (1977) pp.l40-193. 118

[Knut 80] Knuth, Donald B. "The Early Development of Programming Languages" in a History of Comput­ ing in the 20th Century, Eds. N. Metropolis, J. Howlett, G.C. Rota, Academic Press (1980) pp.l97-274.

[Kope 79] Kopetz, H. "Infotech State of the Art Report on Structured Software Development" Vol.!, Infotech International Limited, Maidenhead, England (1979).

[Lisk 72] Liskov, Barbara "A Design Methodology for Reliable Software Systems" Proceedings of the Fall Joint Computer Conference, AFIPS Press, Montvale, M.J. (1972) pp.l91-199.

[Lisk 74] Liskov, Barbara and Zilles, Stephen "Programming with Abstract Data Types" ACM Sigplan Notices (April 1974) pp.50-59.

[McCr 73] McCracken, Daniel "Revolution in Programming an Overview" Datamation Vol.l9, No.l2 (Demcember 1973) pp.50-52.

[Meal 67] Mealy, George "Another Look at Data" in Pro­ ceedings of the AFIPS Vol.31 (1967) pp.525-534.

[Morr 73] Morris, J.H. Jr. "Protection in Programming Languages" Comm. ACM Vol.l6, No.1 {January 1973) pp.l5-21.

[Myer 73] Myers, Glenn J. "Characteristics of Composite Design" Datamation Vol.l9, No.9 (September 1973) pp.l00-102. [Myer 78] Myers, Glenn J. "Composite/Structured Design" Datamation Vol.l9, No.9 (September 1973) Van Nostrand-Reinhold, N.Y. (1978).

[Mill 71] Mills, Harlen D. "Top-Down Programming in Large Systems" in Debugging Techniques in Large Systems, Ed. R.Rustin, Prentice Hall, N.J. (1971) pp.41-56.

[Naur 63] Naur, Peter "GOTO Statements and Good Algol Style" BIT Vol.3, No.3 (1963) pp.204-208.

[Nyra 81] Nygaard, Kristen, Dahl, Ole-Johan "The Deve­ lopment of the Simula Langages" in History of Programming Languages, Ed. R.L. Wexelblat, Academic Press, N.Y. (1981) pp.439-478. 119

[Palm 73] Palme, J. "Protected Programming Modules" in Simula 67 FOAP (1973).

[Parn 71] Parnas, D.L. "Information Distribution Aspects of Design Methodology" in Proceedings of IFIP Congress North-Holland, Amsterdam (1971) Booklet-~A-3.pp.26-~0.

[Parn 72] Parnas, D.L. "On the Criteria to be Used in Decomposing Systems into Modules" Comm. ACM Vol.l5 (December 1972) pp.l053-1058. [Parn 76] Parnas, D.L. "On the Design and Development of Program Families" IEEE Trans. on Software Engineering Vol.SE-2, No.1 (March 1976) pp.l-9. [Pros 59] Prossner, R.T. "Applications.. of Boolean Matrices to the Analysis of Flow Diagrams" .. Proceedings Eastern Joint Computer Conference, Boston, Mass. (December 1959) pp.l33-138. [Ross 61] Ross, Douglas "A Generalized Techri-ique for Symbol Manipulation and Numer-ical Calculation" Comm. ACM Vol.4, No.3 (March 1961) pp.l47-150. [Ross 63] Ross, Douglas T. and Rodriguez, Jorge E. "Theoritical Foundations For.the Computer-Aided Design System" AFIPS (1963) ~p.305-322. [Shaw 84] Shaw, Mary "Abstraction Techniques in Modern Programming Languages" IEEE Software (October 1984) pp.l0-26. [Stev 74] Stevens, Wayne P., Myers, Glenn J., and Constantine, Larry "Structured Design" IBM Systems Journal Vol.13, No.2 (1974) pp.ll5-139. [Step 82] Stepanek, Stephen "Block Structured Applicative State Transition Language" Masters Thesis, California State University Northridge, Northridge, Calif. (1982). [Turn 82] Turner, D.A. "Prospects for Non-Procedural and Data Flow Languages" in State of the Art Report Programming Technology Series 10, No.2, Ed. P. Wallis, Maidenhead, England (1982) pp.341-352. [USDe 81 u.s. Department of Defense "Ada: Reference Manual, Vol. 106, Lecture Notes in Computer Science" Springer-Verlag, N.Y. (1981). [Wass 82] Wasserman, Anthony "New Directions in Programming" in State of the Art Report on 120

Programming Technology Series 10, No.2, Ed. P. Wallis, Maidenhead, England (1982) pp.367-383. [Wegn 76] Wegner, Peter "Programming Languages The First 25 Years" IEEE .Trans. on Computers (December 1976) pp.l207-1225. [Wegn 79] Wegner, Peter "Programming Languages Concepts and Research Directions" in Research Directions in Software Technology, Ed. P. Wegner, MIT Press, Cambridge, Mass. (1979) pp.425-490. [Wegn 83] Wegner, Peter and Smolka, Scott "Processes, Tasks, and Monitors: A Comparative Study of Concurrent P:ogramming Primitives" IEEE Transactions 1n Software Engineering, Vol.SE-9 No.4 (July 1983) pp.446-462. [Whee 52] Wheeler, David -. J • ·- First · ACM•· National Conference, cited by [Knut 71=1~ _,. [Whit 83] White, John R. "On the Multiple Implementa­ tions of. Abstrac:t Data Types Within A Compu­ tation" IEEE Tran~. on Software Engineering, Vol.SE-9, No.4~ (July 1983) pp.395-410. [Wien 84] Wiener, Richard and Sincovec, Richard "Software Engineering with Modula-2 and Ada".John Wiley & Sons, N.Y. (1984). [Wilk 51] Wilkes, M.V. "The Preparation of Programs for an Electronic Digital Computer" Addison­ Welsey, Cambridge, Mass. (195Ir. [Wilk 80] Wilkes, M.V. "Early Program Developments at Cambridge" in A History of Computing in the 20th Century, Eds. N. Metropolis, J. Howlett, G.C. Rota, Academic Press, N.Y. (1980) pp.497-504. [Wirt 66] Wirth, Niklaus and Hoare, C.A.R. "A Contribu­ tion to the Development of Algol" Comm. ACM, Vol.9, No.6 (June 1966) pp.413-431. [Wirth 71 Wirth, Niklaus "Program Development by Step­ wise Refinement" Comm. ACM, Vol.l4, No.1 (April 1971) pp.221-227. [Wirt 74] Wirth, Niklaus "On the Composition of Well Structured Programs" Computing Surveys, Vol.6 No.4 (December 1974) pp.247-259. 121

[Wirt 79] Wirth, Niklaus "The Module: A System Structuring Facility in High-Level Programming Languages" in Proceedings of the Symposium on Language Design and Programming Methodology (September 1979) pp.l-24.

[Wulf 72] Wulf, William A •. "A, Case Against the GOTO" Proceedings of the 25th National ACM Conference (August 1972) pp.79i-797. [Wulf 79] Wulf, W.A. "Introduction to Part I Comments On 'Current Practice'" in Research Directions in Software Technology, Ed. P. Wegner, MIT Press, Cambridge, Mass. (1979).

[Wulf 80] Wulf, W.A. "Trends and the Design and Imple­ mentation of Programming Languages" IEEE Computer (January 1980) pp.14-23.

[Your 78] Yourdon, Ed and Constantine, Larry "Structured Design Fundamentals of Computer Program and Systems Design" Yourdon Press, N.Y. (1978) [Your 79] Yourdon, Edward, Ed. "Classics in Software Engineering" Yourdon Press, N.Y. (1979). APPENDIX - LETTERS

122 123

DAVID J. KOEPKE 13912 Moorpark St. Sherman Oaks, California 91423

September 12, 1984

Professor Dr. Edsger W. Dijkstra Department of Computer Sciences University of Texas at Austin Austin, Texas 78712-1188

Dear Mr. Dijkstra:

I am a Masters' student at California State University, Northridge, and I am writing a thesis an the History of Software Design Ideas. I need same help in the farm of answers to the following short questions and would greatly appreciate your response.

1. 1 am finding the history of Software Design a very elusive subject. Your 1965 article "Programming as a Human Activity" is the earliest presentation I have found an the ideas of structured programming. Is there an earlier article or articles?

2. I have read your 'Structured Programming· paper and feel 1 know your implicit definition of structured programming. However, I have read many articles by several authors that seem to derive their awn definitions, causing conflict and confusion. Would you care to give a definition to set the record straight?

Any additional information you feel would be relevant or helpful would be appreciated. I am enclosing a self-addressed, stamped envelope for your convenience.

Thank you very much.

Sincerely,

•\ 1 ')_ "I I z_~.(l--.'<- e / .C'f'"~ David J. Koepke DJI

COLLEGE Of NATURAL SCIENCES THE UNIVERSITY OF TEXAS AT AUSTIN

. . .'J DtpamnttuofC!J11JjJ1JUrSdmas-T. S. Painw 3.28·Amtin, Texas 78712-1188·(512)471-7316 ~

Mr. ":Do.vicl J. )( o~]Oke 1Sq 1'1 Moorpo.r\'( Stree ~ S'ner'""o."" Ooka. CF\ q1~2S

"De,(l,.. Nr. K oep'kc.

Cl.S \o :JOU.f' rrs~ 9"'cs\-iCJ\"o, o.\- \1.te 0rs\ 1f1'? Ccmare•s. Munich, 1 Cf 6'2, 1. 3o.ve. ~n i nvi tc d. ""P"~ch ~"'ch..,.- .\-h. \-j~'• ''~dvo.~c:~n.5 ?ro~ro.W\Y'Y'\i~ "'. lhe \-ex). ho.s bqe-,.. repr·.-ntcd ~ .... ~he Co"' T"erc"ce 7n:~c:~ediny , ~.-ihlch ho.ve bee, p.... 'b\; s heel - ;f 1 o.'"'·. no\- vn i-s'-Q'k-..,- \:, ~. N CJr)\.. \-\o\\o.ncl "Yu'o\·,shi-n"' CoW\pc.1',:,· (1 nW'I~...,.,'be...- ~o.\ ~he ~feech \.SO.s ve'\:J wel\ YG!C:civtcLJ ~s h:, ~o~r ~ec.ond ~"'es\-iC'n\, 1 V\tl.Ve~ ~"'\end~J \.o ~I "e Ck de~ n i } i

Ccm)..,.o\\in~ I o.vo; ~\~"'.3 f ro~ro.-..n C.c:rrr"\ r'~)l; ).::::1' 'M.:l ta~c~ IVO.S \-o elisc.ovc1'" hoc...) -~ \cce.-r "?ro~rn"""'s )n+e)\cc~c:.\- l~ mo.-n~3ec.b\~.

"W;}.h "".) ,:,reehn,js o.~d ~$\- wisk5 f(l'l"' .jov..­ thQsis ,

'j()t...f'S CW~1", fds::3er \J".'J)_sks}f't:l. 125

DAVID J. KOEPKE 13912 Moorpark St. Sherman Oaks, California 91423

August 16, 1984

Mr. Larry L. Constantine c/o Ms. Wendy Ekin Yourdon Press, Inc. 1133 Avenue of the Americas New York, N. Y. 10036

Dear Mr. Constantine:

I am a Masters' student at California State University, Northridge, and I'm writing a thesis on the History of Software Design Ideas. I need some help in the form of answers to the following short questions and would greatly appreciate your response.

1. I'm having difficulty locating the origin of the term 'top-down·. I notice you mention it in your December 1965 article "Towards a Theory of Program Design." Where did you first hear of the term 'top-down'? Can you give any references of uses of this term prior to your article?

2. Which 'system engineering' works of Dijkstra's influenced your work? Is this where you learned about the hierarchy of modules? If not, where?

Any additional information you feel would be relevant or helpful would be appreciated. I am enclosing a self-addressed, stamped envelope for your convenience.

Thank you very much.

Sincerely,

David J. Koepke DJK/piD Enc. 126

LARRY L. CONSTANTINE

:zz BULETTE ADAD e ACTON, MABBACHUBETTB D17:ZD

28 Augu$t 1984

David J. Koepke 13912 Moorpark St. Sherman Oaks, CA 91423

Dear Mr. Koepke:

I am very pleased to hear of your thesis; I think there is much fertile ground for study in the history of design ideas. (E.g., many people credit Ed Yourdon and me for "data flow" or "bubble" charts. I did first apply them to analysis and design, but they were invented by Martin and Estrin.)

I do not any more own a copy of the original, "Towards a Theory of Program Design,• so I can't verify or check on the context, but if I used the term "top-down" then it is possible that I was the first to do so in print. It is l1kely that my mentors at C-E-I-R, Kenneth Mackenzie and David Jasper, used J.t fi.rst; much of that early article was my distillation and interpretation of their thinking. (An earlier work with a related emphasis on hierarchy and modularity was J. C. ~·s "Modular Data Processing Systems in Cobol," CACM 5 (5). I may have read this he fore writing the "Towards" piece in 1964, but I didn't meet Jim until 1967 when I started teaching for the Wharton School in 1967.)

By the time I started teaching for IAT in ~966, the only piece by Oijkstra that had •influenced" me was his 1965 IFIP piece, "Programming Considered as a Human Activ1ty.• I was less impressed by the later sound and fury over GOTOless programming, because, as many of us knew all along, the issue in actually ues;_gning and building computer 9rograms •..:as not, ult1mately, one of mathematica~ e:egance or forma~ completeness but a matter of discipline 3nd Rtyle. I was more involved in the theoretical bases anc practical approaches to rlesign as the more ahstract precursor to programming. (I was probably not then a very good programmer, '::>ut I've always bee., a f:eck of a gooc des1gner. t>\y reaL forte then, Jnd I th1nk today, was as a tf:eorist. Fortunately, s1nce I wr1te a lot of m: own software now, I've gotten to be a pretty gocd 9rogrammer, too.)

I am embarrassed that I cannot now cite the major WC!ks in systems engineering (not Dijkstra's) I found most helpfu.c. (I presume you got this idea from my Preface to Structured Design.) '!'here were several ~Jorks, including a textbook Ciilled Systems ~eer ing r userl in mv I&SI courses, but I no longer have a co9y or even a citat1on. (I gave my entire library of i:ooks and reprints to F!''lminqham State College wren I "left" the computer field for the tirst time in l972.) I have '..onq ce;-:si.dered myself to be a general systems tC!ecr i.st wor'

Koepke .•. -2-

et al..), information theorv, and general systems theory (especially von Bertalanffy).

I really learned about module hierarchy from Ken 11acl{enzie and Dave Jasper at c-E-I-R, who pushed the icea of sensible, planned arrangements of subroutines in layers. In 1963 thei~ thin~ing was very advanced in thls area, all the more so because most of our work was on 8-l2k IB~I l40ls! What I did initially was to try to reduce their subjdctive judgements to rules and criteria.

As a principal in the "structural revolution" I was always somewhat of a lone ·,...olf, always a bit peripheral, too "academic" for many practicing professionals and never considered ~egitimate in academic circles. Not until after I left the field uid the ideas of cohesion and coupling, for example, gain much currency. Even now, though Structured Design is widely used as a text, academia has done comparatively little with the-theoretlcal underpinnings. From what r have seen, there has been far more ~ebate on tFle "approaches" (Jaci

1-ly own view (and some others agree, e.g. Bill "P. J." Plauger) is that modu::.e cohesi.on and lnterrnodular coupling as concepts are constructs of an essent~al theory of program complexity (from a human v~ewpoint) which undergirds all effective techniq•Jes foL good systems des1gn and development. This line of thinking still needs further research and exploration.

You may ~e ir.terested in a recent paper of mine, to appear in a special 1ssue of the Journal of Psychotherapy and the Family. It is called "Computer-Aided Assessment of-Families: Design :::onsideratio~s." and ~s the fnst paper in which I have cited my works in Je>th famLy ::tnd coraputer fie:Ccs! 1·his piece represen~s <:oming t'ul.l circ:..e in th~ fam1ly ~i.eld anc acknowleaging something of my roots. Anrl, next ye~r : am co-c~a1ring d conference on computers, tech~ology, ~nc families.

My current work in famil.y theorv Ll!ustc~tes t~e ways in whic~ g~neral systems theory, cybernetics, and in~ormation theory continue to pervaGe my theoretical efforts. I have developed an integrated model of the interrelationships between observed behavior (family process!, the communicat1on feedbacl~ mechanisms which regulate interaction patterr:s (family regime), and the reference images for those feedbac~ mechanisms \family para<'ilgm). As in t~e computer f 1e~d• it !-,as ~een a long, tough road to acceptance of my ideas, but papers have ::tppeared recently 1n two ot the premier jou~nals in the fie!i and n~xt year my magnum cpus in the family iield, _Fam~ly Paradigms: <::he Pract~ of ':'he

Koepke ••• -3-

Good luck with your thesis •. would you tell me when it is done and send me a copy of any papers derived from it? ~.;;~8-­ ~~- ant Professor Human Development & Family Relations University of Connecticut

P.S. The National Sympos1um on Modular Programming in 1968 was a turning point in a number of lines of development, but the Proceedings never appeared in final :orm because the publisher (I&SI) folded. I have a single copy of the limited-edition preprint plus one copy of another I&SI limited-edltion collection of ear~y works on program structure and design, Concepts in Program Design. If these are of 1nterest, I could lend them to you. 129

DAVID J • KOEPt~E 13912 Moorpark St. Sherman Oaks, California 91423

September 13, 1984

Mr. Jim C. Emery 4423 Osage Ave. Philadelphia, PA 19104

Dear Mr. Emery:

I am a Masters· student at California State University, Northridge, and I'm writing a thesis on the History of Software Design Ideas. I need some help in the form of answers to the following short questions and would greatly appreciate your response.

In your 1965 article in CACM, "Modular Data Processing Systems Written in Cobol," you talk about module hierarchy and using modules for flexibility. Where did you first learn of these ideas?

If it was from your references "Modular Programming" by Robert Bizzell or "Use of Systems Modules in Systems Design" by Frank J. Carr, where might I obtain a copy of these.

This and any additional information would be appreciated. I am enclosing a stamped, self-addressed envelope for your convenience.

Thank you very much.

Sincerely,

David J. Koepke DJK/pm Enc. 130

UNIVERSITY of PENNSYLVANIA. PHILADELPHIA 19104

THE WHARTON ScHOOL (215) 898-8635 DEPARTMENT OP DECISION ScrENCES CC JAMES C. EMERY, Chairman December 10, 1984

Mr. David J. Koepke 13912 Moorpark St. Sherman Oaks, CA 91423 Dear Mr. Koepke: In response to your letter of October 18 (which took a while reaching me from Macmillan), I am enclosing a 1961 memo that describes the concept. As you will see, my concept of program modules was somewhat different (although related) to the current use of the term. As I used the term, modules are standard building blocks out of which a tailored system could be constituted. In today's usage, a system is broken down into self-contained blocks as a means of limiting the effects of a change in a program. The concept, as I used it in my Westinghouse project, was original with me (at least as far as I know). I borrowed the idea from the construction industry, in which standard modules (such as a standard wall panel) could be used in the construction of buildings. I am afraid I do not have Robert Bizzell's or Frank Carr's references. In fact, I do not remember anything about the~, or their dates. In the case of Carr, however, his use of the modular concepts stems from my work (as decribed in the enclosed cover memo). I hope that this information will be of some use to you. ~ry truly yo~r~ I/ #'.----~ ~/t~·~ V James C. Emery / JCE:cah Enclosure