<<

The Relationship Between COBOL and Science BEN SHNEIDERMAN

Based on interviews, reviews of the literature, and personal impressions, the author offers historical, technical, and social/psychological perspectives on the fragile relationship between COBOL and . The technical contributions of COBOL to design are evaluated. Five proposals for computer science research on COBOL and fourth-generation languages are described. Categories and Subject Descriptors: 0.3.0 [Genera/]-standards; 0.3.2 [Language Classifications]-cosoc K.2 []-coso& software; K.3.2 [ and Education] Computer and Information Science Education-computer science education General Terms: Design, Languages,

For a computer scientist to write sympathetically More recently I have turned my attention to psycho- about COBOL is an act bordering on heresy. It requires logical or human-factors issues of programming, courage because academic colleagues and data proc- query facilities, and human-computer inter- essing professionals are both likely to be suspicious of action (Shneiderman 1980). I have taught introduc- my motives. Therefore, I feel my first obligation is to tory PrOgramming coursesin ,BASIC,&isCal, make clear my intention and orientation. COBOL, APL, and PL/~ and have written or coauthored I believe that COBOL has had a strong and largely textbooks using the first three of these languages. positive influence on the emergence of computer usage. The development of COBOL was a pioneering Historical Perspective effort that advanced the state of the art in practical data processing and language design. COBOL clearly Five aspects of the historical development of COBOL had many flaws, some of which have been overcome can be traced as important influences in the alienation by revisions to the original language. Designers of of COBOL from the computer science community. First, other languages have learned from the mistakes and academic computer scientists did not participate in overcome the problems with novel constructs. This the design team. The developers of COBOL were from paper offers three perspectives on the rise of COBOL the commercial community: the manufacturers and (historical, technical, and social/psychological) and users of large data processing systems in and suggests directions for future cooperation. government (see the minutes in this issue for lists of My orientation is as a computer scientist whose the participants). early work was in database systems and programming. In 1959-1960 very few academics could have been classified as computer scientists, of course, and few of them could have made useful contributions, but en- 0 1985 by the American Federation of Information Processing Societies, Inc. Permission to copy without fee all or part of this gaging them might have been beneficial. material is granted provided that the copies are not made or distrib- The developers of the Ada language realized this uted for direct commercial advantage, the AFIPS copyright notice possibility and made ambitious and successful efforts and the title of the publication and its date appear, and notice is given that the copying is by permission of the American Federation to elicit the participation of academic and industrial of Information Processing Societies, Inc. To copy otherwise, or to researchers. Computer scientists can do more than republish, requires specific permission. contribute ideas; they are often effective in dissemi- Author’s Address: Department of Computer Science, University of Maryland, College Park, MD 20742. nating new concepts through their publishing, teach- 0 1985 AFIPS 0164-l 239/85/040348-352$01 .OO/OO ing, and lecturing efforts.

348 . Annals of the History of Computing, Volume 7, Number 4, October 1985 6. Shneiderman - COBOL and Computer Science

The second historical aspect is that the COBOL science. It is not surprising that they worked sepa- developers apparently had little interest in the aca- rately and in parallel. The development of computer demic or scientific aspects of their work. The May science and data processing might have been substan- 1962 issue of the Communications of the ACM was tially altered if these communities had taken the time devoted to 13 papers describing COBOL and related to meet and work together. issues. Every article was written by an industry or government person. These people did not have the Technical Perspective academic frame of mind that involves referencing previous and related work; only four of the papers had Getting convergence on the technical successes and any references. Sociologists of science who use cita- failures of COBOL was a difficult task. I spoke to 30- tions to trace the flow of ideas would recognize this 40 computer scientists and data processing profession- pattern as an indicator of intellectual separatism. als in trying to sort out the issues. The following The third aspect is the decision of the COBOL de- analysis is my own view guided by the interviews. velopers not to use the Backus-Naur Form (sometimes First the successes. The strongest point of agree- called Backus Normal Form) notation as the metalan- ment was that COBOL contributed the record structure, guage to describe COBOL. The COBOL developers were explicit structure definition, and the separation of unaware of this work, which appeared in a conference data definition from procedural aspects. The COBOL report in June 1959. I don’t see the failure to use BNF record and file structure certainly influenced the de- as central, but apparently the criticism at the time sign of PL/~ and Pascal. The notion of an aggregation was severe (Sammet 1981, p. 233). The COBOL style of of dissimilar items is a major advance over the FOR- metalanguage has become widely used and might be TRAN array. The Pascal notion of variant records can considered as an important contribution. be traced to the COBOL REDEFINES clause. A fourth concern is the process of describing COBOL Explicit file structure definitions that included a to the academic and industrial community. Profes- hierarchy of names for fields and the separate DATA sionals could learn the language from Division were the predecessors of the database man- reference manuals, but a well-written book emphasiz- agement system (DBMS) concept. DBMSs are a vital ing the conceptual foundations of COBOL would have part of computer science and are the source of a rich been an asset. It took a few years before successful theory that is still developing rapidly. introductions were available (McCracken 1963; Saxon Another contribution was the diverse set of control 1963) to teach COBOL to novices. Note the contrast structures, which were quite sophisticated for the time. with the dissemination of Pascal. pub- The COBOL IF-THEN-ELSE, in spite of its awkward lished a precise description of the language in Acta scope delimiter rules, reduced the need for and Informatica (1971), an interesting book titled System- permitted the creation of more comprehensible code. atic Programming: An Introduction (1973), the widely The variety of PERFORM statements allowed conveni- read Pascal User Manual and Report (1975), and a ent and powerful looping and some degree of modular forward-looking textbook Algorithms + Data Struc- design. The ease with which paragraphs could be tures = Programs (1976). created, named, and used facilitated hierarchical de- A search of the of Congress SCORPIO system sign. revealed 252 books indexed under COBOL, 529 under A popular feature of COBOL that has appeared in FORTRAN, and 1054 under BASIC, demonstrating the other languages is the COPY statement. By including relatively lower rate of publication on COBOL. Pascal groups of statements from a library, organizational is more recent, but already 208 books are indexed standards were easily enforced, were under that programming language. encouraged to cooperate, and reuse of code became The fifth historical aspect is that the people who convenient. might have accepted the title of computer scientist in Another interesting COBOL concept is the ENVIRON- 1960 were not interested in the problem domain of MENT Division, which allowed users to specify COBOL programs. The commercial file-processing machine or dependencies. It permitted deli- problems were remote from the concerns of computer nition of separate host and target computers, thus scientists, who generally dealt with numerical analy- foreshadowing the idea of cross-. sis, physics, engineering problems, and systems pro- Finally, the COBOL community demonstrated the gramming. power of portability and standardization. In spite of In summary, the people and the problems in the some local variations, COBOL is largely machine inde- COBOL world were very different from the people and pendent. Programmers could successfully transfer problems who were laying the basis for computer their knowledge and often their programs from one

Annals of the History of Computing, Volume 7, Number 4, October 1985 l 349 . Shneiderman * COBOL and Computer Science

organization to another. Standardization also encour- somewhat more powerful INSPECT command, plus aged the development of software tools and reuse of STRING and UNSTRING. code. Several computer scientists complained about the Some of the perceived technical failures of COBOL lack of in COBOL, but I think that it hardly might have been avoided by early consultation with would have made a difference in the use of the lan- computer scientists, but other problems would not guage. have been recognized until the early 1970s. The most Knowledgeable COBOL programmers in my survey serious omission is a function or procedure definition had other complaints, but I did not judge them facility with parameters and local scope of variables. to be vital. The original version of COBOL has only global vari- ables; therefore a generic to sum the ele- Social/Psychological Perspective ments of an array, search a string, or print a bar chart was not easy to write. Two programmers working Now we move on to some more speculative areas that together had to coordinate carefully to ensure that reflect on the fundamental differences between the they did not inadvertently both use the same variable computer science and business data processing com- name for different values. The lack of protected mod- munities. The rejection of COBOL by most computer ule boundaries also allowed complex and sometimes scientists is a product of their desire to avoid the dangerous programming techniques. A DEFINE business data processing domain, their pursuit of feature was in the original language description and mathematically oriented theory, and often their lack appears to be the first such facility in a high-level of knowledge about COBOL. language. Unfortunately, it was never implemented When asked for his comments, one computer sci- and was eventually dropped from the language. The entist who is a widely respected expert in program- 1974 revision to COBOL included the now-popular CALL ming languages boldly replied, “What’s COBOL?" His USING feature, which permitted parameters and run- world did not have a place for COBOL. In fact, several time creation of procedure names. programming language texts (e.g., MacLennan 1983) Computer scientists often complained about the do not include COBOL in the index. Another responded, wordiness of COBOL statements and the clutter of the “It’s terrible . . . ugly,” but had difficulty explaining optional noise words. The designers of COBOL appar- why. I suspect this prejudice emerges from the bias of ently believed that the English-like statements would many computer scientists against the problem domain make programs readable by managers or other non- and the wordy, nonmathematical style of COBOL, programmers, but the of programming are rather than from any serious consideration of the at least as challenging as the syntax. Computer sci- technical weaknesses. Dijkstra (1982) wrote, “COBOL entists whose background emphasized mathematical cripples the mind,” but he was equally harsh on FOR- notation, which is precise and concise, felt that COBOL TRAN (“infantile disorder”), PL/I (“fatal disease”), was too wordy and somehow unscientific. BASIC, and APL. Tompkins (1983) sought to defend I found and was sympathetic to several complaints COBOL, and Reid (1983) responded with many legiti- about COBOL control structures. The scope of an IF- mate criticisms. THEN-ELSE is delimited by a period, which is often The bias against the problem domain is stated ex- missed by programmers when reading and even when plicitly in Pratt’s programming language textbook writing programs. It might have been better to have (1975; 1984), which says that COBOL has “an orienta- used a keyword such as ENDIF. The PERFORM state- tion toward business data processing . . . in which the ment cannot contain a list of statements, only the problems are . . . relatively simple algorithms coupled name of a paragraph. Studying a program is tricky with high-volume input-output (e.g. computing the because the reader must hunt for the body of the payroll for a large organization).” Anyone who has PERFORM statement. With short loops, the overhead written a serious payroll program would hardly char- is annoying, and confusion can increase. acterize it as “relatively simple.” I believe that com- Poor string-handling facilities were cited by several puter scientists have simply not been exposed to the people as a major weakness of early COBOL. Moving complexity of many business data processing tasks. and copying of strings was convenient, but insertion Computer scientists may also find it difficult to pro- and deletion of characters inside a string was difficult. vide elegant theories for the annoying and pervasive The EXAMINE verbandthe TALLYING and REPLACING complexities of many realistic data processing appli- options were included in the early COBOL specifica- cations and therefore reject them. tions to facilitate plans to write a COBOL compiler in Tucker’s programming language textbook (1977) COBOL. The 1974 version of COBOL contained the has this : “We judge COBOL's programming

350 - Annals of the History of Computing, Volume 7, Number 4, October 1985 B. Shneiderman - COBOL and Computer Science

features as fair and its implementation dependent significantly influenced by COBOL," and goes on to features as poor . . . its overall writing as fair to poor, say, “Most computer scientists are simply not inter- its overall reading as fair and its data processing estedin COBOL." support as good. . . . [It has] tortuously poor compact- I do feel that COBOL was a major influence on PL/I, ness and poor uniformity.” Not much to warm the which was designed explicitly to include the popular heart of a COBOL programmer. features of COBOL,FORTRAN, and ALGOL. COBOL also Several computer scientists remarked about the had an impact on the use of record structures in “trade school” nature of COBOL and that university languages such as Pascal and on the creation of professors did not like dealing with current practice, database systems. but sought to distinguish themselves with novel lan- Most important, COBOL greatly facilitated the enor- guages, theory, and an abstract, more mathematical mous expansion of computer usage in data processing. orientation. One professor commented that he was The demand for programmers and computers bene- “hostile to teaching what is used commercially,” while fited the entire industry and stimulated further sci- a researcher sneered at the “folly of an English-looking entific advances because there was such a large market language.” for new ideas. The desire to be aloof from current practice was a common theme and leads me to the The Future: A Challenge to Computer Scientists Theory Conjecture: Computer scientists like pro- gramming language theory, but find fault with any The story of COBOL is not over. The continuing widely used language. changes to COBOL, refinements in design guidelines, The Theory Conjecture should be comforting to and improvements in teaching strategies mean that COBOL supporters because it means that computer the COBOL of today looks very different from the scientists will express displeasure for almost any pro- COBOL of 1960. COBOL and the so-called fourth-gen- gramming language that is widely used. Since com- eration languages will still be around in 25 more years, puter scientists desire to be with the state of the art, and maybe computer scientists still have an opportu- anything that is widely used must also be outdated. nity to influence their evolution. Commercial usage lowers academic prestige. Let me propose five areas of beneficial COBOL and A related principle might be expressed as the fourth-generation language research that might be of interest to computer scientists. Egocentricity Conjecture: Computer scientists appre- ciate no programming language except the one that they Code Optimization: There is a grand opportunity to design. apply traditional compiler optimization techniques to The role of a scientist is to innovate, so comments COBOL. Even more provocative would be to explore on other people’s work are often in the form of criti- optimizations that are specific to the COBOL domain. cism that lays the basis for a proposed improvement. Instead of eliminating redundant mathematical sub- expressions, the COBOL compiler writers could concen- trate on eliminating redundant MOVE or file opera- Summary tions. Global dataflow analysis would be a challenge in the COBOL context. Jean Sammet’s book on programming languages (1969) and her review of the history of COBOL (1981) Formal Semantics: Precise descriptions of COBOL offer lists of contributions of COBOL that are close to syntax are available, but there are variations in the my own impressions: semantics of some operations across implementations. Creating a formal semantic description of COBOL l l Readable language. would be useful to implementors and might require l l Separate data declaration section with rich record structure. novel techniques that could be applied in many pro- gramming language situations. l l Decent control structures. l l Machine-independent, portable, standardized Maintenance Tools: The enormous and valuable li- language. braries of COBOL programs are underutilized because l l A useful alternative metalanguage. tools are not adequate for indexing, searching, inter- l l Creation of a successful and large community of preting, and modifying this code. Library-science and data processing programmers. expert-system concepts might be applied to making In her review, Sammet (1981, p. 239) writes: “I per- the voluminous “literature” of COBOL more readily sonally do not see much language development . . . available for reuse.

Annals of the History of Computing, Volume 7, Number 4, October 1985 l 351 B. Shneiderman * COBOL and Computer Science

Programming-Style Research: Because COBOL is so tation to give a talk at the 1979 .National Computer widely used, we would see a substantial benefit if Conference’s Pioneer Day celebration of the 20th an- empirically tested style guidelines were available. niVerSary Of COBOL. Questions abound about the choice of mnemonic variable names, nesting of control structures, use of indentation, page formatting, modular design, REFERENCES commenting techniques, data structure design, etc. Dijkstra, Edsger W. May 1982. How do we tell truths that Empirical studies of program composition, compre- might hurt? ACM SIGPLAN Notices 17, 5, pp. 13-15. hension, , and modification by professional Jensen, Kathleen, and Niklaus Wirth. 1975. Pascal User programmers would be valuable in resolving some of Manual and Report. SecondEdition, New York, Springer- these issues and formulating a cognitive model of Verlag. programmer behavior. MacLennan, Bruce J. 1983.Principles of Programming Lan- guages: Evaluation and Implementation. New York, Holt, : In some ways COBOL is a Rinehart and Winston. convenient language as the target for a compiler or McCracken, Daniel D. 1963.A Guide to COBOL Program- preprocessor. Indeed, numerous preprocessors at- ming. New York, Wiley. Pratt, Terrence W. 1984. Programming Languages: Design tempt to offer higher-level control structures or data and Implementation. Second Edition, Englewood Cliffs, structures. How might procedural or data abstraction N.J., Prentice-Hall. concepts be molded to fit the COBOL context? Can Reid, E. October 1983.Fighting the disease:More comments computer scientists offer an interesting theory of pre- after Dijkstra and Tompkins. ACM SIGPLAN Notices 18, processors to parallel the theory of compilers? Does 10, pp. 16-21. Sammet, Jean E. 1969. Programming Languages: History large-system design in COBOL have features that are and Fundamentals. EnglewoodCliffs, N.J., Prentice-Hall. distinct from FORTRAN or Ada? Sammet, Jean E. 1981. “The Early History of COBOL.” In This list is only a starting point. COBOL and fourth- Wexelblat, . (ed.), History of Programming Languages, generation languages present many provocative chal- New York, Academic Press,pp. 199-243. Saxon, James A. 1963. COBOL. Englewood Cliffs, N.J., lenges to computer scientists. Also, electronic spread- Prentice-Hall. sheets such as VisiCalc and Lotus l-2-3 are exciting Shneiderman,Ben. 1980. Software Psychology: Human Fac- innovations that have yet to be properly acknowledged tors in Computer and Information Systems. Boston, Little, in the computer science community. Brown. Tompkins, H. E. April 1983. In defense of teaching struc- tured COBOL as computer science(or, Notes on being sage Acknowledgments struck). ACM SIGPLAN Notices 18, 4, pp. 86-94. Tucker, Allen B. 1977. Programming Languages. Reading, In addition to the people I interviewed, I was greatly Mass, Addison-Wesley. helped by knowledgeable reviews of early drafts by W&h, Niklaus. 1971. The programming languagePascal. , John Gannon, Rick Linger, Daniel Acta Informatica 1, 1, pp. 35-63. Wirth, Niklaus. 1973. Systematic Programming: An Intro- McCracken, Terrence Pratt, Edward Reid, and Jean duction. Englewood Cliffs, N.J., Prentice-Hall. Sammet. My first consideration of this topic was Wirth, Niklaus. 1976. Algorithms + Data Structures = Pro- stimulated by Hank Tropp and Jean Sammet’s invi- grams. Englewood Cliffs, N.J., Prentice-Hall.

352 * Annals of the History of Computing, Volume 7, Number 4, October 1985