<<

TURlNGAWARD LECTURE

COMPUTER SCIENCE: THE EMERGENCE OF A DISCIPLINE

The clontinued rapid development of will require an expansion of the science base and an influx of talented new researchers. Computers have already altered the way we think and live; now they will begin to elevate our knowledge of the world.

JOHN E. HOPCROFI’

It is a great honor to be a recipient of the 1986 longer adequate to meet the challenges created by . Along with the recognition provided the expansion of knowledge. The demands of indus- by the award comes the opportunity to present this trial research laboratories and academic institutions lecture, and so to speak not only to computer scien- far surpass the current resources for producing the tists, but also to policy makers and to other members needed pool of talented researchers. To reap the of the scientific establishment. I would like to start maximum benefit from the scientific and technologi- by relating some of my experiences in the field of cal advances of computing, a national commitment computer science, and then to make some recom- must be made and sustained. mendations about the future development of the dis- Before expanding on this need, let me first relate cipline. some of the events in my career that have brought I began my professional career in computer sci- me to this position. When I received my Ph.D. in ence in 1964, when computer science was just be- electrical engineering from , my ginning to establish itself as an academic discipline. education had included only one course in comput- Having been a pa.rt of the academic community dur- ing, which was taught by David Huffman. My last ing the formative years of computer science, I have year at Stanford coincided with the year Huffman been in a fortunate position to observe it as it spent there, and it was from him that I received a evolved and to watch it as :it matured and devel- basic introduction to switching circuits, logic design, oped. I have a great fondness for the field, and I and the . During the spring of would like to see it continue to flourish and grow. that same year, Edward McCluskey was recruiting Even though computer science has emerged as a ma- faculty for his Digital Systems Laboratory at Prince- ture discipline, s.trong leadership and direction ton. By chance, he happened to be telephoning within the profession are still of great importance in Bernard Widrow at Stanford to discuss prospective order for it to contribute fully to science and society. Ph.D. candidates just as I was walking by Widrow’s What has impressed me most during my years in door. When Widrow saw me, he motioned me into computer science is the level of commitment of the his office and handed me the telephone, and we participants. In the early years, I saw strong individ- arranged that I would visit Princeton. The fact that I ual commitment. Later, I saw institutions joining had no formal education in computer science, ex- forces with individuals to form strong support sys- cept for Huffman’s course, did not deter McCluskey. tems for computer science. Today, technology is ad- At that time, few people had an educational back- vancing the discipline so rapidly that the combined ground in computing. After my visit, I was suffi- commitment of individuals and institutions is no ciently impressed by McCluskey’s commitment to 01987 ACM OOOl-0782,'87/0300-0198 750 computer science and with the opportunity he pre-

199 Communications of the ACM March 1987 Volume 30 Number 3 Turing Award Lecture sented me to begin a career in the developing sci- Rabin and Scott were mathematicians who devel- ence of computing that I accepted his job offer as oped a model of a computer with a finite amount of soon as it was extended. This decision significantly memory. They called this model the finite-state au- altered my career plans, for prior to McCluskey’s tomaton, and showed that the possible behaviors of telephone call, I had planned to teach electrical en- finite-state automata were precisely those behaviors gineering on the West Coast. that could be described by the regular expressions My arrival at Princeton in the fall of 1964 oc- that grew out of the work of McCulloch and Pitts. curred at a time when a dramatic change was taking This confluence of ideas from two widely different place in the computing field. Much of the course disciplines helped convince early computer scien- content in computer science had focused on the de- tists of the importance of regular expressions and sign of circuits for digital computers and minimizing finite automata. the number of transistors needed to build these cir- Chomsky, a linguist, had been studying the syntax cuits. By the mid sixties, however, technology had of natural languages. In the course of his work, he advanced to the point where transistors were about developed the concept of a context-free grammar. At to be replaced by computer chips with as many as a about the same time, two computer scientists, hundred components per chip. Thus, minimizing the Backus and Naur, were attempting to develop for- number of transistors was no longer relevant. As you malisms for describing programming languages. Be- can imagine, this had profound ramifications for fore 1960 programming languages were defined by what was then called computer science; existing lengthy and often incomplete verbal descriptions. In- courses were about to become obsolete, and new consistencies in various implementations of a lan- ones had to be developed. guage often made it difficult to change software be- Princeton asked me to develop a course in auto- tween systems. Backus and Naur developed a formal mata theory to expand the scope of the curriculum notation for describing the syntax of various pro- beyond the digital circuit design course then being gramming languages. Amazingly enough, their nota- offered. Since there were no courses or books on the tion was equivalent to the context-free grammars subject, I asked McCluskey to recommend some ma- developed by Chomsky. terials for a course on . He was not Turing had in 1936 introduced a simple model sure himself, but he gave me a list of six papers and of a computing device, which is now known as the told me that the material would probably give stu- . This device was simple enough dents a good background in automata theory. Mc- that there could be no question that any function Cluskey’s list included works by Warren McCulloch computed by it was computable. However, Turing and Walter Pitts, and , Noam argued further that his model could compute every Chomsky, Michael Rabin and , Juris Hart- function considered computable. Simply put, any manis and Richard Stearns, and, of course, Alan computational process that could be carried out Turing. could be programmed on the Turing machine. Today At the time, I thought it strange that individuals this hypothesis is universally accepted, and the were prepared to introduce courses into the curricu- Turing machine is the foundation of modern com- lum without clearly understanding their content. In putability theory. retrospect, I realize that people who believe in the Turing’s work might have remained in the realm future of a subject and who sense its importance will of and logic were it not for a seminal invest in the subject long before they can delineate paper on the computational complexity of algo- its boundaries. rithms by mathematicians Hartmanis and Stearns. When I look back on the material in that early They measured the complexity of an algorithm by course in automata theory, I am struck by the diver- the number of steps needed for its execution and sity of these sources. In 1943 McCulloch and Pitts, used this method to develop a theory of complexity working in neurophysiology, published a paper on a classes. This paper sparked the imagination of many logical calculus for describing events in neuron nets. computer scientists and led to the establishment These events were series of electrical pulses and of complexity theory as an integral part of the could be viewed as strings of zeros and ones. The discipline. paper had a notation for describing how these strings Certain research papers are important not only for of zeros and ones combine in neurons to produce their technical contributions, but, more importantly, new strings of zeros and ones. This notation was because they provide a conceptual view or establish subsequently developed into the language of regular a paradigm for research. The work of Hartmanis and expressions for describing sets of strings. Stearns attracted researchers and focused attention

March 1987 Volume 30 Number 3 Communications of the ACM 199 Turing Award Lecture

on the topic of complexity. Among the more signifi- data structures. I believed that the methodology of cant advances that resulted were the classification of theoretical computer science could be used to de- the complexity of most major mathematical theories, velop a science base for algorithmic design that the reducibility of many combinatorial problems, the would be useful to the practitioner. r,oncept of NP-completeness, and a deeper under- During the 196Os, research on algorithms had been standing of concepts such (asrandomness. very unsatisfying. A researcher would publish an And so the early automata theory course served to algorithm in a journal along with execution times emphasize two important aspects of computer sci- for a small set of sample problems, and then several ence: the way it has used a multiplicity of ideas from years later, a second researcher would give an im- diverse fields to develop and expand, and the way proved algorithm along with execution times for the basic research can compound and escalate advances same set of sample problems. The new algorithm in computing. On a more personal note, the course would invariably be faster, since in the interven- had a tremendous impact on my career. Through it I ing years, both computer performance and program- met and , with whom I ming languages had improved. The fact that the subsequently collaborated for many years. Formal algorithms were run on different computers and Languages and Automata Theory, which I wrote with programmed in different languages made me un- Ullman, also evolved from this course. comfortable with the comparison. It was difficult to In the spring of 1967, Princeton asked me to run a factor out both the effects of increased computer seminar series. As is often the case, funds for the performance and the programming skills of the seminars were limited. The intention was to invite implementors-to discover the effects due to the local speakers; however, there was just enough new algorithm as opposed to its implementation. money to sponsor two outside speakers from within Furthermore, it was possible that the second re- a 266mile radius. Drawing from my interest in auto- searcher had inadvertently tuned his or her algo- mata theory, I invited Chomsky and Hartmanis, both rithm to the sample problems. Conceivably, if the of whom agreed to speak. Hartmanis’s visit had a two algorithms were run again on another sample significant impact on my career. During the course set of problems, the original algorithm might out- of his visit, he asked me about prospective Ph.D. perform the newer one. candidates for faculty positions at Cornell. Our ensu- I set out to demonstrate that a theory of algorithm ing discussion that evening on the importance of the design based on worst-case asymptotic performance science base for computing and the fact that Cornell could be a valuable aid to the practitioner. First, a had established an independent computer science size was associated with each instance of a problem; department led me to ask if I might be considered then the complexity of an algorithm was measured for one of the positions, instead of a new Ph.D. by calculating the rate of growth in computing time candidate. After visiting Cornell, I immediately as a function of the problem size. Owing to problems accepted their job offer. in estimating input distribution, the worst-case anal- I had first learned of Cornell’s commitment to ysis over all inputs of a given size was adopted as computer science from a news article that had ap- the measure of complexity. peared the year before Hartmanis’s visit-1966. At There were a number of problems associated with that time computer science departments were rap- worst-case asymptotic complexity. Some of the idly being established at a rate that exceeded the asymptotically optimal algorithms were numerically supply of faculty to staff them. The article discussed unstable; others were sufficiently complex that they Cornell’s recognition of this problem and their ef- were unlikely to be programmed. Further, the ex- forts to solve it. Three faculty members from the pected time for realistic input distributions would be mathematics and engineering departments had per- a more reasonable measure of complexity. But the suaded the Sloan Foundation to donate one million paradigm of worst-case asymptotic complexity pro- dollars to develop an independent computer science vided a mathematical criterion for measuring the department to produce the Ph.D.‘s needed to staff goodness of an algorithm, making it possible to ask the computer science departments being created at such questions as “What is the optimal algorithm for other institutions. a problem?” and “What is the intrinsic complexity of By the time I arrived at. Cornell in 1967, automata a problem?” The idea met with much resistance. theory and complexity theory were established as People argued that faster computers would remove parts of the computer science curriculum. And so the need for asymptotic efficiency. Just the opposite shortly after switching institutions, I decided to is true, however, since as computers become faster, switch fields and began to work on algorithms and the size of the attempted problems becomes larger,

200 Communications oj the ACM March 1987 Volume 30 Number 3 Turing Award Lecture thereby making asymptotic efficiency even more transactions, and access . They are advanc- important. ing the frontiers of knowledge in physics, chemistry, In early 1970 I took a year-long sabbatical at Stan- and biology. Computers will also play a vital role in ford University, where I met and shared an office the economic revitalization of this country. with , a second-year graduate student. Computers have already made a major impact on The research recognized by the 1986 Turing Award the way we think and live. However, I envision an took place during that period of collaboration. We even greater impact. The potential of computer sci- worked together on a number of graph algorithms, ence, if fully explored and developed, will take us to including graph connectivity and planarity testing. a higher plane of knowledge about the world. Com- We adopted the philosophy of designing for worst- puter science will assist us in gaining a greater un- case performance and demonstrated that a few sim- derstanding of intellectual processes. It will enhance ple techniques could be used to construct efficient our knowledge of the learning process, the thinking algorithms in many areas of computer applications. process, and the reasoning process. Computer sci- We were able to prove that some of our algorithms ence will provide models and conceptual tools for had worst-case performance optimal within a con- the cognitive sciences. Just as the physical sciences stant factor, and also that the performance of the have dominated man’s intellectual endeavors during resulting algorithms far surpassed that of existing this century as researchers explored the nature of algorithms. Our planarity algorithm was able to test matter and the beginning of the universe, today we graphs with 1000 vertices in about 10 seconds-two are beginning the exploration of the intellectual uni- orders of magnitude faster than existing algorithms. verse of ideas, knowledge structures, and language. Our results attracted many researchers whose efforts I foresee significant advances continuing to be made created new data structures and design techniques. that will greatly alter our lives. The work Tarjan Today this area is an integral part of computer sci- and I started in the 1970s has led to an understand- ence. Again, it was not one specific algorithm that ing of data structures and how to manipulate them. was of fundamental importance, but rather that our Looking ahead, I can foresee an understanding of work captured the imagination of others. It also at- how to organize and manipulate knowledge. tracted the attention of bright young researchers In the sixties, a change in technology broadened who went on to establish data structures and algo- the scope of computer science from circuit design to rithms as an important subdiscipline. computation, The tremendous increases in comput- I would like to make some observations and rec- ing power that are on the horizon today will once ommendations for the future of computer science. again fundamentally change computer science. No During my career, I have seen the commitment of one can foresee the precise details, but we can sense individuals and institutions grow and expand as the potential for understanding the semantics of lan- computer science has grown and expanded. I be- guage, abstraction, and knowledge representation. lieve, however, that much of the credit for the emer- Just as early researchers were willing to invest their gence of computer science as a discipline rests with energies in computer science before fully under- the dedication and commitment of a relatively small standing the details, we must also commit ourselves number of researchers who had a vision of the po- to the future of computer science before fully dis- tential of computing and the perseverance to make cerning its shape. this vision a reality. It is now our responsibility to Today, there are signs that computer science is formulate a new vision, to shape the goals for the turning to applications areas. As it contributes its next generation of researchers. It is important that models, tools, and techniques to these new fields, computer scientists share this new vision and make they in turn will contribute new ideas and method- it a reality. ologies that will greatly enrich and expand the scope Today, computers have penetrated almost every of computer science. Potentially, we are at the aspect of modern life. They are used in agriculture, threshold of a new era of growth of the science. communication, education, manufacturing, and However, two areas impede our progress over this medicine. They are used to predict weather, to opti- threshold. mize food production, to control satellites in space, First there is the inadequate size of the science to develop new drugs, and to manufacture fuel- base, as well as the lack of sufficient researchers to efficient automobiles. In medicine, their use in expand it. Despite a sizable body of knowledge, tomography allows precise imaging of vital organs tools, and techniques, the science base of computer to aid diagnosis and treatment. In the business world, science is not developing as rapidly as it should. they are used to route messages, handle commercial With computer technology advancing so quickly, it

March 1987 Volume 30 Number 3 Communications of the ACM 201 Turing Award Lecture

is easy to lose sight of the importance of developing and structures. Computer scientists came from many the underlying science base for computing. There is backgrounds and have not been able to bring the too great a temptation to focus simply on writing support structures of a mother discipline with them. software. As we build larger and more complex sys- Consequently, computer science began without a tems, we must develop the conceptual tools that will sufficient number of recognized senior scientists and allow us to comprehend th.e essence of a task and to without existing support structures to facilitate and develop the software tools needed to create complete channel its growth. systems. We have not been able to utilize fully the The demand for computer science researchers is benefits of computing precisely because we have ever increasing. Even within the profession, there is failed to develop the science base necessary for con- competition for researchers. The demands of educa- structing reliable and user,-friendly software systems tional institutions and research laboratories for com- quickly and economically. Existing programming puter scientists are growing faster than the field can languages are not satisfactory for the task; they ap- build the infrastructure necessary for producing the pear to be requiring higher and higher skill levels. needed pool of talent. The shortfall in computer sci- We must find ways of communicating with com- ence faculty, in particular, has serious ramifications. puters that lower the required skill level, just as the Unless an investment is made to provide sufficient assembly line lowered the skill level needed in the researchers in our educational institutions, we will production process. A factor of 100 improvement in fall even further behind in trying to meet this ever- software productivity would allow us to design, im- growing demand for computer scientists. plement, and install a given system in two or three Universities that realized the importance of com- days rather than over the course of a year. Such puter science early on and acted were able to estab- improvements depend on the development of a sci- lish departments of the highest quality. Generally, it ence base for programming environments, software has been difficult for institutions that started later to development, and man-machine interfaces. catch up. A similar situation is arising on the na- Clearly, the science base established by early re- tional scene. Today, there is a global struggle for searchers has been beneficial. Each of the six papers technological and economic leadership. Computing covered in that early automata theory course added will play a key role in this struggle. Unless we de- considerably to it. For example, extensions of the velop a national policy to support computer science, work of Backus and Naur on formal descriptions of we will, by our own inaction, allow other countries programming languages allow us to automate the to establish leads in computing that cannot be over- construction of .the parsing component of compila- come. tion. An early Fortran took 20-50 man- A national commitment to computer science must years of effort. Today we assign undergraduate term be made and sustained. We must develop innovative projects that involve constructing a compiler for a programs to provide an adequate talent pool of re- language that is conceptually far more sophisticated searchers, use existing researchers more effectively, than Fortran and see it completed within the term. and develop environments to foster research in all This dramatic improvement is attributable to the de- aspects of computing. We must increase public velopment of a :science base in formal languages. awareness of computer science. And we must con- This same science base allows us to construct com- vince policymakers of the importance of a full-scale piler and to tailor languages to specific commitment to computer science. needs. It allows a chemist to write in a language where the data items are molecules and valences CR Categories and Subject Descriptors: A.0 [General Literature]: rather than integers and strings. Similar contribu- General-biographies/autobiographies: F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnumerical Algorithms and Problems; G.2.2 tions have been made in databases, concurrency, [Discrete Mathematics]: Graph Theory; K.2 [Computing Milieu]: His- algorithms, VLSI design tools, and many other areas. tory of Computing-people General Terms: Algorithms, Design, Performance, Theory But greater efforts must be made. Additional Key Words and Phrases: John E. Hopcroft, Robert E. Tar- Why is the science base lagging behind the tech- jan, Turing Award nological base? The field of computer science has grown explosively, more rapidly than any other dis- Author’s Present Address: John E. Hopcroft, Dept. of Computer Science, cipline in history. It is unique in that it evolved from Cornell University, Ithaca, NY 14853. researchers from diverse backgrounds instead of Permission to copy without fee all or part of this material is granted emerging from an existing discipline. Other fields, provided that the copies are not made or distributed for direct commer- such as molecular biology, had the advantage of cial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of emerging from broader disciplines that could con- the Association for Computing Machinery. To copy otherwise, or to tribute researchers of all ages, along with resources republish, requires a fee and/or specific permission.

Communications of the ACM March 1987 Volume 30 Number 3