Concurrency Theory: a Historical Perspective on Coinduction and Process Calculi
Total Page:16
File Type:pdf, Size:1020Kb
CONCURRENCY THEORY: A HISTORICAL PERSPECTIVE ON COINDUCTION AND PROCESS CALCULI Jos Baeten, Davide Sangiorgi Readers:Luca Aceto and Robert van Glabbeek 1 INTRODUCTION Computer science is a very young science, that has its origins in the middle of the twentieth century. For this reason, describing its history, or the history of an area of research in computer science, is an activity that receives scant attention. The discipline of computer science does not have a clear demarcation, and even its name is a source of debate. Computer science is sometimes called informatics or information science or information and communication science. A quote widely attributed to Edsger Dijkstra is Computer science is no more about computers than astronomy is about telescopes. If computer science is not about computers, what is it about? We claim it is about behavior. This behavior can be exhibited by a computer executing a program, but equally well by a human being performing a series of actions. In general, behavior can be shown by any agent. Here, an agent is simply an entity that can act. So, computer science is about behavior, about processes, about things happening over time. We are concerned with agents that are executing actions, taking input, emitting output, communicating with other agents. Thus, an agent has behavior, e.g., the execution of a software system, the actions of a machine or even the actions of a human being. Behavior is the total of events or actions that an agent can perform, the order in which they can be executed and maybe other aspects of this execution such as timing, probabilities or evolution. Always, we describe certain aspects of behavior, disregarding other aspects, so we are considering an abstraction or idealization of the ‘real’ behavior. Rather, we can say that we have an observation of behavior, and an action is the chosen unit of observation. 22 Jos Baeten, Davide Sangiorgi But behavior is also the subject of dynamics or mechanics in physics. How does computer science distinguish itself from these fields? Well, the most impor- tant difference is that dynamics or mechanics considers behavior using continuous mathematics (differential equations, analysis), and that computer science considers behavior using discrete mathematics (algebra, logic). So, behavior considered in computer science is discrete, takes place at separate moments in time, and we can observe separate things happening consecutively, different actions are separated in time. This is why a process is sometimes also called a discrete event system. Discrete behavior is important in systems biology, e.g. considering metabolic pathways or behavior of a living cell or components of a cell. We see more and more scientists active in systems biology with a computer science background. Another example of discrete behavior is the working of an organization or part of an organization. This is called workflow. There are two more ingredients involved before we can get to a definition and a demarcation of computer science. These are interaction and information. Com- puter science started off considering a single agent executing actions by itself, but with the birth of concurrency theory, it was realized that interaction is also an essential part of computer science. Agents interact amongst themselves, and inter- act with the outside world. Usually, a computer is not stand-alone, with limited interaction with the user, executing a batch process, but is always connected to other devices and the Internet. Information is the stuff of informatics, the things that are communicated, processed, transformed, sent around. So now we have all ingredients in place and can formulate the following definition of computer science. It may be obvious that we find informatics a better name than computer science. Informatics or computer science is the study of discrete behavior of interacting information processing agents. Given this definition of informatics, we can explain familiar notions in terms of it. A computer program is a prescription of behavior, an interpreter or compiler can turn this into specific behavior. An algorithm is a description of behavior. Computation refers to sequential behavior, not considering interaction. Commu- nication is interaction with information exchange. Data is a manifestation of information. Intelligence has to do with a comparison between different agents, in particular between a human being and a computer. Finally, concurrency theory is that part of the foundations of computer science where interaction is considered explicitly. Acknowledgements The authors are grateful to Luca Aceto and Rob van Glabbeek for comments on an earlier draft of the paper. Sangiorgi’s work has been partially supported by the ANR project 12IS02001 “PACE”. Concurrency Theory 23 2 CONCURRENCY The simplest model of behavior is to see behavior as an input/output function. A value or input is given at the beginning of the process, and at some moment there is a(nother) value as outcome or output. This view is in close agreement with the classic notion of “algorithmic problem”, which is described by giving its legal inputs and, for each legal input, the expected output. The input/output model was used to advantage as the simplest model of the behavior of a computer program in computer science, from the start of the subject in the middle of the twentieth century. It was instrumental in the development of (finite state) automata theory. In automata theory, a process is modeled as an automaton. An automaton has a number of states and a number of transitions, going from one state to a(nother) state. A transition denotes the execution of an (elementary) action, the basic unit of behavior. Besides, there is an initial state (sometimes, more than one) and a number of final states. A behavior is a run, i.e. a path from the initial state to a final state. Important from the beginning is when to consider two automata equal, expressed by a notion of equivalence. On automata, the basic notion of equivalence is language equivalence: a behavior is characterized by the set of executions from the initial state to a final state. Later on, this model was found to be lacking in several situations. Basically, what is missing is the notion of interaction: during the execution from initial state to final state, an agent may interact with another agent. This is needed in order to describe parallel or distributed systems, or so-called reactive systems (see [Harel and Pnueli, 1985]). When dealing with interacting systems, we say we are doing concurrency theory, so concurrency theory is the theory of interacting, parallel and/or distributed systems. The history of concurrency theory can be traced back to the early sixties of the twentieth century with the theory of Petri nets, conceived by Petri starting from his thesis in 1962 [Petri, 1962]. This remained, for some time, a separate strand in the development of the foundations of computer science, with not so many connections to other foundational research. On the other hand, the theory of Petri nets is a strong and continuing contribution to the foundations of computer science, and it has yielded many and beautiful results. The history of Petri nets has been described elsewhere (see e.g. [Brauer and Reisig, 2006]), so we will not consider it any further at this point. In 1970, we can distinguish three main styles of formal reasoning about computer programs, focusing on giving semantics (meaning) to programming languages. 1. Operational semantics. A computer program is modeled as an execution of an abstract machine, an automaton. A state of such a machine is a valuation of variables, a transition between states describes the effect of an elementary program instruction. Pioneer of this field is McCarthy [McCarthy, 1963]. 2. Denotational semantics. This is more abstract than operational semantics; computer programs are usually modeled by a function transforming input 24 Jos Baeten, Davide Sangiorgi into output. Pioneers of this line of research are Scott and Strachey [Scott and Strachey, 1971]. 3. Axiomatic semantics. Here, emphasis is put on proof methods to prove pro- grams correct. Central notions are program assertions, proof triples consist- ing of precondition, program statement and postcondition, and invariants. Pioneers are Floyd [Floyd, 1967] and Hoare [Hoare, 1969]. Then, the question was raised how to give semantics to programs containing a notion of parallelism. It was found that this is difficult using the methods of denotational, operational or axiomatic semantics as they existed at that time, although several attempts were made (later on, it became clear how to extend the different types of semantics to parallel programming, see e.g. [Owicki and Gries, 1976] or [Plotkin, 1976]). There are two paradigm shifts that needed to be made, before a theory of parallel programs could be developed. First of all, the idea of a behavior as an input/output function needed to be abandoned. A program could still be modeled as an automaton, but the notion of language equivalence is no longer appropriate. This is because the interaction a process has between input and output influ- ences the outcome, disrupting functional behavior. Secondly, the notion of global variables needed to be overcome. Using global variables, a state of a modeling automaton is given as a valuation of the program variables, that is, a state is determined by the values of the variables. The independent execution of parallel processes makes it difficult or impossible to determine the values of global vari- ables at a given moment. It turned out to be simpler to let each process have its own local variables, and to denote exchange of information explicitly. For each of these paradigm shifts, we will consider an important notion exem- plifying this development. For the shift away from functional behavior we consider the history of bisimulation, for the shift away from global variables we consider the history of process calculi.