Algebraic Implementation of Abstract Data Types*
Total Page:16
File Type:pdf, Size:1020Kb
Theoretical Computer Science 20 (1982) 209-263 209 North-Holland Publishing Company ALGEBRAIC IMPLEMENTATION OF ABSTRACT DATA TYPES* H. EHRIG, H.-J. KREOWSKI, B. MAHR and P. PADAWITZ Fachbereich Informatik, TV Berlin, 1000 Berlin 10, Fed. Rep. Germany Communicated by M. Nivat Received October 1981 Abstract. Starting with a review of the theory of algebraic specifications in the sense of the ADJ-group a new theory for algebraic implementations of abstract data types is presented. While main concepts of this new theory were given already at several conferences this paper provides the full theory of algebraic implementations developed in Berlin except of complexity considerations which are given in a separate paper. The new concept of algebraic implementations includes implementations for algorithms in specific programming languages and on the other hand it meets also the requirements for stepwise refinement of structured programs and software systems as introduced by Dijkstra and Wirth. On the syntactical level an algebraic implementation corresponds to a system of recursive programs while the semantical level is defined by algebraic constructions, called SYNTHESIS, RESTRICTION and IDENTIFICATION. Moreover the concept allows composition of implementations and a rigorous study of correctness. The main results of the paper are different kinds of correctness criteria which are applied to a number of illustrating examples including the implementation of sets by hash-tables. Algebraic implementations of larger systems like a histogram or a parts system are given in separate case studies which, however, are not included in this paper. 1. Introduction The concept of abstract data types was developed since about ten years starting with the debacles of large software systems in the late 60's. Today this concept seems to be one of the most important features in the development of programming and specification methods (see [44]). Algebraic specification techniques for the design of software systems were introduced by Zilles [45] and Guttag [28] and the first precise mathematical version was given by the ADJ-group in [1]. Since that time a various number of papers on algebraic specification techniques have appeared studying specification problems from the theoretical and the applications point of view. Much less attention was given in the first years to the problem of implementation of abstract data types, although an algebraic version of the implementation of * This paper is a revised and extended version of our ICALP-paper [18J combined with our MFCS-paper [14]. 0304-3975/82/0000-0000/$02.75 © 1982 North-Holland 210 H. Ehrig et at. symbol tables by stacks was given already by Guttag in [28]. Later on algebraic implementation concepts were given by ADJ [1], Goguen-Nourani [27, 39], Ehrich [11, 12], Wand [41], Lehmann-Smyth [35], and most recently Hupbach [32, 33] and Ganzinger [23]. In Section 8 all these concepts are compared with our new approach which was first announced in [16] and later presented as conference versions in [18] and [14]. In contrast to most of the other authors we propose a clear distinction between the syntactical and the semantical level and corresponding correctness criteria. This distinction is widely accepted for specifications but not for implementations up to now. But it is a necessary step towards an implementation concept which can be used in a specification language for design and stepwise refinement of software systems. The concept of stepwise refinement has become most important in programming and software engineering since the early papers of Dijkstra [9] and Wirth [43]. The aim of our new implementation concept is two-fold: First of all it should cover the informal notion of implementations for algorithms in specific programming languages. Secondly it should cover the notion of simulation of one data type by another one and more general the notion ofstepwise refinement of software systems. In Section 8 we give a short discussion based on [5] and [34] how algebraic specification methods can be used for the design of software systems and that we have good chances to meet the general part of our second aim. To show that our notion of implementation covers that of simulation of data types by each other is a central part of this paper which is included in the motivation part of Sections 3 up to 5. Actually we state conceptual requirements for implementations of abstract data types in Section 3 which are shown to be satisfied for our concept in Section 5. Last but not least we show in the introduction of Section 3 how far our first aim can be satisfied: Algorithms can be considered as operations of abstract data types and programming languages become abstract data type once we have a well-defined denotational or algebraic semantics. Hence the informal notion of implementation becomes a special case of algebraic implementations provided that we have algebraic specifications for the corresponding abstract data types. First approaches to find such algebraic specifications are given in [21] for algorithms and in [10] and [37] for programming languages. The technical part of this paper is started in Section 2 where we give a review of algebraic specifications in the sense of the ADJ-group. We only introduce the basic syntactical and semantical notions which are used in later sections. This means we need algebras and homomorphisms but in the main part of the paper we can avoid categorical constructions like adjoint functors which are still frightening for some computer scientists. In Section 4, however, we show that our semantical constructions actually are adjoint functors. In Section 3 we discuss the syntactical level of implementations which is given by a set SORT of "sorts implementing operations" and a set EOP of "operations implementing equations". The equations in EOP are intended to define the new operations in terms of the old ones while the operations in ISORT, like copy Algebraic implementation ofabstract data types 211 operations, establish the connection between old and new sorts. Three different implementations for sets of integers are discussed in detail to show the expressive power of our concept. The semantical level of implementations is studied in Section 4. The semantical construction is given in three steps, called SYNTHESIS, RESTRICTION and IDENTIFICA• TION. Correctness of implementations is defined via a completeness and a con• sistency condition, called OP-completeness and RI-correctness respectively. We show that the data representation part of our implementation concept can be characterized to be an algebra of colored trees. The main results concerning correctness of implementations are given in Section 5. We give proof-theoretical as well as semantical conditions for OP-completeness and RI-correctness. Our characterization result for data equivalence shows that we do not need additional equations to express data equivalence with respect to multiple data representation as suggested in [29]. Furthermore, we show that the concept of taking first RESTRICTION and then IDENTIFICATION in the semantical construction is strictly more general than taking first IDENTIFICATION and then RESTRICTION as done in [11, 12]. This is based on the fact that RESTRICTION and IDENTIFICATION are not commutable as suggested by the well-known examples in automata theory. In order to define the composition of implementations in Section 6 we first have to generalize the standard case of Section 3 by hidden components. But semantics and correctness in Sections 4 and 5 were already formulated in such a way that they apply to the standard as well as the general case. Moreover we define strong and persistent implementations which are shown to lead to a strict hierarchy of implementation concepts. In Section 7 we study the correctness of composition of algebraic implementa• tions. It turns out that the composition is OP-complete but not necessarily RI-correct unless we assume additional consistency conditions or the more restrictive case of persistent implementations. Possible inconsistencies in the composition of implementations are due to the fact that the corresponding equations may be applied in a mixed version. This situation is similar to the scheduling problem for transactions in data base systems where synchronization techniques have to be used to avoid inconsistencies. This paper is concluded with Section 8 where we give a summary of our implementation approach, a comparison with other algebraic implementation con• cepts, and some general ideas towards stepwise refinement of software systems. Especially we point out how far our concpets are already useful, what other features have to be included and what kind of new results should be shown. This paper includes a 3-step implementation of sets of integers by strings of integers via hash-tables where the correctness of the single steps and the composition is shown in Sections 5 and 7 respectively. That these techniques can also be used for correct specification and implementation of larger systems is demonstrated in two case studies, a histogram in [19] and a parts system in [13]. 212 H. Ehrig et al. 2. Review of algebraic specifications The foundations for a strict mathematical theory of algebraic specifications were given by the ADJ-group in [1], while first approaches how to use algebraic specifications for the design of software systems were given already by Zilles [45] and Guttag [28]. The main idea of the ADJ-approach is to give a syntactic description of an abstract data type using algebraic specifications. The semantics of the specification is given by the corresponding quotient term algebra (or any isomorphic algebra) which is the initial algebra in the category of all algebras satisfying the given specification. This is the reason for referring the ADJ-approach as "initial algebra approach", while the approach of some other authors, initiated by [41], is called "final algebra approach". We will follow the ADJ-approach as given in [1] and continued in [15].