Theoretical Computer Science 20 (1982) 209-263 209 North-Holland Publishing Company

ALGEBRAIC IMPLEMENTATION OF ABSTRACT DATA TYPES*

H. EHRIG, H.-J. KREOWSKI, B. MAHR and P. PADAWITZ Fachbereich Informatik, TV Berlin, 1000 Berlin 10, Fed. Rep. Germany

Communicated by M. Nivat Received October 1981

Abstract. Starting with a review of the theory of algebraic specifications in the sense of the ADJ-group a new theory for algebraic implementations of abstract data types is presented. While main concepts of this new theory were given already at several conferences this paper provides the full theory of algebraic implementations developed in Berlin except of complexity considerations which are given in a separate paper. The new concept of algebraic implementations includes implementations for algorithms in specific programming languages and on the other hand it meets also the requirements for stepwise refinement of structured programs and software systems as introduced by Dijkstra and Wirth. On the syntactical level an algebraic implementation corresponds to a system of recursive programs while the semantical level is defined by algebraic constructions, called SYNTHESIS, RESTRICTION and IDENTIFICATION. Moreover the concept allows composition of implementations and a rigorous study of correctness. The main results of the paper are different kinds of correctness criteria which are applied to a number of illustrating examples including the implementation of sets by hash-tables. Algebraic implementations of larger systems like a histogram or a parts system are given in separate case studies which, however, are not included in this paper.

1. Introduction

The concept of abstract data types was developed since about ten years starting with the debacles of large software systems in the late 60's. Today this concept seems to be one of the most important features in the development of programming and specification methods (see [44]). Algebraic specification techniques for the design of software systems were introduced by Zilles [45] and Guttag [28] and the first precise mathematical version was given by the ADJ-group in [1]. Since that time a various number of papers on algebraic specification techniques have appeared studying specification problems from the theoretical and the applications point of view. Much less attention was given in the first years to the problem of implementation of abstract data types, although an algebraic version of the implementation of

* This paper is a revised and extended version of our ICALP-paper [18J combined with our MFCS-paper [14].

0304-3975/82/0000-0000/$02.75 © 1982 North-Holland 210 H. Ehrig et at. symbol tables by stacks was given already by Guttag in [28]. Later on algebraic implementation concepts were given by ADJ [1], Goguen-Nourani [27, 39], Ehrich [11, 12], Wand [41], Lehmann-Smyth [35], and most recently Hupbach [32, 33] and Ganzinger [23]. In Section 8 all these concepts are compared with our new approach which was first announced in [16] and later presented as conference versions in [18] and [14]. In contrast to most of the other authors we propose a clear distinction between the syntactical and the semantical level and corresponding correctness criteria. This distinction is widely accepted for specifications but not for implementations up to now. But it is a necessary step towards an implementation concept which can be used in a specification language for design and stepwise refinement of software systems. The concept of stepwise refinement has become most important in programming and software engineering since the early papers of Dijkstra [9] and Wirth [43]. The aim of our new implementation concept is two-fold: First of all it should cover the informal notion of implementations for algorithms in specific programming languages. Secondly it should cover the notion of simulation of one data type by another one and more general the notion ofstepwise refinement of software systems. In Section 8 we give a short discussion based on [5] and [34] how algebraic specification methods can be used for the design of software systems and that we have good chances to meet the general part of our second aim. To show that our notion of implementation covers that of simulation of data types by each other is a central part of this paper which is included in the motivation part of Sections 3 up to 5. Actually we state conceptual requirements for implementations of abstract data types in Section 3 which are shown to be satisfied for our concept in Section 5. Last but not least we show in the introduction of Section 3 how far our first aim can be satisfied: Algorithms can be considered as operations of abstract data types and programming languages become abstract data type once we have a well-defined denotational or algebraic semantics. Hence the informal notion of implementation becomes a special case of algebraic implementations provided that we have algebraic specifications for the corresponding abstract data types. First approaches to find such algebraic specifications are given in [21] for algorithms and in [10] and [37] for programming languages. The technical part of this paper is started in Section 2 where we give a review of algebraic specifications in the sense of the ADJ-group. We only introduce the basic syntactical and semantical notions which are used in later sections. This means we need algebras and but in the main part of the paper we can avoid categorical constructions like adjoint functors which are still frightening for some computer scientists. In Section 4, however, we show that our semantical constructions actually are adjoint functors. In Section 3 we discuss the syntactical level of implementations which is given by a set SORT of "sorts implementing operations" and a set EOP of "operations implementing equations". The equations in EOP are intended to define the new operations in terms of the old ones while the operations in ISORT, like copy Algebraic implementation ofabstract data types 211 operations, establish the connection between old and new sorts. Three different implementations for sets of integers are discussed in detail to show the expressive power of our concept. The semantical level of implementations is studied in Section 4. The semantical construction is given in three steps, called SYNTHESIS, RESTRICTION and IDENTIFICA• TION. Correctness of implementations is defined via a completeness and a con• sistency condition, called OP-completeness and RI-correctness respectively. We show that the data representation part of our implementation concept can be characterized to be an algebra of colored trees. The main results concerning correctness of implementations are given in Section 5. We give proof-theoretical as well as semantical conditions for OP-completeness and RI-correctness. Our characterization result for data equivalence shows that we do not need additional equations to express data equivalence with respect to multiple data representation as suggested in [29]. Furthermore, we show that the concept of taking first RESTRICTION and then IDENTIFICATION in the semantical construction is strictly more general than taking first IDENTIFICATION and then RESTRICTION as done in [11, 12]. This is based on the fact that RESTRICTION and IDENTIFICATION are not commutable as suggested by the well-known examples in automata theory. In order to define the composition of implementations in Section 6 we first have to generalize the standard case of Section 3 by hidden components. But semantics and correctness in Sections 4 and 5 were already formulated in such a way that they apply to the standard as well as the general case. Moreover we define strong and persistent implementations which are shown to lead to a strict hierarchy of implementation concepts. In Section 7 we study the correctness of composition of algebraic implementa• tions. It turns out that the composition is OP-complete but not necessarily RI-correct unless we assume additional consistency conditions or the more restrictive case of persistent implementations. Possible inconsistencies in the composition of implementations are due to the fact that the corresponding equations may be applied in a mixed version. This situation is similar to the scheduling problem for transactions in data base systems where synchronization techniques have to be used to avoid inconsistencies. This paper is concluded with Section 8 where we give a summary of our implementation approach, a comparison with other algebraic implementation con• cepts, and some general ideas towards stepwise refinement of software systems. Especially we point out how far our concpets are already useful, what other features have to be included and what kind of new results should be shown. This paper includes a 3-step implementation of sets of integers by strings of integers via hash-tables where the correctness of the single steps and the composition is shown in Sections 5 and 7 respectively. That these techniques can also be used for correct specification and implementation of larger systems is demonstrated in two case studies, a histogram in [19] and a parts system in [13]. 212 H. Ehrig et al.

2. Review of algebraic specifications

The foundations for a strict mathematical theory of algebraic specifications were given by the ADJ-group in [1], while first approaches how to use algebraic specifications for the design of software systems were given already by Zilles [45] and Guttag [28]. The main idea of the ADJ-approach is to give a syntactic description of an abstract data type using algebraic specifications. The semantics of the specification is given by the corresponding quotient algebra (or any isomorphic algebra) which is the initial algebra in the category of all algebras satisfying the given specification. This is the reason for referring the ADJ-approach as "initial algebra approach", while the approach of some other authors, initiated by [41], is called "final algebra approach". We will follow the ADJ-approach as given in [1] and continued in [15]. An algebraic specification, short specification, SPEC = (S, I, E> consists of a set S of sorts, a family I = (IW.S)WES*.SES of operation (symbol)s and a family E = (ES)SES of equations.

The sorts s E S denote data domains. The operations 0- E I w." also written (T: w ~

s E I, are declarations with name (T, domain w = s1 ... sn (si E S, i = 1, ... , n) and range s E S. In the special case w = A (empty word) 0- is called O-ary or constant. The equations e = (L, R) E E" more intuitively written L = R, are pairs of I -terms of sort s with variables. I-terms ofsort s with variables of a given family X = (Xs)SES are sets TI(X), which (simultaneously for all s E S) are recursively defined by

(i) (T E TI(X), for all 0- E I A,,, (ii) x E TI(X)s for all x EX" (iii) 0-((1, ... , tn) E TI(X)s for all (T E Isl...sn,,, ti E TI(X)si, i = 1, ... , n. In the denotation of examples specifications are in bold italics and sorts in normal italics, operations and equations are not presented as sets but they are listed behind the corresponding key words "sorts", "opns", and "eqns" respectively. The key words are omitted if the corresponding sets are empty.

2.1. Examples. (1) The basic specification of natural numbers is given by

nat sorts: nat opns: 0: ~ nat SUCC: nat ~ nat

All nat-terms (without variables) are of the form SUCCn (0) for n ~ O. This basic specification can be used to specify additional operations like ADD, MULT: nal nat ~ nat (see [1] or [15]) or the Ackermann function

ackermann = nat + opns: A: nat nat ~ nat Algebraic implementation ofabstract data types 213

eqns: A(O,x)=SUCC(x) A (SUCC(x), 0) = A(x, SUCC(O» A (SUCC(x 1), SUCC(x2» =A(x1, A(SUCC(x1, x2») where the notation nat + ... means the (disjoint) union of the corresponding sorts, operations and equations respectively. (2) As basic specification for boolean values we will take boo) sorts: bool opns: TRUE, FALSE: -+ bool NON: bool -+ bool AND, OR: bool bool-+ bool IF-THEN-ELSE-BOOL: bool bool bool -+ bool eqns: NON(TRUE) = FALSE NON(FALSE) = TRUE TRUE AND b=b FALSE AND b = FALSE b1 OR b2=NON(NON b1 AND NON b2) IF TRUE THEN bi ELSE b2 BOOL=bl IF FALSE THEN b I ELSE b2 BOOL = b2 where AND, OR and IF-THEN-ELSE-BOOL are used in infix notation. (3) A specification for natural numbers with equality is given by natl = nat + boo) + opns: EO: nat nat -+ bool eqns: EO(O, 0) = TRUE EO(O, SUCC(n» = FALSE EO(SUCC(n), 0) = FALSE EO(SUCC(n 1), SUCC(n2» = EO(n 1, n2)

The semantics of a specification SPEC = (5, I, E) is given by a (many-sorted) I -algebra with data domains and operations corresponding to 5 and I, and which satisfies the equations E. More precisely the semantics of SPEC is the initial algebra

TSPEC which is uniquely determined up to isomorphism and hence representation invatiant as required for abstract data types. A canonical construction for TSPEC is the quotient term algebra. All algebras isomorphic to TSPEC for some specification SPEC will be considered as abstract data types. For a more detailed motivation of abstract data types see [1]. A I-algebra A consists of a family of data domains (As )seS, distinguished elements (TA E As for all (T E ZA.s and operations (TA: A sl x ... x A sn -+ As for all (T E IsL.sn.s' A I- h :A -+ A' for I -algebras A, A' consists of a family of functions (hs :As -+ A~)SES which preserve the operations, i.e. 214 H. Ehrig et al. for all (T EI s1...sn,,, ai EAs;, i = 1, ... , n, This includes the special case n = 0 requiring hs «(TA) = (TA' for all (T EI A•s . A I-isomorphism is a bijective I-homomorphism,

Algebras A, A I are isomorphic, written A ::= A', if there is a I -isomorphism h :A ~ A' (and hence also an inverse I-isomorphism h -1: A' ~ A). The I-term algebra T:E(X) (with variables in X) is given by the family (T:E(X)S},ES of I -terms of sort s, distinguished elements (TT := (T for all (T EI A•s and operations (TT for all (T EI s1...sn,s defined by

(TT (t 1, ... , tn) := (T (t 1, ... , tn).

For each assignment h :X ~ A of variables X in a I -algebra A there is a unique I -homorphism h *: T:E(X) ~ A, called term evaluation, extending h. Especially for X = 0 we write eval: T:E ~ A. We say that A satisfies E if for all e = (L, R) EE" s E S and all assignments h :X ~ A we have h~ (L) = h~ (R) in A. I-algebras satisfying E are called (I, E)- or SPEC-algebras. Given two specifications SPEC = (S, I, E) and SPEC' with SPEC c;; SPEC' com• ponentwise, the SPEC-part A SPEC of a SPEC'-algebra A is defined by (ASPEds = As

(T (TA (T for all s ESand A SPEC = for all EI. (In other words, A SPEC is the image V(A) of the forgetful functor V from the category of SPEC'-algebras to the cateogry of SPEC-algebras.)

For an explicit construction of the quotient term algebra TSPEC let us consider the congruence == E which is the congruence closure of the relation that consists of (h~(L), h~(R)) :X~ all pairs with SES, (L,R)EEs and h T:E (where T:E:= T1;(0)). More precisely we will sometimes write ==SPEC instead of == E to indicate that the congruence closure has to be taken with respect to I -operations. For each specification SPEC = (S, I, E) the corresponding quotient term algebra is given by the quotient

The quotient term algebra is an initial SPEC-algebra, i.e. for all other SPEC• ~ algebras A there is exactly one I -homomorphism h : TSPEC A. This is explicitly given by h ([tJ) = eval(t) for all .l' -terms t, where eval is the term evaluation defined above.

This initiality property, which characterizes T SPEC up to isomorphism, allows to define T SPEC as the semantics of the specification SPEC. A precise semantics for a specification is necessary to be able to speak about the correctness of a specification. In general a specification is correct with respect to a model M if the semantics of the specification is isomorphic to M. In order to allow hidden functions in the specification which are not part of the of the algebra A our definition of correctness is a little more subtile: A specification SPEC' is correct with respect to a SPEC-algebra A if SPEC = (S, I, E) is a subspecification of SPEC' (Le. SPEC C;; SPEC' componentwise) and the SPEC-part of TSPEC' is isomorphic to A. Algebraic implementation ofabstract data types 215

2.2. Examples. (1) The specification nat of Example 2.1 is correct with respect to the natural numbers No with constant 0 and SUCC being the usual successor function. (2) The specification ackermann is correct with respect to the Ackermann func• tion on the natural numbers. (3) The specification bool is correct with respect to A = ({T, F}, T, F, I, II, v, ite) where TRUEA = T and FALSEA = T are the constants and NONA = I, ANDA= II,

ORA = v, IF-THEN-ELSE-BOOLA = ite are the usual boolean operations.

Remark. In order to get a correct specification for the natural numbers with addition ADD: nat nat ~ nat we need the hidden functions 0 and SUCC (if they are not assumed to be in the signature of the natural numbers already). Moreover, if we have the natural numbers with 0, SUCC and multiplication MULT we need ADD as hidden function. The correctness proofs for 2.2 are standard, see [1].

The use of subspecifications is also essential for stepwise refinement of specifications (see [15]) and modular structuring. The essential construction is the combination: COMB = SPEC+ (SO, .1'0, EO) is called combination if SPEC = (S, .1', E) and (S + SO, .1' + .1'0, E + EO) are specifications where the latter one defines the semantics TCOMB of the combination (+ is used for (disjoint) union of sets resp. many-sorted sets). Examples were given already in 2.1.1 and 2.1.3. Note that our combination combines the concepts combine and enrich in the algebraic specification language CLEAR (see [5]) where combine assumes that also (SO, .1'0, EO) is a specification (which is not required for our combinations) and enrich may have SO =0. To simplify the notation especially for iterated combinations we sometimes write .1'(COMB) for the operations ~ +.1'0 and E(COMB) for the equations E + EO of COMB. Note that for each combination COMB = SPEC+(SO, .1'0, EO) there is a unique .1'-homomorphism h : TSPEC~ (TCOMB)SPEC defined by h([t]E) = [t]E+EO for all terms t E T1;. If h is surjective, we say that COMB is complete(ly specified) with respect to SPEC, because each .1' + .1'O-term of some sort in S is equivalent to a .1'-term via E + EO. For injective h we say that COMB is consistent(ly specified) with respect to SPEC, because E + EO-equivalence of .1' -terms implies E -equivalence. If h is bijective, we say that COMB is an extension of SPEC, i.e.

In other words the data type T SPEC is protected in TCOMB' If COMB contains no additional sorts, i.e. SO = 0, the extension is called enrichment. In Example 2.1 ackermann is an enrichment of nat and nat! is an extension of nat and bool respectively. Immediately from the definitions we have the following property: 21~6 H. Ehrig et al.

2.3. Fact. Given a combination COMB = SPEC+ (SO, IO, EO), then COMB is an extension of SPEC if and only if COMB is completely and consistently specified with respect to SPEC.

Remark. In general it is undecidable whether COMB is completely resp. con• sistently specified with respect to SPEC. But there are a number of sufficient conditions in the literature for each of the properties to be satisfied and hence also for the extension and enrichment property (see [15] with correction in [17, 40] and other papers summarized in [31]). Another very important special case of combinations are parameterized specifications (see [2,3]) because most data types coming up in software engineering include parameter types. Even basic specifications like those for strings, sets, stacks, queues or arrays which are based on some parameter type data must be treated as parameterized specifications string(data), set(data) etc. if we want to avoid giving separate specifications string(int), string(natl), set(int), set(natl), ... for all mean• ingful actual parameters int, natl, .... The standard parameter passing concept in [3] defines how to replace the formal parameter data in set(data) by an actual parameter like int or natlleading to the "value specifications" set(int) or set(natl) which are actualized parameterized specifications. Since the theory of parameterized specifications requires much more technical and conceptual details we restrict ourselves to actualized parameterized specifications in this paper which are again combinations in the sense defined above.

2.4. Example. (1) The actualized parameterized specification for strings with actual parameter natl is given by:

string(natl) = natl+ sorts: string opns: A: ~ string ADD: nat string ~ string

The essential part of the semantics is given by (Tstring(natl»),tring == Nt, the free monoid built up from the natural numbers. (2) The actualized parameterized specification for sets with actual parameter natl is given by:

set(natl) = natl + sorts: set opns: CREATE : ~ set INSERT: nat set ~ set DELETE: nat set ~ set MEMBER: nat set ~ boo! EMPTY: set ~ boo! IF-THEN-ELSE-SET: boo! set set ~ set Algebraic implementation ofabstract data types 217

eqns: INSERT(n1, INSERT(n2, s» = IFEQ(n1, n2) THEN INSERT(n 1, s) ELSE INSERT(n2, INSERT(n 1, s»SET DELETE(n, CREATE) = CREATE DELETE(n 1, INSERT(n2, s» = IF EQ(n 1, n2) THEN DELETE(n 1, s) ELSE INSERT(n2, DELETE(n 1, s))SET MEMBER(n, CREATE) = FALSE MEMBER(n 1, INSERT(n2, s)) = IF EQ(n 1, n2) THEN TRUE ELSE MEMBER(n 1,s)BOOL EMPTY(CREATE) = TRUE EMPTY(INSERT(n, s)) = FALSE IF TRUE THEN sl ELSE s2 SET= sl IF FALSE THEN sl ELSE s2 SET=s2

The essential part of the semantics is given by

i.e. the set of all finite subsets of natural numbers. The operation CREATE creates the empty set, INSERT inserts a new natural number into a given finite set of natural numbers, DELETE deletes an element, MEMBER checks whether a given number belongs to a given set, EMPTY checks whether the given set is empty and IF-THEN-ELSE-SET is the usual if-then-else-operation on the sort set. For a correctness proof for this specification we refer to [2] where, however, the para• meterized case is treated. But correctness of the parameterized specification and that of the actual parameter implies correctness of the value specification (see [3] and subsequent papers).

3. Syntactical level of implementations

In the previous section we have seen how to specify abstract data types in the algebraic framework. Now we want to consider the problem of implementations. The usual informal meaning of implementation is to give a program in some specific programming language which is intended to simulate a given algorithm or data type. But if we are interested in properties of implementations like correctness and complexity, we have to be much more precise. First of all we need precise definitions for syntax and semantics of the given algorithm or data type and of our specific programming language. Since algorithms can be regarded as special cases of data types (see the specification of Turing machines and their data type semantics in [21]), which were formally introduced in the previous section, it remains to consider suitable formulations of programming languages. Suitable for us means a general 218 H. Ehrig et al.

mathematical framework which allows (in principle) the formulation of all the common programming languages (syntax and semantics) and which is compatible with our formulation of data types. Since there is already an algebraic specification for the functional programming language LISP (see [10]) and since Goguen and Parsaye-Ghomi have shown in [26] how to view the imperative programming language MODEST as an abstract data type (see also [37, 38]), we feel free to assume that our specific programming language is defined by an algebraic specification. Hence we have the same level of description for source and target of implementa• tion. This is not only nice from the mathematical point of view but allows also stepwise refinement of implementations which is most desirable from the software engineering point of view (see [9, 43]). According to the above motivation we can assume to have abstract data types ADTl and ADTO with algebraic specifications SPEC1 and SPECO respectively. It remains to give a meaning for the phrase "ADT1 implements ADTO". Before we give the formal definitions let us state the following informal conceptual require• ments:

3.1. Concept (Conceptual requirements for algebraic implementations). (1) Syntactical level: The implementation of an abstract data type ADTO by an abstract data type ADT1 should be given on a syntactical level as an implementation of the corresponding specifications SPECO and SPEC1 respectively,

IMPL SPEC1 ~ SPECO

where SPECO-sorts and -operations are synthesized by those of SPECl. (2) Semantical level: There should be a construction on the semantical level transforming ADT1 into ADTO which allows to represent data and operations in ADTO and to simulate compound operations in ADTO by corresponding data and operations synthesized from those in ADT1

SEM'MPL ADT1 ====::;. ADTO.

In addition, we postulate the following syntactical requirements in order to achieve correct implementations: (3) Data representation: Data of ADTO should be represented by data synthesized from ADTl. Each one may have different representations. In any case different data are to be represented differently. However, there may be synthesized data which do not correspond to data of ADTO. (4) Simulation ofcompound operations: Each operation and compound operation in ADTO should be simulated by operations synthesized from those in ADTl. The computation of operation calls in ADTO should lead to the same results (up to data representation) as the evaluation of the simulating operations, i.e. we should have an abstraction or representation function. Algebraic implementation ofabstract data types 219

(5) Parameter protection: There may be a common parameter part ADT of ADT1 and ADTO which should be protected by the semantical construction.

Remarks. (1) Since the syntactical and the semanticallevel are given by construc• tions leading from SPEC1 resp. ADT1 to SPECO resp. ADTO, it is natural to call SPEC1 the source and SPECO the target specification. This should not be confused with the terminology in compiler construction where the source language is the more abstract and the target language the more concrete one. In our implementation concept it is just the other way round because we do not want to change the direction in passing from semantical to diagrammatic notation. (2) The common parameter ADT of ADT1 and ADTO is considered to be an actual rather than a formal parameter. The latter would mean that we have to treat the corresponding specifications SPEC1 and SPEC2 as parameterized specifications with common subspecification SPEC for ADT (see [2, 3]). Although this is highly desirable from the applications point of view we have restricted ourselves to "actualized parameterized specifications" in order to keep the conceptual and mathematical overhead as small as possible. (For first approaches of implementa• tions of parameterized types we refer to [23, 32].)

The problem is now to find an implementation concept which satisfies these requirements. In general the semantical requirements are undecidable unless the syntax for implementations is restricted significantly. Undecidability, however, is a well-known phenomenon in formal language theory and programming which cannot be avoided even in the context-free case. Hence we cannot expect to avoid it for the implementation of abstract data types. But in Section 5 we will give sufficient conditions for the semantical requirements and hence for the correctness of implementations. A basic idea of our approach is to have a clear distinction between the syntactical and the semantical level and semantical requirements which have to be verified to show the correctness of implementations. In this section we will only consider "weak implementations" corresponding to the syntactical level of implementations while semantics and correctness will be studied in the following sections. But what are the reasons that we are so much interested to have a syntactical level for implementations at all? It should be clear from the introduction of this section that algebraic implementations are intended to be used for the design and stepwise refinement of software systems. Such a design requires a suitable specification language including the concept of implementations. Hence our syntactical level of implementations is intended to be used in the syntax part of a suitable algebraic specification language. The algebraic specification language CLEAR (see [5, 7]) already includes the derive concept corresponding to the implementation concept in [1]. The derive concept, however, seems to be too restrictive for most of the applications because it only allows composition of operations but no proper recur• sive definitions. 220 H. Ehrig et al.

Let us motivate the syntactical level of our implementation concept with the following example:

3.2. Example (Implementation of powerset operations). We have introduced a specification set(natl) for sets of natural numbers already in Example 2.4.2. The essential operations of set(natl) are INSERT: nat set~ set and DELETE: nat set~ set corresponding to a left action of natural numbers on finite subsets of natural numbers. The typical powerset operations, however, are "union", "intersection", and "difference" which are binary internal operations on powersets. Of course, we could add these operations (and suitable equations) as enrichment operations to the specification set(natl). But for a number of applications the operations INSERT and DELETE are not necessary such that we want to consider the following specification pset(natl) for the typical powerset operations avoiding INSERT and DELETE. For simplicity we also avoid the operations "intersection" and "difference".

pset(natl) = natl + sorts: pset opns: 0: ~ pset { }: nat ~ pset u : pset pset ~ pset E: nat pset ~ bool eqns: Mu0=M MuM=M MuM'=M'uM (M u M') u Mil = M u (M' U Mil) n E0=FALSE n E {m}= EQ(m, n) n E(MuM')= (n EM) OR (n EM')

Note that the essential part of the semantics

is the same as for set(natl) (see Example 2.4(2)) but now we have different operations except of "0" and "E" corresponding exactly to CREATE and MEM• BER in set(natl). The operation "{ }" (singleton) creates the set {n} for each natural number n E No and "u" is the union of sets. Although we should give a correctness proof for this semantical interpretation of the specification pset(natl), we avoid it because we want to concentrate on the implementation aspects. Our intention was to implement the powerset operations using the set(natl)-operations. Due to our conceptual requirement 3.1.1 we will give an implementation of SPECO = pset(natl) by SPEC1 = set(natl) using already the syntactical scheme of 3.4 which for our example is expained below: Algebraic implementation ofabstract data types 221

set(natl) impl pset(natl) by sorts impI opns c : set~ pset opns impI eqns 0= c(CREATE) {n}= c(INSERT(n, CREATE)) c(s)u c(CREATE) = c(s) c(s) u c(INSERT(n, s')) = c(INSERT(n, s)) u c(s') n E c(s) = c(MEMBER(n, s)) The key words "sorts impl opns" and "opns impl eqns" are standing for "sorts implementing operations" and "operations implementing equations" respectively. The idea of the sorts implementing operations is to generate data of sort pset from those of sort set. In our case we have a simple copy operation c :set ~ pset which makes sure that each date of pset is represented by a copy of a date in set. This is the simplest case of data representation as required in 3.1.3 where synthesis consists of copy only. The idea of the operations implementing equations is to define the new operations of specification pset(natl) in terms of the given operations of set(natl). In our algebraic implementation concept we allow arbitrary recursive definitions given by a set of equations like those for the union. The derive concept in CLEAR would only allow (non recursive) derived operations corresponding to the remaining equations for empty set (0), singleton ({ }) and element (E). The operations implementing equations are showing how the operations in pset(natl) are simulated by those in set(natl) as required in 3.1.4. The first equation means that the empty set operation for pset(natl) is given as composition of CREATE in set(natl) and the copy operation. In the second equation we learn that singleton is simulated by INSERT applied to CREATE where n is a variable of sort nat. Again we need the copy operation because INSERT(n, CREATE) is a term of sort set which becomes a term of sort pset after application of the copy operation. In the next two equations the union is recursively defined in terms of CREATE and INSERT using again the copy operation for adaptation of sorts. The union operation is well-defined because all terms of sort pset are of the form c(CREATE) or c(INSERT(n, s')) and the recursive equation is decreasing in the second argument. Finally the last equation says that "element" is simulated by "MEMBER". Although the use of the copy operation may seem to be an unnecessary burden, it is essential in order to keep a clear distinction between the sorts set and pset. (Forcing to have set = pset would change the given specifications or lead to syntac• tical inconsistencies as those in the first version of the symbol table implementation in [28].) Finally let us note that the parameter protection requirement (see 3.1.5) is reflected by the fact that we only give equations for the pset(natl)-operations not belonging to the parameter part natl. 222 H. Ehrig el al.

Due to the conceptual requirements 3.1.1 and 3.1.5 we have the following:

3.3. General assumption. We assume to have the following algebraic specifications:

SPEC = (S, ~, E) (actual parameter part) SPECO = SPEC+(SO, ~O, EO) (target specification) SPECI = SPEC+(SI, ~1, El) (source specification) where SPECO and SPECI are both extensions of SPEC, i.e. they are combinations in the sense of Section 2 and the common parameter part SPEC is protected, i.e.

(TSPECO)SPEC ~ TSPEC ~ (TSPEC\)SPEC. Now we are able to define the syntactical level of an implementation, called weak implementation, which becomes an implementation if additional semantical proper• ties - given in Section 4 - are satisfied. Moreover we only consider the standard case in this section while the general case including hidden components is studied in Section 6.

3.4. Definition (Weak standard implementation). Given algebraic specifications SPECO and SPECI as in 3.3 a weak standard implementation of SPECO by SPEC1 is a pair IMPL = (~SORT, EOP) of operations ~SORT, called sorts implementing operations, and equations EOP, called operations implementing equations, such that

SORTIMPL = SPECl + (SO, ~SORT, 0) and OPIMPL = SORTIMPL + (0, ~O, EOP) are combinations, called SOrt implementation and operation implementation level respectively, and for all (J' : ~ s, (J': s 1 ... sn ~ s in ~SORT the range s belongs to SO.

Notation. We use the following diagrammatic notation (see remark in 4.2) IMPL: SPECI ~ SPECO or - especially for examples - the syntactical schema SPECI impl SPECO by sorts impI opns: .. , (operations of ~SORT) opns impl eqns: . .. (equations of EOP) Algebraic implementation ofabstract data types 223 where the lists of operations and equations can be written as usual in algebraic specifications.

3.5. Remarks and interpretation

(1) Sorts in SO and operations in 2'0 are used ambiguously in different specification and implementation levels. First they name data domains and oper• ations of the abstract data type specified by SPECO. On the other hand they refer to the corresponding realizations of these domains and operations in the implementation levels. Whereas in the former case the semantics of SO and 2'0 is given by TSPECO, in the latter case data of SO-sorts are considered to be generated by sorts implementing operations applied to data of TSPEC1. The effect of the 2'O-operations is determined by the operations implementing equations. (Confer the synthesis step in 4.2.) Hopefully, it is not confusing for the reader that we use the same names for corresponding sorts and operations in different levels (which is done frequently in programming). (2) Without any additional technical problem we can allow that some auxiliary (hidden) sorts in addition to 2'SORT are used to generate the SO-sorts and some auxiliary (hidden) operations with (hidden) equations in addition to EOP are used to define the 2'O-operations. This more general case of (weak) implementations will be studied in Section 6. The main features of implementations, however, can be studied already in the standard case. (3) In our syntactical requirements we assume that the sorts implementing oper• ations (T E 2'SORT are generating synthesized data in the target sorts SO from those in SPEC1 while the operation implementing equations e E EOP are intended to define recursive procedures for the target operations 2'0 in terms of the source operations 2' +2'1 and the sorts implementing operations 2'SORT. The most common sorts implementing operations will be copy operations c :s 1 ~ sO where s 1 E S + Sl and sO E SO. But we also think of tuple- and table-operations like TUP:sl. .. sn~sO of TAB:sl. .. sn sO~sO which are generating tuples of data of sorts sl, ... , sn in sort sO resp. sequences of such tuples which can be considered as tables with entrilts of sorts s 1, ... , sn. In the second case, however, we need an additional constant like NIL: ~ sO to initialize the recursive table construction. (4) Restricting the form of sorts implementing operations, we can classify implementations by their type of sort implementation. The most simplest case seems to be renaming of sorts by copy operations as defined above. Most of the known implementation concepts [28, 1, 27, 11, 12, 41, 35] belong to this type. More complex than copy are constructions like tuples and tables defined above or union, given by INk :sk ~ s for k = 1, ... , n, and binary trees, given by EMPTY : ~ s and BIN: s s s 1 ... sn ~ s. Each of these constructions and each combination defines a special class, sometimes called device of the implementation, provided that all sorts implementing operations (and possibly equations) belong to this class. 224 H. Ehrig et al.

3.6. Examples (Implementation of sets using tables). We want to implement sets of natural numbers using tables. Let us consider the following two implementations where the target specification is in both cases SPECD = set(natl) as given in 2.4.2; (1) The most straightforward solution is to take SPEC1 = natl as source specification and to generate the tables in the sorts implementing part. Then the operations implementing part includes a renaming of ISORT. natl impI set(natl) by sorts impl opns: NIL:~set TAB: nat set~ set opns impI eqns: CREATE = NIL INSERT(n, s) =TAB(n, s) DELETE(n, NIL) = NIL DELETE(n 1, TAB(n2 s)) = IF EQ(n 1, n2) THEN DELETE(n 1, s) ELSE TAB(n2, DELETE(n 1, s)) SET (similar for equations for MEMBER, EMPTY, IF-THEN-ELSE-SET as in 2.4.2 where CREATE and INSERT are replaced by NIL and TAB respectively) Taking string(natl) (see 2.4.1) instead of natl as source specification we could avoid to generate the tables in the sorts implementing part which would reduce to a simple copy operation c :string ~ set in this case. In both cases of implementations the complexity of the operations as defined in [21] turns out to be undefined because the representation of each finite set in the implementation may be arbitrarily long. (2) A more efficient implementation for all operations in the data type can be given by the well-known hash-table technique. Again we could use string(natl) as source specification and construct the hash-tables in the sorts implementing part. But at the moment we prefer to have the hash-tables already in the source specification. In Section 7 we will discuss the first case as an example of a composition of implementations. First we have to specify a suitable data type hash(natl), which consists of "generalized hash-tables" in contrast to "actual hash-tables" occurring in the implementation of sets. Generalized hash-tables are constructed by CREATEAR in sort array as m-tuples of strings of natural numbers in sort list (we obtain the usual table-layout if the components of the m -tuples are listed below each other). The total number m of rows is fixed with the HASH-function which is addition modulo m in our example. "Actual hash-tables" are those generalized hash-tables where an element n of the set is represented as an entry in row j = HASH(n) and where all entries are pairwise distinct. The sort nat(m) provides a copy of the first m positive natural numbers. (s1, ... , sm) is an abbreviation for a list of m distinct variables: If, e.g., m =4, then (sl, . .. , sm) stands for (sl, s2, s3, s4). hash(natl) natl+ Algebraic implementation ofabstract data types 225

sorts: list, nat(m), array opns: e: ~ list ADDLI: list nat ~ list ELEMi : ~ nat(m) (i=l, ... ,m) CREATEAR: list m ~ array ENTRY:nat(m) array~list CHANGE: nat(m) array list ~ array HASH:nat~nat(m) ADJOIN: list nat ~ list REMOVE: list nat ~ list SEARCH: list nat ~ bool EMPTYLI : list ~ bool IF-THEN-ELSE-LI :boollist list ~ list eqns: ENTRY(ELEMi, CREATEAR(sl, ... , sm» = si (i=l, ,m)

CHANGE(ELEMi, CREATEAR(sl, , sm), s) = CREATEAR(sl, ... , sci -1), s, sci + 1), ... , sm) (i=l, ,m) HASH(SUCCi-1(O» = ELEMi (i=l, ,m) HASH(SUCCm(n» = HASH(n) ADJOIN(s, n) = IF SEARCH(s, n) THEN s ELSE ADDLI(s, n) LI REMOVE(e, n) = e REMOVE(ADDLI(s, n 1), n2) = IF EQ(n 1, n2) THEN s ELSE ADDLI(REMOVE(s, n2), n 1) LI SEARCH(e, n) = FALSE SEARCH(ADDLI(s, n 1), n2) = IF EQ(n 1, n2) THEN TRUE ELSE SEARCH(s, n2) BOOL EMPTYLI(e) = TRUE EMPTYLI(ADDLI(s, n» = FALSE IF TRUE THEN sl ELSE s2 LI = sl IF FALSE THEN s 1 ELSE s2 LI = s2

Each object of sort array has the following graphical representation where for alIi ~j~m and alIi ~k ~ii> n(j, k) EN and HASH(n(j, k» = (n (j, k)mod m)+ 1= j:

n(l,l) n(1,2) ... n(l, i1)

n(2,1) n(2,2) ... n (2, i2)

n(m,l) n(m,2) ... n (m, im) 226 H. Ehrig et al.

Now we are able to give the implementation IMPL: hash(natl) ~ set(natl): hash(natl) impl set(natl) by sorts impl opns (1:S0RT) a : array~ set opns impl eqns (EOP) CREATE = a (CREATEAR(e, ... , e)) INSERT(n, a(s)) = a (CHANGE(HASH(n), s, ADJOIN(ENTRY(HASH(n), s), n))) DELETE(n, a (s)) = a (CHANGE(HASH(n), s, REMOVE(ENTRY(HASH(n), s), n))) MEMBER(n, a(s)) = SEARCH(ENTRY(HASH(n), s), n) EMPTY(a(s)) = EMPTYLI(ENTRY(ELEMI, s) AND· .. AND

EMPTYLI(ENTRY(ELEMm, s)) IF b THEN a(s1) ELSE a(s2) SET=a(IF b THEN s1 ELSE s2 LI) Let us give a brief interpretation of the INSERT-operation: Given a natural number n and an actual hash-table representing a set a(s), INSERT(n, a(s)) is a new actual hash-table where row rk of a (s), with K = HASH(n) and rk = ENTRY(HASH(n), s), is replaced by a new row r~ = ADJOIN(rk, n). The new row r~ extends rk by the element n if n does not belong to rk and is equal to rk otherwise. Similarly element n is deleted in r~ if n belongs to rk in the case of the DELETE• operation. Note that with rk also r~ becomes an actual hash-table. Since we only apply set(natl)-operations in the implementation we only obtain actual hash-tables after each application of an operation.

The set of all actual hash-tables becomes an algebra REP1MPL with respect to the signature of set(natl) which will be called representation of the implementation.

But REP1MPL does not satisfy all the set(natl)-equations: Take e.g. the terms INSERT(n 1, INSERT(n2, CREATE)) and INSERT(n2, INSERT(n 1, CREATE)) which are interpreted by different actual hash-tables although they correspond to the same set {n 1, n2} for n 1 ~ n2 but HASH(n1) = HASH(n2). In other words,

REP1MPL differs between several representations of the same set(natl)-object (d. 3.1.3).

4. Semanticallevel of implementations

In the last section we have studied the syntactical level of implementations which consists of a set 1:S0RT of operations and a set EOP of equations. The set 1:S0RT of sorts implementing operations provides the connection between given sorts in the source specification and the new sorts in the target specification. On the other hand the new operations in the target specification are defined in terms of the given operations by the operation implementing equations. This is already an intuitive semantics for our implementation concept. More precisely the semantics is a Algebraic implementation ofabstract data types 227 construction which starting from the semantics TSPECI of the source specification SPECl leads to a data type SIMPL, the semantical algebra of the implementation, which should be isomorphic to the semantics TSPECO of the target specification SPECO if the implementation is correct. Of course, our syntactical framework cannot assure that each weak implementation is automatically correct and hence an implementation. But it assures that we have for each weak implementation a well-defined semantical construction and it is correct if two additional semantical requirements are satisfied. The semantical construction together with these seman• tical requirements will be shown to imply the conceptual requirements for algebraic implementations given in 3.1.

4.1. Motivation (semantics of implementations)

The semantical construction of an implementation consists of three steps: SYN• THESIS, RESTRICTION and IDENTIFICATION. The idea of the SYNTHESIS step is to synthesize the new sorts and operations of the target specification SPECO from those of the source specification SPECl in two substeps. In the first substep, called SORT-SYNTHESIS, the data domains for the new sorts of SPECO are generated by the operations 2'SORT while in the second substep, called OP-SYNTHESIS, the new operations of SPECO are defined by the equations EOP. This corresponds exactly to the sort implementation level SORTIMPL and operation implementation level OPIMPL given in the definition of weak implementations. The result of the SYNTHESIS step is the semantics TOPIMPL of the operation implementing level OPIMPL which in fact combines the source specification SPECl with the signature of the target specification SPECO using the components 2'SORT and EOP of the weak implementation. Note, that OPIMPL does not contain the equations EO of SPECO because the semantics of the 2'0• operations in OPIMPL is defined by the equations EOP. In the RESTRICTION step, the second step of the semantical construction, we restrict the data type TOPIMPL to all operations of SPECO and all data which are reachable by these operations. This construction will be done in two substeps

FORGETTING and REACHABILITY and the result is a data type REP1MPL which has the signature of SPECO but is not intended to satisfy all the equations of SPECO.

Actually REP1MPL allows multiple representation of SPECO-data, like representa• tion of the set {nl, n2} by the two different lists nl n2 and n2 nl which violates the commutativity axiom for insertion of elements in sets (see 2.4.2). Hence the data type REP1MPL is called representation of the implementation. In the final IDENTIFICATION step all multiple representations of SPECO-data are identified with respect to the equations of SPECO. The result SIMPL of this construc• tion becomes aSPECO-algebra which, however, may include more identification of data then those specified in SPECO, like identification of TRUE and FALSE in bool. In this case the semantical construction would lead to an inconsistent and hence incorrect result with respect to SPECO. But our semantical requirements will make sure that this inconsistency cannot occur in correct implementations. 228 H. Ehrig et al.

In the following definition the two substeps of the SYNTHESIS- and the RESTRIC• TION-step are not reflected explicitly but the substeps are defined in 4.3.

4.2. Definition (semantics of implementations). Given a weak (standard) implementation IMPL of SPECO by SPEC1 the semantical construction SEMIMPL of the weak implementation is the composition of the following three constructions: SYNTHESIS RESTRICTION SEMIMPL : TSPECI ;. TOPIMPL ;. IDENTIFICATION REP1MPL ;. SIMPL defining a sequence of algebras where (1) TSPECI is the semantics of the source specification SPEC1, i.e. TSPECI is the initial SPEC1-algebra, (2) TOPIMPL is the semantics of the operation implementing level, OPIMPL (see 2.4 resp. 6.1), i.e. TOPIMPL is the initial OPIMPL-algebra, (3) REP1MPL, called representation of the implementation, is that part of TOPIMPL which is generated by (.l" +.l"O)-operations, i.e.

REPIMPL = eval(THIo) where eval is the term evaluation

eval: T I +IO ~ (TOPIMPLh+IO uniquely defined by initiality of TI+IO and (TOPIMPdHIO is the (.l" +.l"O)-part of TOPIMPL. (4) SIMPL, called semantical algebra of the implementation, is the quotient of REP1MPL by the congruence generated by the EO-equations, i.e. SIMPL = REPIMPL/==EO'

Remark. This definition of the semantics makes sense for weak standard implementations as defined in 3.4 but also for weak implementation with hidden components which will be introduced in Section 6. The only difference in the latter case is that OPIMPL in addition contains all the hidden components. In the same way the semantical requirements (see 4.5) will formally have the same definition, but SORTIMPL and OPIMPL in the standard case does not contain the hidden components. While the intuitive interpretation of the semantical construction and the seman• tical algebra was given already in 4.1 we are now going to give a more categorical interpretation which may be skipped for first reading.

4.3. Fact and remark (functorial interpretation of the semantics)

From the categorical point of view the semantical construction SEMIMPL can be considered as a composition of three functors: SYNTHESIS and IDENTIFICATION are Algebraic implementation ofabstract data types 229 free functors (left adjoints) with respect to the forgetful functors VI :AlgoPIMPL ~ AlgsPEcI and V3: AlgsPEco ~ AlgsPEco° respectively where SPECO° = SPECO-EO. RESTRICfION is not a left adjoint but a composition of two right adjoint functors FORGETTING and REACHABILITY and an inclusion 1. FORGETIING is the forgetful functor V2: AlgoPIMPL ~ AlgsPEco° and REACHABILITY is the right adjoint of the inclusion I: Reach-AlgsPEcoo ~ AlgsPEco° where Reach-AlgsPEcoo is the full sub• category of all reachable SPECO°-algebras, i.e. all SPECO°-algebras A such that the term evaluation evalA : TI + IO ~ A is surjective. In other words A is a quotient of TI + IO' The composition SEM1MPL of these three functors SYNTHESIS RESTRICTION SEMIMPL :AlgsPEcl ====~> AlgOPIMPL ;. IDENTIFICATION AlgsPEco' AlgsPEc becomes a functor from the category of SPECI-algebras to the category of SPECO• algebras. In our semantical construction we do not need the full functor but only the functor SEMIMPL applied to one object, the initial SPECI-algebra TSPECI . This is the reason that we have defined our semantical construction by the sequence of algebras and not by the composition of functors. Both views are compatible because we have for the functors SYNTHESIS, RESTRICfION and IDENTIFICATION uniquely defined (up to isomorphism) as above: (1) SYNTHESIS(TsPEC1) = TOPIMPL, (2) RESTRICTION(ToPIMPd = REPIMPL, (3) IDENTIFICATION(REPIMPd = SIMPL. The first property follows from the fact that free functors preserve initial objects, the second and third properties follow from the explicit construction of REPIMPL and SIMPL respectively. The functorial view of the semantics is very important with respect to the generalization of our concept to the implementation of parameterized data types. As pointed out in the remark of 3.1 we only consider actualized parameterized specifications in this paper, where SPEC is actual parameter of SPECI and SPECO. If SPEC is considered as formal parameter then we have to consider the semantics of parameterized types, i.e. the free functor PI :AlgsPEc ~ AlgsPEcl (see [2-4]), instead of the initial algebra TSPECI. Hence we have to apply our semantical construction SEMIMPL not only to TSPECl but to all free algebras PI (A) for A E AlgsPEc. Our semantical requirement SEMIMPdTSPECl) = SIMPL ~ TSPECO (see 4.6.2) will then be generalized to SEM1MPLo PI = FO where PO is the free functor

PO: AlgsPEc ~ AlgsPEco.

Finally let us remark that we have defined implementation by functors already in our paper [15]. At that time, however, we did not know suitable restrictions from those functors corresponding to the intuitive idea of implementations. 230 H. Ehrig et aJ.

Especially we did not have a syntactical description for such functors. Now we have, because IMPL = (ISORT, EOP) is exactly a syntactical description of the semantical functor SEM1MPL : AlgsPEcl~ AlgsPEco(where it is assumed that source and target specifications SPEC1 and SPECO are given in advance). Actually most of the syntactical description is already reflected in the SYNTHESis-functor where ISORT (together with SO) defines the free functor SORT-SYNTHESIS: AlgsPEcl~ AlgsORTIMPL, EOP (together with IO) defines the free functor OP• SYNTHESIS: AlgsORTIMPL~ AlgoPIMPL and SYNTHESIS is the composition of SORT• SYNTHESIS and OP·SYNTHESIS. In the RESTRICTION part only the functor FORGETTING: AlgOPIMPL~ AlgsPEco' (weakly) depends on ISORT and EOP because they are part of OPIMPL. But REACHABILITY and IDENTIFICATION (con• sidered as functors) are independent of the specific implementation they only depend on the target specification SPECO. This is important to keep in mind when different implementations for the same target specification SPECO are compared with each other as done in our complexity paper [21].

Before we are going to define the semantical requirements for implementations we will give the semantical constructions for our examples studied in Section 3.

4.4. Examples. (1) For set(natl) impl pset(natl) given in Example 3.2 TSPECI is the initial set(natl)-algebra as given in 2.4.2. By the SYNTHESis-construction a new sort pset is added and the copy operation c : set ~ pset includes a bijection of the corresponding data domains in TSORTIMPL' This corresponds to the SORT-SYNTHESIS• step. In the OP-SYNTHEsls-step we obtain the algebra TOPIMPL which in addition to TSORTIMPL has the pset(natl)-operations empty set, singleton, union and element. These operations have already the same semantics as the corresponding ones in Tpsel(nall). In other words we have already

(TOPIMPLh+,ro = REP1MPL == SIMPL == TpseHnall) because eval: TJ:HO ~ (TOPIMPLhHO is already surjective and REP1MPL satisfies already all pset(natl)-equations.

(2) In example 3.6.1, natl impl set(natl), we have TSPECI = T nall, TSORTlMPL is an extension T nall with a new sort set such that (TSORTIMpdset = N~, and TOPIMPL is enrichment of TSORTIMPL by the set(natl)-operations CREATE, INSERT, DELETE, MEMBER, EMPTY and IF-THEN-ELSE-SET. The semantics of these operations in TOPIMPL is that of the corresponding string operations and not yet that of the operations in TSel(nall). Since all data in (TOPIMPLh+,ro are reachable by the operations we have (TOPIMPd.l'+J:o= REP1MPL. But SIMPL, the result after IDEN•

TIFICATION, is a proper quotient of REP1MPL and isomorphic to TSel(Dalll' (3) Finally our Example 3.6.2, hash(natl) impl set(natl), has trivial SORT• SYNTHESIS but nontrivial RESTRICTION and IDENTIFICATION. In the SYNTHESis-step we obtain copies of all generalized hash-tables in (TOPIMpdseh and the set(natl)• operations in TOPIMPL are manipulating these generalized hash-tables as explained Algebraic implementation ofabstract data types 231 in 3.6.2. In the RESTRICTION-step all generalized hash-tables are removed which are not actual hash-tables such that (REP1Mpdser consists exactly of all actual hash-tables. Moreover we have forgotten in REP1MPL the copy operation, all data domains of sorts list, nat(m), array, and hence also all hash(natl)-operations which are not natl-operations. Let us point out that FORGETTING does not mean that we do not need the hash(natl)-operations any more. They are still used as basic operations to define the set(natl)-operations in REP1MPL, but they are hidden because they do not belong to the signature of REP1MPL. Finally also the IDENTIFICA• TION-step is nontrivial because REP1MPL does not satisfy all the set(natl)-equations like commutativity of INSERT for two arbitrary elements (see 3.6.2). Hence different actual hash-table representations for the same set are identified such that the quotient SIMPL of REP1MPL becomes isomorphic to Rset(natl).

All the previous examples are satisfying already our semantical requirements for implementations as stated below. Hence all these examples are not only weak implementations but implementations. Before we state the properties we need some more motivation.

4.5. Motivation (semantical requirements) Given a weak implementation IMPL of SPECO by SPEC1 we have the following interpretation in terms of programming languages: IMPL is a program and the semantics of the program is given by the SPECO-algebra SIMPL. But in the same way as not every program is correct we cannot expect that every weak implementa• tion is correct in the sense what it is intended to do. This intention is not always clear for programs but it is clear for our implementations, because the intended semantical algebra of IMPL is TSPECO. As mentioned already in 4.1 we need a semantical requirement to make sure that we avoid inconsistencies. Hence we will have to assume that the result SIMPL of our semantical construction is isomorphic to TSPECO, i.e. the semantical construction should be a transformation from the semantics of the source specification SPEC1 to the semantics of the target specification SPECO. We call this property RI-correctness which reflects that we have first the restriction (R) and then the identification (I) step in our semantical construction and not the other way round (see 4.2). In Section 5 we will show that RI-correctness is equivalent to the existence of a (surjective) representation homomorphism rep: REP1MPL ~ TSPECO which - in other implementation concepts - is sometimes called abstraction function. Moreover, the idea to have an abstraction function f: A ~ B was one of the first ideas to define the notion"A implements B". In order to satisfy our conceptual requirements concerning simulation of com• pound operations (see 3.1.4) we also need another semantical requirement, called OP(eration)-completeness. Simulation of compound operations means that SPECO• operations should be simulated by synthesized SPEC1-operations, this is by SOR• TIMPL-operations. Hence we have to make sure that the new .l'O-operations in 232 H. Ehrig et al.

OPIMPL are completely specified with respect to SORTIMPL. We cannot assume that they are completely specified with respect to SPEC1 because in general the IO-operations will use sorts in SO which are not available in SPECl. On the other hand if we would drop OP-completeness the empty implementation IMPL = (0, 0) would be a correct implementation for all SPEC1 and SPECO where in TOPIMPL and REP1MPL all the data in SO-sorts are freely generated by the new operations.

4.6. Definition (semantical requirements). A weak (standard) implementation IMPL of SPECO by SPEC1 is called (1) OP-complete, if for each term to in T:E+:EO there is an OPIMPL-equivalent term t1 in T:E(SORTIMPLh i.e. to ""'OPIMPL t1; (2) RI-correct, if the semantical algebra SIMPL of the implementation is isomorphic to TSPECO, i.e.

(3) implementation, if it is OP-complete and RI-correct. We also say that a weak implementation is correct if it is an implementation.

Remarks. (1) A more detailed discussion of OP-completeness and RI-correctness will be given in Section 5. We will also consider some stronger semantical require• ments like IR-correctness corresponding to an IR-semantical construction where IDENTIFICATION is done before RESTRICTION. ~2) In most of the previous versions of our implementation concept we had a third semantical requirement, called type protection, which is now a corollary of our new definition of weak implementations (see 4.7).

Now we give an explicit description of TSORTIMPL as "SPEC1-colored trees" and hence a characterization of the SORT-SYNTHESIS step (see 4.3) in the semantical construction.

4.7. Fact (data representation). (1) Given a weak standard implementation IMPL = (ISORT, EOP) of SPECO by SPEC1 and let us assume (without loss of generality) that for all u :s 1 ... sn ~ s in ISORT we have an m ~ n such that s 1, ... , sm E SO and sCm + 1), ... , sn E(S + S1), then the initial algebra TSORTlMPL is isomorphic to the SORTIMPL-algebra TREE1MPL of SPEC1-colored trees defined as follows: (TREE1MPdsPECI = TSPECI and for all s E SO (TREE1MPds is recursively defined by

- u E (TREE1MPds for all (T: ~ s EISORT; - u(tl, ... , tm/t(m +1), , tn)E (TREE1MPL), for all u :s1 ... sn ~s EISORT, ti E (TREE1MPdsi for i = 1, , m and tj E(TSPECI),j for j = m + 1, ... , n. Algebraic implementation ofabstract data types 233

Now let (JT :=(J for (J : -+ s E ISORT, and for (J: s 1 ... , sn -+ s E ISORT

(JT: (TREE1MPdsl X, .. x (TSPECI)sn -+ (TREE1MPds is defined by (JT(t1, ,tn)=(J(t1, ... ,tm/t(m+1), ... ,tn) for all tiE (TREE1MPdsi for i = 1, , m and tj E (TSPECl)sj for j = m + 1, ... , n.

Remark. The elements (J(t1, . .. , tm/t(m + 1), ... , tn) of (TREE1MPL)s for s E SO can be illustrated as SPEC1-colored trees in the following way

(t(m + 1), ... , tn)

In the case m = a we only have a colored node and in the case m = n the color is empty.

(2) If we have in addition OP-enrichment, i.e. OPIMPL is enrichment of SOR• TIMPL, then the elements of the representation algebra REP1MPL are SPEC1• colored trees, and TSPEC is protected, i.e. we have (up to isomorphism)

(REP1MPds <:; (TREE1MPds for all s E SO, and

Proof. (1) We will show that TREE1MPL is an initial SORTIMPL-algebra and hence isomorphic to TSORTIMPL. We have to construct for each SORTIMPL-algebra B a unique homomorphism I:TREE1MPL -+B. On the SPEC1-part of TREE1MPL I is given by the unique homomorphism g: TSPECI -+ BSPECI defined by initiality of TSPECl. For sorts SE SO I is defined for (J: s 1 sn -+ s E ISORT with n ~ a by Is ((J(t1, ... , tm/t(m + 1), , tn» = (JaUs 1(t1), ,/sm (tm), gs(m+l)(t(m + 1», ... , gsn(tn» for all ti (i = 1, , n) as specified in the construction of TREE1MPL. It follows by induction that I is a well-defined family I = Us), S E S + S1 + SO, of functions and the unique SORTIMPL-homomorphism from TREE1MPL to B. (2) Since OPIMPL is enrichment of SORTIMPL we have up to isomorphism (TOPIMPdsORTIMPL = TSORTIMPL = TREE1MPL. By construction of REP1MPL (see 4.2.3) we have for s E SO (REP1MPds <:; (TOPIMPds = (TREE1MPds' Moreover by general assumption 3.3 and fact 4.7 we have (TOPIMPdsPEC=(TsORTIMPdsPEC= (TSPECl)SPEC = T SPEC which implies (REP1MPdsPEC = TSPEC because I(SPEC) <:; I(SPECO). 0 234 H. Ehrig et al.

As a corollary of 4.7.1 we get

4.8. Fact (type protection). Each weak (standard) implementation IMPL of SPECO by SPEC1 is type protecting in the sense that SORTIMPL is an extension of SPEC1, I.e.

(TSORTIMPdsPECl ~ (TREE1MPdsPEcl = TSPECI.

5. Correctness of implementations

In the last section we have studied the semantics of implementations and corres• ponding semantical requirements. Each weak implementation has already a well• defined semantical construction. But in order to show that it is a correct implementa• tion we have to verify the semantical requirements OP-completeness and RI• correctness. In this section we will study these requirements in more detail and we will give sufficient conditions which can be used in concrete examples to verify the correctness of weak implementations. The conditions we are going to present are semantical conditions meaning that the existence and!or uniqueness of certain algebras and!or homomorphisms has to be shown. The corresponding proof tech• niques are those of set theory, algebra and . Of course, these semantical techniques in general are not suitable to be used in theorem provers although this would be highly desirable for standard applications in software system design. Having in mind this purpose we also show how proof theoretical conditions for the verification of the correctness of weak implementations can be obtained which - in the long run - should be suitable to be used in theorem provers. Actually OP-completeness and RI-correctness are conditions which are closely related to completeness resp. consistency conditions for enrichment and extension situations in the sense of Section 2. As pointed out in the remark of 2.3 there are a number of sufficient conditions being developed in the literature to verify completeness and consistency in a proof theoretical style which are suitable to be used in theorem provers. Since this field of efficient term evaluation and theorem proving is far beyond the scope of this paper we will only give the corresponding references. In addition to the proof theoretical and semantical correctness conditions we will study correctness with respect to a slightly different semantical construction for implementations, called IR-semantical construction. In opposite to our construc• tion in Definition 4.2 the IR-semantical construction has first the IDENTIFICATION and then the RESTRICTION step. This concept is used in some other papers [1, 11, 12, 27] and it was suggested to be equivalent to the RI-semantical construction in some of these papers. Actually in the well-known example of automata theory where RESTRICTION corresponds to restriction of states to all those reachable from the initial state and IDENTIFICATION corresponds to reduction of states both con• structions commute (see e.g. 9.8 in [15] for the very general class of automata in pseudoclosed categories including several deterministic and nondeterministic types). Algebraic implementation ofabstract data types 235

In spite of this fact we will give a basic counter-example showing that RESTRICTION and IDENTIFICATION cannot be commuted. Finally we will show that in the IDENTIFICATION step it suffices to use the equations given already in the target specification. In some papers like [29] additional equations are given to specify the equivalence of all those data in the implementation corresponding to a specific date in the target specification which seems to be useful when we allow multiple data representation (see 3.1.3). In [1] and [27] the implementation concept provides a congruence for the same purpose. In our concept each correct implementation has already this congruence which is the congruence generated by the equations of the target specification. Moreover, if we want to have an explicit equivalence check then it suffices to have an equality specification for the corresponding sorts in the target specification. Actually we will show that the EOP-equations for the equality operation in the implementation are generating exactly the equivalence of data we have sketched above.

5.1. Fact (verification of OP-completeness). A weak implementation IMPL and SPECO by SPECI is OP-complete if one of the following conditions is satisfied: (1) (Proof-theoretical condition): OPIMPL is completely specified with respect to SORTIMPL. (2) (Semantical condition): There is an OPIMPL-algebra A such that the restric• tion ASORTIMPL is isomorphic to TSORTIMPL and the unique SORTIMPL• homomorphism

h ASORTIMPL"'" TSORTIMPL ...,. (TOPIMPdsORTIMPL where h is defined by initiality of TSORTIMPL, preserves all operations in .l'(OPIMPL) - .l'(SORTIMPL) such that it becomes an OPIMPL-homomorphism h' :A...,. TOPIMPL'

Remark. Both conditions are only sufficient but not necessary for OP-completeness. Condition (1) is really proof theoretical because it means that it suffices to verify proof theoretical completeness conditions, which are based on structural induction and/or the termination of a term replacement system corresponding to the equations of OPIMPL (d. [17, 31, 40]). Moreover these conditions should be refined for the case of OP-completeness which is strictly weaker than completeness of OPIMPL with respect to SORTIMPL.

Proof. (1) OPIMPL completely specified with respect to SORTIMPL means that h : TSORTIMPL"'" (TOPIMPdsORTIMPL defined by h([t]SORTIMPL) = [t]OPIMPL for all t E Tl;ISORTIMPLI is surjective (see Section 2). In order to show OP-completeness let to E (THl;O)sO. Hence [to]OPIMPL E (TOPIMPL )so = ((TOPIMPL )sORTIMPdso such that by surjectivity of h there is tl E Tl;ISORTIMPLI with [to]OPIMPL = [tl]oPIMPL which implies OP-completeness. 236 H. Ehrig et al.

(2) Given A and h' as above initiality of TOPIMPL implies an OPIMPL• homomorphism g : TOPIMPL ~ A and h' 0 g = id : TOPIMPL ~ TOPIMPL. But this implies surjectivity of h' and hence of h such that we have OP-completeness by part (1). 0

Before we give similar sufficient conditions for verification of RI-correctness we will prove the following characterization theorem.

5.2. Theorem (characterization of RI-correctness). Given a weak implementation IMPL of SPECO by SPEC1 together with the semantical construction SEM1MPL in 4.2, the following conditions are equivalent to RI-correctness of IMPL: (1) The semantical algebra SIMPL is an initial SPECO-algebra. (2) The unique SPECO-homomorphism f: TSPECO ~ SIMPL, defined by initiality of TSPECO, is injective. (3) There is a (I+IO)-homomorphism

rep: REP1MPL ~ TSPECO called representation homomorphism or abstraction function. In this case rep is uniquely defined by

rep([t]OPIMPd = [t]SPECO for all t E TI+J:o and surjective. (4) All (I +IO)-terms t and t' equivalent in OPIMPL are also equivalent in SPECO, i.e. t =OPIMPL t' implies t =SPECO t'.

Proof. By construction in 4.2 there are natural homomorphisms nat: REP1MPL ~ SIMPL and natO: TI +IO ~ TSPECO as well as a surjective homomorphism e : TI +IO ~ REP1MPL which is the restriction of eval to REP1MPL. A unique homomorphism f: TSPECO ~ S IMPL exists by initiality of TSPECO.

eval T I + IO ------+1 (TOPIMPLh+IO ~ ~ REPIMPL natO rep ,-,-,-" j .... nat ,- ,- " ~ f' TSPECO .. ---- -I) SIMPL f

Moreover we have nat 0 e = f 0 natO because TI +IO is initial. Surjectivity of the left-hand side implies that also f is surjective. Hence injectivity of f implies that f is an isomorphism. Since initiality is closed under isomorphism, RI-correctness, i.e. SIMPL ~ TSPECO, is equivalent to conditions (1) and (2) respectively. Algebraic implementation ofabstract data types 237

If I is an isomorphism with inverse f', then the composition rep := f' 0 nat is a surjective (.l' +.l'0)-homomorphism satisfying natO = rep 0 e by initiaiity of TI,+I,o. Hence RI-correctness implies condition (3) including the explicit definition of rep while uniqueness of rep follows from surjectivity of e. Moreover well-definedness of the explicit definition of rep implies condition (4). Conversely condition (4) implies that there are well-defined functions reps: (REP1MPds ~ (TSPEcoL for all s E S + SO defined as in condition (3). This ~ family of functions becomes a (.l' +.l'O)-homomorphism rep: REP1MPL TSPECO because e and natO are homomorphisms and e is surjective. This means that condition (4) implies (3) and it remains to show that (3) implies (2). Given rep, the quotient construction of SIMPL = REP1MPL!==EO implies that there is a unique SPECO-homomorphism f': SIMPL ~ TSPECO such that rep = f' 0 nat.

Initiality of TSPECO implies f' 0 I = id such that I becomes injective as required in condition (2). 0

Now we will give sufficient conditions for verification of RI-correctness where the second condition is also necessary but not the first one.

5.3. Fact (verification of RI-correctness). A weak implementation IMPL of SPECO by SPEC1 is RI-correct if condition (1) or (2) below is satisfied: (l) (Prool theoretical condition): IDIMPL = OPIMPL+(0,0, EO) is consistently specified with respect to SPECO. (2) (Semantical condition): There is an OPIMPL-algebra A and a (.l' + .l'O)-homomorphism h :RESTRICTION(A)~ TSPECO where RESTRICTION(A) is the image of evalA: TI,+I,o~AuI,o which coincides with REP1MPL for A = TOPIMPL.

Remarks. (1) Condition (1) is called proof theoretical because it suffices to verify proof theoretical consistency conditions like those given in [40], which are mainly confluence properties of two term replacement systems corresponding to the equations of OPIMPL and SPECO, respectively. But we have to add EO to OPIMPL to obtain an inclusion of specifications and hence the usual consistency situation. Note that this condition is equivalent to IR-correctness (see 5.8). (2) The semantical condition is closely related to the existence of a representation homomorphism in 5.2.3. Although we do not have to take A = TOPIMPL in most applications, we will construct an OPIMPL-algebra A which intuitively is isomor• phic to TOPIMPL. But it is not necessary to verify that A is isomorphic to TOPIMPL'

Proof. (1) IDIMPL consistently specified with respect to SPECO means that h : TSPECO ~ (T1DlMPdsPECO is injective (see Section 2). In other words, for all t, t' E TI,+I,o t ==IDIMPL t' implies t ==SPECO ('. Since E(IDIMPL) =E(OPIMPL) + EO, this condition implies condition (4) in 5.2 and hence RI-correctness. 238 H. Ehrig et al.

(2) Assume that we have an OPIMPL-algebra A and a (2' +2'O)-homomorphism h :RESTRICTION(A) ~ TSPECO. By initiality of TOPIMPL there is a unique f: TOPIMPL ~

A. Since RESTRICTION is a functor with RESTRICTION( TOPIMPd = REPIMPL (see 4.3) we also have a (2' +2'O)-homomorphism RESTRICTION(f): REP1MPL ~ RESTRICTION(A). Hence the composition with h is the representation homomorph• ism rep: REP1MPL ~ TSPECO which implies RI-correctness by Theorem 5.2. The necessity of the semantical condition follows from the fact that we can take A = TOPIMPL and use Theorem 5.2. 0

In the following example we will show how our correctness criteria can be used to verify the correctness of our implementation examples in Section 3. We will significantly use the semantical construction for these implementations given in Example 4.4 and will not make use of the proof theoretical conditions in [17] and [40). Using such conditions correctness proofs become more rigorous but this is beyond the scope of this paper.

5.4. Examples. (1) For set(natl) impl pset(natl) given in Example 3.2 consider the OPIMPL-algebra A with A set = A pset = g>fin(r~O), CA = identity and the usual set• and powerset-operations acting on A set and A pset respectively. Then we have

RESTRICTION(A) = A pset == Tpset which shows RI-correctness by 5.3.2. For OP• completeness (see 5.1.2) we have to show that the unique SORTIMPL• homomorphism h preserves empty set, singleton, union and element. This can be shown by induction using that CREATE, INSERT and MEMBER are preserved by h already as SORTIMPL-homomorphism. The proof in this case is nearly the same as verifying directly the OP-completeness condition in 4.6.1. But that is trivial for empty, singleton and element and follows by induction for union using in each case the corresponding operation implementing equations and the fact that the recursive equation for union is decreasing in the second argument. (2) In Example 3.6.1, natl impl set(natl), the corresponding OPIMPL-algebra

A has A set = Nt and the homomorphism due to 5.3.2 maps each string to the set of all elements of that string. OP-completeness follows by verification of the condition in 4.6.1. (3) Finally in our Example 3.6.2, hash(natl) impl set(natl), the corresponding

OPIMPL-algebra A has in A set all generalized hash-tables, while the homomorph• ism due to 5.3.2 is restricted to actual hash-tables which are mapped to the set of all its entries. Verification of OP-completeness in 4.6.1 in this case is trivial because all the operation implementing equations are non-recursive.

Using the characterization theorem 5.2 we are also able to verify our conceptual requirements for implementations stated at the beginning of Section 3. Most of the arguments in 5.5 have already been used to motivate our implementation concept but it seems worthwhile to summarize these arguments. The verification Algebraic implementation ofabstract data types 239 will be given for standard implementations but with slight modifications it remains true for general implementations which will be introduced in the next section.

5.5. Verification of conceptual requirements Given a standard implementation IMPL of SPECO by SPEC1 in the notation of 4.2, the abstract data types ADTO and ADT1 used in 3.1 are given by the initial algebras TSPECO and TSPECI respectively. The conceptual requirements of 3.1 are verified as follows: (l) Syntactical level: The implementation on the syntactical level is given by IMPL = (ISORT, EOP): SPEC1 =} SPECO (see 3.4 and 3.5.3).

(2) Semanticallevel: The semantical construction SEMIMPL (see 4.1 and 4.2)

SYNTHESIS RESTRICTION SEMIMPL : TSPECI ;. TOPIMPL ;.

JDENTIFICAnON REPIMPL ======~;. SIMPL ~ TSPECO transforms TSPECI into SIMPL isomorphic to TSPECO because IMPL is RI-correct.

(3) Data representation: Since REPIMPL is a restriction of TOPIMPL and rep: REPIMPL ~ TSPECO (see 5.2.3), the representation homomorphism, a surjective but not necessarily injective function, the data representation requirements 3.1.3 are satisfied. (4) Simulation of compound operations: Since rep is a homomorphism it is compatible with operations. But this implies compatibility with compound oper• ations in the sense of requirement 3.1.4. (5) Parameter protection: The common parameter part of TSPECI and TSPECO is

T SPEC. This is protected because SPEC1 and SPECO are extensions of SPEC by general assumption in 3.3. Now we will show how to obtain an explicit procedure to calculate whether two data representations in REPIMPL are equivalent with respect to SPECO. Of course, this means exactly whether two elements in REPIMPL have the same image under the representation homomorphism rep: REPIMPL ~ TSPECO. But the representation homomorphism is not an explicit component of our implementation concept, at least it is not available for the user because it is not included in the syntactical level. But if we intend to make it available to the user we have to include an "equality predicate" for the corresponding sorts of interest into the target specification SPECO. We say that EQ: s s ~ bool for some s E SO is an equality predicate in SPECO if boot is protected in SPEC <:; SPECO and the corresponding operation EQT of TSPECO satisfies for all x, y E (TSPECO)s:

EQT(X, y) = TRUE ¢> x = y.

The following result shows that the existence of an equality predicate is already sufficient to calculate equivalence of data representation in REPIMPL and OPIMPL. 240 H. Ehrig et at.

This is based on the fact that boof is protected in REP1MPLalthough boof is not necessarily protected in OPIMPL.

5.6. Fact (characterization of data equivalence). Given a (standard) implementa• tion IMPL of SPECO by SPEC1 and an equality predicate EO: s s ~ bool for some s E SO in SPECO as defined above. Then we have (1) boo) is protected in REP1MPL, i.e (REP1MPdbool == Tboo.. with (REP1MPdbool = {TRUER, FALSER} and TRUER ~FALSER; (2) For all x, y E (REP1MPds the following conditions are equivalent: (a) reps(x) = reps(Y), (b) EOT(reps(x), reps(Y)) = TRUE, (c) EOR(x, y)=TRUER, (d) EO(tx, ty) =OPIMPL TRUE for all tx, ty E (Tl;(OPIMPU)s with x = [tX]OPIMPL, y = [tY]OPIMPL where rep: REP1MPL ~ TSPECO is the representation homomorphism and EQR the EO-operation in REP1MPL.

Remark. The essential part of this result is the equivalence of conditions (a) and (d) because it allows to calculate in OPIMPL whether two terms tx, ty in OPIMPL, which may be different in REP1MPL, represent the same data reps (x) and reps (y) in TOPIMPL. Although EO T is an equality predicate in SPECO and boo) is protected in REP1MPL, EOR will not be an equality predicate in REP1MPL in general. But EOR is the equivalence kernel of the representation homomorphism rep. Especially this result shows that it is not necessary to give additional equations to specify the equivalence of terms with respect to multiple data representation as done in some papers like [29]. In our case those equations in IMPL which are used for the implementation of EO are sufficient for this purpose.

Proof. (1) For each bE (REP1MPLhool we know by OP-completeness that b is OPIMPL-equivalent to TRUE or FALSE because boo) is protected in SPEC and hence also in SPEC1 and SORTIMPL (see 3.3 and 4.7). Hence b = TRUER or b = FALSER. On the other hand we cannot have TRUER = FALSER because rep is a (.l" +.l"O)-homomorphism and boo) is protected in SPECO. (2) Conditions (a) and (b) are equivalent because EO is equality predicate in SPECO and conditions (c) and (d) are equivalent by definition of TOPIMPL and REP1MPL. Hence it remains to show equivalence of (a) and (c). But this follows from part (1) and the fact that rep is a (.l" +.l"O)-homomorphism which implies

repboo/(bR) = b for b = TRUE or b = FALSE, and repboo/(EOR(x, y)) = EOT(reps(x), reps(y)). Algebraic implementation ofabstract data types 241

Since EQ is an equality predicate we have

EQT(reps(x),reps(Y)) = TRUE ¢> reps(x)= reps(Y) where the right part is condition (a). Hence (a) implies repbooz(EQR(x, y)) = TRUE = rePbooz(TRUE R ) which implies (c) by part (1), while (c) implies (a) without assuming OP-completeness. 0

Now we are going to compare RI-correctness of implementations with IR• correctness corresponding to a slightly different semantical construction where the IDENTIFICATION step is done before RESTRICTION.

5.7. Remark (IR-semantical construction and IR-correctness). If in our semantical construction SEM1MPL of 4.2 the last two steps are performed in opposite order (that means first SYNTHESIS, then IDENTIFICATION from OPIMPL to IDIMPL = OPIMPL+EO and then RESTRICTION to the SPECO part) we obtain another semantics, called IR-semantical construction IR-SEM1MPL' At first glance it seems that both constructions lead to the same result, which is used for a special case in [27]. But we will show in 5.8 that this is not true in general. There is only a surjective homomorphism g: SIMPL ~ IR-S1MPL where IR-S1MPL = IR-SEMIMPdTsPECI). Let us call a weak implementation IMPL IR-correct if IR• SIMPL == TSPECQ. Then IR-correctness implies RI-correctness but not vice versa. Hence the IR-semantical construction is more restrictive. But such a restriction is not assumed in our conceptual requirements. There is also an intuitive reason to prefer RI-semantics to IR-semantics. In the RI-semantics we first throwaway the junk (Le. non reachable data) and then make the identification on useful data only. In IR-semantics, however, we make identifications on the junk before we throw it away. It would be most inefficient if the execution of an algebraic implementation on some computing device would really correspond to the IR-semantics meaning that the EO-equations are used within the execution. RI-semantics means that EO• equations are not used in the execution but only to show the consistency of data representation (see 5.5.2).

To show the relationship between RI- and IR-constructions and correctness criteria we give the following IR-analogon to the characterization theorem 5.2 and the subsequent counterexample in 5.9 showing that RI is not isomorphic to IR.

5.8. Theorem (characterization of IR-correctness). Given a weak implementation IMPL of SPECO by SPEC1, then there is a surjective SPECO-homomorphism

g: SIMPL ~ IR-S1MPL from the semantical algebra SIMPL to IR-S1MPL corresponding to the RI- and IR• semantical construction of IMPL respectively. But in general g is not an isomorphism. 242 H. Ehrig el al.

Moreover each of the following conditions are equivalent to IR-correctness of IMPL: (1) IMPL is RI-correct and g : SIMPL ~ IR-S1MPL an isomorphism; (2) The unique SPECO-homomorphism f: TSPECO ~ IR-S1MPL is injective; (3) There is a SPECO-homomorphism f': IR-S1MPL ~ TSPECO; (4) All (2' + 2'O)-terms t and t' equivalent in IDIMPL = OPIMPL+(0, 0, EO) are also equivalent in SPECO, i.e.

t =IDIMPL t' implies t =SPECO t'.

Proof. By construction of REP1MPL, IR-S1MPL and TIDIMPL we have surjective homomorphisms e, e', and e" and injective homomorphisms m and m ' making the following diagram commutative by initiality of T:£+:£o. Hence the well-known diagonal-lemma

e T:£+:£o H REP1MPL / / " / Im h / e' / (TOPIMPLh+IO / " / ,/ / e· ;1

TSPECO ISIMPL -:IR-S1MPL and RI-correctness means that f is an isomorphism while IR-correctness means that go f is an isomorphism we also have the equivalence of IR-correctness with condition (1). 0

Finally we will show that RESTRICTION and IDENTIFICATION cannot be com• muted in general which implies that RI-correctness and IR-correctness are not equivalent. Algebraic implementation ofabstract data types 243

5.9. Counterexample. Let int sorts:int opns:O': ~ int SUCC', PRED': int ~ int eqns:SUCC'(PRED'(x)) = x PRED'(SUCC'(x)) = x.

Let c1ashnat = nat +{SUCC(x) = SUCC(SUCC(x))}, then the following implementation IMPL :int ~ c1ashnat is RI-correct but not IR-correct, i.e. g : SIMPL ~ IR-S1MPL is not an isomorphism: int impl c1ashnat by sorts impl opns: c :int ~ nat opns impl eqns: 0 = c(O') SUCC(c(x» = c(SUCC'(x)).

We obtain (TOPIMpdnat c;= 7L and therefore RESTRICTION(( TOPIMpdnat) c;= No because all integer numbers which are generated by c1ashnat-operations are natural num• bers. Moreover, IDENTIFICATION(No) = {O, I} because the equation SUCC(x) = SUCC(SUCC(x») identifies all positive natural numbers with 1. Hence SIMPL,nat c;= {O, I} c;= Tela'hoat,nat so that IMPL is RI-correct. On the other hand, IDENTIFICATION((TOPIMpdnat) c;= {O} because the equation SUCC(x) = SUCC(SUCC(x») identifies all integer numbers with 0 due to the fact that for each integer x there is an integer y with SUCC(x) = y. Moreover, IR-S1MPL,nat = RESTRICTION({O}) = {O} so thatIR-SIMPL,nat~ Tela'hoat,nat' i.e. IMPL is not IR-correct.

6. General case and composition of implementations

In this section we study the general case of implementations including hidden components and also the composition of implementations. A trivial way to define compositions is to take the transitive closure which means that one implementa,tion is applied after the other. But we would rather like to combine all the intermediate steps to one implementation from the source of the sequence to the target specification, Taking for example the standard implementations IMPLl: set(natl) => pset(natl) and IMPL2: hash(natl) => set(natl) given in Example 3.2 and 3.6.2 we would like to obtain an implementation IMPL(l, 2): hash(natl) => pset(natl) by combining the sorts implementing operations and the operations implementing equations of IMPLI and IMPL2. For this specific example we could actually take the composition c 0 a : array ~ pset of the copy operations and substitute the right• hand sides of the equations for CREATE, INSERT and MEMBER in IMPL2 into the equations for empty set, singleton, union and element in IMPL1. Unfortunately 244 H. Ehrig et al.

this procedure does not work in general. If we replace IMPL2 by IMPL2' :natl ~ set(natl) as given in Example 3.6.1 the composItion

c 0 TAB: natset ~ pset of the sorts implementing operations still contains the sort set which neither belongs to the source nor to the target specification of the composition. Moreover there are two equations for MEMBER in IMPL2' such that into the equations of IMPLI is no longer possible. A natural way to avoid these difficulties is to allow hidden sorts, operations and equations in the implementation such that the intermediate specification - in our case set(natl) - can be pushed into the hidden part of the composite implementation. This, in fact, is the basic idea of our composition concept which is one motivation to study a more general case of implementations including hidden components. Another motivation is the additional expressive power of hidden components to formulate single implementations. Although we do not have a formal proof we conjecture that general implementations are strictly more powerful than standard implementations. Before we give the formal definitions let us mention another problem with compositions which will be studied in Section 7. The composition of implementa• tions is not necessarily RI-correct even if both components are. Hence in general we need additional consistency conditions. This is similar to the situation for transactions on data bases where additional synchronization mechanisms are needed. But we also have a notion of persistent implementations which allows composition without additional conditions. Persistent implementations are special cases of strong implementations which are suitable to define also a strong case of composition. In this case we get rid of the equations of the intermedi• ate specification as shown in Section 7.

6.1. Definition (implementation with hidden components). Given specifications SPECO and SPEC1 as in 3.3 an implementation with hidden components (short implementation) of SPECO by SPEC1, written IMPL:SPECI ~SPECO, is a triple IMPL = (.l'SORT, EOP, HID) where .1'SORT and EOP are sets of operations and equations with the same notation as in 3.4 and HID = (SHID, .1'HID, EHID) consists of hidden sorts (SHID), hidden operations (.1'HID), and hidden equations (EHID) such that the following syntactical requirements and the semantical require• ments of 4.6 (using the semantical construction of 4.2) are satisfied: Syntactical requirements:

SORTIMPL= SPECI +{SO+SHID, .1'SORT, 0) and OPIMPL = (SORTIMPL + {0, .1'HID, EHID)) + {0, .1'0, EOP) are combinations, and for all (J E .1'SORT the range of (J belongs to SO + SHID. Algebraic implementation ofabstract data types 245

If we do not assume to have the semantical requirements IMPL is called weak implementation.

Remarks. (1) Standard implementations (see 3.4) are implementations with empty hidden components. (2) The semantical construction in 4.2 and the semantical requirements in 4.6 are formulated in terms of SORTIMPL and OPIMPL such that the same definitions can be used for the standard and the general case. However, SORTIMPL and OPIMPL are now including hidden components as defined above. Moreover all the results given in Section 5 apply to the standard case and to the general case as defined above. (3) OPIMPL is a 2-step combination and should not be confused with the combination SORTIMPL+(0, 2'HID + 2'0, EHID+ EOP) where in contrast to OPIMPL the equations in EHID would be allowed to use 2'O-operations. This restriction does not seem to be significant. But it forces the designer to use EHID only for the specification of 2'HID-operations or identification of some data gener• ated in SORTIMPL. In our previous implementation concept in [18] we have used a set ESORT of equations to allow identification in the sorts implementing level. This extra component ESORT can be avoided in our revised concept because it may be regarded as a subset of EHID.

In the following we give three examples where the use of hidden components is essential. In the first two we need inverse copy operations which are treated as hidden operations, and hidden equations are used to make sure that the inverse copy operations are inverse to the corresponding copy operations in TOPIMPL. The third example is already an example for the composition concept of implementations which is formally defined in 6.3. It shows how hidden sorts, hidden operations and hidden equations arise naturally in composite implementations.

6.2. Examples. (1) Similar to Example 3.6.1 we want to implement sets of natural numbers by strings of natural numbers. But this time strings of natural numbers are given already in the source specification string(natl) (see 2.4.1) and we want to represent each set by those strings containing the elements in arbitrary order but without repetition. string(natl) impl set(natl) by sorts impl eqns: c : string ~ set opns impl eqns: CREATE = c(A) INSERT(n, c(A» = c(ADD(n, A) INSERT(m, c(ADD(n, s») = if EQ(m, n) then c(ADD(n, s» else c(ADD(n, r(INSERT(m, c(s))))) (We omit the equations for DELETE, MEMBER, EMPTY and IF-THEN-ELSE-SET) 246 H. Ehrig et al.

hidden opns: r : set ~ string hidden eqns: r(c(s)) = s We obtain another implementation with the same source and target if HID is replaced by hidden opns: rADD: nat string ~ string hidden eqns: rADD(m, A) = ADD(m, A) rADD(m, ADD(n, s)) = = if EQ(m, n) then ADD(n, s) else ADD(n, rADD(m, s)) and if the third equation of EOP is replaced by INSERT(m, c(ADD(n, s))) = c(rADD(m, ADD(n, s))). Note that we need the "recopy" r: set ~ string in the operations implementing equations for INSERT. But we do not need a hidden equation like c(r(x)) = x because this equation is automatically true in TOPIMPL due to the fact that the copy operation c becomes a bijective function. (2) Extending Example 3.6.2 we want to implement hash-tables by strings such that we obtain an implementation of sets by strings via hash-tables as composition (see part (3) below). string(natl) impl hash(natl) by sorts impl opns: d : string ~ list TUP: string m ~ array [i]:~nat(m) U=1, ... ,m) opns impI eqns: e = d(A) ADDLI(d(s), n) = d(ADD(s, n)) ELEM; = [i] U= 1, , m) CREATEAR(d(s1), , d(sm)) =TUP(s1, ... , sm) ENTRY([i], TUP(s1, , sm)) = d(si) (i = 1, ... , m) CHANGE([i], TUP(s1, , sm), d(s)) =TUP(s1, ... si-1,s,si+1, ... sm) (i=1, ... ,m) HASH(SUCC'(O)) = [i + 1] U= 0, ... , m -1) HASH(SUCCm (n)) = HASH(n) ADJOIN(d(s), n) = IF SEARCH(d(s), n) THEN d(s) ELSE d(ADD(s, n)) LI REMOVE(d(A), n) = d(A) REMOVE(d(ADD(s, n 1)), n2) = IF EQ(n 1, n2) THEN d(s) ELSE d(ADD(r(REMOVE(d(s), n2)), n 1)) LI Algebraic implementation ofabstract data types 247

SEARCH(d(ADD(s, n 1)), n2) = IF EQ(n 1, n2) THEN TRUE ELSE SEARCH(d(s), n2) SEARCH(d(A), n)=FALSE EMPTYLI(d(A) = TRUE EMPTYLI(d(ADD(s, n))) = FALSE IF TRUE THEN s1 ELSE s2 LI = s1 IF FALSE THEN s1 ELSE s2 LI = s2 hidden opns: r: list -+ string hidden eqns: r(d(s»=s

(3) The composition of the implementations IMPL2: string(natl) ~ hash(natl) (given in step (2) above) and IMPLl: hash(natl) ~ set(natl) (given in 3.6.2) is the following implementation IMPL(1, 2): string(natl) ~ set(natl) which is con• structed according to the formal definition of composition in 6.3 below:

string(natl) impl set(natl) by sorts impl opns: d : string -+ list TUP: string'" -+ array [i]: -+ nat(m) (i = 1, ... , m) a : array -+ set opns impl eqns: (equations for CREATE, INSERT, DELETE, MEMBER, EMPTY, and IF-THEN-ELSE-SET as given in 3.6.2

and equations for e, ADDLI, ELEMj , CREATEAR, ENTRY, CHANGE, HASH, ADJOIN, REMOVE, SEARCH, EMPTYLI, and IF-THEN-ELSE-LI as given in part (2) above) hidden sorts: list, array, nat(m) hidden opns: r : list -+ string (hash(natl)-operations e,ADDLI, ... , IF-THEN-ELSE-LI as given in 3.6.2) hidden eqns: r(d(s»=s (hash(natl)-equations for ENTRY, ... , IF-THEN-ELSE-LI as given in 3.6.2) Actually we do not need the hash(natl)-equations to be a subset of the hidden equations for this specific example because IMPL1 and IMPL2 are both strong implementations (see Definition 6.4) such that we can use the strong composition of implementations (see remark in 6.3). In the composite implementation 248 H. Ehrig et al.

IMPL(1,2) a transformation from a set(natl)-term to a SORTIMPL(1,2)-term can be done by applying first IMPL1 to transform the set(natl)-term to a SOR• TIMPLl-term. Then all its hash(natl)-subterms can be transformed by applying IMPL2 to obtain SORTIMPL2-terms which - substituted into the SORTIMPLl• term -lead to the resulting SORTIMPL(l, 2)-term. A very simple example of such a transformation is given by the following sequence of equations:

CREATE = a (CREATEAR(£, ... , E)) =a(CREATEAR(d(A),... , d(A)) = a (TUP(A, ... , A)).

This two-step construction leads always to a correct result but it may be inefficient. Actually the composite implementation allows to use the equations (and hence the transformation step) in arbitrary order where equations of both steps may be mixed. In general, however, this mixture may cause some additional identifications which may violate the consistency (RI-correctness) of the composition such that we will have to assume additional consistency conditions to a correct composite implementation (see Section 7). In our example, however, these conditions are satisfied such that correctness of IMPL1 and IMPL2 implies that of IMPL(1, 2).

6.3. Definition (composition' of implementations). Given implementations IMPL2: SPEC2 ~ SPEC1 and IMPLl :SPEC1 ~ SPECO with SPECi = SPEC + (5i, Ii, Ei) (i = 0, 1, 2) and IMPLi = (ISORTi, EOPi, HIDi) (i = 1, 2), then the composition

IMPL(1, 2) = IMPLl 0 IMPL2: SPEC2 ~ SPECO

is defined by

IMPL(l, 2) = (ISORT(l, 2), EOP(l, 2), HID(l, 2)) with

ISORT(l, 2) = ISORT2 + ISORT1, EOP(l, 2) = EOP2+EOP1, HID(l, 2) = (SHID(l, 2), lliID(l, 2), EHID(1, 2)), SHID(l, 2) = SHID2+SHID1 +Sl, lliID(l, 2) = lliID2+lliID1 + I1, EHID(l, 2) = EHID2 + EHID1 + E1 where + (as in the previous sections) stands for disjoint union. More precisely we assume without loss of generality that all sort sets 5, 50, 51, 52, SHID1 and SHID2 are pairwise disjoint and also all operation symbol sets I, IO, I1, I2, ISORT1, ISORT2, lliID1, and lliID2 are pairwise disjoint while a corresponding condition for the sets of equations does not seem to be necessary. Algebraic implementation ofabstract data types 249

Remarks. (1) The composition IMPL(1, 2): SPEC2::::::} SPECO is at least a weak implementation because

SORTIMPL(1, 2) = SPEC2 + (SO +SHID(1, 2), .l'SORT(1, 2), 0> and OPIMPL(1, 2) = (SORTIMPL(1, 2) +(0, lliID(1, 2), EHID(1, 2)) + (0, .l'0, EOP(1, 2) become combinations and for all (F E .l'SORT(1, 2) the range of (F belongs to SO + SHID(1, 2). Moreover we will see that OP-completeness of IMPLl and IMPL2 imply that of IMPL(1, 2) (see Theorem 7.2). But in general we cannot expect that IMPL2 implies RI-correctness of IMPL(1,2) unless an additional consistency condition is satisfied (see Counterexample 7.1). (2) If at least IMPL1 is a strong implementation in the sense of 6.4 below then the set of equations E1 can be omitted in EHID(1,2) without violating OP• completeness of IMPL(1,2). If we redefine EHID(1,2) to EHID(1, 2) = EHID2 + EHID1 then IMPL(1, 2) is called strong composition. (3) The composition of weak implementations is associative because the disjoint union in the definition of the components is associative. Hence the composition of n implementation IMPLi :SPECi ::::::} SPEC(i -1) for i = n, ... , 1 can be written as IMPL(1, ... , n):SPECn ::::::}SPECO. An example for a 2-step implementation is given by IMPL(1, 2): string(natl)::::::} set(natl) as defined in 6.2.2. This implementa• tion can be composed with IMPLO: set(natl) ::::::} pset(natl) as defined in 3.2 to ob• tain a 3-step implementation IMPL(O, 1,2) = IMPLO 0 IMPL(1, 2): string(natl) ::::::} pset(natl). By the associativity property mentioned above we also have

IMPL(O, 1,2) = IMPL(O, 1) 0 IMPL2 where IMPL(O, 1): hash(natl) ::::::} pset(natl) is the composition of IMPLO and IMPL1 :hash(natl) ::::::} set(natl).

Now we are going to define special cases of implementations with respect to the semantical requirement OP-completeness. We define strong and persistent OP• completeness leading to strong and persistent implementations respectively. In the case of strong implementations we can use strong composition of implementations which is desirable for a number of applications (see 6.2.3). In the case of persistent implementations we have closure under (strong) composition without having to assume an additional consistency condition to obtain RI-correctness for the compo• sition (see Theorem 7.3).

6.4. Definition (strong and persistent implementations). An implementation IMPL of SPECO by SPEC1 is called (1) strong OP-complete, or short strong, if for each term to in TI+IO there is an OPIMPL'-equivalent term t1 in TIISORTIMPLi where OPIMPL' = OPIMPL - E1, i.e.

to =EIOPIMPL') t1 where E(OPIMPL') = E + ESORT+ EHID + EOP; 250 H. Ehrig et al.

(2) persistent OP-complete, or short persistent, if OPIMPL' = OPIMPL-£1 is a persistent enrichment of SORTIMPL' = SORTIMPL-£1, i.e. for all SORTIMPL'• algebras A we have

F(A)sORTIMPL" ==A where F(A) is the OPIMPL'-algebra freely generated by A.

Remark. The freeness property of F(A) means more precisely that F : AlgsoRTIMPL' ~ AlgoPIMPL' is the free functor. In other words F(A) is freely generated by A if there is a SORTIMPL-homomorphism UA:A ~ F(A)SORTIMPL' such that for each OPIMPL'-algebra B and each SORTIMPL'-homomorphism f:A~BsORTIMPL' there is a unique OPIMPL'-homomorphism g:F(A)~B such that gSORTIMPL'O UA = f. In most examples we have strong but not persistent OP• completeness. A sufficient condition for persistent OP-completeness is that EHID and EOP are derivor equations for lliID and 2'0. This means that for each u E .!HID +2'0 there is exactly one equation in EHID +EOP with left-hand side u(xl, ... , xn) or u(c1(x1), . .. , cn(xn)) where xi are variables and ci copy oper• ations for i = 1, ... ,n, and the right-hand side is a term in Tl;(SORTIMPLl ({x 1, ... , xn}). A typical example is the implementation of set(natl) by hash(natl) in 3.6.2.

The three different notions of OP-completeness are defining a strict hierarchy:

6.5. Fact (OP-completeness hierarchy). For each weak implementation we have the following strict hierarchy: Persistent OP-completeness implies strong OP• completeness and the latter one OP-completeness.

Proof. Persistent OP-completeness implies for A = TSORTIMPL' andF(A) = TOPIMPL' that OPIMPL' is enrichment of SORTIMPL'. But this implies strong OP-complete• ness similar to the proof of 5.1.1. Since E(OPIMPL')~E(OPIMPL) strong OP• completeness implies OP-completeness. The implementations in 3.2, 3.6.1 and 6.2 are strong but not persistent. Finally the following slight modification of Example 6.2.1 is an OP-complete implementation which is not strong: Assume that the parameter part natl of string(natl) is renamed by natl' (disjoint to natl) and additional copy operations implementing equations are added such that string(natl') implements set(natl) with empty common parameter part. Then the term INSERT(O, INSERT(O, CREATE)) can only be reduced to a SORTIMPL-term using equations for EQ in natl' which are in E1. Hence we do not have strong OP-completeness but still OP-completeness. 0

7. Correctness of composition

In the last section we have defined the composition of implementations. As pointed out already we cannot expect that the composition is RI-correct without Algebraic implementation ofabstract data types 251 assuming further consistency conditions. At first glance this is surprising ami one might assume that either our notion of correctness or our notion of composition is not well chosen. We do not think so because there occur similar effects in other well-known areas of computer science. Take for example two transactions t1 = (a 1, b1) and t2 = (a2, b2) in a data base system in the sense of Eswaran et al. [22]. If both are applied one after the other (serial schedule) the consistency of the data base is preserved. But if they are applied concurrently, say (a1, a2, b1, b2), we cannot assume that consistency is preserved unless certain consistency conditions are satisfied, which may be forced by suitable synchronization mechanisms. The situation with algebraic implementations, say IMPLl al1d IMPL2, is similar. If both are applied one after the other, the result is correct. But the composition IMPL(l, 2) allows to mix the applications of IMPL1 and IMPL2 which corresponds to the concurrent schedule (a 1, a2, b1, b2) above. This, however, may violate consistency, in our case RI-correctness, unless some additional consistency conditions are satisfied. It is natural to ask for stronger correctness conditions such that the additional consistency conditions are satisfied automatically. We will show that this is the case for persistent implementations but not necessarily for strong implementa• tions which occur most frequently in applications. We have argued that the mixture of the equations in IMPL(l, 2) is responsible for possible inconsistencies. This mixture is excluded if we apply the equations of IMPL1 and IMPL2 strictly one after the other in the same way as we may restrict transaction schedules to serial schedules only. In the case of transactions it is well known that this would decrease efficiency significantly. A similar lack of efficiency is true for algebraic implementations. In [21] we have proved that the composition complexIty is less or equal to substitution complexity corresponding to the case of serial application, and it should not be hard to find examples where the complexity differs considerably. Especially if we consider the composition IMPL(l, ... , II) of n -step implementations the difference will become significant, even more if we apply suitable optimization procedures to such composite implementations combin• ing certain operations and equations. Although we do not yet have formal optimiz• ation procedures, our consistency conditions make sure that also optimal evaluation of the composite implementation is still correct. Let us start with an example showing that the composition of implementations is not necessarily RI-correct nor IR-correct. The idea behind this counterexample is the same as that in 5.9.

7.1. Counterexample. Let us consider the following implementation where nat is supposed to have the operations ZERO and NEXT and bool only TRUE and FALSE:

IMPL2: int impl nat by sorts impl opns: c2:int~ nat opns impI eqns: ZERO = c2(O) (EOP2) NEXT(c2(x)) = c2(SUCC(x)) 252 H. Ehrig el al.

IMPL1: nat impl bool by sorts impl opns: c 1: nat ~ bool opns impl eqns: TRUE = c 1(ZERO) (EOP1) FALSE = c1(NEXT(ZERO)) c1(NEXT(y)) = c1(NEXT(NEXT(y)))

IMPL2 has simple SYNTHESIS with persistent OP-completeness, essential RESTRIC• TION (negative numbers are deleted), simple IDENTIFICATION and hence RI- as well as IR-correctness. IMPL1 has essential SYNTHESIS with (TOPIMPLhool ={O, 1} and strong but not persistent OP-completeness, simple RESTRICTION and IDENTIFICATION and hence RI- as well as IR-correctness. Now let us consider the composition IMPL(1, 2) which is also strong because E1 = E(nat) = 0:

IMPL(1,2): int impl boo) by sorts impl opns: c2: int ~ nat c 1: nat ~ boof opns impI eqns: EOP2 + EOP1 as above hidden sorts: nat hidden opns: ZERO: ~ nat NEXT: nat ~ nat

IMPL(l, 2) has essential SYNTHESIS with (TOPIMPL(1.2))bool ={O} (see below) and strong but not persistent OP-completeness, simple RESTRICTION and IDENTIFICA• TION but no RI-correctness because we have in OPIMPL(1, 2): TRUE== c 1(ZERO) == c1(c2(O)) == c 1(c2(SUCC(PRED(O)))) == c 1(NEXT(c2(PRED(O)))) == c 1(NEXT(NEXT(c2(PRED(O))))) == c 1(NEXT(c2(SUCC(PRED(O))))) == c 1(NEXT(c2(O))) == c 1(NEXT(ZERO)) == FALSE and hence TRUE ==E(OPIMPLil.2)) FALSE but TRUE ~ FALSE in bool.

Note that this counterexample works for composition and strong composition, OP-completeness and strong OP-completeness, RI- and IR-correctness, even with simple IDENTIFICATION (i.e. without multiple data representation). Responsible for the identification of TRUE and FALSE is the identification in the SYNTHESIs-step of IMPL1 and the essential RESTRICTION in IMPL2. In view of the counterexample above we have to assume additional consistency conditions in order to obtain RI-correctness of composite implementations:

7.2. Theorem (correctness of composition). Implementations are closed under composition provided that one of the Consistency Conditions below is satisfied. In Algebraic implementation ofabstract data types 253 more detail we have: Given weak implementations IMPL2: SPEC2 ~ SPEC1 and IMPLl :SPEC1 ~ SPECO, then the composition IMPL(l, 2): SPEC2 ~ SPECO is an implementation if the following two conditions are satisfied: (1) IMPLl and IMPL2 are OP-complete. (2) IMPLl (but not necessarily IMPL2) is RI-correct and one of the following Consistency Conditions is valid: Proof Theoretical Consistency Condition: OPIMPL(l, 2) is consistently specified with respect to OPIMPL1, i.e. for all t, t' E TX(OPIMPLl) we have

t =E(OPIMPL(I,2» t' implies t =E(OPIMPLI) t'. Semantical Consistency Condition: There is an OPIMPL(1, 2)-algebra A and a 2'(OPIMPL1)-homomorphism h :RESTRICTION(A)~ TOPIMPLl where RESTRIC• TION(A) is the image of evalA: TX(OPIMPLll ~ AOPIMPLI.

Remark. Note that the consistency conditions above are slight modifications of those for RI-correctness of IMPL(1, 2) (see Fact 5.3) where SPECO is replaced by OPIMPL1. Thus the conditions become independent of SPECO because the con• sistency of OPIMPL1 with respect to SPECO is given already by RI-correctness of IMPL1 (see 5.3.1), In some cases it may be useful to verify directly RI-correctness of IMPL(1, 2) using Fact 5.3, In such cases we can also drop the assumption that IMPLl is RI-correct.

Proof. The proof will be given in two parts, Part 1: OP-completeness of IMPL(1,2). It suffices to show that each (2' + 2'O)-term to is equivalent via OPIMPL(l, 2)-equations to a 2'(SORTIMPL(1, 2))• term t2. By OP-completeness of IMPL1 to is equivalent via OPIMPL1-equations to a (2'+2'1+2'SORT1)-term t1. Moreover-using Fact 4.7 t1 can be assumed to be (2' + 2'1)-normal in thefollowing sense: All maximal subterms t1j (j = 1, ... , m) of sorts in S+S1 are (2'+2'1)-terms. Each of the t1j (j= 1, ... , m) is equivalent via OPIMPL2-equations to a (2' + 2'2 + 2'SORT2)-term t2j. Hence t1 is equivalent via OPIMPL(1, 2)-equations to a 2'(SORTIMPL(1, 2))-term t2 where each subterm t1j of t1 is replaced by t2j for j = 1, ... , m. Part 2: RI-correctness of IMPL(1, 2). The Proof Theoretical Consistency Condi• tion together with RI-correctness of IMPL1 implies directly RI-correctness of IMPL(1, 2). The Semantical Consistency Condition and initiality of TOPIMPLl implies that TOPIMPLI ~ RESTRICTION(A) and hence also TOPIMPLI ~ AOPIMPLI is injective, Since the last morphism is equal to

TOPIMPLI ~ (TOPIMPUl,2»)OPIMPLI ~ AOPIMPLI its first one is injective, too. Since this is equivalent to our Proof Theoretical condition above we are done. 0

Now let us consider the case of strong and persistent implementations with respect to strong composition (see Remark 2 in 6.3). 254 H. Ehrig et al.

7.3. Theorem (correctness of strong composition). Persistent implementations are closed under strong composition. The same is true for strong implementations provided that one ofthe Consistency Conditions in 7.2 applied to strong composition is satisfied.

Remarks. (1) Strong resp. persistent OP-completeness is closed under strong com• position independent of RI-correctness. But RI-correctness for the composition needs persistent OP-completeness, because strong implementations are not closed under composition or strong composition (see Counterexample 7.1). (2) The Consistency Conditions in the case of strong composition are weaker than those in the case of usual composition because E1-equations are not contained in OPIMPL(l, 2) for strong composition. The difference corresponds exactly to that of RI- and IR-correctness (see 5.8).

Proof. Again the proof is given in two parts. Part 1: 5trong resp. persistent OP-completeness of IMPL(1,2). If IMPLl and IMPL2 are strong OP-complete, part 1 of the proof of Theorem 7.2 can be used to show that also IMPL(1, 2) is strong OP-complete: Actually the use of E1 can be avoided in the first and that of E2 in the second step because IMPL1 and IMPL2 are strong OP-complete. Hence - as required for strong OP-completeness in the strong composition - E1 + E2 can be avoided in step 1 and 2. In order to show persistent OP-completeness we need the following Persistency Lemma which is proved in [4].

Persistency Lemma. Let SPECi = SPEC1 + (5i, ~i, Ei) for i = 2, 3 where 52 and 53 as well as ~2 and ~3 are pairwise disjoint. Moreover let SPEC4 = SPEC2 +(53, ~3, E3) = SPEC3 + (52, ~2, E2). If SPEC2 is a persistent extension of SPEC1, then also SPEC4 is a persistent extension of SPEC3.

Remark. SPEC1 S; SPEC2 is called persistent extension, if for all SPEC1-algebras A we have F(A)sPECl=A where F:AlgsPEcl~AlgsPEc2 is the free functor (see Remark below 6.4).

Now let us assume that IMPL1 and IMPL2 are persistent OP-complete. Then we conclude from the Persistency Lemma above that persistency of SORTIMPL2' S; OPIMPL2' implies persistency of SORTIMPL(l, 2) S; SORTIMPL(l' 2)' +AUX2 with AUX2 = (0, ~ 1 + ~HID2, EOP2 + EHID2). On the other hand persistency of SORTIMPL1' S; OPIMPLl' implies persistency of SORTIMPL1'+AUX1 S; OPIMPLl' + AUX1 with AUX1 = (52 + SHID2, ~2 + ~SORT2 + ~HID2, EOP2 + EHID2). But SORTIMPLl' + AUX1 S; SORTIMPL(1, 2)' + AUX2 and OPIMPLl' + AUX1 = OPIMPL(1, 2)' implies that the composition SORTIMPL(l' 2)' S; OPIMPL(1, 2)' is persistent. Hence IMPL(1,2) is persistent OP-complete. Algebraic implementation ofabstract data types 255

Part 2: RI-correctness of IMPL(1,2). First we consider the case of strong implementations. If IMPL1 is RI-correct and we have the Semantical Consistency Condition then we have an OPIMPL(1,2)-algebra A and a homomorphism h : RESTRICTION(A) ~ TOPIMPLl. Since RESTRICTION is a functor we have a 2' (OPIMPL1)-homomorphism RESTRICTION(g) : RESTRICTION(TOPIMPL(1,2J ~ RESTRICTION(A) where g: TOPIMPL(1,2) ~ A is the initial homomorphism. Similar to the proof of Fact 5.3.2 the existence of the composition h 0 RESTRICTION(g): RESTRICTION(ToPIMPLll,21) ~ TOPIMPLl implies the Proof Theoretical Consistency Condition. This condition together with RI-correctness of IMPL1 implies RI-correctness of IMPL(1,2). It remains to show RI-correctness of IMPL(l, 2) in the case of persistent implementations IMPL1 and IMPL2 without using additional conditions. Given (2' + 2'O)-terms to and to' which are (OPIMPL(l, 2))-equivalent we have by strong OP-completeness transformations to=?t1=?t2 and to'=?t1'=?t2' where t1,t1' are (2'+2'l)-normal (2'+2'1+ 2'SORT1)-terms (see proof of Theorem 7.2 part 1) and t2, t2' are (2'+2'2+ 2'SORT2)-normal (2' + 2'2 + 2'SORT2 + 2'SORTl)-terms. Moreover the OPIMPL2-transformations t1 =? t2 and t1' =? t2' can be restricted to transforma• tions of (2' + 2'l)-terms. OPIMPL(1,2)-equivalence of to and to' implies OPIMPL(l, 2)-equivalence of t2 and t2'. Now consistency of SORTIMPL(l, 2) S; OPIMPL(l, 2) (which follows from persistent OP-completeness of IMPL(l, 2) in step 1 of this proof) implies SORTIMPL(l, 2)-equivalence of t2 and t2'. Since E(SORTIMPL(l, 2)) = E + E2 S; E(OPIMPL(l, 2)) we have the following transformation to ~ t1 ~ t2 ~ t2' ~ t1' ~ to' where t1 ~ t2 ~ t2' ~ t1' is an OPIMPL(l, 2)-transformation, which can be restricted to a transformation of (2' + 2'l)-terms, and the remaining parts to ~ t1, t1' ~ to' are OPIMPL1-transforma• tions. RI-correctness of IMPL2 allows to replace the OPIMPL2-transformations by SPEC1-transformations. Hence the sequence to ~ *to' above can be trans• formed to an OPIMPL1-transformation which implies SPECO-equivalence of to and to' by RI-correctness of IMPLl. 0

7.4. Remark (correctness of iterated composition), The results given in Theorem 7.2 and 7.3 can easily be generalized to the case that we have n weak imple• mentations IMPLi: SPECi ~ SPECU -1) for i = 1, ... ,n. To show correctness of the composition (resp. strong composition) IMPL(l, ... , n) =

IMPLl 0 ••• oIMPLn :SPECn ~ SPECO we need OP-completeness (resp. strong OP-completeness) for each step. But in addition to RI-correctness of IMPL1 one Global Consistency Condition is sufficient for RI-correctness of IMPL(l, ... , n) because IMPL(l, ... ,n) can be considered as 1-step composition of IMPL1 and IMPL(2, ... , n). Hence the Global Consistency Conditions are the same as the Consistency Conditions in Theorem 7.2 where only OPIMPL(l, 2) is replaced by OPIMPL(l, ... , n). In the case of persistent implementations IMPLi, i = 1, ... , n, also the strong composition IMPL(l, ... , n) is persistent. 256 H. Ehrig et al.

In the following example we show the correctness of the 3-step implementation of pset(natl) by string(natl) mentioned in remark 3 of 6.3.

7.5. Example. Given the strong implementations IMPL3 :string(natl) ~ hash(natl) (see 6.2.2), IMPL2 :hash(natl) ~ set(natl) (see 3.6.2), and IMPLl :set(natl) ~ pset(natl) (see 3.2), we will show that the strong composition IMPL(l, 2, 3): string(natl) ~ pset(natl) is a strong implementation. In Example 5A we have shown already that each step is OP-complete and RI-correct. Actually each step is already strong OP-complete because the non parameter equations of the source specifications were not needed to verify OP-completeness. To verify RI-correctness of IMPL(l, 2, 3) we do not use RI-correctness of the single steps nor the Global Consistency Condition but the semantical condition 5.3.2: We have to find an OPIMPL(l, 2, 3)-algebra A and a (2' +2'O)-homomorphism h : RESTRICTION(A) ~ TSPECO where RESTRICTION(A) is the restriction of A to data generated by (2' +2'O)-operations. We will give the idea of this algebra A and the homomorphism: A is an extension of TOPIMPU, where OPIMPL3 is the operation implementing level of IMPL3 :string(natl) ~ hash(natl) which was studied in 3.6.2 and 4A.3. Hence we have generalized hash-tables (instead of actual hash-tables) in sort array. Now let A se , = A pset = (TOPIMPU)array. The 2'l-operations on A are defined by the equations EOP2 where aA is identity. Taking also CA to be the identity we are going to define the 2'O-operations such that the EOP1-equations are satisfied. The equations EOP1, however, are not suitable to define 2'0 com• pletely and consistently because CREATE and INSERT are not generating A se" We define 0A to be the empty table, {n}A to be the table with entry n in row n (mod m). The union t1 UA t2 of two tables t1, t2 is defined as concatenation in each row where however all elements of the second list occurring already in the first one are omitted. n EAt is true if n occurs in row n (mod m) of table t and false otherwise. This completes the description of A. RESTRICTION(A) consists of actual hash-tables in the base set of sort pset while all other sorts except nat and bool are forgotten. To define h : RESTRICTION(A) ~ TSPECO let hpset(t) be the set of all entries in the actual hash-table t while hnat and hbool are identities. Since TSPECO is isomorphic to the finite powerset model it follows directly from the definition of the 2'O-operations of A above that h is a (2' +2'O)-homomorphism.

8. Conclusion

In this last section we try to summarize the main ideas of our implementation concept, compare it with other approaches in the literature, and we will discuss how far the concept fits already the aim of stepwise refinement for software systems. Algebraic implementation ofabstract data types 257

8.1. Summary ofollr implementation approach The implementation IMPL of specification SPECO by specification SPEC1 is given by a triple IMPL = (.l'SORT, EOP, HID) consisting of operation .l'SORT, called sorts implementing operations, equations EOP, called operations implementing equations, and HID, called hidden components (hidden sorts, hidden operations, hidden equations). We include the possibility that SPECO and SPECl are sharing a common actual parameter part SPEC such that only the non parameter part must be implemented.

IMPL SYNTAX:SPEC1 ~SPECD.

Since the semantics of SPECD and SPEC1 is given by the abstract data types TSPECO and TSPECl respectively, the semantics of the implementation is a construction

SEM1MPL transforming TSPECl into TSPECO:

SEM1MPL SEMANTICS: TSPECl ='> TSPECO. The semantical construction

SEM1MPL is the composition of three simple constructions each of which is an adjoint functor in the sense of category theory: (1) SYNTHESIS (sorts and operations of SPECD are implemented by synthesized sorts and operations of SPEC1). (2) RESTRICTION (sorts and operations are restricted to those of SPECD and data to those generated by SPECO-operations). (3) IDENTIFICATION (data are identified with respect to the SPECD-equations). More precisely the SYNTHESIS step consists of a SORT-SYNTHESIS and an op(eration)-SYNTHESIS. In the first step SPEC1 is extended by the new SD-sorts using operations .l'SORT while in the second step the new SPECD-operations are defined by the equations EOP. Moreover the RESTRICTION step consists of FORGET• TING all sorts and operations not belonging to SPECD and REACHABILITY where the data are restricted to those reachable by SPECD-operations. From the point of category theory IMPL: SPEC1 => SPECD can be considered as a syntactical description of a functor SEM1MPL : AlgsPECl ~ AlgsPEco. In addition to SYNTAX and SEMANTICS our concept includes a third important component CORRECTNESS. We have SYNTACTICAL and SEMANTICAL REQUIRE• MENTS to assure CORRECTNESS. In the SYNTACTICAL REQUIREMENTS we only require that the sort and the operation implementing level SORTIMPL and OPIMPL (built up from SPECl, SPECD and IMPL) are specifications. This is easy to check. The SEMANTICAL REQUIREMENTS make sure that the implementation is sufficiently complete (OP-completeness) and consistent (RI-correctness) such that our CONCEPTUAL REQUIREMENTS stated in Section 3 are satisfied. 258 H. Ehrigef al.

Sufficient conditions to verify OP-completeness and RI-correctness and hence the correctness of an implementation are given in Section 5. Finally it is important to mention that for implementations IMPL2': SPEC2 ~ SPECI and IMPLl :SPECI ~ SPECO there is also a composi• tion IMPL(I, 2) = IMPLl 0 IMPL2: SPEC2 ~ SPECO where correctness criteria for the composition are given in Section 6. In the present version our concept is suitable to implement parameterized data types and parameterized specifications in the sense of [2] and [3,4] only after parameter passing. But the concept will be extended to treat also the case before parameter passing and to show compatibility with parameter passing.

8.2. Comparison with other algebraic implementation concepts

Let us compare our implementation concept with other algebraic concepts in the literature. Wand [41] and Lehmann and Smyth [35] assume that the data types ADTl and

ADTO are already of the same type. Hence ADTI corresponds to our REPIMPL and the implementation consists only of a surjective homomorphism (our rep• resentation homomorphism) in the IDENTIFICATION step. Goguen, Nourani, Thatcher and Wagner [1,25,27] are using the derivor concept. This restricts the SORT-SYNTHESIS to copy operations and the OP-SYNTHESIS to nonrecursive enrichment equations. An implementation in their sense is a con• gruence on a derived (and restricted) algebra. This corresponds to our semantical constructions RESTRICTION and IDENTIFICATION where our congruence, however, is automatically generated by the SPECO-equations EO. The possibility to consider arbitrary algebras in their implementation concepts forces to leave the level of abstract data types. This is the reason why they cannot give a syntactical level of implementation. Our concept, however, allows stepwise implementation and refinement within the same concept of abstract data types. Two basic features of our new implementation concept were sketched already by Guttag in [28] and in [29] (see also [24]): Recursive equations for (I+ IO)-operations using (I +I I)-operations of the given specification SPECI and the idea of implementations on the specification level. In the concept of [24], IDENTIFICATION is given by an equational specification of an "abstraction function" A: TOPIMPL ~ TSPECO similar to rep (d. 5.2), and RESTRICTION corresponds to the "representation invariant", which is a predicate P on TOPIMPL' In [29], A is replaced by its equivalence kernel, the "equality interpretation". The implementation is called correct if the image of A restricted to all OPIMPL-data satisfying P is a SPECO-algebra. The main drawback of this approach is the fact that syntax and semantics are mixed up: Since A and P are added to the syntax, an imple• mentation can be incorrect only because A or P have not been specified appropriately. Closely related to our concept is that of Ehrich in [11, 12] where an implementa• tion of DO by Dl is a triple 1= (D2, [, t) with suitable specification morphisms Algebraic implementation ofabstract data types 259 f:D1~D2 and t:DO~D2. Actually his D2 corresponds to our IDIMPL (see 5.7), f: D1 ~ D2 "il-embedding and full wrt t" corresponds to our OP-complete• ness where, however, our SORT-SYNTHESIS is restricted to copy operations only. Finally his condition "true embedding" on t: DO ~ D2 corresponds to our IR• correctness. Since the IR-semantical construction is less general than our RI• semantics (see 5.8) and copying is only a very special case of SORT-SYNTHESIS Ehrich's implementation concept turns out to be a special case of our's although the concept of specification morphisms seems to be more general at first glance. Similar to Ehrich's implementation concept is that by Ganzinger in [23] although some semantical requirements seem to be missing and the composition is just transitive closure. On the other hand it is important to note that Ganzinger is implementing already parameterized types and specifications in the sense of [2] and [3, 4]. Moreover he shows compatibility with parameter passing. Unfortunately, his approach is more or less only on the syntactical level such that the hard semantical part of these problems is still open. Most recently there came up another implementation concept by Hupbach [32] which is based on canons (initially restricted algebraic theories) and canon morph• isms. This concept involves a lot of categorical constructions and computability considerations but does not allow SORT-SYNTHESIS except copy and no multiple data representation (IDENTIFICATION). It seems to include RESTRICTION via use of partial functions where the domain is equationally specified. This allows closure under composition without additional consistency conditions. In [33] also compati• bility with parameter passing is studied but - as far as we see - without considering suitable semantical requirements for parameter passing. Similar to our first approach to implementation in [15] our semantics is given by a functor, actually a composition of adjoint functors (see 4.3). But we have avoided categorical terminology in this paper to be understandable for a wider audience. Actually we have given a syntactical description of the semantical functor

SEM1MPL in this paper. A similar situation is given by our algebraic specification schemes in [20]. In both cases the syntax completely determines the semantical construction. The main conceptual difference is that we implement SPECO by SPEC1 while in [20] SPECD is specified by SPEC1 and connection specifications (similar to our SORT- and OP-SYNTHESIS).

8.3. Towards stepwise refinement ofsoftware systems In a first approximation stepwise refinement of software systems can be considered to consist of a sequence of specifications SPECD, SPEC1, ... , SPECn and implementations IMPLi: SPECi ~ SPECU -1) for i = 1, ... ,n where each implementation IMPLi corresponds to a refinement of SPECU -1) by SPECi. As mentioned already in the introduction of Section 3 the intention is to end up with a specification SPECn of an actual programming language. The syntax of such a refinement procedure would be the sequence (IMPLl :SPEC1 ~ SPECO, ... , IMPLn :SPECn ~ SPEC(n -1)) 260 H. Ehrig et al. of implementations, called compound implementation. We take the complete sequence rather than the n -step composition IMPL(1, ... , n): SPECn ~ SPECO because most of the structure of the refinement procedure, especially the intermedi• ate specifications SPEC(n -1), ... ,SPEC1, are hidden in the composition IMPL(l, ... , n). This technique allows the design of a family of similar software systems if only some specifications or implementations in the sequence are changed. Concerning the semantics and correctness, however, only that of the composition IMPL(1, ... , n) seems to be important. The correctness of the intermediate steps is only interesting as far as it can be used to show the correctness of the composition. Especially RI-correctness of the intermediate steps is not necessary but mainly global consistency conditions (see Remark 7.4). Up to now we have only considered the vertical structure of stepwise refinement starting with SPECO and ending up with SPECn where all specifications have the same actual parameter part as in our Example 7.5. In general, however, it is not advisable to specify an actual software system in terms of a single (unstructured) algebraic specification SPECO. What we would need is something like a horizontal structure of SPECO and of the subsequent specifications SPEC1, ... , SPECn. Such a horizontal structure is proposed by the algebraic specification language CLEAR [5] for example. Two main features of CLEAR are the combine and the procedure concept. In order to handle the semantics of the procedure concept we are convinced that we need the semantics of parameter passing mechanisms for parameterized specifications which is going to be studied in [3, 4]. The semantics of the procedure concept in [7] lacks functionality and semantical consistency is not yet studied. Much simpler to handle is the combine concept, at least the special case which corresponds to the disjoint union of specifications. Given implementations IMPL: SPEC1 ~ SPECO and IMPL' :SPEC1' ~ SPECO' the parallel composition

IMPL+ IMPL' :SPEC1 +SPEC1' ~ SPECO + SPECO' can be defined by taking the disjoint union in each component. Actually implementations are closed under parallel composition and (sequential) composition is compatible with parallel composition in the following way

(IMPL1 +IMPL2) 0 (IMPL3 + IMPL4)

= (IMPLl 0 IMPL3) + (IMPL2 0 IMPL4) provided that the sequential composition on one side is well defined which implies also well-definedness of the other side. Using both types of composition we are able to consider a simple tree structured refinement strategy in contrast to the linear structure of sequential composition only. Consider the following tree scheme where each node is colored with an algebraic specification and each bunch of edges is colored with an implementation of the top specification by the combination of all the bottom specifications Algebraic implementation ofabstract data types 261

SPECO /~ SPEC1 SPEC2 /\ \ E i \ SPE\ JE\ SPEC6 SPEC7 SPEC8 SPEC9 SPEC10

IMPLO: SPEC1 + SPEC2 =? SPECO, IMPL3 :SPEC6 + SPEC7 =? SPEC3, IMPLl :SPEC3 +SPEC4 =? SPEC1, IMPL4 :SPEC8 =? SPEC4, IMPL2 :SPEC5 =? SPEC2, IMPL5 :SPEC9 + SPEC10 =? SPEC5. The semantics of this scheme can be defined by the semantics of the composition

IMPLO 0 (IMPLl + IMPL2) 0 (IMPL3 + IMPL4 + IMPL5) or equivalently - using the compatibility result above - of

IMPLO 0 (IMPLl 0 (IMPL3 + IMPL4) + IMPL2 0 IMPL5). The advantage of such a design is that each subtree can be considered as a separate smaller software system and that the semantics of the complete system is indepen• dent of the order they are combined as far as they are syntactically well defined. The concept of stepwise refinement developed so far is mathematically clean but only a first approach to the design of real systems. In our case study [13] we consider stepwise refinement of a parts system. The basic components are treated as specifications and implementations in the sense of this paper, but they should be treated as implementations of parameterized specifications. A detailed discussion how far our concepts of implementation and parameterization are suitable for the design of software systems is given in our paper [34]. The main problem from the theoretical point of view is to study the compatibility of these concepts including syntax, semantics, semantical requirements and correctness. This seems also to be one of the main problems which have to be tackled in the system CAT sketched by Burstall and Goguen in [6].

Acknowledgment

This paper is part of a project on algebraic specifications and implementations of abstract data types at TV Berlin. We are most grateful to all members of the 262 H. Ehrig et al.

ADJ-group, R. Burstall, H.D. Ehrich, W. Fey, C. Floyd and all members of the software engineering group at TV Berlin, J.A. Goguen, V.c. Hupbach, C. Lautemann, H. Reichel, D. Siefkes, H. Weber and the referee of this paper for several valuable comments concerning the topic of this paper.

References

[1) ADJ: J.A. Goguen, J.W. Thatcher and E.G. Wagner, An initial algebra approach to the specification, correctness and implementation of abstract data types, IBM Research Report RC• 6487,1976; and in: R. Yeh, Ed., Current Trends in Programming Methodology, IV: Data Structluing (Prentice-Hall, Englewood Cliffs, NJ, 1978) 80-144. [2) ADI: J.W. Thatcher, E.G. Wagner and J.B. Wright, Data type specification: Parameterization and the power of specification techniques, Proc. 10 SIGACT Symposium on Theory of Computing, San Diego (1978) 119-132. [3) ADJ: H. Ehrig, H.-I. Kreowski, J.W. Thatcher, E.G. Wagner and J.B. Wright, Parameterized data types in algebraic specification languages, Proc.ICALP'80, Noordwijkerhout, Lecture Notes in Computer Science 85 (Springer, Berlin, 1980) 157-168. [4) ADJ: H. Ehrig, H.-I. Kreowski, J.W. Thatcher, E.G. Wagner and J.B. Wright, Parameter passing in algebraic specification languages, Draft version (1981). [5 J R.M. Burstall and J.A. Goguen, Putting theories together to make specifications, Proc.International Conference Artificial Intelligence, Boston (1977). [6) R.M. Burstall and J.A. Goguen, CAT, A system for the structured elaboration of correct programs from structured specifications, Preliminary draft (1979). [7) R.M. Burstall and JA. Goguen, Semantics of CLEAR, a specification language, Proc. 1979 Copenhagen Winter School on Abstract Software Specification, Lecture Notes in Computer Science 86 (Springer, Berlin, 1980) 294-332. [8) J.A. Bergstra and J.v. Tucker, A characterization of computable data types by means of a finite, equational specification method, Proc. ICALP'80, Noordwijkerhout, Lecture Notes in Computer Science 85 (Springer, Berlin, 1980) 76-90. [9) E.W. Dijkstra, Notes on structured programming, in: c.A.R. Hoare, Ed., Structured Programming (Academic Press, New York, 1972). [10) P. Dybjer, Algebraic specification of LISP, Draft report, Department of Computer Science, University of California, Los Angeles (1980). [11) H.D. Ehrich, Extensions and implementations of abstract data type specifications, Proc. MFCS'78, Zakopane, Lecture Notes in Computer Science 64 (Springer, Berlin, 1978) 155-163. [12) H.D. Ehrich, On the theory of specification, implementation and parameterization of abstract data types, Forschungsbericht Universitat Dortmund (1978). [13) H. Ehrig, W. Fey and H.-J. Kreowski, Algebraische Spezifikation eines Stiicklistensystems-Eine Fallstudie, Proc. German Chapter ACM Conference on Software Engineering: Entwurf lind Spezifikation, Berlin (1980). (14) H. Ehrig, H.-J. Kreowski, B. Mahr and P. Padawitz, Compound algebraic implementations: An approach to stepwise refinement of software systems, Proc. MFCS'80, Rydzyna, Lecture Notes in Computer Science 88 (Springer, Berlin, 1980) 231-245. [15) H. Ehrig, H.-J. Kreowski and P. Padawitz, Stepwise specification and implementation of abstract data types, Proc. ICALP' 78, Udine, Lecture Notes in Computer Science 62 (Springer, Berlin, 1978) 205-226. [16) H. Ehrig, H.-J. Kreowski and P. Padawitz, Algebraic implementation of abstract data types: An announcement, SIGACT News 11 (2) (1979) 25-29. [17) H. Ehrig, H.-J. Kreowski and P. Padawitz, Completeness in algebraic specifications, Bull. EATCS 11 (1980) 2-9. [18) H. Ehrig, H.-J. Kreowski and P. Padawitz, Algebraic implementation of abstract data types: Concept, syntax, semantics and correctness, Proc. ICALP'80, Noordwijkerhout, Lecture Notes in Computer Science (Springer, Berlin, 1980) 142-156. Algebraic implementation ofabstract data types 263

[19] H. Ehrig, H.-J. Kreowski and P. Padawitz, A case study of abstract implementations and their correctness, Proc. 4th International Symposium on Programming, Paris, Lecture Notes in Computer Science 83 (Springer, Berlin, 1980) 108-122. [20] H. Ehrig, H.-J. Kreowski and H. Weber. New aspects of algebraic specification schemes for data base system, Proc. Fachtagung "Formale Modelle fiir Informationssysteme", Tutzing am See (1979). [21] H. Ehrig and B. Mahr, Complexity of algebraic implementations for abstract data types, J. Comput. System Sci., 23 (2) (1981) 223-253. [22] K.P. Eswaran, J.N. Gray, RA. Corie and I.L. Traiger, On the notions of consistency and predicate locks in a data base system, Comm. ACM 19 (11) (1976) 624-633. [23] H. Ganzinger, Parameterized data types: Parameter passing and implementation, Draft manuscript, Computer Science Division, UC Berkeley (1980). [24] M.-C. Gaudel, Algebraic specification of abstract data types, INRIA Rapport de Recherche No. 360 (1979). [25] J.A. Goguen, How to prove algebraic inductive hypothesis without induction, Lecture Notes in Computer Science 87 (Springer, Berlin, 1980) 356-373. [26] J.A. Goguen and Parsaye-Ghomi, Algebraic denotational semantics using parameterized abstract modules in: J. Diaz and I. Ramos, Eds., Formalization of Programming Concepts, Lecture Notes in Computer Science 107 (Springer, Berlin, 1981),292-309. [27] J.A. Goguen and F. Nourani, Some algebraic techniques for proving correctness of data type implementation, Extended abstract, Computer Science Department, UCLA, Los Angeles (1978). [28] J.V. Guttag, Abstract data types and the development of data structures, Supplement to Proc. Conference on Data Abstraction, Definition, and Structure, SIGPLAN Notices 8 (1976). [29] J.V. Guttag, E. Horowitz and D.R. Musser, Abstract data types and software validation, Comm. ACM 21 (12) (1978) 1048-1063. [30] M.A. Harrison and R.J. Lipton, Implementation of abstract data types, Extended abstract, Com• puter Science Division, UC Berkeley (1979). [31] G. Huet and D.C. Oppen, Equations and rewrite rules: A survey, in: R Book, Ed., Formal Languages: Perspectives and Open Problems (Academic Press, New York, 1980). [32] U.L. Hupbach, Abstract implementation of abstract data types, Proc. MFCS'80, Rydzyna, Lecture Notes in Computer Science 88 (Springer, Berlin, 1980) 291-304. [33] U.L. Hupbach, Abstract implementation and parametersubstitution, Proc. 3rd Hungarian Computer Science Conference, Budapest (1981). [34] H.-J. Kreowski, Algebraische Spezifikation von Softwaresystemen, Proc. German Chapter ACM Conference on Software Engineering: Entwurf und Spezifikation, Berlin (1980). [35] D.H. Lehmann and M.B. Smyth, Data types, Proc. 18th IEEE Symposium on Foundations of Computing, Providence, RI (1977) 7-12. [36] B.H. Liskov and S.N. Zilles, Programming with abstract data types, SIGPLAN Notices 9 (1974) 50-59. [37] P. Mosses, A constructive approach to compiler correctness, Proc. ICALP'80, Noordwijkerhout, Lecture Notes in Computer Science 85 (Springer, Berlin, 1980) 449--469. [38] P. Mosses, Abstract semantic algebras, Aarhus University Report DAIMI IR-29 (1981). [39] F. Nourani, Constructive extension and implementation of abstract data types and algorithms, Technical Report UCLA-ENG-7945 (1979). [40] P. Padawitz, New results on completeness and consistency of abstract data types, Proc. MFCS'80, Rydzyna, Lecture Notes in Computer Science 88 (Springer, Berlin, 1980) 460-473. [41] M. Wand, Final algebra semantics and data type extensions, J. Comput. System Sci. 19 (1) (1979) 27--44. [42] M. Wirsing and M. Broy, Abstract data types as lattices of finitely generated models, Proc. MFCS' 80, Rydzyna, Lecture Notes in Computer Science 88 (Springer. Berlin, 1980) 673-685. [43] N. Wirth, Program development by stepwise refinement, Comm. ACM 14 (4) (1971) 221-227. [44] W.A. Wulf, Abstract data types: A retrospective and prospective view, Proc. MFCS' 80, Rydzyna, Lecture Notes in Computer Science 88 (Springer, Berlin, 1980) 94-112. [45] S.N. Zilles, An introduction to data algebras, Working draft paper, IBM Research, San Jose (1975).