Electronic Notes in Theoretical Computer Science

Volume 44.1

COALGEBRAIC METHODS IN COMPUTER SCIENCE CMCS'01

Genova, Italy

April 6-7, 2001

Guest Editors:

A. Corradini M. Lenisa U. Montanari ii Contents

Foreword ...... v A Coalgebraic View of Infinite Trees and Iteration ...... 1 Peter Aczel, Jiˇr´ıAdámek and Jiˇr´ı Velebil From Varieties of Algebras to Covarieties of Coalgebras ...... 27 Jiˇr´ıAdámek and Hans-E. Porst Process Calculi àlaBird-Meertens ...... 47 Lu´ıs S. Barbosa Generalised Coinduction ...... 67 Falk Bartels Deforestation,programtransformation,andcut-elimination ...... 88 Robin Cockett Algebras, Coalgebras, Monads and Comonads ...... 128 Neil Ghani, Christoph Lüth, Federico de Marchi and John Power Whenisa functiona foldoranunfold? ...... 146 Jeremy Gibbons, Graham Hutton and Thorsten Altenkirch A Calculus of Terms for Coalgebras of Polynomial Functors ...... 160 Robert Goldblatt Monoid-labeledtransitionsystems...... 184 H. Peter Gumm and Tobias Schröder ModalOperatorsforCoequations...... 204 Jesse Hughes Two-dimensionallinearalgebra ...... 226 Martin Hyland and John Power ModalRulesareCo-Implications ...... 240 Alexander Kurz Invariantsofmonadiccoalgebras...... 253 Dragan Maˇsulović Modal Languages for Coalgebras in a Topological Setting ...... 270 Dirk Pattinson BialgebraicSemanticsandRecursion(ExtendedAbstract) ...... 284 Gordon Plotkin From Algebras and Coalgebras to Dialgebras ...... 288 Erik Poll and Jan Zwanenburg

iii iv Foreword

This volume contains the Proceedings of the Fourth Workshop on Coalgebraic Methods in Computer Science (CMCS’2001). The Workshop was held in Gen- ova, Italy on April 6 and 7, 2001, as a satellite event of ETAPS’2001. The aim of the CMCS workshop series is to bring together researchers with a common interest in theory of coalgebras and its applications. Previous workshops have been organized in Lisbon (1998), Amsterdam (1999) and Berlin (2000). The proceedings appeared as ENTCS Vols. 11,19 and 33. During the last few years it is becoming increasingly clear that a great variety of state-based dynamical systems, like transition systems, automata, process calculi and class-based systems can be captured uniformly as coalgebras. The first three volumes together with the current volume demonstrate that theory of coalgebras and its applications are developing into a field of its own interest presenting a deep mathematical foundation, a growing field of applications and interactions with various other fields, such as reactive and interactive system theory, object oriented and concurrent programming, formal system specification, modal logic, dynamical systems, control systems, category theory, algebra, analysis, etc. The Program Committee of CMCS’2001 consisted of • Alexandru Baltag (CWI, Amsterdam), • Andrea Corradini (Department of Computer Science, University of Pisa), • Bart Jacobs (Department of Computer Science, University of Nijmegen), • Marina Lenisa (Department of Mathematics and Computer Science, Uni- versity of Udine), • Ugo Montanari (Department of Computer Science, University of Pisa), chair, • Larry Moss (Department of Mathematics, Indiana University, Blooming- ton), • Ataru T. Nakagawa (Software Research Associates, Tokyo), • Dusko Pavlovic (Kestrel Institute, Palo Alto), • John Power (Laboratory for Foundations of Computer Science, University of Edinburgh), • Horst Reichel (Institute of Theoretical Computer Science, Dresden Univer- sity of Technology), • Jan Rutten (CWI, Amsterdam). The papers in this volume were reviewed by the program committee members and by Falk Bartels, Anna Bucalo, Corina Cirstea, Pietro Di Giananto- nio, Marcelo Fiore, Jesse Hughes, Alexander Kurz, Anna Labella, Lambert Meertens, Marino Miculan, Marco Pistore, Grigore Rosu, Doug Smith. This volume will be published in the series Electronic Notes in Theoretical v Computer Science (ENTCS). This series is published electronically through the facilities of Elsevier Science B.V. and its auspices. The volumes in the ENTCS series can be accessed at the URL http://www.elsevier.nl/locate/entcs A printed version of the current volume is distributed to the participants at the workshop in Genova. We are very grateful to the following persons, whose help has been crucial for the success of CMCS’2001: Maura Cerioli and Gianna Reggio for their help with the organization of the Workshop as satellite event of ETAPS’2001; and Mike Mislove, Managing Editor of the ENTCS series, for his assistance with the use of the ENTCS style files. Thanks are also due to the Department of Computer Science of the University of Pisa, which has covered the printing cost of the copies distributed in Genova.

March 13, 2000 Andrea Corradini, Marina Lenisa and Ugo Montanari

vi CMCS’01 Preliminary Version

A Coalgebraic View of Inﬁnite Trees and Iteration

Peter Aczel 1

Department of Mathematics and Computer Science, Manchester University, Manchester, United Kingdom

Jiˇr´ıAd´amek 2,4

Institute of Theoretical Computer Science, Technical University, Braunschweig, Germany

Jiˇr´ı Velebil 3,4

Faculty of Electrical Engineering, Technical University, Praha, Czech Republic

Abstract The algebra of inﬁnite trees is, as proved by C. Elgot, completely iterative, i.e., all ideal recursive equations are uniquely solvable. This is proved here to be a general coalgebraic phenomenon:let H be an endofunctor such that for every object X aﬁ- nal coalgebra, TX,ofH( )+X exists. Then TX is an object-part of a monad which is completely iterative. Moreover, a similar contruction of a “completely iterative monoid” is possible in every monoidal category satisfying mild side conditions. Key words: monad, coalgebra, monoidal category

1 Email:[email protected] 2 Email:[email protected] 3 Email:[email protected] 4 The support of the Grant Agency of the Czech Republic under the Grant No. 201/99/0310 is gratefully acknowledged. This is a preliminary version. The ﬁnal version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Aczel, Adamek,´ Velebil

1 Introduction

There are various algebraic approaches to the formalization of computations of data through a given program, taking into account that such computations are potentially infinite. In 1970’s the ADJ group have proposed continuous algebras, i.e., algebras built upon CPO’s so that all operations are continuous. Here, an infinite computation is a join of the directed set of all finite approximations, see e.g. [7]. Later, algebras on complete metric spaces were considered, where an infinite computation is a limit of a Cauchy sequence of finite approximations, see e.g. [6]. In the present paper we show that a coalgebraic approach makes it possible to study infinite computations without any additional (always a bit arbitrary) structure — that is, we can simply work in Set, the category of sets. We use the simple and well-known fact that for polynomial endofunctors H of Set the algebra of all (finite and infinite) properly labelled trees is a final H-coalgebra. Well, this is not enough: what we need is working with “trees with variables”, i.e., given a set X of variables, we work with trees whose internal nodes are labelled by operations, and leaves are labelled by variables and constants. This is a final coalgebra again: not for the original functor, but for the functor

H + CX : Set −→ Set

where CX is the constant functor with value X. We are going to show that for every polynomial functor H : Set −→ Set

(a) final coalgebras TX of the functors H + CX form a monad, called the completely iterative monad generated by H, (b) there is also a canonical structure of an H-algebra on each TX, and all these canonical H-algebras form the Kleisli category of the completely iterative monad, and (c) the H-algebra TX has unique solutions of all ideal systems of recursive equations. A surprising feature of the result we prove is its generality: this has nothing to do with polynomiality of H, nor with the base category Set. In fact, given an endofunctor H of a category A with binary coproducts, and assuming that each H+CX has a final coalgebra, then (a)–(c) hold. Moreover, the completely iterative monad T : A −→ A, as an object of the endofunctor category [A, A], is a final coalgebra of the following endofunctor H of [A, A]: H(B)=H · B +1A for all B : A −→ A. Now [A, A] is a monoidal category whose tensor product ⊗ is composition and unit I is 1A. And the completely iterative monad generated by H is a monoid in [A, A]. We thus turn to the more general problem: given a monoidal category B, we call an object H iteratable provided that the endofunctor H : B −→ B given by H(B)=H ⊗ B + I has a final coalgebra T . Assuming 2 Aczel, Adamek,´ Velebil that binary coproducts of B distribute on the left with the tensor product, we deduce that T has a structure of a monoid, called the completely iterative monoid generated by the object H. Coming back to polynomial endofunctors of Set: the solutions of equations mentioned in (c) above refer to a topic extensively studied in 1970’s by C. C. Elgot [11], J. Tiuryn [18], the ADJ group [7] and others: suppose that X and Y are disjoint sets of variables and consider equations of the form

x0 = t0(x0,x1,x2,...,y0,y1,y2,...) x1 = t1(x0,x1,x2,...,y0,y1,y2,...) . . where xi are variables in X and yj are variables in Y , while ti are trees using those variables. Following Elgot, we call the system ideal provided that each tree ti is diﬀerent from any variable, more precisely,

ti ∈ T (X + Y ) \ η[X + Y ] for each i =0, 1, 2,... It then turns out that the system has a unique solution in TY. That is, there exists a unique sequence si(y0,y1,y2,...) of trees in TY for which the following equalities

s0(yi)=t0(s0(yi),s1(yi),...,yi,...) (1) s1(yi)=t1(s0(yi),s1(yi),...,yi,...) . . hold. Expressed categorically, an ideal system of equations is a morphism e : X −→ T (X + Y ) which factors through the H-algebra structure

τX+Y : HT(X + Y ) −→ T (X + Y ) mentionedin(b)above: e / X T (6X + Y ) nnn nnn (2) nnn % nnn τX+Y HT(X + Y ) Asolutionofe is given by a morphism e† : X −→ TY for which the following diagram e† / X TYO

e µY (3) / T (X + Y ) † TTY T [e ,ηY ] commutes. (Here, µY : TTY −→ TY is the multiplication of the completely iterative monad. In case of polynomial functors this takes a properly labelled 3 Aczel, Adamek,´ Velebil tree whose leaves are again properly labelled trees, and it delivers the properly labelled tree obtained by ignoring the internal structure.) In fact, the † morphism T [e ,ηY ] takes a tree with variables from X + Y onitsleavesand substitutes the solution-tree e†(x) for each occurence of the variable x ∈ X. Thus the equality (3), i.e., † † e (x)=µY · T [e ,ηY ](e(x)) for all x ∈ X precisely corresponds to the condition (1) above. Now in this categorical formulation we can, again, forget about polynomiality and about Set:ifH has ﬁnal coalgebras for all functors H +CX ,thenwe prove that every ideal equation-morphism e : X −→ T (X + Y ) has a unique solution, i.e., a unique morphism e† : X −→ TY for which (3) commutes. Related work. After ﬁnishing the present version of our paper we have found out that a similar topic is discussed by L. Moss in his preprints [15] and [16].

2 A Completely Iterative Monad of an Endofunctor

Assumption 2.1 Throughout this section, H denotes an endofunctor of a category A with ﬁnite coproducts. Whenever possible we denote by inl : X −→ X + Y and inr : Y −→ X + Y the ﬁrst and the second coproduct injections. Remark 2.2 For the functor

H( )+CX : A −→ A i.e., for the coproduct of H with CX (the constant functor of value X)itis well-known that

initial (H + CX )-algebra ≡ free H-algebra on X. (See [4].) More precisely, suppose that FX together with

αX : HFX + X −→ FX is an initial (H + CX )-algebra. The components of αX form

an H-algebra ϕX : HFX −→ FX and 0 −→ a universal arrow ηX : X FX. That is, for every H-algebra HA −→ A and for every morphism f : X −→ A there exists a unique homomorphism f : FX −→ A of H-algebras with · 0 f = f ηX . 4 Aczel, Adamek,´ Velebil

Example 2.3 Polynomial endofunctors of Set. These are the endofunctors of the form n HZ = A0 + A1 × Z + A2 × Z × Z + ...= An × Z n<ω where

Σ=(A0,A1,A2,...) is a sequence of pairwise disjoint sets called the signature.AninitialH-algebra can be described as the algebra of all finite Σ-labelled trees. Here a Σ-labelled tree t is represented by a partial function ∗ t : ω −→ An n<ω ∗ whose definition domain Dt is a nonempty and prefix-closed subset of ω (the set of all finite sequences of natural numbers), such that for any i1i2 ...ir ∈ Dt with t(i1 ...ir) ∈ An we have

i1i2 ...iri ∈ Dt iﬀ i

H + CX is also polynomial of signature

ΣX =(X + A0,A1,A2,...). Therefore, FX can be described as the algebra of all ﬁnite ΣX -labelled trees. Remark 2.4 (i) Dualizing the concept of a free H-algebra, we can study cofree H-coalgebras. A cofree H-coalgebra on an object X of A is just a free Hop-algebra on X in Aop,whereHop : Aop −→ Aop is the obvious endofunctor. If A has ﬁnite products then, by dualizing 2.2, we see that initial (H × CX )-algebra ≡ cofree H-coalgebra on X. Example: let H be a polynomial functor on Set.Then

H × CX is also a polynomial functor, since n (H × CX )Z = X × An × Z . n<ω This is the polynomial functor of signature X Σ =(X × A0,X × A1,X × A2,...). A cofree H-coalgebra can be described as the coalgebra TX of all (ﬁnite and inﬁnite) ΣX -labelled trees. Thus every node is labelled by (i) an operation from An and (ii) a variable from X. 5 Aczel, Adamek,´ Velebil

(ii) Besides a free H-algebra on X and a cofree H-coalgebra on X we have a third structure associated with X: a ﬁnal coalgebra of H + CX . We will show that it has an important universal property. Deﬁnition 2.5 An endofunctor H of A is called iteratable provided that for every object X of A the endofunctor

H + CX has a ﬁnal coalgebra. Notation 2.6 Let TX denote a ﬁnal coalgebra of H + CX . The coalgebra map

αX : TX −→ H(TX)+X is, by Lambek’s lemma [13], an isomorphism. Thus, we have TX = H(TX)+X with coproduct injections

τX : H(TX) −→ TX and ηX : X −→ TX −1 −→ where [τX ,ηX ]=αX : H(TX)+X TX. In particular, TX is an H-algebra via τX . We can also deﬁne T on morphisms f : X −→ Y of A in the expected manner: we turn TX into the following coalgebra of type H + CY :

αX id+f TX /H(TX)+X /H(TX)+Y and denote by Tf : TX −→ TY the unique homomorphism of coalgebras: αX / id+f TX o H(TX)+X /H(TX)+Y [τX ,ηX ] Tf HTf+id αY / TY o H(TY)+Y [τY ,ηY ] That is, Tf is the unique morphism such that the following squares

τX ηX HTX /TX X /TX HTf Tf f Tf / / HTY τY TY Y ηY TY commute. Shortly, Tf is the unique homomorphism of H-algebras extending f. It is easy to verify that (due to the uniqueness) we obtain a well-deﬁned functor T : A −→ A 6 Aczel, Adamek,´ Velebil and a natural transformation

η :1A −→ T

Example 2.7 Continuous functors are iteratable. Here we assume that A has 1.aterminalobject1 2. limits of ωop-sequences and 3. binary coproducts commuting with ωop-limits. Suppose that H is ωop-continuous, i.e., preserves ωop-limits. Due to 3., all op functors H + CX are ω -continuous, thus, have a final coalgebra (see [2]) n TX = lim(H + CX ) 1. n<ω Observe that the functor T is also continuous: in fact, as we will see in Section 4, T is a final coalgebra of the functor H :[A, A] −→ [A, A] defined on objects by H(B)=H · B +1A for all B : A −→ A. Now [A, A] satisfies 1.–3. above, and H is continuous, thus, n T = lim H (C1). n<ω This, being a limit of continuous functors, is continuous.

Example 2.8 Polynomial endofunctors of Set. They are continuous, thus iteratable. A final coalgebra TX of the (polynomial!) functor H + CX of signature ΣX is the algebra of all ΣX -labelled trees. That is, unlike the coalgebra TX of all ΣX -labelled trees, see 2.4, where every node carries a label from X and one from An (for the case of n children), the trees in TX have nodes labelled in An except for leaves: they are labelled in X + A0. As a concrete example, consider the unary signature: HZ = A × Z. We have defined three algebras for a set X of variables: the free algebra FX = A∗ × X of all finite Σ-labelled trees for Σ = (∅,A,∅, ∅,...), the cofree coalgebra TX =(A × X)∞ 7 Aczel, Adamek,´ Velebil

(where ( )∞ denotes the set of all finite and infinite words in the given alphabet), and the algebra TX = A∗ × X + Aω (where ( )ω denotes the set of all infinite words in the given alphabet). Example 2.9 Generalized polynomial functors. We want to include functors such as HZ = ZB,whereB is a (not necessarily finite) set: since these functors are continuous, the description of TX is quite analogous to the preceding case. Here we introduce a generalized signature as a collection

Σ=(Ai)i∈Card of pairwise disjoint sets indexed by all cardinals such that for some cardinal λ we have

i ≥ λ implies Ai = ∅. (WesaythatΣisaλ-ary generalized signature; the case λ = ω being the above one.) The generalized polynomial functor of generalized signature Σ is defined on objects by j HZ = Aj × Z j<λ and analogously on morphisms. A final coalgebra is, again, described as the coalgebra of all Σ-labelled trees, i.e., partial maps ∗ t : λ −→ Aj j<λ ∗ defined on a nonempty, prefixed-closed subset Dt of λ (the set of all finite sequences of ordinals smaller than λ) such that for all i1i2 ...ir ∈ Dt with t(i1i2 ...ir) ∈ Aj we have

i1i2 ...iri ∈ Dt iﬀ i

Since H+CX is a generalized polynomial functor of signature ΣX , obtained from Σ by changing A0 to X + A0, we conclude that TX is the coalgebra of all (ﬁnite and inﬁnite) ΣX -labelled trees. Remark 2.10 Denote by U : H-Alg −→ A the forgetful functor of the category of all H-algebras and H-homomorphisms. The universal property of free H-algebras ϕX : HFX −→ FX (provided they exist on all objects X of A) makes U a right adjoint. The left adjoint is the functor

X → (FX,ϕX ).

We now show a related universal property of the H-algebras τX : HTX −→ TX: given a morphism s : X −→ TY we prove that there is a unique homomorphism s : TX −→ TY of H-algebras extending s. Thisisinteresting 8 Aczel, Adamek,´ Velebil even for the basic case of the polynomial endofunctors of Set: here a morphism s : X −→ TY can be viewed as a substitution rule, substituting a variable x ∈ X by a ΣY -labelled tree s(x). We obviously have a homomorphism s : TX −→ TY extending s: take a tree t ∈ TX, substitute every variable x ∈ X on any leave of t by the tree s(x) and obtain a tree t = Ts(t) ∈ TTY over TY. Now forget that t is a tree of trees and obtain a tree s(t)inTY. However, it is not obvious that such a homomorphism is unique. This is what we prove now:

Substitution Theorem 2.11 For every iteratable endofunctor H of A and any morphism s : X −→ TY in A there exists a unique extension into a homomorphism s : TX −→ TY of H-algebras. That is, a unique homomorphism s :(TX,τX ) −→ (TY,τY ) with s = s· ηX .

Proof. We turn TX + TY into a coalgebra of type H + CY as follows: the coalgebra map is

X Y β TX + TYα +α /HTX + X + HTY + Y /H(TX + TY)+Y where the components of β (denoted by β1, β2, β3 and β4 from left to right) are as follows: X s HTX TY HTY

Hinl αY Hinr H(TX + TY) H(TX + TY) eeee TT HTY + Y k eeeeee Y TTTT kkkk eeeeee TTT inlkkk eeeee TTT Hinr+id kkk eeeeee inl TTT) kukkkreeeeee inr H(TX + TY)+Y There exists a unique homomorphism

αX +αY / β TX + TY o H(TX)+X + HTY + Y /H(TX + TY)+Y [τX ,ηX ]+[τY ,ηY ] f Hf+id αY / TY o HTY + Y [τY ,ηY ] of (H + CY )-coalgebras. Equivalently, a unique morphism

f =[f1,f2]:TX + TY −→ TY 9 Aczel, Adamek,´ Velebil in A for which the following two squares

[τX ,ηX ] HTX + X /TX [τY ,ηY ] [β1,β2] / HTY + Y TY H(TX + TY)+Y f1 and Hf2+id f2 / Hf+id HTY + Y TY [τY ,ηY ] HTY + Y / [τY ,ηY ] TY commute. The right-hand square shows that f2 is an endomorphism of the ﬁnal (H + CY )-coalgebra — thus,

f2 = id. The left-hand square is equivalent to the commutativity of the following two squares (recall the deﬁnition of β1 and β2): ηX X /TX s τX HTX /TX TY αY Hf1 f1 and f1 / HTY τY TY HTY + Y

Hf2+id HTY + Y / [τY ,ηY ]TY

The square on the left tells us that f1 is a homomorphism of H-algebras. And −1 since f2 = id (thus Hf2 + id = id)andαY =[τY ,ηY ],thesquareonthe right states f1 · ηX = s, i.e., f1 extends s. This proves that there is a unique extension of s to a homomorphism: put s = f1. ✷ Corollary 2.12 Let K denote the full subcategory of H-Alg formed by all the H-algebras (TX,τX ),forX in A. The functor

Ψ:A −→ K,X → (TX,τX ) is left adjoint to the forgetful functor

U/K : K −→ A, (TX,τX ) → TX. This adjunction deﬁnes a monad T =(T,η,µ) on A. In fact, the natural transformation µ : TT −→ T is, in the notation of the Substitution Theorem, precisely µY = id TY : T (TY) −→ TY. Deﬁnition 2.13 The above monad T, associated with any iteratable endofunctor H, is called the completely iterative monad generated by H. 10 Aczel, Adamek,´ Velebil

Examples 2.14 (i) The completely iterative monad generated by the endofunctor HZ = A × Z of Set is the monad TX = A∗ × X + Aω. This can be described as the free-algebra monad of the variety of algebras with (a) unary operations oa for a ∈ A, ω (b) nullary operations indexed by A (i.e., constants of the names a0a1a2 ...∈ Aω), and (c) satisfying the equations

oa(a0a1a2 ...)=aa0a1a2 ... for all a, a0,a1,...∈ A In this case, T is a ﬁnitary monad on Set. (ii) The completely iterative monad generated by the endofunctor HZ = Z × Z of Set is the monad TX of all binary trees with leaves indexed in X.This is not ﬁnitary: consider the following element of T : ? ?? ?? ?? x1 ?? ?? x2 ?? ?? x3 x4

in which all xi are pairwise distinct. (iii) Let CPO denote the category of CPO’s (say, posets with a smallest element ⊥ and joins of ω-chains) and strict continuous functions (i.e., those preserving ⊥ and joins of ω-chains). For all locally continuous functors H : CPO −→ CPO, i.e., such that the derived functions CPO(A, B) −→ CPO(HA,HB),f → Hf are all continuous, it is well-known that initial H-algebra ≡ ﬁnal H-coalgebra, see [17]. Since each H + CX is also locally continuous, we deduce that locally continuous functors are iteratable, and in this case FX ≡ TX that is, the completely iterative monad T is just the free algebra monad F of H. 11 Aczel, Adamek,´ Velebil

(iv) Analogously for the category CMS of all complete metric spaces and contractions: every locally contractive endofunctor H : CMS −→ CMS, i.e., such that the derived functions CMS(A, B) −→ CMS(HA,HB),f → Hf are all contractive with a common constant < 1, has a single ﬁxed point, i.e., initial H-algebra ≡ ﬁnal H-coalgebra, see [6]. Since each H + CX is also locally contractive, we again get T = F. Remark 2.15 (i) The Kleisli category

AT −→ A of the completely iterative monad is the above category K of all H- algebras τX : HTX −→ TX (with its forgetful functor K −→ A). This follows from the Substitution Theorem. (ii) The Eilenberg-Moore category AT −→ A of all T-algebras and T-homomorphisms seems to be a new construct. As seen in 2.14, it is usually inﬁnitary. For example, if H : Set −→ Set,HZ= Z × Z we can describe SetT as the category of all algebras on one binary operation (say, )andonω-ary operations t(x0,x1,x2,...) for all inﬁnite trees t(x0,x1,x2,...)inT {xn | n<ω} satisfying the following equations: (a) x t = x t, (b) t x = t x, and (c) t(t0, t1, t2,...)=s if s is the tree obtained by substituting ti for xi, i<ω,intot(x0,x1,x2,...).

3 Solution Theorem

In the introduction above we have motivated the following Deﬁnition 3.1 Let H be an iteratable endofunctor of A. (i) By an equation-morphism we understand a morphism in A of the following form e : X −→ T (X + Y ),X,YareobjectsofA. (ii) An equation-morphism is called ideal if it factorizes through

τX+Y : HT(X + Y ) −→ T (X + Y ). 12 Aczel, Adamek,´ Velebil

(iii) A solution of an equation-morphism e is a morphism e† : X −→ TY such that the following diagram

e† / X TYO

e µY / T (X + Y ) † TTY T [e ,ηY ] commutes.

Examples 3.2 (i) Consider the following endofunctor HZ = A × Z of Set with TX =(A∗ × X)+Aω. The only interesting equations are of the form

x = a1a2 ...anx ∗ for a word a1a2 ...an ∈ A . This equation is ideal iﬀ the word is nonempty. Then it has a unique solution: † ω e (x)=a1a2 ...ana1a2 ...ana1a2 ...an ...∈ A . (ii) For the functor HZ = Z × Z on Set we have as TX the algebra of all binary trees with leaves labelled in X. Consider the following system e of equations ?? ? x1 = x2 y1 ?? ? x2 = x1 y2 It is ideal. The unique solution is ?? ? ? ?? y1 † ?? e (x )= ? y2 1 ? ?? y1 y2 ?? ? ? ?? y2 † ?? e (x )= ? y1 2 ? ?? y2 y1 Solution Theorem 3.3 Let H be an iteratable endofunctor. Then every ideal equation-morphism has a unique solution. 13 Aczel, Adamek,´ Velebil

Proof. Let e : X −→ T (X + Y ) be an ideal equation-morphism and denote by e0 : X −→ HT(X + Y ) a morphism such that the triangle K e / X KK Tn(6X + Y ) KKK nnn KK nnn 0 KK nn X+Y e K% nnn τ HT(X + Y ) commutes.

We are going to use the definition of a final (H +CY )-coalgebra for defining a morphism h : T (X + Y )+TY −→ TY. To do this, we must first define an (H + CY )-coalgebra γ : T (X + Y )+TY −→ H T (X + Y )+TY + Y on T (X + Y )+TY. The morphism γ is defined as a composite

αX+Y +αY T (X + Y )+TY /HT(X + Y )+X + Y + HTY + Y

δ H T (X + Y )+TY + Y where the components of δ are as follows: X

e0 HT(X + Y )

Hinl HT(X + Y ) H T (X + Y )+TY yY yy yy yy Hinl yy yy yy inryy H T (X + Y )+TY inl yy HTYll + Y TTT yy lll TTT yy ll TTTT y lll TTTT yy lll inl TT* |yy vlll Hinr+idY H T (X + Y )+TY + Y

Denote by h the unique homomorphism of (H + C )-coalgebras: Y T (X + Y )+TY γ /H T (X + Y )+TY + Y

h Hf+idY / TY αY HTY + Y Put:

h1 = h · inl : T (X + Y ) −→ TY and h2 = h · inr : TY −→ TY

The commutativity of the above square is (since αX+Y + αY is inverse to [τX+Y ,ηX+Y ]+[τY ,ηY ]) equivalent to the commutativity of the following four 14 Aczel, Adamek,´ Velebil diagrams: τX+Y HT(X + Y ) /T (X + Y )

(4) Hh1 h1 / HTY τY TY

ηX+Y X inl /X + Y /T (X + Y )

e0 HT(X + Y )

(5) Hh1 h1 HTY

inl HTY + Y / [τY ,ηY ] TY

ηX+Y Y inr /X + Y /T (X + Y )

(6) inr h1 HTY + Y / [τY ,ηY ] TY

[τY ,ηY ] HTY + Y /TY

(7) Hh2+idY h2 HTY + Y / [τY ,ηY ]TY

The square (7) asserts that h2 is an endomorphism of the ﬁnal (H + CY )- coalgebra, i.e.,

h2 = id TY .

The diagram (4) tells us that h1 is a homomorphism of H-algebras, thus, by Substitution Theorem, h1 is uniquely determined by

h1 · ηX+Y : X + Y −→ TY. The last morphism is determined uniquely by its ﬁrst component, s : X −→ TY, because, using the diagram (6), we conclude that the second component of h1 · ηX+Y is ηY (=[τY ,ηY ] · inr). That is h1 · ηX+Y =[s, ηY ] or, equivalently h1 = [s, ηY ]. Finally, the diagram (5) asserts that

s = h1 · ηX+Y · inl =[τY ,ηY ] · inl · Hh1 · e0 = τY · Hh1 · e0 = h1 · τX+Y · e0 = h1 · e whereweusenaturalityofτ and the above triangle for e0. 15 Aczel, Adamek,´ Velebil

We have proved that there exists s : X −→ TY with s = h1 · e = [s, ηY ] · e.

Moreover, this s is uniquely determined by h =[h1, id Y ], hence s is uniquely determined by e. ✷

4A Completely Iterative Monoid of an Object

We can view the procedure of Section 2 globally by working, instead of in the given category A, in the endofunctor category [A, A]. Here H is an object. If H is iteratable, then it deﬁnes another object, T , together with a morphism (natural transformation)

α : T −→ HT +1A. This is a coalgebra of the functor H :[A, A] −→ [A, A] defined on objects by H(S)=H · S +1A (for all S : A −→ A) and analogously on morphisms. We prove below that T is a final H-coalgebra. Within the realm of locally small categories with coproducts this global approach is equivalent to that of Section 2 as we now prove. Proposition 4.1 Let A be a locally small category with coproducts. Then, for every endofunctor H, the following are equivalent: (i) H is an iteratable object of [A, A], i.e., a final H-coalgebra exists.

(ii) H is an iteratable endofunctor, i.e., all final (H + CX )-coalgebras exist. Remark. The proof that (ii) implies (i) holds for all categories A with binary coproducts. For the proof that (i) implies (ii), only copowers indexed by hom-sets of the category A are used. Thus the proposition holds also for any poset A and for the category A = Setfin of finite sets.

Proof. (i) implies (ii): For every pair X, Y of objects in A denote by KX,Y the following endofunctor KX,Y A = Y A(X,A) for objects A, analogously for morphisms. This is just a left Kan extension of Y , considered as a functor 1 −→ A along X : 1 −→ A. In fact, for every functor P : A −→ A we have a bijection

KX,Y −→ P Y −→ PX 16 Aczel, Adamek,´ Velebil

natural in P , which to every natural transformation ϕ : KX,Y −→ P assigns the composite u Y ϕX Y / /PX A(X,X) where u is the id X -injection. Conversely, given a morphism f : Y −→ PX, the corresponding natural transformation f @ : K −→ P has components X,Y @ −→ fA : Y PA h:X−→ A f determined by Y /PX Ph /PA. Let α : T −→ HT +1A be a ﬁnal H-coalgebra. We are to show that

αX : TX −→ HTX + X isaﬁnal(H + CX )-coalgebra for every X. In fact, for every (H + CX )-coalgebra b : Y −→ HY + X when composing b with Hu + id : HY + X −→ H Y + X = HKX,Y +1A X A(X,X) we obtain a morphism ¯ b : Y −→ HKX,Y X which by the above adjointness yields an H-coalgebra ¯@ b : KX,Y −→ HKX,Y . Let ϕ be the unique homomorphism of H-coalgebras ¯b@ / KX,Y HKX,Y

ϕ Hϕ / T α HT Then ϕ = f @ for a unique f : Y −→ TX, and the commutativity of the above square yields the commutativity of Y b /HY + X

f Hf+id X / TX αX HTX + X

(ii) implies (i): It has been noted above (see 2.6) that if αX : TX −→ HTX+ X denotes a ﬁnal coalgebra for H + CX , then the assignment X → TX can be extended to a functor T : A −→ A.

Analogously one can show that the collection of all αX ’s constitutes a natural transformation α : T −→ H·T +1A.Thus,α makes T an H-coalgebra. 17 Aczel, Adamek,´ Velebil

To verify that α is indeed a ﬁnal H-coalgebra, consider any coalgebra β : S −→ H · S +1A.ForeachX in A there exists a unique morphism fX : SX −→ TX such that the square

βX SX /HSX + X

fX HfX +id / TX αX HTX + X commutes. It is easy to show that the collection of fX ’s is natural in X and that it deﬁnes a unique natural transformation f : S −→ T for which the square β / S HS +1A

f Hf+id / T α HT +1A commutes. ✷

Now [A, A] is a monoidal category with composition as tensor product and 1A as a unit. Moreover composition distributes with coproducts on the left: (H +K)·L =(H ·L)+(K ·L). This leads us to consider an arbitrary monoidal category (B, ⊗,I) with coherence isomorphisms (for all H, K, L in B):

lH : I ⊗ H −→ HrH : H ⊗ I −→ H and

aH,K,L : H ⊗ (K ⊗ L) −→ (H ⊗ K) ⊗ L satisfying the usual laws, and distributive in the following sense: Deﬁnition 4.2 (i) A monoidal category is called left distributive if it has binary coproducts and the canonical morphisms

dH,K,L :(H ⊗ L)+(K ⊗ L) −→ (H + K) ⊗ L are all isomorphisms. (ii) An object H of a monoidal category B is said to be iteratable provided that the endofunctor H : B −→ B deﬁned by H(B)=H ⊗ B + I has a ﬁnal coalgebra. (iii) A left distributive monoidal category with each object iteratable is called an iteratable category. Examples 4.3 18 Aczel, Adamek,´ Velebil

(i) The category Cont[Set, Set] of continuous endofunctors (i.e., those preserving ωop-limits) of Set is iteratable: we know that continuous functors are closed under (a) composition (here: a tensor product) (b) identity functor (here: unit I) and (c) finite coproducts, thus Cont[Set, Set] is a distributive monoidal subcategory of [Set, Set]. Now, as observed in 2.7, every continuous functor is iteratable, and the completely iterative monad is also continuous; therefore Cont[Set, Set]is an iteratable category. (ii) More in general, Cont[A, A] is an iteratable category for every category A satisfying conditions 1.–3. of Example 2.7. (iii) The category Fin[Set, Set] of all finitary endofunctors of Set (i.e., those preserving filtered colimits) is an iteratable category. In fact, finitary functors are closed under composition, identity functor, and finite coproducts, thus, Fin[Set, Set]isa distributive monoidal subcategory of [Set, Set]. A completely iterative monad T of a finitary functor H exists, since finitary functors always have final coalgebras, see [8], Theorem 1.2, and each H + CX is clearly finitary. However, this monad is seldom finitary, see 2.14.(ii) above. We can form a finitary part Tfin of every monad T on Set (see [14]): it is obtained by restricting the underlying functor T to the full subcategory Setfin of finite sets, and then forming a left Kan extension of T/Setfin along the embedding of Setfin in Set. It is quite easy to verify that Tfin is a final coalgebra of the endofunctor H · ( )+1Set of Fin[Set, Set]. In fact, given any coalgebra

S −→ H · S +1Set (with S finitary, of course) the unique H-homomorphism f : S −→ T is easily seen to have a factorization through the canonical morphism m : Tfin −→ T . That is, we have a unique f : S −→ Tfin with f = m·f .And f is the unique homomorphism of coalgebras of the functor H ·( )+1Set, considered as an endofunctor of Fin[Set, Set]. Example: the functor H : Set −→ Set with HZ = Z × Z has the completely iterative monad T where TX are all binary trees with leaves indexed in X.AndTfin is the finitary monad where Tfin X are all binary trees with leaves indexed in a finite subset of X. (iv) More generally, if A is a locally finitely presentable category (see [5]) then 19 Aczel, Adamek,´ Velebil

Fin[A, A], the category of finitary endofunctors of A, is iteratable. The argument is the same: we form a completely iterative monad T in [A, A], which exists by Theorem 1.2 in [8] (although formulated for Set,itholds in all locally presentable categories) and then take a finitary part Tfin just as in (iii) above. (v) In the formal language theory we study the monoidal category B of all subsets of Γ∗ (for the basic alphabet Γ) which is a complete lattice exp(Γ∗) considered as a category. Here

L1 ⊗ L2 = concatenation of languages L1 and L2 and

L1 + L2 = union of languages L1 and L2. Every language L is iteratable with T = L∗ (Kleene star). (vi) Kleene algebras, cf. [12], are distributive symmetric monoidal categories (B, ⊗,I)whereB is a join semilattice such that each of the endofunctors H = H ⊗ ( )+I has a least fixed point H∗. This is closely related to our concept, except that we are concerned with the largest fixed points. Since the basic motivation for Kleene algebras is the previous example of formal languages and since this example has the property that H = H ⊗ ( )+I always has a unique fixed point, the choice of least or largest seems to be rather arbitrary. (vii) Let B have a terminal object 1 and limits of ωop-chains which commute with both the tensor product and the binary coproduct. Then every object H is iteratable and T is a limit of the following countable chain: ⊗ id H⊗(H⊗!+id)+id 1 o ! H ⊗ 1+I o H !+ H ⊗ (H ⊗ 1+I)+I o ... For example: the category of sets with a binary product as ⊗ and a terminal object I as a unit is an iteratable category: the (polynomial) functor H(Z)=H × Z + I has a final coalgebra T = H∞ for every set H. And the cartesian closed category Cat of all small categories is an iteratable category. Every small category H is iteratable with T =1+H +(H × H)+...+ Hω (viii) Let H be an iteratable Abelian group (where we consider the category Ab of all Abelian groups with the usual tensor product). Then a final coalgebra of H is, as we show below in 4.6, a monoid in the given monoidal category — thus, in the present case T is a ring. 20 Aczel, Adamek,´ Velebil

Notation 4.4 For every iteratable object H we denote by T and α : T −→ H ⊗ T + I a ﬁnal coalgebra of H. By Lambek’s Lemma, T is a coproduct of H ⊗ T and I with injections τ : H ⊗ T −→ T and η : I −→ T where α−1 =[τ,η]. This makes T into an algebra for the functor H ⊗ . More generally, every object S of B yields an algebra

aH,T,S/ τ⊗idS/ τS ≡ H ⊗ (T ⊗ S) (H ⊗ T ) ⊗ S T ⊗ S

(where aH,T,S is the associativity isomorphism). Put

rS / η⊗idS/ ηS ≡ S I ⊗ S T ⊗ S Substitution Theorem 4.5 Let H be an iteratable object in a monoidal category B. For every morphism s : S −→ T in B there is a unique homomorphism s : T ⊗ S −→ T of algebras of type H ⊗ with

s = s· ηS.

Proof. This is quite analogous to the proof of Theorem 2.11. We turn the object T ⊗ S + T into an H-coalgebra as follows:

α⊗idS +α ∼ T ⊗ S + T /(H ⊗ T + I)+(H ⊗ T + I) = ∼ = H ⊗ (T ⊗ S)+S + H ⊗ T + I β /H ⊗ (T ⊗ S + T )+I where the isomorphism in the middle is the combination of the canonical ⊗ ⊗ ∼ ⊗ ⊗ ⊗ −1 ⊗ ⊗ −→ isomorphism (H T +I) S = (H T ) S+I S with aH,T,S :(H T ) S ⊗ ⊗ −1 ⊗ −→ H (T S)andrS : I S S,andβ has the following components (from left to right): S H ⊗ T

s H⊗inr H ⊗ (T ⊗ S) T H ⊗ (T ⊗ S + T ) ss ss ⊗ α ss H inl ss ss inlss H ⊗ (T ⊗ S + T ) ⊗ ss iii UUU H T + I ss iii I UUU ss iiii UUUU ss iii UUUH⊗inr+id ss iiii inr UUU* yss tiiii inr H ⊗ (T ⊗ S + T )+I The unique homomorphism

f =[f1,f2]:T ⊗ S + T −→ T 21 Aczel, Adamek,´ Velebil of H-coalgebras is the unique morphism of B which has the second component, f2, an endomorphism of the ﬁnal H-coalgebra α : T −→ H ⊗ T + I,thus,

f2 = id, and for the ﬁrst component we get two commutative diagrams: one tells us that f1 is a homomorphism of (H ⊗ )-algebras, and the other one is as follows:

ηS S /T ⊗ S s T

α f1 H ⊗ T + I

H⊗f2+id ⊗ / H T + I [τ,η] T

Since f2 = id, this diagram tells us that f1 · ηS = s, which proves the Substi- tution Theorem. ✷

Corollary 4.6 For every iteratable object H,aﬁnalH-coalgebra T is a monoid with respect to η : I −→ T and µ = id T : T ⊗ T −→ T.

Proof. In fact, the equality µ · ηT = id follows from the deﬁnition of µ and the other two equalities deﬁning monoids in (B, ⊗,I) easily follow from the uniqueness of s. ✷

Deﬁnition 4.7 The monoid from the above Corollary is called a completely iterative monoid generated by an iteratable object H.

We now show a remarkable property of completely iterative monoids: if a left distributive monoidal category is an iteratable category, then the assignment of a completely iterative monoid is a functor which underlies a completely iterative monoid again. Example: Set is an iteratable category, see item (vii) in 4.3, and the assignment H → H∞ is, as an object of [Set, Set], itself a completely iterative monoid generated by Id. For every monoidal category B we consider [B, B] as a monoidal category (with the “pointwise” tensor product P ⊗ Q : H → P (H) ⊗ Q(H)andthe “pointwise” unit CI : H → I). Theorem 4.8 Suppose that (B, ⊗,I) is an iteratable category. Then the following hold: (i) The functor category [B, B] is an iteratable category. 22 Aczel, Adamek,´ Velebil

(ii) The assignment of a completely iterative monoid to every object is an endofunctor of B and which, as an object of [B, B], is itself a completely iterative monoid generated by Id B. Proof. (i). First observe that [B, B] is indeed a distributive monoidal category, since the required structure is transported pointwise from B. Consider now any functor H : B −→ B. To show that the derived functor H = H ⊗ ( )+CI :[B, B] −→ [B, B] has a ﬁnal coalgebra, form, for each B in B, a ﬁnal coalgebra of the functor H(B) ⊗ ( )+I:

aB : T (B) −→ H(B) ⊗ T (B)+I. It is clear that there is a unique canonical way of making the assignment B → T (B) functorial: consider any morphism f : B −→ C in B and deﬁne T (f):T (B) −→ T (C) to be the unique morphism such that the diagram ⊗ id T (B) aB /H(B) ⊗ T (B)+IH(f) T (B)+ /H(C) ⊗ T (B)+I

T (f) H(C)⊗T (f)+id / ⊗ T (C) aC H(C) T (C)+I commutes. It is easy to show that this indeed deﬁnes a functor T : B −→ B.

The collection of morphisms aB : T (B) −→ H(B) ⊗ T (B)+I is natural in B and thus deﬁnes a coalgebra for H ⊗ ( )+CI :

a : T −→ H ⊗ T + CI . To show that a is a ﬁnal coalgebra, consider any coalgebra

b : S −→ H ⊗ S + CI .

For every B in B there exists a unique morphism λB : S(B) −→ T (B)such that the square S(B) bB /H(B) ⊗ S(B)+I

λB H(B)⊗λB+id / ⊗ T (B) aB H(B) T (B)+I commutes. To show that the collection (λB) constitutes a natural transformation, observe that, for every f : B −→ C,both

λC · S(f):S(B) −→ T (C)andT (f) · λB : S(B) −→ T (C) are homomorphisms of (H(C) ⊗ ( )+I)-coalgebras from

(H(f) ⊗ S(B)+id) · bB : S(B) −→ H(C) ⊗ S(B)+I to

aC : T (C) −→ H(C) ⊗ T (C)+I and therefore they are equal. We have formed a ﬁnal coalgebra

a : T −→ H ⊗ T + CI . 23 Aczel, Adamek,´ Velebil

(ii) Put Φ(B)=TB for every object B and extend the assignment B → Φ(B) toafunctorΦ:B −→ B as in the ﬁrst part of the proof. Let us now consider the functor

Id ⊗ ( )+CI :[B, B] −→ [B, B].

The collection of morphisms aB :Φ(B) −→ B ⊗ Φ(B)+I deﬁnes a coalgebra for Id ⊗ ( )+CI :

a : T −→ Id ⊗ T + CI and it follows from the ﬁrst part of the proof that this coalgebra is ﬁnal. To conclude the proof use now the monoidal version of the existence of a completely iterative monad from 4.6. ✷

5 Conclusions and Connections

We have seen that with every endofunctor H satisfying rather weak hypothesis we can associate a monad T by assigning to every object X a final coalgebra, TX, of the endofunctor H( )+X. This monad is specified by the Substitution Theorem for final coalgebras. It has the remarkable property that every ideal system of recursive equations in it has a unique solution (Solution Theorem). Even in the basic case, where a signature in (or a polynomial endofunctor of) the category of sets is given, this monad T of finite and infinite trees over given variables seems to be new. We can introduce T more globally, as an object of the endofunctor category [A, A]: here, T is simply a final coalgebra of the endofunctor H · + Id, and this generalizes to monoidal categories satisfying mild additional assumptions. One of the sources for the ideas in this paper has been the so-called hyperset theory which is an approach to axiomatic set theory that allows for non- well-founded sets. In hyperset theory the Foundation Axiom of the standard axiomatic set theory ZFC is replaced by the Anti-Foundation Axiom, AFA. This axiom expresses that every flat system of set equations

xi = {xj | j ∈ Ji} (i ∈ I) has a unique solution in the hyperuniverse of possibly non-well-founded sets. Here the xi are variables, indexed by a set I,andeachJi is a subset of I. By considering the variables as atoms (urelemente) taken from a class X,the right hand sides become sets of atoms taken from a hyperuniverse V [X]of possibly non-well-founded sets that are built out atoms taken from X.The hyperuniverses V [Y ], for classes Y , satisfy relativized forms of AFA which generalize to non-ﬂat systems of equations. In general we get the Solution Theorem for the V [Y ]. It was only the recent collaboration between the authors that led us to realise that the Substitution Theorem just expresses that the natural endofunctor T determined by the operation X → V [X] forms a monad T on the category of classes and that the Solution Theorem just expresses that T is the 24 Aczel, Adamek,´ Velebil completely iterative monad generated by the endofunctor Pow on the category of classes that associates to each class X the class Pow(X) of its subsets. The approach to working with hypersets using the Substitution and Solu- tion Theorems has been presented in the books [9], [1] and [10].

References

[1] Aczel, P., “Non-Well-Founded Sets”, CSLI Lecture Notes Number 14, CSLI Publications, Stanford, 1988.

[2] Ad´amek, J., Free Algebras and Automata Realizations in the Language of Categories, Comment. Math. Univ. Carolinae 15 (1974), 589–602.

[3] Ad´amek, J., and V. Koubek, On the Greatest Fixed Point of a Set Functor, Theor. Comp. Science 150 (1995), 57–75.

[4] Ad´amek, J., and V. Trnkov´a, “Automata and Algebras in Categories”, Kluwer Academic Publishers, Dordrecht, 1990.

[5] Ad´amek, J., and J. Rosick´y, “Locally presentable and accessible categories”, Cambridge University Press, 1994.

[6] America, P., and J. Rutten, Solving Reﬂexive Domain Equations in a Category of Complete Metric Spaces, J. Comp. System Sci. 39 (1989), 343– 375.

[7] Goguen, J. A., S. W. Thatcher, E. G. Wagner and J. B. Wright, Initial Algebra Semantics and Continuous Algebras, manuscript.

[8] Barr, M., Terminal Coalgebras in Well-founded Set Theory, Theor. Comp. Science 114 (1993), 299–315.

[9] Barwise, J., and J. Etchemendy, “The Liar:An Essay in Truth and Circularity”, Oxford University Press, 1987.

[10] Barwise, J., and L. Moss, “Vicious Circles”, CSLI Lecture Notes Number 60, CSLI Publications, Stanford, 1996.

[11] Elgot, C. C, Monadic Computation and Iterative Algebraic Theories,in: “Logic Colloquium ’73” (eds:H. E. Rose and J. C. Shepherdson), North- Holland Publishers, Amsterdam, 1975.

[12] Kozen, D., A Completeness Theorem for Kleene Algebras and the Algebra of Regular Events, Inform. and Comput. 110 (1994), 366–390.

[13] Lambek, J., A Fixpoint Theorem for Complete Categories, Mathematische Zeitschrift 103 (1968), 151–161.

[14] Linton, F. E. J., Some aspects of equational categories, in:“Proc. Conf. Categorical Algebra, La Jolla 1965”, Springer-Verlag 1966, 84–94. 25 Aczel, Adamek,´ Velebil

[15] Moss, L., Parametric Corecursion, preprint, available at http://math.indiana.edu/home/moss/parametric.ps.

[16] Moss, L., Recursion and Corecursion Have the Same Equational Logic, preprint, available at http://math.indiana.edu/home/moss/eqcoeq.ps.

[17] Smyth, M. B., and G. D. Plotkin, The Category-theoretic Solution of Recursive Domain Equations, SIAM J. Comput., 11 (1982), 761–783.

[18] Tiurin, J., Unique Fixed Points vs. Least Fixed Points, Theoretical Computer Science 12 (1980), 229–254.

26 CMCS’01 Preliminary Version

From Varieties of Algebras to Covarieties of Coalgebras

Jiˇr´ıAd´amek 1,2

Department of Theoretical Computer Science Technical University of Braunschweig Braunschweig, Germany

Hans–E. Porst 3,4

Department of Mathematics University of Bremen Bremen, Germany

Abstract Varieties of F -algebras with respect to an endofunctor F on an arbitrary cocomplete category C are defined as equational classes admitting free algebras. They are shown to correspond precisely to the monadic categories over C. Under suitable assumptions satisfied in particular by any endofunctor on Set and Setop the Birkhoff Variety Theorem holds. By dualization, covarieties over complete categories C are introduced, which then correspond to the comonadic categories over C, and allow for a characterization in dual terms of the Birkhoff Variety Theorem. Moreover, the well known conditions of accessibilitly and boundedness for Set-functors F , sufficient for the existence of cofree F -coalgebras, are shown to be equivalent.

Introduction

What is a variety? A classical answer is: an equationally presented class of finitary algebras (such as groups, lattices, etc). Less classical answer: an equationally presented class of algebras with infinitary operations of possibly unbounded arities (such as complete semilattices or compact Hausdorff spaces). The first, classical, case corresponds precisely to algebras of a finitary monad

1 Support under Project SMS No. 34-99141-301 is gratefully acknowledged. 2 Email: [email protected] 3 This paper was written during an extended stay of the second author at the Department of Mathematics of the University of Stellenbosch, whose hospitality is gratefully acknowledged. 4 Email: [email protected] This is a preliminary version. The final version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Adamek´ and Porst over Set. The latter one, then, to algebras of an arbitrary monad over Set. This, however, has nothing to do with Set as a base category: we are going to introduce equations for —and equational classes of—F -algebras, where F is an endofunctor on any cocomplete category C. Those equational classes that have free algebras are called varieties. They are proved to precisely correspond to monadic categories over C. Although this is a folklore fact, it seems that it has never been really formulated. Our formulation is based on the construction of free F -algebras as a colimit of “term-objects” (which, in general, diverges: we do not assume that free F -algebras exist) presented by the first author in [2]. Functors F for which free F -algebras exist are called varietors in [6]. For all varietors on Set (and all varietors on “reasonable” categories preserving regular epimorphisms) the Birkhoff Variety Theorem generalizes to the present context: varieties are precisely the full subcategories of Alg(F ) which are closed under products, subalgebras and quotients. What is a covariety? It is a simple but important observation that for every endofunctor on a category C the category Coalg(F )ofallF -coalgebras is the dual of the category of F op-algebras, where F op is the endofunctor on Cop acting as F . Consequently, by simple dualization we obtain the concept of coequation and coequational class of coalgebras. Those coequational classes that have cofree coalgebras (i.e., which are varieties of F op-algebras) are called covarieties. And, for complete base categories, those are precisely the comonadic categories. If F is a covarietor, i.e., if all cofree coalgebras exist, and C = Set or C is “reasonable” and F preserves regular monomorphisms, then covarieties are precisely the full subcategories of Coalg(F )which are closed under coproducts, quotients and subcoalgebras. Covarieties—for bounded endofunctors on Set only—have been considered by various authors. The equivalence of our approach, when specialized to this particular case, to most of these concepts will be shown below. Which Functors are (Co-)Varietors? Varietors on Set have been completely characterized in [6] by the existence of arbitrarily large fixed points. A full characterization of covarietors on Set is not known, but several sufficient conditions are known: M. Barr shows in [9] that every accessible functor is a covarietor, and Y. Kawahara and M. Mori [17] prove that every bounded functor has a final coalgebra (from which it follows that every bounded functor is a covarietor). In the present note we prove that accessibility is, in fact, equivalent to boundedness, and both are equivalent to F being small, i.e., a small colimit of hom-functors. This seems to be the first time that the implication accessible =⇒ small has been properly proved (but see e.g. P. Freyd [11], which contains this result implicitly).

1 Algebras and coalgebras with respect to a functor

Let F : C → C be an endofunctor of some category C. Categories Alg(F )and Coalg(F ) are deﬁned as follows. 28 Adamek´ and Porst

Objects of Alg(F ), called F -algebras (over C),arepairs(C, αC )whereC is a C-object and αC : FC → C is a C-morphism. Morphisms f :(C, αC ) → (D, αD)ofAlg(F ), called F -algebra homomorphisms,areC -morphisms f : C → D making the diagram

Ff FC /FD

αC αD / C f D commute.

Objects of Coalg(F ), called F -coalgebras (over C),arepairs(C, αC )where C is a C-object and αC : C → FC is a C-morphism. Morphisms f :(C, αC ) → (D, αD)ofCoalg(F ), called F -coalgebra homomorphisms,areC-morphisms f : C → D making the diagram f C /D

αC αD / FC Ff FD commute. Composition and identities in Alg(F )andCoalg(F ) respectively are those of C. Alg(F )andCoalg(F ) are concrete categories over C in that they are equipped with canonical underlying functors

F U : Alg(F ) → C and UF : Coalg(F ) → C respectively 5 . The dual of a functor F : C → D is the functor F op : Cop → Dop acting on objects and morphisms as F . With these notations one has

1.1 Lemma For any functor F : C → C the following hold: 1. Coalg(F )=(Alg(F op))op op 2. UF =(F op U)

1.2 Example Let Ω be a signature in Birkhoﬀ’s sense, i.e., Ω = (Ωn)n∈N is a countable family of sets Ωn. We shall describe the category AlgΩof Ω-algebras—up to a concrete isomorphism— as (Alg(F ),U) for a functor → F = FΩ : Set Set as follows: × n FΩ assigns to a set X the set n∈N Ωn X . Correspondingly FΩ assigns → × n × n → toamap f : X Y the map n∈N Ωn f , i.e., the map n∈N Ωn X × n n∈N Ωn Y mapping a pair (ω,(x1,...,xn)) to the pair (ω,(fx1,...,fxn)). 5 Whenever confusion is unlikely to arise we will omit the subscript F . 29 Adamek´ and Porst

Functors of the form FΩ are called polynomial functors. We collect a number of well known properties of categories of F -algebras in the case C = Set as follows:

1.3 Theorem For every endofunctor F of Set the following hold: 1. Alg(F ) has all limits and these are created by U. 2. Alg(F ) has all colimits. Those which are preserved by F are created by U. 3. Alg(F ) has regular factorizations of homomorphisms; these are created by U.

1.4Remark The above properties hold in more general situations than just over Set: as an inspection of the essentially standard proofs (see e.g. [3] or [6]) shows, the following properties of the category C = Set and the functor F respectively are needed only: • C is complete, cocomplete, (regularly) co-wellpowered and has (regular epi, mono)-factorizations of homomorphisms. • F preserves (regular) epimorphisms. These conditions can be assumed to be satisﬁed in particular for every endofunctor on the category Setop, too. One only has to recall the following result of V.Trnkov´a (see [6, III.4.5-6]): Every endofunctor F on Set either preserves monomorphisms, or there is a monomorphism-preserving functor F which coincides with F on all nonempty sets and functions, and F ∅= ∅= F ∅.

It follows that, concerning categories Alg(F )andCoalg(F )overSet,oneal- ways may assume that F preserves monomorphisms: Coalg(F )andCoalg(F ) are (concretely) isomorphic, while Alg(F )=Alg(F ) whenever F fails to preserve monomorphisms. By means of Lemma 1.1 one thus gets by dualization the following properties of categories of F -coalgebras:

1.5 Theorem Let F be an endofunctor of Set. Then the following hold: 1. Coalg(F ) has all colimits and these are created by U. 2. Coalg(F ) has all limits. Those which are preserved by F are created by U. 3. Coalg(F ) has regular factorizations for homorphisms; these are created by U.

30 Adamek´ and Porst

2 Free algebras and cofree coalgebras

2.1 Example For polynomial endofunctors FΩ on Set, the concept of free algebra X on a set X of generators is well known. We can describe it either ∪ recursively as X = i<ωXi where

X0 = X +Ω0 = X + FΩ∅ terms of depths 0 are variables and nullary operations { | ∈ ∈ } Xi+1 = X + (ω,t0,...,tn−1) ω Ωn,t0,...,tn−1 Xi

= X + FΩXi terms of depths i+1 Or directly: X is the algebra of all ﬁnite “properly” labeled trees. “Properly” means that a node with n>0 children is labeled by an n-ary operation, and a leaf is labeled by a variable or a nullary operation. We have the universal arrow ηX : X → X , embedding X into X .

2.2 Remark Free F -algebras on X for an object X (of “variables”) of C can be deﬁned for all functors F : C → C as pairs consisting of an F -algebra

ϕX FX −−→ X and a morphism ηX : X → X with the universal property that given an F -algebra (C, αC ) and a morphism f : X → C of C there exists a unique F -homomorphism f extending f, i.e., such that the following diagram commutes.

ϕX / o FX X ηX }X }} }} Ff f }} }} f }~} αC FC /C

In other words, ηX is a universal arrow of the forgetful functor U : Alg(F ) → C.

2.3 Lemma Let F : C → C be a functor where C has ﬁnite coproducts and X a C-object. The following are equivalent for a morphism ιX : X + FIX → IX with components ηX : X → IX and αX : FIX → IX .

(i) (IX ,ιX ) is the initial algebra of type X + F (−).

(ii) (IX ,αX ) is the free F -algebra on X with universal morphism ηX .

2.4Corollary For a free F -algebra X one has X X + FX . This is Lambek’s Lemma (saying that initial F -algebras are ﬁxed points of F ) applied to FX = X + F (−). It is good to have a name for endofunctors F such that every object of C admits a free F -algebra, that is, such that U has a left adjoint.

2.5 Deﬁnition ([6]) An endofunctor F : C → C is called a varietor provided that a free F -algebra exists on every C-object. 31 Adamek´ and Porst

2.6 Examples 1. Polynomial endofunctors on Set are varietors. 2. If, more generally, C has colimits and ﬁnite products such that colimits of ω-chains commute with ﬁnite products, then every polynomial endofunctor FΩ on C is a varietor. In fact, it is easy to see that, for all Ω, FΩ preserves colimits of ω-chains. And then free algebras can be obtained by the following

2.7 Finitary Free-Algebra Construction (see [2]): This is an application of the famous construction of an initial F-algebra (the free F -algebra on 0, an initial object of C) as a colimit of the chain

2 0 −→! F 0 −F→! F 20 −F−→! F 30 ··· to the functor FX = X + F (−) (see Lemma 2.3 above). Let C have countable colimits. Given an object X in C we deﬁne an ω-chain Xi (i<ω) as follows:

X+F (X+F !) 0 −→! X + F 0 −X−−+F→! X + F (X + F 0) −−−−−−−→ (X + F (X + F (X + F 0)) ···

That is: • −→! X0 =0,X1 = X + F 0=X + FX0 and x0,1 =0 X + F 0 is the unique morphism • ≤ Xi+1 = X + FXi and xi+1,j+1 = X + Fxi,j for all i j

Claim: If F —and thus X + F —preserve a colimit X = colimi<ωXi of the above chain, then X is a free F -algebra on X. More detailed: suppose −→xi (Xi X ) is a colimit cocone. If X + F preserves that colimit we have a unique morphism

ϕX : X + FX → X with ϕX ◦ (X + Fxi)=xi+1

The two components ηX : X → X and αX : FX → X of ϕX form a free F -algebra on X.

Proof For every F -algebra (C, αC ) and any morphism (“assignment to variables”) f : X → C deﬁne a cocone of the above chain ( computation of terms) recursively as follows: ◦ f0 =! and fi+1 =[f,αC Ffi ]

−→xi −f→ Then the (unique) factorization Xi X C = fi gives the (unique) homomorphism f :(X ,αX ) → (C, αC )withf = f ◦ ηX . ✸

2.8 Examples 1. For the endofunctor FY =1+Y × Y (i.e., one constant

and one binary operation) on Set, we know that the terms in Xi are just the binary trees of depths ≤ i labelled in X + 1. This corresponds precisely to the construction above. 32 Adamek´ and Porst

2. For the endofunctor FY = Y N (i.e., one ω-ary operation) on Set we again

might form the sets Xi of terms, but here the colimit after ω steps does not give a free F -algebra, of course: we need ω1 steps of the following

2.9 Free-Algebra Construction (see [2] or [6, IV.3.2]): Let C be a cocomplete category. For every endofunctor F on C and every object X (“of variables”) in C define a transfinite chain of objects Xi (i any ordinal) and connecting morphisms → ≤ xi,j : Xi Xj (i j) by the following transfinite induction:

• −→! X0 =0,X1 = X + F 0withx0,1 being the unique morphism 0 X + F 0 • ≤ Xi+1 = X + FXi for all ordinals i, xi+1,j+1 = X + Fxi,j for all i j • Xj = colimi

Proof Given an F -algebra (C, αC ) and a morphism f : X → C we deﬁne a → cocone fi : Xi C (i an ordinal) by transﬁnite induction as above (leaving ◦ out the limit steps; compatibility fj xi,j = fi (i

2.10 Deﬁnition ([6]) A functor F : C → C is called constructive varietor provided that its Free-Algebra Construction 2.9 stops for each C-object X. A functor can be a varietor, though the above chain-construction fails to stop for every X (see e.g. [6, IV.3.A]). However we have the following results:

2.11 An endofunctor F on a cocomplete category which preserves colimits of λ-chains for some inﬁnite cardinal λ is a constructive varietor.

In fact, if F preserves Xλ = colimj<λXj , then the free algebra construction stops after λ steps.

2.12 Theorem ([6, 4.3], [4]) Every varietor on each of the categories Set, op 6 op Set , Veck and Veck is a constructive varietor. 6 This is the category of vector spaces over some ﬁeld k. 33 Adamek´ and Porst

Calling an endofunctor F on Set trivial iﬀ F is constant on nonempty sets one can prove (note that trivial endofunctors clearly are varietors):

2.13 Theorem ([6]) A non-trivial endofunctor F on Set is a (constructive) varietor iﬀ F has arbitrarily large ﬁxed points.

2.14Cofree coalgebras are the corresponding dualization of free algebras. A cofree F -coalgebra (with respect to a functor F : C → C)onaC-object X (“of colours”) is a coalgebra ψX : X → FX together with a (“colouring”) morphism ρX : X → X having the universal property that given an F -coalgebra (C, αC ) and a morphism f : C → X of C there exists a unique F -coalgebra homomorphism f extending f, i.e., such that the diagram αC / ~C FC ~~ f ~~ ~~ f Ff ~~ ~~ o ψX / X ρX X FX commutes. In other words, ρX is a couniversal arrow of the forgetful functor U : Coalg(F ) → C.

2.15 Deﬁnition An endofunctor F : C → C is called a covarietor provided that a cofree F -coalgebra exists on every C-object.

This terminology is justiﬁed by the following remark based on 1.1 and 2.3.

2.16 Remark The following are equivalent for any F : C → C: • F is a covarietor. • F op is a varietor. In case C has ﬁnite products, another equivalent condition is: • For every object X in C the functor F X = X × F has a terminal (= ﬁnal) coalgebra.

Dualization of the free-algebra construction above gives the following

2.17 Cofree Coalgebra Construction: Let C be a complete category. For every endofunctor F on C and every object X (“of colours”) in C deﬁne i a transﬁnite cochain of objects X (i any ordinal) and connecting morphisms i,j i → j ≥ x : X X (i j) as follows (where 1 denotes a terminal object of C):

• 0 i × 1,0 × −→! X =1,X = X F 1withx : X F 1 1 the unique morphism • i+1 × i i+1,j+1 × i,j ≥ X = X FX for all ordinals i, x = X Fx for all i j • j i j,i X = limi

If this cochain construction stops after k steps, i.e, if k is an ordinal such k,k+1 × k → k k that x : X FX X is an isomorphism, then X is a cofree F - k,k+1 k → coalgebra on X. More detailed: Denoting the inverse of x by ϕX : X × k X FX with components

k → k k → αX : X FX and ρX : X X these form a cofree F -coalgebra on X. For an F -coalgebra (C, αC )anda morphism f : C → X the extension f of f is the k-th member of the cocone i → i f : C X which is deﬁned by transﬁnite induction (leaving out the limit steps) as follows:

0 i+1 i ◦ f =! and f = f,Ff αC . 2.18 Deﬁnition A functor F : C → C is called constructive covarietor provided that its Cofree-Coalgebra Construction 2.17 stops for each C-object X. As the dual of Corollary 2.11 the following holds:

2.19 Corollary An endofunctor F on a complete category which preserves limits of λ-cochains for some inﬁnite cardinal λ is a constructive covarietor.

2.20 Examples 1. Polynomial functors on Set are covarietors (here λ = ω). Ci 2. Generalized polynomial functors on Set, i.e., functors FY = i∈I Y for a given family (Ci)i∈I of (not necessarily finite) sets are covarietors (again, λ = ω). 3. Every endofunctor on Set which has arbitrarily large exponential fixed points (i.e., there are arbitrarily large sets X such that each set Y with cardX ≤ cardY ≤ card expX is a fixed point of F ) is a covarietor (see [4]). Compare with Theorem 2.13.

3 Varieties and Covarieties

The following deﬁnition generalizes concepts from [1]:

3.1 Deﬁnitions Let F be an endofunctor of a cocomplete category C. Using the notation Xi and fi as in 2.9 we deﬁne: → 1. An equation arrow over X is a regular epimorphism e: Xi E for some ordinal i.AnF -algebra (C, αC )issaidtosatisfy e provided that for every → morphism f : X C the morphism fi factors through e: e / Xi ? E ?? ?? ?? ?? fi ? ? C 35 Adamek´ and Porst

2. For any class E of equation arrows, Alg(F, E) denotes the full subcategory of Alg(F ) spanned by all F -algebras satisfying every e ∈E. Such categories are called equational categories (of F -algebras) over C. 3. An equational category Alg(F, E)overC will be called a variety (of F - algebras) over C provided that the underlying functor

UE = U|Alg(F,E) : Alg(F, E) → C

has a left adjoint.

3.2 Remarks 1. Equations in classical (ﬁnitary) universal algebra are pairs of terms, i.e, parallel pairs of morphisms

→ u, v :1 X = Xω

An algebra (C, αC ) satisﬁes this equation iﬀ for every morphism f : X →

C the unique homomorphism f = fω extending f merges u and v, i.e,

f ◦ u = f ◦ v.

This is equivalent to the satisfaction, in the above sense, of the equation arrow e: X → E which is a coequalizer of u and v. In general, every pair of parallel morphisms (with C-objects A, X and an ordinal i) → u, v : A Xi → in C defines an equation arrow e: Xi E, a coequalizer of u and v. An algebra (C, αC ) “satisfies u = v” (in the expected sense: for every → ◦ ◦ morphism f : X C we have fi u = fi v)iff(C, αC ) satisfies e in the above sense. 2. If the base category C has kernel pairs, then, conversely, equation arrows can be substituted by parallel pairs: given a regular epimorphism → → e: Xi E,denotebyu, v : A Xi akernelpairofe.Thenanalgebra satisfies u = v iff it satisfies e. Observe here that the index i can be upgraded arbitrarily: given a → parallel pair u, v : A Xi and an ordinal j>i, put u = xi,ju and v = xi,jv. Then the equations u = v and u = v are satisfied by the same algebras. 3. Let C have kernel pairs and let F be a constructive varietor. It follows from 2. that all equations we have to consider are of the form

u, v : A → X

for objects A, X in C: in fact, upgrade any given parallel pair to an ordinal

j such that X = Xj . Here, the satisfaction of u = v by (C, αC )means

that for every f : X → C the unique homomorphism f :(X ,ϕX ) → 36 Adamek´ and Porst

(C, αC )mergesu and v. Since all homomorphisms h on (X ,ϕX )have

the form h = f (for f = h ◦ ηX ), we see that (C, αC ) satisﬁes u = v

iﬀ hu = hv for all homorphisms h:(X ,ϕX ) → (C, αC ). Thus we can, equivalently, work with equation arrows

e: X → E

which are regular quotients, in C,offreeF -algebras. 4. Suppose that F is a constructive varietor such that Alg(F ) has coequalizers (e.g., whenever C has kernel pairs and regular factorizations of morphisms, is regularly cowellpowered and F preserves regular epimorphisms, see Remark 1.4). Then instead of regular epimorphisms e: X → E in C we can work with regular epimorphisms in Alg(F ). For that purpose recall the following notions: (a) An object I in a category C is called injective w.r.t. a given morphism e: C → D provided that each morphism h: C → I factorizes over e:

e / h: CD D DD DD DD DD h DD D" I (b) Given a class E of homomorphisms, we denote by InjE the injectivity class of E, i.e., the full subcategory of Alg(F ) spanned by all algebras injective w.r.t. each e ∈E. (c) The injectives w.r.t. all regular monomorphisms are called the regular injectives. (d) The dual notions are (regular) projective and projectivity class ProjE. Now observe that for every equation u, v : A → X we have a new equation u ,v : A → X which is satisﬁed by precisely the same algebras

(C, αC ) (because, given a homomorphism h:(X ,ϕX ) → (C, αC ), then → ¯ (hu) = hu and (hv) = hv ). Thus, ife ¯:(X ,ϕX ) (E,αE¯) denotes

a coequalizer of u and v in Alg(F ), then for every algebra (C, αC )we have

(C, αC ) satisﬁes e ⇐⇒ (C, αC ) is injective w.r.t.e. ¯

5. Consider the base category C = Set. It then follows from 4. that, for any varietor F : Set → Set, every variety of F -algebras is speciﬁed by injectivity to regular epimorphisms of Alg(F ) with regularly projective domains 7 . The converse is also true: every class of F -algebras speciﬁed by injectivity to regular epimorphisms of Alg(F ) with regularly projective domains is a variety.

7 Every free F -algebra over Set is regularly projective by the axiom of choice. 37 Adamek´ and Porst

In fact, consider such an epimorphism, e:(D, αD) → (E,αE), in Alg(F ).

Since Since the homomorphism id :(D ,ϕD) → (D, αD)isaregular epimorphism in Alg(F )and(D, αD) is regularly projective we have a homomorphism

m:(D, αD) → (D ,ϕD)with id ◦ m = id.

Choose a pair of homomorphisms u, v with coequalizer e in Alg(F ). Then an algebra (C, αC ) is orthogonal to e iﬀ for every homomorphism h:(D, αD) → (C, αC )wehaveh ◦ u = h ◦ v. Thisisequivalentto

stating that for every homomorphism k :(D ,ϕD) → (C, αC )wehave k ◦ (m ◦ u)=k ◦ (m ◦ v): given k, put h = k ◦ m,andgivenh, put ◦ → ¯ k = h id . Thus, ife ¯:(D ,ϕD) (E,αE¯) denotes a coequalizer of m ◦ u and m ◦ v in Alg(F ), then injectivity toe ¯ and e, respectively, is equivalent. (And the former can be substituted by the equations u0 = v0 obtained by the kernel pair ofe ¯.) This concept of equation and its satisfaction has already been considered by H. Herrlich and his co-authors in [16] and [8].

3.3 Example The power-set functor P on Set is not a varietor. However, we can consider equational categories of P-algebras. Complete semilattices are an example. In fact, the join-operation of a complete (upper) semilattice C is an arrow αC : PC → C satisfying (i) αC {x} = x, and (ii) αC Mi = αC {αC Mi | i ∈ I} for any collection Mi in PC. Conversely, every P-algebra satisfying (i) and (ii) is a (join operation of a unique) complete semilattice. → { } Now (i) can be expressed by the equation arrow e: X2 E where X = x and e just merges x and {x}, whereas (ii) corresponds to the equation arrows → P f : X3 Fwhere X is an arbitrary set and, for a given collection Mi in X, f merges Mi with {Mi | i ∈ I}. The homomorphisms are precisely the functions preserving all joins.

The following lemma—to be proven by an easy transﬁnite induction—will be used frequently:

3.4Lemma Homomorphisms of F -algebras preserve computation of terms, i.e., given a homomorphism h:(C, αC ) → (D, αD) and an assignment of vari- → ◦ ◦ ables f : X C then, for all ordinals i, (h f)i = h fi .

3.5 Proposition Alg(F, E) is always closed in Alg(F ) under 1. subalgebras and all limits which exist; 2. homomorphic images carried by split epimorphisms in C.

Proof 1. is trivial by an obvious diagonal ﬁll-in argument.

2. Let r :(C, αC ) → (D, αD) be a homomorphism with coretraction s in C, → → where (C, αC ) satisﬁes the equation arrow e: Xi E.Givenf : X D,one 38 Adamek´ and Porst

◦ ◦ ◦ ◦ ◦ has r (s f)i =(r s f)i = fi . Thus, since (s f)i factorizes through e so ✸ does fi .

3.6 Theorem Monadic categories over a cocomplete category C are precisely the categories concretely equivalent to varieties over C.

Proof I. Sufficiency: By Beck’s Theorem we have to verify that the underlying functor of a variety Alg(F, E) creates split coequalizers. Since Alg(F, E)is closed under quotients splitting in C, it suffices to prove that U : Alg(F ) → C creates absolute coequalizers. This is proved exactly as in the proof of Becks’s Theorem (see [18]). II. For the converse it suffices to show that, for any monad T =(T,η,µ) on a cocomplete category C, the Eilenberg-Moore category CT of T-algebras coincides with the subcategory Alg(T,E)ofAlg(T ) for a suitable class E of equation arrows. For doing so consider, for every C-object X, the coproduct

−m−−i+1→ ←n−i+1− X X + TXi = Xi+1 TXi .

AclassE1 of equation arrows now is deﬁned as follows: for every C-object

X let eX : X → EX be a coequalizer of the pair m2,n2 ◦η ◦m1.AT -algebra 2 X1 (C, αC ) satisﬁes eX iﬀ, for every morphism f : X → C, the morphism ◦ → f2 =[f,αC Tf1]: X + TX1 C

satisﬁes f ◦ m2 = f ◦ n2 ◦ η ◦ m1 or, equivalently, f = αC ◦ Tf ◦ η ◦ m1. 2 2 X1 1 X1 Since η is natural, this is equivalent to f = αC ◦ ηC ◦ f which, for X = C and f =1C yields satisfaction of the T-algebra axiom αC ◦ ηC =1C . Conversely, αC ◦ ηC =1C yields f = αC ◦ ηC ◦ f by composition with f. Thus, the satisfaction of E1 is equivalent αC ◦ ηC =1C . Next we deﬁne a class E2 of equation arrows as follows: for every C-object → ◦ ◦ X let dX : X + TX2 = X3 DX be a coequalizer of the pair n3 Tm2 2 µX ,n3 ◦ Tn2 ◦ T m1.

vTX; HH vv HH µXvv TmHH 2 vv HH vv # 2 n3 / dX / T XFF

A T -algebra (C, αC ) satisfies dX iff, for every morphism f : X → C,the morphism ◦ ◦ → f3 =[f,αC T [f,αC Tf1]]: X + TX2 C ◦ ◦ ◦ 2 ◦ ◦ ◦ satisfies f3 n3 Tn2 T m1 = f3 n3 Tm2 µX .Thisisequivalentto 2 2 αC ◦ TαC ◦ T f = αC ◦ Tf ◦ µX or, since µ is natural, to αC ◦ TαC ◦ T f = 39 Adamek´ and Porst

2 αC ◦ µC ◦ T f.Choosingf =1C this yields satisfaction of the T-algebra axiom αC ◦ TαC = αC ◦ µC . Conversely, αC ◦ TαC = αC ◦ µC yields αC ◦ TαC ◦ 2 2 2 T f = αC ◦ µC ◦ T f by composition with T f. Thus, the satisfaction of E2 is equivalent αC ◦ TαC = αC ◦ µC . T Chosing E = E1 ∪E2 one thus gets C = Alg(T,E). ✸

3.7 Theorem Let C be a cocomplete, regularly co-wellpowered category with regular factorizations and kernel pairs. If F : C → C is a constructive varietor preserving regular epimorphisms, the following are equivalent for any full subcategory K of Alg(F ): (i) K is a variety. (ii) K is closed under subalgebras, products and homomorphic images carried by split epimorphisms.

Proof In view of Proposition 3.5 we only have to show that (ii) implies (i). By Remark 1.4, Alg(F ) has regular factorizations. Thus (ii) implies that K is a reflective subcategory whose reflection-arrows are regular epimorphisms in Alg(F ), see [3, 16.8]. Let now E be the class of all reflection-arrows of free algebras. We claim K = InjE. Trivially each algebra in K is injective w.r.t. all reflection arrows. Conversely, if (C, αC ) is injective w.r.t. the reflection r of the free algebra (F, αF )overC, then the homomorphic extension idC of the ◦ identity of C factors as idC = g r. This shows that g is—as a C-morphism— a retraction. Thus, (C, αC ) is a split-epi carried quotient of the K-reflection of (F, αF ), hence belonging to K by hypothesis. Thus, K = InjE.FromRemark 3.2.4 above we conclude that K is a variety. ✸

3.8 Corollary (Birkhoﬀ Variety Theorem) For every varietor F on the category Set, varieties of F -algebras are precisely the full subcategories closed under products, subalgebras and homomorphic images.

3.9 Remark In the special case C = Set it suﬃces, in the proof of Theorem 3.7 above, to take as E the set of reﬂection arrows of free algebras on sets of cardinality less than λ (λ a regular cardinal), provided that F preserves λ-directed colimits (see e.g. proof of [5, 3.9]).

By formally dualizing Deﬁnition 3.1, see Lemma 1.1, we obtain the following

3.10 Deﬁnitions Let F be an endofunctor of a complete category C. → i 1. An coequation arrow over X is a regular monomorphism m: M X for some ordinal i.AnF -coalgebra (C, αC )issaidtosatisfy m provided → i that for every morphism f : C X the morphism f factors through m. 40 Adamek´ and Porst

2. For any class M of coequation arrows Coalg(F, M) is the full subcategory of Coalg(F ) spanned by all F -coalgebras satisfying every m ∈M. Such categories are called coequational categories (of F -coalgebras) over C. 3. A coequational category Coalg(F, M) will be called a covariety (of F - coalgebras) over C provided that the underlying functor

M U = U|Coalg(F,M) : Coalg(F, M) → C

has a right adjoint (that is, if Alg(F op, M)isavarietyoverCop). By dualizing the respective results on equational categories and varieties we immediately obtain the following results.

3.11 Corollary Comonadic categories over a complete category C are precisely the categories concretely equivalent to covarieties over C.

3.12 Corollary (Birkhoﬀ Covariety Theorem) For every covarietor F on the category Set, covarieties of F -algebras are precisely the full subcategories closed under coproducts, subalgebras and homomorphic images 8 .

3.13 Remarks 1. Clearly, by duality, an equivalent condition for a full subcategory of Coalg(F )(overSet) to be a covariety is to be a projectivity class w.r.t. of some class of regular monomorphisms M with cofree codomains. 2. Moreover, as in the case of varieties over Set (see Remark 3.9), also covarieties of F -coalgebras over Set can be speciﬁed by projectivity w.r.t. a set of regular monomorphisms—in fact a single one—with cofree codomains, provided that the functor F preserves λ-directed colimits for some regular cardinal λ. This follows easily from the boundedness property (see Theorem 4.1 below) of these functors (see [20,12]). Note, however, that this observation cannot be obtained by dualization of Remark 3.9. While Theorem 3.11 shows that the dual of a covariety over Set is a variety over Setop it moreover implies the following additional dualization principle:

3.14Proposition The dual of a covariety over Set is equivalent to a variety over Set. Proof By means of the contravariant power-set functor P the category Setop is monadic over Set.LetV : Coalg(F, M) → Set be the composite of (U M)op and P. We need to show that V is monadic. Since V has a left adjoint and creates limits it suﬃces to prove that V creates coequalizers of congruence relations (= kernel pairs). Hence let r, s :(C, αC ) → (D, αD)bea pair of Coalg(F, M)-morphisms sich that Vr,Vsis a congruence relation and

8 Observe that, in Coalg(F ), the homorphic images are given by (plain) epimorphims while the embeddings of subalgebras are the regular monomorphisms. 41 Adamek´ and Porst let q : P(D) → X be its coequalizer. Since P reﬂects congruence relations and creates their coequalizers there is a unique Setop-morphism q : D → X with P(q)=q and this is a coequalizer of the congruence relation U Mr, U Ms. If X = ∅ this will even be a split coequalizer such that U M creates from it a coequalizer of r, s. The remaining case X = ∅ is trivial: the unique F - coalgebra structure on ∅ obviously does the job. ✸

3.15 Remarks 1. Coequations and their satisfaction have the following simple interpretation in the case of coalgebras over Set: deﬁne, for every ∈ i “coterm” x X , the coequation [x] as the following embedding

i \{ } → i X x > X .

i → A coalgebra (C, αC ) satisﬁes [x]iﬀx does not lie in the image of f : C i → X for any colouring f : C X. These are all the coequations needed: we can substitute an arbitrary coequation

→ i m: M X

{ | ∈ i \ } by the set of coequations [x] x X m[M] . 2. Various concepts of covariety of F -coalgebras—all restricted to the case of a bounded endofunctor on Set (thus, a varietor—see Section 4)—have already been discussed in the literature: • subcategories of Alg(F ) closed w.r.t. coproducts, subcoalgebras and homomorphic images ([20]). • projectivity classes in Alg(F ) w.r.t. collections of embeddings of subcoalgebras of cofree coalgebras ([13]). • projectivity classes in Alg(F ) w.r.t. collections of embeddings of subcoalgebras of regularly injective coalgebras ([14] 9 ,[7]). Theorem 3.12 (in connection with Remark 3.2) shows in particular that all of them are equivalent to the concept introduced here if specialized to bounded Set-functors. 3. A more complicated concept of coequation appears in [10]; this probably is not equivalent to the one above.

4Properties of Set-functors

Every endofunctor F of Set preserving colimits of λ-chains (or, equivalently, λ- ﬁltered colimits) for some regular cardinal λ is a varietor by 2.11. Such functors are called accessible, see [19]. An accessible functor is also a covarietor, as observed by M. Barr [9]. A diﬀerent criterion is due to Y. Kawahara and M. Mori (see [17] or also [20] and [15]): recall that F is called bounded if there

9 Here, regular injectivity is called extension property 42 Adamek´ and Porst

exists an inﬁnite cardinal λ such that for every F -coalgebra (C, αC ) and every element x of C there is a coalgebra homomorphism h:(D, αD) → (C, αC ) with x ∈ h[D]andcardD ≤ λ. We are going to prove that this, however, is equivalent to accessibility and both are equivalent to F being small, i.e., a small colimit of hom-functors:

4.1 Theorem For an endofunctor F of Set the following conditions are equivalent: (i) F is small; (ii) F is accessible; (iii) F is bounded.

Proof I. Suppose ﬁrst that the given endofunctor F preserves ﬁnite intersections (i.e., pullbacks of monomorphisms). (iii) =⇒ (i): For the above cardinal λ let D be the (essentially small) categoryofallpairs(X, x)whereX is a set of cardinality ≤ λ and x ∈ FX, with morphisms f :(X, x) → (X ,x) all functions f : X → X with Ff(x )=x. We prove that F is a colimit of the diagram V : D → [Set, Set] where V (X, x)=hom(X, −) with the colimit cocone f(X,x) having components

Y → −→ → f(X,x) : hom(X, Y ) FY, q Fq(x) for all q : X Y. That is, we prove that for every set Y Y (a) the maps f(X,x) are collectively epimorphic, and Y Y (b) whenever f (q)=f (q )thenq is connected with q by a zig–zag (X,x) (X ,x ) in the diagram of elements of V composed with the evaluation–at–Y , evalY :[Set, Set] → Set. Proof of (a): Given y ∈ FY, for the coalgebra (Y,const(y)) there exists a homomorphism h :(D, αD) → (Y,const(y)) with card D ≤ λ which fulﬁlls D = ∅ if Y = ∅.ForY = ∅ choose d0 ∈ D,then(D, d) ∈D with d = αD(d0) Y · f(D,d)(h)=Fh(αD(d)) = const(y) h(d)=y. The case Y = ∅ is trivial since (Y,y) ∈D. Proofof(b):WehaveFq(x)=Fq (x ) for some q : X → Y and q : X → Y .Factorq as an epimorphism e: X → Z followed by a monomorphism m: Z → Y and put z = Fq(x); analogously e, m,andz. By assumtion, F preserves the pullback

P ?? u ??u ? Z ? Z ?? ?? m m Y 43 Adamek´ and Porst

The equality Fm(z)=Fq(x)=Fq(x)=Fm(z) thus guarantees that there exists p ∈ FP with z = Fu(p)andz = Fu(p). And since cardP ≤ card(Z × Z) ≤ card(X × X) ≤ λ2 = λ,weobtainanobject(P, p)ofD with morphisms q q (X, x) ←− (Z, z) →u (P, p) ←−u (Z,z) → (X,x) forming the desired zig-zag. (i) =⇒ (ii): Every hom–functor is accessible, and a small colimit of accessible functors is accessible, see [11]. (ii) =⇒ (iii): Let F preserve λ–ﬁltered colimits for some regular cardinal λ. Since every set is a λ–ﬁltered colimit of all subsets of cardinality less than λ,weseethat (∗)givensetsC and T ⊆ FC with card T<λthere exists a subset m: B>→ C with cardB<λand T ⊆ Fm[FB].

We prove that F is bounded: given (C, αC )andx ∈ C define a λ–chain mi : Bi >→ C (i<λ) of subsets of cardinality less than λ by transfinite induction as follows: • B0 = {x}; • given Bi, apply (∗)toT = αC [Bi]togetmi+1 : Bi+1 → C with mi ⊆ m ,α [B ] ⊆ Fm [FB ], and cardB <λ; i+1 C i i+1 i+1 i+1 • given a limit ordinal i define Bi = Bj – due to the regularity of λ,if j

4.2 Example of a covarietor which is not small. Given a class M of cardinal numbers, define PM : Set → Set on objects X by PM X = {A ⊆ X; A = ∅ or cardA ∈ M} and an morphismus f : X → Y by PM f(A)=f[A]iff/A is injective, else = ∅.ThenPM is small iff M is small ( = bounded). But every infinite set X with cardX/∈ M is, obviously, a fixed point of PM .Itiseasyto find an unbounded class M for which PM has arbitrarily large exponential fixed points (i.e., M is unbounded but has arbitrarily large ”exponential holes”). Then PM is a varietor and covarietor but is not small. 44 Adamek´ and Porst

References

[1] J. Ad´amek. Theory of Mathematical Structures. D. Reidel Publ. Comp., 1983. Dordrecht.

[2] J. Ad´amek. Free algebras and automata realizations in the language of categories. Comment. Math. Univ. Carolinae 15 (1974), 589–602.

[3] J. Ad´amek, H. Herrlich, and G.E. Strecker, Abstract and Concrete Categories, Wiley Interscience, 1990. New York

[4] J. Ad´amek and V. Koubek. On the greatest ﬁxed point of a functor. Theor. Comp. Science 150 (1995), 57–75.

[5] J. Ad´amek and J. Rosick´y. Locally Presentable and Accessible Categories. Cambridge University Press, 1994. Cambridge.

[6] J. Ad´amek and V. Trnkov´a. Automata and Algebras in Categories. Kluwer Acad. Publ., 1990. Dordrecht

[7] S. Awodey and J. Hughes. The coalgebraic dual of Birkhoﬀ’s variety theorem. Preprint.

[8] B. Banaschewski and H. Herrlich. Subcategories deﬁned by implications. Houston J. Math. 2 (1976), 149–171.

[9] M. Barr. Terminal coalgebras in well-founded set theory. Theoretical Computer Science 114 (1993), 299–315.

[10] C. Cˆırstea. An algebra-coalgebras framework for system speciﬁcation. Electronic Notes in Theor. Comp. Sci. 33 (2000).

[11] P. Freyd. Several new concepts:lucid and concordant functors, prelimits, precolimits, lucid and concordant completions of categories. Lect. Notes Mathem. 99 (1969), 196–241. Springer,

[12] H. P. Gumm. Elements of the general theory of coalgebras. Preprint.

[13] H. P. Gumm. Equational and implicational classes of coalgebras. Preprint.

[14] H. P. Gumm and T. Schr¨oder. Covarieties and Complete Covarieties. Electronic Notes in Theor. Comp. Sci. 11 (1998).

[15] H. P. Gumm and T. Schr¨oder. Products of coalgebras. Preprint.

[16] H. Herrlich and C.M. Ringel. Identities in categories. Can. Math. Bull. 15 (1972), 297–299.

[17] Y. Kawahara and M. Mori. A small ﬁnal coalgebra theorem. Theoretical Computer Science 233 (2000), 129–145.

[18] S. MacLane. Categories for the working mathematician. Springer, 1971. Berlin- Heidelberg-New York. 45 Adamek´ and Porst

[19] M. Makkai and R. Par´e. Accessible categories:the foundations of categorical model theory. Contemporary Math., 104, 1989. Amer. Math. Soc., Providence.

[20] J.J.M.M. Rutten. Universal Coalgebra:a theory of systems. Report CS-R9652. CWI Amsterdam, 1996

46 CMCS’01 Preliminary Version

Process Calculi `alaBird-Meertens

Lu´ıs S. Barbosa 1

Departamento de Inform´atica Universidade do Minho Braga, Portugal

Abstract This paper is an attempt to apply the reasoning principles and calculational style underlying the so-called Bird-Meertens formalism to the design of process calculi, parametrized by a behaviour model. In particular, basically equational and pointfree proofs of process properties are given, relying on the universal characterisation of anamorphisms and therefore avoiding the explicit construction of bisimulations. The developed calculi can be directly implemented on a functional language supporting coinductive types, which provides a convenient way to prototype processes and assess alternative design decisions.

1 Introduction

It is well known that initial algebras and final coalgebras provide abstract descriptions of a variety of phenomena in programming, in particular of data and behavioural structures, respectively. Both initiality and finality, as universal properties, entail definitional and proof principles, i.e., a basis for the development of program calculi directly based on (actually driven by) type specifications. Moreover, such properties can be turned into programming combinators and used, not only to calculate programs, but also to program with. In functional programming the role of such universals — combined with the ‘calculational’ style entailed by category theory — has been fundamental to a whole discipline of algorithm derivation and transformation. This can be traced back to the so-called Bird-Meertens formalism [5,6] and the foun- dational work of T. Hagino [8]. Since then, the area has known a remarkable progress, as witnessed by the vast bibliography published both on theory and applications — see [12,13,4], among many others references. This paper reports on an attempt to apply the same reasoning principles and calculational style to the design of process calculi, relying on the representation of processes as inhabitants of coinductive types, i.e., final coalgebras for

1 Email: [email protected] This is a preliminary version. The final version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Barbosa suitable Set endofunctors. Final semantics for processes is an active research area, namely after Aczel’s landmark paper [1]. Our emphasis is, however, actually placed on the design side: we intend to show how process calculi can be developed and their laws proved along the lines one gets used to in (data-oriented) program calculi. Although only the ‘fine grain’ observational equivalence entailed by (strict) bisimulation is considered in the sequel, we believe that the proposed approach has a number of advantages: • First of all it provides a uniform treatment of processes and other computational structures, e.g., data structures, both represented as categorical types for functors capturing signatures of, respectively, observers and constructors. Placing data and behaviour at a similar level conveys the idea that process models can be chosen and specified according to a given application area, in the same way that a suitable data structure is defined to meet a particular engineering problem. Moreover, processes and data become expressible in programming languages supporting categorical types, such as Charity [7], providing a convenient way to prototype processes and compare alternative design decisions. This builds on the author previous work on component algebras [3]. • Proofs are carried out in a purely calculational (basically equational and pointfree) style, therefore circumventing the explicit construction of bisimulations used in most of the literature on process calculi. In particular, a ‘conditional fusion’ result is proved to handle conditional laws. • Finally the approach is independent of any particular process calculus and makes explicit the different ingredients present in the design of any such calculi. In particular structural aspects of the underlying behaviour model (e.g., the dichotomies such as active vs reactive, deterministic vs non deterministic) become clearly separated from the interaction structure which defines the synchronisation discipline.

2 Preliminaries

Anamorphisms. Technically, the approach sketched here amounts to the systematic use of the universal property of anamorphisms. Recall that, for a functor T,aT- anamorphism is the unique T-comorphism to the ﬁnal coalgebra ωT : νT −→ T νT from any other coalgebra U, p. Also called the coinductive extension of p [17], it is written, in the tradition of [13], as [(p)] T or, simply, [(p)] , a n d satisﬁes the following universal property:

⇔ · · k =[(p)] T ωT k = T k p (1) 48 Barbosa from which other laws are easily derived, e.g.:

ωT · [( p)] = T [( p)] · p (2)

[( ωT)] = idνT (3) [( p)] · h =[(q)] i f p · h = T h · q (4)

In the context of the Bird-Meertens formalism equations (2) to (4) are seen as instances of a cancellation, reﬂection and fusion result, respectively.

Invariants. In order to derive conditional laws (section 4) we shall resort to the notion of a p-invariant, i.e.,apredicateφ over the carrier of a T-coalgebra p closed under the p dynamics. Following [11], φ is an invariant if, as a set, it satisfies ❡ ❡ φ ⊆ p φ,where p φ denotes the set of all states whose immediate successors, ❡ under p,ifany,satisfyφ. p is, in fact, a modal combinator which corresponds to the familiar (weak) next operator in modal logics 2 . [11] introduces a ❡ T definition of p in terms of a lifting operation ( ) : PX −→ P T X defined inductively on the structure of (extended polynomial) functors (see [9] for ❡ T the complete picture). Formally, p φ = {u ∈ U| pu ∈ (φ) }. The infinite ❡ extension of p characterise the (future) ‘box’ operator relative to a coalgebra p. Thisisdenotedin[11]byp,wherep is defined as the greatest fix point of ❡ λx .φ∩ p x. Informally, p φ reads ‘φ holds now and in all successor states’.

3 Process Structure and Combinators.

Processes. In designing a process calculus, its operational semantics is usually given in terms of a transition relation −→a over processes, indexed by a set Act of actions 3 , witnessing the collection of actions in which a process gets committed and the resulting ‘continuations’, i.e., the behaviours subsequently exhibited. A first basic design decision concerns the definition of what should be understood by such a collection. As a rule it is defined as a set, in order to express non determinism. Other, more restrictive, possibilities consider a sequence or even just a single continuation, modelling, respectively, ‘ordered’ non determinism or determinism. In general, this underlying behaviour model can be represented by a functor B .

2 The transition system deﬁned by p is the structure upon which the operator is interpreted. In fact, it has been recently recognised by a number of authors (notably in [15] and [11]) that a modal language associated to a T-coalgebra is determined by its shape, as recorded in T. 3 This set will later be equipped with further structure to support particular interaction disciplines. For the moment just assume that actions are generated from a set L of labels, i.e., a set of formal names. The embedding L → Act will usually be left implicit. 49 Barbosa

An orthogonal decision concerns the intended interpretation of the transition relation, which is usually left implicit or underspecified in process calculi. We may, however, distinguish between • An ‘active’ interpretation, in which a transition p −→a q is informally characterised as ‘p evolves to q by performing an action a’, both q and a being solely determined by p. • A ‘reactive’ interpretation, informally reading ‘p reacts to an external stimulus a by evolving to q’. Processes will then be taken as inhabitants of the carrier of the final coalgebra ω : ν −→ T ν,withT defined as B (Act×Id), in the first case, and (BId )Act,in the second. To illustrate the proposed approach to the development of process calculi, we shall focus on a particular case where B is the finite powerset functor and the ‘active’ interpretation is adopted. The transition relation, for this case, is given by p −→a q iff a, q∈ωp. Although this corresponds to the main trend in the literature, some alternatives will be considered in section 5. The restriction to the finite powerset avoids cardinality problems and as- sures the existence of a final coalgebra for T. This means, of course, we shall deal only with image-finite processes, a not too severe restriction in practice which may be partially circumvented by a suitable definition of the structure of Act 4 .

Dynamic Combinators. The cornerstone in the design of a process calculi is the judicious selection of a (hopefully small) set of process combinators. In [14], R. Milner classifies them into two distinct groups. The first group consists of all combinators which persist through action, i.e., which are present before and after a transition occurs. They are called static and used to set up process’ architectures, specifying how their components are linked and which parts of their interface are public or private. Dynamic combinators, on the other hand, are ‘consumed’ on action occurrence, disappearing from the expression representing the process continuation. In this paragraph the usual Ccs dynamic combinators — i.e., inaction, prefix and non-deterministic choice —aredefinedas operators on the final universe of processes considered above. Notice that, being non recursive, they have a direct (coinductive) definition which depends solely on the chosen process structure. Therefore, the inactive process is represented as a constant nil : 1 −→ ν upon which no relevant observation can be made. Prefix gives rise to an Act-indexed family of operators a. : ν −→ ν, with a ∈ Act. Finally, the possible actions of the non deterministic choice of two processes p and q corresponds to the collection of all actions allowed for

4 For instance, by taking Act as channel names through which data flows, which corresponds closely to ‘Ccs with value passing’ [14]. Therefore, only the set of channels, and not the messages (seen as pairs channel/data), must remain finite. 50 Barbosa p and q. Therefore, the operator + : ν × ν −→ ν can only be defined over a process structure in which observations form a collection. Formally,

inaction ω · nil = ∅

preﬁx ω · a. = sing · labela choice ω · +=∪ · (ω × ω)

where sing = λx . {x} and labela = λx . a, x. Clearly, structure ν;+, nil forms an Abelian idempotent monoid, a fact that can be proved by simple equational reasoning, resorting to the corresponding properties of set union. Moreover, ﬁnality turns ω into an isomorphism and therefore, to prove e = e it is enough to show that ω ·e = ω ·e.To illustrate the proposed proof style, consider the proof of a particularly simple result: + associativity, i.e.,+· (+ × id)=+· (id × +) · a 5 . Proof.

ω · + · (+ × id)=∪ · (ω × ω) · (+ × id) { definition } = ∪ · (ω · + × ω) { functoriality } = ∪ · ((∪ · (ω × ω)) × ω { definition } = ∪ · (∪×id) · ((ω × ω) · ω) { functoriality } ◦ = ∪ · (∪×id) · ((ω × ω) · ω) · a · a { a is isomorphism } ◦ = ∪ · (∪×id) · a · (ω × (ω × ω)) · a { a naturality } = ∪ · (id ×∪) · (ω × (ω × ω)) · a {∪associativity } = ∪ · (ω × (∪ · (ω × ω)) · a { functoriality } = ∪ · (ω × ω · +) · a { definition } = ∪ · (ω × ω) · (id × +) · a { functoriality } = ω · + · (id × +) · a { definition }

✷

Interleaving and Restriction. Persistence through action occurrence justifies the recursive definition of static combinators. This means that they arise as anamorphisms generated by suitable ‘gene’ coalgebras. Both interleaving and restriction are examples of static combinators, which, moreover, depend only on the process structure. We shall consider them in first place. 5 In the sequel, process properties are stated pointfree, for which we shall resort to standard natural isomorphisms in Set. In particular, associativity, commutativity and product left and right units, will be denoted by a :(A × B) × C −→ A × (B × C), s : A × B −→ B × A, r : 1 × A −→ A and l : A × 1 −→ A, respectively. The converse of an isomorphism i is written as i◦. 51 Barbosa

Although interleaving, a binary operator : ν × ν −→ ν, is not considered as a combinator in most process calculi, it represents the simplest form of ‘parallel’ aggregation in the sense that it is independent of any particular interaction discipline. The following deﬁnition captures the intuition that the observations over the interleaving of two processes correspond to all possible 6 interleavings of the observations of its arguments. Thus, =[(α)] , w h e r e

/ (ω×id)×(id×ω)/ α = ν × ν (ν × ν) × (ν × ν) (P(Act × ν) × ν) × (ν ×P(Act × ν))

r× ∪ τ τl /P(Act × (ν × ν)) ×P(Act × (ν × ν)) /P(Act × (ν × ν))

The restriction combinator \K , for each subset K ⊆ L, forbids the occurrence \ of actions in K. Formally, K =[(α\K )] w h e r e

ω /P × ﬁlterK /P × α\K = ν (Act ν) (Act ν) where ﬁlterK = λs . {t ∈ s| π1 t/∈ K}. The interleaving combinator also forms (with nil), an Abelian monoid. Re- striction, on the other hand, is idempotent and commutes with both choice and interleaving. As one could expect, the cornerstone in the proofs of equations involving static combinators, is the application of the fusion law (4). This is illustrated below in the proof of commutativity.

Proof. By deﬁnition · s = is equivalent to [(α)] · s =[(α)] , w h i c h , b y fusion, is implied by α · s = P(id × s) · α. Then,

α · s

= ∪ ·(τr × τl) · ((ω × id) × (id × ω)) · · s { deﬁnition }

= ∪ ·(τr × τl) · ((ω × id) × (id × ω)) · (s × s) · { nat }

= ∪ ·(τr × τl) · (s × s) · ((id × ω) × (ω × id)) · { s × s nat }

= ∪ ·(τr · s × τl · s) · ((id × ω) × (ω × id)) · { functor }

= ∪ ·(P(id × s) · τl ×P(id × s) · τr) · ((id × ω) × (ω × id)) · { τr vs τl }

= ∪ ·(P(id × s) ×P(id × s)) · (τl × τr) · ((id × ω) × (ω × id)) · { functor }

= P(id × s) · ∪ · (τl × τr) · ((id × ω) × (ω × id)) · {∪nat }

= P(id × s) · ∪ · s · (τl × τr) · ((id × ω) × (ω × id)) · {∪comm }

= P(id × s) · ∪ · (τr × τl) · ((ω × id) × (id × ω)) · s · { s nat }

= P(id × s) · ∪ · (τr × τl) · ((ω × id) × (id × ω)) · { s · = } = P(id × s) · α { deﬁnition }

✷

6 Morphisms τr : P(Act × X) × C −→ P (Act × (X × C)) and τl : C ×P(Act × X) −→ P(Act × (C × X)) stand for, respectively, the right and left strength associated to functor P(Act × Id). 52 Barbosa

4Interaction and Parallel Composition

Interaction Structures. Process combinators introduced so far depend solely on the process structure, as recorded in the shape of the functor. To specify interaction, however, there is a need to introduce some structure in the set Act of actions. There- fore, we define the interaction structure underlying a process calculus as an Abelian monoid Act; θ, 1 with a zero element 0. It is assumed that neither 0 nor 1 belong to the set L of labels. The intuition is that θ determines the interaction discipline whereas 0 represents the absence of interaction: for all a ∈ Act, aθ0 = 0. On the other hand, aθa =1iffa = a = 1. Notice that the role of both 0 and 1 is essentially technical in the description of the interaction discipline. In some situations 1 may be seen as an idle action, but its role, in the general case, is to equip the behaviour functor with a monadic structure, which would not be the case if Act were defined simply as an Abelian semigroup 7 . A basic example of an interaction structure captures action co-occurrence. Therefore, θ is defined as aθb = a, b, for all a, b ∈ Act different from 0 and 1. Ccs [14] synchronisation discipline provides another example. In this case the set L of labels carries an involutive operation represented by an horizontal bar as in a, for a ∈ L.Anytwoactionsa and a are called complementary. A special action τ/∈ L is introduced to represent the result of a synchronisation between a pair of complementary actions. Therefore, the result of θ is τ whenever applied to a pair of complementary actions and 0 in all other cases, except, obviously, if one of the arguments is 1 8 . Once an interaction structure is fixed, any homomorphism f : Act −→ Act lifts to a renaming combinator [f] between processes defined as [f]=[(α[f])] , where

ω / P(f×id/) α[f] = ν P(Act × ν) P(Act × ν)

Conditional Laws. The basic properties of renaming, namely that it preserves identity and composition of homomorphisms, extends along preﬁx and commutes with

7 The structure is similar to what is called a synchronisation algebra in [18] apart from some minor details. In particular, Winskel synchronisation algebras carry a specific constant to denote asynchronous occurrence and θ does not necessarily possess a unit. The monoid structure, however, allows for a more uniform characterisation of behaviour models. On the other hand, the definition of parallel composition below, in terms of synchronous product and interleaving, avoids the need for introducing . 8 For the Ccs case we follow the standard notational convention under which complements are considered implicitly. In particular, a restriction combinator \K ,forK ⊆ L is interpreted as \K∪K . Similarly, the parameter f of a renaming (see below), specifies only the ‘action’ part although it also implies that if fa= b then f a = b. Also, as τ is introduced as a constant in Act, f being a homomorphism forces fτ= τ. 53 Barbosa both choice and interleaving, are proved in the style illustrated above, always avoiding the explicit construction of bisimulations. Often, however, process equalities hold just if some ‘side conditions’ are fulfilled. Let us study how such laws are derived in our framework starting with a very simple example. Let f = {b/a} be substitution of a by b, i.e., a homomorphism over Act which is the identity in all actions but a. In several cases, but not in all, we may conclude that renaming with f has no effect. Can this be expressed on a general law? A simple calculation yields

[f]=id ≡{deﬁnition }

[( α[f])] = [( ω)] ⇐{fusion }

α[f] · id = P(id × id) · ω ≡{identity }

α[f] = ω

Clearly, the last equality holds only if a does not show up as an action in the immediate continuations of the process being renamed. This condition is formally expressed by the following predicate:

◦ φ ==∅ · ∩ ·(Pπ1 · ω × sing · a) · l (5)

Note, however, that φ is stated as a local condition on the immediate continuations of any process candidate to satisfy the given equality. Therefore, it cannot be directly taken as a suﬃcient condition for [f]=id. In fact, to proceed, predicate φ hastobemadeintoaninvariant in the sense of [11]. This is justiﬁed by the following result.

Theorem 4.1 Let α and β be T-coalgebras and φ a predicate on the carrier of β. Then the following ‘conditional’ fusion law holds

⇒ · · ⇒ ⇒ · (φ (α h = T h β)) ( β φ ([(α)] T h =[(β)] T))

Proof. Let X be the carrier of β and iφ the embedding of the subset of X classiﬁed by φ, i.e., φ·iφ = true·!. Recall also that any β-invariant φ induces a subcoalgebra β . Consequently, iφ becomes a comorphism from β to β.Then 54 Barbosa

φ ⇒ (α · h = T h · β)

≡{iφ deﬁnition }

α · h · iφ = T h · β · iφ

⇒{β φ ⊆ φ } · · · · α h iβ φ = T h β iβ φ ≡{ } iβ φ is a comorphism from β to β · · · · α h iβ φ = T h T iβ φ β ≡{functoriality} · · · · α h iβ φ = T (h iβ φ) β ≡{fusion law (4)} · · [( α)] T h iβ φ =[(β )] T ≡{ · } iβ φ being a comorphism implies [(β )] T =[(β)] T iβ φ · · · [( α)] T h iβ φ =[(β)] T iβ φ

≡{iφ deﬁnition } ⇒ · β φ ([(α)] T h =[(β)] T)

Notice the proof would work if β φ is replaced by any other β-invariant contained in φ.Asβ φ is the greatest such invariant, it provides the most ‘generous’ condition. ✷

Deriving the Condition. Applying the previous theorem to the case under consideration, leads to

[{b/a}]=id ⇐ ω φ (6)

with φ given by (5). Now recall that ω φ is defined as the greatest fixpoint ❡ of Φ = λx . φ ∩ ω x. Looking at predicates as sets, Φ is a function over a complete lattice — Pν, ⊆ — whose monotony is easily proved by induction on the functor structure. Therefore, a concrete representation for ω φ can be computed, by the Knaster-Tarski theorem [16], as the union of all post- fixpoints of Φ, i.e.,

❡ ω φ = {s ∈Pν| s ⊆ φ ∩ ω s} 55 Barbosa

Being a post-ﬁxpoint means, for each s above that, for any process p,

p ∈ s ❡ ⇒ p ∈ φ ∧ p ∈ ω s ≡ p ∈ φ ∧ p ∈{x ∈ ν| ωx∈ (s)P(Act×Id)} Act×Id ≡ p ∈ φ ∧ p ∈{x ∈ ν| ωx∈{c ∈P(Act × ν)|∀t .t∈ c ⇒ t ∈ (s) }}

≡ p ∈ φ ∧ p ∈{x ∈ ν| ωx∈{c ∈P(Act × ν)|∀t .t∈ c ⇒ π2 t ∈ s}}

≡ p ∈ φ ∧ p ∈{x ∈ ν| (Pπ2 · ω) x ∈ s}

Seen as a set, predicate (5) is given by φ = {x ∈ ν| (Pπ1 · ω) x ∩{a} = ∅}. Therefore, ω φ = {s ∈Pν| x ∈ s ⇒ ((Pπ1 · ω) x ∩{a} = ∅) ∧ ((Pπ2 · ω) x ∈ s))} or, in words, the set of all processes whose derivations never exhibit an action a. In Ccs, the set of all labels, seen as actions, in which a process p can commit itself, i.e., that appear in at least one derivation of p, is called the sort of p and denoted by L(p). [14] provides a syntactic criterion to compute a majoring approximation of L(p) by induction on the process expression. A semantic deﬁnition can, however, be given as L(p)=(Pπ1 · ·Pp) p true (7) where, again, the embedding of L in Act is left implicit. Law (6) may then be rewritten as 9 [{b/a}]=id ⇐ ∈/a ·L (8)

Parallel. The next static operator considered here is synchronous product, modelling the simultaneous execution of its two arguments. In each step the resulting action is determined by the interaction structure for the calculus. Formally, ⊗ =[(α⊗)] w h e r e

× (ω ω)/ sel·δr / α⊗ = ν × ν P(Act × ν) ×P(Act × ν) P(Act × (ν × ν)) where sel = ﬁlter{0} ﬁlters out all synchronisation failures. Notice how interac- 10 tion is catered by δr — the distributive law for the strong monad P(Act × Id) . δr is the Kleisli composition of the left and the right strengths. This, on its

9 Going pointwise and noticing that, by convention, the parameter of the Ccs renaming operator represents ‘coactions’ implicitly, we end up with the familiar Ccs law p [{b/a}]= a p ⇐ a, a/∈L(p). Function ∈/ is defined as λx . a∈ / x. 10 Notice that the monoidal structure in Act extends functor P(Act × Id) to a strong monad. 56 Barbosa turn, involves the application of the monad multiplication to ‘flatten’ the result and this, for a monoid monad, requires the suitable application of the underlying monoidal operation, which, in our case, fixes the interaction discipline. In fact,

P(Act×Id) P(Act×Id) P(Act×Id) P(Act×Id) δr = µ · P(id × τ ) · τ r l ◦ P P(Act×Id) P(Act×Id) = P((θ × id) · a ) · ·Pτl · P(id × τr ) · τl i.e., going pointwise,

P(Act×Id) δr c1,c2 = {a θa, p, p | a, p∈c1 ∧a ,p∈c2}

Finally, parallel composition arises as a combination of interleaving and synchronous product, in the sense that the evolution of p | q, for processes p and q, consists of all possible derivations of p and q plus the ones associated to the synchronisations allowed by the particular interaction structure for the calculus. This cannot be achieved by a simple composition of the corresponding combinators and ⊗: it has to be performed at the ‘genes’ level for and ⊗. Formally, | =[(α|)] , w h e r e

/ ∪·(α×α⊗) / α| = ν × ν (ν × ν) × (ν × ν) P(Act × (ν × ν))

Synchronous product is commutative, associative and has nil as a zero element. Furthermore, it distributes over choice and, conditionally, over restriction and renaming. On the other hand, as expected, the | combinator shares some properties that are common to both and ⊗. In particular, it gives rise, with nil, to an Abelian monoid and distributes along renaming and restriction in certain cases. However, it lacks a zero element and does not distribute through choice. Notice that the veriﬁcation of such properties ‘re-uses’ the proofs of the corresponding results for and ⊗.

Two Proofs. This paragraph is concerned with the proof of restriction distributivity over ⊗ and |, illustrating two important points in our approach: the derivation of side conditions along a proof and proof re-use. The proofs rely on the following immediate consequence of law (1): to prove the equality φ = ψ it is enough to show that both ω · φ = T φ · α and ω · ψ = T ψ · α hold. Consider, then, the derivation of the following property:

\K · ⊗ = ⊗ · (\K ×\K ) ⇐ uniform restriction (9)

Proof. We proceed by unfolding the composite of ω with both sides of the equation. Therefore, 57 Barbosa

ω · ⊗ · (\K ×\K ) = { comorphism, deﬁnition}

P(id ×⊗) · sel · δr · (ω × ω) · (\K ×\K ) = { functoriality}

P(id ×⊗) · sel · δr · (ω · \K × ω · \K ) = { comorphism, functoriality} P ×⊗ · · δ · P ×\ ×P ×\ · × (id ) sel r ( (id K ) (id K )) (α\K α\K )

= { δr naturality} P ×⊗ · · P × \ ×\ · δ · × (id ) sel (id ( K K )) r (α\K α\K ) = { sel deﬁnition, functoriality} P ×⊗· \ ×\ · · δ · × (id ( K K )) sel r (α\K α\K )

Next a similar calculation is done for ω·\K ·⊗, trying to arrive at an expression P × \ · ⊗ · · δ · × (id ( K )) α,withα = sel r (α\K α\K )asabove.

ω · \K · ⊗ = { comorphism, deﬁnition}

P(id ×\K ) · ﬁlterK · ω · ⊗ = { comorphism}

P(id ×\K ) · ﬁlterK · P(id ×⊗) · α⊗ = { deﬁnition}

P(id ×\K ) · ﬁlterK · P(id ×⊗) · sel · δr · (ω × ω)

= { ﬁlterK naturality}

P(id ×\K ) · P(id ×⊗) · ﬁlterK · sel · δr · (ω × ω) ? = { }

P(id ×\K ) · P(id ×⊗) · sel · δr · (ﬁlterK × ﬁlterK ) · (ω × ω) = { functoriality }

P(id × (\K · ⊗)) · sel · δr · (filterK · ω × filterK · ω) = { definition} P × \ · ⊗ · · δ · × (id ( K )) sel r (α\K α\K ) We have succeeded only partially: the step marked with a F does not hold uni- versally. We are then left with the task of establishing under what conditions, if any, the following equality holds:

filterK · sel · δr = sel · δr · (filterK × filterK )

Unfolding the deﬁnitions of the functions involved and going pointwise for a 58 Barbosa while, we get

( filterK · sel ·δr) c1,c2 = =(filterK · sel) {a θa, p, p | a, p∈c1 ∧a ,p∈c2} = filterK {a θa, p, p | a, p∈c1 ∧a ,p∈c2 ∧ a θa =0 } = {a θa, p, p | a, p∈c1 ∧a ,p∈c2 ∧ a θa =0 ∧ a θa∈ / K}

On the other hand,

( sel · δr ·(ﬁlterK × ﬁlterK )) c1,c2 = =(sel · δr) {a, p∈c1| a/∈ K}, {a ,p∈c2| a ∈/ K} = sel {a θa, p, p | a, p∈c1 ∧a ,p∈c2 ∧ a, a ∈/ K} = {a θa, p, p | a, p∈c1 ∧a ,p∈c2 ∧ a, a ∈/ K ∧ a θa =0 }

Clearly the two sets can be identiﬁed iﬀ, for all possible a and a, such that aθa =0, aθa∈ / K ≡ a, a ∈/ K. Therefore, step F is only possible if the expression scope is restricted to pairs of processes satisfying the following predicate:

∀ ⇒ ∈ ≡ ∈ φ p, q = a∈(Pπ1·ω) p,a ∈(Pπ1·ω) q .aθa =0 (aθa / K a, a / K) (10) which is lifted to the invariant uniform restriction = α φ. ✷ As expected, a similar result holds for parallel composition.

Proof. The attempt to prove \K · | = | ·(\K ×\K ), proceeds by reducing the composite of ω with both sides of the equation to identify a common coalgebra α.Thus,

ω · \K · | = { double application of comorphism and deﬁnition}

P(id ×\K ) · ﬁlterK · P(id×|) · ∪ · (α × α⊗) ·

= { ﬁlterK naturality}

P(id ×\K ) · P(id×|) · ﬁlterK · ∪ · (α × α⊗) · = { ∪ naturality, functoriality}

P(id × (\K · |)) · ∪ · (ﬁlterK · α × ﬁlterK · α⊗) · = { reusing the proof of (9) and a similar result for } P × \ · | · ∪ · ∪ · × · × × × · (id ( K ) (( (τr τl) ((α\K id) (id α\K )) ) × · δ · × · sel r (α\K α\K ))

Similarly, it is shown that ω· | ·(\K ×\K )=P(id × (| ·(\K ×\K )))·α ,where ∪ · ∪ · × · × × × · × · δ · × · α = (( (τr τl) ((α\K id) (id α\K )) ) sel r (α\K α\K )) 59 Barbosa is the common coalgebra. We are, thus, almost ready to conclude. Notice, however, that, on reusing the proof of law (9) in the last step of this derivation, we must also take into account the predicate which constrains the substitution made. Clearly, predicate (10) acts again as a local condition to validate the derivation here. We may then conclude the validity of the law, subjected to the restriction given by uniform restriction = α φ. ✷ We may ask what form such invariants take given a particular interaction structure. For example, in the Ccs case, the result of θ does not belong to L — θ has only three possible results under Ccs interaction discipline: τ,1and 0. Therefore, as K ⊆ L, condition aθa∈ / K holds for any K and φ becomes ∀ ⇒ ∈ ≡ ∈ a∈(Pπ1·ω) p,a ∈(Pπ1·ω) q .aθa =0 (aθa / K a, a / K) ≡{Ccs interaction structure } ∀ ∨ ⇒ ∈ a∈P(π1·ω) p,a ∈(Pπ1·ω) q . (aθa = τ aθa =1) a, a / K ≡{aθa =1 iﬀ a = a =1} ∀ ⇒ ∈ a∈(Pπ1·ω) p,a ∈(Pπ1·ω) q .aθa = τ a, a / K ≡{aθa = τ iﬀ a = a} ∀ ⇒ ∈ a∈(Pπ1·ω) p,a ∈(Pπ1·ω) q .a = a a, a / K ≡{rearranging}

(Pπ1 · ω) p ∩ (Pπ1 · ω) q ∩ (K ∪ K)=∅ where the overbar notation stands here for the lifting of the involutive Ccs complement operation to sets of actions. Now note that, although both invariants are derived from the same local condition (10), they stand for the closure of φ under diﬀerent coalgebras and are, consequently, distinct. In fact, the number of derivations that have to be considered under φ is much greater in the second case. In particular, uniform restriction does not consider conﬁgurations representing interleavings. For example, if one of the processes exhausts after n steps, the condition on the actions is not required to hold after that. On the other hand, in uniform restriction, φ is closed wrt the transitions on α, which include both interleavings and synchronisations. Therefore, and recalling the notion of sort, we conclude that uniform restriction ≡L(p) ∩ L(p) ∩ (K ∪ K)=∅ arriving to the following familiar Ccs presentation of this result

(p | q)\K = p\K | q\K ⇐L(p) ∩ L(p) ∩ (K ∪ K)=∅

This discussion illustrates our claim that such an approach to process calculi allows us to ‘discover’ the appropriate restrictions a law is constrained by, instead of ‘postulating’ them and verifying their suitability. It also makes explicit that such conditions are essentially dependent only on the calculus interaction structure. Consider, for example, what would happen to the law at hands if a Csp-like interaction discipline is chosen instead. In Csp [10] only 60 Barbosa equally named actions synchronise, leading to the following deﬁnition of θ:

aθa = a , aθ1=1θa = a and aθb = 0 in all other cases

Therefore, the condition aθa =0 ⇒ (aθa ∈/ K ≡ a, a ∈/ K) becomes trivially true and the law holds without any side condition.

Another Example. A similar situation occurs when studying the distribution of renaming over parallel. In an attempt to proof [f]· | = | ·([f] × [f]), application of theorem 4.1 makes the following local condition to pop out (see [2] for the complete derivation):

 ∀ ≡ φ p, q = a∈(Pπ1·ω) p,a ∈(Pπ1·ω) q .aθa =0 faθfa =0

Under the Ccs interaction discipline, aθa = 0 implies that aθa =1oraθa = τ. Hence, when does the equivalence aθa =0 ≡ faθfa = 0 hold? Clearly the implication from left to right holds trivially because f is a homomorphism on the interaction structure: aθa = F ⇒{Leibniz } f (aθa)=fF ≡{f is an Act-homomorphism } faθfa = F when F stands for either 1 or τ. The implication in the opposite direction, however, reads faθfa = F ⇒ aθa = F which, f being an Act-homomorphism, is equivalent to

f (aθa)=fF ⇒ aθa = F again for F standing for either 1 or τ. This holds only if f is mono. Thus, for the Ccs case, the generated invariant requires the injectivity of f or, at least, of its restriction to the relevant process sorts. The resulting law is usually written in Ccs as

(p | q)[f]=p[f] | q[f] ⇐ f restricted to L(p | q) ∪ L(p | q)ismono

5 Variants and Conclusions

Behaviour Monads. In the introduction to this paper we have remarked that the approach to process calculi design sketched here could cope with a variety of particu- 61 Barbosa lar cases because the emphasis was placed on the common underlying structures rather than on the distinctive particularities. A first source of genericity has already been introduced by separating the behaviour from the interaction structures. Note that all the process combinators introduced are either independent of any particular interaction discipline or parametrized by it. We shall now briefly examine what happens if the behaviour structure itself is changed. Recall that processes were defined as inhabitants of the carrier of the final coalgebra for T = B (Act × Id)whereB was taken as the finite powerset functor. Next we have shown that, assuming a commutative monoidal structure over Act, the behaviour model captured by P(Act × Id)isastrong Abelian monad. The definitions of some combinators build upon this — in particular synchronous product basically relies on the monad distribution law δr. Com- mutativity of ⊗, and consequently of |, depends on the monoid underlying the interaction structure being itself Abelian. Therefore, a first line of enquire followed below consists of replacing B by different monads and extracting a correspondent family of calculi still parametrized by the interaction structure. The simplest case takes B as the identity Id. The result is, of course, a universe of deterministic (and perpetual) processes. A further elaboration of this replaces Id by Id + 1, entailing a calculus for deterministic but partial processes in the sense that the derivation of a ‘dead’ state is always possible. Such a calculus is far less expressive than the one previously discussed. In fact derivations do not form any kind of collection and, therefore, non determinism is ruled out. Similarly, combinators which explore non determinism, lack a counterpart here. Such is the case of choice, interleaving and, as a generalisation of the later, parallel. On the other hand, the composition of Id + 1 with the monoidal monad generated by Act is still a strong Abelian monad, and therefore synchronous product is still definable. Inaction and prefix, for partial processes, are defined as ω ·nil = ι2 and ω ·a. = ι1 ·labela, respectively. On the other hand, product, restriction and renaming can be defined in a rather generic way as anamorphisms whose ‘genes’ are parametrized by the monad B :

ω / B (f×id) / α[f] = ν B (Act × ν) B (Act × ν) B ω / × ﬁlterK / × α\K = ν B (Act ν) B (Act ν) × B (Act×Id) (ω ω)/ δr / α⊗ = ν × ν B (Act × ν) × B (Act × ν) B (Act × (ν × ν))

B sel /B (Act × (ν × ν))

B (Act×Id) where δr is the distribution law associated to the composed monad B (Act × Id), therefore encapsulating the θ operation on actions. On the B B other hand, sel and filterK explore the B structure in order to rule out synchronisation failures, in the first case, and to perform the action restriction Id+1 in the second. For B = Act + 1, filterK =((∈/K ·π1) → ι1,ι2·!) + id is expressed by a conditional, whereas, for B = P, such conditional was iterated 62 Barbosa

Id+1 over a set. Notice that ∈/K is deﬁned as λa . a∈ / K and, again, sel = Id+1 ﬁlter{0} . Finally, notice that when B is non commutative, which is the case of, e.g., B X = X8, not only is commutativity lost for several combinators, but also two non bisimilar versions of ⊗ emerge, based, respectively, in δr and δl,asequationδr = δl holds only for commutative monads.

Reactive Processes. Up to this point we have assumed an ‘active’ interpretation of processes. Similarly, an universe for reactive processes may be specified as a final coalgebra ω for T X =(B X)Act,whereB captures, as usual, the behaviour structure. Notice that in this paragraph the overbar notation is used to refer to the transpose of a morphism under the curry/uncurry isomorphism. Let us revisit briefly the process combinators in this new setting, taking again B as the finite powerset monad. The definitions of inaction, prefix and choice follow closely the ones already considered for ‘active’ processes, reflecting, however, the fact that Act appears now as an exponent. Notice, for example, how prefixing a process p by an action a results in a new process which is blind for every stimulus different from a. Formally,

ω · nil = ∅·!

ω · a. = ((=a ·π2) → sing · π1, ∅·!) ω · +=∪ · (ω × ω) · pdl where pdl :(X × Y ) × Z −→ (X × Y ) × (Y × Z)isdefinedasπ1 × id,π2 × id. For any K ∈ L and renaming homomorphism f, restriction and renaming are given by \K =[(αK )] a n d [ f]=[(αf )] , w h e r e αK =((∈/K ·π2) → ω, ∅·!) and αf = ω · (id × f). Notice that both f and the restriction set K act by constraining the set of meaningful stimuli, thus before the effective derivation is computed. Under the ‘active’ interpretation their effect was defined over the (previously computed) derivations. The synchronous product is defined, in this setting, as ⊗ =[(α⊗)] , w h e r e ,

(ω×ω)/ prod / Act α⊗ = ν × ν PνAct ×PνAct P(Pν ×Pν) P Act (Pδr ) ∪Act /PP(ν × ν)Act /P(ν × ν)Act where

 →∅ { | ∀ } prod d1,d2 = λa . (a =0 , d1 a1,d2 a2 a1,a2∈Act .a= a1θa2 )

Recall that in the ‘active’ case, interaction was neatly captured by the distribution law for the P(Act × Id) monad, which is no longer the case here. This entails the need to introduce function prod, which, additionally, ﬁlters 63 Barbosa out synchronisation failures. On the other hand, the deﬁnition of the interleaving combinator corresponds to the one for the ‘active’ case, but for the replacement of set union by merge where merge d1,d2 = λa. d1 a ∪ d2 a. Thus, =[(α)] w i t h

× × × / (ω id) (id ω)/ Act Act α = ν × ν (ν × ν) × (ν × ν) (Pν × ν) × (ν ×Pν )

τr×τl merge /P(ν × ν)Act ×P(ν × ν)Act /P(ν × ν)Act where, of course, the right and left strengths are relative to the (PId)Act monad. The same observation applies to parallel composition, which arises, once again, as a combination of product and interleaving at the ‘genes’ level. The basic properties of this model of reactive processes are essentially the properties of the corresponding calculus of ‘active’ processes. The proof style is also similar, although calculation resorts now heavily to the properties of exponentiation (see [2]).

Process Languages and Prototyping. One advantage of thinking about processes as inhabitants of (coinductive) types is the possibility of developing prototypes for process calculi in functional languages supporting such types. Once a prototype implementation of a particular calculus is developed, processes can be defined in the language and their execution traced. Furthermore, to deal with recursive processes, a language of process expressions has to be defined allowing for guarded occurrences of process variables as valid terms. As expected, a term language for processes, over a set of labels, is defined as an inductive type. The set of terms is then taken as the carrier of a B (Act×Id)-coalgebra, whose coinductive extension (i.e., the associated anamorphism) provides an interpreter for the calculus. Moreover, process environments have to be introduced to collect all the process defining equations relevant to conduct experiments on a particular network of processes. The environment acts as context information for the interpreter which is, consequently, redefined as a strong anamorphism. Such a strategy is used by the author, in [2], to develop Charity implementations of interpreters for elementary process calculi parametric on the interaction structure.

References

[1] P. Aczel. Final universes of processes. In Brooks et al, editor, Proc. Math. Foundations of Programming Semantics. Springer Lect. Notes Comp. Sci. (802), 1993.

[2]L.S.Barbosa.Components as Coalgebras. PhD thesis, Universidade do Minho (submitted), 2000. 64 Barbosa

[3] L. S. Barbosa. Components as processes:An exercise in coalgebraic modeling. In S. F. Smith and C. L. Talcott, editors, FMOODS’2000 - Formal Methods for Open Object-Oriented Distributed Systems, pages 397–417, Stanford, USA, September 2000. Kluwer Academic Publishers.

[4] R. Bird and O. Moor. The Algebra of Programming. Series in Computer Science. Prentice-Hall International, 1997.

[5] R. S. Bird. An introduction to the theory of lists. In M. Broy, editor, Logic of Programming and Calculi of Discrete Design, volume 36 of NATO ASI Series F, pages 3–42. Springer-Verlag, 1987.

[6] R. S. Bird and L. Meertens. Two exercises found in a book on algorithmics. In L. Meertens, editor, Program Speciﬁcation and Transformation, pages 451–458. North-Holland, 1987.

[7] R. Cockett and T. Fukushima. About Charity. Yellow Series Report No. 92/480/18, Dep. Computer Science, University of Calgary, June 1992.

[8] T. Hagino. Category Theoretic Approach to Data Types. Ph.D. thesis, tech. rep. ECS-LFCS-87-38, Laboratory for Foundations of Computer Science, University of Edinburgh, UK, 1987.

[9] C. Hermida and B. Jacobs. Structural induction and coinduction in a ﬁbrational setting. Information & Computation, (145):105–121, 1998.

[10] C. A. R Hoare. Communicating Sequential Processes. Series in Computer Science. Prentice-Hall International, 1985.

[11] B. Jacobs. The temporal logic of coalgebras via Galois algebras. Techn. rep. CSI-R9906, Comp. Sci. Inst., University of Nijmegen, 1999.

[12] G. R. Malcolm. Data structures and program transformation. Science of Computer Programming, 14(2–3):255–279, 1990.

[13] E. Meijer, M. Fokkinga, and R. Paterson. Functional programming with bananas, lenses, envelopes and barbed wire. In J. Hughes, editor, Proceedings of the 1991 ACM Conference on Functional Programming Languages and Computer Architecture, pages 124–144. Springer Lect. Notes Comp. Sci. (523), 1991.

[14] A. J. R. G. Milner. Communication and Concurrency. Series in Computer Science. Prentice-Hall International, 1989.

[15] L. Moss. Coalgebraic logic. Ann. Pure & Appl. Logic, 1999.

[16] A. Tarski. A lattice–theoretic ﬁxpoint theorem and its applications. Paciﬁc Journal of Mathematics, 5:285–309, 1955.

[17] D. Turi and J. Rutten. On the foundations of ﬁnal coalgebra semantics:non- well-founded sets, partial orders, metric spaces. Mathematical Structures in Computer Science, 8(5):481–540, 1998. 65 Barbosa

[18] G. Winskel and M. Nielsen. Models for Concurrency. In S. Abramsky, D. M. Gabbay, and T. S. E. Gabbay, editors, Handbook of Logic in Computer Science (vol. 4), pages 1–148. 1995.

66 CMCS’01 Preliminary Version

Generalised Coinduction

F. Bartels 1

CWI, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands http://www.cwi.nl/~bartels

Abstract We introduce the λ-coiteration schema for a distributive law λ of a functor T over a functor F. Under certain conditions it can be shown to uniquely characterise functions into the carrier of a final F-coalgebra, generalising the basic coiteration schema as given by finality. The duals of primitive recursion and course-of-value iteration, which are known extensions of coiteration, arise as instances of our framework. One can furthermore obtain schemata justifying recursive specifications that involve operators such as addition of power series, regular operators for languages, or parallel and sequential composition of processes. Next, the same type of distributive law λ is used to generalise coinductive proof techniques. To this end, we introduce the notion of a λ-bisimulation relation. It specialises to what could be called bisimulation up-to-equality or bisimulation up- to-context for contexts built from operators of the type mentioned above. We state that every such relation is contained in some larger conventional bisimulation and demonstrate that this principle leads to simpler bisimilarity proofs using less complex relations.

1 Introduction

Around the early nineties, the categorical notion of a final F-coalgebra for a functor F on some category C was found useful for the abstract description of possibly infinite objects. Examples are datatypes such as infinite streams or dynamical systems like processes or automata (see e.g. the introductions by Jacobs and Rutten [JR96,Rut00b]). Treating these different entities uniformly as behavioural systems of some type F allowed for an abstract formulation of definition and proof principles that have been studied separately for various applications before. States of a final coalgebra can be characterised using a principle which is given directly by finality. We call it coiteration here to distinguish it from

1 The research reported here was part of the NWO project ProMACS. This is a preliminary version. The final version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Bartels derived variants. The main tool for proving states equal is that of an F- bisimulation [AM89], a categorical generalisation of notions of bisimulation used for different concrete systems. But these basic principles are often too rigid to nicely cover given examples. Many functions into the carrier of a final coalgebra can be shown not to be coiterative themselves and many statements about behavioural equivalence need bisimulations which are difficult to exhibit and check. Therefore several extensions for particular settings are studied. Ex- amples of a more general type are the definition principles that arise as the duals of primitive recursion and course-of-value iteration. The specification of processes by systems of recursive equations forms another one. For proof purposes one often uses relations which instead of being bisimulations themselves satisfy sufficient conditions for being contained in some bisimulation. These are often called bisimulations up-to (see e.g. Milner [Mil89]). Sangiorgi [San98] has introduced a framework to derive such principles for labeled transition systems. A categorical formulation of the schemata that form the duals of (i) primitive recursion and (ii) course-of-value iteration has been given, see e.g. [UV99]. A first step towards a more general description of extended principles on this level has been made by Lenisa in the course of her comparison of set-theoretic and coalgebraic (categorical) formulations of coinduction [Len99a]. She has demonstrated herself that her framework captures the arrows obtainable by (i) and the same can be shown for (ii), but in neither case does her theory directly yield the universal characterisation known for these morphisms. By slightly modifying Lenisa’s approach we arrived at a categorical description of generalised coinductive definition and proof principles based on such a universal characterisation which specialises directly to both principles from above. We introduce the λ-coiteration definition schema and mention two different sufficient conditions for it to uniquely characterise functions into the carrier of a final F-coalgebra. The framework is parametric in a distributive law λ of another functor T over F. Simply speaking, the conventional coiteration schema assigns infinite behaviours to the states of a set X by specifying for each element a direct observation and successor states determining the next layer of the behaviour. Since these successors are taken from X again, the observation can be continued with the same specification. A different approach is taken by our λ-coiteration schema. It allows these successors to be taken from TX for another functor T. This increases the expressiveness of the format in case TX can be regarded as being “richer” than X. For the observations to be continued with these successors, the distributive law λ of T over F is used to lift the specification for X to TX. Making similar use of T and λ,wedefinethenotionofaλ-bisimulation and show that the same conditions as for the validity of the λ-coiteration schema are sufficient to show that all states in such a relation are bisimilar. A small example demonstrates that the technique enables simpler proofs involving less complex relations. Our notion of λ-bisimulation could possibly link the use of

68 Bartels distributive laws to (a categorical reformulation of) Sangiorgi’s set-theoretical bisimulation proof method for labeled transition systems [San98]. The categorical formulation of primitive corecursion and the dual of course- of-value iteration mentioned above can be obtained from the λ-coiteration schema for suitable instantiations of T and λ. Moreover, we briefly explain how it can be used to justify the validity of specifications involving operators of a certain type. These are the operators that Turi and Plotkin [TP97] have shown to be closely related to those definable by structured transition rules in GSOS format [BIM95]. Presenting these schemata as instances of our framework at the same time produces the corresponding variants of the λ-bisimulation proof technique. For the case of definitions via operators one obtains a notion of bisimulation up-to-context [San98] for multivariate contexts (i.e. context with several “holes”) built from operators of the type mentioned above. This provides an abstract justification for the usability of these operators for this purpose. The theory presented is stated for some category C, but for easy reading our explanations and examples all refer to the category of sets and total functions, Set. This paper is a short version of a CWI technical report under the same name [Bar00]. The long version contains detailed proofs for most of the statements given here and further examples.

2 (Co)Algebras and (Co)Iteration

In this section we will brieﬂy present a categorical view on algebras and coalgebras. More detailed expositions can be found e.g. among the papers of Jacobs and Rutten [JR96,Rut00b]. We take C and T, F:C → C to denote a category and two functors on it. Deﬁnition 2.1 [T-algebra, F-coalgebra] A T-algebra is a pair X, β where X is an object of C and β :TX → X is an arrow in C. We will sometimes call X and β the carrier and operation of the algebra. Dually, an F-coalgebra is a pair X, α where the operation α : X → FX is an arrow going into the reversed direction.

Generally, algebra operations can be seen as a means for constructing elements of their carrier. The operation of a coalgebra – also called destruction or unfolding elsewhere – gives us information about its states, either in terms of attributes or (potential) successor states. In the ﬁrst case we will talk about the observation a state allows, in the second about its dynamics. Deﬁnition 2.2 [homomorphism] An arrow h : X → Y is a T-algebra homomorphism from one T-algebra X, βX to another T-algebra Y,βY if it makes the left diagram below commute. Similarly, an F-coalgebra homomorphism from one F-coalgebra X, αX to another F-coalgebra Y,αY is an arrow h : X → Y making the right diagram commute: 69 Bartels

TX Th /TY X h /Y

βX βY αX αY / / X h Y FX Fh FY We will often just talk about homomorphisms when their type is clear from the context. Algebras and coalgebras together with the corresponding T homomorphisms form the categories Alg and CoalgF respectively. Definition 2.3 [final F-coalgebra] A final F-coalgebra isafinalobjectinthe category of F-coalgebras CoalgF, i.e. a coalgebra – usually denote here by ΩF,ωF – such that there exists exactly one homomorphism from every other F-coalgebra to it. Example 2.4 Some of our later examples will involve infinite streams of real numbers σ ∈ IR ω. These arise as states of the final coalgebra of the functor S:Set → Set where SX := IR × X. Coalgebras of this functor have the shape X, o, s for a set X and two functions o : X → IR and s : X → X.Thatis, for each state we can observe a real number and move to a successor state. The infinite streams of real numbers form the final S-coalgebra IR ω, head, tail, ω where for each stream σ = σ0,σ1,...∈IR the observation is given by its first element head(σ):=σ0 and the successor by the stream that remains after removing it, tail(σ):=σ1,σ2,.... We will call S-coalgebras stream systems in the following.

The ﬁnality of an F-coalgebra ΩF,ωF yields an arrow h : X → ΩF for every F-coalgebra operation α on X by assuming it to be the unique homomorphism from X, α to ΩF,ωF. Such an arrow is then called the coiterative arrow deﬁned (or induced) by α: _ _∃!h_ _/ X ΩF

∀α ωF _ _ _/ FX Fh FΩF Example 2.5 [Coiteration for Streams] The coiteration schema for stream systems from Example 2.4 states that for every pair of functions o : X → IR and s : X → X there is a unique function f : X → IR ω satisfying head(f(x)) = o(x), tail(f(x)) = f(s(x)). As an example we take a look at the element-wise addition of two streams. By the coiteration schema there is a unique function ⊕ : IR ω × IR ω → IR ω satisfying the following two equations: head(σ ⊕ τ)=head(σ)+head(τ) tail(σ ⊕ τ)=tail(σ) ⊕ tail(τ)

70 Bartels

As a proof principle, one can show that for a final F-coalgebra ΩF,ωF two arrows h1,h2 : X → ΩF are equal by providing an F-coalgebra operation on X for which both arrows are homomorphisms. Most of the time one exploits this property for proving that two states show the same behaviour, i.e. that they are mapped onto the same state of the final coalgebra by the coiterative morphisms. Technically one uses a derived principle based on the notion of bisimulation. We work with a definition in terms of spans:

Definition 2.6 [span] In a category C,aspan R = R, r1,r2 between two C objects X and Y consists of an object R and two arrows r1 : R → X and r2 : R → Y . A span between X and itself is called a span on X. There is a preorder ) of spans between the objects X and Y defined as R, r1,r2)S, s1,s2 if and only if there is an arrow f : R → S such that both triangles in the following diagram commute: O r1nnn R OOrO2 wnnn OO' XYPgP ∃f o7 PPP ooo s1 P oo s2 S Definition 2.7 [Bisimulation] For a functor F : C → C,anF-bisimulation between two F-coalgebras X, αX and Y,αY is a span B = B,b1,b2 between their carriers X and Y , such that there is an F-coalgebra operation γ : B → FB turning b1 and b2 into homomorphisms: o b1 b2 / X B Y αX ∃γ αY o / FX Fb1 FB Fb2 FY A bisimulation between a coalgebra X, α and itself is called a bisimulation on X, α. In Set one often only considers bisimulations which are relations, i.e. spans B,π1,π2 for a relation B ⊆ X × Y (see e.g. Rutten [Rut00b]). We use the formulation based on spans because it generalises to other categories and is sometimes easier to work with. We will still often talk about bisimulation relations. This is justified by the observation that every span R, r1,r2 in Set can be regarded as representing the image r1,r2[R] ⊆ X × Y . The order ) of spans corresponds to relation inclusion of images. Furthermore the image of a (span) bisimulation is again a (relational) bisimulation. One usually calls two states bisimilar if they are related by some bisimulation relation and takes bisimilarity to be the notion of behavioural equivalence. This is justified by the fact that all bisimilar states are mapped onto the same element of a final coalgebra by the coiterative morphisms, if it exists. This can easily be seen as follows:

Let B ⊆ X × Y be a bisimulation between the F-coalgebras X, αX and Y,αY containing the pair x, y, where the bisimulation property is witnessed by γ : B → FB.LethX and hY denote the coiterative arrows from X, αX 71 Bartels

and Y,αY to the ﬁnal coalgebra ΩF,ωF, then one gets the diagram below in CoalgF, commuting by ﬁnality. This yields hX (x)=hX (π1( x, y )) = hY (π2(x, y)) = hY (y)aswanted. R, γ ?? π1 ??π2 ? X, αX? Y,αY ?? ?? hX hY ΩF,ωF

3 Deﬁnition by λ-Coiteration

The states of a final F-coalgebra ΩF,ωF can be argued to represent abstract behaviours of the type F. An arrow f : X → ΩF assigns such behaviours to the elements of X. The coiteration schema allows us to define this function such that it maps every element of X to the behaviour it exhibits as a state in a given F-coalgebra X, α. Unfortunately, for many functions f : X → ΩF that one wants to specify there is no F-coalgebra operation α on X itself making f the coiterative morphism or it may not be obvious from the given specification of f. In this section we are first going to present a specification of this kind. Then we will formulate the encountered pattern more abstractly and – as the main theorems of this paper – state sufficient conditions for it to uniquely characterise arrows into the carrier of a final coalgebra. This yields a definition schema that we call λ-coiteration.

3.1 Example: Multiplication of Formal Power Series

Like Rutten [Rut00a], we will consider infinite streams of real numbers σ = ∈ ω ∞ i σ0,σ1,... IR as representations of formal power series i=0 σiX .The operation ⊕ from Example 2.5 turns out to be such that σ ⊕ τ represents the sum of the two series represented by σ and τ. Similarly, we would like to define a binary operation ⊗ on IR ω such that σ ⊗ τ represents the convolution product of the two series represented by σ and τ. After some computation (c.f. [Rut00a]) one arrives at the following specification, where the constant series r ∈ IR is represented by [r]=r, 0, 0,...∈IR ω: 2 head(σ ⊗ τ)=head(σ) · head(τ), tail(σ ⊗ τ)=([head(σ)] ⊗ tail(τ)) ⊕ (tail(σ) ⊗ τ). These two equations do not form a coiterative definition as in Example 2.5 because of the use of ⊕ in the expression for the tail. To get a better picture of the type of definition we have here, we set X := IR ω × IR ω and o : X → IR , s1,s2 : X → X with

2 Coiteratively: head([r]) = r and tail([r]) = [0]. 72 Bartels

o(σ, τ):=head(σ) · head(τ),

s1(σ, τ):=[head(σ)], tail(τ), s2(σ, τ):=tail(σ),τ. The equations above can alternatively be stated by asking ⊗ to ﬁt into the diagram below: ⊗ X ______/IR ω

o, s1,s2 head,tail IR × (X × X) ______/IR × IR ω IdIR×(⊕◦(⊗×⊗)) The question is whether this condition uniquely deﬁnes the arrow ⊗.

3.2 The general Pattern

To describe the format of the speciﬁcation from above more abstractly, we make the following two deﬁnitions:

Deﬁnition 3.1 [T-extension] Let β :TY → Y be a T-algebra. For a given arrow f : X → Y ,wecallf|β := β ◦ Tf :TX → Y the T-extension of f along β: Tf / G TX G TY G β G β f| G# / X f Y

Deﬁnition 3.2 [homomorphism up-to-β]LetX, φ be an FT-coalgebra and Y,α be an F-coalgebra with a T-algebra operation β :TY → Y on its carrier. An arrow f : X → Y is called a homomorphism up-to-β from X, φ to Y,α, if it makes the following diagram commute (note that in the picture we added an arrow for β to visualise its typing, though it does not contribute to the commutativity expressed): TY β f X /Y

φ α / FTX β FY Ff|

By taking the functor T := Id × Id, ⊗ becomes a homomorphism up- ω to-⊕ from the ST-coalgebra X, o, s1,s2 to IR , head, tail,theﬁnal S-coalgebra, since we can rewrite the arrow at the bottom of the diagram intended to specify ⊗ as follows: ⊕ IdIR × (⊕◦(⊗×⊗)) = S(⊕◦(⊗×⊗)) = S(⊕◦T⊗)=S⊗| . 73 Bartels

3.3 Distributive Laws We are looking for a framework in which a homomorphism up-to-β into a ﬁnal F-coalgebra ΩF,ωF uniquely exists for a given T-algebra operation β on its carrier. Since it is easy to see that this is not the case for all β,one needs to restrict the class of operations allowed. The unique existence of a homomorphism up-to-β means that an FT-operation on an object X speciﬁes an arrow h : X → ΩF, i.e. an assignment of abstract F-behaviours to the elements in X.

This suggest that the way β evaluates elements of TΩF can be described more globally via a specification of the interaction of T and F. In our approach we assume this to be given by a distributive law λ of T over F defined below, which uniquely determines the T-algebra operation on the final F-coalgebra. In the case of our example, the operation ⊗ can be seen to arise from such a distributive law λ. Definition 3.3 [distributive law] Let T, F:C → C be two functors. A natural transformation λ :TF⇒ FT is called a distributive law of T over F. We will sometimes alternatively use the phrase that T distributes over F via λ. A major application of the notion of a distributive law in computer science has been given by Turi and Plotkin [TP97] where the two functors were coming as a monad and a comonad respectively and additional coherence ax- ions involving the extra structure were considered (see the work by Power and Watanabe [PW99] for a structured account of this setting). Subsequently, distributive laws were also used in situations where less structure was assumed for the functors and the corresponding coherence axioms were dropped (see [LPW00]). Here we start with the case where only plain functors are given and no coherence axioms are assumed to hold, but we will also consider a richer case later. Regarding the elements from TX as structured entities containing elements from X as arguments, a distributive law tells us how to assign an F-step to such an entity given the steps the arguments can do. With this information, we can derive an F-coalgebra operation for TX from the F-coalgebra X, α: Definition 3.4 [λ-lifting] Given a distributive law λ of a functor T over a → → functor F, we can lift T : C C to the functor Tλ : CoalgF CoalgF on the F-coalgebras by setting λ Tλ X, α := TX,liftα and Tλ h := Th, for an F-coalgebra X, α and an F-coalgebra homomorphism h,where λ ◦ → liftα := λX Tα :TX FTX. For this definition to make sense we need the following lemma which is easily proved using the assumption on λ being natural: Lemma 3.5 (see also [Rut00b, Theorem 15.3]) Let h be an F-coalgebra 74 Bartels

homomorphism from X, αX to Y,αY ,thenTh is a homomorphism from Tλ X, αX to Tλ Y,αY .

In the case of our example the distributive law λ of T := Id × Id over S should give a global description of the addition of two states. For a set X we deﬁne λX :(IR × X) × (IR × X) → IR × (X × X)as

(1) λX (ox,sx, oy,sy):=ox + oy, sx,sy. It is easy to verify that we get a natural transformation indeed. Given a stream system X, o, s, the resulting system Tλ X, o, s has pairs of states from X as states. Unfolding such a pair x, y yields the sum of the two observations o(x)+o(y) and the pair of the successors s(x),s(y), that is the behaviour of x, y in Tλ X, o, s is the (stream) sum of the behaviours of x and y in X, o, s as wanted. With the lifting from above, a distributive law λ assigns F-behaviours to the set TX, given the behaviours of the elements from X. The generalised coinduction schema we are about to state and justify involves T-algebra operations β which preserve these behaviours. More precisely, for a given F- coalgebra X, α we consider homomorphisms β from Tλ X, α to X, α.It turns out that this situation is captured by the notion of a λ-bialgebra occurring in the literature, thus we recall its deﬁnition here:

Definition 3.6 [T,F-bialgebra, T,F-bialgebra homomorphism, λ-bialgebra] AT,F-bialgebra is a triple X, β, α of an object X and two arrows β : TX → X and α : X → FX, i.e. a T-algebra and an F-coalgebra operation on a common carrier. Given two T,F-bialgebras X, βX ,αX and Y,βY ,αY , aT,F-bialgebra homomorphism from X, βX ,αX to Y,βY ,αY is an arrow h : X → Y which is both, a T-algebra homomorphism from X, βX to Y,βY and an F-coalgebra homomorphism from X, αX to Y,αY . Like with T- algebras and F-coalgebras, T,F-bialgebras and their homomorphisms form a T category, denoted by BialgF . Given a distributive law λ of T over F, a λ-bialgebra is a T,F-bialgebra X, β, α such that the following diagram commutes: jj TX ujTjjαj TFX β λX λ-bialg.X FTX TTT α TTT) Fβ FX T The full subcategory of BialgF containing all λ-bialgebras is denoted by λ-Bialg. Note that indeed the definition of X, β, α being a λ-bialgebra is equivalent to saying that β is an F-coalgebra homomorphism from Tλ X, α to X, α as wanted above. With this remark, for a final F-coalgebra ΩF,ωF there is exactly one choice for a T-algebra operation βλ :TΩF → ΩF such that 75 Bartels

ΩF,βλ,ωF is a λ-bialgebra, namely the coiterative morphism from Tλ ΩF,ωF to ΩF,ωF. Bialgebras for a distributive law play an important role in the paper by Turi and Plotkin [TP97] as well. As with the definition of the distributive law, our notion of a λ-bialgebra differs from theirs because we do not assume T and F to come as a monad and a comonad. Consequently, in their setting X, β and X, α are assumed to be an algebra for the monad and a coalgebra for the comonad respectively. In our example the bialgebra under consideration is IR ω, ⊕, head, tail. It is easily verified to be a λ-bialgebra for λ as in (1).

3.4 The λ-Coiteration Definition Schema We are now ready to define our new schema called λ-coiteration for a distributive law λ of a functor T over F: Definition 3.7 [λ-Coiterative Arrow] Assume we are given a functor F with a final coalgebra ΩF,ωF and a distributive law λ of another functor T over F. For an FT-coalgebra X, φ we call an arrow f : X → ΩF a λ-coiterative arrow induced by φ if it is a homomorphism up-to-βλ from X, φ to ΩF,ωF for the unique arrow βλ such that ΩF,βλ,ωF is a λ-bialgebra (c.f. remark following Definition 3.6):

TΩF βλ _ _ f_ _/ X ΩF

φ ωF _ _ _/ FTX β FΩF Ff| λ We want to use the above diagram as a deﬁnition for f, calling it the λ- coiteration schema. To be able to do so, we need to prove the unique existence of a λ-coiterative arrow for any given FT-coalgebra X, φ. The two answers we are going to present work by establishing a one-to-one correspondence between the λ-coiterative arrows and the coiterative arrows from an F-coalgebra LX ,αφ constructed from X, φ. The unique existence of the former then follows from that of the latter. The constructions require additional assumptions on C or T and λ respectively. Theorem 3.8 (λ-Coiteration (1)) Assume the category C has countable coproducts and we are given a functor F:C → C with a ﬁnal coalgebra ΩF,ωF and another functor T:C → C that distributes over F via λ. Then, for every FT-coalgebra X, φ there exists a unique λ-coiterative arrow induced by φ.

For this approach, one constructs an F-coalgebra on the countable coprod- ∞ i i → i+1 uct i=0 T X. The operation is a case analysis involving φi :TX FT X and appropriate coproduct injections where the φi are obtained as liftings of 76 Bartels

→ φ via λ. The correspondence of homomorphisms up-to-βλ f : X ΩF and ∞ i → → ∞ → ◦ homomorphisms h : i=0 T X ΩF is given by f [fi]i=0 and h h in0 |βλ ∞ where f0 := f, fi+1 := fi , ini is the ith coproduct injection, and [.]i=0 is the countable case analysis. A detailed proof can be found in [Bar00]. Using this theorem we can finally answer positively the question whether the specification in Section 3.1 uniquely defines the function ⊗. The example was living in the category Set, which of course has countable coproducts. Theorem 3.9 (λ-Coiteration (2)) Assume we are given a functor F:C → C with a final coalgebra ΩF,ωF,amonadT,η,µ in C and a distributive law λ of this monad over F, i.e. a distributive law of T over F such that the following diagrams commute, which we will refer to as the unit and multiplication law of λ:

λ +3 F<<< TFBJ FTT\11 ÓÓÓÓ <<< 11 ÓÓ <<< 11 ηFÓÓÓÓ <<

3.5 Properties and an Extension As an interesting though trivial observation note that the coiteration schema itself arises as the λ-coiteration schema for the identity functor distributing over F via the natural transformation FId :IdF⇒ FId. Another observation in this direction is that when there is a unit natural transformation η :Id⇒ T for which λ satisﬁes the unit law (which is assumed in Theorem 3.9 anyway), then the corresponding instance of the λ-coiteration 77 Bartels schema individually generalises the coiteration schema: one can obtain a coiterative arrow induced by any F-coalgebra operation α : X → FX as the λ-coiterative arrow induced by FηX ◦ α : X → FTX (the crucial observation is that the assumption on λ makes βλ an algebra operation for the pointed βλ functor T,η, which yields f| ◦ ηX = f). One simple extension of the coiteration schema arises as the dual of primitive recursion and is therefore sometimes called primitive corecursion (see e.g. [UV99]). It states that in a category with binary coproducts every arrow φ : X → F(X +ΩF) uniquely determines a morphism f : X → ΩF making the diagram below commute. _ _ _ _∃!h_ _ _ _/ X ΩF

∀φ ωF F(X +Ω ) _ _ _ _ _/ F F[h,Id] FΩF This characterisation can be obtained as the λ-coiteration schema for the functor TX := X +ΩF and the distributive law λ :TF⇒ FT defined as λX := [Finl, Finr ◦ ωF]:FX +ΩF → F(X +ΩF). In loc. cit. the schema that arises as the dual of course-of-value iteration is also given categorically. It can be obtained as an instance of λ-coiteration too. We do not state it here since this requires quite some technicalities. We still would like to mention that for this second example lambda-coiteration simplifies the justification of the schema considerably, compared to a proof “by hand”. Finally, we would like to mention that in case the category C has binary products the results from above can be extended to a setting where the distributive law λ is replaced by a natural transformation of the type λ :T(Id× F) ⇒ FT. This increases the expressiveness in that one can use both, the F-step that some x ∈ X appearing inside t ∈ TX can do and x itself, for the specification of the behaviour of t. The Theorems 3.8 and 3.9 can be generalised to this new setting with minor modifications: In the first case the construction further requires the existence of binary coproducts in the category C, in the second the unit and multiplication laws for λ have to be adapted as follows: 3

π2 +3 × λ +3 Id × F F T(Id;C F) FT[c? ???? µ(Id×F) ???Fµ ηId×F ??? unit λ Fη ??? +3 2 2 T(Id × F) T (Id × F)O mult. λ 3;FT λ FT OOOO ooo OOOO oooo OOO ooo T Tπ1,λ O #+ oo λT T(T × FT)

3 Following a suggestion by Lenisa, Power, and Watanabe [LPW00], one can elegantly show this by moving from the functor F to the free copointed functor generated by F, namely Id × F,π1. 78 Bartels

4Proof by λ-Coinduction

In this Section we will consider generalised proof principles for bisimilarity. We give an example of a statement for which the straightforward relation fails to be a bisimulation. But it satisﬁes a condition suﬃcient to conclude that it is contained in some larger bisimulation. Again the condition is expressed via the existence of an FT-coalgebra operation on the relation instead of an F-coalgebra structure and we will call such relations λ-bisimulations.The corresponding proof principle allows for simpler bisimilarity proofs involving less complex relations.

4.1 Example: Distributivity of ⊕ over ⊗

We would like to prove that the addition of streams of real numbers defined in the Example 2.5 distributes over the multiplication considered in Section 3.1: ∀σ, τ, ρ ∈ IR ω : σ ⊗ (τ ⊕ ρ)=(σ ⊗ τ) ⊕ (σ ⊗ ρ) Since on the final coalgebra bisimilarity is contained in equality, it suffices to prove that both terms are bisimilar. The bisimulation to be found needs to contain B ⊆ IR ω × IR ω defined as (2) B := {σ ⊗ (τ ⊕ ρ), (σ ⊗ τ) ⊕ (σ ⊗ ρ)|σ, τ, ρ ∈ IR ω} One can easily check that all pairs related by B have equal heads. Further- more, the bisimulation to be found has to relate the tails of every two streams related by B. A computation yields ⊗ ⊕ ⊗ ⊕ ⊕ ⊗ ⊕ tail(σ (τ ρ)) = ([σ0] (τ ρ )) (σ (τ ρ)) =:x1 =:x2 ⊗ ⊕ ⊗ ⊗ ⊕ ⊗ ⊕ ⊗ ⊕ ⊗ tail((σ τ) (σ ρ)) = (([σ0] τ ) ([σ0] ρ )) ((σ τ) (σ ρ)) . =:y1 =:y2

We obtain sums of streams related by B,namelyx1,y1, x2,y2∈B.Thus, the bisimulation should also contain T (B)where ω ω T (B):={x1 ⊕ x2,y1 ⊕ y2|x1,y1, x2,y2∈B}⊆IR × IR . But then we need to check on the pairs contained in T (B) as well. The continuation of this procedure would lead to the relation ∞ B˜ := T i(B). i=0 To show that it is a bisimulation one could prove by induction on i ∈ IN that for x, y∈T i(B) one ﬁnds head(x)=head(y)andtail(x), tail(y)∈T i+1(B). 79 Bartels

The argument from above constitutes the base case of this induction, the induction step is easy and independent of B. 4 It proves the following principle: Given a relation B ⊆ IR ω × IR ω such that for x, y∈B one has head(x)=head(y)andtail(x), tail(y)∈T (B) ˜ ∞ i then B := i=0 T (B) is a bisimulation containing B. With this principle, one can work directly with the relation B as in (2) which only contained the pairs used in the statement itself. There is no need to ﬁnd a description covering all the tails encountered while walking down the streams, as it is done e.g. by Rutten [Rut00a], who uses n n B˜ := { (σi ⊗ (τi ⊕ ρi)), ((σi ⊗ τi) ⊕ (σi ⊗ ρi))| i=0 i=0 ω n ∈ IN , σ i,τi,ρi ∈ IR (0 ≤ i ≤ n)}.

4.2 λ-Coinduction We are now going to formulate the idea from above more generally. For a functor T that distributes over F via λ, it corresponds to asking for an FT- coalgebra operation χ on the relation B ⊆ X ×Y which makes the projections homomorphisms up-to-β for suitable T-algebra operations on the F-coalgebras under consideration. We introduce the following notion: Deﬁnition 4.1 [λ-bisimulation] Let F, T:C → C be functors. A span B = B,b1,b2 is a λ-bisimulation between the F-coalgebras X, αX and Y,αY , if there exist an FT-operation χ on B and T-algebra operations βX and βY on X and Y , such that X, βX ,αX and Y,βY ,αY are λ-bialgebras and b1 and b2 are homomorphisms up-to-βX and -βY respectively: TX T Y ∃βX ∃βY o b1 b2 / X B Y αX ∃ αY χ o / FX β FTB β FY Fb1| X Fb2| Y

If we want to make the T-algebra operations βX and βY explicit, we talk about a λ-bisimulation with respect to βX and βY .Aλ-bisimulation between X, α and itself will be called a λ-bisimulation on X, α.

It turns out that under the assumptions made in Theorems 3.8 or 3.9 about C or T and λ, λ-bisimulations only relate bisimilar states. This can be proved by giving a construction that enlarges a λ-bisimulation such that a conventional bisimulation is encountered, similar to what we have done in the

4 The induction step basically states that T is a respectful function in the terminology of Sangiorgi [San98] (with a translation of this notion from labeled transition systems to stream systems). 80 Bartels example. The argument does not assume a final F-coalgebra to exist, hence it can be used in any context where behavioural equivalence is defined via bisimulations. Theorem 4.2 (λ-coinduction) Let two functors T, F:C → C be given such that T distributes over F via λ and assume that (i) the category C has countable coproducts or (ii) T is taken from a monad T,η,µ such that λ satisfies the unit and multiplication law from Theorem 3.9. If B is a λ-bisimulation between two given F-coalgebras X, αX and Y,αY then there exists a (conventional) bisimulation B˜ between them with B)B˜.

The theorem is proved by constructing an F-coalgebra LB,αχ from B,χ in the same way as in the proof of Theorem 3.8 or 3.9 respectively. Here we only need the construction of F-coalgebra homomorphisms from LB,αχ to ˜ X, αX and Y,αY from the homomorphisms up-to-βX and -βY ,sayb1 → b1 ˜ and b2 → b2. (Note that this direction only needs the assumption that the bialgebras involved are λ-bialgebras, ﬁnality is not needed.) This yields a ˜ ˜ bisimulation LB,αχ, b1, b2. The order of spans claimed is witnessed by in0 in (i) and by ηB in (ii).

4.3 The Example revisited Consider again the example from Section 4.1 where we had T := Id × Id and λ as given in equation (1) on page 75. In this setting the deﬁnition of a λ-bisimulation can be spelled out as follows:

Let X, oX ,sX and Y,oY ,sY be two stream systems, both coming with binary operations ⊕X and ⊕Y on their carrier turning them into λ- bialgebras. The condition for a relation B ⊆ X × Y to be a λ-bisimulation between X, oX ,sX and Y,oY ,sY with respect to ⊕X and ⊕Y can easily be seen to be equivalent to the following one: For all x, y∈B we have that oX (x)=oY (y)andsX (x),sY (y)∈T (B)where

T (B):={x1 ⊗X x2,y1 ⊗Y y2|xi,yi∈B; i =1, 2}. ω By taking X, oX ,sX = Y,oY ,sY = IR , head, tail and ⊗X = ⊗Y = ⊗ we ﬁnd that the assumption on the relation B in the principle given in Section 4.1 is just to be a λ-bisimulation on the ﬁnal stream system. Theorem 4.2 tells us that this is enough to conclude that there is a bisimulation B˜ ⊆ IR ω × IR ω with B ⊆ B˜. The span constructed in the proof of Theorem 4.2 corresponds to B˜ from Section 4.1.

5 Example: λ-Coiteration and Terms

In this section we will again consider specifications involving operators in the definition of the dynamics of a state, but this time in a more flexible manner. The definition of the convolution product of formal power series treated in the 81 Bartels previous sections could only be handled by the simple functor T := Id × Id because of the very regular occurrence of the auxiliary operation inside the specification: the tail of σ ⊗ τ could always be characterised as the sum of two products of streams. Here we will give a schema that allows for arbitrary terms build of possibly several operators belonging to a certain class. Again we start with a concrete example.

5.1 The Stream of Hamming Numbers Taking up an example from Dijkstra’s [Dij81] (also treated e.g. by Sijtsma [Sij89]), we consider the stream ham ∈ IN ω containing exactly those natural numbers in increasing order whose only prime factors are 2 and 3. These numbers are usually referred to as the Hamming Numbers (admittedly we dropped the prime factor 5 for simplicity). Similar to the previous examples the carrier of the final SNI -coalgebra, where SNI X := IN × X, is taken to model infinite streams of natural numbers. We will concentrate on the following specification using the auxiliary operators merge (binary) and mapg (unary) for g : IN → IN : head, tail (ham)= 1, merge(map∗2(ham), map∗3(ham))   σ0, merge(σ ,τ) if σ0 <τ0 head, tail(merge(σ, τ)) = σ0, merge(σ ,τ ) if σ0 = τ0  τ0, merge(σ, τ ) if σ0 >τ0 head, tail (mapg(σ)) = g(σ0), mapg(σ ) ω for σ = σ0 : σ ,τ = τ0 : τ ∈IN and the functions ∗2, ∗3:IN → IN that double and triple their arguments. Theideaistoviewthestreamham as a function ham :1→ IN ω and to define it by λ-coiteration, where 1 = {∗} is a one element set. This would involve the functor T mapping a set X to the set of terms freely generated by corresponding operator symbols merge and map (g : IN → IN )overX and a g distributive law λ that equips these terms with a behaviour according to the definition of merge and mapg from above. The stream of Hamming Numbers should arise as the λ-coiterative arrow induced by φ := ∗ →1, merge(map (∗), map (∗)) :1→ S T1. ∗2 ∗3 NI We are going to study this schema in a more general setting in the remainder of this section.

5.2 The λ-Coiteration Schema on Terms Assume we are given a signature Σ,ar, i.e. a set Σ of operator symbols coming with an arity assignment ar :Σ→ IN .LetT,η,µ denote the term monad generated by this signature. That is, TX is the set of terms freely generated by Σ,ar over X (we will sometimes call the set X asetof 82 Bartels variables in this context), Tf renames all variables inside a term by applying f, ηX converts a variable x ∈ X into a term (we will usually leave its application 2 implicit), and µX :T X → TX ﬂattens two-level terms. To use the functor T within λ-coiteration, we need to specify its interaction with F, i.e. a distributive law λ :TF⇒ FT or, using the generalisation mentioned at the end of Section 3.4, λ :T(Id×F) ⇒ FT. We will concentrate on the latter type, since many examples for this approach require the greater expressiveness it oﬀers.

Assume we are given such a distributive law λ.LetΩF,ωF be a final F-coalgebra and let [[.]] denote the coiterative morphism from Tλ ΩF,ωF to ΩF,ωF. By applying Theorem 3.8 we get that for every operation φ : X → [[ .]] FTX there is a unique arrow f : X → ΩF satisfying ωF ◦ f =Ff| ◦ φ. In the above example this means that ham is a unique arrow satisfying head(ham)=1andtail(ham)=[[merge(map (∗), map (∗))]].Forthistobea ∗2 ∗3 solution for the above specification we need that [[.]] evaluates terms according to the (semantic) operators merge and mapg. Generally, we should be able to ∗ n → ∈ associate operators op :ΩF ΩF to all operator symbols op Σ, ar(op)=n such that for σ ∈ ΩF, ti ∈ TΩF ∗ [[σ]] = σ and [[op(t1,...,tn)]] = op ([[t1]],...,[[tn]]). This property is equivalent to [[.]] being an algebra for the monad T,η,µ and it happens to be one if λ satisfies the modified unit and multiplication laws shown in Section 3.5. This again is equivalent to saying that λ is constructed inductively from natural transformations (3) ρop :(Id× F)n ⇒ FT, for each operator symbol op ∈ Σ, ar(op)=n. Then the semantic operators op∗ are unique ones satisfying ∗ op ωF(op (σ1,...,σn)) = (F[[.]])(ρ (σ1,ωF(σ1),...,σn,ωF(σn))). Taken together, the approach yields unique solutions for specifications involving a given set Σ∗ of auxiliary operators as in our example. The operators ∗ ∗ ∗ have to be such that for any op ∈ Σ the first F-step of op (σ1,...,σn)can be determined by only looking at the direct observations for σi ∈ ΩF and the successor states contained are among the σi, their immediate successors, or any state that can be obtained from those by applying the operators in Σ∗ (these restrictions are due to the naturality condition in (3)). The operations merge and mapg can easily be seen to fit into this framework. The proof principle that we get from the instance of the λ-coinduction principle in Theorem 4.2 when T and λ are constructed from Σ,ar and ρop (op ∈ Σ) states that a relation B ⊆ ΩF × ΩF with the following property only contains bisimilar states (for simplicity of presentation we treat relations on the final F-coalgebra only): ∗ Let again op for op ∈ Σ denote the decomposition of [[.]] : T λ ΩF,ωF→ ˜ ∗ ΩF,ωF as above. We set B to be the congruence closure of B under op for 83 Bartels op ∈ Σ, i.e. the smallest relation B˜ containing B such that for all op ∈ Σ, ∗ ∗ 5 ar(op)=n and σi,τi∈B˜ also op (σ1,...,σn),op (τ1,...,τn)∈B˜. We get that B is a λ-bisimulation on ΩF,ωF if and only if there is an operation χ : B → FB˜ making the diagram below commute (note that the projections πi in the upper and lower part of the diagram are taken w.r.t. different relations): o π1 π2 / ΩF B ΩF ωF ∃χ ωF FΩ o ˜ /FΩ F Fπ1 FB Fπ2 F A relation satisfying this condition is sometimes called a bisimulation up-to- context [San98]. Turi and Plotkin [TP97] show that specifications of operators by means of natural transformations ρop as in (3) are closely related to those by means of transition rules in GSOS format [BIM95]. Many important operators are definable this way. The operators on formal power series Rutten considers for his stream calculus [Rut00a], of which we have seen the sum and convolution product already, all fit into this framework (some of the properties he proves in Theorem 4.1 in loc. cit. could nicely be established using bisimulations up- to-context). Other examples are regular operators on automata and many operators used in Process Algebra, like parallel composition. Our treatment gives an abstract justification for these operators to be safely usable within a proof principle based on bisimulation up-to-context.

6 Related and Future Work

As mentioned in the introduction, we developed our framework as an ad- vancement of Lenisa’s schemata [Len99a]. She defines the definition principle of coiteration up-to-T for a pointed functor T = T,η – i.e. a functor T:C → C with a natural transformation η :Id⇒ T–andaproofprinciple based on the notion of bisimulation up-to-T , which additionally assumes that the pointed functor distributes over the behaviour functor F via some λ and that it is actually taken from a monad T,η,µ. In this case one can argue that there is a close relationship between our λ-coiterative arrows and her arrows coiterative up-to-T . Our definition principle differs from Lenisa’s in that it comes as a direct characterisation of the arrows to be defined instead of a construction. For the proof principles, λ-coiteration generalises the one based on bisimulations up-to-T in that it is no longer limited to bialgebras of the shape TX,µX ,α. These are central in Lenisa’s setting since the coalgebras involved appear within coiteration up-to-T . The main impact of this generalisation is pre- sumably the fact that it allows one to work directly on the final coalgebra, as we have demonstrated in our example. This is particularly helpful since it

5 [[ .]] [[ .]] The relation B˜ represents the span TB,π1| ,π2| . 84 Bartels obviates the need to reason up-to-bisimilarity, which arises in many examples but is possible with neither of the two principles yet. Generally, in relation to Lenisa’s theory we have emphasised and clarified the role of the distributive law, paying less attention to the monad structure. Lenisa has used bisimulations up-to-T to derive a sound principle for proving equivalences induced by arrows coiterative up-to-T . In our case of a λ- coiterative arrow f for φ : X → FTX her statement simplifies to the immediate observation that all FT-bisimular states in X, φ are equated by f.There are clearly stronger statements to be found about the equivalence induced by λ-coiterative arrows, which we leave for future work. Another paper that our work on bisimilarity proofs is related to is that by Sangiorgi [San98]. It is about relational bisimulations for labeled transition systems in Set, but a reformulation in terms of spans between arbitrary F- coalgebras in an abstract category C should be possible (certainly requiring extra assumptions to be made for particular aspects of the theory to carry over). We expect that our notion of λ-bisimulation can be made to fit into such a framework (as argued in [Bar00]). This may open our technique for further examples covered by Sangiorgi’s method, like bisimulation up-to-bisimilarity (see e.g. [Mil89]). Our use of a distributive law λ and λ-bialgebras follows that of Turi and Plotkin [TP97]. Starting from a signature functor Σ and a behavioural functor F, they describe program syntax using the term monad T,η,µ generated by Σ and global program behaviour by means of the behavioural comonad D,ε,δ generated by F. The specification of the semantics takes the shape of a distributive law λ of the monad over the comonad and its models are the λ-bialgebras. In the example of corecursive definitions using operators from Section 5 our theory gets related to their setting, in that the functor T is taken from the term monad as well and that the class of well behaved operators here coincides with the main class considered by Turi and Plotkin. We add to their approach a separate treatment of specifications of the type φ : X → FTX, which are similar to what is sometimes called guarded recursive definitions. One of the advantages of treating the operators and solutions of the recursive specifications separately is that we obtain a notion of bisimulation up-to- context to reason about these solutions. The setting of Turi and Plotkin is more general in the sense that they consider distributive laws of the term monad over the behavioural comonad D,ε,δ generated by F instead of F alone. This allows them to also treat a second major class of operators corresponding to tree rules. We cannot handle this class in our setting, which is in agreement with the fact that its operators may not safely be used for bisimulation up-to-context. Sangiorgi has already pointed out that this condition is stronger than only asking that bisimulation be a congruence for the operators under consideration (see the example at the end of Section 2 in [San98]). The latter condition is the decisive one for Turi

85 Bartels and Plotkin’s approach. We left for future work the quest for further interesting instances of our framework. Since we found suﬃcient conditions for our schema to work that do not assume the functor T to come as a pointed functor or monad, it would be particularly nice to come up with examples exploiting this generality. In all the examples we have so far this structure can be added immediately or at least after a straightforward reformulation of the problem. Another interesting aspect to study would be the connection between λ- coiterative arrows and invariant properties.

Acknowledgements I would like to thank my colleagues, in particular Jan Rutten, Alexandru Bal- tag, and Alexander Kurz, for discussions and guidance. I am further grateful to Dirk Pattinson for suggestions and to the anonymous referees for helpful and competent reports. Special thanks go to Matteo Coccia, who raised my interest in Category Theory during his stay at the CWI.

References

[AM89] Peter Aczel and Nax Mendler. A ﬁnal coalgebra theorem. In D.H. Pitt, D.E. Rydeheard, P. Dybjer, A.M. Pitts, and A. Poign´e, editors, Proceedings 3rd Conf. on Category Theory and Computer Science, CTCS’89, Manchester, UK, 5–8 Sept 1989, volume 389 of Lecture Notes in Computer Science, pages 357–365. Springer-Verlag, Berlin, 1989. [Bar00] F. Bartels. Generalised coinduction. Technical Report SEN-R0043, CWI - Centrum voor Wiskunde en Informatica, December 2000. [BIM95] Bard Bloom, Sorin Istrail, and Albert R. Meyer. Bisimulation can’t be traced. Journal of the ACM, 42(1):232–268, January 1995. [Dij81] E.W. Dijkstra. Hamming’s exercise in sasl. Personal Note EWD792, see http://www.cs.utexas.edu/users/EWD/, 1981. [JR96] Bart Jacobs and Jan Rutten. A tutorial on (co)algebras and (co)induction. Bulletin of the EATCS, 62:222–259, 1996. [JR99] B. Jacobs and J. Rutten, editors. Proc. Coalgebraic Methods in Computer Science (CMCS 1999), volume 19 of Electronic Notes in Theoretical Computer Science, 1999. [Len99a] M. Lenisa. From set-theoretic coinduction to coalgebraic coinduction: Some results, some problems. Unpublished extension of [Len99b] (available through the author’s homepage at http://www.dimi.uniud. it/~lenisa/), December 1999. [Len99b] M. Lenisa. From set-theoretic coinduction to coalgebraic coinduction: some results, some problems. In Jacobs and Rutten [JR99], pages 1–21. 86 Bartels

[LPW00] M. Lenisa, J. Power, and H. Watanabe. Distributivity for endofunctors, pointed and co-pointed endofunctors, monads and comonads. In H. Reichel, editor, Proc. Coalgebraic Methods in Computer Science (CMCS 2000), volume 33 of Electronic Notes in Theoretical Computer Science, pages 233–263, 2000.

[Mil89] R. Milner. Communication and Concurrency. International Series in Computer Science. Prentice Hall, 1989.

[PW99] J. Power and H. Watanabe. Distributivity for a monad and a comonad. In Jacobs and Rutten [JR99], pages 119–132.

[Rut00a] J.J.M.M. Rutten. Behavioural diﬀerential equations:a coinductive calculus of streams, automata, and power series. Technical Report SEN- R0023, CWI - Centrum voor Wiskunde en Informatica, 2000.

[Rut00b] J.J.M.M. Rutten. Universal coalgebra:A theory of systems. Theoretical Computer Science, 249(1):3–80, October 2000.

[San98] D. Sangiorgi. On the bisimulation proof method. Journal of Mathematical Structures in Computer Science, 8:447–479, 1998.

[Sij89] Ben A. Sijtsma. On the Productivity of Recursive List Deﬁnitions. ACM Transactions on Programming Languages and Systems, 11(4):633–649, October 1989.

[TP97] D. Turi and G.D. Plotkin. Towards a mathematical operational semantics. In Proc. 12th LICS Conf., pages 280–291. IEEE, Computer Society Press, 1997.

[UV99] Tarmo Uustalu and Varmo Vene. Primitive (co)recursion and course- of-value (co)iteration, categorically. INFORMATICA (IMI, Lithuania), 10(1):5–26, 1999.

87 CMCS’01 Preliminary Version

Deforestation,program transformation,and cut-elimination

Robin Cockett, 1,2

Department of Computer Science University of Calgary Calgary, Alberta, Canada

Abstract The problem of proving that two programs, in any reasonable programming language, are equivalent is well-known to be undecidable. In a formal programming system, in which the rules for equivalence are finitely presented, the problem of provable equivalence is semi-decidable. Despite this improved situation there is a significant lack of generally accepted automated techniques for systematically searching for a proof (or disproof) of program equivalence. Techniques for searching for proofs of equivalence often stumble on the formulation of induction and, of course, coinduction (when it is present) which are often formulated in such a manner as to require inspired guesses. There are, however, a number of well-known program transformation techniques which do address these issues. Of particular interest to this paper are the deforestation techniques introduced by Phil Wadler and the fold/unfold program transformation techniques introduced by Burstall and Darlington. These techniques are recognized to be shadows of an underlying cut-elimination procedure and, as such, should be more generally recognized as proof techniques. We show that these techniques apply to languages which allow both inductive and coinductive datatypes. The relationship between the above mentioned program transformation techniques and cut-elimination requires a transformation between the initial and final algebra proof rules into “circular proof” rules as introduced by Santocanale and as used implicitly in the model checking community. This transformation is sensitive to the ambient proof system. In the system we investigate, which is cartesian closed categories with coproducts and positive (initial and final) datatypes, closedness is an absolutely essential requirement. The cut-elimination theorems and attendant program transformation techniques rely heavily on this alternate presentation of induction and coinduction.

1 Email: [email protected] 2 Partially supported by NSERC, Canada. This is a preliminary version. The ﬁnal version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Cockett

1 Introduction

The problem of proving two programs equivalent in any reasonable programming language is well-known to be undecidable. In formal programming systems, in which equivalence is determined by a finite set of rules, the situation can be brought under better control as one can, at least, secure the semi-decidability of the problem. Despite this improvement, even in formal systems, there is still a significant lack of generally accepted automated techniques for systematically searching for a proof (or disproof) of the equivalence of two programs. Techniques for searching for a proof that two programs are equivalent often flounder on the formulation of induction and, of course, on coinduction (when coinductive types are present). This is reinforced by two strongly entrenched traditions: • The first is the mathematical tradition of inductive proofs in which an inspired “guess” to formulate an inductive step is often portrayed as being unavoidable. Not only is this view of developing proofs anathema to automation but it has also fueled the perception that searching for inductive (and therefore coinductive proofs) is invariably intrinsically difficult. • The second is the computer science habit of allowing general recursive programs which need not terminate. This means that thrown into any (inductive) proof of equivalence there is a second problem of showing the equivalence of termination conditions. This not only complicates the proof system but it also creates a psychological “black hole” into which two traditional impossibilities 3 have been concentrated!

1.1 Program equivalence in formal systems

The objective of this paper is to provide a view of the proof theory of inductive and coinductive datatypes which justiﬁes the discomfort with the traditional inductive proof techniques mentioned in the ﬁrst point. In order to do this I will present a formal programming language and discuss its proof theory. In doing this I will dodge the “black hole” mentioned in the second point above by choosing a the formal programming system which has good termination properties. The precise meaning of “good termination properties” is a little technical. While this is not the main focus of the paper, in order to understand more precisely what program equivalence actually means in these formal programming systems it is useful to understand more precisely what the ability to evaluate programs provides and does not provide.

3 It is beyond the scope of this paper, but perhaps worth mentioning, that recent advances in the equational understanding of partiality (see [6,3]) show that it is possible to formulate simple equational programming logics which give a formal account of partiality. This means that even out of this “black hole” might yet be teased a surprising amount of light! 89 Cockett

1.1.1 Evaluation in formal programming systems The language I shall consider is a variant of charity [5]: it has both inductive and coinductive datatypes. This means that we cannot promise termination in its full force. Consider, for example, the program which produces the infinite list of primes (which can certainly be written in this language): clearly it is impossible to produce this whole list. Instead the language has the weaker ability to reduce in a terminating manner to a “head normal form.” This allows the primes, in our example, to be produced one-by-one on a demand basis with the guarantee that we can always produce the next prime in a terminating manner. For further discussion of these issues see [16]. A head normal form will often contain further unevaluated material which can be unlocked by the process of coinductive destruction. This gives rise to a “lazy” evaluation strategy for coinductive types which is not simply a side- effect of evaluation order: it is quite fundamental to the whole type system. Outside the coinductive records (which, recall, are predetermined by type) one can perform evaluation in any order. A coinductive record in effect freezes its arguments. This means that one can in fact mix (with potential efficiency advantages) by-value evaluation and lazy evaluation in this sort of language without changing the termination properties. Now evaluation only applies to closed terms and thus, strictly speaking, cannot resolve the question of equivalence of programs – which will not usually be closed. Of course, in a higher-order language we may internalize arbitrary programs as closed terms of a higher type. However, these higher types are coinductive in nature so that evaluation simply produces the unevaluated program as a head normal form and, thus, no extra information is gained concerning equality. However, before we dismiss evaluation it is worth recalling that it can be used to disprove equivalence. Essentially the idea is to find ground “values” (or “points”) on which the two programs perform differently. For an arbitrary program this is achieved by a combination of providing values, and destructing to ground types. These processes are both enumerable so can be undertaken in a methodical manner. Clearly this technique of distinguishing programs using their pointwise behavior should be an integral component of any automated proof system as trying to prove something which is obviously false can cause a great deal of wasted effort!

1.1.2 Indistinguishable and formally equivalent programs We note that this discussion has revealed a fundamental (and possibly rather disturbing) feature of formal programming systems: two programs whose behavior cannot be distinguished on points are not necessarily provably equal. If they were we would immediately have a decision procedure for equality - which we cannot have! Thus, the indistinguishable programs and the provably equivalent programs in such systems are never the same. Thus, the fact that there are no points on which two given programs can be 90 Cockett distinguished does not imply that there is necessarily a proof that the two programs are equivalent. This, in turn, raise the unhappy specter that the proof system may be so weak that one may routinely encounter programs which are clearly indistinguishable on “points” and yet cannot be proven equivalent! In fact, in the system I shall be considering below this is not the case. It is known that equivalent and indistinguishable programs coincide for low complexity programs and therefore for these programs the equivalence problem is decidable. However, the precise manner in which program complexity relates to decidability – that is where indistinguishable and equivalence of programs coincides is an open question of some interest.

1.2 On formulations of induction and coinduction We now turn more directly to the issue of program equivalence and, in particular the question of how to tame inductive and coinductive program equivalence proofs. The ﬁrst small step away from standard mathematical induction based on the natural numbers is the step to, so called, structural induction. This allows a convenient expression of inductive arguments for arbitrary inductive (or initial) datatypes in much the same form as mathematical induction where the form of the inductive step is determined by the structure of the datatype (see Slind [13]). However, as a principle which could encompass coinductive as well as inductive datatypes in a programming setting it has signiﬁcant shortcomings. To start with its formulation relies on the presence of a calculus of subobjects (subsets): a structural inductive argument essentially works by showing that the subobject determined by a proposition must, in fact, be the complete object. Now it is arguable whether program logics a priori come equipped with a calculus of subobject strong enough to support these sorts of inductive arguments. Putting aside our concerns over the level of logic implied by these arguments, we are immediately forced to a further level of discomfort by what this approach implies for coinduction. For the analogous theory of structural coinduction would have to use a quotient structure. Again one may, this time with more force, argue that this is absent at the level of program logic. Further- more, this structure compared to that for subobject, if indeed we allow that it is present, is considerably less well-behaved. In practice one has to simulate this structure through the subobject structure and this leads one ultimately to the theory of bisimulation. Now, I do not want to suggest that the theory of bisimulation for coinductive types is without value: in fact, I believe it is an important tool in our understanding of coinduction. However, to reach this point one cannot fail to notice that one has had to employ a series of sophisticated contortions. In particular, these steps completely obscure the symmetry between inductive 91 Cockett and coinductive types as one has been forced to overlay the basic program logic with further structure which heavily emphasizes the asymmetry of the underlying setting. I think, therefore, it becomes a reasonable question to ask whether there might not be more direct approach than this to a proof system that encom- passes coinduction.

1.2.1 The categorical formulation An obvious alternative starting point, in which the symmetry between induction and coinduction is given by duality, uses the categorical universal property associated with inductive (initial) and coinductive (final) datatypes. This is familiar to category theorists who would claim that this the fundamental property expected of datatypes. The basic idea (see section 3 is that inductive (coinductive) datatypes are initial algebras (resp. final coalgebras) and as such have uniquely determined maps to all other algebras (resp. from all other coalgebras). However, there is a major problem with this (categorical) abstract view: it is difficult to use as a proof principle. In particular one may need considerable inspiration to produce an algebra (or coalgebra) which secures a desired equivalence. Thus, from the standpoint of proof automation, this is discouraging and hardly an improvement on the original situation described for mathematical induction. However, we should not loose sight of the fact that symmetry has been regained. Furthermore, these notions do seem to capture the fundamental properties of these structural types. Thus, any system we might choose to devise should, at a very minimum, satisfy the abstract categorical properties of initial and final datatypes.

1.2.2 Lessons from model checking Initial and final algebras, under the guise of least and greatest fixed points, are also used extensively in model checking and, in particular, the modal-µ calculus. These developments have found a lucrative application in hardware and protocol verification. While these applications are concerned with posetal models the abstract proof theory involved applies to programs viewed as proofs. Strikingly the more successful of these logics abandon the simple fixed point rules (which are in fact the categorical rules but in that community they are often called “the Kozen rules” [9]) in favor of more complex proof techniques. I will refer to these as circular proof systems following Santocanale [11] who introduced them to me in roughly the form I will use here. The advantage of this movement away from the simple fixed point rules is significant as these systems allow one to see quite easily that derivability, that is the ability to prove one thing from another, is decidable. While this is not the problem we wish to solve it is an indication of an 92 Cockett increased power that these circular proof systems provide. From a proof theoretic point of view there is also a strong imperative to adopt this modified proof system: the categorical or Kozen system does not have a very satisfactory formulation of cut elimination. At this point in time it may be rather hard to substantiate, from the model checking community itself, the claims that I am making. Model checkers do not generally take a proof theoretic view of the world. Indeed there is no direct published proof of cut-elimination for the modal-µ calculus even with the “circular proofs” (as far as I know). Instead there is a game theoretic argument which translates between proofs in the modal-µ calculus and winning strategies between games and then argues that winning strategies can be composed. A direct proof is, in fact, simpler and will appear, together with a suitable formulation of the modal-µ calculus, in Aldwinckle’s thesis [1].

1.3 The importance of cut-elimination A basic property of a proof system is that one should be able to compose proofs: this is the import of the cut rule in logical systems. The ability to compose proofs in this manner is clearly absolutely fundamental to the view of programs as proofs which we wish to exploit here. A crucial observation of Gentzen [14] is that logical systems often allow one to eliminate the cut by a series of proof rewritings. Basically this meant that if there was a proof from one proposition to another (which perhaps used some intermediate propositions) then there is a direct proof which does not use any intermediate propositions. Anyone familiar with Phil Wadler’s ideas on program deforestation [17] will see a parallel between the ideas there and the aim of cut-elimination. This is made more explicit in Simon Marlow’s thesis [10]. The idea of program deforestation is precisely to remove from the program the building of intermediate datatypes. This removal often results in more efficient programs, as intermediate structures are no longer built only to be destroyed, and so this has been of great interest to the program transformation and compiler community. It is tempting to think that cut elimination and deforestation are the same thing for programs. However, this is not the case: deforested terms can still have cuts present. Indeed, in some cases, there necessarily must be residual cuts. However, the cuts that do remain are “soft” in that they only rearrange structure -more precisely they are cuts essentially from the λ -calculus portion of the logic. The cuts that are removed are the “deforesting” cuts: these are the ones which cause the destruction of datatypes (including coproducts and products). In a logical system the cut-elimination process has further significance. It is a constructive process and its steps, which are usually not confluent actually determine equalities which are needed between proofs. 4 This is why

4 It is not the case that all proof equivalences are necessarily forced by cut elimination: logical systems can also have representation rules (such as “a comma on the left is equivalent 93 Cockett cut-elimination is also relevant to the discussion of program equivalence: thus, one can in some sense determine the notion of program equivalence from a cut elimination process. Now the form of the equivalence rules which arise naturally from these considerations in a circular proof system bear a striking resemblance to the fold/unfold methodology of program transformation [2]: this may help to explain why those ideas, despite being only partially correct, remain the basis for most of the high-level optimizing techniques for programs.

1.3.1 Transforming the proof system The transformation of the proof system from one based on initial and final datatypes (which I call the “algebra based” system) to circular proofs was a relatively easy conceptual step for the model checking community because their models were posetal. This relieved them of the necessity of maintain- ing the equality of the proofs. Unfortunately, when one regards proofs as programs, proof equality becomes the major concern. Also if proofs are to become programs in a programming language the manner of correctly expressing circular proofs becomes another matter of some interest. This direction has already been developed by David Turner: in fact, in [15] he argues that this style of programming, which he calls “elementary strong functional programming,” is the future of functional programming. The discussion in this paper links his ideas to the charity system by showing that his programming system (in so far as it is the circular proof system -and I have not checked this in detail) have the same power. It is a tribute to David Turner’s great programming intuition that he discovered and realized the significance of this system It turns out that a circular proof system implies the presence of initial and final datatypes but the reverse implication is dependent on the particulars of the proof system (if it is true at all). Given the nice properties of a circular proof system it is tempting just to use circular proofs directly rather than rules which might be implied by initial and final datatypes. However, circular proofs interact ultimately with all the rules of the system and if the whole package does not fit together the system simply will not be well-behaved. Therefore, I go to some trouble to establish that the programming logic, which I introduce below, has both a circular proof presentation and an algebra based presentation.

1.4 Contents After this rather lengthy introduction it is appropriate to show how these relationships play out for a particular programming logic. To this end we present a term calculus for cartesian closed categories with positive datatypes. We describe how it may be viewed as a proof system and illustrate some simple programming examples in the system. In this case the “circular” programs, to a product”) which may produce further proof identiﬁcations. 94 Cockett which are very much in the functional style, provide an equivalent formulation to the one implied by initial and ﬁnal datatypes. Finally we discuss the implied program equivalences, a technique for deforesting, and the possibilities for automating equivalence proofs.

2 Program logic for inductive and coinductive datatypes

We will set up the programming logic as a type theory as we wish to exemplify the propositions as types and programs as proofs paradigm. The semantic basis of our programming language is a cartesian closed category with positive datatypes. We shall take the view that the coproducts are primitive inductive datatypes while products and exponentiation with a fixed object (A ⇒ )are primitives of the coinductive structure. To avoid the usual issues which arise from having contravariant functors (and to keep things simple) we shall only admit as primitive functors: • The constant functors K; • The product functor X × Y ,orΠi∈I Xi for I a finite set, with 1 being the empty product; • The coproduct functor X + Y ,or i∈I Yi for I a finite set, with 0 being the empty coproduct; • Exponentiation with respect to fixed objects A as in A ⇒ X. We shall refer to the functors derived from these basic functors as the polynomial functors of the setting. These, of course, can involve multiple free type variables. These basic polynomials are the basis for forming the datatypes of the setting: these include the polynomials and those types which can be obtained from the basic functors and the following two forms of type binding: • An inductive binding µX.T giving the initial datatype or least fixed point of the type T with respect to the type variable X; • A coinductive binding νX.T giving the final datatype or greatest fixed point of the type T with respect to the type variable X. These constitute the positive types of the programming setting we describe below. The term annotated rules of the program logic can be described in the three tables which now follow. The first figure 1 gives the inference rules governing sums, products, and exponentiation: Here are some things to note about the terms defined in figure 1 for this logic which we should view as programs:

(i) The index sets I are all ﬁnite with element denoted by i1, ..., in. (ii) The variables “declared” on the left of the sequents must always be distinct but can in general be product patterns. Thus the following is a 95 Cockett

{ , } Γ1,xi : Xi, Γ2 ti : Y i∈I sumL , Γ1,z : i∈I Xi, Γ2 ixi → ti z : Y i∈I

{Γ , t : Y } , i i i∈I Γ t : Yi sum , prodR , R Γ (i : ti)i∈I :Πi∈I Yi Γ i(t): i∈I Yi

Γ1,x1 : X1, ..., xn : Xn, Γ2 , Y prodL Γ1, (i1 : x1, ..., in : xn):Πi∈I Xi, Γ2 , Y

Γ , s : Y Γx : X , t : Z x : X, Γ , t : Y close close Γf : Y ⇒ X , t[f@s/x]:Z L Γ , λx.t : X ⇒ Y R

Fig. 1. Sum, product, and closure rules

valid sequent:

(fst : x1, snd : x2):Xfst × Ysnd , x1 : X.

This exempliﬁes the “indexed” product notation. When there is no danger of confusion we will tend to omit the indexing letting the order of components carry this information. (iii) We follow the standard programming convention of separating the indexes of all products and coproducts so that the names of the indexes uniquely determine the type. (iv) Products have terms which are indexed records:

(fst : t1, snd : t2):Xfst × Ysnd.

the projection from this record to the ﬁrst coordinate is just the index of that coordinate, fst. (v) Maps from coproducts are expressed through “case” terms:     rgt x → t 1 →   : Xrgt + Ylft Z lft y → t2

and the coproduct embedding are denoted by the index. (vi) The logic uses λ abstraction with application given by the inﬁx operator @ which, as all operators do, binds more tightly that ordinary composition. The next ﬁgure 2 gives the structural rules and, in particular, the cut rule: 96 Cockett

Γ , t : Y id (A atomic) weak x : A , x : A Γ,x: X , t : Y

Γ ,x : X, x : X, Γ , t : Y 1 1 2 2 contr Γ1,x: X, Γ2 , t[x/x1,x/x2]:Y

Γ ,x : X ,x : X , Γ , Y 1 1 1 2 2 2 exch Γ1,x2 : X2,x1 : X1, Γ2 , Y

 Γ , t : C Γ1,x: C, Γ2 , t : Y cut Γ1, Γ, Γ2 , [x → t]t : Y

Fig. 2. Structural rules Notice that the ﬁrst rule introduces identity rules for atomic types.Itisim- portant to realize that the identity map for non-atomic types must be derived. This restriction is common in logics and greatly simpliﬁes the proof theory. However, from a programming point of view this looks a bit crazy! While it is possible to formulate a logic without the necessity to have “expanded” the identity maps it does make program equivalence checking slightly more complex the identities need to be expanded anyway. Here we shall adopt the standard logical convention because our main interest is in program equivalence rather than program optimization. Notice that this logic has explicit substitution given by the syntax

[x → t]t which intuitively means that x should be substituted by t in t. Written as a “let expression” this is let t be x in t. Notice also that the only rule which uses explicit substitution is the cut rule. Thus, for example, I have written the contr(action) rule using a direct substitution: this maintains the correspondence between the use of cut and the occurrences of explicit substitution in the programs. The juxtaposition of [x → t]andt should be regarded as the expression of ordinary composition. Thus, in [x → t]f@s, as operators bind more tightly than composition, the application is calculated before the composition. Those used to functional programming and the λ-calculus may ﬁnd it curious that we distinguish between application for higher-order terms and application of functions. In this respect, as we want our terms to represent proofs in the logic, we are simply being blindly faithful to the logic which makes this distinction. 97 Cockett

Clearly the expression, [x → t]f@s, should be formally equivalent to the substitution t[(f@s)/x] and, in fact, it is. As we shall discover many of the cut-elimination steps which will force the presence of certain proof equivalences, are actually concerned with turning explicit substitution into ordinary substitution and in this sense are rather mundane. However, recall that explicit substitution does permit the “variable” which is to be substituted to be a pattern which gives the cut-elimination steps a little more content. Now it should be clear that the semantics of the proof theory of this fragment lies in cartesian closed categories which have coproducts. That this fragment satisfies cut-elimination is well-known as it is a variant of intuitionistic logic. Furthermore this fragment has its proof equivalence decidable. In the pure logic (that is with no atomic types and no non-logical axioms) this is actually almost immediate as the initial model is just finite sets. In the general case, where arbitrary types and axioms are permitted, the problem is much harder; but even in this general case the calculus is decidable (these issues are discussed in Ghani’s thesis [7] and in recent work of Altenkirch, Dybjer, Hoffman, and Scott [12]). Finally we come to the rules of the logic which are of primary interest to this discussion. They are the rules which determine the behavior of the inductive and coinductive datatypes. The rules for the term construction are given in figure 3. These need some explaining: we shall start with the cons(truction) and dest(ruction) rules as they are somewhat simpler. The construction rule allows one to build an inductive datatype from its component: thus, for example, a list can be built from an element and a list, via µ cons. or (primitively) as a nil list, via µ nil. The application of a µ rule turns a coproduct term into a list term. This syntax may seem a little peculiar as usually one does not distinguish between a coproduct term and an inductive datatype term. The way we have set up this logic requires that we make the distinction. We will often ignore this requirement in order to make the terms look more familiar. The destruction rule is dual: it allows us to break up a coinductive datatype into its constituents. Notice that destruction must be substituted in the term. It might tempting to use an explicit substitution here but this should be resisted for, as explained above, we are reserving explicit substitution to record the occurrence of the cut rule. The remaining two rules are the inductive and coinductive circular proof rules. The form of these rules needs some discussion as they are not perhaps as familiar: essentially they allow the recursive specification of functions but in a tightly controlled manner. Thus these rules are essentially Landin’s letrec construction, of course, we need two variants and some extra notation besides so that we can track the recursive arguments. Please note I am following the charity tradition of using curly parentheses to denote inductive items and round parentheses to denote coinductive items. We shall name the datatypes: for example the inductive natural number and

98 Cockett

(Zi := µx.Pi(x))i∈I | zi : Zi,y :Γ,f Y

Π x : P (Z ), ..., x : P (Z ),y :Γ, t : Y 1 1 1  n n n    : f = , x1 : µx.P1(x), ..., y :Γ (x1, ..., y):Y  →  (:= x1, ..., := xn,y) t

Γ , t : P (µx.P (x)) cons Γ , µµx.P (x)t : µx.P (x)

Z := νx.P(x) | Γ ,g Z

Π y :Γ, t : P (Z)   : g =  y :Γ, y : νx.P(x) y → t 

Γ1,z : P (νx.P(x)), Γ2 , t : Y dest Γ1,z : νx.P(x), Γ2 , t[ννx.P(x)z /z]:Y

Fig. 3. Circular inductive and coinductive proof rules list datatypes are deﬁned by:

Nat = µx.1zero + xsucc

List(A)=µx.1nil +(Afst × xsnd)cons. The append function in this system becomes:     :app =      x, y → nil() → y (x, y)  →   (x :=,y)   x  cons(v,vs) → µ cons(v, app(vs,y)) where notice I have simpliﬁed the product notation. Notice also that we indicate the regenerated variable with the := symbol. There is another way 99 Cockett     :rev 1(:=,y = nil) =  → → x  nil() y  x     cons(v,vs) → rev1(vs,µ cons(v,y ))

Fig. 4. Fast reverse we shall write this in order to reduce the nesting of parentheses:     : app(:=,y = y)=  x, y → nil() → y x     cons(v,vs) → µ cons(v, app(vs,y))

This brings the notation more closely into line with that used by functional languages and the notation used by charity. Notice how the recursive variable is put outside while the non recursive parameter is assigned where the function is introduced. Another example of a program which makes a non-trivial use of the the power of these circular definitions is the “fast” reverse function (compare this to the higher-order definition see figure 8. The circular (or recursive) part of this function actually reverses a first list onto a fixed second list: The slow version of the reverse function uses the append function inside:     :rev 2 := =  → → x  nil() µ nil()  x.     cons(v,vs) → app(rev2(vs),µcons(v, nil))

Notice, in particular, that, despite the fact that I have not indicated them, there are a number of cuts in these terms. For example a circular term like app can only have variables as arguments so these must be supplied by a cut. The second “naive” version of reverse, as we shall discover in section 4.3.2, is not deforested and can be automatically improved. In general, we shall not be punctilious about recording cuts as there is a significant notational overhead instead we will feel free to use the substituted (although often not derivable) term. However, we remark that the presence of a cut is often going to be an indication that something can be simplified. Notice also that we will feel free to use names of previously defined circular functions. The circular proof rule for inductive datatypes also allow for simultaneous recursion. Thus, one can have more than one inductive datatype being “un- raveled” at the same time. However, notice that one cannot do this for the coinductive datatypes because there is a fundamental asymmetry in the logic which allows lists of propositions (types) on the left but only one proposition 100 Cockett     : odd(:=) =      nil() → µ nil →   x     x.  nil() → µ nil)   cons(v,vs) → µ cons(v, vs)      cons( ,L) → odd (L))

Fig. 5. The odd function (type) on the right. The ability to do simultaneous recursion is of practical importance as it allows eﬃcient implementations of certain functions. Here is a well-known example: without simultaneous recursion or higher-order function terms it is impossible to provide the following eﬃcient implementation of monus:     :monus(:=, :=) =      (zero, ) → µ zero → x, y   (x, y).  (m, zero) → m      (succ(m), succ(n)) → µ succ monus(m, n)

Notice that the positions of the recursive arguments in the term logic are indicated by the := type binder so that, in a circular function f which is to be recursively applied, we know precisely which arguments are active. In the proof theory the circular function involves the introduced types Zi whose scope is delimited by the box. These type variables Zi are regenerations of the underlying recursive types: this information is indicated by the assignment Zi := µx.Pi(x). The regeneration relation is, of course, transitive. A type variable Zi which is a regeneration of another type Zi can be treated exactly as if it were the type Zi. Crucially the converse is not true, thus the circular function cannot be applied to an earlier generation of its type variable. There is another important way in which a regenerated type variable diﬀers from the datatype from which it originated: we do not allowed the construction (or destruction) rules to be applied to a regenerated variable. This is because such an application would reverse the regeneration process. However, we do allow a regenerated variable to have the circular proof rule of the datatype applied to it. It is thus possible to have multiple regenerations, that is for a regenerated variable to be itself regenerated. To illustrate the power of regeneration consider the problem of producing, from a list, the list of those elements which have odd indexes. Here is the program: I have used another convention: namely that a circular proof term which never uses its circular function can be represented by simply omitting mention of the introduced function altogether. This almost reduces the syntax to that of the case combinator and, of course, this near confusion of notation is quite 101 Cockett deliberate. The purpose of this example was to illustrate the use of multiple regeneration. While it is not hard to see that we could implement this without a multiple regeneration (see later in ﬁgure 9 ), I would claim that the resulting program is much less natural. Below we give the derivation of this term which shows the pattern of regeneration in the proof:

Z := List(A) | Z ,odd List(A)

Z := Z

Z , List(A) odd A, Z , A A, Z , List(A) A, Z , A × List(A) A, Z , 1+A × List(A) µ cons A, Z , List(A) weak A, A, Z , List(A) µ nil A, 1 , List(A) A, A × Z , List(A) A, 1+A × Z , List(A) A, Z , List(A) µ nil A × Z , List(A) 1 , List(A) 1+A × Z , List(A) List(A) , List(A)

Notice that first the regenerated type variable Z is treated as if it were List(A) by being the subject of a circular derivation. However, this inner circular derivation does not use it’s circular function. Instead the regenerated type variable Z is treated as if it were Z and has the function odd applied to it. Now much the same considerations apply to the coinductive datatypes. We illustrate the coinductive side of this language with a function to “accumulate” an infinite list. This function given an infinite list InfL(A) produces an infinite list InfL(List(A)) in which the inner lists give the list of elements seen so far. The infinite list datatype is defined by

InfL(A)=νx.Ahd × xtl.

Notice that this sort of inﬁnite list always has a “next element” by ﬁat. It cannot be empty or decide to stop! 102 Cockett

  → :acc :InfL(A), List(A) List(A)=     →   IL   hd : vs   (IL,nil). (L, vs) → tl : acc(tl L, app(vs, cons(hd L, nil))) 

Fig. 6. The accumulate function Here is the function accumulate:   → :acc :InfL(A), List(A) List(A)=     →   IL   hd : vs   (IL,µnil). (L, vs) → tl : acc(tl νL,µcons(hd νL,vs))

This is a simple illustration of how coinductive datatypes and inductive datatypes can be used in combination to produce programs. The derivation of this term is (approximately!) as follows:

Z := InfL(A) | InfL(A), List(A) acc Z

A, List(A) A × List(A) cons acc A, List(A) List(A) InfL(A), List(A) Z cut A, List(A), InfL(A) Z List(A) List(A) A × InfL(A), List(A) Z dest InfL(A), List(A) List(A) InfL(A), List(A) Z InfL(A), List(A) List(A) × Z nil List(A) InfL(A), List(A) InfL(List(A)) cut InfL(A) InfL(List(A))

Notice that this proof does contain a number of cuts which, in the term, I have suppressed. However, these cuts are all soft cuts so that actually we shall regard this term as being already deforested. However, the version of the accumulate function in ﬁgure 6 which collects the lists in the right order is not deforested we shall see how this may be deforested in section 4.3.3): This then concludes our introduction to the formal programming language that we shall use in the sequel. At this stage the reader should be convinced – and it is most certainly the case – that one can write nontrivial programs in this language which is a basic “strong functional programming language” in the sense of David Turner.

3 The transformation from initial and ﬁnal datatypes

These circular proof rules despite according with the recursive programming style do not seem at ﬁrst sight to correspond to the initial algebra and ﬁnal 103 Cockett coalgebra interpretation of datatypes. Recall that if F (A, X) is functor then µX.F (A, X) equipped with a map

cons : F (A, µX.F (A, X)) → µX.F (A, X) is an initial algebra for F (A, ) in case for every F (A, )-algebra

g : F (A, Z) → Z thereisauniquemaph : µX.F (A, X) → Z such that the following commutes:

F (A, µX.F (A, X)) cons /µX.F (A, X)

F (A,h) h / F (A, Z) g Z.

This is the universal property of the initial algebra. Dually the universal property of the final coalgebra is defined as follows: νX.F(A, X)equipped with a map dest : νX.F(A, X) → F (A, νX.F (A, X)) is a final algebra for F (A, ) in case for every F (A, )-coalgebra

g : Z → F (A, Z) thereisauniquemaph : Z → νX.F(A, X) such that the following commutes:

g Z /F (A, Z)

h F (A,h ) / νX.F(A, X) destF (A, νX.F (A, X)).

These conditions assert not only the existence of a comparison map but, importantly, the uniqueness that map. The uniqueness of the comparison map forces certain natural equalities on the programs of the system. This algebra based approach to datatypes can also be translated into annotated inference rules see ﬁgure 7. When one writes down these rules one quickly realizes that one needs the initial property “in context” as, to maintain the style of the logic, one must allow other propositions (types) beside the initial one on the left-hand side of the sequent. This does not to relate directly to the property outlined above for the inductive datatypes, however, in the cartesian closed setting at least, the above diagrams do suﬃce as one can shunt the context onto the right-hand side. These matters are described in great categorical detail in [4] and in Bart Jacobs’s book on categorical logic [8]. This is essentially the charity syntax as described in [5] and that paper provides several examples of programs. 104 Cockett

 Γ1,z : P (Y ), Γ2 , t : Y initial Γ1,v : µx.P (x), Γ2 , z → t v : Y

Γ , t : P (µx.P (x)) cons Γ , consµx.P (x) t : µx.P (x)

Γ,x: Z , t : P (Z) final Γ,v : Z , x → t v : νx.P(x)

Γ1,z : P (νx.P(x)), Γ2 , t : Y dest Γ1,z : νx.P(x), Γ2 , t[destνx.P(x) z /z]:Y

Fig. 7. The algebra based proof system

The main purpose of this section is to show that it is possible to translate between these two styles of proof. 5 We shall show this through the sequent proofs but it is important to realize that this works must work for the proof terms too and that this does in fact gives a method of translating between the two styles of programming. We shall start by showing that the algebra based proof system (which in the model checking community is sometimes referred to as the Kozen system [9]) can be simulated by the circular proof system. To this end we need ﬁrst to show how the initial algebra derivation can be obtained in the circular proof system. The required derivation has the following form:

Z := µx.P (x) | Γ1ZΓ2 ,f X

f Γ ,Z,Γ , X 1 2 functor Π assumption Γ1,P(Z), Γ2 , P (X) Γ1,P(X), Γ2 , X cut Γ1,P(Z), Γ2 , X

Γ1,µx.P(x), Γ2 , X

The inference labeled “functor” has to be inductively established for all the

5 Please note the ideas behind this translation are not original. In particular, I am very much in debt to Santocanale for introducing me to his proof [11] of this for the the finitely bicomplete poset case with initial and final fixed points:I am unashamedly borrowing his ideas. 105 Cockett possible functors P . It corresponds to a well-known categorical observation that all the functors involved are “strong” [4]. Next we need to show how the final coalgebra derivation can be simulated in the circular proof system. This is very similar to the above and again uses the strength of the functor P :

Z := νx.P(x) | Γ1,Y,Γ2 ,h Z

h Γ ,Y,Γ , Z Π assumption 1 2 functor Γ1,Y,Γ2 , P (Y ) Γ1,P(Y ), Γ2 , P (Z) cut Γ1,Y,Γ2 , P (Z)

Γ1,Y,Γ2 , νx.P(x)

This then shows that the circular proof system can simulate all the derivations of the algebra based proof system. We now have to show that we can simulate all the proofs of the circular proof system in the algebra based proof system. This is a little more diﬃcult as we also have to handle the possibility of multiple regenerations and simultaneous recursion. We shall therefore approach this in three stages. First we will show that when the recursion is sequential and there are no multiple regenerations that we can easily translate the proof into the algebra based proof system. Next we will show that a circular proof with simultaneous recursion can be reduced (within the circular proof system) to one with only sequential recursion. Finally, we show how a proof with multiple regenerations can be reduced to one which has only single regenerations. Consider a circular proof for a program from an inductive type which has no regenerations. Letting π stand for the derivation from the circular assumption this has the general form:

Z := µx.P (x) | Γ,Z , X

Γ,Z , X π(Z) Γ,P(Z) , X Γ,µx.P(x) , X

Now there are actually many proofs into which we could transform this but we actually have to be somewhat careful. For example, the following seems 106 Cockett     nil : () → λx.x → (x, y)   x)@y. cons : (a, f) → λx.f@(cons(a, x))

Fig. 8. Higher-order fast reverse like a reasonable translation: X , X weak Γ,X,Γ , X π(X) Γ,P(X) , X initial Γ,µx.P(x) , X where we heavily use the fact, which is not totally obvious, that the proof π(Z) is parametric in Z. The reason this works is because the only rule which can involve using the implicit type of Z is the circular induction rules for the type. However, a use of that rule would have meant that there was a multiple regeneration. Thus we can substitute an arbitrary X for Z in the proof π(Z) to obtain a proof involving X. However, consider what this transformation does to the heart of the fast reverse introduced in figure 4: it becomes     nil : () → y → (x, y)   x cons : ( ,vs) → vs which is equivalent to (x, y) → y. Not at all what we want! The problem with this approach to the translation is that the non-recursive arguments are altered prior to being used by the rev1 function. By weakening we, in effect, throw away these modifications. In order to obtain a correct translation it is necessary to make essential use of the higher-order aspects of the language. It is convenient to assume that Γ is a singleton: this we can do without loss because if it is not then we may always form the product of the propositions to obtain an equivalent singleton. Here then is the desired translation: Γ, Γ ⇒ X , X π(Γ ⇒ X) Γ,P(Γ ⇒ X) , X P (Γ ⇒ X) , Γ ⇒ X initial µx.P (x) , Γ ⇒ X Γ,µx.P(x) , X which has the following effect (see figure 8) on the fast reverse introduced in figure 4: Notice that this does have the correct effect! Similarly consider a circular proof for a program to a coinductive type 107 Cockett which has no regenerations (where we assume Γ is a singleton):

Z := νx.P(x) | Γ,Y , Z

Γ,Y , Z π(Z) Γ,Y , P (Z) Γ,Y , νx.P(x)

this transforms to:

Γ , Γ Y , Y weak Γ,Y , Γ × Y π(Y ) Γ,Y , P (Γ × Y ) final Γ,Y , νx.P(x) .

Notice that for the inductive transformations we do not need the any higher- order constructs. Next we will illustrate how to remove simultaneous recursion within the circular proof system. This transformation uses the higher-order aspects of the language non-trivially: the transformation is well-known and was pointed out to me by both Eric Meijer and Simon Thompson. I shall illustrate the technique by transforming a proof which has a simultaneous recursion on two arguments. Here is a typical such proof:

Z1 := µx.P1(x),Z2 := µx.P2(x) | Z1,Z2, Γ , C

Z ,Z , Γ , C 1 2 π P1(Z1),P2(Z2), Γ , C

µx.P1(x),µx.P2(x), Γ , C

This gets transformed to the following proof with this simultaneous recur- 108 Cockett sion removed:

Z2 := µx.P2(x) | Z2, Γ , µx.P1(x) ⇒ C

Z1 := µx.P1(x) | Z1,P2(Z2), Γ , C

Z2, Γ , Z1 ⇒ C Z ,Z , Γ , C 1 2 π P1(Z1),P2(Z2), Γ , C

µx.P1(x),P(Z2), Γ , C

P (Z2), Γ , µx.P1(x) ⇒ C

µx.P2(x), Γ , µx.P1(x) ⇒ C

µx.P1(x),µx.P2(x), Γ , C

Here notice that we use the fact that Z1 is a regenerated variable so may be treated as one of its earlier incarnations namely µx.P1(x)sothatwecan discharge the inner induction using the outer premise. Notice that in this proof the inner circular induction is essentially a case combinator as its circular function is never used. It remains to remove multiple regenerations. Our strategy is, as for simultaneous recursion, to show that multiple regenerations can be removed within the circular proof system itself. Suppose to start with that we have a multiple regeneration on an inductive type, we would then have the following form to the proof: Z := µx.P (x) | Γ,Z , Y

Z := Z | Γ,Z , Y 

Γ,Z , Y Γ,Z , Y π (Z) Γ,P(Z) , Y 2 Γ,Z , Y π (Z) Γ,Z , Y 1 Γ,µx.P(x) , Y where the proofs π1 and π2 could both have other regenerations of Z and (for π2, Z ). In order to simplify the transformed proof it is convenient to assume that ΓandΓ are singletons. Please note that the transformed proof has many implicit cuts (e.g. in π1, π2, π1 ,andπ2 ). Also please note that again we have 109 Cockett

    nil : () → (nil(),λa.[a])) → x   x cons : (v, (L, g)) → (g@v, (λa. cons(a, L))) ;(V, ) → V.

Fig. 9. Higher-order odd made essential use of the fact that the language is higher-order.

Z := µx.P (x) | Z , (Γ ⇒ Y ) × (Γ ⇒ Y )

, ⇒ × ⇒ Z , (Γ ⇒ Y ) × (Γ ⇒ Y ) Z , (Γ ⇒ Y ) × (Γ ⇒ Y ) Z (Γ Y ) (Γ Y ) π2 π2 π1 Z , Γ ⇒ Y Γ ,Z , Y Γ,Z , Y π1(Z) π2(Z) Γ,Z , Y Γ,P(Z) , Y P (Z) , Γ ⇒ Y P (Z) , Γ ⇒ Y P (Z) , (Γ ⇒ Y ) × (Γ ⇒ Y ) µx.P (x) , (Γ ⇒ Y ) × (Γ ⇒ Y ) π1 µx.P (x) , Γ ⇒ Y Γ,µx.P(x) , Y

We remark that the proofs π1 and π2 are not parametric in the type variable Z as it is possible that these proofs contain another regeneration on these variables. However, what is certainly true is that we may substitute (any regenerated variable of) the same inductive type into these proofs. To illustrate this transformation process consider the function odd in figure 5. It has multiple regenerations, thus it is reasonable to ask what the program looks like when it is transformed into the algebra based system. In figure 9 there is a version written in charity using a fold. Notice the rather unexpected second component of the state of the fold: it is a higher order function. Essentially this component lags one step behind the first component, so that it holds the even list waiting for the next element to be added. When the next element is added it becomes (by application to that element) the odd list again. Of course this may not be a very efficient way to compute the list of odd elements but this really is not the point! The question is rather whether this is an equivalent expression of the program. Clearly the direct approach suggested by the earlier code for odd will be more efficient. To complete the transformation we need to also show that multiple regenerations can be removed from coinductive circular proofs. It turns out that the proof of this is much easier: we do not even need to use the closedness of the setting. Here is the general form of a coinductive regeneration in which the proofs π1 and π2 can themselves contain regenerations: 110 Cockett

Z := νx.P(x) | Γ , Z

Z := Z | Γ , Z

, , Γ Z Γ Z π (Z) Γ , P (Z) 2 Γ , Z π (Z) Γ , P (Z) 1 Γ , νx.P(x) Again to simplify the transformation it is convenient to assume that both ΓandΓ are singletons. Under this assumption we can remove one level of regeneration by transforming this to the following:

Z := νx.P(x) | Γ+Γ , Z

Γ+Γ , Z Γ+Γ , Z Γ+Γ , Z , , , Γ Z π (Z) Γ Z Γ Z π (Z) Γ , P (Z) 1 Γ , P (Z) 2 Γ+Γ , P (Z) Γ+Γ , νx.P(x) Γ , νx.P(x) We have now almost established the following: Theorem 3.1 There is an isomorphism between the algebra based proof system and the circular proof based system for the program logic. The respect in which we have not established this theorem is actually rather important: it concerns proof equivalence. We would like it to be the case that if we translate back and forth that the resulting proof (which is diﬀerent if there is recursion) is equivalent to the starting proof. Furthermore, we would like that, for both the translations, that equivalent proofs are translated into equivalent proofs. Of course, all we have really established (despite suggesting otherwise) is that the two systems have the same “derivability” power. Categorically speaking we are trying to establish an isomorphism between two diﬀerent presentations of a category. So far we have shown that with respect to the posetal collapse the two systems are isomorphic. However, we really do want the stronger result and I claim it is true. For proofs with no regenerations establishing the equivalence of the double 111 Cockett translated proof is quite easy. Proofs with multiple regenerations or simultaneous recursion are translated internally to this special case, thus, so long as the internal translation is between equivalent proofs and the translations themselves maintain equivalence, the result will follow. Here we do need to know that equivalence is maintained by both translations and this is has lots of details. Furthermore, to establish this we really must begin to explore the notion of proof equivalence in the circular proof systems. The algebra based proof system derives its notion of proof equivalence directly from its underlying categorical semantics: this is discussed at great length in [5]. On the other hand, it is not so clear where the natural notion of proof equivalence comes from in the circular system. However, the answer to this question should become clearer from how the cut-elimination procedure works for that system.

4Program equivalence

A good rule of thumb when trying to understand proof equivalence in a logical system is to look at its cut-elimination procedure – if it has one. This will usually employ the more important identities in the system. In this section we consider what cut-elimination may mean for the circular proof system. There is an obvious answer which is rather unsatisfactory as it does involve infinite proof terms. However, it does suggest what the proof equivalences for the circular proof system should be. The fact that we produce infinite term from cut-elimination is not really acceptable. Thus, we turn to deforestation which is a process which aims at removing the cuts which remove destroy structure. We informally introduce a technique for deforesting. This differs from that described in [10] in how it handles coinductive circular definitions. We give several examples of deforestation. However, before, I do any of that I must first say an embarrassed word or two about the underlying proof system which corresponds to cartesian closed categories with coproducts.

4.1 Equivalence in intuitionistic logic Despite the fact that the logic of (∧, ∨, ⇒) has been heavily studied since the turn of the century the question of providing a decision procedure for proof equivalence is still a thorny issue. Any approach which uses rewriting is considerably complicated by the presence of “commuting conversions” or equations which makes providing a complete proof that the system works a daunting and highly technical task (see [7]). All the evidence, however, is that this fragment is highly decidable (see as well [12] for an alternate semantic approach). Unfortunately, there is still an outstanding problem which I must acknowledge: there is no feasible and fully general decision procedure in the 112 Cockett ! Γ , C weak Xi, Γ , C i∈I Γ , C , , i∈I Xi, Γ C = i∈I Xi, Γ C

{ → } i(xi) t i∈I s = t

Fig. 10. Idempotence " # " # {X ,Y , Γ , C} {X ,Y , Γ , C} i j j∈J i j i∈I , , Xi, j∈J Yj, Γ C i∈I Xi,Yj, Γ C i∈I i∈I , , i∈I Xi, j∈J Yj, Γ C = i∈I Xi, j∈J Yj, Γ C

{i(xi) → {j(yj) → ti,j} t2} t1 = {j(yj) → {i(xi) → ti,j} t1} t2

Fig. 11. Transposition literature for this fragment of my formal programming system. I am not going to try to correct this situation here as this part of the system is not of primary interest in this exposition. I will therefore restrict myself to a brief discussion of some of the proof equivalences which arise. For this fragment we have the usual cut-elimination steps which give the following (representative) rewrites. → { → } ⇒ → [z i(xi) ti i∈I z]j(s)= [xj tj]s [(i1 : x1, ...) → t](i1 : s1, ...)=⇒ [x1 → ....[xn → t]sn...]s1 → { → } ⇒{ → → } [z t] i(xi) ti i∈I y = i(xi) [z t]ti i∈I y [z → t](i1 : s1, ..., in : sn)=⇒ (i1 :[z → t]s1, ..., in :[z → t]sn) [f → s1[f@s2/x]]λy.t =⇒ [x → s1]([y → t]s2) [ → t1]t2 =⇒ t1 [x → x]t =⇒ t All except the first two rule we regard as “soft” cuts. Notice that they both remove structure. There are in this system a number of additional proof equalities which arise as natural proof identifications. There are two simple examples which are illustrated in figures 10 and 12. However, there are other more subtle identities which arises through con- traction, see for example figure 12. Notice that in this repetition identity a cut appears from nowhere. This is necessary if one wishes to push contractions up the proof tree. The cut reflects the fact that a choice in x has been resolved and is recorded by adding a cut. I worked on this fragment with Guangwu Xu (who was my student at that time) as I had a long standing conjecture that these (and some identities I have not mentioned) can be oriented to make a rewrite system modulo the two 113 Cockett

 π   , , j  π Xj Xi Γ, Xi,Xj Y , j , cut Γ, Xi,Xj Y  Γ,Xj,Xj Y  j∈I  contr  Γ,Xj , Y Γ, Xi, Xi , Y j∈I contr Γ, Xi , Y = Γ, Xi , Y 

i(xi) → ti(x) x = i(xi) → [x → ti(x)]i(xi) x

Fig. 12. Repetition equations above. We almost completed (and may yet!) a proof of this for the ﬁrst-order fragment. However, the details are quite daunting. Clearly to make signiﬁcant progress in program transformation ultimately these problems must be completely sorted out.

4.2 Circular proof equivalences The rules governing the datatypes in the algebra based system are determined by the underlying categorical initial algebra and final algebra semantics: they were described in [5]. Therefore we will now focus on the rules of the circular proof system. The first and most obvious cuts involving the datatypes are those which are concerned with the construction and destruction. Both these cuts remove structure and so must be eliminated in any deforesting transformation. They arewhatweshallcalldeforesting cuts. A term with residual deforesting cuts will, by definition, not be deforested. We will deal with the inductive case first:

Z := µx.P (x) Y | Γ,Z Y Z := µx.P (x) | Γ,Z Y Γ,Z Y π Γ,P(Z) Y π Γ,Z Y Γ P (µx.P (x)) π Γ,µx.P(x) Y cons Γ,P(Z) Y π π Γ µx.P (x) Γ,µx.P(x) Y Γ P (µx.P (x)) Γ,P(µx.P (x)) Y Γ, Γ Y ⇒ Γ, Γ Y in the term calculus this is the following rewrite: $ " # % $ $ " # %% : f = : f = v → (v,u) µ(s) ⇒ z → t u/y, /f s (z :=,y) → t (z :=,y) → t

It may seem that no real progress has been made, but recall the the proof π has been hauled out and this will (most likely) allow the cut to move up. Notice also how the circular proof term is substituted into its body. 114 Cockett

The cut elimination rule for the destruction of a coinductive type is similar:

Z := νx.P (x) | Γ Z Z := νx.P (x) | Γ Z Γ Z π Γ Z Γ P (Z) π π Γ P (Z) Γ ,P(νx.P (x)) C Γ νx.P (x) dest π π Γ νx.P (x) Γ ,νx.P(x) C Γ P (νx.P (x)) Γ,P(νx.P (x)) C cut cut Γ, Γ C ⇒ Γ, Γ C in the term calculus this is:     : f = : f =     [z → t[ν(z)/z ]]( u) ⇒ [z → t]s[u/y, /f] y → s y → s

These rules cause the circular proofs to unroll – or to “unfold” in the terminology of Burstall and Darlington. Now, in fact, if we are to remove any cut from the conclusion of a circular proof then it is necessary to unroll the recursion to expose its inner structure. To eﬀect this unrolling it is necessary to introduce a degenerate circular proof which does not use its circular function. We may then move the cut inside such a step. There are the three ways this can happen: two inductive cases and one coinductive. For the inductive cases the cut formula can be on the left or the right. The case where the cut formula actually is the inductive type itself was handled above. Here is the cut elimination step when the cut formula is on the left:

Z := µx.P (x) | C, Γ,Z , D

| , C, Γ,Z , D Z := µx.P (x) C, Γ,Z D π C, Γ,P(Z) , D C, Γ,Z , D C, Γ,µx.P(x) , D π π π C, Γ,P(Z) , D Γ , C C, Γ,P(µx.P (x)) , D π cut Γ , C C, Γ,µx.P(x) , D Γ , Γ,P(µx.P (x)) , D cut Γ, Γ,µx.P(x) , D ⇒ Γ, Γ,µx.P(x) , D where the open circular proof box indicates that the circular function is not used. Here is the corresponding term.

" # "     #   : f = : f = → ⇒ →  →  [x (v,x)]s y x t[x/y ,   /f] s v (y :=,y) → t (y :=,y) → t 115 Cockett

Here is the cut step when the cut formula is on the right.

Z := µx.P (x) | Γ,Z , C

| , Γ,Z , C Z := µx.P (x) Γ,Z C π Γ,P(Z) , C Γ,Z , C Γ,µx.P(x) , C π π π Γ,P(Z) , C Γ,P(µx.P (x)) , C Γ , C π cut Γ,µx.P(x) , C C, Γ , D Γ,P(µx.P (x)), Γ , D cut Γ,µx.P(x), Γ , D ⇒ Γ,µx.P(x), Γ , D

In the coinductive case the cut formula must be on the left:

Z := νx.P(x) | C, Γ , Z

| , C, Γ , Z Z := νx.P(x) C, Γ Z π C, Γ , P (Z) C, Γ , Z C, Γ , νx.P(x) π π π C, Γ , P (Z) Γ , C C, Γ , P (νx.P(x)) π cut Γ , C C, Γ , νx.P(x) Γ , Γ , P (νx.P(x)) cut Γ, Γ , νx.P(x) ⇒ Γ, Γ , νx.P(x)

Notice that as the cut of a proof π with an identity proof is always equivalent to the original proof, and as we may precipitate unrolling with such a cut, we must conclude that the unrolled proof (to any depth) is equivalent to the original proof. These are, in fact, the cut elimination steps. It is clear, however, that these steps do not terminate. In fact, in general they will precipitate an infinite unrolling of a circular proof. In fact, it is reasonable to view cut-elimination in this system as producing infinite terms. There is an obvious “lazy” way in which one can regard this: for any demanded finite depth one can cut-eliminate to that depth. This idea was actually used, for example, by Simon Marlow [10] as the basis for a practical program transformation system. His first step was to lazily unroll the term performing cut elimination from the root upwards. As he wanted to produce a finite program this was followed up by a second stage he called knot tying, which was essentially Burstall and Darlington’s notion of folding. In this stage he searched for recurrences within the tree and tried to tie the lazy infinite program back into a finite program. The difficulty with the idea of knot tying was to secure the termination of the search for knots. That is a guarantee that on every path in the term either will be finite or go round a knot. While some special conditions are known it seems that in general the general recursive case there can be no guarantee that one can actually completely tie the knot. 116 Cockett

Π Γ,µx.P(x) , Y π Γ,P(µx.P (x)) , Y inferΓ,µx.P(x) , Y Π= Γ,µx.P(x) , Y

Z := µx.P (x) | Γ,Z , Y

Γ,Z , Y π Γ,P(Z) , Y Π Γ,µx.P(x) , Y = Γ,µx.P(x) , Y

Fig. 13. Inductive knot t(u, w)= x → s[w/y, [(u, w) → t]/f] u

    : f = t(u, w)=  (u, w) (x :=,y) → s 

Fig. 14. Inductive knot equation

We may explain the idea behind knot tying proof theoretically: the behavior of a circular proof is determined by the code it produces from its innards when it is unrolled. Now if another proof can simulate the production of the same innards when we unroll it then the two proofs must be equal. If one is transforming one ties the knot by replacing the second proof by the first which is more canonical. This scheme for inductive proofs is shown in figure 13. Notice that, in order that the proof be applicable to a regenerated variable there can be no construction used on that type in π. The proof Π need not start with a circular step. The term is shown in figure 14. There is a similar inference for the coinductive datatypes whose equation is given in figure 15 These circular inferences are the source of equivalences in this proof system. Once one realizes that cutting with a constructor causes unrolling becomes clear that these are just a reformulation of the initial algebra properties. The significance of the transformation lies in the fact that it is much more convenient to work with this form. These remarks suggest in more detail why 3.1 may be true. 117 Cockett t(u, w)= (x, y) → s[[(u, w) → t]/f] (u, w)

  : f =  t(u, w)= (u, w) (x, y) → s 

Fig. 15. Coinductive knot equation

4.3 Deforestation

In this section I shall demonstrate some program equivalences using a technique to deforest the terms of this programming language. This technique of deforesting relies on determining where there is demand in the term. A term with no demand is deforested. There are two ways that demand can occur in a term. The first source of demand is expressed at an argument of a term. This can either be at the active arguments of an inductive circular definition (including a coproduct) or at the argument of a destructor (including projection). This is one reason why we carefully labeled the active (often recursive) arguments of the inductive circular definitions using :=. However, these two expressions of demand only become real demand when they can potentially be put together with a supplier. For an active argument of an inductive circular definition a supplier is a constructor or something which could potentially produce a constructor. For a destructor a supplier is a coinductive circular definition or someone who can supply such a definition. This demand will cause us do one of three things: to perform a deforesting cut if the constructor (resp. destructor) is present, to unroll the circular definition on which the demand has been placed, or, if demand is placed internally on an argument of a circular definition we will “pass the demand up”: this is illustrated below (see section 4.3.3). It is important to realize that as we are dealing with finite terms whether there is demand or not can be easily detected. Below, through examples, I am describing an algorithm which can be implemented. If this algorithm terminates it will return a deforested term. Of course, the problem concerns the termination (see also [10]); I conjecture that this procedure does terminate and so will always return a deforested (finite) term, however, at the time of writing I do not have a proof.

4.3.1 Associativity of append We start with a classic example to illustrate the strategy of deforesting and the way it manages to discover useful identities. Recall the deﬁnition of append: 118 Cockett

    : app(:=,y = y)=  x, y → nil() → y x     cons(v,vs) → µ cons(v, app(vs,y)) Notice in this definition there is no demand: or more specifically the only demand is on a free variable which we cannot satisfy until it is instantiated in some way. This means that app is already deforested. We shall indicate the demand we are working on by a dot. This will usually be the demand nearest the root. However, the knot tying process may actually oblige us to calculate certain demands in order to secure the match it wants. The first step below responds to the demand by unrolling the inner append: · app( app(x, y),z)    nil() → y · = app(   x, z) cons(v,vs) → µ cons(v, app(vs,y))    →  nil() app(y, z) =   x cons(v,vs) → app(·µ cons(v, app(vs,y)),z)    →  nil() app(y, z) =   x cons(v,vs) → µ cons(v, app(· app(vs,y),z)) There is now an opportunity to tie a knot: the term app(app(x, y),z) occurs with vs substituted for the variable x. It is important to note that vs is a “regenerate” variable (that is of regenerated type) as this tells us where the recursion lies. The trick now is to tie a tight knot! The only structure which is of relevance lies between the occurrence where the recurrence template stands and the actual recurrence. All other structure are simply treated as new arguments. Thus we may infer:

    : F (:=,u= app(y, z)) =  app(app(x, y),z)= nil → u x     cons(w, ws) → µ cons(w, F(ws, u)) 

Clearly F = app and so we have just deforested

app(app(x, y),z) ⇒ app(x, app(y, z)).

In this example there is also a slight efficiency improvement: the first argument is traversed twice in the first expression but only once in the second. 119 Cockett

4.3.2 Improving naive reverse Deforestation can make arbitrary large efficiency improvements. Here is a classic example of an O(n2) program which can be improved to an O(n) program. This transformation is interesting as it derives precisely what one might hope would be derived. Consider the following definition of reverse:     :rev 2(:=) =  → x  nil() µ nil()  x     cons(v,vs) → app(rev2(vs),µcons(v,µ nil)) This definition already has internal demand: the app function creates demand on rev2(vs) which causes us to unroll. We let z =cons(v, nil)) in this calculation and we will use our previous calculation in the last step (which, of course, would be recalculated automatically by a system) with a deforesting constructor cut with append: · app( rev2(vs),z)    nil() → µ nil() · = app(   vs,z) cons(v,vs) → app(rev (vs),µcons(v,µnil))  2   → ·  nil() app( µ nil(),z) =   vs cons(v,vs) → app(· app(rev (vs),µcons(v,µnil)),z)  2   →  nil() z = vs   cons(v ,vs) → app(· rev2(vs ),µcons(v ,z)) We can now tie the knot to obtain:     :rev 1(:=,z = z)=  → app(rev2(vs),z)= nil() z  vs.     cons(v ,vs) → rev1(vs ,µcons(v ,z))

This has no residual demand and so is deforested. This is the fast deﬁni- tion of reverse which accumulates the reversed list on its second argument. Substituting this back into the original deﬁnition gives:    →  nil() nil() x → x   cons(v,vs) → rev1(vs, cons(v, nil)) which has no residual demand and so is a deforested version of our original algorithm. 120 Cockett

4.3.3 Improving the accumulate on an inﬁnite list We use the example of collecting the list of values so far in an inﬁnite list:

  :acc( L ,xs)=     L →   L   hd : x   (L ,xs) → tl : acc(tl νL, app(·xs, µ cons(hd νL,µnil))) 

This function has internal demand: the demand is generated at the first argument of the app function — the destructors do not generate demand as there is none passed down through the arguments of the inductive circular definition. To remove this internal demand we must, this time, pass up the demand: this involves creating another argument through which we express that demand (in as tight a fashion as possible). This demand then becomes a new parameter of the coinductive circular definition. Now when we create this extra argument, rather like a time anomaly, we must reflect the change in the future circular call. In this case we abstract the append on its non-recursive argument and This gives the following equalities:

acc(L, xs) =acc (L, xs, λz. app(xs, z))   hd : xs       tl : acc (tl νL  =    ·   , app( xs, µ cons(hd νL ,µnil)))  ,λz.app(app(·xs, µ cons(hd νL,µnil))),z))    hd : xs       tl : acc (tl L  =    ·   , app( xs, cons(hd L , nil)))  ,λz.app(xs, cons(hd L,z))))    hd : xs       tl : acc (tl νL  =    ·   , app( xs, µ cons(hd νL ,µnil)))  ,λz.(λz. app(xs, z))@(µ cons(hd νL,z)))

We can now tie the knot to obtain: 121 Cockett

acc(L, xs, F )  : G =          hd : y           =   tl : G (tl νL   (L, xs, F ).  (L, y, F ) →          , app(y, µ cons(hd νL ,µnil))   ,λv.F@(µ cons(hd νL,v)))

This is now deforested as all the demand has gone. However, we may make a further simpliﬁcation: we can note that second argument can be obtained from the ﬁrst at every call by xs = F @nil. This means we do not need the middle argument at all it can be replaced by the calculation. While this is a nice observation it is not necessary in order to complete the deforestation and so we shall not use it here. We now have:

acc(L, µ nil) = acc(L, µ nil,λz.app(µ nil,z)) = acc(L, µ nil,λz.z) which expresses the original function in deforested form.

4.3.4 Getting values from infinite structures One way to think of a computation, such as calculating the nth even number, is to create an infinite table containing the values and then use this to look up the value. Clearly this is horrendously inefficient, but, one might ask: if we allow for deforestation can this be transformed to an efficient program? This example of a program transformation illustrates the effect of having a coinductive type “inside” an inductive type. In particular it shows how demand can be generated through a containing inductive circular definition. We start by considering the program which gets the nth tail of an infinite list.     : gtt(:=,L = L)=  (L, n) → zero → L n     succ(n) → tl ν gtt(n,L)

This is deforested as the demand generated by tl is never supplied. However, if we substitute a coinductive circular deﬁnition in for L this will create a demand on inner occurrence of gtt. We shall resolve this as follows without reference to what caused the demand: 122 Cockett

· tl ν gtt(n, L)    zero → L · =tlν   n succ(n) → tl ν · gtt(n,L)    → ·  zero tl ν L =   n succ(n) → tl ν tl ν · gtt(n,L)

This allows us to tie the knot (tight):     : H(:=,L =tlL)=  tl · gtt(n, L)= zero → L n     succ(n) → tl νH(n,L) 

Now it is clear that H is gtt so we can replace tl gtt(n,L) by gtt(n, tl L)to get a new version of gtt (I shall not change the name):     : gtt(:=,L = L)=  (L, n) → zero → L n     succ(n) → gtt(n, tl νL)

Now let us consider the program which gets the nth even number. ﬁrst we introduce the notion of the inﬁnite list of every other number starting at m:   :evn=     m →   m.   hd : m   m → tl : µ succ µ succ m

Now we get the nth element of the inﬁnite list of evn starting at zero by applying hd to the the process of getting the nth tail. This means we want to deforest the following term:

n → hd ν · gtt(n, evn(µ zero))

Where this has a demand expressed at the argument of hd and supplied at the second argument of gtt. We shall resolve this demand as follows: 123 Cockett

· hd gtt(n, L)     zero → L · =hdν   n succ(n) → gtt(n, tl νL)    → ·  zero hd ν L =   n succ(n) → hd ν · gtt(n, tl νL) This allows us to tie another little knot:     :gtt (:=,L = L)=  hd · gtt(n, L)= zero → hd L n     succ(n) → gtt(n, tl L)

Now consider the original problem again, or rather a subproblem thereof: gtt(n, evn(m))   → ·  zero hd evn(m) =   n succ(n) → gtt(n, tl · evn(m))    →  zero m =   n succ(n) → gtt(n, evn(succ succ m)) Notice that abstracting the zero out of the calculation helps us to spot the knot. We may now tie this knot to obtain:     : E(:=,m = m)=  gtt (n, evn(m)) = zero → m n.     succ(n) → E(n,µsucc µ succ m)

This is now deforested. Now it is worth remarking that E(n, µ zero) may not have been quite the expected form for calculating even numbers. We may have expected:     : E (:=,m)=  hd · gtt(n, evn(m)) = zero → m n.     succ(n) → µ succ µ succ E(n,m)

Both of these functions are deforested and, furthermore, they are equal. We shall take this as a small reminder that deforesting does not solve the equivalence problem! 124 Cockett

4.4 Remarks on program equivalence As we have seen deforestation does not provide the solution to the program equivalence problem. However, it is very clear that we should in determining equivalence use the fact that one can deforest. If one does this then one of the remaining problems is to establish that the two programs can be expressed in such a manner as to have the same recursion pattern. Now, in principle, this is easy to achieve. The idea is this one unrolls the two programs in parallel. The demand to unroll arrives now from two sources: the first is because there are still deforesting cuts as usual but the second is because one wants to keep the programs unrolling in parallel. Thus, if there is demand on the one program this must be transmitted to the second. Finally knot tying must be done in parallel. There is an advantage to this process as unrolling is equivalent to looking at distinguishability. Thus, if the two programs unroll in different ways one can actually reconstruct a pattern of destruction and application to values which will distinguish the programs. Thus, in principle, one can either prove the programs equivalent or provide a counter-example – or never terminate, of course. This seems to me to be a standard tool which in the future will be provided with any programming language which is worth considering! There are a few outstanding problems with this idea. For example, this presumes that one has tamed the program equivalence problem for the proof theory of intuitionistic logic! It also presumes that such a proof technique would be complete (this I conjecture is the case). There is still therefore considerable work that is required before we can provide the sort of program verificationtoolwhichIfeelshouldbestandard.However,Idofeelthatwe now have most of the technology in hand to provide such a tool.

5 Conclusions

I hope that these discussions have underlined that arriving at an appropriate semantic and proof theoretic formulation for inductive and coinductive datatypes is an extremely important for the field of program transformation and optimization. Furthermore, that a satisfactory semantic formulation (as given by mathematical induction or categorical initial and final datatypes) does not necessarily translate into a good manipulative system. In the case of inductive and coinductive datatypes, I would argue, the circular proof system provides a crucial insight and link between formal settings and the various techniques which have proved to be most useful in practice. The questions as to the termination and completeness of the proposed techniques in this paper (let alone their feasibility) are all left open. However, at this stage, I find it hard to believe that they do not have positive answers! Such answers would be further vindication of the importance of these programming systems ... and would help to make tools, which are at the moment 125 Cockett still the stuff of dreams, standard.

References

[1] Aldwinckle, J., On the proof theory of model checking (to appear). [2] Burstall, R. and J. Darlington, A transformation system for developing recursive programs, Journal of the Association for Computing Machinery 24(1) (1997), pp. 44–67. [3] Cockett, R. and S. Lack, Restriction categories II: Partial map classiﬁcation, Theoretical Computer Science (to appear). [4] Cockett, R. and D. Spencer, Strong categorical datatypes I,in:R.A.G. Seely, editor, International Meeting on Category Theory 1991, Canadian Mathematical Society Proceedings (1992), pp. 141 – 169. [5] Cockett, R. and D. Spencer, Strong categorical datatypes II: A term logic for categorical programming, Theoretical Computer Science 139 (1995), pp. 69–113. [6] Fuhrmann, A. B. C. and A. Simpson, Equational lifting monads, Theoretical Computer Science (to appear). [7] Ghani, N., “Adjoint Rewriting,” Ph.D. thesis, University of Edinburgh (1995). [8] Jacobs, B., “Categorical Logic and Type Theory,” Elsevier, Amsterdam, 1999. [9] Kozen, D., Results on the propositional mu-calculus, Theoretical Computer Science 27 (1983), pp. 113–118. [10] Marlow, S., “Deforestation for Higher-Order Functional Programs,” Ph.D. thesis, University of Glasgow (1996). [11] Santocanale, L., “Sur les µ-treillis libre,” Ph.D. thesis, Univerite de Quebec a Montreal (1999). [12] Scott, T. A. P. D. M. H. P., Normalization by evaluation for typed lambda calculus with coproducts (2001), extended abstract. URL http://www.cs.nott.ac.uk/~txa/drafts/coprod.ps [13] Slind, K., “Reasoning about Terminating Functional Programs,” Ph.D. thesis, TU Munich (1999). URL http://www.cl.cam.ac.uk/users/kxs [14] Szabo, M., editor, “The Collected Papers of Gerhard Gentzen,” Studies in Logic and the Foundations of Mathematics, North-Holland, Amsterdam, 1969. [15] Telford, A. J. and D. A. Turner, Ensuring termination in esfp, in: 15th British Colloquium in Theoretical Computer Science, 1999. [16] Turner, D., Elementary strong functional programming, in:R.Plasmeijer and P.Hartel, editors, First International Symposium on Functional Programming Languages in Education, number 1022 in LNCS (1996). 126 Cockett

[17] Wadler, P., Deforestation: transforming programs to eliminate trees, Theoretical Computer Science 73 (1990), pp. 231–248, (Special issue of selected papers from 2’nd ESOP.).

127 CMCS’01 Preliminary Version

Algebras,Coalgebras,Monads and Comonads

Neil Ghani a,1 Christoph L¨uth b,2 Federico de Marchi a,3,5 John Power c,4

a Department of Mathematics and Computer Science, University of Leicester b FB 3 — Mathematik und Informatik, Universit¨at Bremen c Laboratory for Foundations of Computer Science, University of Edinburgh

Abstract Whilst the relationship between initial algebras and monads is well-understood, the relationship between final coalgebras and comonads is less well explored. This paper shows that the problem is more subtle and that final coalgebras can just as easily form monads as comonads and dually, that initial algebras form both monads and comonads. In developing these theories we strive to provide them with an associated notion of syntax. In the case of initial algebras and monads this corresponds to the standard notion of algebraic theories consisting of signatures and equations:models of such algebraic theories are precisely the algebras of the representing monad. We attempt to emulate this result for the coalgebraic case by defining a notion cosignature and coequation and then proving the models of this syntax are precisely the coalgebras of the representing comonad.

1 Introduction

While the theory of coalgebras for an endofunctor is well-developed, less attention has been given to comonads. We feel this is a pity since the corresponding theory of monads on Set explains the key concepts of universal algebra such as signature, variables, terms, substitution, equations etc. Moreover, applications to base categories other than Set has proven fruitful in many situations, e.g. the study of multi-sorted algebraic theories as monads over SetA,order- sorted theories as monads over Pos, the study of categories with structure

1 Email: [email protected] 2 Email: [email protected] 3 Email: [email protected] 4 Email: [email protected] 5 Research supported by EPSRC grant GRM96230/01: Categorical Rewriting: Monads and Modularity This is a preliminary version. The final version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Ghani, Luth,¨ de Marchi, Power using monads over Graph or Cat [8], the study of rewriting using monads over Pre or Cat [16,17] and the study of higher order abstract syntax using monads over SetF [9] (where F is the skeleton of the category of finite sets). We aim for a similar framework, based upon the theory of comonads, explaining the relationship between the coalgebraic duals of the above concepts, and laying the ground for applications in categories apart from Set.Thispaperis a first step in this direction. Traditionally, one begins with a signature Σ and then for each set of variables X, defines the term algebra TΣ(X). Next one defines substitution and shows this associative with the variables acting as left and right units. Finally, given equations E, one also defines the quotient algebra TΣ,E(X). Generalising this treatment of universal algebra to cover not just sets with extra structure but also algebraic structure over other mathematical objects can be achieved using a categorical reformulation of these ideas. In this categorical framework [14], one constructs from a signature Σ, an endofunctor FΣ (which can be thought of as sending an object X of variables to the terms of depth 1 with variables from X). The term algebra TΣ can then be characterised as the the free monad over FΣ. Being a monad gives TΣ a well behaved notion of substitution, while being free captures the inductive nature of the term algebra. By interpreting the equations of an algebraic theory as pairs of monad morphisms between two free monads, the quotient algebra induced by the equations is modelled as their coequaliser. Crucially, the models of an algebraic theory are isomorphic to the Eilenberg-Moore category of algebras of the monad representing the algebraic theory. Our aim is to dualise this categorical treatment of universal algebra and in particular derive coalgebraic duals of the key concepts described above. We therefore need to dualise firstly the construction of an endofunctor over a signature and secondly the free monad over an endofunctor. It turns out there is more than one possible dualisation and this is the subject matter of this paper. For example, given a finitary endofunctor F : A→A,onecan consider the mapping sending an object X to the underlying object of the initial X + F -algebra, the mapping sending X to underlying object of the final X + F -coalgebra, the mapping sending X to underlying object of the initial X × F -algebra and the mapping sending X to underlying object of the final X × F -coalgebra. Each of these four mappings form either a monad or a comonad as described in Table 1. Thus, there is perhaps not one dualisation at play, but two possible, and orthogonal, dualisations giving four concepts worthy of our attention. We do not claim that this paper provides complete answers to the questions raised above. However we do feel that the questions we address will be of fundamental importance to the CMCS audience and that our approach is novel in being more abstract than most. In particular, we hope that our notions of cosignature, coequations and the categorical representation of this syntax via comonads will be of general interest and stimulate further research.

129 Ghani, Luth,¨ de Marchi, Power

Monads Comonads

Initial Algebras µY. X + FY µY. X × FY

Final Coalgebras νY.X + FY νY.X × FY

Table 1 Algebras and Coalgebras forming Monads and Comonads

The rest of this paper is structured as follows: Sect. 2 contains preliminary definitions. In Sect. 3, we briefly review the presentation of finitary monads which we seek to dualise in the rest of the paper (the upper left entry in Table 1). Sections 4 and 5 explore the different possibilities of dualisation, with Section 4 exploring the lower left of Table 1 and Section 5 containing our proposed coalgebraic syntax with associated representation and correctness results (the right side of Table 1).

2 Preliminary Deﬁnitions and Notation

We assume the reader is familiar with standard category theory as can be found in [18]. In order to abstract away from the category Set we have had to employ certain constructions which some readers may not be completely familiar with. We give a short definition of these here; more details can be found in [18,3]. Locally Presentable and Accessible Categories: Let κ be a regular cardinal. A diagram D is κ-filtered iff every subcategory with less than κ objects and morphisms has a compatible cocone in D. An object X of a category A is said to be κ-presentable iff the hom-functor A(X, ) preserves κ-filtered colimits. A category is locally κ-presentable (abbreviated as lκp) if it is cocomplete and has, up to isomorphism, a set Nκ of κ-presentable objects such that every object is a κ-filtered colimit of objects from Nκ. A category is κ-accessible providing it is lκpexceptthatitonlyhasκ-filtered colimits rather than all colimits. A functor is κ-accessible iff it preserves κ-filtered colimits; we also say it has rank κ. The discrete category on Nκ is denoted Nκ.Thefull subcategory of κ-presentable objects is denoted Aκ. The inclusion functors are denoted Jκ : Nκ →Aκ and Iκ : Aκ →A.WhenA = Set, the set Nℵ0 can be taken to be the natural numbers which we denote N. Kan Extensions: Given a functor I : A→Band a category C, precomposition with F defines a functor ◦ I :[B, C] → [A, C]. The problem of left and right Kan extensions is the problem of finding left and right adjoints to ◦ I. More concretely, given a functor F : A→C, the left and right Kan 130 Ghani, Luth,¨ de Marchi, Power extensions satisfy the natural isomorphisms

∼ ∼ [B, C](LanI F, H) = [A, C](F, H ◦ I)[B, C](H, RanI F ) = [A, C](H ◦ I,F).

Kan extensions can be given pointwise using using colimits and limits, or more elegantly using ends and coends (see [18, Chapter X] for details).

3 Initial Algebras and Monads

We begin by recalling the well known equivalence between (finitary) monads on the category Set and universal algebra and the generalisation of this equivalence to locally presentable categories [14,21]. We illustrate these ideas with a specific example using Set as the base category — examples over other base categories can be found in [21,8]. A signature declares the operations from which the terms are constructed. Usually, a signature is given as a set of operations and a function assigning each operation an arity, but we can equivalently consider it as a function Σ:N → Set, assigning to each arity the set of operations of that arity. In order to abstract away from the category of Set, we need a notion of arity appropriate for different categories. The key observation is that the usual arities in Set, i.e. the natural numbers N, represent the finitely presentable objects in Set. Hence a natural notion for arities in a lκp-category is the set Nκ. Hence, a signature is a function Σ : Nκ →|A|, which is equivalent to a functor Σ : Nκ →A: Definition 3.1 (Signature) Let A be a lκp category. A signature is a func- def tor Σ:Nκ →A, and the object of e-ary operations of Σ is Σe =Σ(e). We now present two examples of such signatures, which we will later aug- ment by the appropriate equations.

Example 3.2 (Monoids) The signature ΣM : N → Set for the theory of monoidsisdeﬁnedasΣM (0) = {e}, ΣM (2) = {m}, ΣM (n)=∅ for all other n ∈ N.ThusΣM declares one operation e of arity 0 (a constant) and one binary operation m. Example 3.3 (Categories with 1) Categories with a terminal object can be seen as algebraic over Cat. The signature Σ must declare the terminal object and for each object X, the unique map !X : X →1. Since the terminal object does not depend upon any data, its arity is the empty category 0.Since the map !X depends upon an object, its arity is the one object category 1. Thus Σ(0)=1, Σ(1)=(◦→◦) (i.e. the category with two objects and one arrow) while Σ(c)=0 for all other ﬁnitely presentable categories c. We now turn to the construction of an endofunctor over a signature Σ. 131 Ghani, Luth,¨ de Marchi, Power

Working over Set, e.g. see [23], one defines a functor n FΣ(X)= X f∈Σ(n) which calculates the terms of depth 1 whose variables come from the set X. For example, given the signature of example 3.2, the associated endofunctor is 2 FM (X)=1+X . Such polynomial endofunctors (i.e. built from products and coproducts) are known to preserve all κ-filtered colimits — this is an important property which later will ensure the free monad can be easily calculated. Unfortunately, it is unclear how to generalise it to categories other than Set and hence we take an alternative approach. Recall that by definition, left Kan 2 ◦ N A → A A extension is left adjoint to restriction LanJκ Jκ :[ κ, ] [ κ, ]and hence we define

FΣ(X)=(LanJκ Σ)X (1) The standard formula for left Kan extensions shows that, in the case A = Set, both deﬁnitions of FΣ agree, as in Example 3.2: × (LanJℵ0 ΣM )X = Setℵ0(n, X) ΣM (n) n∈N = Set(0,X) × 1+Set(2,X) × 1 =1+X2

Notice that the domain of FΣ is Aκ and not A, and hence FΣ is not an endofunctor. However, an endofunctor F : A→Ais κ-accessibleiﬀitis (isomorphic to) the left Kan extension of its restriction to Aκ; hence there is an equivalence (2) between κ-accessible endofunctors and functors Aκ →A and we can regard FΣ as a κ-accessible endofunctor.

◦ Iκ ✲ [A, A]κ ✛ 1 [Aκ, A](2)

LanIκ

To recap, we have presented an abstract definition of signature Σ over an lκp-category and a construction of the associated endofunctor FΣ over the signature. In the case of Set, these definitions and constructions agree with the usual definitions. The associated term algebra TΣ : A→Ais then the free monad over the endofunctor FΣ, constructed in a number of ways: Lemma 3.4 Let F be a κ-accessible endofunctor over a lκp-category A.Then the free monad T over F satisfies (i) For every X in A, TX is the carrier of the initial X + F -algebra. (ii) The forgetful functor U : F−Alg →Afrom the category of F -algebras to ∼ A has a left adjoint L,andT = UL 132 Ghani, Luth,¨ de Marchi, Power

(iii) T is colimit colimi<κSi of the sequence Si where

S0 = JSn+1 = J + F Sn Sλ = colim Si (3) i<λ

def and the composition of F, E : Af →Ais given as F E =(LanI F .E).

Proof. Most of the proofs are standard and can be found in [14,12]. However, note that in the last of the claims, to calculate the free monad we start forming the sequence of endofunctors Si and we do not need to go further than the κ-ﬁltered colimit colimi<κSi because F is κ-ﬁltered. Hence T is κ-accessible and thus shares the same rank as F . ✷

When we come to the dual of this construction in Sect. 5, a construction like the above will not be possible since one would be interested in the limit of co-chain, and there is no reason for an accessible endofunctor to preserve such limits. The eﬀect is that the rank of the cofree comonad over an accessible endofunctor may increase. This change of rank underlies the technical diﬃculties which will arise in Sect. 5. We obtain the two adjunctions in (4) which compose to give an adjunction F −−| U between signatures and monads.

LanJκ ✲ H ✲ [N , A] ✛ ⊥ [Aκ, A] ✛ ⊥ Monκ(A)(4) ◦ Jκ V

The sequence Si in (3) in Lemma 3.4 is called the free algebra sequence [12] and can be seen as a uniform calculation of the initial X + F algebra. As an example of this construction, consider the signature ΣM . Aswehave 2 seen LanJκ ΣM (X)=1+X . The free algebra sequence then specialises to S0X = X and 2 Sn+1(X)=X + A(e, Sn(X)) × ΣM (e)=X +1+Sn(X) e∈N from which we can see Sn as deﬁning the terms of depth at most n,e.g. S0(X) contains the variables X,andS1(X) contains the variables X, the canonical element of 1 representing the unit of the monoid and a pair of elements of S0(X) which can be thought of as the multiplication of these elements. Within this framework, to give equations is to give another signature E and two natural transformations σ, τ : E → UFB. One should regard E as giving the shape of the equations and σ and τ as the actual equations. Example 3.5 Given the monoids example, there are three equations we wish to assert: left unit, right unit (both unary) and associativity (ternary). Hence we set E(3) = {a},E(1) = {l, r} and E(n)=∅ for all other n. 133 Ghani, Luth,¨ de Marchi, Power

σ and τ deﬁne the left, and the right hand side of equations, respectively:

σ(l)=m(e, x) σ(r)=m(x, e) σ(a)=m(x, m(y, z)) τ(l)=xτ(r)=xτ(a)=m(m(x, y),z)

For example 3.3, one equation is needed to force the uniqueness of !X : X →1, i.e. !Y .f =!X for all f : X → Y (see [8] for details). The advantage of this deﬁnition of equations is that it allows an elegant derivation of the representing monad for an algebraic theory: given such a theory σ, τ : E → UFB, under adjunction (4) we have two monad morphisms σ,τ whose coequaliser is the representing monad T . This construction makes sense, because the category of models of the algebraic theory is isomorphic to the Eilenberg-Moore-category of the representing monad. In (4), both adjunctions are monadic and their composition is of descent type, which means that each component of the counit εB : FUB ⇒ B is a coequaliser. This means that every κ-accessible monad T =(T,η,µ)onA is a coequaliser of two free monads over two signatures B,E : N→Ain the category of κ-accessible monads over A, or every monad is represented by equations in this general sense.

4Final Coalgebras and Monads

Given a signature Σ we have constructed an associated endofunctor FΣ,which can be thought of as calculating the closed terms of depth 1. By closing FΣ under composition, or more formally by considering the free monad over FΣ, we get the term algebra TΣ. Concretely TΣX is the carrier of the initial X+FΣ- ∞ algebra. In this section we investigate the map TΣ sending an object X to the carrier of the final X + FΣ-coalgebra, and apply our results to simplify some recent developments is the semantics of lazy datatypes in functional programming languages. A more general version of Lemma 4.2 appears in [1] and is implicit in [20]. In comparing µY.X + FY with νY.X + FY, the first observation [5] is that the final coalgebra of a Set-endofunctor can be regarded as a Cauchy completion of the initial algebra. Thus, if F arises from a signature and hence µY.X + FY is the usual term algebra, then νY.X + FY is the set of terms of finite and infinite depth. This result was extended to lfp categories in [2]. The goal of this section is to ascertain the structure possessed by the map sending X to the carrier of the final X + FΣ-coalgebra. Our answer is that this map also forms a monad. In proving this result we use the following result [2]: Lemma 4.1 Assume that A is lfp, that the unique map !:0→ 1 is a strong monomorphism, that F preserves monos and ωop-chains and that there is a map p :1→ F 0. Then for each object X ∈A, A(X, µF) and A(X, νF) are metric spaces with the latter being the Cauchy completion of the former. 134 Ghani, Luth,¨ de Marchi, Power

We use this result to prove our main lemma: Lemma 4.2 Let F be a polynomial endofunctor and A satisfy the premises of lemma 4.1. Then the map T ∞ assigning an object X to the carrier of the ﬁnal X + F -coalgebra carries the structure of a monad. Proof. We actually prove the lemma for the case A = Set since the general case involves slightly clumsy reasoning with hom-sets but is entirely analogous. We prove that T ∞ is a monad by constructing a Kleisli structure for it, namely ∞ → by deﬁning the following operations: i) for each object X, an arrow ηX : X ∞ ∞ T X; ii) for each pair of objects X, Y a function sX,Y : A(X, T Y ) → A(T ∞X, T ∞Y ) such that

∞ ∞ ∞ ◦ ◦ ◦ sX,X(ηX )=1T X sX,Z(sY,Zg f)=sY,Zg sX,Y fsX,Y (f) ηX = f ∞ ◦ Firstly define ηX = PX ηX where η is the unit of the free monad on F and PX is the unique comparison map between the initial X + F -algebra and the 6 final X + F -coalgebra. Next define sX,Y (f) as follows

η P X X ✲ TX X ✲ T ∞X

∗ ![f,h] ] f [f,h ✲ ❄ ✛ )=! ∞ s(f T Y where h : FT∞Y → T ∞Y is the obvious map derived from the structure map of T ∞Y .Then[f,h]:X + FT∞Y → T ∞Y endows T ∞Y with an ∞ X + F -algebra structure and hence induces a unique map ![f,h] : TX → T Y from the initial X + F -algebra. Diagram chasing shows that ![f,h] is uniformly continuous with respect to the relevant metrics and hence by lemma 4.1 there ∗ is an extension ![f,h] which can be taken to be sX,Y (f). The ﬁrst equation ∞ follows since by uniqueness ![ηX ,h] = PX and the extension of PX is the identity. The equation sX,Z(sY,Zg ◦ f)=sY,Zg ◦ sX,Y f holds if their restrictions via PX are the same which in turn holds if they arise from the same algebra structure on T ∞Z. This holds because the two algebra structures are both ∞ inherited from the map sY,Z(g) ◦ f : X → T Z. The last equation follows ◦ ∞ ◦ ◦ ◦ ✷ since sX,Y (f) ηX = sX,Y (f) PX ηX =![f,h] ηX = f Unlike the monad T , the monad T ∞ has a larger rank than F : Lemma 4.3 Let F be a polynomial endofunctor on Set. Then the monad ∞ ℵ TΣ has rank 1. Proof. The lemma can be proven directly using an interchange law between limits and ﬁltered colimits. ✷

6 Actually applying Lemma 4.1 requires that X + F 0 be non empty which is the case when ∞ X = 0. Thus the maps η0 and s0,Y and their properties must be established separately, but this all follows trivially from the initiality of 0 135 Ghani, Luth,¨ de Marchi, Power

Notice that lemma 4.2 indicates that there is no need to develop a new ∞ syntax for the monad TΣ since the canonical syntax should be the finite and infinite depth terms of the associated monad TΣ. We finish this section with an application of this result to reasoning in functional programming.

4.1 The Generic Approximation Lemma The generic approximation lemma [10] is a proof principle for reasoning about functions in a lazy functional programming language (such as Haskell). The approximation lemma itself pertains to lists and states that given a function approx (n+1) [] = [] approx (n+1) (x:xs) = x:(approx n xs) two lists xs and ys are equal iff ∀n.approx n xs = approx n ys.Notethe lack of a base case: approx 0 x is ⊥ (i.e., the denotation of undefined) in the denotational model, but because of non-strictness, approx n x (with n>0) is defined. This principle can be applied to other datatypes such as trees: data Tree a = Leaf a | Node Tree a Tree approx (n+1) (Leaf x) = Leaf x approx (n+1) (Node l x r) = Node (approx n l) x (approx n r) Two trees t1 and t2 are equal iff ∀n. approx n t1 = approx n t2.[10] proves the generic approximation lemma using the standard denotational semantics of functional programming languages, where types are interpreted as cpo’s, programs as continuous functions and recursive datatypes as least fixed points of functors. That is, the correctness of the proof principle depends upon the semantic category chosen; we have already seen the implicit use of ⊥ in the definition of approx. We propose an alternative and, we believe, more natural derivation of the approximation lemma which is independent of the particular denotational model chosen. The definition of a polymorphic datatype is usually interpreted as the free monad TΣ over its signature Σ. However, this does not capture laziness, since TΣ consists of only finite terms. Instead, we model such a ∞ ∞ datatype by the monad TΣ .SinceTΣ (X)isthefinalX + Σ-coalgebra and ∞ since FΣ is a polynomial endofunctor, TΣ (X) can be calculated as the limit of the following ωop-chain 1 ← (X + F )1 ← (X + F )21 ←···. The universal property of the limit states that two elements x and y of this limit will be equal iff for each n, πn(x)=πn(y)whereπn is the n-th projection. But these projections are precisely the approximation function for the datatype. Notice how the categorical argument replaces the semantic dependency on ⊥ with use of a co-chain beginning with 1. This establishes the correctness of the generic approximation lemma independently of any specific denotational model. More generally, a lazy functional language like Haskell combines aspects appropriately modelled by a monad (being finitely generated), while others are more appropriately captured by final semantics (laziness and non-termination). 136 Ghani, Luth,¨ de Marchi, Power

5 Final Coalgebras and Comonads

We now turn to another possible dualisation of Sect. 3 based upon the idea of mapping an object X to the carrier of the ﬁnal X × F -coalgebra for some κ-accessible functor F . The elegance of the monadic approach to algebraic theories suggests that we can apply these techniques to the development of coalgebraic syntax by dualising them apropriately.

5.1 Cosignatures and their Coalgebras Recall that the heart of the categorical approach to universal algebra is adjunction (4). The dualisation outlined in this section can be summed up as replacing the left adjoint to U with a right adjoint and the replacement of monads with comonads. As a result, the deﬁnition of a cosignature turns out to be formally the same as that of a signature.

Definition 5.1 (Cosignature) Let A be a lκp-category with arities Nκ.A κ-cosignature is a functor B : Nκ →A. Recall that in Section 3, we constructed a κ-accessible endofunctor from a signature by first taking a left Kan extension, and then used equivalence (2) to get a κ-accessible endofunctor. Therefore, in this section we take the A →A right Kan extension of a cosignature B to obtain a functor RanJκ B : κ , followed by the same equivalence to obtain a κ-accessible endofunctor on A. The standard formula for the right Kan extension gives us * A (RanJκ B)(X)= (X, c) Bc c∈Nκ where U B is the U-fold product of B, or more formally the representing object for the functor [U, A( ,B)] : A→Set.WhenA = Set, then this operation is simply exponentiation, i.e. U B =[U, B]. Thus, although signatures and cosignatures are formally the same, the endofunctors they generate are very different. For example, note that while the default value for signatures is 0, if there is a single arity c such that

B(c) = 0, then (RanJκ B)(X) = 0. In fact, the default value for cosignatures is 1 since U 1 = 1 and hence if c is an arity such that B(c)=1,then this arity will contribute nothing to the right Kan extension. Here are two example cosignatures which we will explore further below:

Example 5.2 Deﬁne the cosignature Bp by Bp(2) = 2 and Bp(c)=1for all other arities. Then RanJκ Bp(X)=[Set(X, 2), 2] = [[X, 2], 2].

Deﬁne the cosignature Bpω (2) = ω and Bpω (c)=1for all other arities.

Then (RanJκ Bpω )(X)=[Set(X, 2),ω]=[[X, 2],ω].

Recall that in order to turn a functor Aκ →Ainto a κ-accessible endofunctor, one takes its left Kan extension. Given a cosignature B,wecan thus consider coalgebras of the endofunctor LanIκ RanJκ B. The following is 137 Ghani, Luth,¨ de Marchi, Power an important result from [11]: Lemma 5.3 Let F : A→Abe an accessible endofunctor on a locally presentable category. Then F−coalg is locally presentable and the κ-presentable objects of F−coalg are simply those coalgebras whose underlying object is κ- presentable in A. Proof. The proof uses the fact that the category of all (small) accessible categories has all weighted limits [19, Theorem 5.1.6]. Then the category F −coalg can be constructed as such a limit, namely the inserter [13] of 1 C ✲ C F The second part of the lemma follows from the fact that colimits in F−coalg are formed pointwise. ✷

This result allows us to consider κ-presentable LanIκ RanJκ B-coalgebras:

Lemma 5.4 Let B be a κ-cosignature. A κ-presentable LanIκ RanJκ B-coalgebra consists of a κ presentable object X together with for every arity c,a function hc : A(X, c) →A(X, Bc). Proof. Given any κ-accessible endofunctor F ,andκ-presentable object X,

LanIκ F (X)=F (X). Hence coalgebras of the form mentioned are + A ∼ A A (X, (RanJκ B)(X)) = (X, ∈N (X, c) Bc) + c +κ ∼ A A ∼ A A = c∈Nκ (X, (X, c) Bc) = c∈Nκ [ (X, c), (X, Bc)]

In the theory of monads, a crucial result states that the models of an algebraic presentation are isomorphic to the category of algebras of the theories representing monad. In order to emulate this result we require a notion of model (or co-model if you like) for a κ-cosignature B, and Lemma 5.4 suggests this be LanIκ RanJκ B-coalgebra.

Example 5.5 A model of the cosignature Bp is given by a ﬁnite set X together with a function Set(X, 2) → Set(X, 2).Ifweregardamapf : X → 2 as a subset of X, or a predicate over X, then such a model can be regarded as a map between predicates.

Similarly, a model for the cosignature Bpω consists of a ﬁnite set X and a function Set(X, 2) → Set(X, ω). This is clearly related to the previous example if we think of it as mapping binary partitions of X to ω-ary partitions.

5.2 The Representing Comonad of a Cosignature Recall that in the algebraic case one starts with a signature, constructs an endofunctor, and then considers the free monad over the endofunctor. The second part of the dualisation is to consider the cofree comonad over an endofunctor. The following pair of results from [11] provides the framework for 138 Ghani, Luth,¨ de Marchi, Power this discussion: Lemma 5.6 The following conditions on a functor F : A→Aare equivalent: (i) There is a cofree comonad on F . (ii) The forgetful functor F−coalg →Ais comonadic. (iii) The forgetful functor F−coalg →Ahas a right adjoint (iv) (If A has products) For every object X, the functor X × F has a ﬁnal coalgebra. Similar to the monadic case, only accessible endofunctors have a cofree comonad. For example, there is no ﬁnal coalgebra for the powerset functor ∼ P : Set → Set, as such a coalgebra would be an isomorphism X = PX.The important result from [19,11] is the following: Lemma 5.7 If A is locally presentable and F is accessible, then F−coalg is locally presentable and there is a cofree comonad over F . Proof. Recall from lemma 5.3 that F−coalg is accessible. Cocompleteness of F−coalg follows since the forgetful functor creates colimits and A is cocomplete. Since the forgetful functor is a colimit-preserving functor between locally presentable categories, it follows from the Special Adjoint Functor The- orem [18, Theorem V.8.1] that the forgetful functor has a right adjoint. By lemma 5.6, there is a cofree comonad over F . ✷

Let Comκ(A) be the category of κ-accessible comonads on A,andfur- ther ACom(A) be the category of accessible comonads on A. For every κ, Comκ(A) is a full subcategory of ACom(A). Similarly, let [A, A]κ be the category of endofunctors with rank κ,andAEnd(A) be the category of accessible endofunctors. Piecing together the above results we have the following: Lemma 5.8 The forgetful functor V : ACom(A) → AEnd(A) has a right adjoint R. Proof. Given an accessible endofunctor F , let the right adjoint of the forgetful functor UF : F−coalg →Abe KF . Then choosing R(F )tobeUF KF we can deduce that R(F ) is accessible because UF is accessible (as a left adjoint, it preserves all colimits) and KF is accessible by [3, Proposition 1.66 on p. 52]. The action of the right adjoint on accessible endofunctors has already been given. Given a natural transformation α : F1 ⇒ F2, this induces a functor ∗ α : F1−coalg → F2−coalg which commutes with the respective forgetful functors. Using the dual of [6, Theorem 3 on p. 128] and the comonadicity of these categories of coalgebras, this in turn induces a natural transformation R(α):R(F1) ⇒ R(F2). This defines the functor R : AEnd(A) → ACom(A). The fact that it is right adjoint to V is easily verified. ✷ As opposed to the monadic case, the cofree comonad RM on an endofunctor M : A→Aof rank λ need not have rank λ; as a simple counterexample, def consider the endofunctor M : Set → Set defined as MX = A × X for a 139 Ghani, Luth,¨ de Marchi, Power

fixed set A, which clearly is finitary (has rank ℵ0). We know that the value of RM for a set X is given by RM(X)=νY.X × MY = νY.X × A × Y which is the set of all infinite streams of elements of X × A. Now consider a countably infinite set Y ,thenRM(Y ) contains a sequence with infinitely many different elements from Y . This sequence can not be given from any of the finite subsets Y0 ⊆ Y , which shows that RM has a rank larger than ℵ0.As we saw in Sect. 4, in general calculating coalgebras of accessible endofunctors invariably seems to increase their rank — see [24] for an investigation of this phenomenon.

Using the equivalence between [Aκ, A]and[A, A]κ, we now have the following functors, where ◦ Jκ is the precomposition with Jκ and V is the forgetful U for comonads of rank κ:

RanJκ ✲ ∼ ✛ [Nκ, A] ✛ 1 [Aκ, A] = [A, A]κ Comκ(A) ∩ V ∩ ◦ J κ (5) ❄ R ✲ ❄ AEnd(A) ✛ 1 ACom(A) V

We further deﬁne the composite functors

def Vκ : Comκ(A) → [Nκ, A] Vκ =( ◦ Jκ).V N A → A def . Rκ :[ κ, ] ACom( ) Rκ = R RanJκ

As we have seen, the rank of RκB may be greater than the rank of B,and hence the codomain of Rκ is not the domain of Vκ, but rather ACom(A). So although Rκ and Vκ can not be adjoint, there is the following isomorphism:

Lemma 5.9 For any κ-accessible comonad G and κ-cosignature B : Nκ → A there is an isomorphism ∼ [Nκ, A](VκG, B) = ACom(A)(G, RκB)(6) which is natural in G and B. Proof. The isomorphism is shown by the following chain of natural isomorphisms provided by the two adjunctions in (5) and the full and faithful embedding of [A, A]κ into AEnd(A): ∼ [Nκ, A](VκG, B) = [Nκ, A]( ◦ JκVG,B) ∼ A = [ κ,A](VG,RanJκ B) ∼ A = AEnd( )(VG,RanJκ B) ∼ A = ACom( )(G, RRanJκ B) ∼ = ACom(A)(G, RκB) ✷ 140 Ghani, Luth,¨ de Marchi, Power

So given a κ-cosignature B we have constructed its representing comonad RκB. Note that by lemma 5.6, the category of coalgebras RκB−Coalg is − isomorphic to the category LanIκ RanJκ B coalg. Restricting ourselves to κ-presentable coalgebras, we have that the κ-presentable coalgebras of the representing comonad RκB are isomorphic to the κ-presentable models of the cosignature B as deﬁned earlier. This is our partial dualisation of the result stating that the models of a signature are isomorphic to the Eilenberg-Moore category of the representing monad.

5.3 Coequational Presentations and their Representing Comonad In this section, we continue the dualisation by defining coequational presentations, deriving a representing comonad for such a coequational presentation, and relating the coalgebras of the representing comonad to the models of the presentation. As we have seen above, equations are interpreted as a pair of monad morphisms between free monads; the representing monad for an equational presentation is then defined to be the coequaliser of these monad morphisms. Dualising this requires a coequational presentation to form a pair of comonad morphisms between cofree comonads and taking the representing comonad for the coequational presentation to be the equaliser of these comonad morphisms. Definition 5.10 (Coequational Presentations) A coequational presentation is given by two cosignatures B :Nκ →Aand E :Nλ →A(where RκB is λ-accessible), and two comonad morphisms σ, τ : RκB → RλE. Under (6) the maps σ, τ : RκB → RλE are given by maps σ ,τ : VλRκB → → ∈N E, which in turn are families σc,τc : RκBc Ec of maps for c λ. So, as opposed to the monadic case where an equation is given by a pair of terms, a coequation consists of two partitionings of the coterms; for example, if E(n)= 2,eachofσ and τ partition RκB(n) into two subsets. As mentioned above, given a coequational presentation, our intention is to define its representing comonad to be the equaliser of the comonad morphisms:

σ ✲ ✲ G RκB ✲ RλE (7) τ Proving that these equalisers exist is made easier by our abstract categorical setting which provides us with an alternative deﬁnition of the coalgebras of a comonad. Recall that such coalgebras are given by an object x ∈Aand amapα : x → Gx which commutes with the unit and counit of G. First, observe that an object x of A is given by a map X : 1 →A(where 1 is the one-object category). Further, the functor category [1, A]isisomorphictoA, and we have ∼ ∼ A(x, Gx) = [1, A](X, G ◦ X) = [A, A](LanX X, G), (8) 141 Ghani, Luth,¨ de Marchi, Power so to give a map x → Gx is equivalent to giving a natural transformation from LanX X ⇒ G. In fact. LanX X is a comonad:

Lemma 5.11 LanX X is a comonad. If X is κ-presentable, then LanX X is κ-accessible.

Proof. Using the standard formula for left Kan extensions, LanX X(a)= A(X, a) ⊗ X (where ⊗ is the tensor;e.g.inSet,itisgivenbytheusual product). If X is κ-presentable, then A(X, −) preserves all κ-ﬁltered colimits, as does −⊗X; hence LanX X is κ-accessible. To have a comonad structure, we need natural transformations LanX X ⇒ 1andLanX X ⇒ LanX X ◦ LanX X which satisfy the comonad laws. The ﬁrst of these is given by the image of the identity transformation on X ∼ under the isomorphism [1, A](X, X) = [A, A](Lan X, 1 ). The second is X A ∼ given by the image under the isomorphism [1, A](X, LanX X ◦ LanX X ◦ X) = [A, A](LanX X, LanX X ◦ LanX X) of the transformation

A LanX XA X =⇒ LanX X ◦ X =⇒ LanX X ◦ LanX X ◦ X where P is the canonical transformation X ⇒ LanX X ◦ X. That the counit and comultiplication obey the comonad laws is easily veriﬁed. ✷

The fact that LanX X is a comonad allows us to strengthen equation (8) to obtain the promised characterisation of the coalgebras of a comonad. Proposition 5.12 A coalgebra for a comonad G is given by an object X of A and a map LanX X ⇒ G in ACom(A).

Proof. We have already seen that the structure map of the coalgebra is precisely a natural transformation between the two functors. It is then routine to verify that the properties of the structure map of a coalgebra correspond to the laws of a comonad morphism. ✷

With Proposition 5.12, we can show the existence of equalisers: Lemma 5.13 The category of accessible comonads ACom(A) has all equalisers.

σ ✲ Proof. Consider two comonad morphisms M ✲ E. Wedeﬁnethecate- τ gory (M,E)−Coalg to be given by coalgebras α :LanX X → M of M such that σ.α = τ.α. Recall from the proof of lemma 5.4 how we constructed the category of coalgebras as a weighted limit; using a similar diagram called an equiﬁer, we can construct (M,E)−Coalg as a weighted limit as well, making it accessible. With (M,E)−Coalg, M and E accessible, the forgetful functor from (M,E)−Coalg to A has a right adjoint; and post-composition of this right adjoint with the forgetful functor gives the equaliser of M and E. ✷ 142 Ghani, Luth,¨ de Marchi, Power

We finish by relating the models of a coequational presentation to the coalgebras of its representing comonad. If G is the representing comonad of the coequational presentation σ, τ : RκB → RλE,thenaG-coalgebra is a map α :LanX X → G which by the universal property of equalisers is a map α :LanX X → RκB such that σ.α = τ.α, or in other words a coalgebra for B which equalises σ and τ. In this case, we say that the coalgebra LanX X → RκB satisfies the coequational presentation σ, τ : RκB → RλE. Following the calculations above, if X is κ-presentable the map α :LanX X → RκB,gives us two RλE-coalgebras, which by Lemma 5.4 are families σ : A(X, c) → A(X, Ec)andτ : A(X, c) →A(X, Ec) of maps for all c ∈Nλ, which have to be equal if α :LanX X → RκB satisfies the presentation σ, τ : RκB → RλE.

6 Conclusions, Related and Further Work

This paper set out to analyse the essence of the categorical approach to universal algebra based upon the theory of accessible monads and then derive their coalgebraic counterparts. While still only work in progress, we feel we have made a definite contribution which will be of interest to the general coalgebra community. In particular, the derivation of the infinitary monad corresponding to the collection of final X + F -coalgebras seems to have ready applications in lazy functional programming. In addition, the derivation of the cofree comonad representing a cosignature and the interpretation of a coequational presentation as an equaliser of cofree comonads are dualisations of the corresponding results in the standard theory. We have also related the coalgebras of the representing comonad to the models of the coequational presentations. The major technical challenge was caused by the increase in rank when passing from a cosignature to its representing comonad. For example, this meant that the characterisation of coalgebras only holds for coalgebras up to a certain rank. In addition, this increase in rank makes it unlikely that every accessible comonad can be given as a coequational presentation (i.e. as an equaliser of cofree comonads), as is possible for finitary monads. Of course there remains much to be done. Our approach has been very abstract and perhaps the most important next step is to make precise the relationship between our approach and others pursued in the community. For example, Cirstea [7] defines an abstract cosignature as a functor F : C→Cwhere C has all finite limits and limits of ωop-chains, and F preserves pullbacks and such limits. Then, an observer is given by a functor K : C→C and a natural transformation c : U → K.U, and a coequation by two observers (K, l)and(K, r). Thus, an observer is a family β : C → KC of maps for all F -coalgebras α : X → FX. This notion corresponds roughly with a comonad morphism RκB ⇒ RλE, since we always have the cofree coalgebra RκBX → RκBRκBX. However, the difference is the stronger assumption that the abstract cosignature F preserves limits of ωop, which allows the con- 143 Ghani, Luth,¨ de Marchi, Power struction of a final coalgebra, and reasoning about it; in our more abstract setting all we have is the existence of a right adjoint and the corresponding weaker properties. We have not as yet developed the logical calculus underlying the categorical constructions of the comonadic semantics. It seems this underlying logic will have a modal flavour, and hence research in this area, e.g. by Kurz [15], seems relevant. Finally the use of coequations in recent work on co-Birkhoff theorems [4,22] should be compared with our coequations.

References

[1] P. Aczel, J. Adamek, and J. Velebil. Iteration monads. In U. Montanari, editor, Proceedings CMCS’01, volume 44, 2001.

[2] J. Adamek. On ﬁnal coalgebras of continuous functors. To appear.

[3]J.AdamekandJ.Rosick´y. Locally Presentable and Accessible Categories. Number 189 in London Mathematical Society Lecture Note Series. Cambridge University Press, 1994.

[4] S. Awodey and J. Hughes. The coalgebraic dual of birkhoﬀs variety theorem. http://www.contrib.andrew.cmu.edu/user/jesse/papers/CoBirkhoff.ps.gz, 2000.

[5] M. Barr. Terminal algebras in well-founded set theory. Theoretical Computer Science, 114:299– 315, 1993.

[6] M. Barr and C. Wells. Toposes, Triples and Theories. Number 278 in Grundlehren der mathematischen Wissenschaften. Springer Verlag, 1985.

[7] C. Cirstea. An algebra-coalgebra framework for system speciﬁcation. In Proceedings CMCS 2000, volume 33 of Electronic Notes in Theoretical Computer Science, 2000.

[8] E. J. Dubuc and G. M. Kelly. A presentation of topoi as algebraic relative to categories or graphs. Journal for Algebra, 81:420–433, 1983.

[9] M. Fiore, G. Plotkin, and D. Turi. Abstract syntax and variable binding. In Proc. LICS’99, 1999.

[10] G. Hutton and J. Gibbons. The generic approximation lemma. Information Processing Letters, 2001. To appear.

[11] P. T. Johnstone, A. J. Power, T. Tsjujushita, H. Watanabe, and J. Worrell. The structure of categories of coalgebras, 1998.

[12] G. M. Kelly. A uniﬁed treatment of transﬁnite constructions for free algebras, free monoids, colimits, associated sheaves and so on. Bulletins of the Australian Mathematical Society, 22:1– 83, 1980. 144 Ghani, Luth,¨ de Marchi, Power

[13] G. M. Kelly. Elementary observations on 2-categorical limits. Bulletins of the Australian Mathematical Society, 39:301–317, 1989.

[14] G. M. Kelly and A. J. Power. Adjunctions whose counits are coequalizers, and presentations of ﬁnitary monads. Journal for Pure and Applied Algebra, 89:163– 179, 1993.

[15] A. Kurz. Logics for Coalgebra and Applications to Computer Science. Dissertation, Ludwig-Maximilans-Universtit¨at M¨unchen, 2000.

[16] C. L¨uth. Compositional term rewriting:An algebraic proof of Toyama’s theorem. In H. Ganzinger, editor, Rewriting Techniques and Applications, 7th International Conference, number 1103 in LNCS, pages 261– 275. Springer Verlag, July 1996.

[17] C. L¨uth and N. Ghani. Monads and modular term rewriting. In Category Theory in Computer Science CTCS’97, number 1290 in LNCS, pages 69– 86. Springer Verlag, September 1997.

[18] S. Mac Lane. Categories for the Working Mathematician, volume 5 of Graduate Texts in Mathematics. Springer Verlag, 1971.

[19] M. Makkai and R. Par´e. Accessible Categories: The Foundations of Categorical Model Theory, volume 104 of Contemporary Mathematics. American Mathematical Society, 1989.

[20] L. Moss. Coalgebraic logic. Annals of Pure and Applied Logic, 96:277–317, 1999.

[21] E. Robinson. Variations on algebra:monadicity and generalisations of equational theories. Technical Report 6/94, Sussex Computer Science Technical Report, 1994.

[22] G. Rosu. Equational axiomatisability for coalgebra. Theoretical Computer Science, 260(1), 2000.

[23] D. Turi. Functorial Operational Semantics and its Denotational Dual.PhD thesis, Free University, Amsterdam, 1996.

[24] J. Worrell. Terminal sequences for accessible endofunctors. In Electronic Notes in Theoretical Computer Science, volume 19, 1999. Proceedings CMCS ’99.

145 CMCS’01 Preliminary Version

When is a function a fold or an unfold?

Jeremy Gibbons a, Graham Hutton b, and Thorsten Altenkirch b

a Oxford University Computing Laboratory, Wolfson Building, Parks Road, Oxford OX1 3QD, United Kingdom b Languages and Programming Group, School of Computer Science and IT, University of Nottingham, Jubilee Campus, Wollaton Road, Nottingham NG8 1BB, United Kingdom

Abstract We give a necessary and suﬃcient condition for when a set-theoretic function can be written using the recursion operator fold, and a dual condition for the recursion operator unfold. The conditions are simple, practically useful, and generic in the underlying datatype.

1 Introduction

The recursion operator fold encapsulates a common pattern for defining programs that consume values of a least fixpoint type such as finite lists. Dually, the recursion operator unfold encapsulates a common pattern for defining programs that produce values of a greatest fixpoint type such as streams (infinite lists). Theory and applications of fold abound — see [11,4] for recent sur- veys — while in recent years it has become increasingly clear that the less well-known concept of unfold is just as useful [5,6,10,12,14]. Given the interest in fold and unfold, it is natural to ask when a program can be written using one of these operators. Surprisingly little is known about this question. This article gives a complete answer for the special case in which programs are total functions between sets. In particular, we give a necessary and sufficient condition for when a set-theoretic function can be written using fold, and a dual condition for unfold. The conditions are simple, practically useful, and generic in the underlying datatype. However, our proofs are set- theoretic, and make essential use of classical logic and the Axiom of Choice; hence our results do not generalize to categories of constructive functions 1 .

1 Such as the eﬀective topos or the category of ω-sets. This is a preliminary version. The ﬁnal version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Gibbons, Hutton and Altenkirch

2 Fold and unfold

In this section we review the categorical treatment of fold and unfold in terms of initial algebras and ﬁnal coalgebras; for further details see [17,19,13,1]. Suppose that we ﬁx a category C and a functor F : C→C.Analgebra is pair (A, f) comprising an object A andanarrowf : FA → A,anda homomorphism h :(A, f) → (B,g) from one such algebra to another is an arrow h : A → B such that the following square commutes: FA Fh /FB

f g

/ A h B An initial algebra is an initial object in the category with algebras as objects and homomorphisms as arrows. We write (µF, in) for an initial algebra, and fold f for the unique homomorphism h :(µF, in) → (A, f) from the initial algebra to any other algebra (A, f). That is, fold f is deﬁned as the unique arrow that makes the following square commute: F (fold f) F (µF ) /FA

in f

/ µF fold f A The dual notions of coalgebra, cohomomorphism,andterminal coalgebra are deﬁned similarly. We write (νF,out) for a terminal coalgebra, and unfold f for the unique cohomomorphism h :(A, f) → (νF,out) from any coalgebra (A, f) to the terminal coalgebra. That is, unfold f is deﬁned as the unique arrow that makes the following square commute: unfold f A /νF

f out

 /F (νF) FA F (unfold f) In the literature, fold f and unfold f are sometimes written as (|f|)and45(f)67, and called catamorphisms and anamorphisms respectively.

2.1 Example: ﬁnite lists

Suppose that we define a functor L : SET →SET by LA = 1+(N×A)and Lf = id1+(idN×f), where N is the set of natural numbers. Then an algebra is 147 Gibbons, Hutton and Altenkirch apair(A, f) comprising a set A and a function f : 1+(N×A) → A. Functions of this type can always be uniquely decomposed into the form f =[g,h] for some other functions g : 1 → A and h : N × A → A. A homomorphism f :(A, [g,h]) → (B,[i, j]) is a function f : A → B such that f · g = i and f · h = j · (idN × f). The functor L has an initial algebra (µL, in)=(List(N), [nil, cons ]), where List(A) is the set of all finite lists with elements drawn from A,andnil : 1 → List(N)andcons : N × List(N) → List(N) are constructors for this set. Given any other set A and two functions i : 1 → A and j : N × A → A, the function fold [i, j]:List(N) → A is uniquely defined by the following two equations:

fold [i, j] · nil = i

fold [i, j] · cons = j · (idN × fold [i, j]) That is, fold [i, j] processes a list by replacing the nil constructor at the end of the list by the function i,andeachcons constructor within the list by the function j. For example, the function sum : List(N) → N that sums a list of naturals can be deﬁned by sum = fold [zero, plus], where zero : 1 → N and plus : N × N → N are given by zero () = 0 and plus (x, y)=x + y. We will use this datatype in examples later. For notational simplicity, we will write ‘[ ]’ for nil (), and ‘x : xs’ for cons (x, xs). Thus, we might have written the above deﬁnition of fold more perspicuously as: (fold [i, j]) [ ] = i (fold [i, j]) (x : xs)=j (x, (fold [i, j]) xs)

2.2 Example: streams Suppose that we define a functor S : SET →SET by SA = N × A and Sf = idN × f. Then a coalgebra is a pair (A, f) comprising a set A and a function f : A → N × A. Functions of this type can always be uniquely decomposed into the form f = g,h for some other functions g : A → N and h : A → A. A cohomomorphism f :(A, g,h) → (B,i, j) is a function f : A → B such that i · f = g and j · f = f · h. The functor S has a terminal coalgebra (νS,out)=(Stream(N), head, tail), where Stream(A) is the set of all streams with elements drawn from A, and head : Stream(N) → N and tail : Stream(N) → Stream(N) are destructors for this set. Given any other set A and two functions g : A → N and h : A → A, the function unfold g,h : A → Stream(N) is uniquely defined by the following two equations: head · unfold g,h = g tail · unfold g,h = unfold g,h·h That is, unfold g,h produces a stream by using the function g to produce 148 Gibbons, Hutton and Altenkirch the head of the stream, and the function h to generate another value that is then itself unfolded in the same way to produce the tail of the stream. For example, the function from : N → Stream(N), which produces a stream of naturals ascending in steps of one, can be defined by from = unfold idN, succ where succ : N → N is given by succ x = x +1.

3 When is an arrow a fold or an unfold?

The fold operator encapsulates a common pattern for defining an arrow of type µF → A. It is natural then to ask when an arrow of this type can be written using fold. More precisely, when can an arbitrary arrow h : µF → A be written in the form h = fold f for some other arrow f : FA → A? A technically complete, but nonetheless unsatisfactory, answer to this question is provided by the universal property of the fold operator [17], which can be stated as the following equivalence: h = fold f ⇔ h · in = f · Fh The ⇒ direction of this equivalence states that fold f is a homomorphism from the initial algebra (µF, in) to another algebra (A, f), while the ⇐ direction states that any other homomorphism h between these two algebras must be equal to fold f. Taken as a whole, the universal property expresses the fact that fold f is the unique homomorphism from (µF, in)to(A, f). The universal property provides a complete answer to our question — h can be written in the form fold f precisely when h · in = f · Fh— but is less helpful than it might be because it requires that we already know f.Given a specific h, however, the universal property can often be used to guide the construction of an appropriate f [11], but we do not consider this a completely satisfactory answer either, because this approach is only a heuristic, and it is sometimes difficult to apply in practice. The problem with the universal property is that it concerns an intensional aspect of h, namely the function f that forms part of its implementation. Often a condition based on purely extensional aspects is more useful. A partial answer to our question with purely extensional concerns is that every left invertible arrow h : µF → A can be written using fold [19]. Formally, if we assume that there exists an arrow g : A → µF such that g · h = idµF , then the equation h = fold f can be solved for f as follows: h = fold f ⇔{universal property } h · in = f · Fh ⇔{identities }

h · in · idF (µF ) = f · Fh

149 Gibbons, Hutton and Altenkirch

⇔{functors }

h · in · F (idµF )=f · Fh ⇔{assumption } h · in · F (g · h)=f · Fh ⇔{functors } h · in · Fg· Fh = f · Fh ⇐{substitutivity } f = h · in · Fg In summary, we have derived the following implication:

g · h = idµF ⇒ h = fold (h · in · Fg) As an example, the function rev : List(N) → List(N) that reverses a list is its own inverse, and hence it is immediate that rev can be written using fold by the above implication. Note, however, that this implication only provides a partial answer to our question, because the converse is not true in general. That is, not every arrow h : µF → A that can be written using fold is left invertible. For example, the function sum : List(N) → N was written using fold in the previous section, but is not left invertible. Dually, the unfold operator also satisﬁes a universal property, which can be used to show that every right invertible arrow of type A → νF can be written using unfold [19]. For example, the function evenpos : Stream(N) → Stream(N) that removes every other element from a stream has a right inverse (any function that inserts an element between each adjacent pair in a stream), and hence it is immediate that evenpos can be written using unfold. However, not every arrow h : A → νF that can be written using unfold is right invertible. For example, the function from : N → Stream(N) was written using unfold in the previous section, but is not right invertible. As far as we are aware, the invertibility results above are the only known results that state when arbitrary arrows of the correct type can be written using fold or unfold. We conclude this section by noting that much more progress has been made concerning speciﬁc kinds of arrows. For example, the fusion law states that the composition of a homomorphism and a fold can always be written as a fold, while the banana split law states that two folds applied to the same argument can always be written as a single fold [19].

4When is a function a fold?

In this section we give a necessary and sufficient condition for when an arrow can be written using fold, for the special case of the category SET in which the arrows are total functions between sets. We dualize the result to unfold in 150 Gibbons, Hutton and Altenkirch the following section. The result depends on the following definition: Definition 4.1 The kernel [16]ofafunctionf : A → B is the set of pairs of elements that are identified by f: ker f = { (a, a) ∈ A × A | fa= fa }

The main result of this section is a necessary and suﬃcient condition for when an arbitrary arrow h : µF → A in SET canbewrittenintheform h = fold f for some other arrow f : FA→ A. Theorem 4.2 Suppose that h : µF → A.Then (∃g : FA→ A. h = fold g) ⇔ ker (Fh) ⊆ ker (h · in)

The crux of the proof is the well-known observation that inclusion of kernels is equivalent to the existence of ‘postfactors’: Lemma 4.3 Suppose that f : A → B and h : A → C.Then (∃g : B → C. h = g · f) ⇔ (ker f ⊆ ker h ∧ B → C = ∅)

Proof. The proof is straightforward. For the ⇒ direction, assume that g : B → C and h = g · f; then clearly B → C = ∅, and moreover, (a, a) ∈ ker f ⇔{kernels } fa= fa ⇒{substitutivity } g (fa)=g (fa) ⇔{h = g · f } ha= ha ⇔{kernels } (a, a) ∈ ker h Conversely, assume that ker f ⊆ ker h and B → C = ∅, so that either B = ∅ or C = ∅.WhenB = ∅,letg be the unique function in B → C;notethatg is the ‘empty function’, and so g · f is empty too. Moreover, A = ∅ because ofthetypeoff,soh is also empty and hence equal to g · f.WhenC = ∅,we define gb for b in the range of f by gb= ha for some a with fa= b;thisis a proper definition, because if there are two choices a, a with fa= fa = b, then ha= ha also by assumption. For b outside the range of f,wedefinegb arbitrarily. By construction, this gives ha= g (fa) for every a. ✷

151 Gibbons, Hutton and Altenkirch

We also use the following simple fact concerning initial algebras: Lemma 4.4 µF → A = ∅⇒FA→ A = ∅

Proof. We note that FA→ A = ∅ is equivalent to A = ∅⇒FA= ∅,which implication can then be veriﬁed as follows: A = ∅ ⇒{µF → A = ∅} µF = ∅ ⇒{in : F (µF ) → µF } F (µF )=∅ ⇒{µF = ∅ = A } FA= ∅ ✷

Proof of Theorem 4.2 Given the two lemmata above, the proof of the theorem is almost embarrassingly simple: ∃g : FA→ A. h = fold g ⇔{universal property } ∃g : FA→ A. h · in = g · Fh ⇔{Lemma 4.3 } ker (Fh) ⊆ ker (h · in) ∧ FA→ A = ∅ ⇔{Lemma 4.4, h : µF → A } ker (Fh) ⊆ ker (h · in) ✷

Remark 4.5 For the type List(A) of finite lists with elements drawn from A, with constructors nil : 1 → List(A)andcons : A × List(A) → List(A), Theorem 4.2 reduces to stating that an arbitrary function h : List(A) → B can be written directly as a fold precisely when the lists that are identified by h are closed under cons , in the sense that for all x, xs, ys, h xs = h ys ⇒ h (x : xs)=h (x : ys) Example 4.6 If we define sum : List(N) → N by the equations

sum [] = 0 sum (x : xs)=x + sum xs 152 Gibbons, Hutton and Altenkirch then it is easy to show that the lists identiﬁed by sum are closed under cons : sum (x : xs)=sum (x : ys) ⇔{deﬁnition of sum } x + sum xs = x + sum ys ⇐{substitutivity } sum xs = sum ys Hence, sum can be written directly using fold.

Example 4.7 In contrast, if we deﬁne a function stail : List(N) → List(N) (for ‘safe tail’) by the equations

stail [] = [] stail (x : xs)=xs then a simple counterexample veriﬁes that the lists identiﬁed by stail are not closed under cons : for example, with xs =[]andys =0:[],wehave stail xs =[]=stail ys, but stail (1 : xs)=[]=0:[]= stail (1 : ys). Therefore stail cannot be written directly as a fold.

Example 4.8 For the type List(R) of finite lists of reals, consider the problem of computing floorsum = floor · rsum,wherersum : List(R) → R sums a list of reals and floor : R → Z rounds a real r down to the largest integer at most r. Because the result is an integer, one might wonder whether floorsum can be carried out as a fold to integers, thereby avoiding the computationally more expensive real arithmetic. It cannot: we have floorsum (0.3:[])= floorsum (0.6 : [ ]), but floorsum (0.5:0.3:[])= floorsum (0.5:0.6:[]). On the other hand, the reverse composition sum · map floor, which floors every element of the list before summing, can be written as a fold: an argument similar to Example 4.6 applies. This is an instance of deforestation [23], an optimisation whereby two computations are combined into one and the intermediate data structure (here of type List(Z)) is eliminated.

Remark 4.9 For the type Tree (A) of binary trees with constructors leaf : A → Tree (A)andnode : Tree (A) × Tree (A) → Tree (A), Theorem 4.2 reduces to stating that an arbitrary function h : Tree (A) → B can be written directly as a fold precisely when the trees that are identiﬁed by h are closed under node, in the sense that for all t, u, h t = h t ∧ h u = h u ⇒ h (node (t, u)) = h (node (t,u))

Example 4.10 For another deforestation example, consider flatsum = sum · flatten,whereflatten : Tree (A) → List(A) generates a list of the elements of a tree. The intermediate list in flatsum can be eliminated, because 153 Gibbons, Hutton and Altenkirch

flatsum (node (t, u)) = { definition of flatsum } sum (flatten (node (t, u))) = { definition of flatten } sum (flatten t ++ flatten u) = { sum distributes over ++ } sum (flatten t)+sum (flatten u) = { definition of flatsum } flatsum t + flatsum u from which we conclude that trees identified under flatsum are closed under node. (Here, ‘++’ concatenates two lists.) Example 4.11 The predicate bal : Tree (A) → B that holds of tree iff it is balanced (all the leaves at the same depth) is not a fold: with tree t being balanced and of depth 1, and tree u being balanced and of depth 2, both t and u are identified by bal (both yielding true), yet bal (node (t, t)) = bal (node (t, u)). Example 4.12 However, the function dbal : Tree (A) → N×B that computes a pair, the depth of the tree and whether it is balanced, is a fold. Because depth (node (t, u)) = 1 + max (depth t, depth u) bal (node (t, u)) = bal t ∧ bal u ∧ depth t = depth u trees identified by dbal are closed under node.Thisisanexampleofamu- tumorphism [7] or almost homomorphism [3,8]; transforming a function into such a form is an important step towards constructing an efficient data-parallel algorithm for computing it.

5 When is a function an unfold?

Dualising Theorem 4.2 to unfold is straightforward. The appropriate dual to the notion of the kernel of a function is simply its image: Deﬁnition 5.1 The image of a function f : A → B is the set of elements that are produced by f: img f = { b ∈ B |∃a ∈ A. f a = b }

The duality between kernels and images is perhaps not immediately evident, but is revealed by thinking relationally. In particular, if functions are viewed as relations in the obvious way, then the relational composition f ◦ · f of a function f with its converse f ◦ is precisely the kernel of f, while the dual 154 Gibbons, Hutton and Altenkirch composition f · f ◦ is (the identity relation on) the image of f. We can now present our result for unfold, which gives a necessary and suﬃcient condition for when an arbitrary arrow h : A → νF in SET can be written in the form h = unfold g for some other arrow g : A → FA. Theorem 5.2 Suppose that h : A → νF.Then (∃g : A → FA. h= unfold g) ⇔ img (Fh) ⊇ img (out · h)

The crux of the proof is the dual of Lemma 4.3, namely that inclusion of images is equivalent to the existence of ‘prefactors’: Lemma 5.3 Suppose that f : B → C and h : A → C.Then (∃g : A → B. h = f · g) ⇔ (img f ⊇ img h ∧ A → B = ∅)

Proof. For the ⇒ direction, assume that g : A → B and h = f · g;then clearly A → B = ∅, and moreover, c ∈ img h ⇔{images } ∃a. h a = c ⇔{h = f · g } ∃a. f (ga)=c ⇒{g : A → B } ∃b. f b = c ⇔{images } c ∈ img f Conversely, assume that img f ⊇ img h and A → B = ∅, so that either A = ∅ or B = ∅.WhenA = ∅,thenh is the empty function; let g be the empty function too, so f · g is also empty and hence equal to h.WhenB = ∅,we deﬁne ga for a ∈ A as follows. Let c = ha; by assumption, c ∈ img f too, so there exists b ∈ B with fb= c, and we deﬁne ga to be such a b.Ifthere is more than one such b,itdoesn’tmatterwhichonethatwechoose.By construction, this gives ha= f (ga) for every a. ✷

We also use the dual of Lemma 4.4: Lemma 5.4 A → νF = ∅⇒A → FA= ∅

Proof. We note that A → FA= ∅ is equivalent to A = ∅⇒FA= ∅,which implication can then be veriﬁed by combining the two calculations: 155 Gibbons, Hutton and Altenkirch

A = ∅ ⇒{A → νF = ∅} νF = ∅ ⇒{out : νF → F (νF) } F (νF) = ∅ and A = ∅ ⇒{functions } νF → A = ∅ ⇒{functors } F (νF) → FA= ∅ That is, A = ∅ implies that F (νF) = ∅ and F (νF) → FA = ∅,which conjunction in turn implies that FA= ∅, as required. ✷ Proof of Theorem 5.2 Again, the proof is simple: ∃g : A → FA. h= unfold g ⇔{universal property } ∃g : A → FA. out · h = Fh· g ⇔{Lemma 5.3 } img (Fh) ⊇ img (out · h) ∧ A → FA= ∅ ⇔{Lemma 5.4, h : A → νF } img (Fh) ⊇ img (out · h) ✷

Remark 5.5 For the type Stream(A) of streams with elements drawn from A, with destructors head : Stream(A) → A and tail : Stream(A) → Stream(A), Theorem 5.2 reduces to stating that an arbitrary function h : B → Stream(A) can be written directly as an unfold precisely when the tail of every stream producible by h is itself producible by h, in the sense that: img (tail · h) ⊆ img h Example 5.6 Consider the function from : N → Stream(N) deﬁned in Sec- tion 2.2. Then (tail · from) n is the stream [n +1,n+2,...],andingeneral, img (tail · from)isthesetofstreams{ [n +1,n+2,...] | n ∈ N },whichis in included in img from, the set of streams { [n, n +1,...] | n ∈ N }. Hence, from can be written directly using unfold. 156 Gibbons, Hutton and Altenkirch

Example 5.7 In contrast, if we deﬁne a function mults : N → Stream(N) such that mults n produces the stream of multiples [0,n,n× 2,n× 3,...]ofa natural n,then(tail·mults) n is the stream [n, n×2,...], and so img (tail·mults) is not included in img mults, which only includes streams whose head is 0. Therefore mults cannot be written directly as an unfold.

Remark 5.8 For the type CoTree(A) of inﬁnite binary trees with elements drawn from A, with destructors root : CoTree(A) → A and left, right : CoTree(A) → CoTree(A), Theorem 5.2 reduces to stating that an arbitrary function h : B → CoTree(A) can be written as an unfold precisely when the left and right of every tree producible by h are themselves producible by h: img (left · h) ⊆ img h img (right · h) ⊆ img h

Example 5.9 Consider the inﬁnite binary tree with every node labelled by its path, a ﬁnite list of booleans recording the left and right turns from the root in order to reach that node. The function paths : 1 → CoTree(List(B)) that produces this tree is not an unfold, because img (left · paths )andimg (right · paths ) contain trees with singleton lists at their roots, which are not included in img paths , which contains a tree with the empty list at its root.

Example 5.10 In contrast, the more general function pathsfrom : List(B) → CoTree(List(B)) that generates the tree of paths starting from a given path is an unfold, because (left · pathsfrom) bs = pathsfrom (false : bs) implies that img (left · pathsfrom) is included in img pathsfrom, and similarly for right.

6 Conclusion

We have given the first complete results for when an arbitrary arrow can be written directly as a fold or unfold, for the special case of the category SET .In future work we will investigate whether the results can be generalised to other categories, and to other patterns of recursion, such as primitive (co-)recursion [18,21] and course-of-value (co-)iteration [22]. As well as being interesting from a theoretical point of view, we also expect the results to have practical applications in program optimisation. A well-structured program is typically factored into several phases, each phase generating a data structure that is consumed by the subsequent phase; deforestation [9,15,20] fuses adjacent phases and eliminates the intermediate data structures. When performed as a compiler optimisation, it yields efficient object code without sacrificing the structure and clarity of the source code. Our results can be used to determine when two phases cannot be fused to a fold or an unfold. It might be possible to use an automatic testing system such as QuickCheck [2] to find counterexamples to the appropriate inclusions. 157 Gibbons, Hutton and Altenkirch

Acknowledgements

We are very grateful to Lambert Meertens, whose suggestions lead to a substantial simpliﬁcation of our proofs. We also thank the anonymous referees for their useful comments. Graham Hutton was supported by EPSRC grant Structured Recursive Programming, and together with Thorsten Altenkirch by ESPRIT Working Group Applied Semantics.

References

[1] R. Bird and O. de Moor. Algebra of Programming. Prentice Hall, 1997. [2] K. Claessen and J. Hughes. Quickcheck:A lightweight tool for random testing of Haskell programs. In Proc. 5th ACM SIGPLAN International Conference on Functional Programming, September 2000. [3] M. Cole. Parallel programming with list homomorphisms. Parallel Processing Letters, 5(2):191–203, 1995. [4] J. Gibbons. Calculating functional programs. In Summer School and Workshop on Algebraic and Coalgebraic Methods in the Mathematics of Program Construction, Oxford, April 2000. [5] J. Gibbons and G. Hutton. Proof methods for structured corecursive programs. In Proc. 1st Scottish Functional Programming Workshop, Stirling, Scotland, August 1999. [6] J. Gibbons and G. Jones. The under-appreciated unfold. In Proc. 3rd ACM SIGPLAN International Conference on Functional Programming, Baltimore, Maryland, September 1998. [7]M.M.Fokkinga. Law and Order in Algorithmics. PhD thesis, Universiteit Twente, 1992. [8] S. Gorlatch. Extracting and implementing list homomorphisms in parallel program development. Science of Computer Programming, 33:1–27, 1999. [9] Z. Hu, H. Iwasaki and M. Takeichi. Deriving structural hylomorphisms from recursive deﬁnitions. In Proc. 1st ACM SIGPLAN International Conference on Functional Programming, 1996. [10] G. Hutton. Fold and unfold for program semantics. In Proc. 3rd ACM SIGPLAN International Conference on Functional Programming, Baltimore, Maryland, September 1998. [11] G. Hutton. A tutorial on the universality and expressiveness of fold. Journal of Functional Programming, 9(4):355–372, July 1999. [12] B. Jacobs, L. Moss, H. Reichel, and J. Rutten, editors. Proc.oftheFirst Workshop on Coalgebraic Methods in Computer Science. Elsevier Science B.V., 1998. Electronic Notes in Theoretical Computer Science Volume 11. 158 Gibbons, Hutton and Altenkirch

[13] B. Jacobs and J. Rutten. A tutorial on (co)algebras and (co)induction. Bulletin of the European Association for Theoretical Computer Science, 62:222–259, 1997.

[14] B. Jacobs and J. Rutten, editors. Proc. of the Second Workshop on Coalgebraic Methods in Computer Science. Elsevier Science B.V., 1999. Electronic Notes in Theoretical Computer Science Volume 19.

[15] J. Launchbury and T. Sheard. Warm fusion:Deriving build-catas from recursive deﬁnitions. In Proc. Conference on Functional Programming Languages and Computer Architecture, ACM Press, 1995.

[16] S. Mac Lane. Categories for the Working Mathematician.Number5in Graduate Texts in Mathematics. Springer-Verlag, 1971.

[17] G. Malcolm. Algebraic data types and program transformation. Science of Computer Programming, 14(2-3):255–280, September 1990.

[18] L. Meertens. Paramorphisms. Formal Aspects of Computing, 4(5):413–424, 1992.

[19] E. Meijer, M. Fokkinga, and R. Paterson. Functional programming with bananas, lenses, envelopes and barbed wire. In J. Hughes, editor, Proc. Conference on Functional Programming and Computer Architecture,number 523 in LNCS. Springer-Verlag, 1991.

[20] A. Takano and E. Meijer. Shortcut deforestation in calculational form. In Proc. Conference on Functional Programming Languages and Computer Architecture, ACM Press, 1995.

[21] V. Vene and T. Uustalu. Functional programming with apomorphisms (corecursion). Proceedings of the Estonian Academy of Sciences: Physics, Mathematics, 47(3):147–161, 1998.

[22] V. Vene. Categorical Programming with Inductive and Coinductive Types.PhD thesis, Universitity of Tartu, 2000.

[23] P. Wadler. Deforestation:Transforming programs to eliminate trees. Theoretical Computer Science, 73:231–248, 1990.

159 CMCS’01 Preliminary Version

A Calculus of Terms for Coalgebras of Polynomial Functors

Robert Goldblatt 1

Centre for Logic, Language and Computation School of Mathematical and Computing Sciences Victoria University of Wellington, New Zealand [email protected] http://www.mcs.vuw.ac.nz/~rob

Abstract A syntax and semantics of types, terms and formulas for coalgebras of polynomial functors is developed, extending earlier work [4] on monomial coalgebras to include functors constructed using coproducts. A modified ultrapower construction for polynomial coalgebras is introduced, adapting the conventional ultrapower to retain only those states that evaluate observable terms in a standard way. A special role is played by terms that take observable values and are “rigid”:their free variables do not occur in any state-valued subterm. The following “co-Birkhoff” theorem is proved:a class of polynomial coalgebras is definable by Boolean combinations of equations between rigid terms iff the class is closed under disjoint unions, images of bisimulations, and observable ultrapowers.

1 Introduction

A coalgebra of a functor T : Set → Set is a pair (A, α)withα a function of the form A → TA. This notion has proven useful in modelling transition systems, such as automata, as well as classes in object-oriented programming languages [14,7,16,17]. α is viewed as a transition structure on a state set A. Relational models of propositional modal logic can be viewed as coalgebras [16] and this has lead to a number of proposals of languages with modalities for describing coalgebras [12,11,15,10]. An alternative method used here is to develop a syntax of equations between terms for coalgebraic operations that is similar to the standard equational logic of algebras, but subject to the principle that a coalgebraic term should have a single state-valued variable or parameter.

1 Paul Taylor’s diagrams package was used in preparing this document. This is a preliminary version. The ﬁnal version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Goldblatt

In a previous article [4] the author developed such a calculus of terms and equations for coalgebras of certain monomial functors. These are constructed from constant functors and the identity functor by forming products and exponential functors with constant exponent (which we will call power functors). It was shown that Boolean combinations of equations between terms of “observable” type form a suitable language of formulas for specifying properties of coalgebras and characterising bisimulation relations between them. A structural description was given of classes of coalgebras definable by such formulas, using the notion of the ultrafilter enlargement of a monomial coalgebra. Now many of the more significant examples in the above references involve also coproducts in their construction, and so are coalgebras for polynomial functors. The aim of this article is to explain how the theory of [4] can be extended to the case of polynomial functors. The presence of coproducts introduces considerable complexity, associated with the partiality of certain “path functions” that express the dynamics of the transition structure α. The approach taken here is to use type theory [8] to describe the construction of sets-as-types from some base types by forming products, powers and coproducts, and to provide rules of syntax for terms that take values in these types. Among the base types is the type St of states: this symbol St denotes the state set of a given coalgebra. The symbol s is reserved as the special state-valued parameter that appears in terms, and may be thought of as denoting the “current” state. The symbol tr denotes the transition structure, so that we are able to form the term tr(s), or more generally tr(M) for any state-valued term M. But the situation is far more subtle than previously, because we now allow state variables distinct from s in coalgebraic specifications, provided that they are not free. In the syntax of [4] all variables of a term are free, but here we have variable-binding operations on terms (lambda-abstraction, case-formation). A given term M may contain free state variables. More generally it may have a number of free variables of various types that occur in state-valued subterms, and hence provide a number of ways of referring to states by varying the values of those variables. M is rigid if this does not hold, i.e. if any variable occurring in a state-valued subterm is bound in M itself (an example will be given shortly). Rigidity is imposed on M by requiring that the type of any free variable of M does not involve St. Our main result (Theorem 7.1) is about the specification of coalgebras by combinations of equations between rigid terms. Following established practice in categorical logic, the “case” operation is used to introduce terms associated with coproducts. The coproduct A1 +A2 of sets A1,A2 is their disjoint union, and comes equipped with injective insertion functions ιj : Aj → A1 + A2 for j =1, 2. Each element of A1 + A2 is equal to ιj(a) for a unique j and a unique a ∈ Aj. Our syntax generates terms of the form

case N of [ι1v1 → M1 | ι2v2 → M2], where N is a term taking values in A1 + A2, M1 and M2 take values in some 161 Goldblatt

other set B,andthevj’s are variables that take values in Aj and are bound in the overall case term. The latter is evaluated by ﬁrst obtaining the value d of N in A1 + A2 and then, if d is equal to ιj(a), evaluating Mj with vj assigned value a. Another notation for this term [8, Section 2.3] is

unpack N as [ι1v1 in M1 ,ι2v2 in M2].

Example. To illustrate the use of rigid terms and case-formation in coalgebraic speciﬁcation, here is an example adapted from [9, Section 4]. Let A be a set of (possibly inﬁnite) binary trees. Each tree x either is a single node with no children, or has exactly two children obtained by deleting the top node of x. This gives an operation

children : A −→ 1+(A × A), where 1 = {∗}; children(x)=ι1∗ when x has no children, and children(x)= ι2(x1,x2)whenx1 and x2 are the left and right children of x. There is a size (number of nodes) operation

size : A −→ 1+N, where N is the set of positive integers and size(x)=ι1∗ when x is inﬁnite. The two operations can be “tupled” into a single function

α A ✲ (1 + (A × A)) × (1 + N) which is a coalgebra for the functor T (X)=(1+(X × X)) × (1 + N). The operations can be recovered from α as children = π1 ◦ α and size = π2 ◦ α, where π1 and π2 are the left and right projections. Now the size of a tree is 1 if it has no children, is infinite if at least one child is infinite, and otherwise is the sum of the sizes of the children plus 1. Thus our example validates the equation of Figure 1, in which the right-hand term M is obtained by iteration of case-formation. Validity means that the equation is satisfied no matter what member of A is denoted by the state parameter s.Thevariablev takes values in A × A,soπ1v and π2v take values in A. Although v is free in these subterms, and indeed in the subterms beginning case size(πjv)..., v is bound in M itself. M is rigid. A significant departure from [4] is to replace the notion of ultrafilter enlargement by a modified ultrapower. There is an obstacle to using the conventional ultrapower construction in that it produces states that assign “nonstandard” values to terms of observable type. Our modification is to retain only those states that are observable in the sense that they assign only standard observable values (see Section 6). One advantage of ultrapowers over ultrafilter enlargements is that lifting the operations of a coalgebra to an ultrapower is a more familiar exercise, and is less cumbersome in that it works 162 Goldblatt

size(s)= casechildren(s) of

ι1u → ι21

ι2v → casesize(π1v) of

ι1u → ι1∗

ι2n → case size(π2v) of

ι1u → ι1∗

ι2k → ι2(n + k +1) endcase endcase endcase

Fig. 1. case terms with elements rather than collections of sets. Also the proof that ultrapowers preserve satisfaction of observable formulas is more accessible, and follows the pattern ofLo´ c s’s Theorem for regular ultrapowers. On the other hand there is considerable intricacy in deﬁning the transition structure of an observable ultrapower. This is carried out with the help of the notion from [10] of a path from a functor to one of its component functors.

This article is in the nature of a research announcement, giving a survey of all the relevant concepts and explaining the results, but leaving out the more technical proofs, which would take up much more space than is available here (these proofs will appear elsewhere). To summarize, the main features of the work are: • The formulation of syntax and semantics of types and terms for coalgebras of any polynomial functor (Sections 3 and 4). • The deﬁnition of observable formulas as Boolean combinations of equations between terms of observable type, and their use in logically characterising bisimilarity of states: two states are bisimilar when they assign the same values to all ground observable terms, or equivalently when they satisfy the same rigid observable formulas (Theorem 5.8). • The construction of observable ultrapowers of polynomial coalgebras and derivation of a version ofLo´ c s’s Theorem (Section 6). • A proof that a class of polynomial coalgebras is deﬁnable by a set of observable formulas if, and only if, it is closed under disjoint unions, images of bisimulations, and observable ultrapowers (Theorem 7.1). This last result may be viewed as an analogue for polynomial coalgebras of 163 Goldblatt

Birkhoﬀ’s famous characterisation of varieties of classical algebras. For discussion of the nature of such “co-Birkhoﬀ” theorems and references to other proposals for them, see the Introduction to [4].

2 Polynomial Functors

Standard notation for products, powers and coproducts of sets will be used. The coproduct A1 + A2 and associated insertions ιj : Aj → A1 + A2 have already been described. πj : A1 × A2 → Aj is the projection function from the D product set A1 × A2 onto Aj.TheD-th power of set A is the set A of all functions from set D to A.Foreachd ∈ D there is the evaluation function D ev d : A → A having ev d(f)=f(d). The identity function on a set A is denoted idA : A → A. The symbol ◦ ✲ will be used for partial functions. Thus f : A ◦ ✲ B means that f is a function with codomain B and domain Dom f ⊆ A.Wemay write f(x)↓ to mean that f(x) is defined, i.e. x ∈ Dom f. Associated with each insertion ιj : Aj → A1 + A2 is its partial inverse, the extraction function ✲ εj : A1 + A2 ◦ Aj having εj(y)=x iff ιj(x)=y.Thusy ∈ Dom εj iff y ∈ ιjAj, i.e. y = ιj(x) for some x ∈ Aj. Extraction functions play a vital role in the analysis of coalgebras built out of coproducts, as will be seen below. Consider the following constructions of endofunctors T : Set → Set. • For a fixed set D = ∅,theconstant functor D¯ has D¯(A)=D on sets A and D¯(f)=idD on functions f. • The identity functor Id has IdA = A and Idf = f. • The product T1 × T2 of two functors has T1 × T2(A)=T1A × T2A, and, for a function f : A → B,hasT1 × T2(f) being the function

T1(f) × T2(f):T1A × T2A → T1B × T2B

that acts by (a1,a2) → (T1(f)(a1),T2(f)(a2)). • The coproduct T1 + T2 of two functors has T1 + T2(A)=T1A + T2A,and for f : A → B,hasT1 + T2(f) being the function

T1(f)+T2(f):T1A + T2A → T1B + T2B

that acts by ιj(a) → ιj(Tj(f)(a)). • The D-th power functor T D of a functor T has T DA =(TA)D,andT D(f): (TA)D → (TB)D being the function g → T (f) ◦ g. A functor T is polynomial if it is constructed from constant functors and Id by ﬁnitely many applications of products, coproducts and powers. Note that any polynomial functor constructed without the use of Id is constant. A T -coalgebra is a pair (A, α) comprising a set A and a function A −→α TA. A is the set of states and α is the transition structure of the coalgebra. 164 Goldblatt

Note that A is determined as the domain Dom α of α, so we can identify the coalgebra with its transition structure, i.e. a T -coalgebra is any function of the form α : Dom α → T (Dom α). A morphism from T -coalgebra α to T - coalgebra β is a function f : Dom α → Dom β between their state sets which commutes with their transition structures in the sense that β ◦ f = Tf ◦ α, i.e. the following diagram commutes:

f Dom α ✲ Dom β

α β ❄ ❄ Tf T (Dom α) ✲ T (Dom β)

If Dom α ⊆ Dom β,thenα is a subcoalgebra of β iﬀ the inclusion function → Dom α> Dom β is a morphism from α to β. { ∈ } Every set αi : i I of T -coalgebras has a disjoint union I αi,whichis a T -coalgebra whose domain is the disjoint union of the Dom αi’s and whose transition structure acts as αj on the summand ιjDom αj of Dom I αi.More → precisely, this transition is given by ιj(a) T (ιj)(αj(a)), with the insertion → ιj : Dom αj Dom I αi being an injective morphism making αj isomorphic to a subcoalgebra of the disjoint union (see [17, Section 4]).

3 Syntax of Types, Terms and Formulas

Types Fix a set O of symbols called observable types, and a collection {[[ o ]] : o ∈ O} of sets indexed by O.Membersof[[o ]] a r e observable elements, or constants, of type o. Example: O = {num, bool, 1, 0},with[[num]] = {0, 1,...}, [[ bool]] = {true, false},[[1]] = {0},[[0]] = ∅. The set of types over O,orO-types, is the smallest set T such that O ⊆ T, St ∈ T and

(1) if σ1,σ2 ∈ T then σ1 × σ2,σ1 + σ2 ∈ T; (2) if σ ∈ T and o ∈ O,theno ⇒ σ ∈ T. A subtype of an O-type τ is any type that occurs in the formation of τ. St is a type symbol that will denote the state set of a given coalgebra. A type is rigid if it does not have St as a subtype. The set of rigid types is thus the smallest set that includes O and satisﬁes (1) and (2). The symbol “o” will always be reserved for members of O. o ⇒ σ is a power type: such types will always have an observable exponent. 165 Goldblatt

Given any set A, we associate a set [[ σ ]] A with each O-type by putting [[ o ]] A =[[o ]] , [[ St ]] A = A, and inductively

[[ σ1 × σ2 ]] A =[[σ1 ]] A × [[ σ2 ]] A

[[ σ1 + σ2 ]] A =[[σ1 ]] A +[[σ2 ]] A [[ o ]] [[ o ⇒ σ ]] A =[[σ ]] A .

If σ is a rigid type, then [[ σ ]] A is a ﬁxed set whose deﬁnition does not depend on A,soitmaybewritten[[σ ]] .

Terms

To define terms we fix a denumerable set Var of variables and define a context to be a finite (possible empty) list

Γ=(v1 : σ1,...,vn : σn) of assignments of O-types σi to variables vi, with the proviso that v1,...,vn are all distinct. Γ is a rigid context if all of the σi’s are rigid types. Concatenation of lists Γ and Γ with disjoint sets of variables is written Γ, Γ.Aterm-in- context is an expression of the form

Γ ✄ M : σ, which signiﬁes that M is a “raw” term of type σ in context Γ. This may be abbreviated to Γ ✄ M if the type of the term is understood. Figure 2 gives axioms that legislate terms into existence, and rules for generating new terms from given ones. The rules for products, coproducts and powers are the standard ones for introduction and elimination of terms of those types. Axiom (Con) states that an observable element is a constant term of its type, while the raw term s in axiom (St) is a special parameter which will be interpreted as the “current” state in a coalgebra. Bindings of variables in raw terms occur in lambda-abstractions and case terms: the v in the consequent of rule (Abs) and the vj’s in the consequent of (Case) are bound in those terms. It is readily shown that in any term Γ ✄ ϕ, all free variables of M appear in the list Γ. A ground term is one of the form ∅ ✄ M : σ, which may be abbreviated to the raw term M. Thus a ground term has no free variables. Note that a ground term may contain the state parameter s, which behaves as a variable taking values in Dom α. Atermisdeﬁnedtoberigid if its context is rigid. This entails that every free variable of M is assigned a rigid type in Γ, and prevents any free variable of M from occurring in a subterm of type St. Of course all ground terms are rigid. 166 Goldblatt

Axioms v ∈ Var c ∈ [[ o]] (Var) (Con) (St) v : σ ✄ v : σ ∅ ✄ c : o ∅ ✄ s : St

Weakening Γ, Γ ✄ M : σ (Weak) where v does not occur in Γ or Γ. Γ,v : σ, Γ ✄ M : σ

Product Types Γ ✄ M : σ Γ ✄ M : σ (Pair) 1 1 2 2 Γ ✄ M1,M2 : σ1 × σ2

Γ ✄ M : σ1 × σ2 Γ ✄ M : σ1 × σ2 (Proj1) (Proj2) Γ ✄ π1M : σ1 Γ ✄ π2M : σ2

Coproduct Types

Γ ✄ M : σ1 Γ ✄ M : σ2 (In1) (In2) Γ ✄ ι1M : σ1 + σ2 Γ ✄ ι2M : σ1 + σ2

Γ ✄ N : σ + σ Γ,v : σ ✄ M : σ Γ,v : σ ✄ M : σ (Case) 1 2 1 1 1 2 2 2 Γ ✄ case N of [ι1v1 → M1 | ι2v2 → M2]:σ

Power Types Γ,v : o ✄ M : σ Γ ✄ M : o ⇒ σ Γ ✄ N : o (Abs) (App) Γ ✄ (λv.M):o ⇒ σ Γ ✄ M · N : σ

Fig. 2. Axioms and Rules for Generating Terms

τ-Terms

For a given O-type τ,aτ-term is any term that can be generated by the axioms and rules of Figure 2 together with the additional rule

Γ ✄ M : St (Tr) . Γ ✄ tr(M):τ 167 Goldblatt

Equations Γ ✄ M : σ Γ ✄ M : σ (Eq) 1 2 Γ ✄ M1 ≈ M2

Weakening Γ, Γ ✄ ϕ (Weak) where v does not occur in Γ or Γ. Γ,v : σ, Γ ✄ ϕ

Connectives Γ ✄ ϕ Γ ✄ ϕ Γ ✄ ϕ (Neg) (Con) 1 2 Γ ✄ ¬ϕ Γ ✄ ϕ1 ∧ ϕ2

Fig. 3. Formation Rules for Formulas

Note that from this rule and the axiom (St) we can derive the τ-term

∅ ✄ tr(s):τ.

The symbol tr will denote the transition structure of coalgebras of the form α✲ A [[ τ ]] A.IfM is interpreted as the state x of α,thentr(M) is interpreted as α(x).

τ-Formulas

An equation-in-context has the form Γ ✄ M1 ≈ M2 where Γ ✄ M1 and Γ ✄ M2 are terms of the same type.Aformula-in-context has the form Γ ✄ ϕ,with the expression ϕ being constructed from equations M1 ≈ M2 by propositional connectives. Formation rules for formulas are given in Figure 3, using the connectives ¬ and ∧. The other standard connectives ∨, →,and↔ can be introduced as deﬁnitional abbreviations in the usual way. A formula ∅ ✄ ϕ with empty context is ground, and may be abbreviated to the expression ϕ. A rigid formula is one whose context is rigid. A τ-formula is one that is generated by using only τ-terms as premisses in the rule (Eq). An observable formula is one that uses only terms of observable type in forming its component equations.

4Semantics of Terms and Formulas

Each O-type σ determines a polynomial functor |σ| : Set → Set.Foro ∈ O, |o| is the constant functor D¯ where D =[[o ]] ; |St| is the identity functor Id; 168 Goldblatt and inductively

[[ o ]] |σ1 × σ2| = |σ1|×|σ2|, |σ1 + σ2| = |σ1| + |σ2|, |o ⇒ σ| = |σ| .

Then in general, |σ|A =[[σ ]] A as deﬁned earlier in Section 3. If σ is a rigid type, then |σ| is the constant functor |σ|A =[[σ ]] . A τ-coalgebra is a coalgebra for the functor |τ|.Agivenτ-coalgebra α : A →|τ|A interprets types σ and contexts Γ = (v1 : σ1,...,vn : σn) by putting [[ σ ]] α = |σ|(Dom α)=[[σ ]] A,and

[[ Γ ]] α =[[σ1 ]] α ×···×[[ σn ]] α

(so [[ ∅ ]] α is the empty product 1). Hence α itself is a function of the form A → [[ τ ]] α.Thedenotation of each τ-term Γ✄M : σ, relative to the coalgebra α, is a function

[[ Γ ✄ M : σ ]] α : A × [[ Γ ]] α −→ [[ σ ]] α, deﬁned by induction on the formation of terms. For empty contexts, ∼ A × [[ ∅ ]] α = A × 1 = A, so we replace A × [[ ∅ ]] α by A itself and interpret a ground term ∅ ✄ M : σ as a function A → [[ σ ]] α. Var: [[ v : σ ✄ v : σ ]] α : A × [[ σ ]] α → [[ σ ]] α is the right projection function. Con: [[ ∅ ✄ c : o ]] α : A → [[ o ]] is the constant function with value c. St: [[ ∅ ✄ s : St ]] α : A → [[ St ]] α is the identity function A → A. Tr: [[ Γ ✄ tr(M):τ ]] α : A × [[ Γ ]] α → [[ τ ]] α is the composition of the functions ✄ [[ Γ M : St ]]✲α α✲ A × [[ Γ ]] α A [[ τ ]] α.

Weak: [[ Γ ,v : σ , Γ ✄ M : σ ]] α is the composition of [[Γ, Γ ✄ M : σ ]] α with the projection

 A × [[ Γ ]] α × [[ σ ]] α × [[ Γ ]] α −→ A × [[ Γ ]] α × [[ Γ ]] α.

Pair: [[ Γ ✄ M1,M2 : σ1 × σ2 ]] α is the product map ✄ ✄ [[ Γ M1 : σ1 ]] α, [[ Γ M2 : σ2 ]] α✲ A × [[ Γ ]] α [[ σ1 ]] α × [[ σ2 ]] α. 169 Goldblatt

Projj: [[ Γ ✄ πjM : σj ]] α is the composition of ✄ × [[ Γ M : σ1 σ2 ]]✲α πj✲ A × [[ Γ ]] α [[ σ1 ]] α × [[ σ2 ]] α [[ σj ]] α.

Injj: [[ Γ ✄ ιjM : σ1 + σ2 ]] α is the composition of ✄ [[ Γ M : σj ]]✲α ιj✲ A × [[ Γ ]] α [[ σj ]] α [[ σ1 ]] α +[[σ2 ]] α.

Case: This is most readily described at the level of function values. For x ∈ A and γ ∈ [[ Γ ]] α,let

[[ Γ ✄ N : σ1 + σ2 ]] α(x, γ)=ιj(a) ∈ [[ σ1 ]] α +[[σ2 ]] α

(which holds for a unique j and a ∈ [[ σj ]] α). Then the element

[[ Γ ✄ case N of [ι1v1 → M1 | ι2v2 → M2]:σ ]] α(x, γ)

of [[ σ ]] α is deﬁned to be

[[ Γ ,vj : σj ✄ Mj : σ ]] α(x, γ, a).

Abs: [[ Γ ✄ (λv.M):o ⇒ σ ]] α(x, γ) is the function [[ o ]] → [[ σ ]] α given by

a → [[ Γ ,v : o ✄ M : σ ]] α(x, γ, a).

App: [[ Γ ✄ M · N : σ ]] α(x, γ) is the element of [[ σ ]] α obtained by evaluating the function [[ Γ ✄ M : o ⇒ σ ]] α(x, γ):[[o ]] −→ [[ σ ]] α

at [[Γ ✄ N : o ]] α(x, γ) ∈ [[ o ]] .

This completes the inductive deﬁnition of [[Γ ✄ M : σ ]] α.

Semantics of Formulas

A τ-equation Γ ✄ M1 ≈ M2 is said to be valid in coalgebra α if the α- denotations [[Γ ✄ M1 ]] α and [[Γ ✄ M2 ]] α of the terms Γ ✄ Mj are identical. More generally we introduce a satisfaction relation

α, x, γ |=Γ✄ ϕ, for τ-formulas in τ-coalgebras, which expresses that Γ ✄ ϕ is satisﬁed,ortrue, in α at state x under the value-assigment γ to the variables of context Γ. This 170 Goldblatt is deﬁned inductively by

α, x, γ |=Γ✄ M1 ≈ M2 iﬀ [[Γ ✄ M1 ]] α(x, γ)=[[Γ✄ M2 ]] α(x, γ), α, x, γ |=Γ✄ ¬ϕ iﬀ not α, x, γ |=Γ✄ ϕ,

α, x, γ |=Γ✄ ϕ1 ∧ ϕ2 iﬀ α, x, γ |=Γ✄ ϕ1 and α, x, γ |=Γ✄ ϕ2.

Γ ✄ ϕ is true at x, written α, x |=Γ✄ ϕ,ifα, x, γ |=Γ✄ ϕ for all γ ∈ Γ. α is a model of Γ ✄ ϕ, written α |=Γ✄ ϕ,ifα, x, |=Γ✄ ϕ for all states x ∈ Dom α. In that case we also say that Γ ✄ ϕ is valid in the coalgebra α.

Substitution In working with this system it becomes essential to have available the operation N[M/v] of substituting the raw term M for the variable v in N. The following rule is derivable: Γ ✄ M : σ Γ,v : σ ✄ N : σ Γ ✄ N[M/v]:σ The semantics of terms obeys the basic principle that substitution is interpreted as composition of denotations [13, 2.2]. Because of the special role of the state set A, this takes the form

[[ Γ ✄ N[M/v]]]α =[[Γ,v : σ ✄ N ]] α ◦π1,π2, [[ Γ ✄ M ]] α, so that the diagram π1,π2, [[ M ]] α✲ A × [[ Γ ]] α A × [[ Γ ]] α × [[ σ ]] α ◗ ◗ ◗ ◗ ◗ ◗ [[ N ]] α [[ N[M/v]]] ◗ α ◗ s◗ ❄ [[ σ ]] α commutes. It is also possible to make substitutions N[M/s] for the state parameter s according to the rule Γ ✄ M : St Γ ✄ N : σ Γ ✄ N[M/s]:σ with the semantics [[Γ ✄ N[M/s]]]α =[[Γ✄ N ]] α ◦[[ Γ ✄ M ]] α,π2 : ✄ [[ Γ M ]] α,π2✲ A × [[ Γ ]] α A × [[ Γ ]] α ◗ ◗ ◗ ◗ ◗ ◗ [[ N ]] α [[ N[M/s]]] ◗ α ◗ s◗ ❄ [[ σ ]] α 171 Goldblatt

5 Paths and Bisimulations

If (A, α)and(B,β) are coalgebras for a functor T , then a relation R ⊆ A×B is a T -bisimulation from α to β if there exists a transition structure ρ : R → TR on R such that the projections from R to A and B are coalgebraic morphisms from ρ to α and β, i.e. the following diagram commutes:

π π A ✛ 1 R 2 ✲ B

α ρ β

❄ ❄ ❄ TA ✛ TR ✲ TB Tπ1 Tπ2

A function f : A → B is a morphism from α to β iff its graph {(a, f(a)) : a ∈ A} is a bisimulation from α to β [17, Theorem 2.5]: a morphism is essential a functional bisimulation. When Dom α ⊆ Dom β, α is a subcoalgebra of β iff the identity relation on Dom α is a bisimulation from α to β. The above categorial definition of bisimulation appeared in [1]. It has a characterisation in terms of “liftings” of relations [5,6]. For R ⊆ A × B,define arelationRT ⊆ TA× TB by induction on the formation of the polynomial functor T :

D¯ R =idD RId = R RT1×T2 = {(x, y):π xRT1 π y and π xRT2 π y} 1 1 2 2 T1+T2 T1 T2 R = {(ι1x, ι1y):xR y} {(ι2x, ι2y):xR y} D RT = {(f,g):∀d ∈ Df(d)RT g(d)}.

These liftings preserve many basic properties of relations. Thus if R is total (Dom R = A) or surjective (onto B) or injective or functional, then RT will also have the corresponding property.

Theorem 5.1 (Folklore) If R ⊆ Dom α × Dom β,whereα and β are T -coalgebras, then R is a bisimulation from α to β if, and only if, xRy implies α(x)RT β(y) for all states x in α and y in β. ✷

The inverse of a bisimulation is a bisimulation, and the union of any collection of bisimulations from α to β is a bisimulation [17, Section 5]. Hence there is a largest bisimulation from α to β,whichisasymmetric relation called bisimilarity.Wedenotethisby∼. States x and y are bisimilar, x ∼ y,when 172 Goldblatt xRy for some bisimulation R between α and β. This is intended to capture the notion that x and y are observationally indistinguishable. Theorem 5.1 can be used to show that bisimulations preserve the values of terms, and in particular that related states assign the same values to observable terms. To explain this we need some more notation. Let (A, α)and (B,β)beτ-coalgebras, and R ⊆ A × B. Then for each O-type σ we have |σ| the lifted relation R ⊆ [[ σ ]] A × [[ σ ]] B,where|σ| is the functor defined by σ. For observable σ, R|σ| is just the identity relation on [[ σ ]]. The same is true whenever σ is a rigid type. For any context Γ = (v1 : σ1,...,vn : σn)wedefine Γ |σi| arelationR ⊆ [[ Γ ]] A × [[ Γ ]] B as the direct product of the R ’s, i.e. Γ |σi| ≤ (γ1,...,γn) R (γ1,...,γn)iffγi R γi for all i n. Γ For rigid Γ, R is just the identity relation on [[Γ ]] = [[ σ1 ]] ×···×[[ σn ]] . Theorem 5.2 (Value-Preservation) Let R be a |τ|-bisimulation from α to β. Γ (1) For any τ-term Γ ✄ M : σ,ifγ ∈ [[ Γ ]] α and γ ∈ [[ Γ ]] β have γR γ ,then |σ| xRy implies [[ Γ ✄ M ]] α(x, γ) R [[ Γ ✄ M ]] β(y, γ ). (2) If Γ ✄ M is a term of observable type, and γRΓγ,then xRy implies [[ Γ ✄ M ]] α(x, γ)=[[Γ✄ M ]] β(y, γ ). (3) If Γ ✄ M is a rigid term of observable type, and γ ∈ [[ Γ ]] ,then

xRy implies [[ Γ ✄ M ]] α(x, γ)=[[Γ✄ M ]] β(y, γ). ✷

¿From part (2) of this result it follows, by induction on the formation of formulas, that if Γ ✄ ϕ is an observable formula, then

α, x, γ |=Γ✄ ϕ iﬀ β,y,γ |=Γ✄ ϕ whenever xRy and γRΓγ.ThusifΓ✄ϕ is valid in α and R is surjective, so that RΓ is also surjective, Γ ✄ ϕ will be valid in β. On the other hand if β |=Γ✄ ϕ and R is total, so that RΓ is also total, then α |=Γ✄ϕ. In other words, validity is preserved in passing from α to β if β is the image of a bisimulation from α, and is preserved in passing from β to α if α is the domain of a bisimulation to β.IfΓ✄ ϕ is also rigid, then its validity is preserved by disjoint unions: ∈ | ✄ given any element ιj(a)of I αi and any γ [[Γ ]], if αj =Γ ϕ we get | ✄ | ✄ Γ I αi,ιj(a),γ =Γ ϕ, because αj,a,γ =Γ ϕ, γR γ, and the insertion morphism ιj is a bisimulation. To sum up: Theorem 5.3 The class {α : α |=Γ✄ ϕ} of all models of an observable formula is closed under domains and images of bisimulations, including domains and images of morphisms as well as subcoalgebras. If Γ ✄ ϕ is rigid and observable, then its class of models is also closed under disjoint unions.✷ The main purpose of this Section is to strengthen Theorem 5.2 to a logical characterisation of bisimilarity: states are bisimilar when they assign the same 173 Goldblatt values to all ground terms of observable type, or equivalently when they satisfy the same rigid observable formulas (see Theorem 5.8). The key to this is the relation ≡αβ deﬁned by

x ≡αβ y iﬀ [[ M ]] α(x)=[[M ]] β(y) for all ground terms M : o of observable type.

≡αβ is a bisimulation from α to β, and turns out to be the largest one. The proof of this requires the development of another characterisation of bisimulation, using the notion of “paths” between functors [10, Section 6].

A path is a finite list of symbols of the kinds πj, εj, ev d. Write p.q for p concatenation of lists p and q. The notation T −S means that p is a path from functor T to functor S, and is defined by the following conditions • T −T ,where is the empty path. πj .p p • T1 × T2−S whenever Tj−S, for j =1, 2. εj .p p • T1 + T2−S whenever Tj−S, for j =1, 2. ev d.p p • T D −S for all d ∈ D whenever T −S. It is evident that for any path T −S, S is one of the functors involved in the formation of T . p ✲ ApathT −S induces a partial function pA : TA◦ SA for each set A, defined by induction on the length of p as follows. • ✲ A : TA◦ TA is the identity function idTA,soistotallydefined. π p • j✲ A✲ (πj.p)A = pA ◦ πj, the composition of T1A × T2A TA◦ SA. Thus x ∈ Dom (πj.p)A iff πj(x) ∈ Dom pA. ε p • j✲ A✲ (εj.p)A = pA ◦ εj, the composition of T1A + T2A ◦ TA◦ SA. Thus x ∈ Dom (εj.p)A iff x ∈ ιjTAj and εj(x) ∈ Dom pA. ev p • D ✲d A✲ (ev d.p)A = pA ◦ ev d, the composition of (TA) TA◦ SA. Thus f ∈ Dom (ev d.p)A iff f(d) ∈ Dom pA. ApathT −S is a state path if S =Id,andanobservation path if S = D¯ for some set D.AT -bisimulation can be characterised as a relation that is “preserved” by the partial functions induced by state and observation paths from T . To explain this we adopt the convention that whenever we write “f(x) Qg(y)” for some relation Q and some partial functions f and g we mean that f(x)isdefinediffg(y) is defined, and (f(x),g(y)) ∈ Q when they are both defined.

Theorem 5.4 Let R ⊆ A×B, x ∈ TA,andy ∈ TB,whereT is a polynomial 174 Goldblatt functor. Then the following are equivalent. (1) xRT y. p S (2) For all paths T −S, pA(x)R pB(y). p • (3) For all state paths T −Id, pA(x) RpB(y);and p • for all observation paths T −D¯, pA(x)=pB(y). ✷ Combining this result with Theorem 5.1 gives the desired “dynamic” characterisation of bisimulations: α β Theorem 5.5 If A ✲ TA and B ✲ TB are coalgebras for a polynomial functor T , then a relation R ⊆ A × B is a T -bisimulation if, and only if, xRy implies p • for all state paths T −Id, pA(α(x)) RpB(β(y));and p • for all observation paths T −D¯, pA(α(x)) = pB(β(y)). Corollary 5.6 If C ⊆ Dom α,thenC is a subcoalgebra of α iff x ∈ C implies p pA(α(x)) ∈ C for all state paths T −Id such that pA(α(x))↓. Proof. To say that C is a subcoalgebra of α means that there is some T - transition structure on C that is a subcoalgebra of α. Such a structure is unique, and exists iff the identity relation ∆C = {(x, x):x ∈ C} on C is a bisimulation relation on α [17, Proposition 6.2]. Now apply the Theorem with R =∆C and α = β, and use the fact that pA(α(x)) ∆C pA(α(x)) iff pA(α(x)) ∈ C. ✷ This characterisation makes it easy to see that if R is a bisimulation from α to β, then Dom R is a subcoalgebra of α.Forifx ∈ Dom R and pA(α(x))↓,then xRy for some y,sopA(α(x)) RpB(β(y)) by 5.5 and hence pA(α(x)) ∈ Dom R. Similarly, the image of R is seen to be a subcoalgebra of β. Path functions are thus an effective tool in the structural analysis of polynomial coalgebras. Their use in logical characterisations derives from the fact that the action of a path function is definable by a (ground) term. Lemma 5.7 (Path Lemma) p For any path |τ|−|σ| there exists a term of the form

v : τ ✄ p¯ : σ such that for any τ-coalgebra (A, α) and any x ∈ A,ifα(x) ∈ Dom pA then

pA(α(x)) = [[p ¯[tr(s)/v]]]α(x). ✷

Note that since ∅ ✄ tr(s):τ is a τ-theorem, so too is ∅ ✄ p¯[tr(s)/v]:σ by the rule of Substitution. Hencep ¯[tr(s)/v] is a ground term of type σ. 175 Goldblatt

In proving the Path Lemma (by induction on the length of p)itmustbe shown that the action of an extraction function εj is term-deﬁnable. In fact it can be shown that for any term Γ ✄ M : σ1 + σ2 of coproduct type there exist terms Γ ✄ εjM : σj for j =1, 2 such that , - [[ Γ ✄ εjM ]] α(x, γ)=εj [[ Γ ✄ M ]] α(x, γ) ∈ [[ σj ]] A whenever [[Γ ✄ M ]] α(x, γ) ∈ ιj[[ σj ]] A. Indeed, taking v1,v2 as new variables not in M, put

ε1M := case M of [ι1v1 → v1 | ι2v2 → N1], where N1 is any ground term of type σ1 (the existence of ground terms of every type follows by induction on term and type formation from axioms (Con) and (St) of Figure 2). ε2 is deﬁned similarly.

The term function [[p ¯[tr(s)/v]]]α has domain A, and so may not be identical to pA ◦ α if pA is partial. This is only an issue when the path p includes an extraction symbol εj (for otherwise pA is total), but further use of case allows the construction of terms that “discriminate” between the two summands of a coproduct [[ σ1 ]] A +[[σ2 ]] A and determine whether pA(α(x)) is deﬁned. For this to work we need the (reasonable) assumption that τ has at least one observable subtype o that is non-trivial in the sense that [[ o ]] has at least two distinct members, say c1 and c2. Then we form the term v : σ1 + σ2 ✄ P : o, where P := case v of [ι1v1 → c1 | ι2v2 → c2], and ﬁnd, when α is a σ1 + σ2-coalgebra, that the ground term P [tr(s)/v]:o is a discriminator:

[[ P [tr(s)/v]]]α(x)=cj iﬀ α(x) ∈ ιj[[ σj ]] A = Dom εj.

An inductive argument that repeats this construction for each extraction sym- p bol in a path |τ|−|σ| produces a ﬁnite set Tp of ground observable terms such that

if (A, α) and (B,β) are τ-coalgebras, and x ∈ A and y ∈ B have [[ M ]] α(x)= [[ M ]] β(y) for all M ∈ Tp,thenα(x) ∈ Dom pA iﬀ β(y) ∈ Dom pB. Combining this observation with the Path Lemma 5.7, the path-characterisation of bisimulations of Theorem 5.5, and application of Substitution rules, leads ultimately to a proof that the relation ≡αβ is a bisimulation. This in turn leads to the logical characterisation of bisimilarity of states: Theorem 5.8 Let (A, α) and (B,β) be τ-coalgebras, where τ has at least one non-trivial observable subtype. Then for any x ∈ A and y ∈ B, the following are equivalent: (1) x and y are bisimilar: x ∼ y. 176 Goldblatt

(2) α, x |=Γ✄ ϕ iﬀ β,y|=Γ✄ ϕ for all rigid observable formulas Γ ✄ ϕ. (3) α, x |= M ≈ N implies β,y|= M ≈ N for all ground observable terms M and N.

(4) [[ M ]] α(x)=[[M ]] β(y) for all ground observable terms M, i.e. x ≡αβ y.

6 Observable Ultrapowers

Let U be an ultraﬁlter on a set I. For each set A, the relation

f =U g iﬀ {i ∈ I : f(i)=g(i)}∈U is an equivalence relation on the I-th power AI of A.Eachf ∈ AI has the U I equivalence class f = {g ∈ A : f =U g}. The quotient set

AU = {f U : f ∈ AI } is called the ultrapower of A with respect to U. 2 There is a natural injection U U I eA : A A given by eA(a)=¯a ,where¯a ∈ A is the constant function with value a. The distinction between a anda ¯U is sometimes blurred, allowing A U to be identiﬁed with the subset eA(A)ofA . U A notation that will be useful below is to write f ∈U C, for C ⊆ A,when {i ∈ I : f(i) ∈ C}∈U. ×···× → U U ×···× U → U Amapθ : A1 An B has a U-lifting θ : A1 An B , given by U U U ∈ U θ (f1 ,...,fn )= θ(f1(i),...,fn(i)) : i I . In the case n =1,anyθ : A → B lifts to θU : AU → BU where θU (f U )= (θ ◦ f)U . This works also for a partial θ : A ◦ ✲ B, providing a U-lifting θU : AU ◦ ✲ BU in the same way, with the proviso that θU (f U )isdeﬁned only when f U ∈ Dom θ, i.e. when {i ∈ I : f(i) ∈ Dom θ}∈U.

Now let α : A → [[ τ ]] A be a τ-coalgebra which will remain ﬁxed throughout U U → U Section 6. The transition structure α lifts to a function α : A [[ τ ]] A,and each term denotation [[Γ ✄ M : σ ]] α lifts to a function

✄ U U × U ×···× U −→ U ‡ [[ Γ M : σ ]] α : A [[ σ1 ]] A [[ σn ]] A [[ σ ]] A ( ) where σ1,...,σn is the list of types of Γ. U U U | | U α is not a τ-coalgebra on A since its codomain is [[ τ ]] A =(τ (A)) U rather than [[ τ ]] AU = |τ|(A ). We wish to deﬁne a coalgebraic structure on U ✄ U A that interprets terms in a manner related to the functions [[Γ M : σ ]] α . To achieve this it is necessary to retain only some of the points of AU ,and the key to understanding which ones is provided by considering the U-lifting of the α-denotation of a ground observable term M : o. This is the function

2 For the standard theory of ultraﬁlters and ultrapowers see [2,3]. 177 Goldblatt

U U → U [[ M ]] α : A [[ o ]] . To act as a denotation for M it should assign values in [[ o ]] , v i e w e d a s a s u b s e t o f [[ o ]] U . In other words we should have

U ∈ { U ∈ }⊆ U [[ M ]] α (x) e[[ o ]] = c¯ : c [[ o ]] [[ o ]] .

U U ∈ We are thus led to deﬁne an element x of A to be observable if [[ M ]] α (x) e[[ o ]] for every ground observable τ-term M : o.Ifx = f U , this means that for each such M there exists an observable element cM ∈ [[ o ]] s u c h t h a t

{i ∈ I :[[M ]] α(f(i)) = cM }∈U. (†)

Put A+ = {x ∈ AU : x is observable}. For each a ∈ A and any ground M : o, U U U U ◦ U ∈ [[ M ]] α (eA(a)) = [[ M ]] α (¯a )=([[M ]] α a¯) = [[ M ]] α(a) e[[ o ]] ,

+ + so eA(a) is observable. Thus eA embeds A into A , allowing us to view A as an extension of A. p Theorem 6.1 For any path |τ|−|σ| beginning at |τ| there exist partial func- ◦ + + ◦ ✲ + U ◦ ✲ + tions (pA α) : A [[ σ ]] A and θσ :[[σ ]] A [[ σ ]] A ,

e A✲ A ✲ A+ ⊂ ✲ AU ◦ ◦ ◦

+ U pA ◦ α (pA ◦ α) (pA ◦ α)

❄ | | ❄ ❄ σ eA θσ U ✲ + ✛✛ ◦ [[ σ ]] A [[ σ ]] A [[ σ ]] A such that • + + U Dom (pA ◦ α) = A ∩ Dom (pA ◦ α) ; • + U x ∈ Dom (pA ◦ α) implies (pA ◦ α) (x) ∈ Dom θσ; • + a ∈ Dom pA ◦ α implies eA(a) ∈ Dom (pA ◦ α) ; • ¯U b ∈ [[ σ ]] A implies b ∈ Dom θσ; • θσ is surjective (onto [[ σ ]] A+ ), and the above diagram commutes wherever deﬁned. ✷

The proof of this theorem proceeds by induction on the formation of the end-type σ, and is too long and complex to be described here. But some comments are in order, particularly since the function θσ seems to be pointing in the “wrong” direction. When σ is observable, θσ is just the inverse of the U embedding [[ σ ]] [[ σ ]] ,andwhenσ = St, θσ is the inverse of the inclusion A+ >→ AU . The inductive cases for products and coproducts appeal to the fact that the ultrapower operation commutes with these constructions, in the 178 Goldblatt sense that there exist isomorphisms

∼ ∼ (B × D)U = BU × DU , (B + D)U = BU + DU for any sets B,D. However there is no corresponding commutation for powers: there is only a surjection χ :(BD)U (BU )D, given by the formula χ(x)(d)= U D → ev d (x), which may not be injective (this uses the U-lifting of ev d : B D). χ is used in the deﬁnition of θσ when σ is a power type, and this dictates the direction of θσ. Now applying Theorem 6.1 in the case that σ = τ and p is the empty + + ✲ path, so that pA =idA, gives a function α : A ◦ [[ τ ]] A+ whose domain is A+ ∩ Dom αU = A+,sothatα+ is total, such that the diagram

e A✲ A ✲ A+ ⊂ ✲ AU

α α+ αU

❄ | | ❄ ❄ τ eA θτ U ✲ + ✛✛ ◦ [[ τ ]] A [[ τ ]] A [[ τ ]] A commutes. α+ is thus a τ-coalgebra, which will be called the observable ultrapower of α with respect to U. The use we make of α+ derives ultimately from that fact that for a ground U observable term M : o, the denotation [[ M ]] α+ agrees with [[ M ]] α in the sense + ◦ U + U + ◦ + that [[ M ]] α = θo [[ M ]] α A ,orequivalently[[M ]] α A = e [[ M ]] α :

[[ o ]] U ◦ ✒ ✻ [[ M ]] U α θo e

❄ A+ ✲ [[ o ]] [[ M ]] α+

But to prove that takes an induction on the derivation of the ground term ∅ ✄ M, which may involve more complex types and non-empty contexts. Therefore we have to prove a more elaborate result. To formulate this, given a context Γ ×···× with types σ1,...,σn,letθΓ = θσ1 θσn be the product of the functions U ◦ ✲✲ + θσi :[[σi ]] A [[ σi ]] A . Then Dom θΓ is the product of the Dom θσi ’s, and so + × ✄ U A Dom θΓ is a subset of the domain of [[Γ M ]] α for any term in context ‡ ✄ U Γ(see() earlier in this section for Dom [[Γ M ]] α ). We can now state the result that explains the sense in which [[Γ ✄ M ]] α+ ✄ U can be viewed as a restriction of [[Γ M ]] α . The proof is a lengthy induction on term formation. 179 Goldblatt

+ Theorem 6.2 For any τ-term Γ ✄ M : σ,anyx ∈ A ,andanyγ ∈ Dom θΓ,

✄ U + [[ Γ M ]]✲α U A × Dom θΓ [[ σ ]] ◦ ◦ A

id × θΓ θσ ❄ ❄ ❄ ✄ + + [[ Γ M ]] α✲ A × [[ Γ ]] A+ [[ σ ]] A+

✄ U ∈ (1) [[Γ M ]] α (x, γ) Dom θσ,and

◦ ✄ U ✄ + (2) θσ [[ Γ M ]] α (x, γ)=[[Γ M ]] α (x, θΓ(γ)). ✷ The main use of this theorem is in deriving the following fundamental relationship between satisfaction in a coalgebra and in its observable ultrapowers.

Theorem 6.3 (cLo´s-type theorem for observable ultrapowers) + If Γ ✄ ϕ is an observable τ-formula, x ∈ A and (z1,...,zn) ∈ [[ Γ ]] A+ ,then + α ,x,z1,...,zn |=Γ✄ ϕ if, and only if,

{i ∈ I : α, f(i),g1(i),...,gn(i) |=Γ✄ ϕ}∈U

U U U whenever x = f and (z1,...,zn)=θΓ(g1 ,...,gn ). ✷ ¿From this we can conclude that the class of all models of an observable formula is closed under observable ultrapowers: Corollary 6.4 If Γ ✄ ϕ is observable, then

α |=Γ✄ ϕ if, and only if, α+ |=Γ✄ ϕ.

✷

Intrinsic Ultrapowers AsetΦofgroundformulasissatisfiable in coalgebra α if there is some state of α at which all members of Φ are true, i.e. some x ∈ A such that α, x |= ϕ for all ϕ ∈ Φ. Φ is finitely satisfiable ifeachfinitesubsetofΦissatisfiablein α. Putting ϕα = {x ∈ A : α, x |= ϕ}, we see that Φ is finitely satisfiable in α iff the collection Φα = {ϕα : ϕ ∈ Φ} of subsets of A has the finite intersection property. There is a well-known construction in the theory of ultrapowers that will enable us to force certain finitely α-satisfiable Φ’s become satisfiable in α+. By choosing a suitable ultrafilter U it can be arranged that any collection S of subsets of A with the finite intersection property has a “nonstandard element” in its intersection. This element is an f U ∈ AU such that for each C ∈S, U f ∈U C, i.e. {i : f(i) ∈ C}∈U. 180 Goldblatt

To see how this is done, let IA be the set of all finite subsets of the powerset of A.AtypicalelementofIA is of the form i = {C1,...,Cn} with the Cj’s being subsets of A.Foreachk ∈ IA,letIk = {i ∈ IA : k ⊆ i}. The collection { ∈ } ∩···∩ UA = Ik : k IA has the finite intersection property, since Ik1 Ikn contains the element i = k1 ∪···∪kn. Any ultrafilter U on IA that extends U + UA will be called intrinsic to A, and the associated A and α will be called intrinsic ultrapowers. Now if S is a collection of subsets of A with the. finite intersection property, → ∈ ∩S ∩S ∅ let f : IA A be any function such that f(i) (i ) whenever. i = . Note that by the finite intersection property, if i ∩S= ∅ then (i ∩S) = ∅, so such an f does exist. Then for any C ∈S, put k = {C}∈IA:ifi ∈ I{C} then C ∈ i ∩S,sof(i) ∈ C. This shows that I{C} ⊆{i : f(i) ∈ C},andso U f ∈U C as desired.

Theorem 6.5 Let τ be a type that has at least one non-trivial observable subtype. Suppose that every ground observable τ-formula valid in α is valid also in a τ-coalgebra β.Letα+ be any intrinsic observable ultrapower of α. Then each state of β is bisimilar to a state of α+.

Proof. Let y be a state of β.IfM : o is any ground observable term, let cM =[[M ]] β(y) ∈ [[ o ]] . L e t Φ y be the set of equations M ≈ cM for all ground observable M.Bydefinition,Φy is satisfied by y in β.Eachfinite {ϕ1,...,ϕn}⊆Φy is satisifiable in α, for otherwise the formula

¬(ϕ1 ∧···∧ϕn) would be valid in α, hence valid in β by hypothesis, contrary to the fact that this formula is false at y. α Thus the collection Φy has the ﬁnite intersection property. If U is the intrinsic ultraﬁlter that gives rise to α+, then by the above construction there U U U α is some f ∈ A such that for each M, f ∈U (M ≈ cM ) , which means that the set

IM = {i ∈ I : α, f(i) |= M ≈ cM }

= {i ∈ I :[[M ]] α(f(i)) = cM } belongs to U. Since this holds for all ground observable M, f U is observable by U + + U (†), so f ∈ A . Also, since IM ∈ U, Theorem 6.3 gives α ,f |= M ≈ cM , so U [[ M ]] α+ (f )=cM =[[M ]] β(y). Therefore f U and y assign the same values to all ground observable terms, and so are bisimilar by Theorem 5.8(4). ✷ 181 Goldblatt

7 Deﬁnable Classes of Coalgebras

The tools needed to give a structural characterisation of logically definable classes of coalgebras are now all in place. The following result is the analogue for polynomial functors of Theorem 9.2 of [4] for monomial functors, and the underlying reasoning is the same. Theorem 7.1 If τ has at least one non-trivial observable subtype, then for any class K of τ-coalgebras, the following are equivalent. (1) K is the class of all models of some set of rigid observable formulas. (2) K is the class of all models of some set of ground observable formulas. (3) K is closed under disjoint unions, images of bisimulations, and observable ultrapowers. (4) K is closed under disjoint unions, images of bisimilarity relations, and intrinsic observable ultrapowers. Proof. We explain why (4) implies (2), the proofs that (2) implies (1) which implies (3) which implies (4) being either evident or already discussed (The- orem 5.3, Corollary 6.4). Let Φ be the set of all ground observable formulas that are valid in all members of K. By definition all members of K are models of Φ, so it suffices to prove the converse. Let β be a model of Φ. For each ground observable ϕ such that β |= ϕ theremustbesomeαϕ ∈ K such that αϕ |= ϕ (or else ϕ belongs to Φ hence β |= ϕ). Let α be the disjoint union of all these αϕ’s. Then any ground observable formula valid in α is valid in every αϕ, hence valid in β. Therefore if α+ is an intrinsic observable ultrapower of β,thenby Theorem 6.5 the bisimilarity relation from α+ to β is surjective. In other words, β is the image under bisimilarity of an intrinsic ultrapower of a disjoint union of coalgebras from K. The closure conditions listed in (4) thus ensure that β ∈ K. ✷

References

[1] Aczel, P. and N. Mendler, A Final Coalgebra Theorem, in:D. H. Pitt et al., editors, Category Theory and Computer Science. Proceedings 1989, Lecture Notes in Computer Science 389, Springer-Verlag, 1989 pp. 357–365. [2] Bell, J. L. and A. B. Slomson, “Models and Ultraproducts,” North-Holland, Amsterdam, 1969. [3] Chang, C. C. and H. J. Keisler, “Model Theory,” North-Holland, Amsterdam, 1973. [4] Goldblatt, R., What is the Coalgebraic Analogue of Birkhoﬀ’s Variety Theorem?, Theoretical Computer Science, to appear. Manuscript available at http://www.mcs.vuw.ac.nz/~rob/papers/what.ps. 182 Goldblatt

[5] Hermida, C., “Fibrations, Logical Predicates and Indeterminates,” Ph.D. thesis, University of Edinburgh (1993), techn. rep. LFCS-93-277. Also available as Aarhus Univ. DAIMI Techn. rep. PB-462.

[6] Hermida, C. and B. Jacobs, Structural induction and coinduction in a ﬁbrational setting, Information and Computation 145 (1998), pp. 107–152.

[7] Jacobs, B., Objects and Classes, Coalgebraically, in:B. Freitag, C. B. Jones, C. Lengauer and H.-J. Schek, editors, Object-Orientation with Parallelism and Persistence, Kluwer Academic Publishers, 1996 pp. 83–103.

[8] Jacobs, B., “Categorical Logic and Type Theory,” Elsevier, 1999.

[9] Jacobs, B., Exercises in Coalgebraic Speciﬁcation, manuscript for the proceedings of the Mathematics for Information Technology summer school, Oxford, 2000. Available at http://www.cs.kun.nl/~bart/PAPERS/.

[10] Jacobs, B., Towards a Duality Result in Coalgebraic Modal Logic, Electronic Notes in Theoretical Computer Science 33 (2000), http://www.elsevier.nl/ locate/entcs.

[11] Kurz, A., Specifying Coalgebras with Modal Logic, Electronic Notes in Theoretical Computer Science 11 (1998), http://www.elsevier.nl/locate/ entcs.

[12] Moss, L. S., Coalgebraic Logic, Annals of Pure and Applied Logic 96 (1999), pp. 277–317.

[13]Pitts,A.M.,Categorical Logic, in:S. Abramsky, D. M. Gabbay and T. S. E. Maibaum, editors, Handbook of Logic in Computer Science, Volume 5: Algebraic and Logical Structures, Oxford University Press, 2000 .

[14] Reichel, H., An Approach to Object Semantics Based on Terminal Co-Algebras, Mathematical Structures in Computer Science 5 (1995), pp. 129–152.

[15] R¨oßiger, M., From Modal Logic to Terminal Coalgebras, Preprint MATH-AL-3- 1998, Technische Universit¨at Dresden (1998), to appear in Theoretical Computer Science.

[16] Rutten, J., A Calculus of Transition Systems (towards Universal Coalgebra), in:A. Ponse, M. de Rijke and Y. Venema, editors, Modal Logic and Process Algebra, CSLI Lecture Notes No. 53, CSLI Publications, Stanford, California, 1995 pp. 231–256.

[17] Rutten, J., Universal Coalgebra: a Theory of Systems, Theoretical Computer Science 249 (2000), pp. 3–80.

183 CMCS’01 Preliminary Version

Monoid-labeled transition systems

H. Peter Gumm, Tobias Schr¨oder

Fachbereich Mathematik und Informatik Philipps-Universit¨at Marburg Marburg, Germany {gumm,tschroed}@mathematik.uni-marburg.de

Abstract / Given a -complete (semi)lattice L, we consider L-labeled transition systems as coalgebras of a functor L(−), associating with a set X the set LX of all L-fuzzy subsets. We describe simulations and bisimulations of L-coalgebras to show that L(−) weakly preserves nonempty kernel pairs iff it weakly preserves nonempty pullbacks iff L is join infinitely distributive (JID). (−) Exchanging L for a commutative monoid M, we consider the functor Mω which associates with a set X all finite multisets containing elements of X with multiplici- ties m ∈ M. The corresponding functor weakly preserves nonempty pullbacks along injectives iff 0 is the only invertible element of M, and it preserves nonempty kernel pairs iff M is refinable, in the sense that two sum representations of the same value, r1 + ...+ rm = c1 + ...+ cn, have a common refinement matrix (mi,j) whose k-th row sums to rk and whose l-th column sums to cl for any 1 ≤ k ≤ m and 1 ≤ l ≤ n. Key words: Coalgebra, transition system, fuzzy transition, multiset, weak pullback preservation, bisimulation, refinable monoid, distributive lattice.

1 Introduction

It is well known that transition systems can be described as coalgebras of the covariant powerset functor P. The general theory of coalgebras automatically supplies the fundamental notions of homomorphism and bisimulation.The fact that the functor P is well behaved, i.e. that it preserves (generalized) weak pullbacks, guarantees a collection of useful properties. In particular, the relational product of bisimulations is a bisimulation, the largest bisimulation is an equivalence relation and kernels of homomorphisms are always bisimulations. The subclass of image-finite transition systems corresponds to coalgebras of the finite powerset functor Pω. This functor, in addition, is bounded, so a terminal Pω-coalgebra exists, providing semantics and a proof principle (coinduction) for image-finite transition systems. This is a preliminary version. The final version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Gumm and Schroder¨

Introducing labels does not complicate the situation. Given a set L of labels, an L-labeled transition system on a state set S is a ternary relation T ⊆ S × L × S. Again, the equivalent coalgebraic view as a map α : S →P(S)L, i.e. as a P(−)L-coalgebra, better captures the dynamic aspects of labeled transition systems. What has been said above for P-, resp. L L Pω-coalgebras carries over to P(−) -, resp. Pω(−) -coalgebras. In fact an L-labeled transition system is nothing but a collection of (plain) transition systems, one for each label. The situation becomes more interesting, when the labels carry some algebraic structure. This structure is relevant, when arc labels denote, for instance, flow capacities or durations. In general, we shall assume the labels to define a commutative monoid, i.e. a commutative, associative operation with a neutral element 0. This allows us a coalgebraic interpretation of labeled transition systems as a map from states to graded sets of successor states. This view will be seen to provide an interesting intertwining of coalgebraic structure with algebraic properties of the monoid. Another motivation for this work is the study of Set-functors. In previous work ([Gum98],[GSb],[GS00]) we have studied the coalgebraic significance of various preservation properties of Set-endofunctors. Here we construct such functors, that are parameterized by a commutative monoid, so we can custom build functors by selecting commutative monoids with appropriate algebraic properties. / In the first sections, we start with an arbitrary -complete semilattice L, and we consider L-multisets. A multiset S can be thought of as a set, each of whose elements s is contained in S with some multiplicity (probability, certainty) l ∈L. There are plenty of natural choices for L,suchas{0, 1}, N∪{∞}, or the real interval [0, 1], giving rise to the standard notions of (plain old) set, bag (multiset), or fuzzy set. The semilattice L givesrisetoafunctorL(−) which generalizes the powerset functor P for L = {0, 1}.ThusL(−)-coalgebras generalize transition systems to multiset transition systems and fuzzy transition systems. We show that L(−) always preserves nonempty pullbacks along injective maps, and that it preserves arbitrary nonempty weak pullbacks just in case L is join infinitely distributive, that is, finite meets distribute over infinite joins. In semilattices, the possibility of defining infinite sums is closely tied to the idempotent law. In arbitrary commutative monoids, we can form infinite sums only in cases where all but finitely many summands are zero. This leads (−) to a slightly different functor, Mω , for an arbitrary commutative monoid M. We study it in the second half of this paper. This functor preserves nonempty pullbacks along injective maps iff the monoid is positive and it preserves nonempty weak pullbacks of arbitrary maps iff additionally the monoid is refinable in a sense to be defined later.

185 Gumm and Schroder¨

2 Basic notions

Let R ⊆ A × B and S ⊆ B × C be binary relations. We denote by (R ; S) their composition or relational product:

(R ; S):={(a, c) |∃b ∈ B. (a, b) ∈ R, (b, c) ∈ S}.

With R− we denote the converse of R, i.e. R− := {(b, a) | (a, b) ∈ R}.Weuse “aRb” as a shorthand for “(a, b) ∈ R”. If ϕ : A → B is a map then we denote by G(ϕ) its graph, that is

G(ϕ):={(a, ϕ(a)) | a ∈ A}.

With these deﬁnitions we get: G(ψ ◦ ϕ)=G(ϕ);G(ψ).

2.1 F -coalgebras and homomorphisms Let F : Set →Set be a functor from the category of sets to itself. A coalgebra of type F is a pair A =(A, α), consisting of a set A and a map α : A → F (A). A is called the carrier set and α is called the structure map of A. If A =(A, α)andB =(B,β)areF -coalgebras, then a map ϕ : A → B is called a homomorphism,if

β ◦ ϕ = F (ϕ) ◦ α, that is, if the following diagram commutes: ϕ A /B

α β F (A) F (ϕ) /F (B)

F -coalgebras and their homomorphisms form a category SetF .Itiswell S S known that all colimits in etF exist, and they are formed just as in et.In A A particular, the sum i∈I 0i of a family of F -coalgebras i =(Ai,αi) has as carrier the0 disjoint union0 i∈I Ai and the coalgebra structure is the unique → map α : i∈I Ai F ( i∈I Ai)with ◦ ◦ α ιAi = F (ιAi ) αi ∈ for all 0i I,whereeachιAi is the canonical embedding of Ai into the disjoint union i∈I Ai.

2.2 Subcoalgebras A subset U ⊆ A is called a subcoalgebra of A =(A, α), provided there exists → ⊆A → a coalgebra structure ν : U F (U) so that the inclusion map U : U A is a homomorphism from U =(U, ν)toA. 186 Gumm and Schroder¨

One reads off the definition of homomorphism, that U is a subcoalgebra of A =(A, α)iff

∀ ∈ ∃ ∈ ⊆A u U. v F (U).α(u)=F ( U )(v).

∅ ∅ ⊆A U = is always a subcoalgebra. If U = , then the inclusion map U has ⊆A a left inverse. Consequently, F ( U )hasaleftinversetoo,inparticularitis injective. Thus, if a structure map ν as above exists, it is unique. For that reason we use the term “subcoalgebra” interchangeably for the coalgebra U as well as for its carrier set U.

2.3 Bisimulations A binary relation R ⊆ A × B is called a bisimulation between A and B if it is possible to deﬁne a coalgebra structure δ : R → F (R), so that the projection maps π1 : R → A and π2 : R → B are homomorphisms. This can be expressed by the following commutative diagram:

o π1 π2 / A R B α δ β F (A) oF (π1) F (R) F (π2)/F (B)

If R is a bisimulation between A and B, then its converse R− is a bisimulation between B and A. The union of a family of bisimulations is a bisimulation again, so there is always a largest bisimulation ∼A,B between coalgebras A and B. If A = B, then we shall call R a bisimulation on A. The diagonal relation ∆A = {(a, a) | a ∈ A} is always a bisimulation, hence the largest bisimulation on A, denoted by ∼A, is always reﬂexive and symmetric. In general, though, it need not be transitive. Amapϕ : A → B is a homomorphism between coalgebras A and B,iﬀ its graph G(ϕ) is a bisimulation. If R is a coalgebra, and ϕ : R→Aand ψ : R→Care homomorphisms, then {(ϕ(r),ψ(r)) | r ∈ R} =(G(ϕ)− ; G(ψ)) is a bisimulation between A and B. As a consequence, a bisimulation R between A and B gives rise to a bisimulation R˜ := {(ιA(a),ιB(b)) | (a, b) ∈ R} on their sum A + B.

3 The functor L(−) / Let L be a complete -semilattice. L can be made into a complete lattice by deﬁning for any subset S: 1 2 S = {l ∈ L |∀s ∈ S. l ≤ s}. 187 Gumm and Schroder¨

For any set A,letLA denote the set of all maps σ : A →L.Eachsuchσ can be thought of as the characteristic function of an L-multiset:

x ∈l σ ⇐⇒ σ(x)=l.

Given a map f : A → B,wedeﬁneamapLf : LA →LB by 2 Lf (σ)(b):= {σ(a) | f(a)=b}, then it is easy to check: Lemma 3.1 L(−) is a (covariant) Set-endofunctor.

Proof. Given sets A, B,andC and maps f : A → B, g : B → C,wecalculate for arbitrary σ : A →L, a ∈ A,andc ∈ C: 2 idA L (σ)(a)= {σ(x) | idA(x)=a} = σ(a) = idLA (σ)(a). 2 (Lg ◦Lf )(σ)(c)= {Lf (σ)(b) | g(b)=c} 2 2 = { {σ(a) | f(a)=b}|g(b)=c} 2 = {σ(a) | (g ◦ f)(a)=c} = Lg◦f (σ)(c).

Obviously, when L is the two-element lattice {0, 1}, this functor is naturally isomorphic to the powerset functor P via η : L(−) →P, deﬁned for any set X by

ηX (σ):={x ∈ X | σ(x)=1}. / It is also straightforward to check that each -preserving map ϕ : L→L − induces a natural transformation between L(−) and L( ).

3.1 L-Coalgebras. According to the general deﬁnition of coalgebras, an L(−)-coalgebra is a pair (A, α) consisting of a set A and a map α : A →LA.GiventwoL(−)-coalgebras (A, α)and(B,β), a map ϕ : A → B is a homomorphism if Lϕ ◦ α = β ◦ ϕ, that is, if for all a ∈ A and b ∈ B we have 2 β(ϕ(a))(b)=Lϕ(α(a))(b)= {α(a)(a) | a ∈ A, ϕ(a)=b}.

In the sequel we shall speak of L-coalgebras rather than of L(−)-coalgebras. It is convenient to introduce an “arrow notation” familiar from transition 188 Gumm and Schroder¨ systems as a shorthand. We write

a−→l a iﬀ α(a)(a)=l.

Obviously, the L-coalgebra structure α : A →LA may also be interpreted as an L-graded relation αˆ : A × A →Lby settingα ˆ(a, a):=α(a)(a). When L is the two-element lattice then an L-coalgebra is just a Kripke- frame. In the general case, L provides a set of measures for indicating how “strong” a pair (a, a) should be in the L-graded relationα ˆ, or, alternatively, how conﬁdent we might be that (a, a)isinˆα.IfL is the unit interval, then αˆ is just a fuzzy relation.

3.2 Subcoalgebras

The functor L(−) is not standard in the sense deﬁned in [Mos99], that is

A A (⊆U ) L L = ⊆LU .

Rather, we always have: " ⊆A τ(a), if a ∈ U L U (τ)(a)= 0, otherwise.

From that one obtains straightforwardly:

Lemma 3.2 U ⊆ A is a subcoalgebra of A =(A, α) iﬀ

∀u ∈ U, a ∈ A. ∀m =0 .u−→m a =⇒ a ∈ U.

Proof. U is a subcoalgebra of A =(A, α) ⊆A ⇐⇒ ∀ u ∈ U. ∃τ : U →L.α(u)=L U (τ) ⇐⇒ ∀ u ∈ U. ∀a ∈ (A − U).α(u)(a)=0 ⇐⇒ ∀ u ∈ U, a ∈ A ∀m =0 .u−→m a =⇒ a ∈ U.

For coalgebras of arbitrary type F it is known that subcoalgebras are always closed under ﬁnite intersections (see [GSa]). For the functor F = L(−), we even get from the above:

Corollary 3.3 The intersection of an arbitrary family (Ui)i∈I of subcoalgebras of an L-coalgebra A is again a subcoalgebra of A. 189 Gumm and Schroder¨

3.3 L-Simulations

Deﬁnition 3.4 We deﬁne a simulation between A and B as a relation S ⊆ A × B where for all (a, b) ∈ S and all a ∈ A 2 a−→m a =⇒ {l |∃y ∈ B. b−→l y, aSy}≥m.

Thus, if a is simulated by b and if a is an m-successor of a, then there must be suﬃciently/ many li-successors bi of b, each one simulating a and together ≥ yielding i∈I li m. We write A QS B,ifS is a simulation between A and B. Then we conclude directly from the deﬁnition:

S ;T Lemma 3.5 If A QS B and B QT C,thenA Q C. Thus the class of all L- coalgebras with simulations as arrows forms a category.

3.4 L-Homomorphisms Homomorphisms turn out to be maps between coalgebras which strengthen the graded relation and whose converse is a simulation, that is: Lemma 3.6 Amapϕ : A → B between coalgebras A =(A, α) and B = (B,β) is a homomorphism if and only if the following two conditions are satisﬁed for all a, a ∈ A,allb ∈ B and for all m ∈L: a−→m a =⇒ ϕ(a)−→m ϕ(a) for some m ≥ m (1) 2 ϕ(a)−→m b =⇒ m ≤ {l |∃x ∈ A. a−→l x, ϕ(x)=b}. (2)

Proof. The homomorphism condition requires that for every a ∈ A and every b ∈ B we have the equation

β(ϕ(a))(b)=Lϕ(α(a))(b).

Translating this, using the arrow–notation, we get 2 βˆ(ϕ(a),b)= {l |∃x ∈ A. a−→l x, ϕ(x)=b}.

The two homomorphism conditions are equivalent to the inequalities which are obtained by replacing “=” above by “≥”and“≤”. From the ﬁrst homomorphism condition we obtain βˆ(ϕ(a),b) ≥ l whenever a−→l x and ϕ(x)=b,thus / β(ϕ(a))(b) ≥ {l |∃x ∈ A. a−→l x, ϕ(x)=b}. From this inequality, in turn, we get the ﬁrst homomorphism condition back by considering only x = a in the right hand side, to obtain βˆ(ϕ(a),ϕ(a)) ≥ l whenever a−→l a. The second homomorphism condition is obviously equivalent with the inequality obtained from replacing “=” with “≤”. 190 Gumm and Schroder¨

Using the description of homomorphisms and of subcoalgebras, it is now straightforward to check:

Corollary 3.7 If ϕ : A→Bis a homomorphism and V ≤ B is a subcoalgebra of B,thenϕ−[V ] is a subcoalgebra of A.

Observe that the ﬁrst, resp. second, homomorphism condition just states that G(ϕ), resp. G(ϕ)−,isasimulation,thusweget:

Corollary 3.8 Amapϕ : A → B is a homomorphism between L-coalgebras A and B if and only if both G(ϕ) and G(ϕ)− are simulations.

3.5 L-Bisimulations We have already remarked that a map between coalgebras A and B is a homomorphism iﬀ its graph is a bisimulation. However, a relation R is not necessarily a bisimulation even if R and R− are simulations. To see the connections we have to characterize bisimulations:

Proposition 3.9 A relation R between coalgebras A =(A, α) and B =(B,β) is a bisimulation if and only if for all (a, b) ∈ R and all a ∈ A, b ∈ B we have: 2 a−→m a =⇒ {m ∧ l |∃y ∈ B. b−→l y, aRy} = m, and (3) 2 b−→m b =⇒ {m ∧ l |∃x ∈ A. a−→l x, xRb} = m. (4)

Proof. Let R be a bisimulation, and (a, b) ∈ R, then there is some τ := δ(a, b):R →Lwith α(a)=Lπ1 (τ), and (5) β(b)=Lπ2 (τ). (6) For a ∈ A we have therefore by (5) Lπ1 αˆ(a, a )=2 (τ)(a ) = {τ(u) | π1(u)=a } 2 = {τ(a,y) | aRy}. By (6) we have for every y with aRy:

ˆ Lπ2 β(b, y)=2 (τ)(y) = {τ(v) | π2(v)=y} 2 = {τ(x, y) | xRy} ≥ τ(a,y). Combining the above, we get 191 Gumm and Schroder¨ 2 αˆ(a, a)= {τ(a,y) | aRy} 2 = {αˆ(a, a) ∧ τ(a,y) | aRy} 2 ≤ {αˆ(a, a) ∧ βˆ(b, y) | aRy} ≤ αˆ(a, a). / Hence,α ˆ(a, a)= {αˆ(a, a)∧βˆ(b, y) | aRy}, which is the first bisimulation condition. The second one is obtained symmetrically. To show the converse, let the bisimulation conditions (3) and (4) be satisfied. We need to show that R is a bisimulation. Define a structure map δ : R →LR by δ(a, b)(x, y):=ˆα(a, x) ∧ βˆ(b, y). For an arbitrary a ∈ A we have 2 π1 L (δ(a, b))(a )= {δ(a, b)(u) | π1(u)=a} 2 = {δ(a, b)(a,y) | aRy} =ˆα(a, a) = α(a)(a).

π1 π1 Hence L (δ(a, b)) = α(a)=(α ◦ π1)(a, b), thus L ◦ δ = α ◦ π1, and, analo- π2 gously, L ◦ δ = β ◦ π2. Therefore, R is a bisimulation. Notice that lemma 3.6 could be inferred from the above, since for general coalgebras, homomorphisms are precisely those maps whose graph is a bisimulation. We also get the following corollary: Corollary 3.10 If R is a bisimulation then both R and R− are simulations.

4The role of distributivity

In the applications of coalgebras one often has type functors which satisfy an important technical property: They preserve weak pullbacks. This is to say that a (weak) pullback diagram is transformed into a weak pullback diagram. Many results in coalgebra theory (e.g. [Rut00]) hinge on this property. In [GSa] it was shown that a functor F preserves nonempty weak pullbacks if and only if bisimulations between F -coalgebras are closed under composition. Now consider corollary 3.10. If its converse is true then by lemma 3.5 bisimulations will be closed under composition, thus L(−) will preserve weak pullbacks. In this section we shall show that this and related properties hinge on a certain distributivity condition on the lattice L. Definition 4.1 [[Grä98]] A lattice is called join infinite distributive (in short JID), if it satisfies the law 2 2 x ∧ {xi | i ∈ I} = {x ∧ xi | i ∈ I}. 192 Gumm and Schroder¨

If L is a complete semilattice, we shall say that L isJIDiﬀthisisthecasefor the lattice induced on L. Whenever L is JID, the bisimilarity conditions for L-coalgebras simplify to: Lemma 4.2 If L is JID then a relation R ⊆ A × B is a bisimulation between coalgebras A and B if and only if 2 a−→m a =⇒ {l | b−→l y, aRy}≥m, and (7) 2 b−→m b =⇒ {l | a−→l x, xRb}≥m. (8)

The following theorem, amongst other things, provides a converse to this lemma and to corollary 3.10: / Theorem 4.3 Let L be a -semilattice, then the following are equivalent: (i) L is JID. (ii) R ⊆ A × B is a bisimulation ⇐⇒ R and R− are simulations. (iii) Bisimulations are closed under compositions. (iv) L(−) preserves nonempty weak pullbacks. (v) L(−) weakly preserves nonempty kernel pairs.

(vi) The largest bisimulation ∼A on every L-coalgebra is transitive. Proof. (i) → (ii) follows from lemma 4.2. (ii) → (iii) is a consequence of lemma 3.5. (iii) → (iv) is shown for arbitrary functors in [GS00]. (iv) → (v) → (vi) can be found in [GS00], so we may concentrate on proving (vi) → (i).

Let a family (li)i∈I and a further element m of L be given, it suﬃces to show 2 2 m ∧ li ≤ (m ∧ li), i∈I i∈I since the reverse inclusion holds in any lattice. On the sets A := {ai | i ∈ I}∪{a, a }

B := {bi | i ∈ I}∪{b} C := {c, c} deﬁne coalgebra structures α, β,andγ given in arrow-notation as follows: 2 e a−→ a , where e := m ∧ li

li a−→ ai, for all i ∈ I li b−→ bi, for all i ∈ I / li c−→ c. Using lemma 3.6 it is easy to see that ϕ : A → C,deﬁnedasϕ(a)=c and ϕ(a )=ϕ(ai)=c for all i ∈ I,andψ : B → C given by ψ(b)=c and 193 Gumm and Schroder¨

 − ψ(bi)=c for all i ∈ I are homomorphisms. Consequently, G(ϕ)andG(ψ) are bisimulations.

za2 c b2 z 22 2 zz 2 22 zz 2 2 zz 2 / 2 e z 22 ϕ / o ψ 22 zz 2 li 2 zz li lj 2 li lj 2 z 2 22 zz 22 2 }zz Õ Ö a ai ... aj c bi ... bj

Consider now the sum of the three coalgebras A + B + C. We still have that G(ϕ)andG(ψ)− are bisimulations, in particular, they are contained in the largest bisimulation ∼=∼A+B+C.Thusa ∼ c and c ∼ b. By assumption, now a ∼ b.

Proposition 3.9 tells us that there is a subcollection (bj)j∈J⊆I with 2 2 lj e ≤ {e ∧ lj | b−→ bj}≤ (e ∧ li). j∈J i∈I

Hence 2 2 2 m ∧ li = e ≤ (e ∧ li)= (m ∧ li), i∈I i∈I i∈I finishing the proof. The implication (i) → (iv) is due to S. Pfeiffer [Pfe99]. A finite lattice is distributive iff it does not contain one of the characteristic five-element nondistributive sublattices M3 or N5,see([Grä98]). By separately excluding these cases, she also obtained the converse (iv) → (i)incasethatL is finite and distributive. Observe that nonempty weak pullbacks along injective maps, resp. nonempty pullbacks of an arbitrary collection of injective maps, are always preserved, without assuming JID. In [GS00], these conditions have been shown to be equivalent to homomorphic preimages of subcoalgebras, resp. arbitrary intersection of subcoalgebras, being subcoalgebras again. Thus, these results follow from corollaries 3.7 and 3.3. One might wonder, whether in general, simulations could not have been defined by just one clause/ of 3.9 so that condition (ii) would automatically be satisfied for arbitrary -semilattices L. Notice, however, that with such a definition we would not have been able to show that simulations are closed under composition.

5 Labeling with a commutative monoid

(−) In the definition/ of the functor L it is essential that L has arbitrary suprema, i.e. that L is -complete. When trying to replace L by an arbitrary commutative monoid M =(M,+, 0), we do not have infinite sums available anymore, 194 Gumm and Schroder¨ unless when almost all summands are 0. Hence, we must redefine the functor by only considering maps σ : X → M with finite support: Definition 5.1 Let M =(M,+, 0) be a commutative monoid. Whenever X is a set and all but finitely many elements of a family (g(x))x∈X are 0, we denote its sum by (g(x) | x ∈ X). Given any set X and a map σ : X → M,wecall

supp(σ):={x ∈ X | σ(x) =0 } the support of σ.Let

MX { → || | } ω := σ : X M supp(σ) <ω be the set of all maps from X to M with ﬁnite support, and for any map → Mf ∈MX f : X Y let ω be the map deﬁned on any σ ω by: ∀ ∈ Mf | ∈ y Y. ω(σ)(y):= (σ(x) x X, f(x)=y).

Mf One easily checks that ω(σ) is a map from Y to M with ﬁnite support, so using associativity and commutativity of +, one veriﬁes as before: (−) Lemma 5.2 Mω is a Set-endofunctor.

(−) When M is the two-element Boolean algebra ({0, 1}, ∨, 0) then Mω is just the ﬁnite powerset functor Pω(−). (−) Coalgebras of type Mω , in the sequel called Mω-coalgebras, may again be viewed as graphs with arcs labeled by elements of M,sowecontinueusing the arrow-notation as in the case of L-coalgebras. In particular, if (A, α)isan Mω-coalgebra, a, a ∈ A and m ∈ M, we can choose between the equivalent notations α(a)(a)=m, orα ˆ(a, a)=m, or a−→m a.

For Mω-coalgebras, the basic coalgebraic constructions can be easily described:

Lemma 5.3 Let A =(A, α) and B =(B,β) be Mω-coalgebras, then (i) U ⊆ A is a subcoalgebra of A, iﬀ for all u ∈ U,andalla ∈ A:

u−→m a, m =0 =⇒ a ∈ U.

(ii) ϕ : A → B is a homomorphism iﬀ for all a ∈ A, b ∈ B:

ϕ(a)−→m b ⇐⇒ m = (m |∃a ∈ A. a−→m a,ϕ(a)=b).

Corollary 5.4 The intersection of an arbitrary family (U)i∈I of subcoalgebras of an Mω-coalgebra A is again a subcoalgebra of A. 195 Gumm and Schroder¨

If ϕ : A→Bis a homomorphism and U ⊆ B a subcoalgebra of B,then ϕ−[U] need not be a subcoalgebra of A. This stands in contrast to the situation (−) for lattice labeled coalgebras (corollary 3.7). Thus, the functor Mω does in general not preserve nonempty pullbacks along injective maps ([GS00]). In the next section, we shall study algebraic conditions on the monoid M which are responsible for such properties of the functor. We conclude this section with a characterization of bisimulations R ⊆ A × B between Mω-coalgebras A and B. For this we consider the elements MA | | MR of ω as vectors with A many components and the elements of ω as |A|×|B|-matrices with entries from M. From the deﬁnition of bisimulation in section 2.3, we obtain:

Lemma 5.5 Let A =(A, α) and B =(B,β) be Mω-coalgebras. A relation R ⊆ A × B is a bisimulation iﬀ for every (a, b) ∈ R there exists an |A|×|B|- matrix (mx,y) with entries from M such that: • all but ﬁnitely many mx,y are 0, • mx,y =0 implies (x, y) ∈ R, • α(a) is the vector of all row-sums of (mx,y), i.e. ∀x ∈ A.αˆ(a, x)= (mx,y | y ∈ B),

• β(b) is the vector of all column-sums of (mx,y), i.e. ˆ ∀y ∈ B.β(b, y)= (mx,y | x ∈ A).

5.1 Positive monoids. Any commutative semigroup can be turned into a commutative monoid by simply adjoining a new element 0. The obtained monoid is rather special though, it can be internally characterized by the fact that no nonzero element is invertible: Deﬁnition 5.6 A monoid element m ∈ M is called invertible if there exists some m− ∈ M with m + m− =0.AmonoidM =(M,+, 0) is called positive if 0 is the only invertible element. If a commutative monoid is not positive, we can obtain one by factoring out the invertible elements, or by deleting all invertible ones, except for 0, for it is easy to check: Lemma 5.7 The invertible elements of a commutative monoid form a group I(M). Factoring M by the induced congruence relation yields the largest positive factor of M. At the same time, any commutative monoid is the union of the subgroup I(M) of invertible elements with a positive submonoid M+. 196 Gumm and Schroder¨

Example 5.8 The following monoids are positive: (i) (N, +, 0) (ii) (N \{0}, ·, 1) (iii) (L, ∨, 0) for any (semi)lattice with 0.

5.2 Refinable monoids. We shall need to consider a further monoid condition which we shall call × refinable. For this, let us consider anm n-matrix (ai,j) of monoid elements. Consider their row-sums ri = 1≤j≤m ai,j, and their column sums cj = 1≤i≤n ai,j, then by associativity and commutativity one obviously has r1 + ...+ rn = c1 + ...+ cm. Refinability is just the inverse condition, that is: Definition 5.9 Given m, n ∈ N,amonoidM is called (m, n)-refinable, if for any r1,...,rm,c1,...,cn ∈ M with r1 + ...+ rm = c1 + ...+ cn one can find an m × n-matrix (ai,j)ofelementsofM, whose row sums are r1,...,rm and whose column sums are c1,...,cn.

a1,1 ··· a1,n r1 . . . ··· . ···

am,1 ··· am,n rm

c1 ··· cn

Obviously, when m =1orn = 1, the condition is vacuous. When m>1 then M is (m, 0)-refinable iff it is positive. For the remaining cases we prove: Proposition 5.10 For any m, n > 1 we have: A commutative monoid is (m, n)-refinable, iff it is (2, 2)-refinable.

Proof. In an (m, n+1)-reﬁnement of r1+...+rm = c1+...+cn+0, we can add corresponding elements from the last two rows to obtain an (m, n)-reﬁnement of r1 + ...+ rm = c1 + ...+ cn,soonedirectionisclear. The other direction is proved by an easy induction over the number of columns, followed by a similar induction over the number of rows. As a hint, we show how to get from (2, 2) to (3, 2):

Given r1 + r2 + r3 = c1 + c2, use the (2, 2) reﬁnement property to ﬁnd a 2 × 2-matrix (ai,j)withcolumnsumsc1, c2 and row sums r1,(r2 + r3). Now a2,1 + a2,2 = r2 + r3, so there is another 2 × 2 matrix with row sums r2, r3 and column sums a2,1, a2,1:

a1,1 a1,2 r1 b1,1 b1,2 r2

a2,1 a2,2 r2 + r3 and b2,1 b2,2 r3

c1 c2 a2,1 a2,2 197 Gumm and Schroder¨ obviously now, the following matrix solves the original problem:

a1,1 a1,2 r1

b1,1 b1,2 r2

b2,1 b2,2 r3

c1 c2

As a consequence of this proposition, we may simply call a commutative monoid refinable if it is (2, 2)-refinable. Refinability does not imply positivity, since nontrivial abelian groups, for instance, are refinable, but not positive. For the next section, we shall need the following observation, referring to infinite matrices:

Lemma 5.11 Suppose that M is reﬁnable. Given X, Y nonempty sets, σ ∈ MX ∈MY | ∈ | ∈ ω and τ ω with (σ(x) x X)= (τ(y) y Y ). There exists an | |×| | | ∈ X Y -matrix (mx,y) with row sums (mx,y y Y )=σ(x) and column sums (mx,y | x ∈ X)=τ(y), where all but ﬁnitely many mx,y are 0.

Proposition 5.10 makes it easy to check that the first two instances of example 5.8 are refinable. In fact, any commutative monoid which is cancellative and which satisfies

∀c, r ∈ M.∃x ∈ M. c = x + r or r = x + c is easily seen to be refinable. This also covers the case of (N, +, 0). In the case of (N \{0}, ·, 1), refinability is a consequence of the fact that every element has a unique prime factor decomposition. Refinability does not carry over to submonoids. Consider, for instance, the submonoid (N \{0, 2}, ·, 1) of the previous example, and the refinement problem 5 · 6=3· 10. Any refinement would require the prime factor 2, which is unavailable. In the case of a lattice L, we obtain a familiar property:

Lemma 5.12 If L is a lattice with smallest element 0,then(L, ∨, 0) is reﬁn- able if and only if L is distributive.

Proof. Given a distributive lattice L and a, b, c, d ∈ L with a∨b = c∨d,then we have a reﬁnement a ∧ ca∧ d a b ∧ cb∧ d b cd Conversely, if L is not distributive, then one of the following lattices, known 198 Gumm and Schroder¨

as N5,resp.M3, must be a sublattice of L (see e.g. [Gr¨a98]):

p PP p ? PPP ? P ?? c ?? ? N = ? M = a ? c 5 b ?? 3 ?? b ?? ?? ? nn a ? ? nnn ? q n q

In both cases, a ∨ b = b ∨ c. Suppose we had a reﬁnement

xya zub bc with x, y, u, v ∈ L. From the table, it follows that u ≤ b and u ≤ c,so u ≤ b ∧ c = q ≤ a.Also,y ≤ a, hence u ∨ y = c ≤ a.Butc ≤ a,bothinM3 and in N5.

5.3 Weak pullback preservation (−) We now study conditions under which the functor Mω weakly preserves nonempty kernel pairs, pullbacks along injectives, or arbitrary pullbacks. A functor F is said to (weakly) preserve pullbacks along injective maps, provided for any f : X → Z and g : Y → Z with g injective, a (weak) pullback of f with g is transformed by F into a weak pullback of F (f)with F (g). In [GS00], we have shown that a Set-endofunctor F weakly preserves nonempty pullbacks along injective maps if and only the preimage ϕ−[V ]of any F -subcoalgebra V≤Bunder a homomorphism ϕ : A→Bis again a subcoalgebra of A. For L-coalgebras, this condition is always satisfied according to corollary 3.7. We shall see, however, that this is not necessarily the case for Mω- coalgebras. In fact we shall algebraically characterize those monoids M for (−) which Mω preserves weak pullbacks along injective maps. Finally, we consider preservation of arbitrary weak pullbacks. Theorem 5.13 Let M =(M,+, 0) be a commutative monoid. (−) (i) Mω (weakly) preserves nonempty pullbacks along injective maps iff M is positive. (−) (ii) Mω weakly preserves nonempty kernel pairs iff M is refinable. (−) (iii) Mω weakly preserves nonempty pullbacks iff M is positive and refinable. Proof. (i): Assume that M is positive, and let ϕ : A→Bbe a homomorphism of M-coalgebras and V a subcoalgebra of B. We need to show that 199 Gumm and Schroder¨

ϕ−1[V ] is a subcoalgebra of A.Givena ∈ ϕ−1[V ], a ∈/ ϕ−1[V ]anda−→m a,we need to show that m = 0. We know that ϕ(a) ∈ V and ϕ(a) ∈/ V ,sobypart 0 (i) of lemma 5.3 we conclude ϕ(a)−→ ϕ(a ). Part (ii) of the same lemma then yields (n | a−→n x, ϕ(x)=ϕ(a)) = 0. Positivity forces each summand to be 0, in particular m =0.

To prove the converse, let m1,m2 ∈ M be given with m1 + m2 =0. Consider the coalgebra A, given by a point p and two transitions to points q1 and q2, labeled with m1 and m2.LetB consist of two points r and s with no transitions (all transitions labeled with 0). We get a homomorphism ϕ with ϕ(p)=r and ϕ(q1)=ϕ(q2)=s.Now{r} is a subcoalgebra of B,andthe assumption forces ϕ−1{r} = {p} to be a subcoalgebra of A, but this implies m1 = m2 =0.

p ? r ?? m1 ??m2 ϕ / A = ?? 0 = B ? q1 q2 s

A slight modification of this construction also gives us the backward direction of (ii): Assume that F weakly preserves nonempty kernel pairs, then kernels of homomorphisms are bisimulations. Suppose m1 + m2 = s1 + s2 in M.WetakeacopyA of A as above, but we label the arcs of A with s1 and s2.IfB is obtained from B by changing the edge label to m1 + m2 = s1 + s2, there is an obvious homomorphism ϕ : A + A →B. Its kernel must be a bisimulation, so lemma 5.5, provides us with a refinement of m1 +m2 = s1 +s2. We combine the ’if’-directions of (ii)and(iii): Assume that M is refinable. Given homomorphisms ϕ : A→C,andψ : B→C, we need to show that

pb(ϕ, ψ):={(a, b) | ϕ(a)=ψ(b)} is a bisimulation between A and B.

Let (a, b) ∈ pb(ϕ, ψ)begiven.Weshalldefinean|A|×|B|-matrix (mx,y) with entries from M, satisfying the conditions of lemma 5.5. −1 rx For any c ∈ ϕ[A] ∩ ψ[B], put X := ϕ ({c})andrx :=α ˆ(a, x), i.e. a−→ x, −1 ˆ for any x ∈ X . Similarly, Y := ψ ({c})andcy = β(b, y) for every y ∈ Y . | |×| | c With lemma 5.11 we obtain an X Y matrix (mx,y) with row sums (rx)x∈X and column sums (cy)y∈Y . Observe that we can achieve that • c for all but finitely many c is (mx,y) the 0-matrix • c all but finitely many entries in each (mx,y)are0. | |×| | c The final A B -matrix (mx,y) is obtained by putting all (mx,y) together 200 Gumm and Schroder¨ and filling up with zeroes:

··· 0 ··· " mc if ϕ(x)=c = ψ(y), c x,y 0 (m ) 0 mx,y := x,y 0 otherwise.

··· 0 ···

By construction, mx,y = 0 implies (x, y) ∈ pb(ϕ, ψ). Moreover, all but −→r ﬁnitelymanyentriesof(mx,y) are zero. Suppose now that a a . We need | ∈ to show that the a -th row sum is r, i.e. (ma ,b b B)=r. − 1 { } ∅ | ∈ c | ∈ Let c := ϕ(a ). If ψ ( c ) = then (ma ,b b B)= (ma,b b B)=r.Ifψ−1({c})=∅ (this case cannot happen in (ii)), we shall invoke positivity to show r = 0. Speciﬁcally, for s with ϕ(a)−→s c we have (m | a−→m a,ϕ(a)=c)=s = (m | b−→m b,ψ(b)=c)=0.

Hence r + u = 0 for some u ∈ M, whence r =0. Amoreelegantwaytosee(iii) is to directly conclude it from (i)and(ii), by invoking the following lemma from the second author’s thesis: Lemma 5.14 [Sch01] A Set-endofunctor weakly preserves pullbacks iﬀ it weakly preserves kernel pairs and pullbacks along injective maps.

6 Discussion

One motivation for this study was to provide a repository of examples of Set- endofunctors with particular combinations of preservation properties. This we achieve by parameterizing a certain class of functors with algebraic structures and translating the functorial properties into corresponding algebraic laws. For instance, theorem 5.13 can be used to obtain an example of a functor weakly preserving nonempty kernel pairs, but not weakly preserving nonempty pullbacks: Simply choose for M any nontrivial abelian group. (−) Of course, L-coalgebras and Mω -coalgebras as L-, resp. M-labeled transition systems are of independent interest. L-valued sets and relations are considered by Goguen in [Gog67]. In the book [FS90], Freyd and Scedrov consider the following operations on “L-valued relations” R : A × B →Land S : B × C →L: 2 (R ◦ S)(a, c):= {R(a, b) ∧ S(b, c) | b ∈ B}.

When L = {0, 1}, this agrees with the familiar composition of relations. The 201 Gumm and Schroder¨ authors remark that this operation is associative iff L is join infinitely distributive (JID), also called a locale in [Bor94]. R(−) L. Moss, in [Mos99], considers the following subfunctor of + : Q(X):={f : X → R | supp(f)finite, f(x)=1}. x∈X Coalgebras of this functor are stochastic transition systems([Mos99],[dVR99]). Moss invokes the “Row/Column-theorem” for R+, which is to say that R is (m, n)-refinable for each m, n > 1, to show that Q weakly preserves pullbacks. (He attributes the proof of the Row/Column theorem to Saley Aliyari). We have borrowed the term refinable from a classical line of algebraic investigation, asking for the existence of unique product decompositions of ∼ finite algebras. If A1 × ...×Am = B1 × ...×Bn are two representations of the same finite algebra as a product of indecomposables, one would like to ∼ conclude m = n and Bi = Aτ(i), for some permutation τ. It is easy to come up with examples of finite algebras that do not have unique decompositions. In such cases it may still be possible to prove a re- ∼ finement property:GiventhatA1 ×...×Am = B1 ×...×Bn, then each factor can be further decomposed into a product of smaller algebras Qi,j,untilone has the same collection of factors Qi,j on the left and on the right side. J.D.H. Smith has reminded us of a result of B. Jónsson and A. Tarski [JT47] which states that a class of algebras, amongst whose operations are a binary operation + and a constant 0, which is neutral with respect to + and idempotent for all fundamental operations, has the (m, n)-refinement property. This means that the class of all finite Jónsson-Tarski algebras, with the monoid structure given by the direct product (×)andwith{0} as neutral element, is a refinable monoid. Jónsson and Tarski needed the operations + and 0 to represent direct products as “inner products”. Without some such assumptions, refinement ∼ is not possible in general, for refinement implies cancellability: A×B = ∼ A×C =⇒B= C.WhenA has a 1-element subalgebra, cancellability holds, according to L. Lovász ([Lov67]), but otherwise, one needs to replace “isomorphy” by the weaker notion of “isotopy”, see [Gum77]. A refinement theorem up to isotopy for algebras in congruence modular varieties has been proved in [GH79].

References

[Bor94] F. Borceux, Handbook of categorical algebra 1: basic category theory, Cambridge University Press, 1994. [dVR99] E.P. de Vink and J.J.M.M. Rutten, Bisimulation for probabilistic transition systems: a coalgebraic approach, Theoretical Computer Science (1999), no. 211, 271–293. 202 Gumm and Schroder¨

[FS90] P. Freyd and A. Scedrov, Categories, allegories, Elsevier, 1990.

[GH79] H.P. Gumm and C. Herrmann, Algebras in modular varieties: Baer reﬁnements, cancellation and isotopy, Houston Journal of Mathematics 5 (1979), no. 4, 503–523.

[Gog67] J.A. Goguen, L-fuzzy sets, Journal of Mathematical Analysis and Applications (1967), no. 18, 145–174.

[Grä98] G. Grätzer, General lattice theory, Birkhäuser Verlag, 1998.

[GSa] H.P. Gumm and T. Schr¨oder, Coalgebras of bounded type, Submitted.

[GSb] H.P. Gumm and T. Schr¨oder, Products of coalgebras, Algebra Universalis, to appear.

[GS00] H.P. Gumm and T. Schr¨oder, Coalgebraic structure from weak limit preserving functors, CMCS (2000), no. 33, 113–133.

[Gum77] H.P. Gumm, A cancellation theorem for ﬁnite algebras, Coll. Math. Soc. J´anos Bolyai (1977), 341–344.

[Gum98] H.P. Gumm, Functors for coalgebras, Algebra Universalis, to appear, 1998.

[JT47] B. J´onsson and A. Tarski, Direct decompositions of ﬁnite algebraic systems, Notre Dame Mathematical Lectures (1947), no. 5.

[Lov67] L. Lov´asz, Operations with structures, Acta. Math. Acad. Sci. Hungar. 18 (1967), 321–328.

[Mos99] L.S. Moss, Coalgebraic logic, Annals of Pure and Applied Logic 96 (1999), 277–317.

[Pfe99] S. Pfeiffer, Funktoren für Coalgebren, Master Thesis, Universität Marburg, 1999.

[Rut00] J.J.M.M. Rutten, Universal coalgebra: a theory of systems, Theoretical Computer Science (2000), no. 249, 3–80.

[Sch01] T. Schr¨oder, Coalgebren und Funktoren, PhD-Thesis, Philipps-Universit¨at Marburg, 2001.

203 CMCS’01 Preliminary Version

Modal Operators for Coequations

Jesse Hughes

Dept. of Philosophy Carnegie Mellon University Pittsburgh, PA 15213 [email protected]

Abstract We present the dual to Birkhoﬀ’s variety theorem in terms of predicates over the carrier of a cofree coalgebra. We then discuss the dual to Birkhoﬀ’s completeness theorem, showing how closure under deductive rules dualizes to yield two modal operators acting on coequations. We discuss the properties of these operators and show that they commute, and we prove the invariance theorem, which is the formal dual of the completeness theorem.

1 Introduction

Jan Rutten’s development of the theory of coalgebras in [Rut96] provided a foundation for coalgebraic semantics for computer science. In addition, he proved the dual to Birkhoff’s variety theorem [Bir35] for coalgebras over Set. The covariety theorem states that a class V of coalgebras is closed under (regular) subcoalgebras, coproducts and codomains of epis just in case V is “coequationally definable”. The notion of a coequation and coequation satisfaction arises as the formal dual of sets of equations and equation satisfaction in categories of algebras. Peter Gumm and Tobias Schröder continued work on the duals of Birkhoff’s theorems for coalgebras over Set in [GS98], where they dualized the deductive completeness theorem as well. Namely, they showed that, given a regular injective coalgebra A, α, the partial order of quasi-covarieties definable by conditional coequations over A, α is isomorphic to the invariant subcoalgebras of A, α. Here, the notion of invariance arises as the dual of closure of sets of equations under substitution of terms for variables. In ibid, we also find the first discussion of “complete” or “behavioral” covarieties. These covarieties are definable by coequations over one “color” This research is part of the Logic of Types and Computation project at Carnegie Mellon University under the direction of Dana Scott. This is a preliminary version. The final version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Hughes or, equivalently, are the covarieties closed under total bisimulations. The work on coalgebraic specifications in [RJT01], for instance, involves giving models for classes in an object oriented language as behavioral covarieties in an appropriate category of coalgebras. Hence, we can understand this approach in terms of coequations over a single color. These coequations are dual to variable-free equations for a class of universal algebras, and so one has the idea that there is much more expressive power to exploit in the theory of coequations. We provide examples of coequations here which illustrate some of the expressive power available when one moves from behavioral covarieties to covarieties in general. See also [Ro¸s00] for a discussion of behavioral covarieties, called “sinks” there, and [AH00] or [Hug01] for a synthesis of the work of [GS98] and [Ro¸s00], as well as some further discussion of behavioral covarieties. In this paper, we develop the theory of coequations from a logical view- point. A coequation ϕ over a set C of colors is a regular subobject of UHC, the carrier of the cofree coalgebra over C. Hence, we can view ϕ as a predicate over UHC. In particular, we can form new coequations out of old by means of the logical connectives ∧, →, etc. Furthermore, we have available a modal operator ✷ taking a coequation ϕ to the (carrier of the) largest subcoalgebra ✷ϕ contained in the coequation. This modal operator is dual to taking a set E of equations to the least congruence containing E — hence, it is dual to closure of E under the first four rules of inference of Birkhoff’s equational logic. So we see that closure of E under deductive inferences is dual to the addition of related modal operators to Sub(UHC). We introduce a modal operator that is dual to closure under Birkhoff’s fifth rule of inference, substitution of terms for variables. We confirm that is an S4 operator and show that, under certain conditions, commutes with ✷. We then prove the invariance theorem in terms of and ✷. In this way, we develop the coequations-as-predicates view by augmenting the predicates over UHC with two modal operators ✷ and and show that the partial order of covarieties definable by coequations over C is isomorphic to the partial order of predicates ϕ over UHC such that ϕ = ✷ ϕ. In Section 2, we summarize the dual of Birkhoff’s variety theorem, introducing the relevant terminology and results. In Section 3, we generalize the covariety theorem to accommodate quasi-covarieties and conditional coequations. Section 4 is a categorical presentation of Birkhoff’s deductive completeness theorem and its dual, the invariance theorem. We discuss the well-known greatest subcoalgebra operator, ✷,inSection5andshowthatitisanS4 modal operator that commutes with pullbacks along homomorphisms. In Sec- tion 6, we introduce a second S4 operator, , taking a coequation to its largest invariant sub-coequation. This allows an easy proof of the invariance theorem in terms of the operators ✷ and in Section 7. This work forms part of the author’s doctoral dissertation, written under the supervision of Professors Dana S. Scott and Steve Awodey. Professor

205 Hughes

Scott suggested research into the dual of Birkhoff’s theorems, and that research and the presentation found here benefited from many conversations with both Professors Scott and Awodey. I also benefited from conversations with Jiˇr´ıAdámek, who pointed us to the Banaschewski and Herrlich article, Peter Gumm and Bart Jacobs.

2 The dual of Birkhoﬀ’s variety theorem

We begin with a brief summary of the dual of Birkhoﬀ’s variety theorem. This section summarizes the work found in [AH00], which can be viewed as a generalization of [Rut96] and [GS98]. A similar account of the covariety theorem can be found in [Kur00], and a similar categorical approach to the variety theorem for categories of algebras can be found in [BH76]. We start with some terminology. Recall that a morphism is a regular mono just in case it is the equalizer for some pair of maps, and that a subobject is regular in case its inclusion is a regular mono. In what follows, we call an object C regular injective if it is injective for regular subobjects; that is, if whenever B is a regular subobject of A,thenevery f :B /C can be extended to a (not necessarily unique) map

g :A /C such that the diagram 1 below commutes.

g / AO ~>C ~~ ~~ _LR~~ f B We say that a category E has enough regular injectives if, for every object A ∈E, there is a regular injective C such that A is a regular subobject of C. Definition 2.1 We say that a category E is quasi-co-Birkhoff if it is regularly well-powered, cocomplete and has epi-regular mono factorizations. If, in addition, E has enough regular injectives,thenE is co-Birkhoff. A full subcategory of a quasi-co-Birkhoff category is a quasi-covariety iff it is closed under coproducts and codomains of epis. A quasi-covariety of a co-Birkhoff category is a covariety iff it is also closed under regular subobjects. In fact, we could replace regular monos with strong monos throughout what follows and the results shown here would still apply. This entails weakening some assumptions (for instance, E needs only have epi-strong mono factorizations) while strengthening others (for instance, E needs enough strong 1 We use ,2/to denote regular monos. 206 Hughes injectives). We prefer to present the material in terms of regular monos, since there is a natural relationship between regular epis and sets of equations in the algebraic setting.

Example 2.2 The category EG of coalgebras for a comonad G = G, ε, δ forms a covariety in the category EG of coalgebras for the functor G. Given a map f :A /B in a category with epi-regular mono factorizations, we denote by Im(f) (read “the image of f) the object through which f uniquely (up to isomorphism) factors via an epi followed by a regular mono. We denote the partial order of regular subobjects of A by Sub(A). A map f :A /B induces a functor

f ∗ :Sub(B) /Sub(A), by pulling back a regular subobject of B along f. This functor has a left adjoint, / ∃f :Sub(A) Sub(B), which takes a regular subobject i:P ,2/A to Im(f ◦ i). Recall that an object A is orthogonal to an arrow f :B /C (written A ⊥ f) if, for every g :A /C , there is a unique map h:A /B such that g = f ◦ h 2 (see [Bor94, Volume 2]). Given a collection S ⊆E1 of arrows of E,theclass S⊥ ⊆E0 is the collection of all A such that, for all f ∈ S, A ⊥ f. The following theorem can be found in [AH00] or [Hug01].

Theorem 2.3 If C is a co-Birkhoff category, then V is a covariety iff V = S⊥ for some collection S of regular monos with regular injective codomains. One can show that, if G = G, ε, δ is a comonad on a quasi-co-Birkhoff category and G preserves regular monos, then EG inherits the epi-regular mono factorizations from E. We use this fact to prove the following. Theorem 2.4 Let G = G, ε, δ be a comonad on a (quasi-)co-Birkhoff category E and suppose that G preserves regular monos. Then EG is (quasi-)co- Birkhoff. In fact, Theorem 2.4 applies more generally than stated. If E is a quasi- co-Birkhoff category and Γ is any endofunctor that preserves regular monos, then the category EΓ of coalgebras for the endofunctor Γ is quasi-co-Birkhoff. We do not need cofree Γ-coalgebras for this result. Throughout what follows, we state our theorems in terms of coalgebras for a comonad, although we often indicate when the theorem applies to coalgebras for an endofunctor as well. The advantage of working with coalgebras for a comonad is that covarieties in EG are themselves comonadic over E,andso the results here may be “iterated”. Also, any category EΓ of coalgebras for 2 When we use the word collection, we allow that it is a proper class. We often abuse set notation and adopt it for classes in what follows. 207 Hughes an endofunctor is, given the presence of cofree coalgebras, equivalent to a category EG of coalgebras for a comonad (see [Tur96]). Since we often require cofree coalgebras in what follows, it’s reasonable to work with categories of coalgebras for a comonad. / / We let U :EG E (or U :EΓ E , resp.) denote the coalgebraic forgetful / / functor and H :E EG (H :E EΓ , if it exists, resp.) be the right adjoint of U.WeomitU when convenient, writing A for UA, α and just p for Up. Theorem 2.4 ensures that categories of coalgebras are (quasi-)co-Birkhoff, assuming that the base category is and that G preserves regular monos. Thus, the abstract co-Birkhoff theorem applies. In order to understand Theorem 2.3 in categories of coalgebras, we introduce the notion of coequation. Definition 2.5 Let C ∈Ebe regular injective. A coequation over C is a regular subobject ϕ ≤ GC(= UHC). We say that a coalgebra A, α satisfies ϕ (written A, α|= ϕ) just in case, for every homomorphism

p:A, α /HC

(equivalently, every “coloring” A /C ), there is a unique map

p:A /ϕ making the diagram below commute.

p / A GCO LR p !_ ϕ

If V is a class of coalgebras, we write V |= ϕ just in case each A, α∈V satisﬁes ϕ. In other words, A, α|= ϕ if, for every homomorphism

p:A, α /HC, we have Im(p) ≤ ϕ, or, equivalently, 1≤p∗ϕ. We similarly define, for each p:A, α /HC, A, α|= ϕ(p)iff Im(p) ≤ ϕ. Equivalently, following the presentation of [AN82] (also found in [AR94]), one could say that a coalgebra A, α satisfies a coequation ϕ over C just in case A, α is projective with respect to the inclusion ϕ ,2/UHC.Inthese terms, Theorem 2.3 says that any covariety is S-Proj for some collection S of regular monos with regular injective codomains. Acoequationϕ over C canbeviewedasapredicateoverGC.Thus,if Sub(GC) is a Heyting algebra, we can construct coequations ϕ ∧ ψ, ϕ → ψ, etc., and so we see that coequations over C come with a natural structure. 208 Hughes

Continuing this interpretation, if ϕ, ψ ∈ Sub(GC), we often write ϕ , ψ to mean ϕ ≤ ψ. It is easy to see that, if ϕ , ψ and A, α|= ϕ,thenA, α|= ψ. If we view coequations ϕ over C as predicates of a variable x of type GC, one may interpret pullback of coequations along homomorphisms

p:A, α /GC as substitution of p(y)(wherey is a variable of type A) for x, i.e., p∗ϕ = ϕ[p(y)/x]. Thus, A, α|= ϕ just in case, for every homomorphism p,wehave 1,ϕ[p(y)/x]. Remark 2.6 In the case of equations, one can easily distinguish between single equations and sets of equations. Gumm makes a similar distinction between single coequations and sets of coequations in [Gum01], by interpreting coequation satisfaction as an exclusionary condition, we prefer to keep the definition of satisfaction above, in keeping with our view of coequations as predicates. Hence, we do not distinguish between single coequations and sets of coequations. This notion of coequation allows a more familiar statement of the dual of Birkhoff’s variety theorem. Theorem 2.7 Suppose E is co-Birkhoff and G preserves regular monos. Then a full subcategory V of EG is a covariety iff there is a collection S of coequations such that for all A, α,

A, α∈V iﬀ ∀ϕ ∈ S A, α|= ϕ.

If, furthermore, G is bounded by C, then for each covariety V,thereisa coequation ϕ over C such that

A, α∈V iﬀ A, α|= ϕ.

The deﬁnition of a bounded functor can be found in [Rut96] or [GS98], for instance, where Theorem 2.7 is proved for coalgebras over Set.Aproofof this theorem in a more general setting can be found in [Hug01] or [Kur00]. The following corollary is a generalization of Theorem 12 from[Jac95], where the author proves it for a restricted class of coequations over Set, namely those coequations that arise as equalizers of a pair of terms related to the functor G. Corollary 2.8 Let E be co-Birkhoﬀ and G preserves regular monos, and let V be a covariety of EG. Then the forgetful functor

V /E is comonadic. Moreover, the associated comonad preserves regular monos and so V is again co-Birkhoﬀ. 209 Hughes

Proof. The forgetful functor V /E is the composite

UV / U / V EG E .

To show that this composite is comonadic, it suﬃces to show (by the dual of [Bor94, Volume 2, Theorem 4.4.4] that the following hold:

(i) U ◦ UV has a right adjoint;

(ii) U ◦ UV reﬂects isomorphisms;

(iii) U ◦ UV creates equalizers of pairs

• f /• g

such that U ◦ UVf, U ◦ UVg have a split equalizer in E. Condition (i) follows from Theorem 3.4, below. Condition (ii) is easily veriﬁed and (iii) follows from the same condition for U and the fact that UV creates equalizers. ✷ Remark 2.9 In the examples that follow, we prefer to describe the coalgebras as coalgebras for an endofunctor, rather than coalgebras for a comonad. Because these examples involve categories E in which the forgetful functor Γ ∼ has a right adjoint, there is a comonad G such that EΓ = EG [Tur96] and hence the previous results apply. Example 2.10 Fix a set of “inputs”, I and let Γ:Set /Set be deﬁned by

I ΓS =(PfinS) , where Pfin is the covariant finite powerset functor. A Γ-coalgebra S, σ can be regarded as a non-deterministic automaton over I, where the structure map gives the transition function. Explicitly, for each state s ∈ S and each input i ∈I, we write s i /s just in case s ∈ σ(s)(i). The deterministic automata are those automata S, σ such that, for each s ∈ S and each i ∈I, there is at most one s such that s i /s .LetDet denote the class of deterministic automata, so Det ⊆ SetΓ.Thenitiseasytosee that Det is a covariety in SetΓ. In fact, one can show that there is a coequation ϕ over 2 colors that defines Det. Namely, define ϕ ⊆ UH2by

ϕ = {x ∈ UH2 |∀i ∈I∀y, z ∈ δ(x)(i) .ε2(y)=ε2(z)},

∼ where δ :UH2 = /ΓUH2 is the structure map for H2. Then, it is easy to show that A, α|= ϕ iﬀ A, α∈Det . 210 Hughes

Example 2.11 Fix a set Z and let Γ:Set /Set be the functor

ΓX = Z × X.

Any Γ-coalgebra A, α can be viewed as a collection of streams over Z, then, in which the same stream may be multiply represented as elements of A. The cofree coalgebra HN is the ﬁnal N × Z ×−coalgebra – i.e., HN = (N × Z)ω. Given an element σ ∈ HN, we can deﬁne

Col(σ)={π1 ◦ σ(i) | i<ω}

i (equivalently, Col(σ)={εN ◦ t (σ) | i<ω},wheret is the tail destructor). In other words, Col(σ) is the set of all colors that occur in the stream σ.Deﬁne acoequationϕ over N by

ϕ = {σ | card(Col(σ)) < ℵ0},

(where card(X) is the cardinality of X)soσ ∈ ϕ just in case only ﬁnitely many colors occur in σ. One can check that, for any Γ-coalgebra A, α,wehaveA, α|= ϕ just in case, for all a ∈ A,thereisn ≥ 0, m>0 such that

tn(a)=tn+m(a),

(where α = h, t). In other words, A, α|= ϕ iﬀ each stream in A has only a ﬁnite number of “states”. Remark 2.12 If one is interested not in equality of states, but in the observable behavior of streams, then one might require instead that, for every a ∈ A, there is n ≥ 0, m>0 such that for all i ≥ 0,

h ◦ tn+i(a)=h ◦ tn+m+i(a).

This condition can be speciﬁed by a coequation over 1 color. Remark 2.13 One can easily generate other interesting coequations using Example 2.11. First, it’s easy to see that the same idea can be used with polynomial functors in general. Second, one can require that each state begins repeating within n applications of the destructors by replacing ℵ0 with n in the deﬁnition of ϕ.

3 Conditional coequations

In Definition 2.5, we defined a coequation ϕ over C as a regular subobject ϕ ,2 /UHC 211 Hughes in E. In this section, we generalize the notion of coequation to include regular subobjects ϕ ,2 /A, α where A, α is an arbitrary coalgebra. Definition 3.1 A conditional coequation over A, α is any regular subobject ϕ ≤ A = UA, α.WesaythatB, β|=α ϕ (or just B, β|= ϕ)ifandonly if, for every homomorphism

p:B, β /A, α,

Im(p) ≤ ϕ. We sometimes drop the word “conditional” and refer to ϕ ≤ A as a coequation over A, α. We adopt the name conditional coequation because the semantics introduced in Deﬁnition 3.1 arise from the dual of conditional equations in the algebraic case. Given two coequations, ϕ and ψ,overC,wesaythatB, β|= ϕ ⇒ ψ just in case, for every

p:B, β /HC, if B, β|= ϕ(p), then B, β|= ψ(p). (In [Kur99] and [Kur00], ϕ ⇒ ψ is denoted ϕ/ψ.) Now, for any pair of coequations ϕ and ψ over C, there is a coalgebra A, α and a conditional coequation ϑ over A, α such that, for all B, β,

B, β|= ϕ ⇒ ψ iﬀ B, β|=α ϑ.

Namely, we can take A, α =[ϕ]HC (see Section 5 for the deﬁnition of [−]) and ϑ = A ∧ ψ. On the other hand, given a conditional coequation ϑ over A, α,wecanviewbothϑ and A as coequations over A — that is, as subobjects of UHA. It is easy to check that

B, β|=α ϑ iﬀ B, β|= A ⇒ ϑ. Remark 3.2 Given coequations ϕ and ψ over C, one can consider the coequation ϕ → ψ over C,where→ is the exponential in Sub(UHC). One can show that, if A, α|= ϕ → ψ,thenA, α|= ϕ ⇒ ψ, but the converse does notholdingeneral. Example 3.3 Let Γ− = −×−and let A = {a, b}.Let

∼ = / εA,l,r:UHA A × UHA× UHA be the structure map of HA. Deﬁne coequations ϕ and ψ over A by

ϕ = {σ ∈ UHA | σ = l(σ)}, ψ = {σ ∈ UHA | σ = r(σ)}.

212 Hughes

Let α(a)=b, b and α(b)=b, a.ThenA, α|= ϕ ⇒ ψ, but A, α|= ϕ → ψ.

Conditional coequations provide a means of interpreting the co-quasivariety theorem, below. As before, we ﬁrst state an abstract version of the quasi-variety theorem and then interpret the theorem in categories of coalgebras. The proof of Theorem 3.4 and its corollaries can be found in [AH00]. The theorem also was proven independently by Alexander Kurz in [Kur00]. Theorem 3.4 Let C be a quasi-co-Birkhoﬀ category and V a full subcategory of C. The following are equivalent. (i) V is a quasi-covariety. (ii) The inclusion U V :V /C has a right adjoint HV such that each compo- V / V V nent of the counit ε :1C U H is a regular mono.

(iii) V = S⊥ for some collection S of regular monos. Corollary 3.5 Let C be a quasi-co-Birkhoﬀ category and V a quasi-covariety of C.Then (i) The inclusion U V :V /C has a right adjoint HV. V / V V (ii) The unit η :idV H U is an isomorphism. ∈C ∈ ⊥ V V (iii) For each C , C V iﬀ C εC ,whereε is the counit of the adjunction U V 2 HV. V V V V (iv) The corresponding comonad, G = U H ,ε,U ηHV , is idempotent. (v) The comonad GV preserves regular monos.

The following corollary restates the results of Theorem 3.4 for categories of coalgebras in terms of conditional coequations. Corollary 3.6 Let E be quasi-co-Birkhoﬀ and let Γ:E /E be an functor that preserves regular monos. A full subcategory V of EΓ is a quasi-covariety just in case there is a collection S of conditional coequations such that

B, β∈V iﬀ ∀ϕ ∈ S B, β|= ϕ.

The same claim holds if we replace the endofunctor Γ with a comonad G.

4Deductive completeness and invariance

We focus now on Birkhoff’s completeness theorem. Whereas the variety theorem gives an equivalence between closure conditions on classes of algebras and equational definability, the completeness theorem states an equivalence between deductively closed sets of equations and equational theories for classes of algebras. We first recall the completeness theorem in the classical setting. Let Σ be a signature and Γ the associated polynomial functor (so that 213 Hughes

Alg(Σ) = SetΓ), and let F :Set /SetΓ be the left adjoint of the forgetful functor U :SetΓ /Set.Wesaythataset of equations E over X (i.e., a subset of UFX × UFX)isclosed if it satisﬁes the following: (i) For each x ∈ X, x = x ∈ E;

(ii) For each t1 = t2 ∈ E, t2 = t1 ∈ E;

(iii) If t1 = t2 ∈ E and t2 = t3 ∈ E, then t1 = t3 ∈ E; (iv) For each function symbol f (n) ∈ Σ, and each n-tuple of equations,

s1 = t1,...,sn = tn,

(n) (n) in E, the equation f (s1,...,sn)=f (t1,...,tn) ∈ E. (v) E is closed under substitution of terms for variables. That is, for each t1 = t2 ∈ E, t ∈ UFX, x ∈ X,

t1[t/x]=t2[t/x] ∈ E. Theorem 4.1 (Birkhoﬀ’s completeness theorem) A set of equations E is the equational theory for some class V of Σ-algebras just in case E is closed.

We say that a (binary) relation E over UFX is stable just in case, for every homomorphism f :FX /FX, the image of E under f is contained in E, i.e.,

∃f E ≤ E.

In categorical terms, then, a set E of equations over X is closed just in case (i’) E is a congruence; (ii’) E is stable. The notion of stable sets of equations dualizes to a notion of invariant 3 coequations. This definition is first found in [GS98]. Definition 4.2 Let A, α be a G-coalgebra. A regular subobject ϕ of A is invariant just in case, for every homomorphism

p:A, α /A, α,

3 The use of the word “invariant” in coalgebra is a bit overloaded. We adopt the term from [GS98], but others ([Jac99], for instance) use “invariant” to mean “admits a structure map”, i.e., is the carrier of a subcoalgebra. Perhaps a better term for the latter concept is coinductive predicate, since such predicates are the analogues of inductive predicates for algebras. 214 Hughes the image of ϕ under p is contained in ϕ, i.e.,

∃pϕ ≤ ϕ. Remark 4.3 If A, α is a subcoalgebra of the ﬁnal coalgebra, then any conditional coequation ϕ over A, α is invariant. Given a coequational variety

V = {B, β|B, β|= ψ}, we are interested in the minimal coequation ϕ such that V |= ϕ. Such minimal coequations can be viewed as generating the collection of coequations that V satisfies, in the sense that, for any coequation ϑ,ifV |= ϑ,thenϕ , ϑ.In this sense, the minimal coequation represents the coequational theory of V — it represents the coequational commitment that V entails. This intuition motivates the following definition. Definition 4.4 Let ϕ be a (conditional) coequation over A, α and V acol- lection of coalgebras. We say that ϕ is the generating (conditional) coequation for V just in case (i) V |= ϕ; (ii) For any conditional coequation ψ over C,ifV |= ψ then ϕ , ψ. Theorem 4.5 (Invariance theorem) Acoequationϕ over C is the generating coequation for some collection V of coalgebras just in case ϕ is an invariant subcoalgebra of HC. We postpone the proof until we’ve defined the modal operators ✷ and . The invariance theorem first arises in [GS98], where it is proved for coalgebras over Set. The theorem is stated in different terms in their work, since it is not motivated by the coequation-as-predicate view that we take here.

5 The subcoalgebra operator

In what remains, we construct the modal operators that are used in the proof of the invariance theorem, and prove some basic results regarding these operators. Throughout what follows, we assume that E is co-Birkhoff and has pullbacks and that G preserves regular monos and pullbacks of regular monos, so that EG is co-Birkhoff and U creates pullbacks of regular monos (and, in particular, finite intersections). We further assume that, for each A ∈E, Sub(A) is a Heyting algebra. In this section, we introduce the modal operator ✷. Given a subobject ϕ of A = UA, α, ✷ϕ is the greatest subcoalgebra of A contained in ϕ.The construction is well-known, although the view that ✷ is a modal operator is perhaps less familiar. The ✷ operator is discussed in [Jac99], where it plays a 215 Hughes central role. It is from that work that we take the view of ✷ as a “henceforth” operator. / Since the coalgebraic forgetful functor U :EG E preserves regular monos, there is an induced forgetful functor,

/ Uα :Sub(A, α) Sub(A), from the partial order of regular subobjects of A, α to the partial order of regular subobjects of A. Asiswellknown,Uα has a right adjoint, which we denote [−]α (dropping the subscripts whenever convenient). The right adjoint maps a subobject B ≤ A to the largest subcoalgebra contained in B.More precisely, 2 [B]= {C, γ≤A, α|C ≤ B}.

Here, we use the fact that Uα creates joins. Alternatively, one may deﬁne [B] as the pullback shown below.

,2 / [B] HB_ _ _ A, α ,2 /HA

This adjoint pair yields a modal operator

/ ✷α :Sub(A) Sub(A), as usual, by taking the composite ✷α =[−]α◦Uα. Again, we drop the subscript when convenient. Theorem 5.1 ✷ is an S4 necessity operator, i.e., satisﬁes the following: (i) If ϕ , ψ,then✷ϕ , ✷ψ (ii) ✷ϕ , ϕ (iii) ✷ϕ , ✷✷ϕ (iv) ✷(ϕ → ψ) , ✷ϕ → ✷ψ

Proof. Condition (i) is just functoriality, and conditions (ii) and (iii) are just the counit and comultiplication for the comonad ✷.

The last item follows from the fact that Uα preserves meets, and hence so does ✷. The argument for (iv) from this is standard, but we include it here. By (i), we have ✷((ϕ → ψ) ∧ ϕ) , ✷ψ, and, hence, ✷(ϕ → ψ) ∧ ✷ϕ , ✷ψ. Therefore, ✷(ϕ → ψ) , ✷ϕ → ✷ψ. ✷ 216 Hughes

Theorem 5.2 ✷ is stable under pullback along homomorphisms. That is, for any f :A, α /B, β, we have ∗ ∗ ✷α ◦ f = f ◦ ✷β. Proof. The bottom face in Figure 1 commutes, since f is a homomorphism. The front and rear faces are pullbacks by deﬁnition of ✷, and the right face is a pullback since G preserves pullbacks along regular monos by assumption. Hence, the left face is a pullback. ✷

∗ ,2 / ∗ ✷f_ P? Gf_ P? ?? ?? ?? ?? ?? ?? ?? ?? ? ? ,2 / ✷P GP_ _ 

 ,2 / A ?? GA?$ ?? ?? ?? ?? ?? ?? ?? ?? B ,2 /GB

Fig. 1. ✷ commutes with pullback along homomorphisms.

Theorem 5.2 can be understood as a statement about substitution of terms for variables. Namely, we view conditional coequations ϕ over A, α as predicates of a single variable x of type A. Then, Theorem 5.2 says that, for any homomorphism f :B, β /A, α, and any variable y of type B,wehave

(✷ϕ)[f(y)/x]=✷(ϕ[f(y)/x]).

Thus, ✷ is stable under substitutions of terms built from homomorphisms for variables. (It is not stable under substitution of arbitrary terms for variables, however.)

6 The invariance operator

We apply the same approach to invariant coequations as in Section 5. That is, we first define an adjoint pair (a Galois correspondence) between the coequations over A, α and the invariant coequations. Then, we use this pair to define a modal operator on coequations over A, α. 217 Hughes

Accordingly, let Inv(α) denote the full subcategory of Sub(A) consisting of the invariant coequations over A, α,andlet

/ Iα :Inv(α) Sub(A) be the inclusion functor.

Theorem 6.1 Iα has a right adjoint. Proof. Let ϕ ≤ A and deﬁne

/ Pϕ = {ψ ≤ A |∀p:A, α A, α(∃pψ ≤ ϕ)}.

/ We deﬁne a functor Jα :Sub(A) Sub(A)by 2 Jα(ϕ)= ψ, ψ∈Pϕ omitting the subscripts when convenient. We ﬁrst show that Jϕ is invariant. Let

r :A, α /A, α be given. In order to show that ∃rJϕ ≤ Jϕ, it suﬃces to show that ∃rJϕ ∈ Pϕ, / i.e., for every homomorphism p:A, α A, α, we have ∃p(∃rJϕ) ≤ ϕ.A quick calculation shows 2 2 ∃p∃rJϕ = ∃p◦r ψ = ∃p◦rψ ≤ ϕ. ψ∈Pϕ ψ∈Pϕ

Next, we show that I 2 J.Letψ be invariant. If ψ ≤ ϕ, then, for every endomorphism p,

∃pψ ≤ ψ ≤ ϕ, so ψ ∈ Pϕ and hence ψ ≤ Jϕ. On the other hand, if ψ ≤ Jϕ,then

ψ ≤ Jϕ ≤ ϕ.

✷

Let α = IαJα. We confirm that is an S4 operator. Again, it suffices to show that preserves meets. Theorem 6.2 is an S4 necessity operator. Proof. Again, since is a comonad, it suffices to show that preserves meets, or, more specifically, that

ϕ ∧ ψ , (ϕ ∧ ψ). 218 Hughes

Let p:A, α /A, α be given (where ϕ and ψ are conditional coequations over A, α). Then ∃p(ϕ ∧ ψ) ≤∃p ϕ ≤ ϕ and, similarly, ∃p(ϕ ∧ ψ) ≤ ψ. Hence, ∃p(ϕ ∧ ψ) ≤ ϕ ∧ ψ.Sincep was an arbitrary homomorphism, ϕ ∧ ψ , (ϕ ∧ ψ). ✷ Remark 6.3 Unlike ✷, the operator does not commute with pullbacks along homomorphisms. Let Γ:Set /Set be the identity functor. We will consider a coequation ϕ over 2 colors, that is, a subset of UH2=2ω, the set of streams over 2. Speciﬁcally, let

ϕ = {0, 1}, where 0 and 1 are the constant streams. Note that ϕ is invariant. Let p:H3 /H2 be the homomorphism induced by the coloring p:3 /2, where p(0) = 0, p(1) = 0, p(2) = 1 (i.e., p = H(p)). Then p∗ϕ is the set

{σ ∈ 3ω |∀nσ(n) < 2}∪{2}.

It is easy to check that

p∗ϕ = {0, 1, 2}= p∗(ϕ)=p∗ϕ.

In terms of substitutions, then, it is not the case that, for every homomorphism f :B, β /A, α, (ϕ)[f(y)/x]=(ϕ[f(y)/x]). We return to the examples of Section 2 to give some idea of how works. In those examples, the coequations over C were described in terms of the coloring εC . Typically, takes a coequation defined in terms of colorings to a similar coequation defined in terms of equality of states, as these examples illustrate. I Example 6.4 Let ΓS =(PfinS) , as in Example 2.10. Recall that the class of deterministic automata Det forms a covariety of SetΓ, where the defining coequation ϕ over 2 is given by

ϕ = {x ∈ UH2 |∀i ∈I∀y, z ∈ σ(x)(i) .ε2(y)=ε2(z)}. It is easy to show that

ϕ = {x ∈ UH2 |∀i ∈I∀y, z ∈ σ(x)(i) .y = z}, or, more simply,

ϕ = {x ∈ UH2 |∀i ∈I. card(σ(x)(i)) < 2}. 219 Hughes

Example 6.5 Recall the functor ΓX = Z × X and the coequation ϕ over N deﬁned by ϕ = {σ | card(Col(σ)) < ℵ0}, from Example 2.11. For each σ ∈ UHN,let

St(σ)={tn(σ) | n ∈ ω},

∼ = / where εN,h,t:UHN N × Z × UHN is the structure map for HN.Then

ϕ = {σ | card(St(σ)) < ℵ0}.

7 Generating coequations

We return to the invariance theorem. To begin, we show that, for any ϕ over A, α, ϕ and ✷ϕ have the same expressive power as ϕ – i.e., deﬁne the same quasi-covariety.

Theorem 7.1 Let A, α be given. For every ϕ ∈ Sub(A), B, β∈EG,

B, β|= ϕ iﬀ B, β|= ϕ.

Proof. Since ϕ , ϕ, one direction is trivial. Suppose, then, that B, β|= ϕ.Let p:B, β /A, α be given. To show that Im(p) ≤ ϕ, we will show that, for every

r :A, α /A, α,

∃r Im(p) ≤ ϕ.But,∃r Im(p)=Im(r ◦ p) ≤ ϕ, since B, β|= ϕ. ✷

Theorem 7.2 Let A, α be given. For every ϕ ∈ Sub(A), B, β∈EG,

B, β|= ✷ϕ iﬀ B, β|= ϕ.

Proof. Again, one direction is trivial. Let B, β|= ϕ and let

p:B, β /A, α be given. Then Uα Im(p)=Im(Up) ≤ ϕ and so, by the adjunction Uα 2 [−]α, Im(p) ≤ [ϕ]α.Thus,

Im(Up)=Uα Im(p) ≤ Uα[ϕ]α = ✷αϕ. ✷

Lemma 7.3 Let ϕ be a coequation over C. Then the coalgebra [ϕ] satisﬁes the coequation ϕ. 220 Hughes

Proof. Let p:[]ϕ /HC be given. Because HC is regular injective, p extends to a homomorphism HC /HC, as shown below. Hence, because ✷ ϕ<ϕ and ϕ is invariant, there is a unique map ✷ ϕ / ϕ making the square and thus the lower triangle commute, as desired.

/ O 9 O UHC sUHCs ss ssp _LR ss _LR ✷ ϕ /ϕ ✷ Theorem 7.4(Invariance theorem) Acoequationϕ over C is the generating coequation for some collection V of coalgebras just in case ϕ is an invariant subcoalgebra of HC, i.e., ϕ = ✷ ϕ. Proof. Let ϕ = ✷ ϕ and deﬁne

V = {B, β|B, β|= ϕ}.

Then, clearly, V |= ϕ. We will show that, if V |= ψ,thenϕ , ψ.But,from Lemma 7.3, we know that [ϕ]=[ϕ]isinV. Consequently, []ϕ |= ψ and hence ϕ = ∃id✷ ϕ , ψ. ✷ Remark 7.5 The same claim and proof holds for conditional coequations over A, α where A, α is regular injective or A, α is an invariant subcoalgebra of HA. That is, a conditional coequation ϕ over such A, α is a generating coequation for some class V just in case ϕ = ✷ ϕ.

Remark 7.6 Let ϕ be a coequation over C and Vϕ the covariety it deﬁnes. ϕ / ϕ Let U :Vϕ EG be the inclusion and H right adjoint to V (as in Corol- lary 3.5). Then one can show that

UUϕHϕHC = ✷ ϕ. / I Example 7.7 Consider again the functor Γ:Set Set where ΓS =(PﬁnS) and the coequation ϕ deﬁned by

ϕ = {x ∈ UH2 |∀i ∈I∀y, z ∈ σ(x)(i) .ε2(y)=ε2(z)}.

We claimed in Example 6.4 that

ϕ = {x ∈ UH2 |∀i ∈I∀y, z ∈ σ(x)(i) .y = z}.

We write s /s if there is an i such that s i /s and we write ∗/ for the transitive closure of /. One can further show that

✷ ϕ = {x ∈ UH2 |∀w.x ∗/w →∀i ∈Icard(σ(w)(i)) < 2}. 221 Hughes

By Theorem 7.4, ✷ ϕ is the generating coequation for Det,theclassof deterministic automata. Theorem 7.8 For any coalgebra A, α,

✷ ≤ ✷.

Proof. By deﬁnition of , it suﬃces to show that, for every homomorphism / p:A, α A, α, ∃p✷ ϕ ≤ ✷ϕ. We know that, for every p, ∃p✷ ϕ ≤ ∃p ϕ ≤ ϕ. Thus, since Uα commutes with ∃p,

Uα∃p[ϕ]α = ∃pUα[ϕ]α ≤ ϕ, and so ∃p[ϕ]α ≤ [ϕ]α.Thus,

∃p✷ ϕ = Uα∃p[ϕ]α ≤ Uα[ϕ]α = ✷ϕ. ✷

We can prove that ✷ commutes with given further assumptions. Namely, if the modal operator ✷ has a left adjoint ,then✷ = ✷. The existence of such an adjoint arises naturally, given that the comonad G preserves nonempty intersections. In this case, the subcoalgebra forgetful functor Uα has a left adjoint, / Fα :Sub(A) Sub(A, α), takingasubobjectϕ to the least subcoalgebra B, β such that ϕ ≤ B.The closure operator α is the composite UαFα. See [Gum98b] for a discussion of functors which preserve non-empty intersections and an example of a functor which does not have this property. See also [Jac99] for a discussion of the closure operator , where it is denoted α α ⇐ (and ✷ is denoted α). ⇒

Theorem 7.9 If ✷α has a left adjoint, α,then✷ = ✷. Proof. To show that ✷ ≤ ✷, it is sufficient (by the adjunction 2 ✷)to show that ✷ ≤ . Let ϕ ≤ A = UA, α. We will show that, for every homomorphism / p:A, α A, α, ∃p ✷ϕ ≤ ϕ and conclude (by definition of )that ✷ϕ ≤ ϕ. Again, by the adjunctions, it suffices to show that

✷ϕ ≤ ✷p∗ϕ = p∗✷ϕ, or, equivalently, ∃p ✷ϕ ≤ ✷ϕ. This is immediate from the definition of .✷ One suspects that Theorem 7.9 does not depend on the existence of the closure operator — that is, there should be a proof that ✷ = ✷ that does not require an adjoint to the modal operator ✷.Atthistime,weare 222 Hughes unaware of such a proof. Nor do we have an example of a coequation ϕ in some EG for appropriate G such that ✷ϕ>✷ ϕ. In any case, one finds that for many functors of interest (polynomials, finite powerset, etc.), the operator ✷ does have an adjoint and so the assumptions above are not as limiting as one might suspect. Example 7.10 Let ΓX = Z × X and consider again the coequation ϕ over N from Example 2.11, where

ϕ = {σ | card(Col(σ)) < ℵ0}.

It is easy to check that ✷ϕ = ϕ,andso✷ϕ = ϕ (which was deﬁned in Example 6.5) is the least coequation over N such that A, α|= ϕ just in case A, α|= ϕ.

8 Future research

We have tried to develop the idea of “coequation-as-predicate” here. This approach naturally gives a means of constructing new coequations out of old, by using the standard logical operators ∧, ¬, ∃, etc., as well as the modal operators ✷ and . We have shown that, for any coequation ϕ, the covariety ϕ defines is just the same covariety that ✷ϕ and ϕ defines. It is also obvious that the covariety ϕ∧ψ defines is the intersection of the covarieties defined by ϕ and ψ. One would like to investigate the relation between the other logical operators (especially the quantifiers) and the partial order of covarieties. One would also like to investigate the apparent inequality between ✷ and ✷. Theorem 7.9 showed that for any functor G which preserves intersections, ✷ = ✷, but it’s not clear that the assumption is really necessary. To this end, it is instructive to consider the dual case. One supposes that closing a set of equations under the congruence conditions followed by stability always yields the same set as closure under stability followed by the congruence conditions, but perhaps there is a technical detail one needs to prove this (maybe an assumption true for all categories of algebras over Set,even).

References

[AGM92] S. Abramsky, Dov M. Gabbay, and T. S. E Maibaum, editors. Handbook of Logic in Computer Science, volume 1. Oxford Science Publications, 1992.

[AH00] Steve Awodey and Jesse Hughes. The coalgebraic dual of Birkhoﬀ’s variety theorem. Technical Report CMU-PHIL-109, Carnegie Mellon University, Pittsburgh, PA, 15213, November 2000.

[AN82] H. Andr´eka and I. N´emeti. A general axiomatizability theorem formulated in terms of cone-injective subcategories. In Proc. Conf. 223 Hughes

“Universal Algebra Esztergom 1977”, pages 13–15. North-Holland, Amsterdam, 1982.

[AR94] Jiˇr´ıAd´amek and Jiˇr´ıRosick´y. Locally Presentable and Accessible Categories, volume 189 of London Mathematical Society Lecture Note Series. Cambridge University Press, 1994.

[BH76] B. Banaschewski and H. Herrlich. Subcategories deﬁned by implications. Houston Journal of Mathematics, 2(2), 1976.

[Bir35] G. Birkhoﬀ. On the structure of abstract algebras. Proceedings of the Cambridge Philosophical Society, 31:433–454, 1935.

[Bor94] F. Borceux. Handbook of Categorical Algebra I-III. Cambridge University Press, 1994.

[Gr¨a68] George Gr¨atzer. Universal Algebra. D. Van Nostrand Company, Inc., 1968.

[GS98] H. Peter Gumm and Tobias Schr¨oder. Covarieties and complete covarieties. In Jacobs et al. [JMRR98], pages 43–56.

[Gum98a] H. Peter Gumm. Equational and implicational classes of coalgebras, 1998. Extended Abstract submitted for RelMiCS’4.

[Gum98b] H. Peter Gumm. Functors for coalgebras. Preprint, 1998.

[Gum01] H. Peter Gumm. Birkhoﬀ’s variety theorem for coalgebras. Cont. to General Algebra 2000, To appear 2001.

[Hug01] Jesse Hughes. A Study of Categories of Algebras and Coalgebras.PhD thesis, Carnegie Mellon University, 2001.

[Jac95] B. Jacobs. Mongruences and cofree coalgebras. Lecture Notes in Computer Science, 936, 1995.

[Jac99] B.P.F. Jacobs. The temporal logic of coalgebras via Galois algebras. Technical Report CSI-R9906, Computing Science Institute, April 1999.

[JMRR98] Bart Jacobs, Larry Moss, Horst Reichel, and Jan Rutten, editors. ENTCS, volume 11. Coalgebraic Methods in Computer Science (CMCS’1998), 1998.

[JR97] Bart Jacobs and Jan Rutten. A tutorial on (co)algebras and (co)induction. Bulletin of the European Association for Theoretical Computer Science, 62, 1997.

[Kur98] Alexander Kurz. A co-variety-theorem for modal logic. Proceedings of Advances in Modal Logic, 1998.

[Kur99] Alexander Kurz. Modal rules are co-implications. Draft. Available at http://helios.pst.informatik.uni-muenchen.de/~kurz, February 1999. 224 Hughes

[Kur00] Alexander Kurz. Logics for Coalgebras and Applications for Computer Science. PhD thesis, Ludwig-Maximilians-Universit¨at M¨unchen, 2000.

[Man76] Ernie Manes. Algebraic Theories. Number 26 in Graduate Texts in Mathematics. Springer-Verlag, 1976.

[MT92] K. Meinke and J. V. Tucker. Universal algebra. In Abramsky et al. [AGM92], pages 189–412.

[RJT01] J. Rothe, B. Jacobs, and H. Tews. The coalgebraic class speciﬁcation language ccsl. Journal of Universal Computer Science, To appear 2001.

[Ro¸s00] Grigore Ro¸su. Equational axiomatizability for coalgebra. Theoretical Computer Science, 260, 2000. Scheduled to appear.

[Rut96] J.J.M.M. Rutten. Universal coalgebra:a theory of systems. Technical Report CS-R9652, Centrum voor Wiskunde en Informatica, 1996.

[Tur96] Daniele Turi. Functorial Operational Semantics and its Denotational Dual. PhD thesis, Free University, Amsterdam, June 1996.

225 CMCS’01 Preliminary Version

Two-dimensional linear algebra

Martin Hyland

Department of Pure Mathematics and Mathematical Statistics University of Cambridge Cambridge, ENGLAND

John Power 1

LFCS, Division of Informatics University of Edinburgh King’s Buildings, Edinburgh EH9 3JZ, SCOTLAND

Abstract We introduce two-dimensional linear algebra, by which we do not mean two-dimensional vector spaces but rather the systematic replacement in linear algebra of sets by categories. This entails the study of categories that are simultaneously categories of algebras for a monad and categories of coalgebras for comonad on a category such as SymMons, the category of small symmetric monoidal categories. We outline relevant notions such as that of pseudo-closed 2-category, symmetric monoidal Lawvere theory, and commutativity of a symmetric monoidal Lawvere theory, and we explain the role of coalgebra, explaining its precedence over algebra in this setting. We outline salient results and perspectives given by the dual approach of algebra and coalgebra, extending to two dimensions the study of linear algebra.

1 Introduction

Fundamental to the development of linear algebra are extensions of the fact that for any commutative ring R, the forgetful functor U : R-Mod −→ Ab from the category of R-modules to the category of abelian groups has both left and right adjoints. The left adjoint sends an abelian group A to the tensor product R ⊗ A in Ab,withR-action induced by the multiplication of R.The right adjoint sends an abelian group A to the R-module [R, A]givenbythe set of abelian group morphisms from R to A, with the pointwise abelian group

1 This work is supported by EPSRC grant GR/M56333:The structure of programming languages :syntax and semantics, and a British Council grant and the COE budget of STA Japan. This is a preliminary version. The final version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Hyland and Power structure and with the action of R on [R, A] induced by precomposition. These adjoints can be expressed as left and right Kan extensions respectively, if one enriches in the symmetric monoidal closed category of abelian groups, because a commutative ring is exactly a commutative monoid in the category Ab.The two adjoints exhibit R-Mod as both the category of algebras for a monad on Ab and also as the category of coalgebras for a comonad on Ab. The above suggests that linear algebra is a potential area of application of the study of coalgebras. But it does not, a priori, imply that linear algebra is relevant to computer science. However, now suppose we drop the assumption of inverses in the definition of a group. Then, abelian groups above would be replaced by commutative monoids, rings would be replaced by semirings, and R-modules remain the same except that the underlying object need only be a commutative monoid rather than an abelian group. Having done this, consider a second step, generalising from sets to categories. Some of the equalities of commutative monoids are most naturally replaced by coherent isomorphisms, so let us assume we systematically do that. Abelian groups, which became commutative monoids above, now become small symmetric monoidal categories. One can prove that the category of small symmetric monoidal categories does not have a coherent symmetric monoidal closed structure on it, systematically generalising that of Ab, but it does have a pseudo-closed structure, which, with care, should suffice for our purposes. The underlying mathematics is not yet complete, as we shall explain. But in principle, a commutative ring now becomes a pseudo- commutative pseudo-monoid M in SymMon and we can consider a comonad of the form [M,−] and coalgebras for it. So we are still in the realm of coalgebra, and this is now relevant to computer science, as illustrated in [12], as it supports analysis of various kinds of contexts as well as various kinds of wiring, as provided for instance by categories with finite products, or finite coproducts, or symmetric monoidal structure, or variants or combinations of these. See for instance [18] for one sophisticated combination, and see [12] for several other examples. However, as mentioned above, the underlying mathematics is not yet complete. We do have a definition of pseudo-closed 2-category [13] generalising Eilenberg and Kelly’s definition of closed category [8], and SymMon is an example, but we do not yet have a notion of pseudo-symmetric pseudo-monoidal pseudo-closed 2-category. So, a priori, we cannot yet say what a pseudo- commutative pseudo-monoid in SymMon is. But that seems likely to be achievable in coming years, and we already have clear evidence, based on the work in [13], that the coalgebraic perspective here will be far more primitive than the algebraic perspective. Moreover, despite not yet having completed the underlying mathematics, we are already in a position to explicate much of the relevant coalgebraic structure. In particular, as we have a definition of pseudo-closed 2-category, we can define a monoid to be a comonad of the form [M,−], and continue in that vein. We largely leave that implicit in the

227 Hyland and Power paper, but it is the underlying idea. This application of coalgebra to computer science has not previously been considered, so here, we present some of the ideas and underlying structures that we have been developing. This may be seen as the beginning of what might be called two-dimensional linear algebra: just as linear algebra is characterised by its special additional features on top of those given by universal algebra, notably its coalgebraic structure, two-dimensional linear algebra may equally be characterised by its special features, notably coalgebraic ones, relative to the two-dimensional universal algebra of [2]. As usual in two- dimensional studies, the greatest challenge is in taking careful note of the subtle relationship between equality and coherent isomorphism [15,2]. We have not, at this time, developed other aspects of linear algebra two-dimensionally, but we anticipate doing so. The originality of this paper resides primarily in its identification of two- dimensional linear algebra as a topic of study, its development of relevant notions such as that of symmetric monoidal Lawvere theory, and in explaining how this can be seen as an area of application of coalgebra in computer science. Some of the technical results here, primarily those of Sections 3 and 4, have been presented previously in a different (more complicated) form, in [12], but others have not. The paper is organised as follows. In Section 2, we outline our conception of two-dimensional linear algebra and how we hope to develop it. We sketch the basic definitions and results, and we explain why coalgebra plays a stronger role here than in the ordinary one-dimensional linear algebra. In Sec- tion 3, we introduce the notion of a symmetric monoidal Lawvere theory. In Section 4, we define what it means for a symmetric monoidal Lawvere theory to be commutative, and we prove that this induces a comonad on SymMons. And in Section 5, we show how, by means of a more complex construction that may ultimately prove to be less natural, one can obtain a comonad to account for those examples of symmetric monoidal Lawvere theories that are not commutative.

2 An overview of two-dimensional linear algebra

Extending the situation for Ab, or perhaps better, the category CMon of commutative monoids, to two dimensions, we should like to prove that the category SymMons of small symmetric monoidal categories and strict symmetric monoidal functors is itself a symmetric monoidal category. There is a theorem, ultimately due to Kock [16], but also expressed in [14], which is a primary reference for us, that may help. Given a 2-monad T on Cat, the monad T automatically acquires a strength t : X × TY −→ T (X × Y ) using the cartesian closed structure of Cat together with the enrichment of T 228 Hyland and Power to a 2-functor. It also acquires a costrength t∗ : TX × Y −→ T (X × Y ) by trivial use of the symmetry of ﬁnite products in Cat. A 2-monad T on Cat is commutative if for every pair of categories X and Y , the diagram

t Tt∗ TX × TY ✲ T (TX × Y ) ✲ T 2(X × Y )

t∗ µ ❄ ❄ T (X × TY) ✲ T 2(X × Y ) ✲ T (X × Y ) Tt µ commutes, where µ is the multiplication of the 2-monad. Theorem 2.1 For any ﬁnitary 2-monad T on Cat, the category T -Alg is symmetric monoidal closed, making the forgetful functor U : T -Alg −→ Cat into part of a symmetric monoidal closed adjunction if and only if T is commutative.

Alas, it follows from this theorem that the 2-category SymMons has no symmetric monoidal closed structure that is coherent with that of Cat in the sense of the theorem: the reason is that T is not commutative, and the reason for that is that the commutativity diagram does not commute, but rather contains a non-trivial isomorphism determined by the symmetry in the definition of symmetric monoidal category. No coherence theorem can force that symmetry to be an equality, even in the most mundane examples. This contrasts with the situation for Ab relative to Set: the category Ab is symmetric monoidal closed, coherently with respect to the finite product structure of Set, and because of that, one can consider a commutative ring R as a commutative monoid in the symmetric monoidal category Ab and proceed to consider R ⊗−and [R, −].So,atfirstsight,weappeartobestuck. There seem to be two ways to negotiate this difficulty. The first, more elegant, approach is to define a notion of pseudo-symmetric pseudo-monoidal pseudo-closed 2-category and attempt to prove that the 2-category SymMon of small symmetric monoidal categories and strong symmetric monoidal functors has that structure. That should be possible within the coming few years, as we are making good progress in that direction in [13] based upon a notion of pseudo-commutative monad. We do not give detailed definitions here, as our account is not complete yet and the details might distract from the flow of the paper. So we refer the reader to [13] for detail. But the central idea is as follows. One first defines a notion of pseudo-commutative monad. For the 2-category theoretic terminology used here, we refer the reader to [15]. 229 Hyland and Power

Deﬁnition 2.2 A pseudo-commutative monad on Cat is a 2-monad T together with a isomorphism, natural in X and Y ,withcomponents

t Tt∗ TX × TY ✲ T (TX × Y ) ✲ T 2(X × Y )

∗ t ⇓ ρX,Y µ ❄ ❄ T (X × TY) ✲ T 2(X × Y ) ✲ T (X × Y ) Tt µ subject to one coherence axiom with respect to each of the symmetry of Cat, and the multiplication, unit, and strength of T .

Examples of pseudo-commutative monads include our leading example of the 2-monad for small symmetric monoidal categories, the 2-monad for small categories with finite products, and that for small categories with finite coproducts. Another example is the 2-monad for which an algebra is a small symmetric monoidal category together with a strong endofunctor, as lies at the heart of [9]. A non-example is the 2-monad for which an algebra is a small category together with a monad on it. This can all be verified by routine calculation. We then define a notion of pseudo-closed 2-category. The full definition is complex, owing to a lengthy but definitive list of coherence axioms: the axioms are only a little more complex than those in Eilenberg and Kelly’s definition of closed category in [8], which are also lengthy. The central data is that a pseudo-closed 2-category has, for each pair of objects X and Y , an object [X, Y ] that acts as an internal hom, or exponential, of X and Y . The construction becomes an endo-2-functor [X, −] on the pseudo-closed 2- category, so we are in a position in which we can consider coalgebra. A full definition appears in [13]. The main theorem of [13] yields Theorem 2.3 If T is a pseudo-commutative monad on Cat, then the 2- category T -Algp of strict T -algebras and pseudo-maps of algebras forms a pseudo-closed 2-category.

The leading example here has T being the 2-monad for which the 2- category T -Algp is exactly SymMon. So, in due course, we hope to use this theorem as a basis for two-dimensional linear algebra. Note that the pseudo-closedness is strict in that [X, −] is an endo-2-functor. In contrast, if a corresponding pseudo-monoidal structure exists, which we believe it will under mild hypotheses, then the construction −⊗X will not be a 2-functor but rather a pseudo-functor. So in this precise sense, coalgebra is more primitive here than algebra. There are difficulties with this line of argument that seem readily resolv- able but which we have not successfully addressed yet, in particular the fact 230 Hyland and Power that the coherence for a pseudo-symmetry appears to be of essentially the same character as that for a definition of tetracategory, which, owing to the complexity of the coherence, does not have a fully established definition yet: see [11] to see some of the relevant issues. But once we resolve that, we can define a notion of pseudo-commutative pseudo-monoid M, then consider the 2-category of coalgebras for what will be a 2-comonad [M,−]. But in the absence of the mathematics required to proceed in this way, we adopt a more subtle approach that includes all the examples of primary interest to us but bypasses this difficulty. The way we proceed is effectively by ignoring pseudo-monoidal structure and defining a monoid in a pseudo-closed category to be a comonad of the form [M,−]. We do not make that explicit in our development, as it will probably become obsolete before long. However, we believe that the constructions we do develop now are of independent interest and will, in due course, become integrated into the above setting. We already have a coalgebraic account here, so we start to explain that in the next section.

3 Symmetric monoidal Lawvere theories

In this section, we introduce the notion of a symmetric monoidal Lawvere theory. This is a symmetric monoidal version of the usual notion of Lawvere theory, and indeed it extends the usual definition. Both definitions may be seen as instances of the same general phenomenon, which may be described for any monad T on Cat: for ordinary Lawvere theories, consider the monad T for small categories with finite products; for symmetric monoidal Lawvere theories, consider the monad T for small symmetric monoidal categories. To make the comparison precise, we recall the definition of Lawvere theory. Let N denote the category whose objects are natural numbers and whose morphisms are all functions between natural numbers.

Definition 3.1 A Lawvere theory is a small category L with finite products together with an identity on objects strict finite product preserving functor j : N op −→ L.Amodel of a Lawvere theory L in a category C with finite products is a finite product preserving functor h : L −→ C.

The significance of N op here is that it is the free category with strictly associative finite products on 1. Every category with finite products is equivalent to one with strictly associative finite products, so there are a few different ways to deal with coherence issues here. For simplicitiy of exposition, we shall adopt a slightly different approach to that given by Lawvere as explained in [1]: we shall make our theories non-strict and our models strict.

Deﬁnition 3.2 A symmetric monoidal Lawvere theory is a small symmetric monoidal category L together with an identity on objects strict symmetric monoidal functor j : S(1) −→ L,whereS(1) is the free symmetric monoidal category on 1. A strict model of a symmetric monoidal Lawvere theory L 231 Hyland and Power in a symmetric monoidal category C is a strict symmetric monoidal functor h : L −→ C.

Usually, in referring to a Lawvere theory, we shall simply use the notation L, treating the rest of the data as implicit. Up to equivalence of categories, S(1) may be identified with P , the category of natural numbers and permutations. Strict models of a symmetric monoidal Lawvere theory L in a specified symmetric monoidal category C, together with symmetric monoidal natural transformations, yield a category Mods(L, C). Examples of symmetric monoidal Lawvere theories are given by any small symmetric monoidal category with objects, up to equivalence, given by natural numbers, inheriting the tensor product of natural numbers. Examples abound andareexplainedindetailin[12].Forinstance,ifL is an ordinary Lawvere theory, it is automatically a symmetric monoidal Lawvere theory. But also the opposite of an ordinary Lawvere theory is a symmetric monoidal Lawvere theory. More specifically, there is a Lawvere theory for which the models in a symmetric monoidal category amount to commutative comonoids in the category: this example is central to Milner’s work on action calculi in [17], as explained in [12]. Another symmetric monoidal Lawvere theory is that for which models are given by relational bimonoids, as Plotkin plans to use to model concurrency, again explained in [12]. We feel obliged to spell out at least one example in detail, so we do so here.

Example 3.3 Let CMon be the symmetric monoidal Lawvere theory for a commutative monoid. The underlying category of CMon is that required to express the data and commutativity axioms for a commutative monoid: its objects must all be generated by a single object X, it has all the maps given by permutations of natural numbers, and it has additional maps j : I −→ X and · : X ⊗ X −→ X together with maps generated by them by closing under tensor product and composition, all subject to the four commutativity axioms in the deﬁnition of commutative monoid. It follows that CMon is the free symmetric monoidal category on a commutative monoid, which, perhaps surprisingly, is equivalent to Setf . It may also be characterised as the free category with ﬁnite coproducts on 1.

One can easily produce variants of this along the lines of considering comonoids rather than monoids, or considering bimonoids, or structures with some of the data and some or perhaps more axioms than those of monoids, comonoids, and combinations of them. Now, we start to analyse how symmetric monoidal Lawvere theories give rise to comonads on SymMon.

Proposition 3.4 For any symmetric monoidal Lawvere theory L and any small symmetric monoidal category C, the category Mods(L, C) has a sym- 232 Hyland and Power metric monoidal structure given as follows: for strict models h and h, • put (h ⊗ h)(1) = h1 ⊗ h1 • extend the deﬁnition of h ⊗ h to an arbitrary object of L, which is the result of inductively applying the tensor operation to the unit and to 1,by induction on the complexity of the tensorial description of the object. • deﬁne h ⊗ h on arrows by conjugation using the canonical isomorphisms induced by induction between (h ⊗ h)(x) and h(x) ⊗ h(x).

Observe that the tensor product here is not given pointwise: if we tried to define a pointwise tensor product, we would not be able to make h ⊗ h strict symmetric monoidal, so it would not be an object of Mods(L, C). If we further tried to deal with that by extending from Mods(L, C) to the category Mod(L, C), we would be unable to obtain an endofunctor because of coherence difficulties: we believe we will be able to resolve this in [13], but this is one of the reasons why we have retreated to single-sorted theories here, as they allow us to keep tight control over the behaviour of a putative tensor product h ⊗ h on objects. With a little effort, the proposition can be extended to show Theorem 3.5 Given a symmetric monoidal Lawvere theory L,theconstruc- tion Mods(L, −) yields an endofunctor on SymMons with a copoint − ⇒ Mods(L, ) IdSymMons given by evaluation at 1.

If SymMons were symmetric monoidal closed, coherently with respect to Cat, then the unit of the symmetric monoidal closed structure on it would be S(1), because the left adjoint of a symmetric monoidal adjunction always preserves the symmetric monoidal structure. So, as part of the deﬁnition of a symmetric monoidal Lawvere theory j : S(1) −→ L,weimmediatelyhavethe unit data for a monoid structure on L. It remains for us to ﬁnd a construct that can act as a multiplication. As we have such tight control on the objects of L, the construct proves to be uniquely determined, so it just amounts to a condition on the data we already have. We explore the situation in the next section.

But for the moment, observe that categories of the form Mods(L, −)-Coalg for the copointed endofunctor (Mods(L, −),ev1) include categories of very substantial interest. For instance, if L is the symmetric monoidal Lawvere theory for a commutative monoid, the category of coalgebras is the category of small categories with finite coproducts, as can be checked by direct calculation. Dually, if L is the symmetric monoidal Lawvere theory for a commutative comonoid, the category of coalgebras here is the category of small categories with finite products. One can continue along these lines for other examples of symmetric monoidal Lawvere theories to account for the category of small categories with finite biproducts, or finite relational biproducts, or the like. 233 Hyland and Power

4Commutative symmetric monoidal Lawvere theories

In this section, we place a commutativity condition on the notion of symmetric monoidal Lawvere theory L in order to extend the copointed endofunctor (Mods(L, −),ev1)onSymMons to a comonad on SymMons. Recall that a symmetric monoidal Lawvere theory L has the same objects as S(1), which in turn is equivalent to the category P of natural numbers and permutations. For simplicity of exposition here, we suppress coherent isomorphisms and identify the objects of S(1) with the objects of P .Thus we identify the objects of L with natural numbers. Now, for natural numbers m and p,denotebym × p the tensor product of m copies of p. It follows that m ×−is functorial in L. Note that this does not mean that −×m is functorial in L! Deﬁnition 4.1 A symmetric monoidal Lawvere theory L is commutative if for all maps f : m −→ n and g : p −→ q in L,thetwomapsfromm × p to q × n,onegivenby

m × g q × f m × p ✲ m × q ✲ q × m ✲ q × n, with the other dual, where the unlabelled maps are given by canonical isomorphisms in P , agree. This deﬁnition provides the information we need to obtain a comonad. Proposition 4.2 If L is a commutative symmetric monoidal Lawvere theory, there is a natural transformation with C-component

δC : Mods(L, C) −→ Mods(Mods(L, C)) such that Mods(L, −) together with ev1 and δ form a comonad on the category SymMons. Proof. Given a strict symmetric monoidal functor h : L −→ C,wemust obtain a strict symmetric monoidal functor δC (h)fromL to the symmetric monoidal category Mods(L, C), whose objects are strict symmetric monoidal functors from L to C.SinceδC (h) must be strict symmetric monoidal, and since every object of L is given by a tensor product generated from 1, the behviour of δC (h) on objects is completely determined by its behaviour on 1. And since we must have ev1(δC (h)) = h in order to satisfy one of the comonad laws, we must have

δC (h)(1) = h : L −→ C The commutativity condition is exactly what is required to force the behaviour of δC on maps to be strict symmetric monoidal. ✷ The proposition gives us the comonad we seek. But we can say a little more that we have found valuable in our analysis. Speciﬁcally, we can identify the category of coalgebras for the comonad with the category of coalgebras for 234 Hyland and Power its underlying copointed endofunctor. That is an unususal situation, redolent of that for the Eckmann Hilton argument [7] as explained in [12]. Our argument goes as follows. Proposition 4.3 For a commutative symmetric monoidal Lawvere theory L, the strict symmetric monoidal functors

Mods(L, ev1):Mods(L, Mods(L, C)) −→ Mods(L, C) and −→ (ev1)Mods(L,C) : Mods(L, Mods(L, C)) Mods(L, C) are jointly monomorphic. Proof. The proof takes a little care, as it amounts to decomposing an arbitrary strict symmetric monoidal functor from L to Mods(L, C), which can be seen as a construction on two variables m and n, into consideration of its behaviour on pairs of variables of the form (1,n)and(m, 1) by use of the fact that each object of L is given by a tensor product generated by 1. ✷ The two propositions yield Theorem 4.4 For any commutative symmetric monoidal Lawvere theory L, the category of coalgebras for the copointed endofunctor (Mods(L, −),ev1) is equal to the category of coalgebras for the comonad (Mods(L, −),ev1,δ). The above all fits into our conception of two-dimensional linear algebra. Moreover, the definitions we have developed here, such as that of symmetric monoidal Lawvere theory, obviously can be extended far beyond symmetric monoidal categories. In particular, the use of a comonad along the lines of Mods(L, −)onSymMons extends in two directions that seem likely to be important, one of them definitely fitting within the scope of two-dimensional linear algebra, the second not quite as clearly. The first of these directions involves the generalisation from consideration of the 2-category SymMon to that of the 2-category T -Algp for a pseudo- commutative 2-monad T on Cat. We do not explore that here, but see [13]. The other direction involves removing the commutativity condition that we have just introduced, yet still obtaining a comonad, necessarily a somewhat different one from that we have described: we would hardly have introduced the notion of commutativity if we did not need it to obtain the comonad structure we have defined. We give details of that in the next section.

5 Deleting the commutativity condition

In this section, we try to obtain a comonad much as we did in the previous section, but without resort to the commutativity condition we introduced there. The reason is that some symmetric monoidal Lawvere theories of interest to us, speciﬁcally one for Frobenius objects, are not commutative; so we would like to extend our analysis. 235 Hyland and Power

More speciﬁcally, all of the speciﬁc examples of symmetric monoidal Law- vere theories we have described so far have been commutative. And so are all of the examples implicit in [12]. But for an example of a symmetric monoidal Lawvere theory that is not commutative, consider the following. Example 5.1 Let RFrob be the symmetric monoidal Lawvere theory for relational Frobenius objects. This is generated by an object X together with a commutative monoid structure on X and a commutative comonoid structure on X such that the diagram

X ⊗ δ X ⊗ X ✲ X ⊗ X ⊗ X

m m ⊗ X

❄ ❄ X ✲ X ⊗ X δ commutes (see [4]) and m.δ = idX . This category can be described explicitly as the category whose objects are ﬁnite sets and with a map from m to n given by an equivalence relation on m + n. This category is implicitly used by Danos and Regnier [6] in connection with Geometry of Interaction and is considered by Gardner in [10] for diﬀerent reasons.

For another example, one can drop the condition m.δ = idX in the above, giving the symmetric monoidal Lawvere theory for Frobenius objects, for which an explicit description is the main result of [3]. So there is some value in considering symmetric monoidal Lawvere theories that are not commutative, so we would like to incorporate such examples into coalgebra too. The technical heart of our construction of a comonad allowing us to do that is given by modifying our deﬁnition of Mods(L, C) for a symmetric monoidal Lawvere theory L and a symmetric monoidal category C. A priori, this leads us a little away from two-dimensional linear algebra as we initially envisioned it, as Mod(L, C) is the pseudo-closed structure of SymMon. However, the objects of the construction we now make are the same as the objects of Mods(L, C), and in a symmetric monoidal closed category such as Ab, there are no arrows between elements of an abelian group, so in a precise sense, if one restricted to a category like Ab, the constructs Mod(L, C) and Mod∗(L, C) would agree. So a more subtle view of two-dimensional linear algebra may well incorporate this construction. Deﬁnition 5.2 Given a symmetric monoidal Lawvere theory L,andasym- ∗ metric monoidal category C,letMods(L, C) denote the (unique) factorisation

✲ ∗ ✲ Mods(L, C) Mods(L, C) C 236 Hyland and Power

−→ −→ ∗ of ev1 : Mods(L, C) C into a functor Mods(L, C) Mods(L, C)thatis ∗ −→ the identity on objects followed by a fully faithful functor Mods(L, C) C. ∗ The construction Mods(L, C) extends to an endofunctor on SymMons and ∗ ev1 trivially restricts to give a copoint ev1 for the endofunctor. Emulating the results of the previous section in this somewhat more complex setting, we have Proposition 5.3 If L is a symmetric monoidal Lawvere theory, there is a natural transformation with C-component ∗ ∗ −→ ∗ ∗ δC : Mods(L, C) Mods(Mod (L, C)) ∗ − ∗ ∗ such that Mods(L, ) together with ev1 and δ form a comonad on the category SymMons. Proof. The proof is essentially the same as for the commutative case. In the commutative case, the commutativity was required to force δC to behave well ∗ on maps, but here, we have changed the maps so that the behaviour of δC on maps is trivial. ✷ Proposition 5.4 If L is a symmetric monoidal Lawvere theory, the pair of ∗ ∗ ∗ ∗ strict symmetric monoidal functors Mods(ev1) and (ev1)Mods (S.C) from the ∗ ∗ ∗ symmetric monoidal category Mods(Mods(L, C) to Mods(L, C) are jointly monomorphic in the category SymMons. Proof. The proof here is the same as that for the commutative case. ✷ Theorem 5.5 If L is a symmetric monoidal Lawvere theory, the category ∗ − of coalgebras for the copointed endofunctor (Mods(L, ),ev1) is equal to the ∗ − ∗ ∗ category of coalgebras for the comonad (Mods(L, ),ev1,δ ).

6 Conclusions and Further Work

In this paper, we have introduced the concept of two-dimensional linear algebra and we have commenced a development of it. There are difficult coherence questions that arise, some of which we have resolved, others of which we have not yet resolved. So we have had to skirt our way around a few difficulties at some point. But that in itself has been valuable as it has led us to identify the notion of symmetric monoidal Lawvere theory, generalising Lawvere’s original definition in what seems to us to be an interesting direction. We have further developed a condition, that of commutativity, on symmetric monoidal Law- vere theories. What may be of greatest interest to the coalgebra community is the extent to which, in the second dimension, the coalgebraic structure is simpler and more primitive than the algebraic structure of linear algebra. One intriguing observation, which we have not developed here, is that the comonads we discover here, in all our leading examples, are idempotent comonads, i.e., the comultiplication is an isomorphism, equivalently, the category of coalgebras is a full coreflective subcategory of SymMons. We believe 237 Hyland and Power

[12] that a full understanding of that observation might provide a conceptual foundation for the Eckmann Hilton argument [8]. So there is plenty of work with which we can proceed: resolving the outstanding coherence issues such as the notion of pseudo-symmetry and the relationship between pseudo-monoidal and pseudo-closed structures on a 2- category, providing a conceptual foundation for the Eckmann Hilton argument, or trying to develop speciﬁc classes of examples, for instance to support the use of structures on SymMon for concurrency [5] or contexts [9].

References

[1] Barr, M., and C. Wells, “Toposes, Triples, and Theories,” Grundlehren der math. Wissenschaften 278 (1985).

[2] Blackwell, R., G.M. Kelly, and A.J. Power, Two-dimensional monad theory,J. Pure Appl. Algebra 59 (1989) 1–41.

[3] Carmody, S.M., “Cobordism Categories,” Ph.D. dissertation, Cambridge, 1996.

[4] Carboni, A., and R.F.C. Walters, Cartesian bicategories I, J. Pure Appl. Algebra 49 (1987) 11–32.

[5] Cattani, G., M.P. Fiore, and G. Winskel, A theory of recursive domains with application to concurrency, Proc. LICS 98 (1998) 214–225.

[6] Danos, V., and L. Regnier, The Structure of Multiplicatives, Arch. Math. Logic 28 (1989) 181–207.

[7] Eckmann, B., and P. Hilton, Group-like structures in general categories I: multiplications and comultiplications, Math. Annalen 145 (1962) 227–255.

[8] Eilenberg, S., and G.M. Kelly, Closed categories, “Proc. Conference on Categorical Algebra (La Jolla 1965)”, Springer-Verlag, 1966.

[9] Fiore, M., G.D. Plotkin, and A.J. Power, Cuboidal sets in axiomatic domain theory, Proc. LICS 97 (1997) 268–279.

[10] Gardner, P., “Graphical presentations of interactive systems,” MathFit Summer School, 1998.

[11] Gordon, R., A.J. Power, and R. Street, “Coherence for tricategories,” Mem. Amer. Math. Soc. 558, 1995.

[12] Hyland, J.M.E., and A.J. Power, Symmetric Monoidal Sketches, Proc. PPDP 00 (2000) 280–288.

[13] Hyland, J.M.E., and A.J. Power, Pseudo-commutative monads and pseudo- closed 2-categories, J. Pure Appl. Algebra (to appear).

[14] Kelly, G.M., Coherence theorems for lax algebras and for distributive laws, Lecture Notes in Mathematics 420 (1974) 281–375. 238 Hyland and Power

[15] Kelly, G.M., and R. Street, Review of the elements of 2-categories, Lecture Notes in Mathematics 420 (1974) 75–103

[16] Kock, A., Closed categories generated by commutative monads, J. Austral. Math. Soc. 12 (1971) 405–424.

[17] Milner, R., Calculi for interaction, Acta Informatica 33 (1996) 707–737.

[18] O’Hearn, P.W., and D.J. Pym, The logic of Bunched Implications, Bull. Symbolic Logic (to appear).

239 CMCS’01 Preliminary Version

Modal Rules are Co-Implications

Alexander Kurz

CWI, P.O.Box 94079, NL-1090 GB Amsterdam, The Netherlands

Abstract In [13], it was shown that modal logic for coalgebras dualises—concerning definability— equational logic for algebras. This paper establishes that, similarly, modal rules dualise implications:It is shown that a class of coalgebras is definable by modal rules iff it is closed under H (images) and Σ (disjoint unions). As a corollary the expressive power of rules of infinitary modal logic on Kripke frames is characterised.

1 Introduction

The investigation of the relationship of modal logic and coalgebras is motivated by coalgebras being a generalisation of transition systems. A first major achievement was Moss’ paper [15] on ‘coalgebraic logic’ where it was shown how to formulate a modal logic for Ω-coalgebras depending in a canonical way on the functor Ω : Set → Set. Since then, modal logics as a specification language for coalgebras have been investigated in a number of papers, e.g. [17,18,12,10,9,5,16]. On the other hand, it is also interesting to apply categorical and (co)algebraic tools in order to obtain new insights in modal logic. For example, it was shown in [13] that one can characterise the expressive power of infinitary modal logics on Kripke frames by dualising the proof of Birkhoff’s variety theorem (which, in turn, characterises the expressive power of equational logic on algebras). Here, we continue this line3 of research. → We start from the correspondence between implications i∈I ti = ti s = s, I a set or class, and algebras. The classical result on implicationally definable classes is due to Banaschewski and Herrlich [3] (see also Wechler [20]): A class of algebras is implicationally definable iff it is closed under subalgebras and products (and isomorphisms). Similarly to [13], our aim here is to use the duality of algebras and coalgebras to prove a dual of this theorem for coalgebras. As it turns out, the concept dual to implication is that of a modal rule. Theorem 4.1 establishes that a class of coalgebras is rule-definable iff it is closed under images of morphisms and disjoint unions. Theorem 5.2 applies this result to Kripke frames: A class of Kripke frames is definable by rules of infinitary modal logic iff it is closed under images of p-morphisms and disjoint unions. This is a preliminary version. The final version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Kurz

An algebraically similar but logically different approach is followed by Gumm [8]. There, also the results on equationally and implicationally definable classes of algebras are dualised. But the logic used for coalgebras is different: A formula ϕ is an element of the carrier of a cofree coalgebra TC and ϕ holds in a coalgebra M iff for all valuations α : UM → C the formula ϕ is not in the image of the induced morphism α# : M → TC. If we consider the semantics of a modal rule or an co-implication in the sense of Gumm as given by the corresponding coreflection morphism (see the proof of theorem 4.1 or chapter 2 in [14]), then both approaches are equivalent. A previous version of this draft has been electronically available since February 1999. It was presented at the 11th International Congress of Logic, Methodology and Philosophy of Science, Cracow, 1999. The main improvement over this draft is that theorem 4.1 does not depend any more on the existence of cofree coalgebras in SetΩ. As a consequence, the application to Kripke frames in theorem 5.2 does not require a bound on the degree of branch- ing of the frames. This result is quite surprising and can not be transferred in an obvious way to the case of the covariety theorem in [13].

2 Coalgebras

We introduce notation and brieﬂy review coalgebras as models for modal logic (for more information see [13,14]). The classical paper on coalgebras is Rut- ten [19]. Coalgebras are given w.r.t. a category C and an endofunctor Ω : C→C. An Ω-coalgebra M =(UM,fM ) is then given by an object UM ∈Cand an arrow fM : UM → Ω(UM)inC. Ω-coalgebras form a category CΩ where a coalgebra morphism α :(UM,fM ) → (UN,fN ) is an arrow α : UM → UN in C such that Ωα ◦ fM = fN ◦ α. As an example consider the functor Ω : Set → Set given by ΩX = PX where P denotes powerset. 1 Then Ω-coalgebras are Kripke frames:given a coalgebra M and a world x ∈ UM, fM (x) is the set of successors of x. Coalgebra morphisms in SetP are functional bisimulations, i.e. p-morphisms.

A remark on epis and monos in SetΩ: Since a morphism α :(UM,fM ) → (UN,fN ) is also a function α : UM → UN,itisimmediatethatifα is epi (mono) in Set, α is also epi (mono) in SetΩ. The converse is also true for epis (see Rutten [19], 4.7), that is, a morphism in SetΩ is epi iff it is surjective. Concerning monos it holds: α ∈ SetΩ is mono in Set iff it is strong mono in SetΩ (see the appendix), that is, a morphism in SetΩ is strong mono iff it is injective.

1 This only deﬁnes Ω on sets. On functions Ω is deﬁned in the standard way, P being the covariant powerset functor:Given f : X → Y ,Ωf = λA ∈PX.{f(a):a ∈ A}. 241 Kurz

2.1 Cofree Coalgebras We first give the definition of cofree coalgebras and then show how they are interpreted in the context of modal logic. To define the notion of a cofree coalgebra consider the diagram below:

UTC✛ α # PC ❄ C ✛ UM α

Let Ω be a functor. An Ω-coalgebra TC (and a mapping PC : UTC → C)is called cofree over C iﬀ for all Ω-coalgebras M and all mappings α : UM → C there is a unique morphism α# : M → TC such that the diagram commutes. 2 Compared to universal algebra, the set of colours C corresponds to the set of variables and a colouring α to a valuation of variables.

As in the case of Ω = P, cofree coalgebras may not exist in SetΩ.This problem can be circumvented by extending the functor Ω on Set to a functor on SET 3 by deﬁning ΩK = {ΩX : X ⊂ K, X aset} for classes K.Itthen follows from a theorem by Aczel and Mendler [2] that for every functor Ω on Set and all C ∈ Set the cofree coalgebras TC exist in SETΩ. Thus, in the following, we will allow, without further mentioning, that cofree coalgebras exist in SETΩ instead of SetΩ. As an example consider Ω = P and C = PP where P is a set of propositional variables. Let M be an Ω-coalgebra, i.e. a Kripke frame. The functions α : UM → C are valuations: every world x in M is assigned the set of propositional variables holding in x, that is, (M,α)isaKripke model. (Note that an Ω-coalgebra M plus a valuation α : UM → C is a C × Ω-coalgebra; and a morphism between C × Ω-coalgebras f :(M,α) → (N,β) is an Ω-morphism f : M → N such that β ◦ f = α, see [13].) The diagram above then shows that any Kripke model (M,α) has a unique p-morphic image in the model (TC,PC ). We can think of (TC,PC ) as the disjoint union of all models based on Ω-frames with the additional feature that any two bisimilar worlds are identiﬁed.

2.2 Strong-mono-Coreflective Classes of Coalgebras The concept of a strong-mono-coreflection is a generalisation of the concept of cofreeness and dualises the concept of a strong epireflection (see Borceux [4], I.3.6). Strong monos do appear here because they are the categorical way of describing subcoalgebras (see the appendix). Coreflective classes are used

2 # Note that every morphism α : M → TC ∈ SetΩ is by definition of morphisms in SetΩ also a mapping α# : UM → UTC ∈ Set. 3 SET is the category of classes and class maps as in Aczel [1], chapter 7, and Aczel and Mendler [2]. 242 Kurz because, on the one hand, they are precisely those classes closed under the operators H (closure under images of coalgebra morphisms) and Σ (closure under disjoint unions (coproducts, sums) of coalgebras), and because, on the other hand, the ‘coreflection morphisms’ will allow us to see what the defining modal rules will be (section 4). Let Ω : Set → Set be a functor. A strong-mono-coreflection for a class K of Ω-coalgebras is given by coalgebras RK M and strong monomorphisms K → ∈ ∈ PM : RK M M for all M SetΩ such that for all N K every morphism → K α : N M factors uniquely through PM :

K ✛PM M✛ RK M ✻ ∗ α α N

K RK M is called the coreflection of M and PM the coreflection morphism of M. Proposition 2.1 (Existence of strong-mono-coreflections) Let Ω be a functor on Set and K a class of Ω-coalgebras. Then for all M ∈ SetΩ there ∈ K → ∈ is RK M SetΩ and a strong mono PM : RK M M SetΩ such that for ∗ all N ∈ K and all α : N → M in SetΩ there is a unique α : N → RK M in K ◦ ∗ ∈ SetΩ such that PM α = α. Moreover, RK M HΣHK.

Proof. Let M ∈ SetΩ.LetA = {α : N → M | N ∈ K} be the collection of all coalgebra morphisms N → M with N ∈ K. Now, the union of the images ∈ K 4 of all α A defines a subcoalgebra RK M of M with inclusion PM . Clearly, for each N ∈ K the above factorisation property holds. Moreover, the fact that union is a quotient of a disjoint union shows that RK M ∈ HΣHK. ✷ Definition 2.2 (Strong-mono-coreflective-classes, smc) Let Ω be a functor on Set and K a class of Ω-coalgebras. K is called strong-mono-coreflective, or smc for short, iff it is closed under isomorphisms and contains all core- 5 flections RK M. Strong-epi-reflective classes of algebras are characterised as being exactly those classes of algebras that are closed under subalgebras and products. Du- ally, smc-classes of coalgebras are characterised by closure under homomorphic images (denoted by H) and closure under disjoint unions, i.e. coproducts (denoted by Σ). Proposition 2.3 Let Ω be a functor on Set and K a class of Ω-coalgebras. K is smc iff it closed under H and Σ.

4 Every coalgebra morphism α ∈ SetΩ factors through its image Im α, see Rutten [19], theorem 7.1; and the union of images of coalgebra morphisms always exists, see [19], theorem 6.4. 5 Categorically: K is smc iﬀ it is closed under isomorphisms and the inclusion functor K K → SetΩ has a right adjoint RK with the counit (i.e. 6M ) being strong mono. 243 Kurz

Proof. “only if”: Closure under H is (the dual) of [4], I.3.6.4. For closure under Σ let Mi ∈ K (for all i ∈ I)andM the sum of all Mi.NotethatRK M i K is isomorphic to Mi. WehavetoshowthatRK M is isomorphic to M. PM K being a strong mono, it suffices to show that PM is epi in Set (and hence epi in SetΩ). This follows from the observation that every x ∈ UM is in the image → K of an inclusion ini : Mi M and every inclusion factors through PM . “if”: Follows from RK M ∈ HΣHK. ✷ Closure under H and Σ is equivalent to closure under the operator HΣ. This is dual to the fact that, in universal algebra, SP is closure under subalgebras and products (see Gumm and Schröder [6] for details on closure operators on coalgebras). We can therefore phrase the proposition above as K smc iff K = HΣ(K).

2.3 An Example To illustrate the notions above and their connection to modal logic we give an example. Let Ω = B ×P,whereB is the set of Booleans. That is, every state is assigned (b, Y ), where b is a Boolean and Y a set. We interpret b as the truth value of a ﬁxed proposition and Y as the set of successors. A modal language for this functor is build from the usual connectives, modal operators and propositional variables from a set P , plus a propositional constant denoted by start.ASetΩ-coalgebra M =(UM,f) is a Kripke frame together with a predicate interpreting start. To be more precise, let α : UM →PP, x ∈ UM,p ∈ P . Then (boolean cases as usual and π1,π2 denoting the projections from the product B ×P to its components):

M,α,x |= start ⇔ π1 ◦ f(x)=true M,α,x |= p ⇔ p ∈ α(x)

M,α,x |= ✷ϕ ⇔∀y ∈ π2 ◦ f(x):M,α,y |= ϕ The states x satisfying the ﬁrst clause are called states marked by start.Next, we want to axiomatise a subclass of these Kripke frames by modal rules. A modal rule ϕ/ψ (where ϕ, ψ are modal formulas) is interpreted via

M |= ϕ/ψ ⇐⇒ ∀ α : UM →PP : M,α |= ϕ ⇒ M,α |= ψ

Modal axioms are rules with true premise. Now, consider the following rules: (refl) ✷p → p (trans) ✷p → ✷✷p (start) start → ✷p/p The first two are the well-known axioms defining reflexivity and transitivity on Kripke frames. The third one is the start rule from Kröger [11]. In the 244 Kurz presence of reflexivity and transitivity it expresses that every state has to be reachable from a state marked by start. Call Φ the set of the three rules above and let K be the class of Kripke frames defined by Φ. We show that K is smc. Define RK M as the largest subcoalgebra of M satisfying Φ (that is, to find RK M, take the largest subcoalgebra of M that is reflexive and transitive and then cut off all states that are K → not reachable by a state marked by start). PM : RK M M is the canonical embedding and it is a strong mono since it is injective. Recalling the definition of a coreflective subcategory, it remains to show that for all N ∈ SetΩ satisfying Φ it holds that for all f : N → M there is a unique g : N → RK M K ◦ such that f = PM g:

K ✛PM M✛ R M K. .✻ . f . g . N Consider a factorisation N →e Im f →m M of f. Since rules are invariant under taking images (see proposition 3.5) it follows that Im f |=Φ.Moreoverm : Im f → M is a subcoalgebra of M and since RK M is the largest subcoalgebra K K ◦ of M satisfying Φ, m factors through PM as m = PM g for some g .Now, ◦ K g = g e is the required morphism and g is uniquely determined since PM is injective. Finally, let us note that K is closed under images and disjoint unions (coproducts) but not under subcoalgebras. Hence K is an example of a co- quasivariety that is not a covariety.

3 Modal Logics for Coalgebras

There are many different kinds of modal logics but most of them share the following features that are essential for a logic for coalgebras: formulas are evaluated in points (worlds, states) and they are invariant under bisimulations. Compared to the paper on covarieties [13] the definition below changed a little: Since we have no requirement that the functor Ω is bounded, a logic has to have formulas for arbitrary large sets of colours. Definition 3.1 Let Ω be a functor. A modal logic for coalgebras L is given by the following: • a class Col of sets (the sets in Col are called sets of colours of L), where Col contains for each cardinal κ a set with cardinality ≥ κ,andforeach C ∈ Col a class of formulas LC , • for all C ∈ Col, for all M ∈ SetΩ, and for all valuations α : UM → C a | C ⊂ ×L | ∈| C relation =(M,α) UM C .(WriteM,α,x = ϕ for (x, ϕ) =(M,α).) • for all C ∈ Col, M,N ∈ SetΩ, α : UM → C, β : UN → C, ϕ ∈LC , 245 Kurz

x ∈ UM,andallC × Ω-morphisms 6 f :(M,α) → (N,β), it has to hold

M,α,x |= ϕ ⇔ N,β,f(x) |= ϕ.

The last condition says that formulas have to be invariant under bisimulations respecting not only the structure of the Ω-coalgebras but also the given valuations. As usual, M,x |= ϕ,(M |= ϕ) are deﬁned by quantifying over all valuations (and all elements) of M.

Formulas ϕ ∈LC deﬁne subsets of the cofree coalgebra (TC,PC ). It is useful to introduce the following notation:

TC,AC [[ ϕ]] = {x ∈ UTC : TC,PC ,x|= ϕ}.

From the invariance of the formulas under bisimulations it follows the fundamental property allowing to reduce validity w.r.t. a valuation in any model to validity in the cofree models:

M,α |= ϕ ⇐⇒ Im α# ⊂ [[ ϕ]] TC,AC .

Next, we show that dualising the concept of an implication in algebra we obtain the notion of a modal rule. The reader might want to recall the semantics of implications in universal algebra. First two basic facts: Let X be a set of variables and TX be the term algebra over variables X.Then every valuation α : X → A has a unique lifting to an algebra morphism αG : TX → A. And every algebra morphism αG : TX → A determines G a congruence3 relation on TX that we denote by ker α . Next, consider an → implication i∈I ti = ti s = s . It determines two congruence3 relations P, Q on the carrier of TX, P standing for the relation induced by i∈I ti = ti and Q for the relation induced by s = s. Now, it is not difficult to see that the implication holds in an algebra A iff P ⊂ ker αG ⇒ Q ⊂ ker αG for all α : X → A. This characterisation of implications is dualised by the following definition of modal rules.

Deﬁnition 3.2 (Rules) Given two formulas ϕ, ψ ∈LC we call the expression ϕ/ψ a rule. The class of all rules built from formulas in LC is denoted by RuC .Deﬁne|=

M |= ϕ/ψ iﬀ ∀α : UM → C :Imα# ⊂ [[ ϕ]] TC,AC ⇒ Im α# ⊂ [[ ψ]] TC,AC ,

Definition 3.3 (rule-definable) Let Ω:Set → Set be a functor and L be a modal logic for Ω-coalgebras. K ⊂ SetΩ is rule-definable iff there are classes ΦC ⊂ RuC , C ∈ Col, such that M ∈ K ⇔∀C ∈ Col : M |=ΦC . Up to now, we have only required that formulas of modal logic are evaluated in points and are invariant under bisimulations. We need an additional

6 Recall that f :(M,α) → (N,β)isaC × Ω-morphisms iff f is an Ω-morphism such that β ◦ f = α. 246 Kurz property that guarantees enough expressive power. Definition 3.4 A modal logic for coalgebras L is called expressive if for all C ∈ Col and every C ×Ω-subcoalgebra S of (TC,PC ) there is a formula ϕ such that US = {x ∈ TC : TC,PC ,x|= ϕ}. An important consequence of our definition of a modal logic for coalgebras is that rules are preserved under images and disjoint unions. Proposition 3.5 Let L be a modal logic for Ω-coalgebras and let K be a rule- definable class of Ω-coalgebras. Then K is closed under the operators H and Σ.

Proof. Let C ∈ Col and ϕ, ψ ∈LC . “H”: Suppose M ∈ K and f : M → N epi in SetΩ. Wehavetoshow M |= ϕ/ψ =⇒ N |= ϕ/ψ. For a contradiction assume that N/|= ϕ/ψ, i.e. there exist β : UN → C and y ∈ UN s.t. N,β,y |= ϕ and N,β,y|= / ψ.Deﬁne α : UM → C by α = β ◦ f, i.e. f is a C × Ω-bisimulation between (M,α) and (N,β). Now, since f epi in SetΩ implies f epi in Set and since |=is compatible with C × Ω-bisimulations, there is x ∈ UM such that M,α,x |= ϕ and M,α,x|= / ψ, which is the desired contradiction.

“Σ”: Similar to the above. Let (Mi)i∈I be a family of models in K and M =Σi∈I Mi. Suppose M/|= ϕ/ψ. That is, there exist α : UM → C and x ∈ UM such that M,α,x |= ϕ and M,α,x|= / ψ. Since sums in SetΩ are 7 constructed as sums in Set there is a j ∈ I such that x ∈ Mj.Now,using that the inclusion inj of Mj into M is a bisimulation shows Mj,ij,x|= ϕ and Mj,ij,x|= / ψ. ✷

4Rule-Deﬁnable Classes of Coalgebras

We have already seen that rule-definable classes are closed under H and Σ. To show the converse, one uses that every class K closed under H and Σ is strong- mono-coreflective (smc) and then shows that K is ‘defined’ by its coreflection morphisms. Theorem 4.1 (Characterisation of rule-definable classes) Let Ω:Set → Set be a functor and L an expressive modal logic for Ω- coalgebras. Then a class K is definable by rules of L iff K is closed under H and Σ. Proof. “only if” is proposition 3.5. For “if” note that K is smc by proposition 2.3. The defining rules are now determined by the coreflection mor- K → ∈ ∈ phisms PM : RK M M.DefineΦC for C Col as follows. For M SetΩ, |UM|≤|C|, choose an injective mapping i : UM → C. By expressive- C C ∈L C TC,AC # ness there are formulas ϕM ,ψM C such that [[ϕM ]] =Imi and C TC,AC # ◦ K { C C ∈ | |≤| |} [[ ψM ]] =Im(i PM ). Let ΦC = ϕM /ψM : M SetΩ, UM C .

7 Categorically: U : SetΩ → Set creates all colimits, see [19], theorem 4.5. 247 Kurz

We have to show N ∈ K ⇔∀C ∈ Col : N |=ΦC . ⇒ C C ∈ | C # ⊂ C TC,AC “ ”: Let ϕM /ψM ΦC and suppose N,β = ϕM , i.e. Im β [[ ϕM ]] . C → C TC,AC # # By definition of ϕM there is i : UM C with [[ϕM ]] =Imi . Hence, β factors through i# as β# = i# ◦ f for some f : N → M.SinceN ∈ K and K K K ◦ → is smc, f factors through PM as f = PM g for some g : N RK M. It follows # # ◦ K ◦ ⊂ # ◦ K C TC,AC | C Im β =Im(i PM g) Im(i PM )=[[ψM ]] , i.e. N,β = ψM . “ ⇐ ”: Let N ∈ SetΩ.ChooseC ∈ Col, |C|≥|UN|,andi : UN → C # C TC,AC # # ◦ K such that Im i =[[ϕN ]] .WeshowImi =Im(i PN ) (which implies, # K ∈ ⊃ since i and PN injective, N RK N and hence N K). “ ”isobviousand # ⊂ C TC,AC # ◦ K | C C ✷ Im i [[ ψN ]] =Im(i PN ) holds due to N,i = ϕN /ψN . Remark 4.2 In the case of Ω=P the cofree coalgebras TC do not exist in # SetΩ but in SETΩ. This has no effect on the proof since for all α : M → TC, # M ∈ SetΩ, also Im α ∈ SetΩ. Note that this reasoning cannot be transferred to the proof of the covariety theorem in [13] since there one needs to consider coreflections RK TC (called FK C in [13]) of the cofree coalgebras which usually are only in SetΩ if TC ∈ SetΩ (which is not the case for Ω=P and C = {}).

5 Rule Deﬁnable Classes of Kripke Frames

The generality of theorem 4.1 allows for many applications. For example it is possible to give a version of this theorem for coalgebraic logic (Moss [15]). For covarieties instead of smc-classes this has been carried out in [13]. Coalgebraic logic has the advantage that it gives a definition of a modal logic for coalgebras for all functors Ω (preserving weak pullbacks). But here we only want to give one example of a (concrete) modal logic. We choose Ω = P and show that our theorem becomes a statement about rule-definable classes of Kripke frames. We denote with ML the infinitary modal logic built from a proper class of propositional variables Prop, the/ constant ⊥, the operators ¬, ✷ and conjunctions over any set of formulas. and ✸ are defined as abbreviations. When P ⊂ Prop and P asetwewriteML(P ) for the class of formulas taking only variables from P . In order to apply theorem 4.1 we need the following lemma: Lemma 5.1 The collection of all ML(P ) where P ranges over subsets of Prop is an expressive modal logic for coalgebras w.r.t. the functor P.Fur- thermore, the classes RuPP (see definition 3.2) are the classes of rules of ML(P ). Proof. Instantiating Col of definition 3.1 by {PP : P ⊂ Prop,P aset} and | C =(M,α) by the usual satisfaction relation of modal logic, it is immediate that the conditions of definition 3.1 are met. Expressiveness can be shown as in [13]. Next, let ϕ/ψ ∈ RuPP and M be a Kripke frame. Then, according to the definition of a rule in modal logic, M |= ϕ/ψ iff ∀α : UM →PP : M,α |= ϕ ⇒ M,α |= ψ, matching definition 3.2. ✷

248 Kurz

Theorem 5.2 Let K be a class of Kripke frames. Then K is rule-deﬁnable iﬀ K is closed under p-morphic images and disjoint unions.

Proof. Recall that Col = {PP : P ⊂ Prop,P aset}. “if”: By lemma 5.1 and theorem 4.1. “only if”: K is rule-deﬁnable, that is, there is a class Φ ⊂{ϕ/ψ : ϕ, ψ ∈ ML} { ∈ | } { ∈ such that K = M SetΩ : M =Φ..LetKPP = M SetΩ : | ∩ } { P ∈ } M =Φ RuPP . We. can then write K = KPP : P Col and, by { P ∈ } proposition 3.5, K = HΣKPP : P .Col . Now, it follows from a general fact on closure operators that K ⊃ HΣ {KPP : PP ∈ Col} and, therefore, K ⊃ HΣK. ✷

Some readers might feel that the ‘detour’ via coalgebras is unneccessary and a proof of the theorem from first principles could be shorter. Let us therefore emphasise that our proof is in fact easy and short: once we established that a class K closed under p-morphic images and disjoint unions is K determined by the coreflection morphisms PM (see proposition 2.3), it remains only to check that the coreflection morphisms (or more generally, generated subframes, ie., strong-monos) are indeed definable by rules (see lemma 5.1 and the proof of theorem 4.1).

6 Conclusion

This paper showed that the duality between quotients in algebra and subcoalgebras in coalgebra does not only allow for a dual of Birkhoff’s variety theorem but also for a dual of the result characterising implicationally definable classes of algebras. Moreover, it was shown that the modal concept corresponding to an implication is not that of a formula ϕ → ψ but that of a rule ϕ/ψ. To study finitary specification languages for coalgebras containing (the expressiveness of) modal rules and appropriate deduction calculi is left for future research. Let us mention that the duality of algebras and coalgebras has been used here as a heuristics. The proof of theorem 4.1 is not the formal (categorical) dual of a corresponding proof for algebras since it depends on the category of sets (and coalgebras over Set are dual to algebras over Setop). As shown in [14] it is possible to give an account of the duality of modal and equational logic which makes the duality precise in a categorical sense.

Acknowledgements

I want to thank Alexander Knapp for comments on a previous draft. Diagrams were produced with Paul Taylor’s macro package. 249 Kurz

References

[1] Peter Aczel. Non-Well-Founded Sets. Center for the Study of Language and Information, Stanford University, 1988.

[2] Peter Aczel and Nax Mendler. A ﬁnal coalgebra theorem. In D. H. Pitt et al, editor, Category Theory and Computer Science, volume 389 of LNCS, pages 357–365. Springer, 1989.

[3] B. Banaschewski and H. Herrlich. Subcategories deﬁned by implications. Houston Journal of Mathematics, 2(2):149–171, 1976.

[4] Francis Borceux. Handbook of Categorical Algebra. Cambridge University Press, 1994.

[5] Robert Goldblatt. What is the coalgebraic analogue of Birkhoﬀ’s variety theorem? Theoretical Computer Science. To appear.

[6] H. P. Gumm and T. Schr¨oder. Covarieties and complete covarieties. In B. Jacobs, L. Moss, H. Reichel, and J. Rutten, editors, Coalgebraic Methods in Computer Science (CMCS’98), volume 11 of Electronic Notes in Theoretical Computer Science, 1998.

[7] H. P. Gumm and T. Schr¨oder. Coalgebraic structure from weak limit preserving functors. In B. Jacobs, L. Moss, H. Reichel, and J. Rutten, editors, Coalgebraic Methods in Computer Science (CMCS’00), volume 33 of Electronic Notes in Theoretical Computer Science, pages 113–134, 2000.

[8] H. Peter Gumm. Equational and implicational classes of co-algebras. Extended abstract. RelMiCS’4. The 4th International Seminar on Relational Methods in Logic, Algebra and Computer Science, Warsaw, 1998.

[9] Bart Jacobs. The temporal logic of coalgebras via galois algebras. Technical Report CSI-R9906, Computing Science Institute Nijmegen, 1999.

[10] Bart Jacobs. Towards a duality result in coalgebraic modal logic. In Horst Reichel, editor, Coalgebraic Methods in Computer Science (CMCS’00), volume 33 of Electronic Notes in Theoretical Computer Science, pages 163–198, 2000.

[11] Fred Kr¨oger. Temporal Logic of Programs. Springer, 1987.

[12] Alexander Kurz. Specifying coalgebras with modal logic. Theoretical Computer Science, 260(1-2). To appear.

[13] Alexander Kurz. A co-variety-theorem for modal logic. In Proceedings of Advances in Modal Logic 2, Uppsala, 1998. Center for the Study of Language and Information, Stanford University, 2000.

[14] Alexander Kurz. Logics for Coalgebras and Applications to Computer Science. PhD thesis, Ludwig-Maximilians-Universit¨at M¨unchen, 2000. http://www. informatik.uni-muenchen.de/~kurz. 250 Kurz

[15] Lawrence Moss. Coalgebraic logic. Annals of Pure and Applied Logic, 96:277– 317, 1999.

[16] Dirk Pattinson. Semantical principles in the modal logic of coalgebras. In Proceedings 18th International Symposium on Theoretical Aspects of Computer Science (STACS 2001), Lecture Notes in Computer Science, Berlin, 2001. Springer. To appear.

[17] Martin R¨oßiger. From modal logic to terminal coalgebras. Theoretical Computer Science, 260(1-2). To appear.

[18] Martin R¨oßiger. Coalgebras and modal logic. In Horst Reichel, editor, Coalgebraic Methods in Computer Science (CMCS’00), volume 33 of Electronic Notes in Theoretical Computer Science, pages 299–320, 2000.

[19] J. J. M. M. Rutten. Universal coalgebra:A theory of systems. Theoretical Computer Science, 249:3–80, 2000.

[20] Wolfgang Wechler. Universal Algebra for Computer Scientists, volume 25 of EATCS Monographs on Theoretical Computer Science. Springer, 1992.

A Strong Monomorphisms

We establish that the strong monos in a category of coalgebras SetΩ are precisely the injective morphisms, ie. the subcoalgebras. First recall the definition of a strong mono (see eg. Borceux [4], I.4.3). Definition A.1 (Strong mono) Amonof : M → N is called strong iff for all epis g : X → Y and all u : X → M,v : Y → N such that f ◦ u = v ◦ g thereisa (necessarily unique) w : Y → M such that w ◦ g = u and f ◦ w = v:

g X ✲ Y

u w v ❄✛...... ❄ M ✲ N f

From a technical point of view, this factorisation property is crucial to the results of this paper (it is used implicitely in almost all of the proofs). An immediate consequence is the following useful proposition. Proposition A.2 (Extremal monos) A strong mono m is extremal, that is, if m factors as m = f ◦ e with e epi, then e is iso. In the category Set monos, extremal monos, and strong monos are all simply injective mappings. In the category of Ω-coalgebras monos need not be injective (see Gumm and Schr¨oder [7]). The following proposition shows that strong monos are precisely the injective morphisms in SetΩ. Proposition A.3

(i) If f ∈ SetΩ is mono as a mapping in Set then f is strong mono in SetΩ. 251 Kurz

(ii) Every morphism f ∈ SetΩ factors uniquely as an epi followed by a strong mono. Moreover, this factorisation is obtained as the epi/mono factorisation of f in Set.

(iii) Strong monos in SetΩ are monos in Set.

Proof.

(i) Let f,g,u,v ∈ SetΩ as in the diagram above and f mono in Set, g epi in SetΩ. Since g epi in Set (Rutten [19], 4.7) the required w exists as a mapping in Set. It remains to show that w is a morphism in SetΩ, i.e. Ωw ◦ fY = fM ◦ w where fY ,fM are the structure maps of the coalgebras Y,M, respectively. Drawing the appropriate diagram, this follows from g, u morphisms in SetΩ, g epi in Set and w ◦ g = u in Set.

e m (ii) Let h : X → Y ∈ SetΩ and UX → Im h → UY the factorisation of h in Set through its image Im h. We have to show that Im h can be equipped in a unique way with a coalgebra structure such that e, m become morphisms in SetΩ. This follows from the “diagonal ﬁll in”: e UX ✲ Im h h Im Ωe ◦ fX f fY ◦ m ❄✛ ❄ ΩImh ✲ ΩUY Ωm

The diagonal ﬁll in exists because either Im h = {} and Ωm mono 8 and e epi or Im h = {} and the empty map makes the diagram commute.

That m is strong in SetΩ follows from (1). Uniqueness of the factorisation in SetΩ may be found in [4], I.4.4.5.

(iii) By (2), a strong mono h : X → Y in SetΩ factors as epi/strong-mono h = e◦m. Since h is also extremal and e is epi, it follows e iso. m is strong, so is h. ✷

8 Im h = {} and m mono implies m split mono, hence Ωm mono. 252 CMCS’01 Preliminary Version

Invariants of monadic coalgebras

Dragan Maˇsulovi´c 1

Institute of Mathematics University of Novi Sad Trg D. Obradovi´ca 4, 21000 Novi Sad, Yugoslavia

Abstract In this paper we consider invariants of computations described by monadic coalgebras, that is, coalgebras for a functor endowed with the structure of a monad. Following the idea of P¨oschel and R¨oßiger [8], we propose another concept of invariants of such coalgebras, namely, the one based on co-relations. We introduce the clone-theoretic apparatus for monadic coalgebras and show that co-relations can be taken for a general representation of their invariants. We then demonstrate that not only subuniverses, but arbitrary λ-simulations can be thought of as invariants of monadic coalgebras, and that the approach to invariants via λ-simulations is inferior in comparison to the one via co-relations. In some cases invariant co-relations uniquely determine the monadic coalgebra. Since the same does not hold in general, to every monadic coalgebra we associate a coalgebra for the same monad which emulates the original one, and has the pleasant property of being uniquely determined by its invariant co-relations.

Key Words and Phrases: coalgebras, monads, invariants, co-relations, clone theory

AMS Subj. Classiﬁcation (1991): 68Q05

1 Introduction

Coalgebras provide an elegant and uniﬁed apparatus for investigation of various models of both computation and data structures. A particularly intriguing approach to modeling computer programs formally was oﬀered by E. Moggi in [5,6]. The idea is to represent programs by coalgebras X → T (X), where T is a functor endowed with the structure of a monad, with the intuition that programs are to be thought of as mappings from data (elements of X)tocom- putations (elements of T (X)). The requirement that T be equipped with the

1 Email: [email protected] This is a preliminary version. The ﬁnal version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Maˇsulovic´ structure of a monad is a way of coping with the need to compose simpler programs into larger ones. Note that the carrier of a coalgebra is usually thought of as a set of internal states of the computation model under investigation. Following the idea of E. Moggi, in this paper we accept the other approach and think of it as the set of data. The view of programs as monadic coalgebras (that is, coalgebras for a functor endowed with the structure of a monad) has been widely accepted by the functional programming community, where it has led to elegant solutions to many practical problems. The use of monads in functional programs makes easier both implementation and formal description of some key concepts such as purely functional input/output, destructive arrays, lazy evaluation and modularity [10,11]. The formal investigation of invariant properties of computations has started long time ago and has played an important role in the development of computer science ever since. In this paper, we consider invariants of computations described by monadic coalgebras. Invariants of arbitrary coalgebras were introduced in [4,3] as subsets of the carrier closed with respect to the coalgebraic structure. Inspired by the research presented in [8], we propose another concept: invariant co-relations. We ﬁrst show that co-relations and monadic coalgebras can be bound together by means of a pair of standard clone-theoretic operators. The Galois connection which emerges from this construction shows that co-relations can be taken for a general representation of invariants of monadic coalgebras. We then demonstrate that not only subuniverses, but arbitrary λ-simulations can be thought of as invariants of such coalgebras, and that the approach to invariants via λ-simulations is inferior in comparison to the one via co-relations. As its main tool, this paper introduces clone-theoretic apparatus for monadic coalgebras. The “classical” clone theory [7] can be understood as a general theory of invariants of sets of operations. We say that an operation f preserves a relation S if          a1   a2   an   f(a1,...,an)            ,   ,...,  ∈ S implies   ∈ S.  b1   b2   bn   f(b1,...,bn)  ......

By the fact that f preserves S we actually mean that f has the property “encoded” by S.WealsosaythatS is an invariant of f. For example, “f preserves S0 := {0}”meansthatf(0,...,0) = 0, while “f preserves ≤”, where ≤ is a partial order, means that f is monotone. To every set of operations F we can associate the set Inv F of all invariants common to all the elements of F . Dually, to every set of relations Q we can associate the set Pol Q of all the operations having each of the properties encoded by relations from Q. The pair Pol, Inv forms a Galois connection 2 Maˇsulovic´ between sets of operations and sets of relations. Galois closed sets of operations are called clones of operations. Clones of operations can be thought of as maximal sets of operations with the given set of properties. It turns out that there is another, equivalent, characterization: clones of operations are composition closed sets of operations containing all trivial operations. Let us note that the emphasis in clone-theoretic investigations is not on properties of one operation, but on sets of operations having certain sets of properties. This paper is motivated by [8] and relies on notions and results presented there. In [8], a coalgebra is taken to be a pair X, F where X is a set and F asetofco-operations on X. In that context (ﬁrst proposed by K. Drbohlav in 1971, [2]) an n-ary co-operation is a mapping f : X → X + ...+ X (n times). Clones of co-operations understood in this way were introduced in [1], while the notion of co-relation (which is crucial for our purposes) and the standard clone-theoretic apparatus were introduced in [8]. The authors of [8] had a strong intuition that their results should somehow carry over to T -coalgebras. They expressed this opinion in the form of a problem posed in Remark 6.10. In this paper, we solve the problem for a class of monadic coalgebras.

2 Preliminaries

Co-operations and coalgebras. Let X be a set and T : Set →Set a functor. A T -co-operation is any mapping α : X → T (X). A T -coalgebra on X is any pair X, α where α is a T -co-operation. To keep the terminology simple, we shall omit the preﬁx “T -” and we shall not distinguish between coalgebras and co-operations – in the sequel, the term coalgebra will refer to both. If T is the identity functor, then T -coalgebras are just mappings X → X. For a set X let TX denote the set of all mappings X → X.Forf,g ∈ TX and x ∈ X let (f · g)(x):=(g ◦ f)(x):=g(f(x)). Let X, α and Y,β be coalgebras. A mapping X h✲ Y h : X → Y is a homomorphism between X, α and Y,β,insymbolsh : X, α→Y,β, if the adjacent α β diagram commutes. A bijective homomorphism is called ❄ ❄ ∼ T (h) isomorphism. We write X, α = Y,β to denote that T (X) ✲ T (Y ). the two coalgebras are isomorphic. We say that X, α is a subcoalgebra of Y,β if X ⊆ Y and the inclusion mapping iX : X → Y : x → x is a homomorphism. In that case we say that X is a subuniverse of Y,β. Let X, α be a coalgebra, λ>0 an ordinal and S ⊆ Xλ arelationon X.WesaythatS is a λ-simulation of X, α if there exists a coalgebra → λ γ : S T (S) such that πν is a homomorphism between S, γ and X, α λ λ for every ν<λ. Here, πν is the projection mapping πν ( xξ : ξ<λ)=xν. 2 2 In case of λ = 2, we write π1 and π2 instead of (more correct) π0 and π1, 3 Maˇsulovic´ respectively. Note that 1-simulations are precisely the subuniverses of the coalgebra, while 2-simulations are generally referred to as bisimulations on X, α. ArelationS ⊆ X × Y is called a bisimulation between coalgebras X, α and Y,β if there exists a coalgebra γ : S → T (S) such that π1 : S → X and π2 : S → Y are homomorphisms. SetT denotes the category whose objects are T -coalgebras, and whose morphisms are homomorphisms between T -coalgebras.

Monads. Let C be a category and let T : C→Can endofunctor. A C-monad is a triple T, µ, η where µ : T 2 → T and η :id→ T are natural transformations such that: T 3 Tµ✲ T 2 T ηT✲ T 2 ✛Tη T ❅ µT µ ❅ µ ❄ ❄ and id ❅❘ ❄ ✠ id

T 2 µ✲ T T

Recall that for every coalgebra α : X → T (X)wehaveµX ◦ T (α) ◦ ηX = α.

Monadic coalgebras. Let M := T, µ, η be a Set-monad. To emphasise that T carries the structure of a monad, coalgebras for functor T will be referred to as M- coalgebras. Also, the category SetT will be denoted by SetM.ForasetX,the pair M,X will be abbreviated to MX.

For a set X, let cAMX denote the set of all M-coalgebras on X.Note that for every f ∈ TX , the monounary algebra X, f is a coalgebra for the identity monad id, id, id. ∗ For α ∈ cAMX ,letα := µX ◦ T (α) ∈ TT (X) (the Kleisli star). Let ∗ TMX := {f ∈ TT (X) | (f ◦ ηX ) = f}.Thesuperposition of M-coalgebras α ∗ and β,insymbolsα·β, is deﬁned in the usual way: α·β := µX ◦T (β)◦α = β ◦α. It is clear that cAMX , ·,ηX is a monoid isomorphic to TMX , ·, idT (X) under the isomorphism α → α∗. k 0 For α ∈ cAMX and a nonnegative integer k,wedeﬁneα by: α := ηX and αk := α · ...· α (k times). A coalgebra α is called idempotent if α2 = α. idp Let SetM denote the full subcategory of SetM whose objects are idempotent coalgebras.

Submonoids of cAMX , ·,ηX will be referred to as monoids of coalgebras. For F ⊆ cAMX let MonMX F denote the least monoid of coalgebras containing F . 4 Maˇsulovic´

Co-relations. For a set Y let 4Y 6 denote the least inﬁnite cardinal exceeding |Y |: " ℵ , |Y | < ℵ 4Y 6 := 0 0 ℵξ+1, |Y | = ℵξ,ξ ≥ 0.

Let Y be a set and let λ>0beanordinal.Aλ-ary co-vector on Y is any mapping r : Y → λ.Aλ-ary co-relation on Y [8] is any set of λ-ary (λ) co-vectors. If σ is a λ-ary co-relation, we write ar(σ)=λ. Let cRY denote (λ) the set of all λ-ary co-relations on Y and let cRY = 0<λ<Y cRY . We say that f ∈ TY preserves S ∈ cRY [8] if r ◦ f ∈ S for all r ∈ S.For F ⊆ TY , let cInvY F := {S ∈ cRY | every f ∈ F preserves S} (see [8]). For Q ⊆ cRY let cEndY Q := {f ∈ TX | f preserves every S ∈ Q}.

Clones of co-relations.

Let ξ, λ and λi (i ∈ I) be ordinals such that ξ>0and0<λ,λi < 4Y 6, ∈ ∈ (λi) ∈ → → ∈ i I. Further, let Si cRY , i I,andletϕ : ξ λ and ϕi : ξ λi, i I, ∈ be arbitrary mappings. The1generalised superposition of Si : i I [8] is the ϕ { ◦ | → ∀ ∈ ◦ ∈ λ-ary co-relation defined by ϕi Si := ϕ r r : Y ξ and ( i I) ϕi r i∈I Si}. For a nonempty subset B ⊆ λ we define the λ-ary B-co-diagonal [8] by λ { | → ⊆ } ⊆ δB := r r : Y λ and r[Y ] B .WesaythatQ cRY is a clone of co-relations [8] if • λ 4 6 ∅ ⊆ Q contains all δB,where0<λ< Y and = B λ,and • Q is closed with respect to all generalised superpositions. We say that a clone Q of co-relations is complete if S ∈ Q for every ⊆ ∩ (λ) 4 6 S Q cRMX and every ordinal λ such that 0 <λ< Y . (It is easy to see that every clone of co-relations is closed with respect to arbitrary intersections; hence the name. What we here call complete clones of co-relations are referred to as 1-locally closed clones of co-relations in [8].) The next proposition is a slight modification of [8, Theorems 5.1 and 5.2]. Although the paper considers finitary co-relations only, the proofs of the two theorems easily generalise to cardinals less than 4Y 6. Thus, we have:

Proposition 2.1 Let Y be a nonempty set.

(i) M ⊆ TY is a transformation monoid if and only if M = cEndY cInvY M.

(ii) Q ⊆ cRY is a complete clone of co-relations if and only if Q =cInvY cEndY Q. 5 Maˇsulovic´

3 Monoids of coalgebras and clones of co-relations

In this section we extend the notions of co-relation, preservation, cEnd and cInv to coalgebras for a given monad. We characterise closed sets for the Galois connection cEnd, cInv, and give a clone-theoretic description of λ- simulations. We pay additional attention to 1-simulations.

The clone-theoretic toolbox. For a set X and a Set-monad M := T, µ, η let |MX| := |T (X)| and " ℵ , |MX| < ℵ 4MX6 := 0 0 ℵα+1, |MX| = ℵα,α≥ 0.

M (λ) A λ-ary co-relation on X is any λ-ary co-relation on T (X). Let cRMX de- M (λ) note the set of all λ-ary co-relations on X and let cRMX = 0<λ<MX cRMX . Let α ∈ cAMX be a coalgebra and let r : T (X) → λ be a co-vector. The action of α on r is the λ-ary co-vector α · r deﬁned by α · r := r ◦ µX ◦ T (α)= ∗ r ◦ α . We say that a coalgebra α ∈ cAMX preserves a co-relation σ and that the co-relation σ is an invariant of α if α · r ∈ σ whenever r ∈ σ.

For F ⊆ cAMX , let cInvMX F := {S ∈ cRMX | every α ∈ F preserves S}. For Q ⊆ cRMX let cEndMX Q := {α ∈ cAMX | α preserves every S ∈ Q}.

Characterisations. The pair cEnd, cInv forms a Galois connection between the lattices P(cAMX )andP(cRMX ). Our intention is to characterize the corresponding Galois closed sets of coalgebras and co-relations.

Lemma 3.1 Let α, β ∈ cAMX and F ⊆ cAMX . Further, let r be a co-vector, S ∈ cRMX and Q ⊆ cRMX .Then (i) (α · β) · r = α · (β · r). (ii) α preserves S if and only if α∗ preserves S. ∗ (iii) cInvT (X)(F )=cInvMX F . ∗ (iv) If F = cEndT (X) Q then F = cEndMX Q.

Proof. (i) follows by an easy calculation: (α · β) · r = r ◦ µX ◦ T (α · β)= 2 r ◦ µX ◦ T (µX ◦ T (β) ◦ α)=r ◦ µX ◦ T (µX ) ◦ T (β) ◦ T (α)=r ◦ µX ◦ T (β) ◦ µX ◦ T (α)=α·(r◦µX ◦T (β)) = α ·(β ·r), while (ii), (iii) and (iv) are immediate.✷

Proposition 3.2 Let F ⊆ cAMX . The following statements are equivalent:

(i) F is a monoid of coalgebras, i.e. F =MonMX F ,

(ii) F = cEndMX cInvMX F ,and

(iii) F = cEndMX Q for some Q ⊆ cRMX . 6 Maˇsulovic´

Proof. (ii) ⇒ (iii) is trivial. (iii) ⇒ (i) is an easy consequence of the Lemma 3.1 (i). As for (i) ⇒ (ii), note that F ∗ is a transformation monoid, so F ∗ = ∗ cEndT (X) cInvT (X)(F ) (Proposition 2.1). According to Lemma 3.1 (iii) and (iv), we have F = cEndMX cInvMX F , as required. ✷

The description of Galois closed sets of co-relations requires several technical prerequisits. Let λ := |T (X)|. For arbitrary bijection ϕ : T (X) → λ, ϕ ϕ { · | ∈ } deﬁne a λ-ary co-relation δMX by δMX := α ϕ α cAMX ,andlet { ϕ | } c∆MX := δMX ϕ is a bijection from T (X)toλ .

Lemma 3.3 (i) cEndT (X) c∆MX = TMX .

(ii) c∆MX ⊆ cInvT (X) TM. Proof. It is clear that (ii) follows immediately from (i). So let us show (i). ⊆ ∈ ϕ ∈ :Takeanyf cEndT (X) c∆MX and let δMX c∆MX be arbitrary. ϕ · ∈ ϕ ◦ ∈ ϕ Then f preserves δMX .Sinceϕ = ηX ϕ δMX ,wehavethatϕ f δMX . ∗ ∗ So, ϕ ◦ f = α · ϕ = ϕ ◦ α for some α ∈ cAMX .Butthenf = α ∈ TMX ∗ (since ϕ is bijective, and (cAMX ) = TMX ). ⊇ ∗ ∈ ϕ ∈ · ∈ ϕ :Takeanyα TMX and any δMX c∆MX .Letβ ϕ δMX be · · · · ∈ ϕ ✷ arbitrary. Then α (β ϕ)=(α β) ϕ δMX .

Proposition 3.4 Let Q ⊆ cRMX . The following are equivalent:

(i) Q is a complete clone of co-relations containing c∆MX ,

(ii) Q =cInvMX cEndMX Q,and

(iii) Q =cInvMX F for some F ⊆ cAMX . Proof. (ii) ⇒ (iii) is trivial.

(iii) ⇒ (i): Let Q =cInvMX F for some F ⊆ cAMX . According to ∗ ∗ Lemma 3.1 (iii), Q =cInvT (X)(F ). From F ⊆ TMX it follows that Q = ∗ cInvT (X)(F ) ⊇ cInvT (X) TMX ⊇ c∆MX . Therefore, Q contains c∆MX . Let us now show that Q is a complete clone of co-relations. In view of Proposition 2.1, it suﬃces to show that Q =cInvT (X) cEndT (X) Q. Aswehave ∗ ∗ already seen, Q =cInvT (X)(F ). So, Q =cInvT (X)(F )= ∗ cInvT (X) cEndT (X) cInvT (X)(F )=cInvT (X) cEndT (X) Q. Here, we used the fact that cInv, cEnd is a Galois connection, whence cInv cEnd cInv = cInv. (i) ⇒ (ii): Since Q is a complete clone of co-relations, according to Propo- sition 2.1, we conclude that Q =cInvT (X) cEndT (X) Q.LetG := cEndT (X) Q. Since Q ⊇ c∆MX ,wehavecEndT (X) Q ⊆ cEndT (X) c∆MX = TMX (Lemma ∗ 3.3). Now, let F := {f ◦ ηX | f ∈ G}.FromG ⊆ TMX it follows that G = F . ∗ So, Q =cInvT (X)(F ), whence Q =cInvMX F (Lemma 3.1). On the other ∗ hand F = G = cEndT (X) Q, whence F = cEndMX Q (the same lemma). Therefore, Q =cInvMX cEndMX Q. ✷

Thus, Galois closed sets of coalgebras and co-relations are, respectively, the monoids of coalgebras and certain complete clones of co-relations. We 7 Maˇsulovic´

η S ✲ T (S) S γ✲ T (S) S δ ✲ T (S)

λ λ λ λ λ λ πν T (πν ) πν T (πν ) πν T (πν ) ❄ ❄ ❄ ❄ ❄ ❄

X ηX✲ T (X) X α✲ T (X) X β✲ T (X). (a)(b)(c)

µ S γ✲ T (S) T (δ✲) T 2(S) ✲ T (S)

λ λ 2 λ λ πν T (πν ) T (πν ) T (πν ) ❄ ❄ ❄ ❄

X α✲ T (X) T (β✲) T 2(X) µX✲ T (X) (d)

Fig. 1. The four diagrams from the proof of Proposition 3.5 see here a nice parallel with invariants in the sense of [4]. It is a well known fact that the set of subuniverses of a coalgebra for a suﬃciently nice functor is closed with respect to arbitrary unions and intersections. The same holds for co-relations: the set of all invariant co-relations of a monadic coalgebra is a complete clone of co-relations, and hence, closed with respect to arbitrary unions and intersections of co-relations of the same arity. As an immediate consequence of this observation we have that, given α ∈ cAMX , for every ∈ (λ) S cRMX thereexiststheleastλ-ary invariant of α containing S,andthe greatest λ-ary invariant of α contained in S. If we denote the former by Sα and the latter by [S]α,thenSα = {β · r | β ∈ MonMX {α}, r ∈ S} and [S]α = {r ∈ S |rα ⊆ S}. Note also that the closure operator ·α resembles the operator ΓF (·)which plays the analogous role in [8] (see Deﬁnition 2.1 and Proposition 3.8 in case n =1).

Co-relations and λ-simulations. ¿From the clone-theoretic point of view, certain property is considered to be invariant if the set of all the objects having that property is composition closed and contains the trivial objects. The reason for this is simple. Com- position closed sets of objects can be described by means of relations – the standardised invariants. Thus, the property under consideration can also be represented by the standardised invariants, and, therefore, is an invariant itself. In view of that, we shall now show that arbitrary λ-simulations can also be considered as invariants of monadic coalgebras. Proposition 3.5 Let λ>0 be an ordinal and S ⊆ Xλ a relation on X.The set of all coalgebras α ∈ cAMX with the property that S is a λ-simulation of X, α forms a monoid of coalgebras. 8 Maˇsulovic´

Proof. Let F = {α ∈ cAMX | S is a λ-simulation of X, α}.Sinceη :id→ T is natural, the diagram (a) in Fig. 1 commutes i.e., ηX ∈ F .Now,letα, β ∈ F . There exist γ,δ : S → T (S) such that diagrams (b)and(c)inFig.1commute. Then diagram (d) in Fig. 1 commutes whence α · β ∈ F , µI ◦ T (δ) ◦ γ being the corresponding coalgebra structure S → T (S). ✷

Corollary 3.6 Let λ>0 be an ordinal and S ⊆ Xλ a relation on X.There exists a set of co-relations QI ⊆ cRMX such that S is a λ-simulation of X, α if and only if QI ⊆ cInvMX {α}.

Proof. Let F := {α ∈ cAMX | S is a λ-simulation of X, α}.ThenF is a monoid of coalgebras, whence F = cEndMX cInvMX F . Now, put QI := cInvMX F . ✷

The corollary says that, at least in principle, one can describe λ-simulations by co-relations. Thus, when speaking of properties of coalgebras, co-relations suﬃce and there is no need to consider λ-simulations. However, λ-simulations are much easier to work with, and we shall use them whenever appropriate. As an illustration, we now present a constructive description of 1-simulations (that is, invariants in the sense of [4]) by co-relations.

For S ⊆ X we deﬁne a binary co-relation Θ T (iS ) S T (S) ✲ T (X) as follows: r ∈ ΘS if and only if r is a mapping T (X) →{0, 1} which makes the adjacent dia- c0 r ∗ ❄ ❄ ( ) gram commute. Here, c0 is the constant map- → i{0} ping c0(x) = 0, and iS : S X is the inclusion {0} ✲ {0, 1} mapping.

Proposition 3.7 Let X, α be a coalgebra and S ⊆ X. S is a subuniverse of X, α if and only if ΘS ∈ cInvMX {α}.

Proof. ⇒: Suppose S is a subuniverse of X, α. There exists a mapping α : S → T (S) such that S, α is a subcoalgebra of X, α. By applying T to the corresponding diagram we obtain

T (S) T (iS✲) T (X)

T (α) T (α) ❄ ❄ (1) 2 T 2(S) T (i✲S ) T 2(X) 9 Maˇsulovic´

The naturality of µ yields

2 T 2(S) T (i✲S ) T 2(X)

µS µX ❄ ❄ (2)

T (S) T (iS✲) T (X)

Take any r ∈ ΘS.Thenr makes the diagram (∗) commute. By pasting (1), (2) and (∗)weobtain T (S) T (iS✲) T (X)

c0 ◦ µS ◦ T (α ) r ◦ µX ◦ T (α) ❄ ❄

i{0} {0} ✲ {0, 1} Since c0 ◦ µS ◦ T (α )=c0 and r ◦ µX ◦ T (α)=α · r,wehavethatα · r ∈ ΘS. This proves that α preserves ΘS.

⇐: Now suppose that ΘS ∈ cInvMX {α}.Wehavetofindβ : S → T (S) such that α◦iS = T (iS)◦β. To do so, it suffices to show that α[S] ⊆ jS[T (S)], where jS := T (iS). (Note that if this inclusion holds, then β : S → T (S): → −1 s jS (α(s)) will do.) Suppose to the contrary that there exists an s ∈ S such that α(s) ∈/ jS[T (S)]. Define r : T (X) →{0, 1} by " 1,x= α(s), r(x)= 0, otherwise.

Obviously, r ∈ ΘS.Letusshowthatα · r ∈/ ΘS.Sinceη :id→ T is natural and since T, µ, η is a monad, we have

S iS✲ X α✲ T (X)

❅ idT (X) ηS ηX ηT (X) ❅ ❄ ❄ ❄ ❅❘

T (S) T (iS✲) T (X) T (α✲) T 2(X) µX✲ T (X) i.e. µX ◦ T (α) ◦ T (iS) ◦ ηS = α ◦ iS.Now((α · r) ◦ T (iS) ◦ ηS)(s)=(r ◦ µX ◦ T (α) ◦ T (iS) ◦ ηS)(s)=(r ◦ α ◦ iS)(s)=r(α(s)) = 1. On the other hand, for all r ∈ Θ and all S S ηS✲ T (S) T (iS✲) T (X) s ∈ S we have (r ◦ T (iS) ◦ ηS)(s) = 0. Indeed, since r ∈ ΘS and ηS : S → T (S), the adjacent c0 r ❄ ❄ diagram commutes, whence (r◦T (iS)◦ηS)(s)= ◦ ◦ · ∈ i{0} (i{0} c0 ηS)(s) = 0. Therefore, α r / ΘS,which {0} ✲ {0, 1} contradicts the assumption ΘS ∈ cInvMX {α}. 10 Maˇsulovic´

This concludes the proof of α[S] ⊆ jS[T (S)]. ✷

Example 3.8 Let us see an example concerning Turing machines. Let Σ be an alphabet and let Tapes(Σ) be the set of all mapings t : Z →{} +Σ with the property that the set {m ∈ Z | t(m) = } is finite. Further, let X := Z × Tapes(Σ) be the set of configurations, where k, t represents the tape t with the head scanning the k-th cell. Then a Turing machine can be modelled by a coalgebra α : X → T (X), where T (X)=X+ + Xω (recall that X+ is the set of all nonempty finite sequences of elements of X). So, α(x)is the sequence of configurations representing the behavour of the machine on input x. This sequence is either finite or infinite, depending on whether the machine stops on the input or not. Let us now describe the monad for T where α · β will mean “start β on the outcome of α”. Let ηX (x)=x, a 1-element sequence. As for µ,letus first note that it operates on sequences of sequences which we shall depict as column vectors. For finite sequences of (finite or infinite) sequences we set

   a11 ,a12,...       a21 ,a22,...    µX  .  = a11,a21,...,an1,an2,... ,  .    an1,an2,... while for inﬁnite sequences we just collect the ﬁrst coordinates:

   a11 ,a12,...       a21 ,a22,...     .  µX  .  = a11,a21,...,an1,... .      a ,a ,...   n1 n2  . .

Then T, µ, η is a monad. For T -coalgebras α and β, α·β(x) has the following meaning: if α(x) terminates with the outcome y, apply β on y;ifnot,β is never applied (since α goes on for ever). Let σ be the set of all co-vectors r : T (X) →{0, 1} with the property that r[X+]={0}. Then it is not hard to see that σ is an invariant of a T -coalgebra α if and only if α terminates on all inputs. 11 Maˇsulovic´

4Characterising coalgebras by invariant co-relations

Invariants are often used to show that two systems are not equal or not isomorphic. In this section we are going to show that the concept of co-relation as a tool for encoding properties of monadic coalgebras is ﬁner than that of a λ-simulation. Namely, it may happen that two coalgebras cannot be distinguished by means of λ-simulations, but can easily be distinguished by means of invariant co-relations. We start with two examples. In the ﬁrst example, we present a pair of isomorphic coalgebras which have the same λ-simulations and distinct invariant co-relations. The second example shows that the same can happen with coalgebras that are not even bisimilar. After the examples, we investigate to what extent the knowledge of invariants of a monadic coalgebra determines the coalgebra. In some cases invariant co-relations uniquely determine the coalgebra, but this does not hold in general.

Example 4.1 Let X = {a, b, c, d, e, f} and consider the monad T, µ, η given by: T (X)=X + X (which we understand as {0, 1}×X), ηX (x)=0,x and

" 0,x, p, q∈{0, 0, 1, 1} µX (p, q,x)= 1,x, p, q∈{0, 1, 1, 0}.

Consider coalgebras X, α and X, β given by

    abcdef abc def α :=   and β :=   , 0b 0c 0a 1e 1f 1d 0b 0c 0a 1f 1d 1e

  abcdef where 0a abbreviates 0,a, etc. The mapping ϕ =   is an abcfed isomorphism between the two coalgebras. It is easy to check that for every ordinal λ>0 and every S ⊆ Xλ, S is a λ-simulation of X, α if and only if S is a λ-simulation of X, β. Therefore, these two coalgebras cannot be distinguished by means of λ-simulations.

Now, consider the 6-ary co-relation S := {r1, r2, r3} where 12 Maˇsulovic´

   0a 0b 0c 0d 0e 0f 1a 1b 1c 1d 1e 1f  r1 := , 012345012345    0a 0b 0c 0d 0e 0f 1a 1b 1c 1d 1e 1f  r2 := , and 120453120453    0a 0b 0c 0d 0e 0f 1a 1b 1c 1d 1e 1f  r3 := . 201534201534

An easy computation shows that α·r1 = r2, α·r2 = r3 and α·r3 = r1, whence S ∈ cInvMX {α}. However S/∈ cInvMX {β} since    0a 0b 0c 0d 0e 0f 1a 1b 1c 1d 1e 1f  β · r1 := ∈/ S 120534120534

Example 4.2 Let X = {a, b} and consider the monad T, µ, η from Example 4.1. Let α and β be the following coalgebras:      ab  ab α := ηX = and β := . 0a 0b 1a 1b

Since ∅ is the only bisimulation between X, α and X, β, the two coalgebras are “coalgebraically unrelated”. However, it is easy to see that S ⊆ Xλ is a λ-simulation of X, α ifandonlyifitisaλ-simulation of X, β. Now consider the binary co-relation σ := {r} where r is the following co-vector:   0a 0b 1a 1b r :=   . 0011 It is easy to see that   0a 0b 1a 1b β · r =   ∈/ σ, 1100 whence σ/∈ cInvMX {β}. On the other hand, ηX preserves any co-relation, so σ ∈ cInvMX {α}.

k We say that α ∈ cAMX is singular if α = ηX and α = α for some integer k>1. Otherwise, α is called regular. For a singular α ∈ cAMX ,theleast integer k>1 for which αk = α will be called the singularity index of α.Ifs is 13 Maˇsulovic´ the singularity index of a singular coalgebra α,thens − 1 is called the period of α and will be denoted by ω(α).

Let f ∈ TY be a mapping. A nonempty subset C ⊆ Y is called a cycle of f of length l if C = {y0,...,yl−1} where yi = yj for i = j,andf(yi)=yi+1 (addition modulo l). Note that if f is singular and C is a cycle of f,then|C| divides the period of f.

Proposition 4.3 Let α and β be coalgebras such that cInvMX {α} = cInvMX {β}. (i) If at least one of α, β is regular, then α = β. (ii) If at least one of α, β is idempotent, then α = β.

Proof. (i) Suppose that α is a regular coalgebra. From cInvMX {α} = cInvMX {β} it follows that cEndMX cInvMX {α} = cEndMX cInvMX {β},and hence, by Proposition 3.2, MonMX {α} =MonMX {β}.Ifα = ηX ,then MonMX {α} = {ηX } so β = ηX , too. Now, suppose α = ηX and β = ηX . Then α = βk and β = αl for some positive integers k and l, whence α = αkl. Since α is regular, kl = 1. Therefore, k =1andα = β.

(ii) Suppose α is idempotent, and α = ηX and β = ηX .Asin(i), MonMX {α} =MonMX {β}.ButMonMX {α} = {ηX ,α} and thus β = α. ✷ The situation concerning non-idempotent singular coalgebras is not so clear. In only a few cases we can show that invariant co-relations determine the coalgebra uniquely. We present one such case.

Lemma 4.4 Let f,g ∈ TY be singular mappings such that cInvY {f} =cInvY {g}. (i) Let S ⊆ Xλ for some λ>0.Then:S is a λ-simulation of Y,f if and only if S is a λ-simulation of Y,g. (ii) For ∅ = C ⊆ X we have: C is a cycle of f if and only if C is a cycle of g. (iii) ω(f)=ω(g). (iv) ker f =kerg. (v) If ω(f)=2or ω(g)=2,thenf = g. Proof. (i) Let S ⊆ Xλ be a λ-simulation of Y,f. According to Corollary 3.6, there exists a set QI of co-relations such that for every h ∈ TY we have: S is a λ-simulation of h if and only if QI ⊆ cInvY {h}. Therefore, QI ⊆ cInvY {f} = cInvY {g},andsoS is a λ-simulation of g. (ii) Follows from (i) and the following simple observation: Let f be a singular mapping. Then C = ∅ is a cycle of f if and only if C is a subuniverse of Y,f and no proper subset of C has this property. (iii) Note that ω(f)=max{|C||C is a cycle of f}. The statement now follows from (ii).

(iv) Let x = y. It is easy to see that x, y∈ker f if and only if Sxy := {x, y} ∪ {z,z|z ∈ Y } is a bisimulation of Y,f and at most one of {x}, 14 Maˇsulovic´

{y} is a cycle of f. The statement now follows from (i) and (ii). (v) The statement follows from (ii) and (iv). ✷

Proposition 4.5 Let α and β be singular coalgebras such that cInvMX {α} = 3 3 cInvMX {β} and α = α or β = β .Thenα = β.

Proof. Suppose α = α3.Thenα∗ =(α∗)3. From Lemma 3.1 (iii) it follows ∗ ∗ ∗ ∗ that cInvT (X){α } =cInvT (X){β }.So,α = β by Lemma 4.4 (v), whence α = β. ✷

Example 4.6 We can easily provide an example of two coalgebras having the same invariant co-relations. Let X = {a, b, c},letf : X → X be the mapping a → b → c → a and g = f −1. Then id-coalgebras X, f and X, g have the same invariant co-relations.

5 Emulating coalgebras by idempotent coalgebras

We have seen in Section 4 that invariant co-relations need not characterise the coalgebra uniquely, and that they fail to do so in some rather irregular cases. We would, therefore, like to single out certain well-behaved representatives and show that every coalgebra can be somehow reduced to one of them. In our case, the well-behaved representatives are idempotent coalgebras. In this section we ﬁrst introduce the notion of emulation, and then show that for every monadic coalgebra there exists an idempotent coalgebra for the same monad which emulates the former one. Intuitively, to a coalgebra α : X → T (X) we shall associate a coalgebra α : Y → T (Y )insuchaway that α(some suitable representation of x) = some suitable representation of x, α(x). For example, if α were a model of a Turing machine, say M,thenα would correspond to a Turing machine with two tapes which ﬁrst copies the input data from tape 0 to tape 1, and then proceeds as machine M would have, operating on tape 1 only. Coalgebras of the form Y,α can be thought of as models of computations that follow a simple discipline of not destroying the input data. It is intuitively acceptable that, at a fairly low cost, every computation can be emulated by such a computation. Coalgebras corresponding to such computations are obviously idempotent and, therefore, have the pleasant property of being uniquely determined by their invariant co-relations. Formally, we shall show that the mapping X, α→Y,α is functorial, and, more over, that this is an embedding of the category SetM into its full subcategory spanned by idempotent coalgebras.

Deﬁnition 5.1 Let X and Y be arbitrary sets. 15 Maˇsulovic´

β We say that a coalgebra Y,β emulates a coalgebra Y ✲ T (Y ) X, α if there exist two mappings, an encoding mapping ✻ → → e d e : X Y and a decoding mapping d : T (Y ) T (X), ❄ such that d ◦ T (e)=idT (X) and α = d ◦ β ◦ e.Wealsosay that Y,β is an e, d-emulation of X, α. X α✲ T (X).

⊗T ⊗T For a set X,letX := X × T (X), let ∂X : X → X : x → x, ηX (x) ⊗T and define pX : T (X ) → T (X)bypX = µX ◦ T (π2). Finally, for a coalgebra α : X → T (X)letα : X⊗T → T (X⊗T ) be the coalgebra defined by α := ηX⊗T ◦π1,α◦π1.Notethatα(x, t)=ηX⊗T (x, α(x)), that is, the first component always contains the input data, while the second component contains the outcome of the computation.

Proposition 5.2 For every coalgebra X, α, α is a ∂X ,pX -emulaton of α. Moreover, α is idempotent.

Proof. Let us ﬁrst show that ∂X ,pX is indeed a pair of encoding-decoding mappings, i.e. that pX ◦ T (∂X )=idT (X): pX ◦ T (∂X )=µX ◦ T (π2) ◦ T (∂X )= µX ◦ T (π2 ◦ ∂X )=µX ◦ T (ηX )=idT (X). Next, let us show that α is a ∂X ,pX -emulaton of α: pX ◦ α ◦ ∂X (x)= pX ◦ α(x, ηX (x))=pX ◦ ηX⊗T (x, α(x))=µX⊗T ◦ T (π2) ◦ ηX⊗T (x, α(x))= π2(x, α(x))=α(x). Finally, let us show that α · α = α: α · α(x, t)=µX⊗T ◦ T (α) ◦ α(x, t)= µX⊗T ◦ T (α) ◦ ηX⊗T (x, α(x))=α(x, α(x))=ηX⊗T (x, α(x))=α(x, t).✷ idp ⊗T Proposition 5.3 Functor G : SetM →SetM given by G(X, α)=X , α idp and G(f)=f ×T (f) is an embedding of SetM into SetM (i.e. G is one-to-one and faithful). Proof. Let us show that for every homomorphism f : X, α→Y,β, G(f) is a homomorphism between X⊗T , α and Y ⊗T , β, i.e. that β ◦ G(f)= T (G(f)) ◦ α.Takeanyx, t∈X⊗T .Then

β ◦ (f × T (f))(x, t)=ηY ⊗T ◦π1,β◦ π1(f(x),T(f)(t))

= ηY ⊗T (f(x),β◦ f(x))

= ηY ⊗T (f(x),T(f) ◦ α(x)) (since f is a homomorphism)

= ηY ⊗T ◦ (f × T (f))(x, α(x))

= T (f × T (f)) ◦ ηX⊗T (x, α(x)) (since η is natural)

= T (f × T (f)) ◦ ηX⊗T ◦π1,α◦ π1(x, t) = T (f × T (f)) ◦ α(x, t).

Now, it is easy to see that G is indeed a functor, that it is one-to-one and faithful. It is well-deﬁned since coalgebras of the form α are idempotent. This 16 Maˇsulovic´

idp concludes the proof that G is an embedding of SetM into SetM . ✷

References

[1] Cs´ak´any, B.: Completness in coalgebras, Acta Sci. Math. 48(1985), 75–84

[2] Drbohlav, K.: On quasicovarieties, Acta Fac. Rerum Natur. Univ. Comenian. Math. Mimoriadne Cisloˇ (1971), 17–20

[3] Jacobs, B.: Mongruences and cofree coalgebras, in “Algebraic Methods and Software Technology”, Alagar, V. S. and Nivat. M., eds, Lect. Notes Comp. Sci. 936, Springer, Berlin 1995, 245–260

[4] Jacobs, B.: Invariants, bisimulations and the correctness of coalgebraic reﬁnements, Tech. Rep. CSI-R9704, Comput. Sci. Inst., Univ. of Nijmegen, 1997

[5] Moggi, E.: A category-theoretic account of program modules, Proc. Conf. Category Theory and Computer Science, Manchester, UK, Sept. 1989, Vol. 389 of Lecture Notes in Computer Science, Springer Verlag, 1989

[6] Moggi, E.: Computational lambda-calculus and monads, IEEE Symposium on Logic in Computer Science, Asilomar, California, June 1989

[7] P¨oschel, R., Kaluˇznin,L.A.:Funktionen- und Relationenalgebren, Ein Kapitel der Diskreten Mathematik, Deutscher Verlag der Wissenschaften, Berlin 1979

[8] P¨oschel, R., R¨oßiger, M.: A general Galois theory for cofunctions and corelations, Algebra Universalis 43 (2000), 331–345

[9] Rutten, J. J. M. M.: Universal coalgebra: A theory of systems, CWI Technical Report CS-R9652, 1996

[10] Wadler, P.: Comprehending Monads, Mathematical Structures in Computer Science, Vol. 2, pp. 461–493, Cambridge University Press, 1992

[11] Wadler, P.: The essence of functional programming, Proc. XIX ACM Symposium on Principles of Programming Languages, Santa Fe, New Mexico, 1992 (invited talk)

17 CMCS’01 Preliminary Version

Modal Languages for Coalgebras in a Topological Setting

Dirk Pattinson 1,2

Institut f¨ur Informatik, LMU M¨unchen

Abstract It is well know that the solution Z of a recursive domain equation, given by an endofunctor T , is the final T -coalgebra. This suggests a coalgebraic approach to obtain a logical representation of the observable properties of Z. The paper considers fibrations of frames and (modal) logics, arising through a set of predicate liftings. We discuss conditions, which ensure expressiveness of the resulting language (denotations of formulas determine a base of the frame over the final coalgebra). The framework is then instantiated with categories of domains, and we establish these conditions for a large class of locally continuous endofunctors. This can be seen as a first step towards a final perspective on Abramsky’s domain theory in logical form.

1 Introduction

Coalgebras have by now been recognised as models for state based systems, which are quite naturally speciﬁed using modal logic. There are several approaches [9,11,15,17,18] discussing (modal) logics for coalgebras on the category of sets and functions. The purpose of this paper is to generalise these approaches as to accommodate coalgebras on other computationally interesting categories, and categories of domains are the prime example. The motivation of this extension is twofold. Viewing coalgebras as transition systems, we would like to work with systems, whose transition structure is (the denotational semantics) of a programme. This necessitates to consider coalgebras for endofunctors on categories of domains. The second motivation for the present work is the observation, that the solution Z of a recursive domain equation, given by an endofunctor T , is precisely (the carrier of) the ﬁnal T -coalgebra. Taking Scott-open subsets o ⊆ Z as predicates on Z, we can use the methods of coalgebraic modal logic in order to obtain a syntactical representation of the underlying set of the

1 Research partially supported by the DFG Graduiertenkolleg “Logik in der Informatik” 2 Email: [email protected] This is a preliminary version. The final version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Pattinson frame O(Z) of Scott-open subsets o ⊆ Z. In order to obtain this representation, we reconstruct the topology on TX from the topology on X,givenany domain X. Since this can be done for a large class of domains (continuous and better), the resulting representation does not rely on the domains being bifinite (in contrast to [2]). The interpretation of the logics discussed is based on predicate liftings. To the author’s knowledge, the connection between predicate liftings and logics for coalgebras has first been made explicit in [11]. In contrast to the exposition in loc. cit. (and the subsequent paper [10]), we do not associate predicate liftings to a functor based on its syntactic structure. Instead, an axiomatic approach is taken. That is, we investigate properties of predicate liftings, which ensure expressiveness of the resulting language. This has the advantage of not restricting the class of signature functors for the coalgebras under consideration a priori, as well as enabling us to argue in terms of structural properties, and thus simplifying the proofs. We work with an arbitrary base category C, which comes equipped with a fibration p : E → C of posets (or alternatively with a functor P : Cop → Poset) and consider coalgebras for an endofunctor T : C → C.GivenaT -coalgebra −1 (C, γ : C → TC), we take the elements o ∈ EC = p (C) (or alternatively o ∈ P (C) under the above correspondence) as the properties of the system (C, γ). We show that every set PL of predicate liftings for T gives rise to a logical language L(PL). The interpretation of languages arising this way is shown to be stable under coalgebra-homomorphisms, and thus adequate for coalgebras. The main issue we are concerned with is the expressive power of languages arising in this way. The notion of expressiveness we consider is inspired by Moss’ coalgebraic logic [16], where it is shown, that every element of the final T -coalgebra can be characterised by a single formula. Not having elements available, the expressivity considerations are performed in a topological setting, that is, we assume that the fibre EC over every element C ∈ C is a frame (the partial order on EC behaves like a topology). Viewing the carrier Z of the terminal coalgebra of a set-endofunctor T : Set → Set as a topological space via the discrete topology, the property that a logic L has characteristic formulas translates to the fact that the set of denotations of formulas (wrt. the final coalgebra Z) is a base of the discrete topology on Z. We investigate this concept in an abstract setting and give sufficient conditions for the expressiveness of a language arising via a set of predicate liftings for T . The theory is subsequently instantiated to coalgebras for endofunctors on categories of domains. We show, that the abstract requirements are met by a large class of (locally continuous) endofunctors.

271 Pattinson

2 Preliminaries and Notation

Given an arbitrary category C, the notion of predicates or properties of relative to C is usually expressed via a posetal fibration p : E → C, where the fibres EC support the interpretation of (a fragment of) propositional logic. If p : E → C is a fibration, we denote the fibre over an object C ∈ C by EC and the reindexing (or substitution) functor induced by f : C → D ∈ C by ∗ f : ED → EC . Note that reindexing functors are determined uniquely in a posetal setting. We will be concerned with three different types of fibrations p : E → C in the sequel: The concept of predicate liftings is considered for fibrations of posets, that is, every fibre EC is a poset and reindexing preserves order. Via the Grothendieck construction [5,8], fibrations p : E → C of posets correspond to functors P : Cop → Poset. When looking at languages arising via a set of predicate liftings, we need additional structure in the fibres to interpret propositional connectives. We therefore consider fibrations of lattices. These are fibrations p : E → C of posets, where each fibre additionally has finite meets and joins, which are preserved by reindexing. In this setting, we interpret languages arising through a set of predicate liftings and show, that they are homomorphism-stable. Via Grothendieck, every fibration p : E → C of lattices corresponds uniquely to a functor P : Cop → Lat, with values in the category of lattices and join-and meet-preserving maps. The third type of fibration considered are fibrations of frames. These are fibrations of lattices, where every fibre additionally has infinite meets distributing over (finite) joins (ie. is a frame), which are required to be preserved under reindexing. As it can easily be seen, fibrations of frames correspond to functors P : Cop → Frm,whereFrm is the category of frames and maps which preserve infinite meets and finite joins. This will be the context in which the expressivity of the logics is examined.

3 Modal Logics and Predicate Liftings

We consider a posetal fibration p : E → C over an arbitrary base category C. Intuitively speaking, a predicate lifting for an endofunctor T : C → C maps predicates over C ∈ C (ie. elements of the poset EC ) to predicates over TC, subject to a naturality condition. Passing from p : E → C to the corresponding functor P : Cop → Poset, we obtain the following definition: Definition 3.1 (Predicate Lifting) Suppose T : C → C is an endofunctor and P : Cop → Poset.AP -predicate lifting for T is a natural transformation λ : P → P ◦ T op. By passing from functors P : Cop → Poset to fibrations p : E → C of posets, we obtain the following, alternative characterisation in terms of fibred functors 272 Pattinson

(see [8], where they are called morphisms of fibrations, or [5], Section 8.2 where they figure under the name cartesian functors). Proposition 3.2 (Predicate Liftings are Fibred Functors) Suppose T : C → C and the fibration p : E → C corresponds to P : Cop → Poset via the Grothendieck construction. Then there is a one-to-one correspondence between (i) P -predicate liftings for T (ii) Functors L : E → E such that (L, T ):p → p is a fibred functor. Proof Suppose λ : P → P ◦ T op is natural. Define L : E → E by L(E)= λ(pE)(E). Conversely, if (L, T ):p → p is fibred, let λ(C)(E)=L(E) for C ∈ C and E ∈ EC . Both constructions translate to morphisms in a straight forward way. ✷ We give some examples for predicate liftings: Example 3.3 (i) Suppose C has pullbacks and is well-powered, and T : C → C preserves monic arrows. Assigning the equivalence class of monos Tm : TM → TC to a subobject represented by a monic m : M → C extends to a S-predicate lifting for T (where S : Cop → Poset maps an object C ∈ C to the set of its subobjects, see eg. [8], Section 1.3). (ii) If p : E → C is a bicartesian fibration of posets and T : C → C is polynomial, then the logical predicate liftings of T , as considered in [7], is a predicate lifting in the sense of Definition 3.1. (iii) Suppose D ⊆ DCPO is a subcategory of the category of directed-complete partial orders and Scott-continuous functions. Consider the functor O : Dop → Poset, which maps an object D ∈ D to its Scott-topology O(D). If D is cartesian closed and TX = XD for some D ∈ D, then the assignment D o ⊆ X open → {f ∈ X | f(d0) ∈ o} defines an O-predicate lifting for T for all d0 ∈ D. The next section establishes the desired connection between logics for coalgebras and predicate liftings for endofunctors on a category endowed with a functor Cop → Lat.

3.1 Logics via Predicate Liftings In the case of modal logic interpreted wrt. Kripke models, both the interpretation of the modalities and the interpretation of propositional constants can be seen to arise via predicate liftings. Since Kripke models over a set P of propositional constants are coalgebras for the functor TX = P(X)×P(P ), we argue that a logic for coalgebras, which is interpreted wrt. predicate liftings, deserves the attribute “modal”. The relationship between modal logic and predicate liftings is the purpose of the next example, taken from [13]. Example 3.4 Suppose TX = P(X) ×P(P ), modelling Kripke models with afixedsetP of propositional constants and let 2 : Setop → Poset denote 273 Pattinson the contravariant powerset functor. The function [P ](X):2X → 2P(X)×P(P ) defined by [P ](X)(x)={(y, p) ∈P(X) ×P(P ) | y ⊆ x} is a 2-predicate lifting for T . Given a T -coalgebra γ : C →P(C) ×P(P ), the associated operator γ−1 ◦ [P ]:2C → 2C is the ✷-operator of modal logic: Indeed, given c ⊆ C,we obtain γ−1 ◦ [P ](c)={c ∈ C |∀c ∈ C.c → c =⇒ c ∈ c}, where we have written c → c for γ(c)=(c, p) ∧ c ∈ c. Also, given a propositional constant p ∈ P , the constant function [p ∈ P ] defined by [p ∈ P ](X)(x)={(c, p) ∈P(X) ×P(C) | p ∈ p} is a predicate lifting. Given γ : C →P(C) ×P(P ), then the (constant) operator γ−1 ◦ [p ∈ P ]:2C → 2C associated to [p ∈ P ] gives rise to the set of states, which validate p. More precisely, given any subset c ⊆ C,wehaveγ−1 ◦ [p ∈ P ](c)= {c ∈ C | c |= p},wherec |= p is a shorthand for γ(c)=(c, p) ∧ p ∈ p.

Note that predicate liftings can be used both to interpret atomic propositions and modalities. This leads us to consider logics, which are interpreted wrt. a set of predicate liftings of the signature functor. In order to interpret also propositional connectives, we need joins and meets in the fibres EC of the posetal fibration p : E → C. For a richer logical language, one could also include Heyting implication, if the posetal fibration supports its interpretation. We refrain from doing so, since implication is not needed to obtain the result regarding expressivity of coalgebraic modal logics. Suppose C has a terminal object 1 ∈ C, P : Cop → Lat is a functor (giving rise to a fibration p : E → C of lattices) and T : C → C is an endofunctor. Given a set PL of P -predicate liftings for T ,thelanguage L(PL) associated to PL is the least set of formulas containing • The elements p ∈ P (1) as atomic propositions • The formulas φ ∧ ψ and φ ∨ ψ for every pair (φ, ψ) of formulas, and • The formula [λ]φ,ifλ ∈PLand φ ∈L(PL). We blur the distinction between syntax and semantics on the level of propositional constants in order to avoid notational overhead regarding the interpretation of formulas. Note that since P (1) is a lattice, the language L(PL) contains true and false as atomic propositions. Given an T -coalgebra (C, γ : C → TC), the semantics of a formula φ ∈ L(PL) wrt. (C, γ) is then given as a predicate [[φ]] γ ∈ P (C)overthecarrier C ∈ C. The inductive definition is as follows: • For p ∈ P (1), we let [[p]] γ = P (!)(p), where ! denotes the unique morphism into the terminal object 1 ∈ C. • Propositional connectives are interpreted via joins and meets in P (C) • If φ ∈L(PL)andλ ∈PL,then[[[λ]φ]] γ = P (γ) ◦ λ(C)([[φ]] γ). The first property which we note about logics interpreted via predicate liftings, is that the interpretation is homomorphism-stable. 274 Pattinson

Proposition 3.5 (Stability under Coalgebra-Homomorphisms) Suppose P : Cop → Lat is a functor and 1 ∈ C.IfPL is a set of P -predicate liftings for T : C → C,then

[[ φ]] γ = P (f)([[φ]] δ) for all T -coalgebra morphisms f :(C, γ) → (D, δ) and all formulas φ ∈L(PL). Proof The claim is immediate for the propositional connectives and atomic propositions, since P (f) preserves (ﬁnite) joins and meets. Given a natural op transformation λ : P → P ◦ T ∈PL,wehavetoshowthatP (f)([[[λ]φ]] δ)= [[ [ λ]φ]] γ, given the induction hypothesis [[φ]] γ = P (f)([[φ]] δ). This is done by calculating

P (f)([[[λ]φ]] δ)=P (f) ◦ P (δ) ◦ λ(D)([[φ]] δ)

= P (δ ◦ f) ◦ λ(D)([[φ]] δ)

= P (Tf ◦ γ) ◦ λ(D)([[φ]] δ)

= P (γ) ◦ P (Tf) ◦ λ(D)([[φ]] δ)

= P (γ) ◦ λ(C) ◦ P (f)([[φ]] δ)

= P (γ) ◦ λ(C)([[φ]] δ)

=[[[λ]]]γ, using the fact that f is a homomorphism and the naturality of λ. ✷ In other words, the interpretation is invariant wrt. coalgebra-morphisms. When C = Set and P =2:Setop → Lat is the contravariant powerset functor, one derives the immediate corollary that bisimilar points (in the sense of Aczel and Mendler [3]) satisfy the same set of formulas. Also, when considering coalgebras for endofunctors T : D → D on subcategories D ⊆ DCPO (as in [19,20]), this result implies that (ordered) bisimilarity implies logical equivalence.

4Expressivity

This section investigates the expressive power of logics given by a set of predicate liftings for an endofunctor T : C → C. Wehaveseenintheprevious section, that the interpretation of a formula wrt. an arbitrary coalgebra (C, γ) can be recovered by the interpretation of φ wrt. the terminal coalgebra. The object of our study in this section is the class of predicates over the (carrier of) the terminal coalgebra, which can be denoted by modal formulas. For cardinality reasons, we cannot expect that all predicates over the carrier of the final coalgebra can be denoted by a formula for cardinality reasons. Consider for example the functor TX = L × X on the category of sets. It is well known (see [4]), that the carrier of the final coalgebra is Z = LN, the set of functions from the set N of natural numbers into L.IfL has cardinality greater than one, LN is uncountable and hence P(LN) is uncountable. How- 275 Pattinson ever, given a finite set of predicate liftings, we can only denote a countable set of subsets of Z. Two approaches seem feasible in this context: We can extend the language as to accommodate conjunctions and disjunctions of larger cardinality or we can restrict attention to “basic predicates”, which can be used to “approximate” all predicates available in the fibre over Z. Both approaches necessitate to move from fibrations of lattices to fibrations with “infinitary structure” in the fibres, to interpret infinitary conjunctions / disjunctions. This leads us to consider fibrations of frames in the sequel, the prime example being topological spaces, having open subsets as fibres. We do not work with fibrations of frames directly, but instead represent them as functors P : Cop → Frm,where Frm denotes the category of frames and maps preserving infinite meets and finite joins. A subset B of (the underlying/ set of) a frame L is called a base, if, for all l ∈ L we have that l = {l ∈B|l ≤ l}. Note that every frame has a base, namely its underlying set. A subset S of L is called a subbase of L, if the set {l1 ∧···∧lk | k ∈ N ∧ l1,...,lk ∈S}is a base of L. Clearly every base of L is also a subbase. We proceed to give conditions, which ensure that the set of denotations of formulas wrt. the final coalgebra (Z, ζ) form a base of the frame of predicates over Z. This will allow us to denote “basic predicates” on the terminal coalgebra by formulas of the logic. Extending the logic with infinitary disjunctions then yields a language, in which every predicate over Z has a denotation. In order to obtain a language with this property, two different conditions are needed: the set of predicate liftings has to be sufficiently complete (that is, we can encode enough information in the formulas of L(PL)). The second condition concerns the interplay between the signature functor T : C → C and the functor P : Cop → Frm describing predicates relative to objects C ∈ C.Since we construct the final T -coalgebra by means of an inverse limit construction, we require that P maps the limiting cone (in C) to a colimit (in Frm), that is, the predicates over the limit are determined by the predicates over the approximands. Definition 4.1 (Complete Sets of Predicate Liftings) Suppose P : Cop → Frmop and T : C → C. WecallasetPL of P -predicate liftings for T complete, if for all C ∈ C the set {λ(o) | o ∈B} λ∈PL is a subbase of P (TC) whenever B is a base of P (C). Logically speaking, this means that we can denote the elements of a base of P (TC)bymeansofabaseofP (C), the set PL of predicate liftings and finite conjunctions. For examples of complete sets of predicate liftings, we refer the reader to Section 5, where the situation in categories of domains is studied in more detail. 276 Pattinson

The second requirement concerns the well-behavedness of P wrt. T .Since we construct the terminal T -coalgebra as limit of the sequence

o ! oT ! o T 2! 1 T 1 T 21 T 31 ··· ,

n which we denote by (T 1)n∈N, we need that this limit exists and is preserved n by T . The fact that the predicates over the limit of (T 1)n∈N are determined n n by the predicates over the approximands T 1 means that the limit of (T 1)n∈N is preserved by P op : C → Frmop.Thisisthecontentof Definition 4.2 Suppose T : C → C is an endofunctor and P : Cop → Frm. We say that T satisfies the approximation requirement wrt. P ,if • n C has a terminal object 1 ∈ C and the limit L =Lim(T 1)n∈N exists in C, and • Both T and P op preserve this limit. Although these conditions are quite technical, they are present in the example motivating this study, that is for locally continuous endofunctors on categories of domains, and we establish them in the next section. Given a complete set of predicate liftings PL for a functor satisfying the approximation requirement, we can (finally) show that every “basic” predicate on the terminal coalgebra is the denotation of a formula of L(PL). Theorem 4.3 (Expressiveness of L(PL)) Suppose P : Cop → Frm and T : C → C satisfy the approximation requirement. If (Z, ζ) is the final T - coalgebra, then {[[ φ]] ζ | φ ∈L(PL)} is a base of P (Z) whenever PL is a complete set of P -predicate liftings for T . We postpone the proof to state the auxiliary

Lemma 4.4 Suppose D =((On)n∈N, (fnm : On → Om)n≤m∈N) is a diagram in → B Frm with colimiting cocone (O, (fn : On O)n∈N).If n is a base of On for ∈ N { | ∈B } all n ,then n∈N fn(b) b n is a subbase of O. Proof Let x ∈ O.SinceO is the vertex of the colimiting cone for D in Frm, we can represent x as 2 i ∧···∧ x = x1 xk(i) i∈I i ∈ { | ∈ } ∈ ≤ for some index set I and xj n∈N fn(o) o On for i I and j k(i). If i ∈ i B xj On,/ say, we can approximate xj by elements of the base n of On,that i i i ⊆B is, xj = Xj for some subset Xj n.Sincefn is a frame homomorphism for n ∈ N, the distributivity law of O entails the claim. ✷ Using the above lemma, we are ready for the

n Proof of Theorem 4.3 Suppose (Z, (pn : Z → T 1)n∈N) is a limiting cone of n the sequence (T 1)n∈N.SinceT preserves this limit, the induced isomorphism 277 Pattinson

∼ = ζ : Z → TZ is the ﬁnal coalgebra. We deﬁne a sequence of subsets (Ln)n∈N n of subsets of L(PL) and a sequence (Dn)n∈N where Dn ⊆ P (T 1) as follows: for n =0letD0 = L0 = P (1) (note that the elements of P (1) are the atomic propositions). Suppose 0

Ln = {[λ1]φ1 ∧···∧[λk]φk | λ1,...,λk ∈PL,φ1,...,φk ∈Ln−1}, and

n−1 n−1 Dn = {λ1(T )(d) ∧ ...λk(T 1)(dk) | λ1,...,λk ∈PL,d1,...,dk ∈Dn−1} where, in the ﬁrst case, we use ∧ to denote (syntactical) conjunction and in the second, to denote the meet in P (T n1). n It follows from completeness of PL,thateachsetDn is a base of P (T 1). WethinkofthesetsDn as the interpretations of the formulas at the level of the approximations of Z =LimT n1. This intuition we make now precise by deﬁning inductively a sequence of maps An : Ln →Dn such that for φ ∈Ln we have

[[ φ]] ζ = P (pn)(An(φ)). (1)

The deﬁnition of An is straightforward: Let A0 : L0 →D0 = id and for 0 ≤ n deﬁne

n−1 n−1 An([λ1]φ1 ∧···∧[λk]φk)=λ1(T )(An−1(φ1)) ∧···∧λk(T )(An−1(φk))

Equation (1) can then be proved by induction on n using the fact that T n preserves the limiting cone (Z, (pn : Z → T 1)n∈N) and hence

ζ−1 Z o TZ

pn−1 Tpn−1 n−1 o n T 1 T n−1! T 1 commutes for all 0

n is a base of P (Z), since Dn is a base of P (T 1) for each n. ✷ 278 Pattinson

5 Coalgebras in Categories of Domains

This section shows, that every locally continuous endofunctor on a subcategory D of DCPO⊥ satisfies the approximation requirement wrt. the functor O : Dop → Frm, mapping a dcpo D ∈ D to the frame of its Scott-open subsets. We then demonstrate, that a large (inductively defined) class of endofunctors on categories of domains admit a complete set of predicate liftings. For the remainder of the section, we consider a full subcategory D ⊆ DCPO of the category of directed-complete partial orders (dcpos) and Scott- continuous mappings, which is closed under bilimits and has a terminal object 1={⊥} ∈ D. The full subcategory of D consisting of pointed dcpos (that is, dcpos with least element) is denoted by D⊥; the non-full subcategory of D⊥ which contains strict (least element preserving) maps only is denoted by DCPO⊥!. We assume a basic knowledge of domain theory (see eg. [1], which is also the role model for the notation used). We begin with an auxiliary lemma, which we use in proving that every locally continuous functor on a category of dcpos satisfies the approximation requirement wrt. O : Cop → Frm.

Lemma 5.1 Suppose D =((Dn)n∈N, (pmn : Dm → Dm)n≤m∈N) is a diagram in DCPO such that pmn is a projection for all n ≤ m ∈ N. If L =(D, (pn : D → Dn)n∈N) is a limiting cone for D in DCPO,then O(L) is colimiting for O(D) in Frm.

Proof Since the pmn’s are projections, the limit of D can be computed via the bilimit construction, see [1], Theorem 3.3.7. In particular, the maps pn are projections for all n ∈ N. Denote the embedding associated to pn by → O → O D en : Dn D.If(Q, (qn : (Dn) Q)n∈N is a cocone/ for ( ), we obtain a O → o ◦ −1 o ✷ unique mediating morphism u : (D) Q by u( )= n∈N qn en ( ). We show that locally continuous endofunctors satisfy the approximation requirement in two settings: We consider functors T : D⊥ → D⊥ and functors T : D⊥! → D⊥!. This distinction is necessary for considering constructions like smash product and coalesced sum, which fail to be functorial in the non-strict case.

Theorem 5.2 Let C ∈{D⊥, D⊥!} and T : C → C locally continuous. Then T satisﬁes the approximation requirement wrt. O : Cop → Frm. Proof Since D contains only pointed domains, it is easy to see that the con- m n n necting morphisms pmn : T 1 → T 1 of the sequence (T 1)n∈N are projections. This is obvious for p10 : T 1 → 1 and follows by induction, since local continuity of T implies that T preserves projections. n By the limit-colimit coincidence, the bilimit L =(L, (pn : L → T 1)n∈N) n of the sequence (T 1)n∈N is a limiting cone in DCPO. By construction of bilimits, L is pointed, hence L is also limiting in C in case C = D⊥.Forthe case C = D⊥! note that projections between pointed dcpos are automatically 279 Pattinson strict, hence L qualiﬁes as cone in C. In order to see that L is limiting, note that mediating morphisms, constructed in the ambient category DCPO are strict. The preservation of this limit by T is standard, and can be proved along the lines of [1], Lemma 5.2.2. It follows from the last lemma, that Oop : C → Frmop preserves the limit L. ✷

It is worthwhile to note that the approximation property can also be established in the setting of complete metric spaces with non-expansive maps as morphisms, discussed in [19,20], justifying the generality of the approach presented. Proposition 5.3 Suppose CMS is the category of complete metric spaces with non-expansive maps. If T : CMS → CMS is locally contracting, then it satisﬁes the approximation property wrt. O : CMSop → Frm,whereO(M) is the set of open subsets of M. The ﬁnal coalgebra also arises as a limit in a category with embedding- projection pairs as arrows and is similar to the case of domains above. We now show, that functors obtained via constructions like lifting, cartesian product and function space admit a complete set PL of O-predicate liftings. For the same reason as above (functoriality of smash product and coalesced sum), we have to consider endofunctors on D⊥ and D⊥! separately. The general recipe for constructing a complete set of predicate liftings for T1 × T2, say, is to reconstruct the Scott topology on T1X × T2X by means of a representation of the Scott topology on T1X and T2X, obtained by a complete set of predicate liftings PLi for Ti (i =1, 2). The problematic constructions in this context are cartesian product and the function space construction, since the Scott topology O(D1 ×D2) on the cartesian product D1 ×D2 of two dcpos only coincides with the product of the Scott topologies O(D1)andO(D2)in case at least one of D1 and D2 are continuous. We therefore restrict attention to the case of full subcategories D ⊆ CONT of the category of continuous dcpos (see [1]). With the notable exception of the function space construction, it is straightforward to construct complete sets of predicate liftings for functors obtained by various domain constructions. We therefore begin with a representation of the Scott topology on the function space [D → E]={f : D → E | f Scott continuous}, in case both D and E are continuous and pointed. We refer to [1] for the notion of a base of a dcpo. In order to avoid confusion with bases of a topology, we explicitly speak of bases of dcpos and bases of topologies, when there is danger of confusion. If D and E are dcpos and d ⊆ D, e ⊆ E are subsets, we write d, e = {f ∈ [D → E] |∀d ∈ d.f(d) ∈ e} for the set of Scott continuous functions mapping all of d into e. The upper set generated by d ∈ D is denoted by d ↑= {d ∈ D | d d},where is the order on D.

280 Pattinson

Proposition 5.4 Suppose C ⊆ CONT⊥ is a full cartesian closed subcategory and let D, E ∈ C. Then the Scott topology on [D → E] is the compact open topology. If B ⊆ D is a base of the dcpo D,andB⊆O(E) is a base of the Scott topology O(E) on E,then

{d↑, o|d ∈ B, o ∈B} is a subbase of the Scott topology O([D → E]) on [D → E]. Proof (Sketch) By the classiﬁcation theorem of Jung [12], C is contained in the category of FS-domains or in the category of L-domains. Using the explicit representation of the objects of both, one can prove that Scott topology on [D → E] is contained in the compact open topology. The compact open topology on [D → E] is contained in the Scott topology for arbitrary dcpos D and E. Using Lemma 5.5, Chapter XII of [6], one shows that {d↑, o|d ∈ B, o ⊆ E Scott open} is a base of the compact open topology on [D → E]. The passage to the representation claimed is standard. ✷ We are now in the position to show the existence of a complete set of predicate liftings for an inductively generated class of functors. We denote the constant endofunctor with value D by KD and silently assume that D⊥ is closed under the constructions mentioned below.

Proposition 5.5 Every functor T : D⊥ → D⊥ belonging to the inductive class

D T ::= Id | KD | T⊥ | T1 × T2 | T (D ∈ D⊥) admits a complete set of O-predicate liftings.

Proof We just treat cartesian product and function space. Suppose PLi is a complete set of predicate liftings for Ti, i =1, 2. Denote the canonical × → { −1 ◦ | ∈{ } ∈PL} projection T1 T2 Ti by πi.Then πi λ i 1, 2 ,λ i is complete for T1 × T2, since the Scott-topology on the product of two continuous dcpos coincides with the product topology. d If PL is complete for T and D ∈ D⊥, consider the predicate lifting λ deﬁned by λd(X)(o ⊆ X)={f ∈ [D, TX] | f(d) ∈ λ(X)(o)} for λ ∈PLand d ∈ D. Naturality of λd is straightforward and Proposition 5.4 implies that {λd | λ ∈PL,d∈ B} is complete for T D,ifB is a base of the dcpo D. ✷

For functors T : D⊥! → D⊥!, it can easily be see that smash product and coalesced sum can also be included:

Proposition 5.6 Every functor T : D⊥! → D⊥! belonging to the inductive class

D T ::= Id | KD | T⊥ | T1 × T2 | T1 ⊗ T2 | T1 ⊕ T2 | T (D ∈ D⊥!) admits a complete set of O-predicate liftings. 281 Pattinson

6 Conclusions, Further and Related Work

We have demonstrated that the notion of predicate lifting smoothly generalises to posetal fibrations and gives rise to logical languages for coalgebras, which are bisimulation invariant. The needed to obtain the expressivity result (Theorem 4.3) are rather technical, and the hard part is to actually verify them, when it comes to examples. The similarity of ordered and metric domain equations could possibly be handled more uniformly by working in a more general, enriched framework [21]. To the author’s knowledge, logics for coalgebras on arbitrary categories have so far only been studied in [14], where a semantical framework for coalgebraic logics is discussed. In contrast to [14], we have presented languages which arise by means of an inductive definition, and are thus “more syntactic” in nature. Regarding the main example (coalgebras on categories of domains), logics ∼ describing observable properties on the solution TX = X of recursive domain equations have been discussed in some detail in [2]. Since the solution of a recursive domain equation given by a locally continuous endofunctor T is precisely the final coalgebra, Abramsky’s domain logic can also be used to formulate predicates on the final coalgebra. The work of Abramsky differs form the results presented here in that we view the logical language presented as modal logic, which is used to specify predicates on the final coalgebra. In view of the connection with Abramsky’s domain logic, the next step which has to be taken is to formalise a syntactical consequence relation and to investigate, under which conditions logics for coalgebras also serve as means to characterise solutions of recursive domain equations logically.

References

[1] S. Abramsky and A. Jung. Domain Theory. In S. Abramsky, D. Gabbay, and T. S. E. Maibaum, editors, Handbook of Logic in Computer Science, volume 3. Clarendon Press, 1994. [2] Samson Abramsky. Domain Theory in Logical Form. Annals of Pure and Applied Logic, 51:1–77, 1991. [3] P. Aczel and N. Mendler. A Final Coalgebra Theorem. In D. H. Pitt et al, editor, Category Theory and Computer Science, volume 389 of LNCS, pages 357–365. Springer, 1989. [4] Michael Barr. Terminal coalgebras in well-founded set theory. Theor. Comp. Sci., 114:229–315, 1993. Korrigendum in Theor. Comp. Sci., 124:189-192, 1993. [5] Francis Borceux. Handbook of Categorical Algebra, volume 2. Cambridge University Press, 1994. [6] James Dugundji. Topology. Allyn and Bacon, 1966. 282 Pattinson

[7] B. Jacobs and C. Hermida. Structural Induction and Coinduction in a Fibrational Setting. Information and Computation, 145:107–152, 1998.

[8] Bart Jacobs. Categorical Logic and Type Theory. Elsevier, 1998.

[9] Bart Jacobs. The temporal logic of coalgebras via Galois algebras. Technical Report CSI-R9906, Computing Science Institute, University of Nijmegen, 1999.

[10] Bart Jacobs. Many-Sorted Coalgebraic Modal Logic:a Model-theoretic Study. Technical Report CSI-R0020, Computing Science Institute, University of Nijmegen, 2000.

[11] Bart Jacobs. Towards a Duality Result in the Modal Logic of Coalgebras. In Horst Reichel, editor, Coalgebraic Methods in Computer Science (CMCS’2000), volume 33 of Electr. Notes in Theoret. Comp. Sci., 2000.

[12] Achim Jung. The Classiﬁcation of Continous Domains. In Logic in Computer Science, pages 35 – 40. IEEE Computer Society Press, 1990.

[13] A. Kurz and D. Pattinson. Coalgebras and Modal Logics for Parameterised Endofunctors. Technical report, CWI, 2000.

[14] Alexander Kurz. Coalgebras and Applications to Computer Science.PhD thesis, Universit¨at M¨unchen, April 2000.

[15] Alexander Kurz. Specifying Coalgebras with Modal Logic. Theor. Comp. Sci., 260, 2000.

[16] Lawrence Moss. Coalgebraic Logic. Annals of Pure and Applied Logic, 96:277– 317, 1999.

[17] Martin R¨oßiger. Coalgebras and Modal Logic. In Horst Reichel, editor, Coalgebraic Methods in Computer Science (CMCS’2000), volume 33 of Electr. Notes in Theoret. Comp. Sci., 2000.

[18] Martin R¨oßiger. From Modal Logic to Terminal Coalgebras. Theor. Comp. Sci., 260, 2000.

[19] J. Rutten and D. Turi. On the foundations of ﬁnal coalgebra semantics:non- well-founded sets, partial orders, metric spaces. In J. W. de Bakker, W. P. de Roever, and G. Rozenberg, editors, Sematics: Foundations and Applications, number 666 in Lect. Notes in Comp. Sci., pages 477 – 530. Springer, 1992.

[20] J. Rutten and D. Turi. On the foundations of ﬁnal coalgebra semantics:non- well-founded sets, partial orders, metric spaces. Math. Struct. in Comp. Sci., 8:481 – 540, 1998.

[21] Kim Ritter Wagner. Solving Recursive Domain Equations with Enriched Categories. PhD thesis, School of Computer Science, Carnegie Mellon University, July 1994.

283 CMCS’01 Preliminary Version

Bialgebraic Semantics and Recursion (Extended Abstract)

Gordon Plotkin 1,2

Division of Informatics, University of Edinburgh, King’s Buildings, Edinburgh EH9 3JZ, Scotland

In [9] a unifying framework was given for operational and denotational semantics. It uses bialgebras, which are combinations of algebras (used for syntax and denotational semantics) and coalgebras (used for operational semantics and solutions to domain equations). Here we report progress on the problem of adapting that framework to include recursion. A number of difficulties were encountered. An expected one was the need to treat orders in the general theory; much less expected was the need to give up defining bisimulations in terms of spans of functional bisimulations, and move to a relational view. Even so the outcome is not yet satisfactory because of well-known difficulties involved in prebisimulations. Our work can be compared to, e.g., that of [2]. The principal difference is that we aim, following [9], at a conceptual overview of the area using appropriate categorical tools. The main idea of [9] is to represent rules for operational semantics by a natural transformation:

ρ :Σ(X × BX) → BTX where Σ is the signature functor associated to an algebraic signature of the same name, B is a behaviour functor and T is the term monad associated to Σ. For suitable choices of these over the category of sets, image-ﬁnite sets of rules in GSOS format yield such natural transformations. Models of such rules are bialgebras

γ ΣX →α X → BX satisfying a suitable pentagonal condition. The category of these models has ∼ι an initial object consisting of the programming language ΣL = L and its op- γ erational semantics L → BL. Every model gives an adequate compositional

1 Email:[email protected] 2 This work has been done with the support of EPSRC grant GR/M56333:The Structure of Programming Languages:Syntax and Semantics. This is a preliminary version. The final version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Plotkin denotational semantics for the language L. Suppose now that the final coal- ∼σ gebra M = BM exists; it can be thought of as the solution to the “domain ∼ equation” X = BX in the category at hand. Then it automatically gives a final object which incorporates a semantic algebra ΣM →α M. One can model bisimulation by spans of coalgebra maps (first done in [3]). With this, under mild conditions, one has that the semantics given by the final coalgebra is fully abstract and that there is a greatest bisimulation which is a congruence. These conditions are that kernel pairs exist and that weak kernel pairs are preserved by B. Turning to recursion we need first to consider syntax. One way would be to add µ-terms µx.t to the usual algebraic terms over a given signature Σ, as in [2]. However, taken literally, we would then also have to incorporate an extension of the framework to handle binding operators, as in [5]. While that would be interesting, it would obscure the issues caused by recursion; we prefer to consider infinite terms, regarding the µ-terms as syntactic sugar for their unwinding. We therefore consider the functor Σ : CPPO → CPPO over the category of cppos and strict continuous functions given by: arity(f) Σ(D)= (D⊥ ) f∈Σ

Its initial algebra is the free continuous algebra with signature Σ. A natural choice of behaviour functor is

B(D)=P(A⊥ ⊗ D⊥) where A is a set of actions, and P(D) is the free continuous semilattice with a zero over D (on bifinite cppos this is the convex powerdomain with a zero). The final coalgebra of B is the solution to Abramsky’s domain equation for bisimulation [1]. Every image finite set of GSOS rules again gives rise to a natural transformation as above, but now over CPPO, and we may attempt to apply the general theory. However weak pullbacks are not preserved (because of convexity phenomena) and full abstraction fails in the sense that there are not enough bisimulations to equate processes with the same final semantics. One can instead develop a general theory for locally ordered categories (better: CPPO-enriched ones). One can define simulations (or ordered bisimulations) as spans of lax and and oplax coalgebra maps; it is shown in [4] that these include partial bisimulations [1] (also known as prebisimulations). There is then a new version of the general theory but now with the conditions that insertors exist and that weak insertors are preserved. Unfortunately, the weak preservation condition fails. This motivates us to give up considering bisimulations as spans and, instead, to work directly with relations. One introduces a category of relations, 285 Plotkin lifts the behaviour functor to the category and takes bisimulations to be coalgebras in the category of relations—see, e.g., [8]. Thus one asks that the coalgebra map is “logical,” preserving the relation. One can argue this is less arbitrary than the span view where one has to turn the relation itself into a coalgebra, and there is not necessarily a canonical way to do this (and in our case there may be no way!).

A ﬁrst attempt at a category for relations is CPPOa whose objects are pairs (D, R) of cppos and admissible relations on them, and whose morphisms are the strict continuous functions preserving the relations. The various functors involved have natural liftings to the category of relations: liftings of a functor F correspond to families FD of monotone actions on relations over D such that for any f :(D, Q) → (E,R),

−1 −1 FD(f (R)) ⊆ F (f) FE(R)

In the case of the powerdomain functor P there is a minimal such action which makes singleton, empty set and union logical. It is then natural to assume that ρ is logical, in the sense that it lives in CPPOa, and indeed this is the case when ρ arises from GSOS rules as above. One can then show that full abstraction holds if a partial converse to the above condition on actions holds, viz that

−1 −1 F (h) FM (=M ) ⊆ FL(h (=M )) where h is the unique coalgebra map from the operational semantics to the solution of the domain equation, and =M is the equality relation on M.How- ever, this is false and full abstraction again fails; intuitively this is because we are again trying to handle equality (bisimulation) when convexity phenomena are present. Instead one should use simulations, which we take to be admissible relations R such that: ≤ o R o ≤⊆ R and move to the subcategory CPPOs whose objects are pairs of cpppos and simulations. The condition now needed is that:

−1 −1 F (h) FM (≤M ) ⊆ FL(h (≤M )) where ≤M is the inequality relation on M. This holds under the condition that h preserves finiteness. Unfortunately that does not hold in general, though it does if the set of GSOS rules is compact in the sense of [2]. However we lack a conceptual proof of that assertion, and in any case we would prefer a general result. The essential difficulty is that an infinite process may be simulated by a set of processes rather than any single one. It may be possible to account for this by changing the kind of relations used, taking advantage of the scope of the fibrational approach to logical relations as discussed in, e.g., [7]. 286 Plotkin

As regards possible further developments, one evident possibility for future work is to combine recursion and value-passing (e.g. as in the work of [6]). More immediately it would be good to treat other examples than GSOS. In particular probabilistic computation presents a challenge with its need for continuous domain theory.

References

[1] S. Abramsky, A Domain Equation for Bisimulation, in Inf. and Comp.,, Vol. 92, pp. 161–218, 1991.

[2] L. Aceto and A. Ing´olfsd´ottir, CPO Models for GSOS Languages—Part I: Compact GSOS Languages, in Inf. and Comp., Vol. 129, No. 2, pp. 107–141, 1994.

[3] P. Aczel and N. Mendler, A Final Coalgebra Theorem, in Proc. CTCS ’89 (eds. D. H. Pitt, D. E. Ryeheard, P. Dybjer, A. M. Pitts and A. Poign´e), LNCS Vol. 389, pp. 357–365, Berlin:Springer-Verlag, 1989.

[4] M. P. Fiore, A Coinduction Principle for Recursive Data Types Based on Bisimulation, in Information and Computation, Vol. 127, No. 2, pp. 186–198, 1996.

[5] M. P. Fiore, G. D. Plotkin and D. Turi, Abstract Syntax and Variable Binding, in Proc. LICS ’99, pp. 193–202, Washington:IEEE Computer Society Press, 1999.

[6] M. P. Fiore and D. Turi Semantics of Name and Value Passing,toappearin Proc. LICS 2001.

[7] C. Hermida, and B. Jacobs, Structural Induction and Coinduction in a Fibrational Setting, Inf. & Comp., Vol. 145, No. 2, pp. 107–152, 1998.

[8] B. Jacobs, Mongruences and Cofree Coalgebras, in Algebraic Methodology and Software Technology (eds. V. S. Alagar and M. Nivat), LNCS Vol. 936, pp. 245–260, Berlin:Springer-Verlag, 1995.

[9] D. Turi and G. D. Plotkin, Towards a Mathematical Operational Semantics, in Proc. LICS 97, pp. 268–279, Washington:IEEE Press, 1997.

287 CMCS’01 Preliminary Version

From Algebras and Coalgebras to Dialgebras

Erik Poll and Jan Zwanenburg

University of Nijmegen Toernooiveld 1, 6525 ED Nijmegen, The Netherlands

Abstract This paper investigates the notion of dialgebra, which generalises the notions of algebra and coalgebra. We show that many (co)algebraic notions and results can be generalised to dialgebras, and investigate the essential diﬀerences between (co)algebras and arbitrary dialgebras.

1 Introduction

An algebra is a set X together with some functions that can be used to construct elements of X, i.e. functions fi that have X as output type,

fi : INi(X) → X, with INi a polynomial functor. Algebras are widely used in (theoretical) computer science. E.g. think of algebraic datatypes such as lists and trees, or algebraic speciﬁcations. A coalgebra is a set X together with some functions that can be used to observe elements of X, i.e. functions fi that have X as input type,

fi : X → OUTi(X), with OUTi a polynomial functor. Coalgebras can be used to describe various kinds of ‘dynamical systems’, e.g. automata, processes, or (labelled) transition systems [21]. Moreover, elements of coalgebras can viewed as objects in the sense of object-oriented (OO) programming [17], in which case the operations are viewed as methods. For an introduction to – and a comparison between – algebras and coalgebras we refer to [11]. Algebras and coalgebras are dual notions, and are in some intuitive sense ‘opposites’. However, this does not mean that algebra and coalgebra do not have certain things in common. Indeed, the standard example of an algebraic speciﬁcation, stacks, also occurs in the literature as an example of a coalgebraic speciﬁcation!

1 Email: {erikpoll,janz}@cs.kun.nl This is a preliminary version. The ﬁnal version will be published in Electronic Notes in Theoretical Computer Science URL: www.elsevier.nl/locate/entcs Poll and Zwanenburg

This paper investigates the notion of dialgebra, a straightforward generalisation of (co)algebra: a dialgebra is a set X together with some functions

fi : INi(X) → OUTi(X) with INi and OUTi polynomial functors. The name ‘dialgebra’ is taken from [4]. Clearly all algebras and coalgebras are dialgebras. An example of a dialgebra that is neither an algebra nor a coalgebra is a type Set of sets of natural numbers with operations empty : Set add : Set × Nat → Set elem : Set × Nat → bool union : Set × Set → Set min : Set → 1+Nat split : Set × Nat → Set × Set Here min could for instance return the minimum of a set, if the set is nonempty, and an element of a unit type 1 if the set is empty, and split(s,n) could split the set s in a pair of sets, one containing the elements of s smaller than n and the other containing the rest. This is not an algebra, because for instance split and min are not ‘algebraic’, nor is it a coalgebra, because for instance empty and the binary operation union are not ‘coalgebraic’. Many interesting examples of dialgebras that are not (co)algebras can be obtained simply by extending a coalgebra with an operation init :1→ X that yields some initial state. Including such an operation is a very natural thing to do for many examples of coalgebras. In particular, this is very natural when considering objects in the sense of OO: here using dialgebras rather than coalgebras makes it possible to account for the constructors as well as the methods of a class. In addition it becomes possible to account for so-called binary methods. We will show that most (co)algebraic notions can be deﬁned for the more general dialgebraic case, and investigate in how far properties of (co)algebras – in particular properties of invariants and bisimulations – can be generalised to arbitrary dialgebras, to get a better understanding of what the essential diﬀerences between algebras, coalgebras, and dialgebras are.

2 Mathematical Preliminaries

Throughout this paper, types will just be sets. Signatures are mappings from types to types, written as type expressions containing a type variable X. Deﬁnition 2.1 A signature Σ(X) is a type expression, possibly containing the type variable X, of the form

Σ(X) ::= X | C | Σ1(X)+Σ2(X) | Σ1(X) × Σ2(X) | Σ1(X) → Σ2(X), 289 Poll and Zwanenburg where C comes from a collection of base types. Here A × B is the Cartesian product, with projections π1 : A × B → A and π2 : A × B → B,andA + B the disjoint sum, with injections inl : A → A + B and inr : B → A + B.The functions [f1,f2]:A1 + A2 → B for fi : Ai → B,andg1,g2 : A → B1 × B2 for gi : A → Bi aredeﬁnedasusual. A polynomial signature is a signature of the form

F (X) ::= X | C | F1(X)+F2(X) | F1(X) × F2(X). N.B. We deliberately exclude constant exponents of the form C → F (X) as polynomial functors, because we have not been able to prove our most interesting result, Theorem 3.20, if we include these. Polynomial signatures are functors, and F (f):F (A) → F (B)isdefined in the usual way for f : A → B. The notions of predicate and relation lifting can be defined not just for polynomial signature but for all signatures: we can define predicate and relation lifting for arbitrary signatures. Definition 2.2 For predicates P and Q we define the predicates • pred (P × Q)(x) ⇐⇒ P (π1(x)) ∧ Q(π2(x)) • (P +pred Q)(x) ⇐⇒ (x = inl(x) ∧ P (x)) ∨ (x = inr(x) ∧ Q(x)) • (P →pred Q)(f) ⇐⇒ ∀ x ∈ A. P (x) ⇒ Q(f(x)), with A the domain of P . Definition 2.3 For relations R and S we define the relations • R +rel S = {(inl(x), inl(y)) | (x, y) ∈ R}∪{(inr(x), inr(y)) | (x, y) ∈ S} • rel R × S = {(x, y) | (π1(x),π1(y)) ∈ R ∧ (π2(x),π2(y)) ∈ S} • R →rel S = {(f,g) |∀(x, y) ∈ R. (f(x),g(y)) ∈ S} Definition 2.4 Let Σ(X) be an arbitrary signature. For a predicate P on X and a relation ∼⊆X × Y , the predicate Σpred(P )onΣ(X)andtherelation Σrel(∼) ⊆ Σ(X) × Σ(Y ) are defined, by induction on the structure of Σ(X), as • if Σ(X)=X then Σpred(P )=P and Σrel(∼)=∼, • pred if Σ(X)=C then Σ (P )=TrueC , the constant predicate ‘true’ on C, rel and Σ (∼)=IdC , the identity relation on C, • pred pred if Σ(X)=Σ1(X)+Σ2(X)thenΣ (P )=Σ1(P )+ Σ2(P )and rel rel Σ (∼)=Σ1(∼)+ Σ2(∼), • pred pred if Σ(X)=Σ1(X) × Σ2(X)thenΣ (P )=Σ1(P ) × Σ2(P )and rel rel Σ (∼)=Σ1(∼) × Σ2(∼), • pred pred if Σ(X)=Σ1(X) → Σ2(X)thenΣ (P )=Σ1(P ) → Σ2(P )and rel rel Σ (∼)=Σ1(∼) → Σ2(∼). Of course, if we identify predicates with subsets, then there is no difference between Σ and Σpred. The notion of relation lifting is not just used in the coalgebraic literature (e.g. [10]), but is much more widely used, notably for 290 Poll and Zwanenburg logical relations in the semantics of typed lambda calculus (see [12] for a comprehensive overview) and to formalise the notion of parametricity (e.g. see [14]). Lemma 2.5 Let Σ(X) be an arbitrary signature. Then pred (i) Σ (TrueX )=TrueΣ(X) rel (ii) Σ (IdX )=IdΣ(X) Proof. Induction on the structure of Σ(X). ✷ For polynomial signatures, there is a close connection between relation and function lifting: Lemma 2.6 For any polynomial signature F (X) graph(F (f)) = F rel(graph(f)) where graph(f) ⊆ A × B is the function f : A → B viewed as a relation. Proof. Induction on the structure of F (X). ✷ Lemma 2.7 Let F (X) be a polynomial signature, and let i range over I. Then (i) P ⊆ Q ⇒ F pred(P ) ⊆ F pred(Q) (ii) R ⊆ S ⇒ F rel(R) ⊆ F rel(S) . . (iii) F pred(P )=F pred( P ) .i i . i i (iv) F rel(R )=F rel( R ) if I is not empty. i i i i (v) F pred(P ) ⊆ F pred( P ) i i i i rel ⊆ rel (vi) i F (Ri) F ( i Ri) (vii) F rel(R; S)=F rel(R); F rel(S) Proof. All these can be proved by induction on the structure of F (X). Prop- erties 3. and 4. are also easy consequences of 1. and 2., respectively. ✷ None of the properties in Lemma 2.7 hold for arbitrary signatures. Note that we have stronger properties for intersection, (iii) and (iv), than for union, pred pred rel (v) and (vi). The properties i F (Pi)=F ( i Pi)and i F (Ri)= rel F ( i Ri) do not hold for all polynomial signatures F (X) (for counterexamples, take F (X)=X × X), but do hold for some polynomial signatures: Lemma 2.8 Let F (X) be a polynomial signature with at most one occurrence of X.Then (i) F pred(P )=F pred( P ) i i i i rel rel (ii) i F (Ri)=F ( i Ri) Proof. Induction on the structure of F (X). Of course, for F (X)withno occurrence of X – i.e. F (X) a constant – the property is trivial. ✷ 291 Poll and Zwanenburg

3 Dialgebras

Deﬁnition 3.1 A dialgebraic signature is a signature of the form

Σ(X)=Σ1(X) × ...× Σn(X), with each Σi(X) of the form

Σi(X)=INi(X) → OUTi(X), with INi(X)andOUTi(X) polynomial signatures. Σ(X) is called algebraic iff OUTi(X)=X for all i,andcoalgebraic iff INi(X)=X for all i. Throughout the remainder of this paper, Σ will be a dialgebraic signature of the form * * Σ(X)= Σi = INi(X) → OUTi(X). i∈I i∈I In examples Σ(X) will usually be a labelled product rather than unlabelled one; this is just syntactic sugar. Definition 3.2 AΣ-dialgebra is a pair (A, f) consisting of a set A and a function f ∈ Σ(A), i.e. f =(f1,...,fn)withfi ∈ Σi(A)=INi(A) → OUTi(A). For (co)algebraic signatures, this is just the definition of Σ-(co)algebra. An important difference between dialgebras and (co)algebras is that whereas an algebra with n operations fi : Fi(X) → X can be turned into a algebra with the single operation, namely [f1,...,fn]:F1(X)+...+ Fn(X) → X, and, similarly, a coalgebra with n operations fi : X → Fi(X) can be turned into a coalgebra with just one operation f1,...,fn,wecannot do something similar for dialgebras. A practical consequence is that most definitions and proofs for dialgebras have to be ‘point-wise’, quantifying over i. Definition 3.3 Let (A, f)and(B,g) be Σ-dialgebras. A Σ-homomorphism h :(A, f) → (B,g) is a function from A to B that preserves the operations, i.e.

OUTi(h) ◦ fi = gi ◦ INi(h) for all i. For dialgebras that are (co)algebras, this is the standard notion of (co)algebraic homomorphism.

Lemma 3.4 (i) The identity idA is a homomorphism from (A, f) to itself. (ii) Homomorphisms are closed under composition, i.e. if h :(A, f) → (A,f) and h :(A ,f ) → (A ,f ) then h ◦ h :(A, f) → (A ,f ). Proof. Easy. ✷ Lemma 2.7 listed some properties for polynomial signatures that do not hold for arbitrary signatures. For dialgebraic signatures, we can salvage some of the properties mentioned in Lemma 2.7: 292 Poll and Zwanenburg

Lemma 3.5 Let Σ(X) be a dialgebraic signature, and let i range over I.Then . . (i) Σpred(P ) ⊆ Σpred( P ) .i i . i i rel ⊆ rel (ii) i Σ (Ri) Σ ( i Ri) if I is not empty. (iii) Σrel(R); Σrel(S) ⊆ Σrel(R; S) Proof. The proofs are quite straightforward, using Lemma 2.7. We just give the proof of (i) for binary intersection; the others are similar. f ∈ Σ(P ∩ Q)

⇐⇒ ∀ j. fj ∈ Σj(P ∩ Q)

⇐⇒ ∀ j. fj ∈ INj(P ∩ Q) → OUTj(P ∩ Q)

⇐⇒ ∀ j. ∀x ∈ INj(P ∩ Q).fj(x) ∈ OUTj(P ∩ Q)

⇐⇒ ∀ j. ∀x ∈ INj(P ) ∩ INj(Q).fj(x) ∈ OUTj(P ) ∩ OUTj(Q) by Lemma 2.7(iii) (twice)

⇐∀j. (∀x ∈ INj(P ).fj(x) ∈ OUTj(P ))

∧(∀x ∈ INj(Q).fj(x) ∈ OUTj(Q))

⇐⇒ ∀ j. fj ∈ INj(P ) → OUTj(P ) ∧ fj ∈ INj(Q) → OUTj(Q)

⇐⇒ ∀ j. fj ∈ Σj(P ) ∩ Σj(Q) ⇐⇒ f ∈ Σ(P ) ∩ Σ(Q) ✷

3.1 Invariants and sub-dialgebras The notion of invariant is used in the literature both for algebras and for coalgebras. Intuitively, a predicate is an invariant if all the operations preserve it: Definition 3.6 A predicate P on A is an invariant for a Σ-dialgebra (A, f) iff f ∈ Σpred(P ). For dialgebras that are (co)algebras, this is the standard notion of (co)algebraic invariant. Given an invariant we can construct a sub-dialgebra: Definition 3.7 A sub-dialgebra of a Σ-dialgebra (A, f) is another Σ-dialgebra (A,f)withA ⊆ A. For dialgebras that are (co)algebras, this is the standard notion of sub-(co)algebra. For (A,f) to be a sub-dialgebra of (A, f) all the functions in f have to be ‘closed’ under the subset A of A. Lemma 3.8 Invariants are closed under intersection. Proof. Follows immediately from Lemma 3.5: if f ∈ Σ(P )andf ∈ Σ(Q) then f ∈ Σ(P ) ∩ Σ(Q) ⊆ Σ(P ∩ Q) by Lemma 3.5. ✷ As a consequence of this lemma, we can define the smallest invariant of a dialgebra as the intersection of all invariants. This strongest invariant expresses 293 Poll and Zwanenburg exactly the property of being ‘reachable’ by the dialgebra operations. Invariants of dialgebras are not always closed under union: Example 3.9 Let T (X)=X × X → X and consider the T -(di)algebra (Z, +). The predicates Neg(x)=x<0andPos(x)=x>0onZ are both invariants, but clearly their union is not, because the sum of a positive and a negative number may be 0. For coalgebras, however, we do have this property (e.g. see [21,11]): Lemma 3.10 For coalgebras, invariants are closed under union. A useful consequence of this lemma is that, for a coalgebra, given any property Φ there exist a largest invariant Φ ⊆ Φ, namely the union of all invariants that are subsets of Φ. We can slightly generalise Lemma 3.10. For this we first define

Deﬁnition 3.11 Σ(X) has no binary methods if none of the INi(X) has more than one occurrence of X. Clearly all coalgebras are dialgebras without binary methods. Many interesting examples of dialgebras without binary methods that are not coalgebras can be obtained in the way mentioned earlier, simply by extending a coalgebra with an operation that yields some initial state. Note that (counter)Example 3.9 involves a binary method. Binary methods are already notorious in the theoretical computer science literature on object oriented (OO) programming; see [2]. Some properties of coalgebras, that do not hold for all dialgebras, do hold for all dialgebras without binary methods, including: Lemma 3.12 For dialgebras without binary methods, invariants are closed under union.

Proof. Let Σ be a signature without binary methods. It suﬃces to prove pred ⊆ pred that i Σ (Pi) Σ ( i Pi). The crucial property of dialgebraic signature without binary methods needed to prove this is that, since there is at most one pred pred occurrence of X in the INi(X), i F (INi)=F ( i INi) by Lemma 2.8: f ∈ Σ(P ) ∪ Σ(Q)

⇐⇒ ∀ j. fj ∈ Σj(P ) ∪ Σj(Q)

⇐⇒ ∀ j. fj ∈ INj(P ) → OUTj(P ) ∨ fj ∈ INj(Q) → OUTj(Q)

⇐⇒ ∀ j. (∀x ∈ INj(P ).fj(x) ∈ OUTj(P ))

∨(∀x ∈ INj(Q).fj(x) ∈ OUTj(Q))

⇒∀j. ∀x ∈ INj(P ) ∪ INj(Q).fj(x) ∈ OUTj(P ) ∪ OUTj(Q)

⇐⇒ ∀ j. ∀x ∈ INj(P ∪ Q).fj(x) ∈ OUTj(P ) ∪ OUTj(Q)

since INj(P ∪ Q)=INj(P ) ∪ INj(Q) by Lemma 2.8

⇒∀j. ∀x ∈ INj(P ∪ Q).fj(x) ∈ OUTj(P ∪ Q)

since OUTj(P ) ∪ OUTj(Q) ⊆ OUTj(P ∪ Q) by Lemma 2.7 294 Poll and Zwanenburg

⇐⇒ ∀ j. fj ∈ INj(P ∪ Q) → OUTj(P ∪ Q)

⇐⇒ ∀ j. fj ∈ Σj(P ∪ Q) ⇐⇒ f ∈ Σ(P ∪ Q) ✷

3.2 Bisimulations, (partial) congruences, and quotient-dialgebras The notion of bisimulation plays an important role in the literature on coalgebras, as does the closely related notion of (partial) congruence in the literature on algebras. Definition 3.13 Arelation∼⊆A × B is a bisimulation between two Σ- dialgebras (A, f)and(B,g)iff(f,g) ∈ Σrel(∼). Dialgebras (A, f)and(B,g) are bisimilar if there exists a bisimulation between (A, f)and(B,g). For (co)algebras one has the property that (co)algebra homomorphisms are just functional bisimulations. This property also holds for dialgebras: Lemma 3.14 Let (A, f) and (B,g) be Σ-dialgebras and h be a function from A to B.Thenh : A → B is a homomorphism iff graph(h) ⊆ A × B is a bisimulation. Proof. Straightforward, using Lemma 2.6. ✷ Lemma 3.15 Bisimulations are closed under intersection and composition. Proof. Follows immediately from Lemma 3.5. ✷ Bisimulations between dialgebras are not always closed under union. For coalgebras this is a basic property (e.g. see [21,11]): Lemma 3.16 For coalgebras, bisimulations are closed under union. An important consequence of this property of coalgebras is that between any two coalgebras there exists a largest bisimulation, namely the union of all bisimulations. For arbitrary dialgebras we do not have this property. Lemma 3.17 For dialgebras without binary methods, bisimulations are closed under union. Proof. Similar to Lemma 3.12. ✷ Of special interest are bisimulations between a dialgebra and itself: Definition 3.18 Arelation∼⊆A×A is a (partial) congruence on a dialgebra (A, f) iff it is a bisimulation between (A, f) and itself, – i.e. (f,f) ∈ Σ(∼)– and it is a (partial) equivalence relation. In [21], for coalgebras, what we call a congruence here is called a bisimulation equivalence. Given a (partial) congruence we can construct a quotient- dialgebra: 295 Poll and Zwanenburg

Deﬁnition 3.19 Let ∼ be a (partial) congruence on (A, f). Then the quotient- dialgebra ([S]∼, [f]∼) is the Σ-dialgebra ([S]∼, [f]∼), where [S]∼ is the collection of ∼-equivalence classes, and [f]∼ is the family of functions on ∼-equivalence classes induced by f. As mentioned above, bisimulations are not closed under union. Congruences, however, are, in the following sense: ∈ Theorem 3.20 Let Rj be congruences on the Σ-dialgebra (A, f),forj J, ∅ ∗ J = .Then( j Rj) is also a congruence relation for (A, f). Proof. See the appendix. ✷ So, for any dialgebra there exists a largest congruence relation, namely (the transitive closure of) the union of all congruences. Intuitively, this is the notion of observational equality for that dialgebra. In Section 3.3 below, we discuss this diﬀerence between dialgebras and coalgebras – the existence of largest congruences vs largest bisimulations – in more detail. The property above does not hold for partial congruences: 2 Example 3.21 Let T (X)=X × X → Z and consider the T -(di)algebra (Z,f)wheref is the function f(x, y)=ify<0thenx else 23

The relations R = N × N and IdZ on Z are both bisimulations, i.e. (x, x) ∈ R ∧ (y, y) ∈ R =⇒ f(x, y)=f(x,y) (x, x) ∈ Id ∧ (y, y) ∈ Id=⇒ f(x, y)=f(x,y) However, neither R ∪ Id nor (R ∪ Id)∗ is a bisimulation, since for instance (4, 5) ∈ R ∪ Id and (−1, −1) ∈ R ∪ Id, but f(4, −1) = f(5, −1).

3.3 Coalgebras vs dialgebras: the problem with binary methods Moving from coalgebras to dialgebras we gain something, notably the possibility of having binary methods. The price for this is that some properties are lost, namely • the existence of final coalgebras, • the existence of unique largest bisimulation between any two coalgebras. These two properties are intimately connected, as the largest bisimulation relates precisely those elements that have the same image under the unique homomorphisms to the final coalgebra. The properties are useful because they provide a canonical notion of observational equality between elements of different coalgebras. Two different coalgebras (A, f)and(B,g)withthe

2 The property does hold for partial congruences if these all have the same domain; instead ∗ of the transitive and reflexive closure ( j Rj) one then has to consider the transitive closure + ( j Rj) . 296 Poll and Zwanenburg same signature Σ can be regarded as different implementations for classes with the same interface. The properties above then provide a canonical notion of equality between objects from these two classes: an object with an internal state a ∈ A – using implementation (A, f) – is observationally equal to an object with an internal state b ∈ B – using implementation (B,g)–iffa b, where a b is the greatest bisimulation between (A, f)and(B,g). Below a simple (counter)example to illustrate the fundamental problem with the union of bisimulations caused by a binary method: Example 3.22 Consider Σ(X)=(X × X → X) × (X → N)andtheΣ- dialgebras Z =(Z, (+,abs)) and L =(List, (++,length)) with List the set of lists over some type, ++ the concatenation operation, and abs : Z → N the function returning the absolute value.

The relations ∼1 and ∼2,

∼1 = {(z,l) | z = length(l)}∼2 = {(z,l) | z = −length(l)} are bisimulations between Z and L. However, ∼1 ∪∼2 is not a bisimulation between Z and L; for example, if l is some list with three elements, then

−3(∼1 ∪∼2)l ∧ 3(∼1 ∪∼2)l ⇒ −3+3(∼1 ∪∼2) l++l. The problem is that when the binary method (+ or ++) is used to observe an individual element (of Z or List, respectively), the operation + of Z offers different – more – observations than the operation ++ of L. Binary methods are notorious in the literature on the theoretical foundations of object-oriented programming: in the presence of binary methods defining a satisfactory notion of subtyping becomes a problem [2]. This problem is closely related to the fact that there is no canonical notion of ‘observational equivalence’ in the presence of binary methods. Indeed, the coalgebraic definition of subtyping (see [15]) crucially depends on the existence of final coalgebras, or the existence of unique largest bisimulation between any two coalgebras.

4Dialgebraic Speciﬁcation

Just like algebras provide the basis for algebraic specification, coalgebras have been used as the basis for coalgebraic specification, notably in the experimental specification language CCSL [8,20]. We now consider a notion of dialgebraic specification, defined in exactly the same way. Because of lack of space, we have to omit many details of definitions and proofs. For example, an equational dialgebraic specification for the dialgebraic signature of Set mentioned in the introduction could include union(s,empty) = s union(s,t) = union(t,s) elem(add(s,n),m) = (n=m or elem(s,m)) 297 Poll and Zwanenburg

elem(union(s,t),n) = elem(s,n) or elem(t,n) elem(empty,n) = false min(s) = inl(x) ⇐⇒ s = empty min(s) = inr(m) ⇒ elem(s,m) = true min(s) = inr(m) ∧ elem(s,n) = true ⇒ m ≤ n union(split(s,m)) = s . . A model of this dialgebraic speciﬁcation would be a dialgebra providing an implementation of all the operations for which these equations hold. However, instead of insisting that models satisfy the equations above, one could also only require that they satisfy the equations above up to some congruence relation. This weaker notion of model makes sense because intuitively a congruence relation provides a notion of ‘observational equivalence’. For example, consider a simple implementation of the speciﬁcation above using lists, in which union is implemented as concatenation ++, i.e. some dialgebra L =(ListNat, (...,++ ,...). This implementation does not satisfy union(s,t) = union (t,s) as concatenation is not commutative, but it does satisfy

union(s,t) ∼perm union (t,s) where ∼perm is the relation on lists that relates permutations. If ∼perm is a congruence for L,thenL can be regarded as a correct implementation of the specification. To formally introduce a notion of equational specification for dialgebras, we+ need to introduce some syntax. We fix a dialgebraic signature Σ(X)= → i∈I INi(X) OUTi(X) and use names opi for the operations of the dialgebra: Definition 4.1 The type expressions are given by

E ::= X | C | E1 + E2 | E1 × E2 | E1 → E2 where X stands for the carrier of the dialgebra, and C ranges over some collection of base types. The term expressions is given by E E E e ::= opi | x | c | λx .e| e(e) | inlE+E(e) | inrE+E(e) | π1(e) | π2(e) where xE is a variable of type E,andcE a constant of type E not containing X. We assume an inﬁnite number of variables for every type E. The collection of well-typed terms is deﬁned in the obvious way. The propositions are given by E Φ ::= ¬Φ | Φ1 ∧ Φ2 | e1 =E e2 |∀x . Φ where E is some type expression possibly containing X,ande1 and e2 are well-formed expressions of type E. 298 Poll and Zwanenburg

A dialgebraic speciﬁcation Φ is simply a closed proposition.

Note that the definition above allows more propositions than usually allowed in algebraic specification; for example, it allows universal quantifications over the type X → X. Definition 4.2 The interpretation of a type E,terme, and proposition Φ in the dialgebra (A, f), [[ E ]] A,f ,[[e ]] A,f ,and[[Φ]]A,f , are defined in the obvious way, by induction on the structure, interpreting X as A and opi :Σi(X)as fi :Σi(A). Because terms and propositions can contain free variables, we have to define the interpretation of terms and propositions wrt. an environment η, η η E [[ e ]] A,f and [[Φ ]]A,f , where the environment η assigns to every free variable x an interpretation in E(A). Definition 4.3 A dialgebra (A, f) satisfies specification Φ, written (A, f) |=

Φ, iff [[Φ ]]A,f is true. Now that we have a syntax, we can define a notion of observational equivalence: Definition 4.4 Two Σ-dialgebras (A, f)and(B,g)areobservationally equivalent iff for all closed expressions e of some closed type E (i.e. a type E not containing X) the interpretation of e in (A, f)isequaltoitsinterpretationin (B,g).

As one would expect, bisimilar dialgebras are observationally equivalent: Theorem 4.5 Let (A, f) and (B,g) be Σ-dialgebras. If (A, f) and (B,g) are bisimilar then they are observationally equivalent.

Proof. Let ∼ a bisimulation between (A, f)and(B,g). For environments η and ξ we write η ∼ ξ if (ρ(xE),ξ(xE)) ∈ Erel(∼) for all xE in their domains. η ξ ∈ rel ∼ Then we can prove that ([[ e ]] (A,f) , [[ e ]] (B,g)) E ( ) for all e : E and for all η ∼ ξ, by induction on the derivation of e. ✷

Behavioural Satisfaction We now consider a weaker notion of behavioural satisfaction of dialgebras satisfying specifications ‘up to some congruence relation’. First we define [[ Φ ]] A,f,, the interpretation of Φ in the dialgebra (A, f) wrt. a congruence : 3 η Definition 4.6 For a congruence on the dialgebra (A, f), [[Φ ]]A,f, is η defined as [[Φ ]]A,f , except that equality is interpreted as follows: η η η ∈ rel [[ e1 =E e1 ]] A,f, =([[e1 ]] A,f, , [[ e2 ]] A,f,) E ( )

3 One could also take a partial congruence, but then the semantics of types would have to be changed with [[ X ]] = dom(). 299 Poll and Zwanenburg

Note that if X does not occur in E, this simply reduces to η η η [[ e1 =E e1 ]] A,f, =([[e1 ]] A,f, =[[e2 ]] A,f,) A notion of behavioural satisfaction can now be defined as follows: Definition 4.7 Let (A, f) be a Σ-dialgebra and a congruence relation for | it. Then (A, f) satisfies Φ with respect to , written (A, f) = Φ, iff [[Φ ]]A,f, is true.

Theorem 4.8 (A, f) |= Φ iﬀ (A, f)/|=Φ

Proof. By induction on the structure of Φ we can prove that [[Φ ]]A,f, = [[ Φ ]] (A,f)/. To prove this we must first prove relations between [[ E ]] A,f and [[ E ]] (A,f)/,and[[e ]] A,f and [[ e ]] (A,f)/,namely[[E ]] A,f / =[[E ]] (A,f)/ and ✷ [[[ e ]] A,f ]E() =[[e ]] (A,f)/. Definitions and results similar to the ones above can be found in the literature for algebraic specifications, e.g. in [1,9,19]. We do not know of any similar definitions or results in the literature on coalgebras or coalgebraic specifications. However, given that the notion of ‘observability’ plays a much more central role in the coalgebraic setting than in the algebraic setting, we believe that a notion of behavioural satisfaction makes even more sense for coalgebraic specifications than for algebraic ones. More work would be needed to really exploit the opportunities offered by the notion of behavioural satisfaction when reasoning about specifications. In particular, one would want to establish that any consequences of a specification – in a particular logic – are not just valid for models satisfying the specification, but also for models behaviourally satisfying the specification. For algebras and algebraic specifications this idea is pursued in [1,9,19]. In a type-theoretic setting, this idea is illustrated in [16] and further investigated in [23]; the abstract data types considered here are more general than dialgebras.

5 Related Work

Several ways to combine algebras with coalgebras have been investigated over the past few years. One way of combining algebra with coalgebra is to consider pairs consisting of an algebra and a coalgebra, sometimes called bi-algebras. This is done in [7] and [3]. Dialgebras are clearly more general than algebra-coalgebra pairs. Using a algebra-coalgebra pair rules out operations f : IN(X) → OUT(X) where both INi(X) = X and OUTi(X) = X, for example a ‘partial’ binary operation f : X × X → 1+X. In [22] Tews introduces extended polynomial functors and coalgebras for these extended polynomial functors. This setting allows operations that are not possible in our dialgebra-setting, because a (restricted) use of → is possible in output types. For example, g : X → (C1 → X)+C2 is a coalgebra 300 Poll and Zwanenburg for some extended polynomial functor, but cannot be an operation of any dialgebra. However, whereas our notion of dialgebras subsumes algebras, the setting of [22] does not; this setting is still strictly coalgebraic and does not allow algebraic operations, not even one as simple as g : C → X.InOO terminology, the setting of [22] allows binary methods but not constructors. As for our dialgebras, for the coalgebras in [22] bisimulations turn out to be closed under intersection and composition, but not under union. It would be interesting to see if a result similar to Theorem 3.20, i.e. closure under union for congruences, could be proved for extended polynomial functors. Dialgebras and dialgebraic specifications can be regarded as special cases of the abstract data types and the specifications for abstract data types considered in a type-theoretic setting in [16,23]. Such a type-theoretic setting is also used in [5,6]. The crucial observation to link dialgebras with type theory is that dialgebras – and hence algebras and coalgebras – can be regarded as abstract datatypes. Abstract datatypes can be elegantly described in type theory using so-called existential types [13], and the logic for the notion of parametricity described in [14] then offers the expected proof rule for these existential types, namely that two implementations of an abstract datatype are equal if there exists a bisimulation between them. This type-theoretic setting allows much wilder signatures than the dialgebraic signatures considered in this paper. This suggests further generalisations, for example with • operations with higher-order types, e.g. map : SetNat × (Nat → Nat) → SetNat,or • polymorphic operations, i.e. operations with type parameters, e.g. polymorphicMap : SetNat →∀α.(Nat → α) → Setα . Finally, one of the referees drew our attention to [18], which introduces a notion of ‘nested sketches’ that support operations with arbitrarily structured in-and output types and seem more general than our dialgebras.

6 Conclusions

We have shown that the notion of dialgebra is a well-behaved generalisation of the notions of algebra and coalgebra. Dialgebras are more general than both algebras and coalgebras. The coalgebraic setting does for instance not allow ‘binary methods’, i.e. operations of type f : X × X → X, or operations returning an initial state i.e. operations of type f :1→ X. The algebraic setting does not allow operations with complicated return types, e.g. ‘partial’ operations f : X → 1+X. We have shown that many notions used in the fields of algebra and coalgebra are essentially identical, and can already be defined for the more general dialgebraic case: the (co)algebraic notions of homomorphism, invariant, bisim- 301 Poll and Zwanenburg ulation, and (partial) congruence can all be extended to dialgebras, preserving many of the essential properties. We have also shown that dialgebraic specification provide a generalisation of (co)algebraic specification, and indicated how the notion of behavioural satisfaction, used in the field algebraic specification, can be extended to dialgebraic specifications (and hence also to coalgebraic specifications). Given that (co)algebras are special cases of dialgebras, it is to be expected that some properties are lost when moving from algebras or coalgebras to dialgebras. Most obviously, we no longer have the existence of initial c.q. final models. However, as far as dialgebraic specifications are concerned losing these properties maybe is not too bad, given that one is usually interested in loose semantics anyway. In addition to this, useful properties of coalgebras that do not hold for arbitrary dialgebras are closure under union for invariants and bisimulations (Lemmas 3.10 and 3.16). The fact that we do not have these closure properties can be traced back to so-called binary methods, which are already notorious in the literature on object-oriented programming because of the problems they cause with subtyping (see [2]). For dialgebras without binary methods we do still have the properties that invariants and bisimulations are closed under union. For a given functor there exists a canonical notion of ‘observational equality’ between elements of different coalgebras for that functor. An important consequence of the fact that for arbitrary dialgebras bisimulations are not closed under union, is that for a dialgebraic signature such a notion may not exist, as discussed in Section 3.3. However, in the dialgebraic setting we do still have a canonical notion of equality between elements of a single dialgebra, thanks to Theorem 3.20.

Future Work Given the duality between algebras and coalgebras it is surprising that, whereas we did come across properties of coalgebras that do not hold for arbitrary dialgebras (namely closure properties for union, Lemmas 3.10 and 3.16), we did not come across (dual) properties of algebras that do not hold for arbitrary dialgebras. Carefully dualising Lemmas 3.10 and 3.16 might reveal such properties. Another direction for future work is to further investigate which notions and results from the fields of algebra and coalgebra – notably the well-developed field of algebraic specifications – could be generalised to dialgebras. Finally, the notion of dialgebra we have introduced is fairly ad-hoc and very syntactic. The motivation behind the definition of dialgebra was that it is the natural ‘unification’ of the definitions of algebra and coalgebra. It would be interesting to investigate more semantical characterisations of some 302 Poll and Zwanenburg notion of dialgebra, and to investigate in how far the restriction to polynomial functors could be relaxed, e.g. allowing the extended polynomial functors of [22].

Acknowledgments We would like to thank Bart Jacobs and Hendrik Tews for comments on earlier versions of this paper.

7 Appendix: Proof of Theorem 3.20

Lemma 7.1 Basic properties of the operations + and × on relations are: (i) (R + S); (R + S)=(R; R)+(S; S) (ii) (R × S); (R × S)=(R; R) × (S; S) (iii) R+ + S+ =(R + S)+ (iv) R+ × S+ =(R × S)+ if R and S are partially reflexive. Here R+ denotes the transitive closure of R,andR is partially reflexive iff ∀(x, y) ∈ R. (x, x) ∈ R ∧ (y, y) ∈ R.

Proof. Easy. ✷

Lemma 7.2 Let F be a polynomial signature. Then F (R+)=F (R)+ if R is partially reﬂexive.

Proof. Induction on the structure of F , using Lemmas 7.1(iii) and 7.1(iv).✷

Lemma 7.3 Let i and j range over I and J, respectively. (i) R + S = R + S if I and J are not empty. i i j j i,j i j × × (ii) i Ri j Sj = i,j Ri Sj

Lemma 7.4 Let F be a polynomial signature. Let Rj be a equivalence relation + ∈ ∅ ⊆ on A, for all j J, J = .ThenF j Rj j F (Rj) .

Proof. Induction on the structure of F . • F (X)=C or F (X)=X: trivial. 303 Poll and Zwanenburg

• F (X)=F1(X)+F2(X):

F1(Ri)+F2(Rj)

= F1(Ri; IdA)+F2(IdA; Rj)

=(F1(Ri); F1(IdA)) + (F2(IdA); F2(Rj)) by Lemma 2.7

=(F1(Ri)+F2(IdA)) ; (F1(IdA)+F2(Rj)) by Lemma 7.1(i)

⊆ (F1(Ri)+F2(Ri)) ; (F1(Rj)+F2(Rj))

since IdA ⊆ Ri,soFk(IdA) ⊆ Fk(Ri)

= F (Ri); F (Rj) so F R = F R + F R j j 1 j j 2 j j + + ⊆ F (R ) + F (R ) by IH j 1 j j 2 j + = F (R )+ F (R ) by Lemma 7.1(iii) j 1 j j 2 j + = F (R )+F (R ) by Lemma 7.3(i) i,j 1 i 2 j + ⊆ F (R ); F (R ) by the result above i,j i j + = ( F (R )) ; ( F (R )) by Lemma 7.1(i) j j j j + ⊆ ( j F (Rj)) by deﬁnition of +.

• F (X)=F1(X) × F2(X): Analogous. ✷

Lemma 7.5 (Closure of congruences under union) ∈ ∅ Let Rj be congruences on the Σ-dialgebra (A, f), for all j J, J = .Then + j Rj is a congruence on (A, f).

Proof. We just do the proof for binary union. Let R and S be congruences on (A, f), i.e. R and S are equivalence relations, (f,f) ∈ Σ(R), and (f,f) ∈ Σ(S). To prove: (f,f) ∈ Σ((R ∪ S)+), i.e. , - , - + + + (fi,fi) ∈ Σi((R ∪ S) )=INi (R ∪ S) → OUTi (R ∪ S) for all i. + + Let (x, x ) ∈ INi ((R ∪ S) ). To prove (fi(x),fi(x )) ∈ OUTi ((R ∪ S) ).

+ + INi ((R ∪ S) )=(INi(R ∪ S)) by Lemma 7.2 , - + + ⊆ (INi(R) ∪ INi(S)) by Lemma 7.4 + =(INi(R) ∪ INi(S)) by deﬁnition of +. 304 Poll and Zwanenburg

+ So (x, x ) ∈ (INi(R) ∪ INi(S)) , ie. there exist x1 ...,xn such that (x, x1), (x1,x2), ..., (xn,x) ∈ INi(R) ∪ INi(S) . Since R and S are bisimulations it then follows that (fi(x),fi(x1)), (fi(x1),fi(x2)), ..., (fi(xn),fi(x ))

∈ OUTi(R) ∪ OUTi(S)

⊆ OUTi(R ∪ S) by Lemma 2.7 and hence + (fi(x),fi(x )) ∈ (OUTi(R ∪ S)) + = OUTi ((R ∪ S) ) by Lemma 7.2 ✷

References

[1] Michel Bidoit, Rolf Hennicker, and Martin Wirsing. Behavioral and abstractor speciﬁcations. Science of Computer Programming, 25:149–186, 1995.

[2] Kim B. Bruce, Luca Cardelli, Giuseppe Castagna, the Hopkins Objects Group (Jonathan Eifrig, Scott Smith, Valery Trifonov), Gary T. Leavens, and Benjamin Pierce. On Binary Methods. Theory and Practice of Object Systems, 1(3):221–242, 1996.

[3] Corina Cˆırstea. An algebra-coalgebra framework for system speciﬁcation. In H. Reichel, editor, Coalgebraic Methods in Computer Science (CMCS’2000), number 19 in ENTCS, pages 81–112. Elsevier, Amsterdam, 2000.

[4] Tatsuya Hagino. A categorical programming language. PhD thesis, University of Edinburgh, 1987.

[5] Jo Hannay. Speciﬁcation Reﬁnement with System F. In J. Flum and M. Rodriguez-Artalejo, editors, Computer Science Logic (CSL’99), volume 1683 of LNCS, pages 530–545. Springer-Verlag, September 1999.

[6] Jo Hannay. A Higher Order Simulation Relation for System F. In FOSSACS’2000, volume 1784 of LNCS, pages 130–145. Springer-Verlag, 2000.

[7] R. Hennicker and A. Kurz. (ω, ξ)-logic:On the algebraic extension of coalgebraic speciﬁcations. In B. Jacobs and J. Rutten, editors, Coalgebraic Methods in Computer Science, number 19 in ENTCS, pages 195–212. Elsevier, Amsterdam, 1999.

[8] U. Hensel, M. Huisman, B. Jacobs, and H. Tews. Reasoning about classes in object-oriented languages:Logical models and tools. In Ch. Hankin, editor, European Symposium on Programming (ESOP), number 1381 in Lecture Notes in Computer Science, pages 105–121. Springer, 1998. 305 Poll and Zwanenburg

[9] Martin Hofmann and Donald Sannella. On behavioural abstraction and behavioural satisfaction in higher-order logic. Theoretical Computer Science, 167:3–45, 1996.

[10] Bart Jacobs. Invariants, bisimulations and the correctness of coalgebraic reﬁnements. In M. Johnson, editor, Algebraic Methodology and Software Technology (AMAST’97), LNCS, pages 276–291. Springer, 1997.

[11] Bart Jacobs and Jan Rutten. A tutorial on (co)algebras and (co)induction. EATCS Bulletin, 62:222–259, June 1997.

[12] John C. Mitchell. Foundations of Programming Languages. MIT Press, 1996.

[13] John C. Mitchell and Gordon D. Plotkin. Abstract types have existential type. ACM Trans. on Prog. Lang. and Syst., 10(3):470–502, 1988.

[14] Gordon Plotkin and Martin Abadi. A logic for parametric polymorphism. In Typed Lambda Calculi and Applications, volume 664 of Lecture Notes in Computer Science, pages 361–375, 1993.

[15] E. Poll. A coalgebraic semantics of subtyping. In H. Reichel, editor, Coalgebraic Methods in Computer Science (CMCS’2000), number 33 in ENTCS, pages 281– 299. Elsevier, 2000.

[16] E. Poll and J. Zwanenburg. A logic for abstract data types as existential types. In J.-Y. Girard, editor, Typed Lambda Calculi and Applications (TLCA’99), volume 1581 of LNCS, pages 310–324. Spinger, 1999.

[17] Horst Reichel. An approach to object semantics based on terminal co-algebras. Mathematical Structures in Computer Science, 5:129–152, 1995.

[18] Horst Reichel. Nested sketches. Technical Report ECS-LFCS-98-401, Edinburgh University, Laboratory for Foundations of Computer Science, 1998.

[19] Horst Reichel. Specification semantics. In E. Astesiano, H.-J. Kreowski, and B. Krieg-Brückner, editors, Algebraic Foundations of System Specifications, pages 131–158. Springer, 1998.

[20] J. Rothe, B. Jacobs, and H. Tews. The coalgebraic class speciﬁcation language CCSL. Journal of Universal Computer Science, 2001. To appear.

[21] J. Rutten. Universal coalgebra:a theory of systems. Theoretical Computer Science, 249:3–80, 2000.

[22] H. Tews. Coalgebras for binary methods. In H. Reichel, editor, Coalgebraic Methods in Computer Science (CMCS’2000), number 33 in ENTCS. Elsevier, Amsterdam, 2000.

[23] Jan Zwanenburg. Object-Oriented Concepts and Proof Rules: Formalization in Type Theory and Implementation in Yarrow. PhD thesis, Eindhoven University of Technology, 1999.

306