<<

Kleene Algebra and Kleene Algebra with Tests: An Introduction

Warsaw University, December 2015 Outline

Today: A Little History. Models and Axiomatizations. Still Today (if time): A digression on bisimulation. Tomorrow: Expressiveness, Completeness, Complexity. Saturday: The Coalgebraic Theory. Automata and Program Schematology.

Slide credits: Dexter Kozen Today – Models and Axiomatizations

Axiomatizations Salomaa’s axiomatization KA and KAT Conway’s R-algebras *-continuity closed (ω-complete semirings) complete idempotent semirings (S-algebras, quantales) ideal completion Models relational models language models trace models matrices over a KA or KAT Tomorrow – Completeness and Complexity

KAT and Hoare logic completeness for the equational theory completeness for the Hoare theory (reasoning under assumptions) completeness and incompleteness results for PHL complexity (PSPACE completeness) typed KA and KAT and relation to type theory Saturday – The Coalgebraic Theory

Kleene coalgebra (KC) and Kleene coalgebra with tests (KCT) relation to automata theory and program schematology the Brzozowski derivative minimization as finality automatic extraction of equivalence proofs and relations to proof-carrying code Kleene Algebra (KA) Kleene Algebra (KA)

Kleene algebra is an algebraic system that captures axiomatically the properties of a natural class of structures arising in logic and computer science. Named for Stephen Cole Kleene, who among his many other achievements, invented finite automata and regular expressions. Kleene algebra is the algebraic theory of these objects. It has many natural and useful interpretations. Stephen Cole Kleene (1909–1994) Kleene’s Theorem (1956)

0 1 0 (0 + 1(01∗0)∗1)∗ {multiples of 3 in binary} 1 0 1 a (ab)∗a = a(ba)∗ {a, aba, ababa,...} b

(a + b)∗ = a∗(ba∗)∗ a + b {all strings over {a, b}} Foundations of the Algebraic Theory

J. H. Conway. Regular Algebra and Finite Machines. Chapman and Hall, London, 1971 (out of print).

John Horton Conway (1937–) Kleene Algebra

Kleene algebras arise in various guises in many contexts: relational algebra, semantics and logics of programs, program analysis and compiler optimization, automata and theory, design and analysis of algorithms. Many authors have contributed to the development of Kleene algebra over the years: Anderaa, Archangelsky, Backhouse, Bloom, Boffa, Conway, Desharnais, Esik, Kleene, Krob, Kuich, Meyer, Möller, Pratt, Redko, Sakarovich, Salomaa, Stockmeyer, Struth to name a few. There are various competing axiomatizations, and one topic of our study will be to understand the relationships between these definitions. PDL (Fischer & Ladner 1979)

In program logic, KA formed an essential component of Propositional Dynamic Logic (PDL) (Fischer & Ladner 1979) along with Boolean algebra and modal logic. PDL is a theoretically appealing and practical system for reasoning about computation at the propositional level.

From a practical point of view, many arguments do not require the full power of PDL, but can be carried out in a purely equational subsystem using Kleene algebra. But the Boolean component is essential, as it is needed to model conventional programming constructs such as conditionals and while loops that rely on Boolean tests.

PDL subsumes proposition Hoare logic and is semantically well-grounded and deductively complete, but is complex to decide. We will define later a variant of Kleene algebra, called Kleene algebra with tests (KAT), for reasoning equationally with Kleene and Boolean constructs. Practical Applications in Program Verification

lazy caching and concurrency control (Cohen 1994) verifying low-level compiler optimizations (many authors) data restructuring operations in parallelizing compilers (Pingali 2001), pointer analysis (Möller 1997, 2000) and other kinds of static analysis Kleene Algebra

A Kleene algebra is an

(K, +, ·, ∗, 0, 1)

consisting of a K with distinguished operations and constants

operation intuition arity + addition, choice, join 2 · multiplication, sequential composition, meet 2 ∗ asterate, iteration 1 0 additive identity, fail, false 0 1 multiplicative identity, skip, true 0

satisfying certain axioms. The intuitive meaning of the operations depends on the model.

A term over this language is called a . The set of regular expressions over an alphabet Σ is denoted RExpΣ. Models of KA Language-Theoretic Models

Let Σ∗ denote the set of finite-length strings over a finite alphabet Σ, including the null string ε. For A, B ⊆ Σ∗:

A + B def= A ∪ B A · B def= {xy | x ∈ A, y ∈ B} def 0 = ∅ 1 def= {ε}.

Thus the operation ·, applied to two sets of strings A and B, produces the set of all strings obtained by concatenating a string from A with a string from B, in that order. The operator symbol · is often omitted, and we just write AB for A · B. Properties of +, ·, 0, 1

These operations have several agreeable properties: Associativity of + and ·, commutativity of + of +: A + A = A Left distributivity: A(B + C) = AB + AC Right distributivity: (A + B)C = AC + BC Additive identity: 0 + A = A + 0 = A Multiplicative identity: 1A = A1 = A Annihilation: 0A = A0 = 0. These are the laws of idempotent semirings. Asterate

Define the powers of A with respect to · inductively:

A0 def= {ε} An+1 def= A · An.

Then

n A = {x1 ··· xn | xi ∈ A, 1 ≤ i ≤ n}.

The unary operation ∗ on sets of strings is defined as follows:

∗ def [ n A = A = {x1 ··· xn | n ≥ 0, xi ∈ A, 1 ≤ i ≤ n}. n≥0

By convention, the of the of strings is ε; this is the case n = 0. Thus ε is always a member of A∗ for any A, including ∅. The operation ∗ is known as asterate. Language-Theoretic Models

∗ Any of 2Σ containing ∅ and {ε} and closed under the operations of ∪, ·, and ∗ is a Kleene algebra (but there are others!).

The algebra of regular sets over Σ, denoted RegΣ, is the smallest ∗ subalgebra of 2Σ containing all sets {a} for a ∈ Σ.

The standard interpretation is the unique homomorphism R : RExpΣ → RegΣ such that R(a) = {a}. Examples:

R(a∗b∗) = {anbm | n, m ≥ 0} R(a(ba)∗) = {a, aba, ababa, abababa,...} R((a + b)∗) = {all strings of a’s and b’s} Specification of Regular Sets

Regular sets can be specified by Regular expressions Finite automata Systems of linear inequalities (regular grammars) The equivalence of the first two representations was proved by Kleene (1956) and is known in this context as Kleene’s theorem. The equivalence of the third was argued by Chomsky (1956). Proofs can be found in any introductory text in automata and computability. Relational Models

Another useful interpretation involves binary relations on a set X .A on X is just a set of ordered pairs of elements of X . Thus a binary relation on X is a subset of X × X . The set of all binary relations on a set X forms a Kleene algebra, where + is interpreted as and · is interpreted as relational composition:

R + S def= R ∪ S R ◦ S def= {(x, z) | ∃y ∈ X (x, y) ∈ R and (y, z) ∈ S}. y RS>Z  Z  s Z x  Z~- z R ◦ S s s Relational Models

Here 0 is the empty relation ∅ and 1 is the identity relation:

def def 0 = ∅ 1 = {(x, x) | x ∈ X }. These are identities for ∪ and ◦, respectively.

Under +, ·, 0, and 1, the binary relations form an idempotent . ∗ as Reflexive Transitive

Recall that a relation R is reflexive if (x, x) ∈ R for all x ∈ X ; that is, if R includes the identity relation as a subset; transitive if (x, z) ∈ R whenever both (x, y) ∈ R and (y, z) ∈ R; in other words, R is transitive if R ◦ R ⊆ R. The smallest reflexive and transitive relation containing R is called the reflexive transitive closure of R and is denoted R∗. This coincides with the sum of all finite powers of R.

def [ R∗ = Rn, n≥0

where

R0 def= {(x, x) | x ∈ X } Rn+1 def= R ◦ Rn. ∗ as Reflexive Transitive Closure

Equivalently, there is an R∗ edge from x to z iff there is an R-path of length 0 or greater from x to z.

R - RRR R  HH ¨¨*HH s s Hj¨ s Hj R HH x s s Hj- z R∗ s s Relational Models

A relational Kleene algebra is any subset of 2X ×X closed under these operations. These models are useful in programming language semantics, because they can be used to represent the input/output relations of programs. Trace Models

A labeled transition system (LTS) is a set X of states along with a mapping π :Σ → 2X ×X , where Σ is a set of atomic actions.

A trace is an alternating sequence of states and atomic actions

s0 p0 s1 p1 ··· sn−1 pn−1 sn,

beginning and ending with a state, such that (si , si+1) ∈ π(pi ), 0 ≤ i ≤ n − 1.

p s1 1- s2 p2 ··· pn−2  HH ¨¨*HH p0 Hj¨ Hj s s s H pn−1 sn−1 H s0 s s Hj sn s s Trace Models

Two traces σ, τ can be fused to get στ if the last state of σ is the same as the first state of τ (last σ = first τ; the extra copy of the state is suppressed). This is called fusion product. If last σ 6= first τ, then στ does not exist.

Now build a Kleene algebra from sets of traces. For A, B ∈ 2{Traces}, define

A + B def= A ∪ B A · B def= {στ | σ ∈ A, τ ∈ B, στ exists} def 0 = ∅ 1 def= {s | s ∈ X } = {traces of length 0} def [ A∗ = An. n≥0 Axioms of KA

Idempotent Semiring Axioms

p + (q + r) = (p + q) + r p(qr) = (pq)r p + q = q + p 1p = p1 = p p + 0 = p p0 = 0p = 0 p + p = p p(q + r) = pq + pra ≤ b ⇐⇒def a + b = b (p + q)r = pr + qr

Axioms for ∗

1 + pp∗ ≤ p∗ q + px ≤ x ⇒ p∗q ≤ x 1 + p∗p ≤ p∗ q + xp ≤ x ⇒ qp∗ ≤ x Axioms of KA

Some basic facts about ≤: ≤ is a partial order (reflexive, antisymmetric, transitive – depends heavily on idempotence) 0 is the least element All operations are monotone with respect to ≤ (that is, if p ≤ q, then p + r ≤ q + r, pr ≤ qr, rp ≤ rq, and p∗ ≤ q∗ Significance of the ∗ Axioms

Axioms for ∗

1 + pp∗ ≤ p∗ q + px ≤ x ⇒ p∗q ≤ x

Axioms for ∗

q + pp∗q ≤ p∗qq + px ≤ x ⇒ p∗q ≤ x

p∗q is the least solution to q + px ≤ x Systems of Linear Inequalities

Theorem Any system of n linear inequalities in n unknowns has a unique least solution

q1 + p11x1 + p12x2 + ··· p1nxn ≤ x1 . .

qn + pn1x1 + pn2x2 + ··· pnnxn ≤ xn Matrices over a KA form a KA

 a b   e f   a + e b + f  + = c d g h c + g d + h

 a b   e f   ae + bg af + bh  · = c d g h ce + dg cf + dh

 0 0   1 0  0 = 1 = 0 0 0 1

∗  a b   (a + bd ∗c)∗ (a + bd ∗c)∗bd ∗  = c d (d + ca∗b)∗ca∗ (d + ca∗b)∗

b a d c Systems of Linear Inequalities

Theorem Any system of n linear inequalities in n unknowns has a unique least solution

q1 + p11x1 + p12x2 + ··· p1nxn ≤ x1 . .

qn + pn1x1 + pn2x2 + ··· pnnxn ≤ xn

q1 x1 x1 q2 x2 x2 . + P = p . ≤ . . ij . .

qn xn xn

Least solution is P∗q Matrices over a KA

Representation of finite automata Construction of regular expressions Solution of linear inequalities over a KA Connectivity and shortest path algorithms The min,+ Algebra (aka the Tropical Semiring)

The domain is R+ ∪ {∞}, where ∞ is an infinite element, with the usual definition of + and ≤ on R+ extended naturally to include ∞: x + ∞ = ∞ + x = ∞ + ∞ = ∞.

The Kleene operations are

K R + min · + 0 ∞ 1 0 ≤ ≥

and x ∗ = 1 (= the real number 0) for any x. v 3.7>Z 2.2  Z  s Z u  Z~- w 6.3 s s u v w u v w u 0 3.7 6.3 u 0 3.7 5.9 R = R∗ = v ∞ 0 2.2 v ∞ 0 2.2 w ∞ ∞ 0 w ∞ ∞ 0 Completeness

We will show next time that the KA axioms exactly characterize the equational theory of the standard interpretation, all language models, all relational models, all trace models. That is, any equation between regular expressions is true in all models of any of these classes if and only if it is derivable from the axioms by equational reasoning. Alternative Axiomatizations Salomaa’s Axiomatization (1966)

Salomaa (1966) was the first to axiomatize the equational theory of the regular sets and prove completeness. He presented two axiomatizations F1 and F2 for the algebra of regular sets and proved their completeness. Aanderaa (1965) independently presented a system similar to Salomaa’s F1. Backhouse (1975) gave an algebraic version of F1. Here is a brief account of these Arto Salomaa axiomatizations. (1934–) Salomaa’s Axiomatization (1966)

Salomaa’s systems are equational except for one rule of inference in each case that is sound under the standard interpretation R : RExpΣ → RegΣ, but not sound in general under other interpretations.

Recall the standard interpretation is the unique homomorphism R : RExpΣ → RegΣ mapping a to {a} for a ∈ Σ. Salomaa’s Axiomatization (1966)

Salomaa defined a regular expression s to have the empty word property (EWP) if ε ∈ R(s). He observed that the EWP can be characterized syntactically: a regular expression s has the EWP if either s = 1; s = t∗ for some t; s is a sum of two regular expressions, at least one of which has the EWP; or s is a product of two regular expressions, both of which have the EWP.

A simpler way to say this is that s has the EWP iff fε(s) = 1, where fε denotes the unique homomorphism fε : RExpΣ → {0, 1} such that fε(a) = 0, a ∈ Σ. Salomaa’s Axiomatization (1966)

Salomaa’s system F1 contains the rule u + st = t, s does not have the EWP . s∗u = t

This rule is sound under the standard interpretation R, but the proviso “s does not have the EWP” is not preserved under substititution, thus the rule is not valid under nonstandard interpretations. For example, if s, t, and u are the single letters a, b and c respectively, then the rule holds; but it does not hold after the substitution

a 7→ 1 b 7→ 1 c 7→ 0.

Another way to say this is that the rule must not be interpreted as a universal Horn formula. Salomaa’s system F2 is similar. In contrast, the axioms for Kleene algebra are all equations or equational implications in which the symbols are regarded as universally quantified, so substitution is allowed. Other Axiomatizations (in order of generality)

Complete semirings (S-algebras, quantales) (Conway 1971) arbitrary suprema & distributivity Closed semirings (ω-complete semirings) (Aho, Hopcroft, Ullman 1974) countable suprema & distributivity ∗-continuous KA

∗ n pq r = supn≥0 pq r

Same equational theory as KA ∗-Continuity

A KA is star-continuous if it satisfies

xy ∗z = sup xy nz, n≥0

where y 0 = 1, y n+1 = yy n, and sup refers to the supremum or least upper bound with respect to the natural order ≤. This property says that any possibly infinite set of the form {xy nz | n ≥ 0} has a least upper bound, and that least upper bound is xy ∗z. Every star-continuous idempotent semiring is a Kleene algebra (but not vice-versa). All naturally occurring Kleene algebras are ∗-continuous. ∗-Continuity

The ∗-continuity condition is actually an infinitary Horn formula (equational implication). It is equivalent to infinitely many inequalities

xy nz ≤ xy ∗z, n ≥ 0,

which together say that xy ∗z is an upper bound for all xy nz, n ≥ 0, along with the infinitary Horn formula ^ ( xy nz ≤ w) ⇒ xy ∗z ≤ w, n≥0

which says that it is the least such upper bound. ∗ n Another way to view it is as a combination of the statement y = supn y along with two infinitary distributivity properties that say that multiplication is continuous with respect to ∗-suprema. Closed Semirings Tarjan 1981, AHU 1975, Mehlhorn 1984

In the design and analysis of algorithms, a related family of structures called closed semirings form an important algebraic abstraction. They give a unified framework for deriving efficient algorithms for transitive closure and all-pairs shortest paths in graphs and constructing regular expressions from finite automata. Very fast algorithms for all these problems can be derived as special cases of a single general algorithm over a closed semiring. Closed Semirings Tarjan 1981, AHU 1975, Mehlhorn 1984

According to (Aho, Hopcroft, Ullman 1975), a closed semiring is an idempotent semiring equipped with a summation operator P defined on countable sequences (not sets) that satisfies infinitary associativity and distributivity. Infinitary idempotence and commutativity are not assumed. Also, the relation between the between finitary + and infinitary P is not explicitly mentioned, but can be inferred from the use of the notation x0 + x1 + x2 + ··· for the infinitary sum. The element x ∗ is defined to be 1 + x + x 2 + ··· . (In fact, the sole purpose of P seems to be to define ∗.) Closed Semirings Tarjan 1981, AHU 1975, Mehlhorn 1984

Infinitary associativity is defined as follows. If (xn | n ≥ 0) is any countable sequence of elements, then for any way of partitioning the P index set N into intervals, the sum i xi is the same as the sum of the sums of the intervals. If an interval is finite, then its sum is computed with +. If an interval is infinite, then its sum is computed with P. Note that any such partition must consist either of infinitely many finite intervals, or finitely many intervals, all of which are finite except the last, which is infinite. Infinitary distributivity says that X X x · ( yi ) · z = xyi z. i Closed Semirings Tarjan 1981, AHU 1975, Mehlhorn 1984

It is conjectured that AHU’s axiomatization does not imply infinitary commutativity. In particular, it is conjectured that

x0 + x1 + x2 + ··· = (x0 + x2 + x4 + ··· ) + (x1 + x3 + x5 + ··· )

is not provable. The axiomatization in (Mehlhorn 1984) postulates infinitary commutativity. Infinitary commutativity says that for any partition of the index set (not necessarily into intervals), the sum of the sums of the partition elements is the same as the sum of the original sequence. P Infinitary idempotence says that if all xi = x, then i xi = x. This does not follow from the axiomatizations of (AHU 1975, Mehlhorn 1984), nor does the equation x ∗∗ = x ∗. It can be shown that 0∗ = 1, but not that 1∗ = 1. ω-Complete Idempotent Semirings

Like closed semirings, ω-complete idempotent semirings are are defined in terms of a countable summation operator P, but the summation operator is applied to sets instead of sequences and is interpreted as a supremum operator. Every countable set A is assumed to have a supremum P A with respect to the natural order ≤. Also, for any countable set A, X X x · ( A) · z = xyz, y∈A

an infinite distributivity property. This definition is equivalent to closed semirings satisfying infinitary associativity, commutativity, idempotence, and distributivity. ω-Complete Idempotent Semirings

Defining ∗ by

def X x ∗ = x n, n≥0

one obtains a ∗-continuous Kleene algebra.

The KA of regular sets RegΣ is not ω-complete: if A is nonregular, the ∗ countable set {{x} | x ∈ A} has no supremum. However, 2Σ is ω-complete. Similarly, the family of all binary relations on a set is an ω-complete idempotent semiring under the usual relational operations. Complete Idempotent Semirings (Conway 1971)

Complete idempotent semirings (called S-algebras by Conway and quantales by others) are like ω-complete idempotent semirings, except that arbitrary sums, not just countable ones, are permitted. The operation ∗ is defined the same way, giving ∗-continuous Kleene algebras. According to Conway’s definition, P is defined not on sequences or sets, but on multisets. There is no cardinality restriction on the multiset. Technically, P is too big to be represented in Zermelo-Fraenkel set theory! However, idempotence says that the value that P takes on a given multiset is independent of the multiplicity of the elements, so P might as well be defined on . So defined, P A gives the supremum of A with respect to the order ≤. We should also assume the axiom P{a} = a, which Conway omitted. This was a minor oversight, as he clearly intended it. Relations Among Classes of Algebras

We have

S-algebras ⊆ closed semirings ⊆ ∗-continuous KAs ⊆ KAs,

all inclusions strict. Moreover, each ∗-continuous KA extends in a canonical way to a closed semiring, and each closed semiring to an S-algebra, by a construction known as ideal completion. Ideal Completion

Definition (Conway 1971) Let K be a ∗-continuous KA. A ∗-ideal is a subset I ⊆ K such that 0 ∈ I if x, y ∈ I , then x + y ∈ I if y ∈ I and x ≤ y then x ∈ I if xy nz ∈ I for all n ≥ 0, then xy ∗z ∈ I . Ideal Completion

Write for the ∗-ideal generated by a set A. Let K be a star-continuous Kleene algebra. Build a complete idempotent semiring Kb from the set of ∗-ideals of K. For any set A of ∗-ideals, define X [ A = < A>.

For any pair of ∗-ideals I , J, define

I · J = <{ab | a ∈ I , b ∈ J}>.

Define 1 = <{1}> and 0 = <{0}> = {0}. Theorem (Conway 1971) Kb is an S-algebra, and K embeds homomorphically in Kb via x 7→ <{x}>. Ideal Completion

This is a universal construction from ∗-continuous KAs to S-algebras. It is a left adjoint to the forgetful functor S-algebras → ∗-continuous KAs.

hb Kb S

complete forget

h K S 0 Ideal Completion

The construction can be factored through closed semirings. To go from ∗-continuous KAs to closed semirings, complete by countably generated ∗-ideals. To go from closed semirings to S-algebras, complete by ω-ideals. An ω-ideal is a subset A ⊆ C such that 0 ∈ A if B ⊆ A, B is countable, and sup B exists, then sup B ∈ A if y ∈ A and x ≤ y, then x ∈ A. Preview for Next Time

KAT and Hoare logic completeness for the equational theory completeness for the Hoare theory (reasoning under assumptions) completeness and incompleteness results for PHL complexity (PSPACE completeness) typed KA and KAT and relation to type theory Exercises

1 Prove the following theorems of Kleene algebra. Reason equationally using the KA axioms.

∗ ∗ ∗ 1 x = x x

∗ ∗∗ 2 x = x

∗ ∗ ∗ ∗ 3 (x + y) = x (yx )

∗ ∗ 4 (xy) x = x(yx)

∗ n−1 n ∗ 5 for all n ∈ N, x = (1 + x) (x ) ∗ ∗ 6 xy = yz ⇒ x y = yz .

2 A KA K is commutative if xy = yx for all x, y ∈ K. Prove that the following hold in all commutative KAs.

∗ ∗ ∗ ∗ ∗ 1 x y = (xy) (x + y )

2 Every regular expression over a finite alphabet Σ is equivalent to a ∗ ∗ ∗ sum of expressions of the form xy1 ... yn , where x, y1,..., yn ∈ Σ

3 Let RegΣ denote the Kleene algebra of regular sets over alphabet Σ. Prove that RegΣ is isomorphic to a relation algebra.