The Derivative of a Regular Type is its Type of One-Hole Contexts

Extended Abstract

Conor McBride

Abstract subtraction, an operation inherently troubled by the need to subtract only smaller things from larger, but as the in- version of a kind of addition (see figure 1). Polymorphic regular types are tree-like datatypes gen- erated by polynomial type expressions over a set of free variables and closed under least fixed point. The ‘equal-

ity types’ of Core ML can be expressed in this form. Given ¡ such a type expression with ¢ free, this paper shows a = + way to represent the one-hole contexts for elements of ¢ within elements of ¡ , together with an operation which

will plug an element of ¢ into the hole of such a context.

One-hole contexts are given as inhabitants of a regular ¡

type £¥¤ , computed generically from the syntactic struc- ture of ¡ by a mechanism better known as partial differ- Figure 1: a tree as a one-hole context ‘plus’ a subtree entiation. The relevant notion of containment is shown to be appropriately characterized in terms of derivatives and This paper exhibits a similar technique, defined more for- plugging in. The technology is then exploited to give the mally, which is generic for a large class of tree-like data one-hole contexts for sub-elements of recursive types in a structures—the ‘regular datatypes’. These are essentially manner similar to Huet’s ‘zippers’[Hue97]. the ‘equality types’ of core ML, presented as polynomial type expressions closed under least fixed point. In partic- ular, I will define and characterize two operations: 1 Introduction

§ on types, computing for each regular recursive Gerard´ Huet’s delightful paper ‘The Zipper’ [Hue97] de- datatype its type of one-hole contexts; fines a representation of tree-like data decomposed into a subtree of interest and its surroundings. He shows us in- § on terms, computing a ‘big’ term by plugging a formally how to equip a datatype with an associated type ‘’ term into a one-hole context. of ‘zippers’—one-hole contexts representing a tree with one subtree deleted. Zippers collect the subtrees forking The first of these operations is given by recursion on the off step by step from the path which starts at the hole and structure of the algebraic expressions which ‘name’ types. returns to the root. This type of contexts is thus indepen- As I wrote down the rules corresponding to the empty dent of the particular tree being decomposed or the sub- type, the unit type, sums and products, I was astonished tree in the hole. Decomposition is seen not as a kind of to find that Leibniz had beaten me to them by several cen- ¦ Department of Computer Science, University of Durham, turies: they were exactly the rules of differentiation I had [email protected] learned as a child.

1

1.1 Motivating Examples btree @¨ 2 btree

How can we describe the one-hole contexts of a recur- Now imagine similar journeys list ttree for ternary sive type? Huet suggests that we regard them as journeys trees: ‘backwards’ from hole to root, recording what we pass by

on the way. I propose to follow his suggestion, except that 3

ttree ¨ 1 ttree I will start at the root and work my way ‘forwards’ to the hole. Both choices have their uses: ‘backwards’ is better for ‘tree editing’ applications, but the ‘forwards’ approach Each step chooses one of three directions, and remembers is conceptually simpler. two bypassed trees, so

Consider such journeys within binary trees: 2

ttree ¨ 3 ttree © btree ¨ leaf node btree btree It looks remarkably like the type of steps is calculated by differentiating the original datatype. In fact, this is exactly At each point, we are either at the hole or we must step what is happening, as we shall see when we examine how further in—our journey is a list of steps, list btree for to compute the description of the one-hole contexts for some appropriate step type btree , but what? At each regular datatypes. step, we must record our binary choice of left or right, together with the btree we passed by. We may take

2 Presenting the Regular Types

btree ¨ bool btree

1 In order to give a precise treatment of ‘differentiating a We can define the ‘plugging in’ operation as follows datatype’, we must be precise about the expressions which

define them. In particular, the availability of fixed points

  

btree btree

  requires us to consider the binding of fresh type variables.

 btree

There is much activity in this area of research, but this is 

 not the place to rehearse its many issues. The approach

 

true  node 

 I take in this paper relativizes regular type expressions to

 ¥ false    node

finite sequences of available free names. I then explain

!"!#  

list btree btree how to interpret such expressions relative to an environ-

!"! 

$ btree ment which interprets those names.

% &

 

$ I choose to give inductive definitions in natural deduc-

 

(')'*!"! , !"!

 - + $ tion style. Although this requires more space than the datatype declarations of conventional programming lan- We might define btree more algebraically as the sum guages, they allow dependent families to be presented (i.e. ‘choice’) of the unique leaf with nodes pairing two much more clearly: even ‘nested’ types [BM98, BP99] subtrees—powers denote repeated product: must be defined uniformly over their indices, whilst the full notion of family allows constructors to apply only

2 at particular indices. I also give type signatures to de-

btree ¨ 1 btree fined functions in this way, preferring to present the uni- versal quantification inherent in their type dependency via Correspondingly, we might write schematic use of variables, rather than by complex and in-

1 >[email protected]/+B:C6 Huet would write .0/+12143537698;:=< scrutable formulae.

2

2.1 Sequences of Distinct Names family Reg D in figure 2. Firstly, we embed3 type vari- D ables D in Reg , then we give the building blocks for Let us presume the existence of an infinite2 set Name of polynomials and least fixed point.

names, equipped with a decidable equality. We may think D

of Name as being ‘string’. The set NmSeq may then be  NmSeq D defined to contain the finite sequences of distinct names. Reg  Set

Each such sequence can be viewed ‘forgetfully’ as a set. D



¢

D 

¢ Reg

D

¡ D

 

NmSeq 

D Reg

D O ¡ D

   Name 

Set Set 0 Reg Reg

O

¡ D



D D

  

 Reg

¢ ¢GF

D D ¡

NmSeq Name O

 

D=H

E

 

1 Reg Reg ¢

NmSeq NmSeq O

P D=H 

Reg ¢

P D

Q 

¢ . Reg

I shall always use names in a well-founded manner. The

P D=H D ¡ D

D

   ¢

‘explanation’ of some name ¢ from some sequence will Reg Reg Reg

P D ¡ D=H

O

 

© ¢ ¢

= Reg ¤ Reg involve only ‘prior’ names—intuitively, those which have O

already been explained. It is therefore important to define DGI

for any such pair, the restriction ¢ , being the sequence D Figure 2: descriptions of regular types

of names from prior to ¢ . D

D The last two constructors may seem mysterious. How-

 

NmSeq ¢

DGI ever, the redundancy they introduce will save us work 

¢ NmSeq when we come to interpret these descriptions of types.

E

I D

D=H We could choose to interpret only Reg , the closed de-

¢ ¢ 

KJML

I DGI

D=H scriptions, keeping the open types underwater like the

¢  ¢ ¢N dangerous bits of an iceberg. This would require us to substitute for free names whenever they become exposed, As equality on names is decidable, I shall freely allow for example, when expanding a fixed point. We would in names to occur nonlinearly in patterns. In order to re- turn be forced to take account of the equational properties

cover the disjointness of patterns without recourse to pri- of substitutions and prove closure properties with respect JML

oritizing them, I introduce the notation ¢ to mean ‘any to them. I dispense with substitution. J

except ¢ ’. Correspondingly, each clause of such a defi- nition holds (schematically) as an equation. I nonetheless The alternative I have adopted is to interpret open descrip-

tions in environments which explain their free names. The  write  to indicate a directed computation rule, reserving two unusual constructors above perform the rolesˆ of defi-

¨ for equational propositions. nition and weakening, respectively growing and shrinking the environment within their scope. Definition replaces 2.2 Describing the Regular Types substitution and weakening adds more free names without modifying ‘old’ descriptions. This avoids the propagation of substitutions through the syntactic structure and hence The regular types over given free names, or rather, the ex- concerns about capture—we shall only ever interpret the pressions which describe them are given by the inductive ‘value’ of a variable in its binding-time environment.

2By ‘infinite’, I mean that I can always choose a fresh name if I need to create a new binding. 3In fact I am abusing notation by suppressing a constructor symbol.

3

D ¡ D

 

2.3 Type Environments R Env Reg

% % & &

¡ 

R Set

D

% % & & I

A type environment for a given name sequence asso- 

 R ¢ RTSU¢

D

% % & & 

ciates each ¢ in with a type description over the prior

 R ¢

DGI

¢

% & & % % & &

names, . %

¡

 

R  R

% % & & % % & &

O

¡ ¡

 

 R

inl R inr

O O

D



% % & & % % & &

¡



NmSeq #

D



R  R

XZY % % & & X Y % % & &

O ¡

Env Set H

  

 R

R 1

O

& & % %

P P

E E



 Q

R © ¢ ¢

Env  = .

% % & &

P

 Q

D=H D D

  

R ¢

con  . R

¢ NmSeq Env Reg

H D=H

O



% % & & % % & &

H P ¡

 

¢ ¢

R = Env

O

R ¢  R

 =

% % & & % % & &

O

P H ¡

 

 R © ¢  R ¢

= = ¤

O O We may equip environments with operators for restriction (extracting the prefix prior to a given name) and projection

I Figure 3: the data in regular types ¢ (looking up a name’s associated type description). R is

the environment which explains the free names in RTSU¢ .

2.5 Examples of Regular Types

D D D D

   

¢ R ¢

R Env Env

I DGI DGI

 

¢ ¢ RTSV¢ ¢ R Env Reg Let us briefly examine some familiar types in this setting.

We have the unit type 1, so we can use to build the

H I

¢ ¢  R

R = booleans:

O

CJWL

H I I

¢ ¢  R ¢

R =

O



[]\^\_

D D



  

H 1 1 Reg any

X Y  % % & &

[]\+\@_

` a7bMc



¢ SV¢ 

R =

 R  Rd

O O KJML

H inl any

X Y  % % & &

eCfg0h

[]\+\@_

c



¢N SV¢  RTSV¢

R =

 R  Rd O inr any

We may use Q to build recursive datatypes like the natural 2.4 Interpreting the Descriptions D

numbers and our tree examples, again for any and R , as

a fresh bound variable ( ¢ below) can always be chosen: Now that we have the means to describe regular types,

we must say which data are contained by the type with

idjWk Q

 ¢ -¢

 . 1

 X Y

a given description, relative to an environment explaining cVa m

l

 

 con inl



h o

its free names—we must give a semantics to the syntax. bMnpo

% % & &

¡

  4  con inr

Figure 3 defines the interpretation R inductively.

Note that I have not supplied constructor symbols for the [

kVq*r¥r Q

s¢t u¢  ¢

 . 1

 XZY

g f*e

embedding rules corresponding to variable-lookup, def- v

c

 

 con inl

 X Y

inition and weakening. These apparent conflations of vWw

H

m¥xWc

% % & & % % & &

H

   

¥y con inr

¢ ¢ R

types, e.g. R = with , are harmless, as the

O O

types tell us which embeddings are operative.



kVkzq{r¥r Q

 ¢ s¢] ¢t |¢

4  . 1

 X Y fe

The ‘semantic brackets’ do not represent a meta-level operation; g

` c

 

they are intended to be a suggestive object-level syntax for a dependent  con inl

 X X Y Y

w

H H

` m¥xWc

  }   family of types. ¥}~ con inr

4 We may use weakening to define an operation which com- The effect of local type variables is thus reflected at their

putes list types: place of binding, rather than at their places of use.5

¢

_ C€ ¡

¡ The significance of the phrase ‘accounted for by the ’s

k Q

¡

 ¢ p¢

. 1 ¤

 XZY g

wW in ’ is shown by this simple example:

 

J

con inl g v

H H

m¥m

 X Y E

H

'‚'{ƒ„! ƒ!

¢ ¢

If we take R—¨ = =

¢  ¢ 

X Y

con inr eCf„g)h

H

` a7bWc ` a7bWc c

¤

Ž

ˆ ¤

we can derive “

X Y

eCf„g)h eCfg0h

H

c `7a bWc c

¤

ˆ Ž ¤

but not “

The finitely branching trees may thus be given by

eCfg0h

c ¢

The point is that has type ¢ on account of the in

J

_C €

kVq*r¥r Q k

this particular R ’s definition of , whereas, irrespective of

 ¢ ¢

 .

e

w

m¥xWc,† †

! !

R , we can always derive 

 con

X Y

H

¤

 

ˆ Ž 

¤ “

2.6 Subterm Orderings for Regular Types J

However, discharging the definition of in our example:

g

v

H

m¥m



E

‡‰ˆ(Š  ‡

R—¨ ¢

Let us now define an inductive relation, ¤ for Taking =

% % & & % % & &

¡

X Y 

eCfg0h

H

¤ ¤

`7a bWc c

` a7bWc =

 R R ¢

and , characterizing the subterms of a term Ž

¤

ˆ “’ “ ¡

¡ we can derive

X Y

eCf„g)h eCfg0h

H

¤ ¤

c `7a bWc c

in accounted for by ¢ ’s in . This relation characterizes =

Ž

¤

ˆ

¡ “ and “„’

’s roleˆ as a container of ¢ ’s. I omit the obvious ‘well-

formedness’ premises. Both ¢ ’s are accounted for by the indicated type, and the

derivations follow respectively by the two definition rules.

% % & &

 A similar phenomenon occurs if we try to specialize this

‡ R ¢

_C €

k

¤ ¢

ordering to the ¢ ’s contained by a , that is, to deter-

‡Tˆ ‡ ¤

mine when



‡Tˆ‹ ‡Tˆ(Š 

¤ ¤

¤ ”

. 1 †

!

 –˜Ž



“ “

¤

tˆ

‡Œˆ‹9^Š ‡Tˆ‹9^Š  ¤ ¤ inl inr

 Only the rule for fixed points applies, showing this derived

‡Œˆ‹ ‡Œˆ(Š 

¤ ¤

X Y X Y

H H 

 rule to be complete:

‡Tˆ‹ŽŠ  ‡Tˆ‹ŽŠ 

¤ ¤

¤ ¤ ”

= . ”

1 = . 1 †Z†

!

– –

  Ž Ž

¤

‡Œˆ(‘|’ “ “ ‘ 

“Uœ’ “ “ “

¤

™ˆ›š

† † ”

. ¤

!

‡Œˆ 

¤ ‘

“ con ˆ(ž Ÿ ¡5¢

¤ con

 

‡Tˆ‹ ˆ 

¤

‡Œˆ  ‡Œˆ(Š 

‘

¤ ¤ ‘ The premise can only be further simplified by the defini-

= = “

Š –

‹ ‹

¤ ¤

‡Tˆ ‡Œˆ  

‘]’ “ ‘|’ “

¤  ‡•ˆ tion rules: the direct rule finds only the head of the list in

¢ ; the indirect rule’s second premise finds only the tail in J

, and the first demands that we search for an ¢ within that Observe that subterms accounted for by an ¢ in a compos- tail. We thus derive (and show complete) the two rules we

ite type are characterized by a rule for each component of J P might have hoped for:

that type. In particular, the rules for a definition © = O

P 5It would be interesting to investigate a notion of subterm with a ¢

find subterms due to ¢ ’s in and ’s in respectively,

O

J

P

¤ :

the latter occurring within subterms due to ’s in . Note single rule for definitions capturing ‘ £ -subterms within -term , re- ¦

membering that ¥ really means ’: however, we would pay with extra 

that the latter rule exploits the fact that we may see as complexity in the rule for variables, and with the need to carry extra

% % & & J % % J9& &

H R

inhabiting both R and = . environmental information explicitly.

O O

5

¡

† ¤

! 6

ž§Ÿ2¡5¢

tˆ sion over the syntax of . It is defined in figure 4.

¤

 

† †

¤ ¤

')' ! '‚' !

™ˆ(ž Ÿ2¡5¢   ™ˆ(ž Ÿ2¡5¢ 

¤ ¤

D D ¡

 

¢ Reg

¡ D

 ¤

We may exploit this ‘containment’ relation to define the £ Reg

£¥¤¢  ¤ ˆ 1

usual subterm relation for recursive types, namely ” .

CJML

% % & &

P

‘

Q

£¥¤ ¢­ ¢

for R . , as follows: 0



£¥¤ 0 0



¡ ¡

¤ ¤ ¤

£ ® £ ¯£

 

O O

¤

ˆ  ‡Tˆ

¤ ”

. ‘

¤

£ 

‘ 1 0



‡Tˆ ¤ ‡Tˆ ¤  ‡

¡ ¡ ¡ ”

” . . con

‘ ‘

¤ ¤ ¤

£ ® £ °£

O O O

 J J J

P P P ²

Q Q^± Q

¤ ¤

£ ­ £ ©

% % J & &

P . . = .

 Q

J J

P P

²

 R Q

That is, to find a subterm of con . , either stop ±

© £

= .

J J  J

“ P P P

where you are, or move down one level to an ¢ contained

P

¤ ¤ ¤

­ £ © ¯£ © £ ©

£ = = =

O O O O

¡

ˆ

¤ “

by and repeat. ‘ cannot go further than one level,

P

¤

£ 

¤ 0

¡ ¡

because ¢ is free in !

¤ ¤

£  £

¤

[

kzq{r¥r “z³ With a little work, we can specialize these rules for “ to

Figure 4: partial differentiation 

™ˆ,¨ The first six lines are familiar from calculus. For one-hole ¢ª©¬«"«

contexts, they tell us that

tˆ  tˆ 

¨ ¨

 

vWw vWw

m¥xWc mxc

¢5©« « ¢5©« «

tˆ,¨ 9* ™ˆ,¨ 9

§

¢5©« « ¢ª©¬«"« ¢

an ¢ contains one in trivial surroundings

J

§ ¢ a other than ¢ contains no

3 One-Hole Contexts §

constants contain no ¢

¡ ¡

§

Let us now turn to the generic representation of one-hole we find an ¢ in an in either an or a

O O

contexts for recursive regular types. We shall first need to ¡

§

we find an ¢ in an either (left) in the , passing

O O ¡ see what ‘one step’ is. ¡

the , or (right) in the passing the O

The immediate recursive subterms in some con  of

% % & &

P

Q

¢  type R . are those subterms of which correspond

P Note that partial differentiation is independent of environ- 

to the occurrences of ¢ in —recall that has type

% % & &

P P

H ments: it is defined on the syntax of types alone. Just like

Q

¢ ¢

R = . . We therefore need to describe the one- P

P the subterm ordering, it takes no account of the possibil- J

hole contexts for ¢ ’s in ’s. Since may itself contain ¢ ity that some might, for some R , expand in terms of . fixed points, the primitive operation we need to define will

P However, partial differentiation is the basic tool by which compute for the type of one-hole contexts for any of its total differentiation is constructed, and this is exactly what free names. This operation is exactly partial differentia- we need when we go under a binder and become liable to tion. encounter local names which potentially do conceal ¢ ’s. The rule for definitions again handles local variables at 3.1 Partial Differentiation their place of binding, summing the types of contexts for

6

¡ It is depressing how few mathematics teachers deign to impart this The partial derivative of a regular type description with vital clue to calculus students, giving them rules but no method. I learned

respect to a free name ¢ is computed by structural recur- differentiation from the pattern matching algorithm given in [McB70].

6

Á Á

% % & & % % & &

P P P

Q

¢ R © ¢ R ¢ ¨

’s occurring directly in , and indirectly in an buried If =0 ¨ 0 then . 0 O

within a J . In conventional calculus, this is effectively the

¡Â ¡ o ‘chain rule’ extended to functions of two arguments o

Writing for the -fold product of ’s and for the

o o

´ ´{¼ ´ ´{¾

´ sum of 1’s, we can check (by induction on ), that for

 ¶   

oÄÉÅ

¸˜¿ ½ ¼¹¿ ¾

´ ´¹¸ ´ ´¹½ ´

 ·U|¨ ‡º» ‡º» z©

¤µ µ ¤ µ ¤

œKÀ œ

š š

Á

% % & &

 ¥Æ^Ç o

in the special case where is ¢ .

R £9¤„¢ p¢ ¨

Once we know how to differentiate definitions, we can  make the leap to fixed points. If we expand the fixed point o Viewed as a one-hole context for ¢ , the tells us which

and apply the ‘chain rule’, we obtain: ƏÇ

¢ ¢

¢ the hole is at, while the records the remaining ’s.

 J  J J

P P P

¢ ¢ Q

Q A more involved example finds an within a list of ’s

¤ ¤

­¨ £ © 

£ . = .

J J

P P

Q

¤

£ ©

¨ = .

Á

 J J

_ C€

J J  J

P P P

k Q

Q Q

£¥¤ ¢ £¥¤ s¢t 

¨

¤

© °£ 

£ = . . . 1

Á   J J

_ C€

²

Q^± k

“

¤

£ s¢t V© ¢

¨ . 1 =

  J J

_C €

²

k ±

s¢] 7V© ¢

It is tempting to solve this recursive equation with a least £ 1 =

Á

J J J

“ _C € ² _C € ²

Q^± k k ±

© ¢ -¢p© ¢t

fixed point, but does this give the correct type of one-hole ¨ . = =

J

Á  

P

_C € _ C€

Q

k k

¢

¢ ¢ contexts? Yes: every inside a . must be a piece of ¨ ‘payload data’ attached to some J -node buried at a finite

depth. Hence our journey takes us either to an ¢ at the

P We would indeed hope that a one-hole context for a list ¤ outermost node and stops—hence the £ in the ‘base

P element is a pair of lists—the prefix and the suffix!

case’—or to a subnode and onwards—hence the £ in

_ C€

“ k

the ‘step case’: it must stop eventually. We must weaken Amusingly, the following power series resembles ¢ ,

P ±

at ± , for is not free for . Our journey is clearly linear:

±

È

 È

the body of the fixed point is syntactically linear in . In- È

È,Ì

© ¢p©Í  $¢; -¢Ét $¢@Êt ËSVSzS9¨ for deed, the following set isomorphism holds, showing that ¢ our journey is a list of ‘steps’ with a ‘tip’:

and conventional calculus tells us that

² ²

% % & &^Á % %) & &

_ C€

¡ ¡

Q^± ± k

R R *

. ¨

O O

Î Î

È È

É

£

È,Ì È,Ì

¨

¢tÏ ¢Ï The rules for differentiating an explicit weakening simply £@¢ short-circuit the process if the name we seek is that being excluded.

3.3 Plugging In

¡

¤ ¢

3.2 Examples of Derivatives Given a one-hole context in £ and an , we should be ¡

able to construct a by plugging the ¢ in the hole. That Before we develop any more technology, let us check that is, we need an operation which behaves thus:

partial differentiation is giving us the kind of answers we

% % & & % % & & % % & & ¡

expect. Of course, the types we get back will contain lots ¡

¤

R £ R ¢ R

of ‘0 ’ and ‘1 ’, but the usual algebraic laws which sim-

Ð

¡

Ò Ò





‡ ‡ plify such expressions hold as set isomorphisms. It is also ¢^Ñ not hard to show that a recursive type with no base cases is empty:

7

% % & & % % & &

¡

ÒÓ 

R £¥¤ ‡ R ¢

Ð % % & &

¡ ¡

Ò  

¢^Ñ ‡ R

X YÔÐ



¢ ¢^Ñ ‡  ‡

Ð  Ð

¡

Ò  Ò 

¢^Ñ ‡  ¢^Ñ ‡

inl inl

O O

Ð  Ð

¡ ¡

Ò  Ò 

¢^Ñ ‡  ¢^Ñ ‡N

inr inr

O

X YÔÐ X Ð Y

H ¡ H

Ò  Ò 

¢^Ñ ‡  ¢^Ñ ‡ 

inl 

O O

X YÔÐ X Ð Y

H ¡ H ¡

 Ò   Ò 

¢^Ñ ‡  ¢^Ñ°‡

inr

O

 Ð J  Ð

P P

Ò Q  Ò 

¢^Ñ ‡  ¢+Ñ;‡

con inl  . con

 X Y Ð J  Ð J  Ð J

H{Õ P P Õ P

Ò ! Q  Ò  ! Q 

¢^Ñ ‡  Ñ ¢^Ñ°‡N7

con inr  . con .

Ð J Ð

P P

Ò  Ò 

¢^Ñ ‡  ¢+Ñ°‡

inl © =

O

X YÖÐ J Ð J  Ð

H P P

Ò Ò  Ò  Ò 

¤ ¤

¢^Ñ ‡  Ñ ¢+Ñ;‡

inr © =

O O

“ “

Ð

¡ ¡

ÒŒ×  Ò 

¢uØ-‡  ¢+Ñ°‡

¤ “V³

Figure 5: plugging in

¡

P D

 Q

How are we to define this operation? should tell us the ¢ . Reg

€¹Ùd[

P D

Q  Ò shape of the term we are trying to build; should tell us ¢ . Reg

which path to take and also supply the subterms corre-



_C € €zÙd[

P P P

k Q

sponding to the ‘off-path’ components of pairs. In effect, Q

 £9¤ © ¢ ¢  ¢

Ò . = . the operation proceeds by structural recursion on , but its

flow of control involves a primary case analysis on ¡ . Of

€¹Ù+[

P

¡

Q

¢ ¤ course, we need only consider cases where £ is not 0. Informally, an inhabitant of . looks like

The definition is in figure 5.

g

wW

Ò Ò Ò

')' ')'|ÝVÝzÝ '‚'

Ç Â

3.4 Subtrees in Recursive Regular Types É

¤

¢ ¢ The £ operation picks out ’s from containers for ’s. As suggested above, this gives us the tool we need to pick out A few brief calculations reassure us that

subtrees from our tree-like recursive datatypes, interpret- 

ing  as a container for the immediate subtrees of con .

¡

Á

% %  & & % % & &

€¹Ùd[ _ C€ _C € k

Hence our intuition that contexts for recursive type in- k

R ¢ R ¢

¨

¡

Á

& & & & % %  % %

€¹Ù+[ €zÙd[ €zÙd[

P P

Q Q

habit list can be made rigorous. The ‘step type’ for

 R ¢ R ¢ ¨

P hence . .

% % & &ßÁ % %  à & & Q

€zÙd[Þ[ _C € [

kVq*r¥r k kzq{r¥r

trees in ¢ . is

R R 

¨

Á

% % & & Câ

€¹Ù+[ _ C€

kzkVq*r¥r k kzkVq*r¥r

É

R RGá2á 5ã§ã

¨

Á

% % & & 

€¹Ùd[ä _ C€ _C €

kVq*r¥r k k kzq{r¥r

P P

Q

R R ƒ

á2á ã ã

¨

¤

© ¢ ¢

£ = .

€¹Ù+[ P

P The corresponding notion of ‘addition’ has the signature

Q Q ¢ The one-hole contexts for ¢ . thus inhabit .

given by7

% % & & % % & &

€¹Ùd[

Õ P P

!# Q  Q

¢ ‡ R ¢

R . .

% % & &

Õ P

 Q !

‡ R ¢

7 ¤ .

¤ ”

Abuse of notation—Ú Û¥Ü is really an operation on . ‘

8

Õ Õ

! ! Ò õ

For the ‘ ’ depicted above and an appropriate ‘ ‡ ’, we or for the direction. It is moreover the case that

Õ Õ

! Ò !

‡ ¤ expect ” . to be distinct ’s or ’s on the right give rise to distinct deriva- ‘ tions on the left, and vice versa. I omit the details for

reasons of space.

Ò Ò Ò

ÝVÝVÝ

Ç Â

con con con ‡ É

4 Towards an Implementation

Ð

P



¢^Ñ Of course, ¤ is defined by iterating over the ” . Recent extensions to the HASKELL type system, such as

8 ‘ list : Jansson and Jeuring’s POLYP have begun to realise the

potential of generic programming for the development

g

wW

‡  ‡

¤ of highly reusable code, instantiable for a wide class of

” .

  Ð 

Õ P

ҝ'‚' ! Ò  Òz

‘

 ‡  ¢^Ñ ‡ ¤

¤ con datatypes and characterised by equally generic theorems ”

” . . ‘

‘ [JJ97, BJJM98]. However, these systems show no sign of ¤ allowing operations like £ which compute types gener- Observe, for example, that {å„æ really behaves like ‘plus’,

¢ ically by recursion over a closed syntax of type expres-

whilst ¤ is effectively ‘append’. It is not hard to show

€¹Ù+[

P ž Ÿ ¡5¢ Q sions, crucially regarding type variables as concrete ob-

in general that ¢ . is a monoid under ‘append’ and P

Q jects.

¢ ¤ . that ” . is an action of that monoid on . ‘ In a theory supporting inductive families of datatypes [Dyb91], syntaxes such as Reg D can be rep-

3.5 Subterms and Derivatives resented as ordinary data which can then be used to index

% % & &

¡ £¥¤ the families like R which reflect them. becomes The relationship between the containment orderings and merely an operation on data, requiring no extension to

derivatives is an intimate one, and so too is that between the computational power of the theory. A programming €¹Ù+[ the subtree orderings and the types computed by . In language based on such a type theory, as envisaged in

effect, the containment ordering is exactly that induced [McB99], would seem to be a promising setting in which

P Q

by plugging in, while the subtree ordering on ¢ . is in- to implement the technology described in this paper. This

¤ duced by ” . . What we have done is to give a concrete work is well under way in the Computer-Assisted Reason- representation ‘ for the witnesses to these relational prop- ing Group at Durham. erties. We may prove the following theorems: The payoffs from such an implementation seem likely to be substantial, extending far beyond applications to edit-

Theorem (containment)

% % & & % % & &

¡ ing trees. A library of generic tools for working with con-

 

R ¢  R

For ‡ and : texts would allow us to define functions in terms of these

& & Ð % %

¡ ¡

Ý¹Ò  ÒÓ

¤

¢+Ñ;‡é¨ê ‡Œˆ(Š äç è R £ ¤ higher-level structures, manipulating data in large chunks, rather than one constructor at a time. Furthermore, in a

Theorem (subtree)

& & % %

P dependently typed setting, a richer structure on data is

 Q

R ¢

For ༠. : ipso facto a richer structure on the indices of datatypes.

& & % %

€zÙd[

P Õ Õ

Ý ! !# Q

¤

‡Œˆ ‡ä¨ê ç è R ¢ ¤

” .

. ” . ‘ ‘ It also seems highly desirable to extend the class of datatypes for which one-hole contexts can be manipulated The proofs of these theorems are easy inductions, on generically beyond the regular types, perhaps to include

derivations for the ë direction, and on the structure of indexed families themselves. A concrete representation

8Again, Huet would write of one-hole contexts for a syntax with binding has much

8ªîzï 8¬îVï

<>òì7/ . .0ì(ó ¤(Bu£˜ô 6C6

.)ì^121Uí36 con

ð ñ . ð,ñ . to offer both metaprogramming and metatheory.

9 5 Conclusions and Future Work lor series seems worthy of pursuit and is an active topic of research.

This paper has shown that the one-hole contexts for ele- In summary, the establishment of the connection between ments contained in a polymorphic regular type can be rep- contexts and calculus is but the first step on a long road— resented as the inhabitants of the regular type computed who knows where it will end? from the original by partial differentiation. This technol- ogy has been used to characterize data structures equiv- alent to Huet’s ‘zippers’—one-hole contexts for the sub- Acknowledgements trees of trees inhabiting arbitrary recursive types in that class. The operations which plug appropriate data into I feel very lucky to have stumbled on this interpretation of the holes of such contexts have also been exhibited. differentiation for datatypes with such potential for both While this connection seems unlikely to be a mere coinci- utility and fascination, and has been intriguing me since dence, it is perhaps surprising to find a use for the laws the day I made the connection (whilst changing trains of the infinitesimal calculus in such a discrete setting. at Shrewsbury). This work has benefited greatly from There is no obvious notion of ‘tangent’ or of ‘limit’ for hours spent on trains, and enthusiastic discussions with datatypes which might connect with our intuitions from many people, most notably Konstantinos Tourlas, James school mathematics. Neither can I offer at present any McKinna, Alex Simpson, Daniele Turi, Martin Hofmann, sense in which ‘integration’ might mean more than just Thorsten Altenkirch and, especially, Peter Hancock. ‘differentiation backwards’. One observation does, however, seem relevant: the syn- tactic operation of differentiating an expression with re- References

spect to ¢ generates an approximation to the change in value of that expression by summing the contributions [BJJM98] Roland Backhouse, Patrik Jansson, Johan

generated by varying each of the ¢ ’s in the expression in Jeuring, and . Generic turn. The derivative is thus the sum of terms correspond- programming—an introduction. In S. Doaitse

ing to each one-hole context for an ¢ in the expression. Sweierstra, Pedro R. Henriques, and Jose´ N. Perhaps the key to the connection can be found by fo- Oliveira, editors, Advanced Functional cusing not on what is being infinitesimally varied, but on Programming, Third International Summer what, for the sake of a linear approximation to the curve, School (AFP ’98); Braga, Portugal, volume is being kept the same. 1608 of LNCS, pages 28–115. Springer- Verlag, 1998. Apart from the implementation of this technology, and the development of a library of related generic utili- [BM98] Richard Bird and Lambert Meertens. Nested ties, this work opens up a host of fascinating theoretical datatypes. In Mathematics of Program Con- possibilities—one only has to open one’s old school text- struction, volume 1422 of LNCS, pages 52–67. books almost at random and ask ‘what does this mean for Springer-Verlag, 1998. datatypes?’. [BP99] Richard Bird and Ross Paterson. de Bruijn no- There must surely also be a relationship between this tation as a nested datatype. Journal of Func- work and Joyal’s general characterization of ‘species of tional Programming, 9(1):77–92, 1999.

structure’ in terms of their Taylor series[Joy87]. For reg-

¡

£NÉ ¢

ular types, ¤ makes a hole for a second in a one-hole [Dyb91] Peter Dybjer. Inductive sets and families in

¡ Â ¡

o

£¥¤ £

context from . More generally, ¤ gives all the - Martin-Lof¨ ’s type theory. In G. Huet and

¡ ouö

hole contexts for ¢ ’s in ’s in each of the orders with G. Plotkin, editors, Logical Frameworks. CUP, which they can be found. This strong resonance with Tay- 1991.

10 [Hue97] Gerard´ Huet. The Zipper. Journal of Func- tional Programming, 7(5):549–554, 1997. [JJ97] Patrik Jansson and Johan Jeuring. PolyP— a polytypic extension. In Proceedings of POPL ’97, pages 470–482. ACM, January 1997. [Joy87] Andre´ Joyal. Foncteurs analytiques et especes` de structures. In Combinatoire enumer´ ative, volume 1234 of LNM, pages 126–159. Springer-Verlag, 1987. [McB70] Fred McBride. Computer Aided Manipulation of Symbols. PhD thesis, Queen’s University of Belfast, 1970. [McB99] Conor McBride. Dependently Typed Func- tional Programs and their Proofs. PhD thesis, , 1999.

11