Parametric Types for Typed Attribute-Value Logic

Gerald Penn Universit~t Tfibingen K1. Wilhelmstr. 113 72074 Tuebingen Germany [email protected]

Abstract boolean-valued features, AUX and INV, whereas Parametric polymorphism has been combined their form, e.g., finite, infinitive, gerund, is iden- with inclusional polymorphism to provide nat- tified by a subtype of a single vform type). That ural type systems for Prolog (DH88), HiLog it makes, or at least should make, no difference (YFS92), and coristraint resolution languages from a formal or implementational point of view (Smo89), and, in linguistics, by HPSG-like which encoding is used has been argued else- grammars to classify lists and sets of linguistic where (Mos96; Pen-f). objects (PS94), and by phonologists in represen- HPSG's type system also includes parametric tations of hierarchical structure (Kle91). This types, e.g., Figure 1, from (PS94). In contrast paper summarizes the incorporation of para- word ~hrase ¢list ~nelist(_X~. metric types into the typed attribute-value logic \ 7 I / HE.~D:'2t of (Car92), thus providing a natural extension V. ,..~[f~ TAIL:list(X) to the type system for ALE (CP96). Following ... (Car92), the concern here is not with models of J_ feature terms themselves, but with how to compute with parametric types, and what different Figure 1: A fragment of the HPSG type signa- kinds of information one can represent relative ture. to a signature with parametric types, than relative to a signature without them. This en- to the relative expressive potential of normal quiry has yielded a more flexible interpretation typing and features, the expressive potential of of parametric types with several specific proper- parametric types is not at all understood. In ties necessary to conform to their current usage fact, parametric types have never been formal- by linguists and implementors who work with ized in a feature logic or in a manner general feature-based formalisms. enough to capture their use in HPSG parsing so that a comparison could even be drawn. This 1 Motivation paper summarizes such a formalization, 1 based Linguists who avail themselves of attribute- on the typed attribute-value logic of (Car92). value logic normally choose whether to encode This logic is distinguished by its strong inter- information with subtypes or features on the pretation of appropriateness, a set of condi- aesthetic basis of what seems intuitively to tions that tell us which features an object of capture their generalizations better. Linguists a given type can have, and which types a fea- working in LFG typically use one implicit type ture's value can have. Its interpretation, total for objects that bear features, and other types well-typedness, says that every feature structure (atoms) for only featureless objects. In HPSG, must have an appropriate value for all and only the situation is less clear, both historically (se- the appropriate features of its type. Previous mantic relations, for example, used to be val- approaches have required that every parameter ues of a RELN attribute, and are now sub- of a subtype should be a parameter of all of its types of a more general semantic type), and supertypes, and vice versa; thus, it would not be synchronically (verbs, for example, are identi- 1The full version of this paper presents a denotational fied as (un)inverted and (non-)auxiliaries by two semantics of the logic described here.

1027 possible to encode Figure 1 because _1_ E list(X), parametric type hierarchies is introduced in Sec- and if ± were parametric, then all other types tion 3 by way of establishing equivalent, infinite would be. 2 The present one eliminates this re- non-parametric counterparts. Section 5 consid- striction (Section 2) by requiring the existence ers whether there are any finite counterparts, of a simple most general type (which (Car92)'s i.e., whether in actual practice parametric sig- logic requires anyway), which is then used dur- natures are only as expressive as non-parametric ing type-checking and inferencing to interpret ones, and gives a qualified "yes." new parameters. All previous approaches deal In spite of this qualification, there is an easy only with fixed-arity terms; and none but one way to compute with parametric types directly uses a feature logic, with the one, CUF (Dot92), in an implementation, as described in Section 6. being an implementation that permits paramet- The two most common previous approaches ric lists only as a special case. The present ap- have been to use the most general instance of a proach (Section 4) provides a generalization of parametric type, e.g. nelist(J_)without its ap- appropriateness that permits both unrestricted propriateness, or manually to "unfold" a para- parametricity and incremental feature introduc- metric type into a non-parametric sub-hierarchy tion. that suffices for a fixed grammar (e.g. Figure 2). In contrast to the other encoding trade- The former does not suffice even for fixed gram- off, the use of parametric types in HPSG lin- el~t_phon guistics exhibits almost no variation. They are used almost exclusively for encoding lists (and, unconvincingly, sets), either with type list_syn~~~elist arguments as they are posited in (PS94), or list with general description-level arguments, e.g., list(LOCAL:CAT:HEAD:verb),the latter possibly Figure 2: A manually unfolded sub-hierarchy. arising out of the erroneous belief that parametric types are just "macro" descriptions for mars because it simply disables type checking lists. Even in the former case, however, para- on feature values. The latter is error-prone, a metric types have as wide of a range of poten- nuisance, and subject to change with the gram- tial application to HPSG as simple types and mar. As it happens, there is an automatic way features do; and there is no reason why they to perform this unfolding. cannot be used as prolifically once they are understood. To use an earlier example, auxiliary, 2 Parametric Type Hierarchies inverted, and verb_formcould all be parameters Parametric types are not types. They are func- of a parametric type, verb. In fact, parametri- tions that provide access or a means of reference cally typed encodings yield more compact spec- to a set of types (their image) by means of ar- ifications than simply typed encodings because gument types, or "parameters" (their domain). they can encode products of information in their Figure 1 has only unary functions; but in gen- parameters, like features. Unlike features, how- eral, parametric types can be n-ary functions ever, they can lend their parameters to appro- over n-tuples of types. 4 This means that hier- priateness restrictions, thus refining the feature structures generated by the signature to a closer 4In this paper, "parametric type" will refer to such a approximation of what is actually required in function, written as the name of the function, followed by the appropriate number of "type variables," variables the grammar theory itself. that range over some set of types, in parentheses, e.g. It is possible, however, to regard paramet- list(X). "Type" will refer to both "simple types," such as ric type signatures3 as a shorthand for non- _1_ or elist; and "ground instances" of parametric types, parametric signatures. The interpretation of i.e. types in the image of a parametric type function, written as the name of the function followed by the appropriate number of actual type parameters in paren- 2In this paper, the most general type will be called theses, such as list(l-), set(psoa) or list(set(l-)). I will .l_. use letters t, u, and v to indicate types; capital letters 3By "signature," I refer to a partial order of types plus to indicate type variables; capitalized words to indicate feature appropriateness declarations. The partial order feature names; p, q, and r for names of parametric types; itself, I shall refer to as a "type (inheritance) hierarchy." and g to indicate ground instances of parametric types,

1028 archies that use parametric types are not "type" set internally preserves the subsumption order- hierarchies, since they express a relationship being of its domain. It is, thus, possible to think tween functions (we can regard simple types as of a parametric type hierarchy as "inducing" a nullary parametric types): non-parametric type hierarchy, populated with Definition 1: A parametric (type) hierarchy is the ground instances of its parametric types, a finite meet semilattice, (P, EP), plus a partial that obeys both of these relationships. argument assignment function, ap : P × P × Definition 2: Given parametric type hier- Nat -~ Nat U {0}, in which: archy, (P, Ep, a), the induced (type) hierarchy, (I(P), El), is defined such that: • P consists of (simple and) parametric types, (i.e. no ground instances of para- • I(P) is the smallest set, I, such that, for metric types), including the simple most every parametric type, p(Xt,...,Xn) E general type, _1_, P, and for every tuple, (tt...tn)EI n, • For p,q E P, ap(p,q,i), written aq(i), is p(tl,..., tn)eI. defined iff p EP q and 1 <_ i <_ arity(p), • p(tl,...,tn) EI q(ut,... ,urn) iff p EP q, and and, for all l

not shared the same type variable ?,,nelistfl~ b(X,~e 0), then it would have induced the type hierarchy in Figure 5. In the hierarchy induced .1_ nelist(wor~jnelist(phrase} i Figure 3: A subtype that inherits type variables nelist (sigr~.~nelist (list (_l_) ) from more than one supertype. ad(2) = 3, aed(2) = 2, ad(1) = 0, and a± and ae list(wordcl),,flist(phrese) phrase) :. are undefined (1") for any pair in P × Nat. list (sig ..Llist (list (-L) ) 3 Induced Type Hierarchies list(Z) The relationship expressed between two functions by EP, informally, is one between their im- Figure 5: Another possible induced hierarchy. age sets under their domains, 5 while each image feature values. This abstract assumes that these domains where the arguments do not need to be expressed. are always the set of all types in the signature. This is 5One can restrict these domains with "parametric re- the most expressive case of parametric types, and the strictions," a parallel to appropriateness restrictions on worst case, computationally.

1029 by Figure 3, b(e,e) subsumes types d(e,Y,e), lc(_L,e),e,e), since ti Ut u# if there exist i and j such that @(i) = k and arq(j) = k eElc(_l_, e). Also, for any types, W, X, and Z, c(W,e) subsumes d(X,e,Z). Vk = ti if such an i, but no such j Uj if such a j, but no such i The present approach permits parametric _L if no such i or j. types in the type signature, but only ground instances in a grammar relative to that signa- So p(ti,..., tn) UI q(ui,..., Um)~ if p Up q]', or ture. If one must refer to "some list" or "every there exist i, j, and k > 1 such that @(i) = list" within a grammar, for instance, one may arq(j) = k, but ti UI ujl". 6 use list(I), while still retaining groundedness. In the induced hierarchy of Figure 3, for ex- An alternative to this approach would be to at- ample, b(e, 2) Ut 5(2, e) = b(e,e); b(e,e) U1 tempt to cope with type variable parameters c(_L) = die , 2, e); and b(e, e) and b(c(_L), e) are directly within descriptions. From a process- not unifiable, as e and c(_l_) are not unifiable. ing perspective, this is problematic when clos- The first two conditions of semi-coherence en- ing such descriptions under total well-typing, sure that ap, taken as a relation between pairs as observed in (Car92). The most general sat- of pairs of types and natural numbers, is an or- isfier of the description, list(X)A(HEAD:HEAD der induced by the order, EP, where it is not, "-" TAIL:HEAD), for example, is an infinite fea- taken as a function, zero. The third ensures that ture structure of the infinitely parametric type, joins are preserved even when a parameter is nelist(nelist(.., because X must be bound to dropped (ap = 0). Note that joins in an induced nelist(X). hierarchy do not always correspond to joins in a parametric hierarchy. In those places where For which P does it make sense to talk about ap ---- 0, types can unify without a correspond- unification in I(P), that is, when is I(P) a meet ing unification in their parameters. Such is the semilattice? We can generalize the usual notion case in Figure 5, where every instance of list(X) of coherence from programming languages, so ultimately subsumes nelist(_k). One may also that a subtype can add, and in certain cases note that induced hierarchies can have not only drop, parameters with respect to a supertype: deep infinity, where there exist infinitely long Definition 3: (P, EP, ap) is semi-coherent if, subsumption chains, but broad infinity, where for allp, q E P such thatp Ep q, all1 _< i _< certain types can have infinite supertype (but arity(p), 1 <_ j <_ arity(q): never subtype) branching factors, as in the case of nelist(I) or, in Figure 1, elist.

• ag(i) = i, 4 Appropriateness So far, we have formally considered only type • either aq(i) = 0 or for every chain, p = hierarchies, and no appropriateness. Appropri- Pl EP p2 EP ... EP Pn = q, aq(i) = ateness constitutes an integral part of a para- ,up._,t...aP~(i)...)), and metric type signature's expressive power, because the scope of its type variables extends to • Ifpllpq$, then for all i and j for which there include it. is a k >_ 1 such that appUpq(i) = apqUpa(j) = Definition 4: A parametric (type) signature is k, the set, {rip Up q EP r and (@(i) = 0 or a parametric hierarchy, (P, EP, ap>, along with arq(j) = 0)} is empty or has a least element finite set of features, Featp, and a partial (para- (with respect to EP). metric) appropriateness function, Appropp : Featp x P --~ Q, where Q = UneNat Qn, and each Qn is the smallest set satisfying the equa- Theorem 1: If (P, Ep, ap) is semi-coherent, tion, Qn = {1,...,n} u {P(qi,...,qk)lP E Par- then (I(P),EI) is a meet semilattice. In ity k, qi E Qn}, such that: particular, p(ti,...,tn) Ut q(ui,...,Um) = 6The proofs of these theorems can be found in the full r(vi,...,vs), where p tap q = r, and, for all version of this paper.

1030 1. (Feature Introduction) For every feature appropriate value restrictions, whose extension f E Featp, there is a most general to arbitrary descriptions would substantially ex- parametric type Intro(f) E P such that tend the power of appropriateness as well. This Appropp(f , Intro(f) ) is defined alternative is considered further in the full ver- 2. (Upward Closure / Right Monotonicity) sion of this paper. For any p, q E P, if Appropp(f,p) is de- A parametric signature induces a type hier- fined and p EP q, then Appropp(f,q) archy as defined above, along with the appro- is also defined and Appropp(f,p) EQ priateness conditions on its ground instances, Appropp(f,q), where EQ is defined as determined by the substitution of actual param- EI(P) with natural numbers interpreted eters for natural numbers. Thus: as universally quantified variables (e.g. Theorem 2: If Appropp satisfies properties a(1) EQ b(1) iffVx E I(P).a(x) EI(P) b(x)) (1)-(3) in Definition 4, then Appropi(p ) satisfies properties (1) and (2). 3. (Parameter Binding) For every p E P of arity n, for every f E Featp, if Appropp(f ,p) 5 Signature Subsumption is defined, then Appropp(f,p) e Qn. Now that parametric type signatures have been Appropp maps a feature and the parametric formalized, one can ask whether parametric type for which it is appropriate to its value re- types really add something to the expressive striction on that parametric type. The first two power of typed attribute-value logic. There are conditions are the usual conditions on (Car92)'s at least two ways in which to formalize that appropriateness. The third says that the nat- question: ural numbers in its image refer, by position, Definition 5: Two type signatures, P and Q, to the parametric variables of the appropriate are equivalent (P ~s Q) if there exists an order- parametric type -- we can use one of these isomorphism (w.r.t. subsumption) between the parameters wherever we would normally use a abstract totally well-typed feature structures of type. Notice that ground instances of para- P and those of Q. metric types are permitted as value restrictions, Abstract totally well-typed feature structures as are instances of parametric types whose pa- are the "information states" generated by sig- rameters are bound to these parametric vari- natures. Formally, as (Car92) shows, they can ables, as are the parametric variables them- either be thought of as equivalence classes of selves. The first is used in HPSG for fea- feature structures modulo alphabetic variants, tures such as SUBCAT, whose value must be or as pairs of a type assignment function on list(synsem); whereas the second and third feature paths and a path equivalence relation. are used in the appropriateness specification for In either case, they are effectively feature struc- nelist(X) in Figure 1. The use of parameters tures without their "nodes," which only bear in- in appropriateness restrictions is what conveys formation insofar as they have a type and serve the impression that ground instances of lists or as the focus of potential instances of structure other parametric types are more related to their sharing among feature path, where the traversal parameter types than just in name. of two different paths from the same node leads It is also what prevents us from treating in- to the same feature structure. stances of parametric types in descriptions as If, for every parametric signature P, there is instantiations of macro descriptions. These pu- a finite non-parametric N such that P ~s N, tative "macros" would be, in many cases, equiv- then parametric signatures add no expressive alent only to infinite descriptions without such power at all -- their feature structures are macros, and thus would extend the power of just those of some non-parametric signatures the description language beyond the limits of painted a different color. This is still an open HPSG's own logic and model theory. Lists in question. There is, however, a weaker but still HPSG would be one such case, moreover, as relevant reading: they place typing requirements on every element Definition 6: Type signature, P, subsumes of lists of unbounded length. Ground instances signature Q (P Es Q) if there exists an injec- of parametric types are also routinely used in tion, f, from the abstract totally well-typed fea-

1031 ture structures of P to those of Q, such that: sions, functional uncertainty, or more power- ful appropriateness restrictions can completely • if FI mAT(P) F2J', then f(Ft) UAT(Q) f(F2)J', change the picture. • otherwise, both exist and f(F1UAT(p)F2) = /(F1) Uar(Q)/(F2). 6 Finiteness It would be ideal if, for the purposes of feature- If for every parametric P, there is a finite based NLP, one could simply forget the encod- non-parametric N such that P ___s N, then it ings, unfold any parametric type signature into is possible to embed problems (specifically, uni- its induced signature at compile-time and then fications) that we wish to solve from P into N, proceed as usual. This is not possible for sys- solve them, and then map the answers back to tems that pre-compute all of their type opera- P. In this reading, linguist users who want to tions, as the induced signature of any paramet- think about their grammars with P must accept ric signature with at least one non-simple type no non-parametric imitations because N may contains infinitely many types. 7 On the other not have exactly the same structure of informa- hand, at least some pre-compilation of type in- tion states; but an implementor of a feature- formation has proven to be an empirical neces- based NLP system, for example, could secretly sity for efficient processing. 8 Given that one will perform all of the work for those grammars in only see finitely many ground instances of para- N, and no one would ever notice. metric types in any fixed theory, however, it is Under this reading, many parametrically sufficient to perform some pre-compilation spe- typed encodings add no extra expressive power: cific to those instances, which will involve some Definition 7: Parametric type hierarchy, amount of unfolding. What is needed is a way (P, EP, ap) is persistent if ap never attains zero. of determining, given a signature and a gram- Theorem 3: For any persistent parametric mar, what part of the induced hierarchy could signature, P, there is a finite non-parametric be needed at run-time, so that type operations signature, N, such that P Es N. can be compiled only on that part. If elist in Figure 1 retained list(X)'s parame- One way to identify this part is to identify ter, then HPSG's type hierarchy (without sets) some set of ground instances (a generator set) would be persistent. This is not an unreason- that are necessary for computation, and close able change to make. The encoding, however, that set under Ui(p): requires the use of junk slots, attributes with Theorem 4: If G C I(P), is finite, then the no empirical significance whose values serve as sub-algebra of I(P) generated by G, I(G), is workspace to store intermediate results. finite. There are at least some non-persistent P, in- [I(G)[ is exponential in [G[ in the worst case; cluding the portion of HPSG's type hierarchy but if the maximum parametric depth of G can explicitly introduced in (PS94) (without sets), be bounded (thus bounding [GD, then it is poly- that subsume a finite non-parametric N; but nomial in [P[, although still exponential in the the encodings are far worse. It can be proven, maximum arity of P: for example, that for any such P, some of its Definition 8: Given a parametric hierar- acyclic feature structures must be encoded by chy, P, the parametric depth of a type, t -- cyclic feature structures in N; and the encoding p(tl,...,tn) e I(P), ~(t), is 0 if n = 0, and cannot be injective on the equivalence classes 1 + maxl O. induced by the types of P, i.e. some type in So, for example, 6(list(list(list(.l_))))= 3. N must encode the feature structures of more In practice, the maximum parametric depth than one type from P. While parametric types should be quite low, 9 as should the maximum may not be necessary for the grammar pre- sented in (PS94) in the strict sense, their use in 7With parametric restrictions (fn. 5), this is not nec- that grammar does roughly correspond to cases essarily the case. for which the alternative would be quite unap- SEven in LFG, a sensible implementation will use de facto feature co-occurrence constraints to achieve much pealing. Of course, parametric types are not of the same effect. the only extension that would ameliorate these 9With lists, so far as I am aware, the potential de- encodings. The addition of relational expres- mand has only reached 6 -- 2 (MSI98) in the HPSG

1032 arity. A standard closure algorithm can be used, given in Section 6. although it should account for the commutativ- ity and associativity of unification. One could References also perform the closure lazily during process- (Car92) Carpenter, B., 1992. The Logic of ing to avoid a potentially exponential delay at Typed Feature Structures. Cambridge Univer- compile-time. All of the work, however, can be sity Press. performed at compile-time. One can easily con- (CP96) Carpenter, B., and Penn, G., 1996. Ef- struct a generator set: simply collect all ground ficient Parsing of Compiled Typed Attribute instances of types attested in the grammar, or Value Logic Grammars. In H. Bunt and M. collect them and add all of the simple types, or Tomita, eds., Recent Advances in Parsing add the simple types along with some extra set Technology, pp. 145-168. Kluwer. of types distinguished by the user at compile- (DH88) Dietrich, R. and Hagl, F., 1988. A Poly- time. The partial unfoldings like Figure 2 are morphic Type System with Subtypes for Pro- essentially manual computations of I(G). log. Proceedings of the 2nd European Sympo- Some alternatives to this approach are dis- sium on Programming, pp. 79-93. Springer cussed in the full version of this paper. The LNCS 300. benefit of this one is that, by definition, I(G) (Dor92) Dorna, M., 1992. Erweiterung der is always closed under Ili(p). In fact, I(G) Constraint-Logiksprache CUF um ein Typsys- is the least set of types that is adequate tern. Diplomarbeit, Universit~it Stuttgart. for unification-based processing with the given (Kle91) Klein, E., 1991. Phonological Data grammar. Clearly, this method of sub-signature Types. In S. Bird, ed., Declarative Perspec- extraction can be used even in the absence of tives on Phonology, pp. 127-138. Edinburgh parametric types, and is a useful, general tool Working Papers in Cognitive Science, 7. for large-scale grammar design and grammar re- (MSI98) Manning, C., Sag, I., and Iida, use. M., 1998. The Lexical Integrity of Japanese Causatives. To appear in G. Green and R. 7 Conclusion Levine eds., Studies in Contemporary Phrase This paper presents a formal definition of para- Structure Grammar. Cambridge. metric type hierarchies and signatures, ex- (Mos96) Moshier, M. A., 1995. Featureless tending (Car92)'s logic to the parametric case HPSG. In P. Blackburn and M. de Rijke, eds., through equivalent induced non-parametric sig- Specifying Syntactic Structures. CSLI Publi- natures. It also extends appropriateness to the cations. common practice of giving the binding of para- (Pen-f) Penn, G., forthcoming. Ph.D. Disserta- metric type variables scope over appropriate tion, Carnegie Mellon University. value restrictions. (PS94) Pollard, C. and Sag, I., 1994. Head- Two formalizations of the notion of expressive Driven Phrase Structure Grammar. Univer- equivalence for typed feature structures are also sity of Chicago Press. provided. While the question of ~s-equivalence (Smo89) Smolka, G., 1989. Logic Program- remains to be solved, a weaker notion can be ming over Polymorphically Order-Sorted used to establish a practical result for under- Types. Ph.D. Dissertation, Universit~it standing what parametric types actually con- Kaiserslautern. tribute to the case of HPSG's type signature. A (YFS92) Yardeni, E., Friiwirth, T. and Shapiro, general method for generating sub-signatures is E., 1992. Polymorphically Typed Logic Pro- outlined, which, in the case of parametric type grams. In F. Pfenning, ed., Types in Logic signatures, can be used to process with signa- Programming, pp. 63-90. MIT Press. tures that even have infinite equivalent induced signatures, avoiding equivalent encoding problems altogether. Parametric type compilation is currently being implemented for ALE using the method literature to date.

1033