<<

Implementing the of Japanese Numeral Classifiers

Emily M. Bender Melanie Siegel Department of Saarland University University of Washington Computational Linguistics Box 354340 PF 15 11 50 Seattle WA 98195-4340 D-66041 Saarbruck¨ en [email protected] [email protected]

Abstract originally developed as part of the Verbmobil project (Siegel, 2000) to handle spoken Japanese, and then While the sortal constraints associated extended to handle informal written Japanese (email with Japanese numeral classifiers are well- text; (Siegel and Bender, 2002)) and newspaper text. studied, less attention has been paid to Recently, it has been adapted to be consistent with the details of their syntax. We describe the LinGO Matrix (Bender et al., 2002). an analysis implemented within a broad- coverage HPSG that handles an intricate 2 Types of numeral classifiers set of numeral classifier construction types and compositionally relates each to an ap- Paik and Bond (2002) divide Japanese numeral clas- propriate semantic representation, using sifiers into five major classes: sortal, event, men- Minimal Recursion . sural, group and taxanomic, and several subclasses. The classes and subclasses can be differentiated ac- 1 Introduction cording to the semantic relationship between the classifiers and the they modify, on two lev- Much attention has been paid to the semantic aspects els: First, what properties of the modified mo- of Japanese numeral classifiers, in particular, the se- tivate the choice of the classifier, and second what mantic constraints governing which classifiers co- properties the classifiers of the nouns. As occur with which nouns (Matsumoto, 1993; Bond we are concerned here with the syntax and com- and Paik, 2000). Here, we focus on a more neglected positional semantics of numeral classifiers, we will aspect of this linguistic phenomenon, namely the focus only on the latter. Sortal classifiers, (kind, syntax of numeral classifiers: How they combine shape, and classifiers) serve to indi- with number names to create numeral classifier viduate the nouns they modify. Event classifiers phrases, how they modify head nouns, and how they quantify events, characteristically modifying can occur as stand-alone NPs. We find that there is rather than nouns. Mensural classifiers measure both broad similarity and differences in detail across some property of the entity denoted by the noun they different types of numeral classifiers in their syntac- modify (e.g., its length). NPs containing group clas- tic and semantic behavior. We present semantic rep- sifiers denote a group or set of individuals belonging resentations for two types of numeral classifiers, and to the type denoted by the noun. Finally, taxonomic describe how they can be constructed composition- classifiers force a kind or species reading on an NP. ally in an implemented broad-coverage HPSG (Pol- In this paper, we will treat the syntax and compo- lard and Sag, 1994) for Japanese. sitional semantics of sortal and mensural classifiers. The grammar of Japanese in question is JACY,1 However, we believe that our general analysis can

1http://www.dfki.uni-sb.de/ siegel/grammar- be extended to treat the full range of classifiers in download/JACY-grammar.html Japanese and similar languages. 3 Data: Constructions NumClPs can be modified by elements such as yaku ‘approximately’ (before the number name) or mo Internally, Japanese numeral classifier expressions ‘even’ (after the floated numeral classifiers). consist of a number name followed by a numeral The above examples illustrate the contexts with a classifier (1a,b,c). In this, they resemble date ex- sortal numeral classifier, but mensural numeral clas- pressions (1d).2 sifiers can also appear both as modifiers (3a) and as (1) a. juu mai b. juu en NPs in their own right (3b): 10 NumCl 10 yen (3) a. ni kiro no ringo wo katta c. juu kagetsu d. juu gatsu 2 NumCl (kg) GEN apple ACC bought 10 month 10 month ‘(I) bought two kilograms of apples.’ ‘10 months’ ‘October’ b. ni kiro wo katta 2 NumCl (kg) ACC bought In fact, both numeral classifiers and date expressions ‘(I) bought two kilograms.’ are tagged as numeral classifiers by the morpho- logical analyzer ChaSen (Asahara and Matsumoto, NumClPs serving as NPs can also appear as mod- 2000). However, date expressions do not have the ifiers of other nouns: same combinatoric potential (syntactic or semantic) as numeral classifiers. We thus give date expressions (4) a. san nin no deai wa 80 nen haru a distinct analysis, which we will not describe here. 3 NumCl GEN meeting TOP 80 year spring Externally, numeral classifier phrases (NumClPs) ‘The three’s meeting was in the spring of appear in at least four different contexts: alone, as ’80.’ anaphoric NPs (2a); preceding a head noun, linked b. ichi kiro no nedan ha hyaku en desu by the particle no (2b); immediately following a 1 kg GEN price TOP 100 yen COPULA head noun (2c); and ‘floated’, right after the asso- ‘The price of/for 1 kg is 100 yen.’ ciated noun’s case particle or right before the (2d). These constructions are distinguished prag- As a result, tokens following the syntactic pattern matically (Downing, 1996).3 of (2b) and (3a) are systematically ambiguous, al- though the non-anaphoric reading tends to be pre- (2) a. ni hiki wo kau ferred. 2 NumCl ACC raise Certain mensural classifiers can be followed by ‘(I) am raising two (small animals).’ the word han ‘half’: b. ni hiki no neko wo kau 2 NumCl GEN cat ACC raise (5) ni kiro han ‘(I) am raising two cats.’ two kg half c. neko ni hiki wo kau ‘two and a half kilograms’ cat 2 NumCl ACC raise In order to build their semantic representations com- ‘(I) am raising two cats.’ positionally, we make the numeral classifier (here, d. neko wo (ni hiki) ie de kiro) the head of the whole expression, and ni and cat ACC (2 NumCl) house LOC han its dependents. Kiro can then orchestrate the se- (ni hiki) kau mantic composition of the two dependents as well (2 NumCl) raise as the composition of the whole expression with the ‘(I) am raising two cats in my house.’ noun it modifies (see 6 below). 2Note that many of the time units are ambiguous with date Although they aren’t tagged as numeral classi- expressions, although some, like the one for months shown in fiers by ChaSen, we extended our analysis of mensu- (1), are distinguished. ral classifiers to certain elements that appear before 3Downing also notes NumClPs following the head noun with an intervening no. As this rare construction did not appear numbers, namely currency symbols (such as $), and in our data, we have not incorporated it into our account. prefixes like No. ‘number’ in (6). (6) kouza No. 1234 gou al., 2001). Abstracting away from handle con- account number 1234 number straints,4 illocutionary force, tense/aspect, and the ‘account number 1234’ unexpressed , the representation we build for (2b,c) is as in (8).5 Finally, we found that number names can some- times occur without numeral classifiers, either as (8) cat n rel(x), udef rel(x), card rel(x,“2”), modifiers of nouns or as anaphora: raise v rel(z,x)

(7) (kouza) 1234 wo tojitai This can be read as follows: A relation of raising

¡ ¡ (account) 1234 ACC close.volitional holds between (the unexpressed subject), and . ‘(I) want to close (account) 1234.’ denotes a cat entity, and is bound by an underspeci-

fied quantifier (as there is no explicit ). ¡ Due to space considerations, we won’t describe our is also an argument of a card rel (short for ‘cardi- analysis of such bare number names here. nal relation’), whose other argument is the constant 4 Data: Distribution value 2, meaning that there are in fact two cats referred to.6 We used ChaSen to segment and tag 10,000 para- For anaphoric numeral classifiers, the representa- graphs of the Mainichi Shinbun 2002 corpus. Of tion contains an underspecified noun relation, to be the resulting 490,202 words, 11,515 (2.35%) were resolved in further processing to a specific relation: tagged as numeral classifiers. 4,543 of those were (9) noun relation(x), udef rel(x), card rel(x,“2”), potentially time/date expressions, leaving 6,972 nu- raise v rel(z,x) meral classifiers, or 1.42% of the words. 203 ortho- graphically distinct numeral classifiers occur in the Mensural classifiers have somewhat more elab- corpus. The most frequent is nin (the numeral clas- orated semantic representations, which we treat as sifier for people) which occurs 1,675 times. similar to English measure NPs (Flickinger and We sampled 100 sentences tagged as containing Bond, 2003). On this analysis, the NumClP de-

numeral classifiers to examine the distribution of the notes the extent of some dimension or property constructions outlined in 3. These sentences con- of the modified N. This dimension or property is tained a total of 159 numeral classifier phrases and represented with an underspecified relation (un- the vast majority (128) were stand-alone NPs. This spec adj rel), and a degree rel relates the mea- contrasts with Downing’s (1996) study of 500 exam- sured amount to the underspecified rela- ples from modern works of fiction and spoken texts, tion.7 The underspecified adjective relation modi- where most of the occurrences are not anaphoric. fies the N in the usual way. This is illustrated in Furthermore, while our sample contains no exam- (10), which is the semantic representation assigned ples of the floated variety, Downing’s contains 96. to (3a).8 The discrepancy probably arises because Downing 4The potentially underspecified MRS representation of only included sortal numeral classifiers, and not any scope. other type. Another possible contributing factor is 5By convention, the predicate names for lexically con- the effect of genre. In future work we hope to study tributed relations reflect the orthography of the lexical items that introduce them. In this paper, we are using English translations the distribution of both the types of classifiers and of the predicate names for expository convenience. the constructions involving them in the Hinoki tree- 6We take it as implicit in this representation that uncount- bank (Bond et al., 2004). able nouns are individuated when they appear as arguments of a card rel. 7For clarity, we show the relation between the degree rel 5 Semantic Representations and the measure phrase by giving the index of the measure phrase a role in the degree rel. In the current implementation, One of our main goals in implementing a syntac- however, this relationship is represented with identity of han- tic analysis of numeral classifiers is to composition- dles (see (19)). ally construct semantic representations, and in par- 8The relationship between the degree rel and the un- spec adj rel is not entirely apparent in this abbreviated nota- ticular, Minimal Recursion Semantics (MRS) rep- tion. The first argument of the degree rel is in fact the predicate resentations (Copestake et al., 2003; Copestake et name of the unspec adj rel, and not the whole relation. numeral-classifier

obj-only- spr-obj- spr-only- mensural- individuating- anymod- noun-mod- num-cl-lex num-cl-lex num-cl-lex num-cl-lex num-cl-lex num-cl-lex num-cl-lex

num-cl-spr- num-cl-obj- num-cl-spr- num-cl-spr- num-cl-spr- only-meas-lex only-meas-lex obj-meas-lex only-ind-lex only-ind-nmod-lex en kiro $ nin ban

Figure 1: Type hierarchy under numeral-classifier

(10) kilogram n rel(x), udef rel(x), card rel(x,“2”), further composition, the INDEX and LTOP (local degree rel(unspec adj rel, x), unspec adj rel(y), top handle) of the modified element with the nu- apple n rel(y), udef rel(y), buy v rel(z,y) meral classifier’s own INDEX and LTOP, as these When mensural NumClPs are used anaphorically are intersective modifiers (Bender et al., 2002). The (3b), the element modified by the unspec adj rel constraints on the type num-cl head (not shown is an underspecified noun relation, analogously to here) ensure that numeral classifiers can modify only the case of sortal NumClPs used anaphorically: saturated NPs or PPs (i.e., NPs marked with a case postposition wo or ga), and that they only combine (11) kilogram n rel(x), udef rel(x), card rel(x,“2”), 9 degree rel(unspec adj rel, x), unspec adj rel(y), via intersective head-modifier rules.

noun relation(y), udef rel(y), buy v rel(z,y) (12) numeral-classifier :=

¡

¡

¡ num-cl head

6 Implementing an Analysis ¡

¡

¡

¦ 

...CAT.HEAD ¢ ...INDEX ¡

Our analysis consists of: (1) a lexical type hi- MOD £¥¤

¨ © 

...LTOP §

erarchy cross-classifying numeral classifiers along

¢ 

INDEX ¦ three dimensions (Fig 1), (2) a special lexical en-

...CONT.HOOK ¤

¨ try for no for linking NumClPs with nouns, (3) a LTOP § unary-branching phrase structure rules for promot- The constraints on the types spr-only-num-cl-lex, ing NumClPs to nominal constituents. obj-only-num-cl-lex and spr-obj-num-cl-lex account 6.1 Lexical types for the position of the numeral classifier with re- spect to the number name and for the potential pres- Fig 1 shows the lexical types for numeral classi- ence of han. Both the number name (a phrase of fiers, which are cross-classified along three dimen- head type int head) and han (given the distinguished sions: semantic relationship to the modified noun head value han head) are treated as dependents of (individuating or mensural), modificational possibil- the numeral classifier expression, but variously as ities (NPs or PPs: anymod/NPs: noun-mod), and re- specifiers or complements according to the type. In lationship to the number name (number name pre- the JACY grammar, specifiers immediately precede cedes: spr-only, number name precedes but may their heads, while complements are not required to take han: spr-obj, number name follows: obj-only). do so and can even follow their heads (in rare cases). Not all the possibilities in this space are instanti- Given all this, in the ordinary case (spr-only-num- ated (e.g., we have found no sortal classifiers which cl-lex), we treat the number name as the specifier of can take han), but we leave open the possibility that the numeral classifier. The other two cases involve we may find in future work examples that fill in the numeral classifiers taking complements: with no range of possibilities. specifier, in the case of pre-number unit expressions The constraint in (12) ensures that all numeral like the symbol $ (obj-only-num-cl-lex) and both a classifiers have the head type num-cl head, as re- quired by the unary phrase structure rule discussed 9Here and throughout, we have suppressed certain details

of the feature structures and abbreviated feature paths. Angle

in 6.4 below. Furthermore, it identifies two key  brackets with exclamation points inside (  ) indicate differ- pieces of semantic information made available for ence lists, used to enable list appends in unification. number-name specifier and the complement han in 6.2). However, numeral classifiers can appear af- the case of unit expressions appearing with han (spr- ter the noun (2c), modifying it directly. Some nu- obj-num-cl-lex).10 Finally, the type spr-obj-num-cl- meral classifiers can also ‘float’ outside the NP, lex does some semantic work as well, providing the either immediately after the case postposition or

plus rel which relates the value of the number name to the position before the verb (2d).11 While we ¡ to the “ ” contributed by han, and identifying the leave the latter kind of float to future work (see ARG1 of the plus rel with the XARG the SPR and 7), we handle the former by allowing most nu- COMPS so that they will all share an index argu- meral classifiers to appear as post-head modifiers of ment (eventually the index of the modified noun for PPs. Thus noun-mod-num-cl-lex further constrains sortal classifiers and of the measure noun relation the HEAD value of the element on the MOD list for mensural classifiers). The constraints which im- to be noun head, but anymod-num-cl-lex leaves it plement these aspects of our analysis are sketched in as inherited (noun-or-case-p head). This type does, (13)–(15). however, constrain the modifier to show up after the

(13) spr-only-num-cl-lex := head ([POSTHEAD right]), and further constrains

¡ ¡

the modified head to be [NUCL nucl plus], in order SUBJ null to rule out vacuous attachment ambiguities between

OBJ null

¢   ¢ ...VAL

¢ numeral classifiers attaching to the right and other

  SPR ...CAT.HEAD int head £ modifiers appearing to the left of the NP.

(14) obj-only-num-cl-lex := (16) noun-mod-num-cl-lex :=

¡ ¡

¥

SUBJ null ¢

£

§¦

...MOD  ...HEAD noun head

¢

£

¢  

¢ ...VAL OBJ ...CAT.HEAD int head 

SPR null  (17) anymod-num-cl-lex :=

¨ ¨

¢

£  MOD LOCAL.NUCL nucl plus 

(15) spr-obj-num-cl-lex := ...HEAD

¡

¡

POSTHEAD right © © ¡

¡ SUBJ null

¡

¡

¡

¡

¡ ¡

¡ ...CAT.HEAD han head ¡

¡ The final dimension of the classification captures



¡

¡

¦ 

OBJ ¢ LTOP

¡ ¡

...CONT.HOOK ¤ the semantic differences between sortal and mensu-

¡

¡

¨ §

XARG 

¡

...VAL ¡ ral numeral classifiers. The sortal numeral classifiers

¡

¡

¡

¡ ...CAT.HEAD int head 12

contribute no semantic content of their own. They

¡

¡

¤

¢ 

¡ SPR LTOP 

¢ are therefore constrained to have empty RELS and

¤

¡ ...CONT.HOOK

¨

§

XARG

¡ HCONS lists:

¡

¡

¡

¡

¡ plus-relation (18) individuating-num-cl-lex :=

¨

ARG1 §

  

...RELS  ! ! RELS ! !

¢ 

¤

¢ 

TERM1 ¤ ...CONT



 ¨

HCONS  ! !

©

TERM2 ¦

In the second dimension of the cross- In contrast, mensural numeral classifiers con- classification, anymod-num-cl-lex and noun-mod- tribute quite a bit of semantic information, and there- num-cl-lex constrain what the numeral classifier fore have quite rich RELS and HCONS values. As may modify, via the MOD value. shown in (19), the noun-relation is identified with When numeral classifiers appear before the head the lexical key relation value (LKEYS.KEYREL) so noun, they are linked to it with no, which medi- 11Those that can’t include expressions like gou in (i), cf. (ii): ates the modifier-modifiee relationship (see (2) and (i) kouza 1234 gou wo tojitai account 1234 number ACC close.volitional 10Because numeral classifiers are analyzed as taking post- ‘(I) want to close account number 1234.’ head complements in these two cases, the head type num- (ii) *kouza wo 1234 gou tojitai cl head is a subtype of init-head, which contrasts with final- 12The individuating function they serve we take to be implicit head. These types are used by the head-complement rules to in the linkage they provide between the card rel and the noun determine the order of the head and complements. relation. See note 6. that specific lexical entries of this type can easily on the MOD list of no shares its local top handle further specify it (e.g., kiro constraints its PRED to and index with the element on the MOD list of the be kilogram n rel). The type also makes reference NumClP (i.e., that no effectively inherits its comple- to the HOOK value so that the INDEX and LTOP ment’s MOD possibility). Even though (most) nu- (also the INDEX and LTOP of the modified noun, meral classifiers can either modify NPs or PPs, all see (12)) can be identified with the appropriate val- entries for no are independently constrained to only ues inside the RELS list. The length of the RELS list modify NPs, and only as pre-head modifiers. is left unbounded, because some mensural classifiers

also inherit from spr-obj-num-cl-lex, and therefore (20) nmod-numcl-p-lex :=

¡

¡

¡

¡ must be able to add the plus rel to the list. ¡

¡ num-cl head

¡

¡

£

¦ ¢

...COMPS ...HEAD 

¡ ...INDEX ¢ mensural-num-cl-lex := 

(19) ©

£ ¤ 

¡ MOD



¡

§ 

¨ ©

...LTOP

¡

¡ ¦

...LKEYS.KEYREL

¡

¡

¡

¡

¡

¨

¡

¡ ¡

¡ quant-relation

¡ ¦

¡ INDEX

¡

¡



¢ 

£ ¤

¡ §  ARG0 ...HEAD.MOD ...HOOK

¡ ! ,

§ ¨ ©

¡ LTOP

©

¡

¡

¤

RSTR

¡

¡

¡

¢ 

¡

 

¡ RELS ! !

¡ noun-relation

¤

¡ CONT

¡

  ¨

¢ 

HCONS ! !

¡

¦

¡ LBL ,

¡

¡

¡ §

ARG0

¡

¡

¡

¡

¡ RELS 6.3 Examples: NumClPs as Modifiers

degree-relation

¡

¡

¡

¢ 

¡

LBL ,

¡

¡ We illustrate our analysis with sample derivations,

¡

¡

DARG ¡

¡

¡

¡

...CONT ¡ displayed as trees with (abbreviated) rule names and

¡

¡

¡

¡ arg1-relation

¡ lexical types on the nodes. (21) corresponds to (2b),

¡

¡ ¢

LBL

¡

¡ 

, ... !

¡ (22) to (2c), and (23) to a shortened (2d).

¢ 

¡

¡

PRED unspec adj rel

¡



¡

¡

£

¡ ARG1

¡

¡

¡ (21) utterance-rule-decl-finite

¡

¡

¡ qeq

¡

¡

¢ 

¡

¤

 

¡ HCONS ! HARG !

head-comp

¡

LARG

head-comp V

¢ 

£



¢ INDEX

¤

HOOK

¢ ¨ LTOP head-adj-final-intersect P kau The types in the bottom part of the hierarchy in head-comp NP wo Fig 1 join the dimensions of classification. They also head-spr nmod-numcl-p-lex do a little semantic work, making the INDEX and neko LTOP of the modified noun available to their num- card-lex num-cl-spr-only-ind-lex no ber name argument, and, in the case of subtypes of mensural-num-cl-lex, they constrain the final length ni hiki of the RELS list, as appropriate. (22) utterance-rule-decl-finite

6.2 The linker no head-comp We posit a special lexical entry for no which me- head-comp V diates the relationship between NumClPs and the nouns they modify. In addition to the constraints that head-adj-first-intersect P it shares with other entries for no and other modifier- kau heading postpositions, this special no is subject to NP head-spr wo the constraints shown in (20). These specify that card-lex num-cl-spr-only-ind-lex no makes no semantic contribution, that it takes neko a NumClP as a complement, and that the element ni hiki (23) utterance-rule-decl-finite (25) utterance-rule-decl-finite

head-comp head-comp

head-adj-first-intersect V head-comp V

head-comp head-spr quantify-n-rule P kau kau NP P card-lex num-cl-spr-only-ind-lex nominal-numcl-rule wo

wo ni hiki head-spr neko card-lex num-cl-spr-only-ind-lex 6.4 Unary-branching phrase structure rule ni hiki We treat NumClPs serving as nominal constituents by means of an exocentric unary-branching rule.13 7 Future Work This rule specifies that the mother is a noun subcate- We have not yet implemented an analysis of pre- gorized for a determiner specifier (these constraints verbal floated NumClPs, but we sketch one here. are expressed on noun sc), while the daughter is The key is that NumClPs are treated as simple modi- a numeral classifier phrase whose valence is satu- fiers, not quantifiers. Therefore, they can attach syn- rated. Furthermore, it contributes (via its C-CONT, tactically to the verb, but semantically to one of its or constructional content feature) an underspecified arguments. In our HPSG analysis, the verb will have noun-relation which serves as the thing (semanti- unsaturated valence features, making the indices of cally) modified by the numeral classifier phrase. The its arguments ‘visible’ to any modifiers attaching to reentrancies required to represent this modification it. are implemented via the LTOP and INDEX features. There appear to be constraints on which argu- ments can ‘launch’ floated quantifiers, although their

(24) nominal-numcl-rule-type := ¡ exact nature is as yet unclear. Proposals include:

¡ HEAD ordinary noun head ¡

...CAT ¤ only nominals marked with the case particles ga or ¡

VAL noun sc ¨

¡

¡

¡ wo (Shibatani, 1978), only subjects or direct ob-

¡

 ¡

LTOP ¦

¡

¡

HOOK ¤ jects (Inoue, 1978), or c-command-based constraints

¡

¡ ¨

INDEX §

¡

¡

(Miyagawa, 1989). While there are exceptions to all

¡ 

C-CONT

¡ noun-relation

¡ of these generalizations, Downing (1996) notes that

¢ 

¢ 

¦

 

¡ RELS ! LBL !

¡ the vast majority of actually occurring cases satisfy

§

¡ ARG0

¡

¡ all of them, and further that it is primarily intransi-

¡

¡

¡ HEAD num-cl head

¡ tive subjects which participate in the construction.

¡

...CAT ¤

VAL saturated ¨

These observations will help considerably in re-

ARGS £

©



¢ 

¦



¢ LTOP ducing the ambiguity inherent in introducing an

¤

...CONT.HOOK

§ ¨ INDEX analysis of floated NumClPs. We could constrain floated NumClPs to only modify intransitive verbs This rule works for both sortal and mensural (semantically modifying the subject) or transitive NumClPs, as both are expecting to modify a noun. verbs (semantically modifying the object). Some ambiguity will remain, however, as the pre-verbal 6.5 Examples: NumClPs as Nouns and post-nominal positions often coincide. Again, we illustrate the interaction of these various Also missing from our analysis are the sortal con- constraints with an example derivation (25) for (2a). straints imposed by classifiers on the nouns they modify. In future work, we hope to merge this analy- 13In the analysis of number names used as NumClPs, we sis with an implementation of the sortal constraints, posit a second unary-branching rule. The mother of that rule (a NumClP) can then serve as the daughter of the rule discussed such as that of Bond and Paik (2000) . We be- here. lieve that such a merger would be extremely use- ful: First, the sortal constraints could be used to nar- Francis Bond and Kyoung-Hee Paik. 2000. Reusing an row down the possible of anaphoric uses of to generate numeral classifiers. In Coling NumClPs. Second, sortal constraints could reduce 2000, Saarbruck¨ en, Germany. ambiguity in NumClP+no+N strings, whenever they Francis Bond, Sanae Fujita, Chikara Hashimoto, Kaname could rule out the ordinary numeral classifier use, Kasahara, Shigeko Nariyama, Eric Nichols, Akira leaving the anaphoric interpretation (see (4) above). Ohtani, Takaaki Tanaka, and Shigeaki Amano. 2004. The Hinoki Treebank: A treebank for text understand- Third, sortal constraints will be crucial in generation ing. In IJC-NLP-2004. (Bond and Paik, 2000). Without them, we would propose an additional string for each sortal classifier Ann Copestake, Alex Lascarides, and Dan Flickinger. whenever a card rel appears in the input semantics, 2001. An algebra for semantic construction in constraint-based . In ACL 2001, Toulouse, most of which would in fact be unacceptable. Imple- France. menting sortal constraints could be simpler for gen- eration than for parsing, since we wouldn’t need to Ann Copestake, Daniel P. Flickinger, Ivan A. Sag, and Carl Pollard. 2003. Minimal Recursion Semantics. deal with varying inventories or metaphorical exten- An introduction. Under review. sions. Pamela Downing. 1996. Numeral Classi®er Systems: 8 Conclusion The Case of Japanese. John Benjamins, Philadelphia. Precision grammars require compositional seman- Dan Flickinger and Francis Bond. 2003. A two-rule analysis of measure noun phrases. In Stefan Muller¨ , tics. We have described an approach to the syntax editor, Proceedings of the 10th International Con- of Japanese numeral classifiers which allows us to ference on Head-Driven Phrase Structure Grammar, build semantic representations for strings contain- pages 111–121, Stanford CA. CSLI Publications. ing these prevalent elements – representations suit- Kazuko Inoue. 1978. Nihongo no Bunpou Housoku. able for applications requiring natural language un- Tasishuukan, Tokyo. derstanding, such as (semantic) machine translation and automated email response. Yo Matsumoto. 1993. Japanese numeral classifiers: A study of semantic categories and lexical organization. Acknowledgements Linguistics, 31:667–713. Shigeru Miyagawa. 1989. Structure and Case Marking This research was carried out as part a joint R&D in Japanese. Academic Press, New York. effort between YY Technologies and DFKI, and we are grateful to both for the opportunity. We would Kyounghee Paik and Francis Bond. 2002. Spatial repre- also like to thank Francis Bond, Dan Flickinger, sentation and shape classifiers in Japanese and Korean. In David I. Beaver, Luis D. Casillas Mart´ınez, Brady Z. Stephan Oepen, Atsuko Shimada and Tim Baldwin Clark, and Stefan Kaufmann, editors, The Construc- for helpful feedback in the process of developing tion of Meaning, pages 163–180. CSLI Publications, and implementing this analysis and Setsuko Shirai Stanford CA. for grammaticality judgments. This research was Carl Pollard and Ivan A. Sag. 1994. Head-Driven Phrase partly supported by the EU project DeepThought Structure Grammar. U of Chicago Press, Chicago. IST-2001-37836. Masayoshi Shibatani. 1978. Nihongo no Bunseki. Tasishuukan, Tokyo. References Melanie Siegel and Emily M. Bender. 2002. Efficient Masayuki Asahara and Yuji Matsumoto. 2000. Extended deep processing of Japanese. In Proceedings of the models and tools for high-performance part-of-speech 3rd Workshop on Asian Language Resources and Stan- tagger. In Coling 2000, Saarbruck¨ en, Germany. dardization, Coling 2002, Taipei. Emily M. Bender, Dan Flickinger, and Stephan Oepen. Melanie Siegel. 2000. HPSG analysis of Japanese. In 2002. The Grammar Matrix: An open-source starter- Wolfgang Wahlster, editor, Verbmobil: Foundations of kit for the rapid development of cross-linguistically Speech-to-Speech Translation. Springer, Berlin. consistent broad-coverage precision grammars. In Proceedings of the Workshop on Grammar Engineer- ing and Evaluation, Coling 2002, pages 8–14, Taipei.