Representing Honorifics Via Individual Constraints

Home , Honorific, Honorifics (linguistics)

Representing Honoriﬁcs via Individual Constraints

Sanghoun Song Division of Linguistics and Multilingual Studies Nanyang Technological University Singapore [email protected]

Abstract different honorific types of languages. Section 5 reports a small experiment to see if the current Within the context of grammar engineer- model contributes to semantics-based processing. ing, modelling honorifics has been regarded as one of the components for im- 2 Background proving machine translation and anaphora 2.1 Forms of Expressing Honorifics resolution. Using the HPSG and MRS framework, this paper provides a compu- A cross-linguistic survey reveals that there are tational model of honorifics. The present three ways of expressing honorifics (Agha, 1994; study incorporates the honorific informa- Ide, 2005): (i) pronouns, (ii) inflection, and (iii) tion into the meaning representation sys- suppletives. Different languages use a different tem via Individual Constraints with an eye range of honorific systems, but it appears that there toward semantics-based processing. exists a hierarchy in the system of honorification, as presented in Table 1. Note that some languages 1 Introduction (e.g. English) use no honorific forms. Honorific forms express the speaker’s social atti- Table 1: Honorification hierarchy tude to others and also indicate the social ranks of no forms < pronouns Hindi, ... guage in a socially correct way, they have been Japanese, Korean, ... studied in computational linguistics as well as theories of grammar. Particularly, using the hon- The most widespread linguistic phenomenon re- orific information improves anaphora resolution, garding honorific expressions can be found in the and helps machine translation systems provide taxonomy of personal pronouns. In many lan- more natural-seeming output sentences (Mima et guages, personal pronouns (particularly, second al., 1997; Siegel, 2000; Nariyama et al., 2005). pronouns) are dualized, viz. ordinary (a.k.a. in- This paper provides a way of modelling hon- formal) forms and honorific (a.k.a. formal) forms. orifics within the formalism of grammar-based For example, Chinese employs two second per- 你您 language processing. Building upon Head-driven sonal pronouns: nˇı and n´ın. Both sentences Phrase Structure Grammar (Pollard and Sag, provided in (1) convey a meaning like “What is 1994, HPSG) and Minimal Recursion Semantics your name?” in English. (Copestake et al., 2005, MRS), the present study (1) a. 你/您叫什么名字？ suggests using Individual CONStraints (hence- nˇı/n´ın jiao` shenme´ m´ıngzi ? forth, ICONS) for representing honorifics from the 2.SG be.called what name PU perspective of multilingual processing. b. #你/您贵姓？ This paper is structured as follows: Section 2 nˇı/n´ın gu`ı x`ıng ? 2.SG noble last.name PU [cmn] presents some background knowledge of the current study. Section 3 proposes using Individual (1a) is a plain way to ask someone’s name, in Constraints for modelling the honorific system. which both pronouns can be felicitously used. In Building upon the specification, Section 4 shows contrast, (1b) is a way of asking in a courteous how honorific expressions can be translated across manner, in which the use of 你 nˇı is inappropriate.

57 Proceedings of the Grammar Engineering Across Frameworks (GEAF) Workshop, 53rd Annual Meeting of the ACL and 7th IJCNLP, pages 57–64, Beijing, China, July 26-31, 2015. c 2015 Association for Computational Linguistics That is to say, the predicate in (1b) 贵姓 gu`ıx`ıng Therefore, a single expression can involve all three is a marked expression in terms of honorification. types of honorification, as shown in (4). Some languages employ a more complicated honorific system. In Japanese, Korean, Javanese, (4) þÕ`¦ ×¼oᅵ$4_þvmᅵ다. chayk-ul tuli-si-ess-supni-ta Hindi, and some other languages, the inflectional book-ACC give(HON)-HON-PST-HON-DECL paradigm is conditioned by the honorific rela- ‘(An honoree) gave a book (to another honoree).’ tions between dialogue participants (Siegel, 2000; (The hearer is also an honoree.) [kor] Ohtake and Yamamoto, 2001; Kim and Sells, 2007). For instance, in Japanese and Korean, if The verb in (4) contains three honorific forms for the subject is in the honorific form, the predicate the object, the subject, and the addressee. The lex- is preferred to be in the honorific form, as exem- eme ×¼oᅵ- tuli- is a suppletive counterpart of ÅÒ- plified in (2). Note that 先生 sensei ‘teacher’ is an cwu- ‘give’. This verb implies the receiver is re- honorific word, and the verbal form o+STEM+ni spectable. The second one is the suffix -si-, which naru is used to signify honor to the subject. indicates the subject is an honoree. The third one is the ending suffix -supni-, which indicates that (2) 先生は本をお読みになりました the speaker expresses a respect to the hearer. sensei wa hon o o-yomi ni nari mashi-ta teacher TOP book ACC HON-read become HON-PST There are also nominal suppletive forms. The different lexical items that denote the same refer- ‘The teacher read a book.’ [jpn] (Dalrymple, 2001, p. 18) ent sometimes indicate the relative degree of fa- miliarity to the referent. For example, kinship Other elements can also be marked with respect terms in Japanese vary depending on the relation- to honorification. When non-subjects (e.g. ob- ship between the speaker and the referent: When jects and obliques) are honored, the canonical ver- talking about the speaker’s own grandfather with bal form in Japanese is o+STEM+suru. When the others in a modest attitude, 祖父 sofu is normally speaker wants to express an honor to the hearer in used. When either denoting the other’s grandfa- Japanese, a verbal ending masu is used as shown ther or calling the speaker’s grandfather friendly in the last word of (2). On the other hand, the and informally, お爺さん o-jii-san is normally nominal inflectional system is also influenced by used. This contrast shows that o-jii-san lexically honorifics, as exemplified in (3). involves an honorific information, whereas sofu is (3) a. お元気ですか neutral. On the other hand, because o-jii-san can o-genki desu ka be used to denote both the other’s grandfather and HON-good.health COP QUES the speaker’s own grandfather, the honorific infor- ‘How are you (honored)?’ mation has to be flexibly represented so as to cover b. 私は元気です the two potential relations. watashi wa genki desu In addition to the forms discussed hitherto, 1.SG TOP good.health COP some particular constructions, such as passives ‘I am fine.’ [jpn] and interrogatives, can serve to express honorifi- In (3a), the addressee is presumed to be an hon- cation. However, the meaning is just pragmati- ored person, and thereby the prefix o- canonically cally conveyed in this case. Such a construction co-occurs with genki ‘good.health’. In contrast, is not a necessary condition but a sufficient con- (3b) explains the speaker’s own status, and thereby dition for expressing honorifics. Not all passive the honorific marker o- does not appear. sentences in Japanese necessarily involve an hon- Some languages, such as Korean and Japanese, orific relation. In contrast, if the o+STEM+ni naru make ample use of suppletive forms of honori- form in Japanese is used, then the subject is pre- fication. For example, Japanese has three verbs sumed to be an honoree. Since the current work is for ‘eat’: 食べる taberu (neutral), 召し上がる exclusively concerned with honorific forms, these meshiagaru ‘(An honoree) eats’, and 頂く itadaku constructions are out of the scope of this paper. ‘(An honorer) eats’. These suppletive forms can often be used with the inflectional forms men- 2.2 Motivations tioned before; for example, お-召し上がり-に Honorifics have often been regarded as agreement なる o-meshiagari-ni naru ‘(Someone highly re- phenomena just as the subject-predicate agree- spected) eats’ (Nariyama et al., 2005, p. 93-94). ment in many European languages (Boeckx, 2006;

58 Kim et al., 2006). However, there is an opposing sentences. The representation method used in the view to this (Choe, 2004; Bobaljik and Yatsushiro, present study (i.e. MRS+ICONS) has to do with 2006; Kim and Sells, 2007). One counterexample not only semantic information incrementally gath- is provided in (5). ered up to the parse tree, but also other components required to be accessed in the process of (5) ^Òtq_sᅵ ¸(rᅵ)%3다 Kim-sensayng-nim-i o-(si)-ess-ta cross-lingual processing. MRS+ICONS enables Kim-teacher-HON-NOM come-(HON)-PST-DECL us to model several discourse-related items within ‘Teacher Kim came.’ [kor] (Choe, 2004, p. 546) an intrasentential system (i.e. sentence-based processing). Notice that there exist several discourse- _ The subject of (5) contains an honorific form related items that can be at least partially resolved nim, but the predicate optionally takes the hon- without seeing adjacent sentences. This can be rᅵ orific marker -si- though the verb with the hon- conceptualized in the format of Dependency MRS orific marker sounds more natural. Along this line, (Copestake, 2009), as exemplified in (6). the current study does not constrain honorification as a way of agreement. (6) a. John likes himself. There are also a couple of reasons for not fol- x1 x2 [x1 eq x2] lowing honorification-as-agreement. These rea- b. John likes him. sons make it necessary to model honorifics as flex- x1 x2 [x1 neq x2] ibly as possible. Himself in (6a) equals the subject John, while him First, honorification is a matter of tendency in (6b) does not. The notation in the bracket in rather than restriction. Notice that tendency and each example indicates the relationship between restriction are not on a par with each other in gram- two individuals: equal and non-equal. That is to mar engineering. Corpus data provide more than a say, anaphora can be partially identified within an few cases in which a mismatch of honorific forms intrasentential domain via such a binary relation. happens, as exemplified in (5). Grammar engi- There are some other phenomena that require con- neering systems must work robustly for even less textual information in theory but can be partially frequent items if the forms appear in naturally oc- resolved in practice in a way similar to (6), and curring texts and unless they critically violate the honorification is one of them. principle of human language. The current work represents honorifics as a bi- Second, honorification is a matter of accept- nary relation between two individual elements. A ability rather than grammaticality. Acceptabil- set of honorific information is stored into a bag of ity is primarily concerned with appropriateness, constraints, and the value is only partially speci- whereas grammaticality confirms the linguistic fied unless there is a clue to identify the honorific rules mostly provided by linguists. Thus, accept- relation within the intrasentential context. ability distinguishes not grammatical and ungram- matical sentences, but felicitous and infelicitous 3.1 Comparision to Previous Approaches ones. In a similar vein, Zaenen et al. (2004) argue On the one hand, MRS+ICONS makes honori- that animacy is mainly relevant to acceptability: fication (basically a pragmatic information) vis- For instance, the choice between the Saxon gene- ible in semantic representation with an eye to- tive and the of -genetive in English is sensitive to ward semantics-based language processing. In animacy, but the difference has more to do with the previous HPSG-based studies, honorifics are felicity. The same goes for honorification. The treated as a typed feature structure under CTXT choice of honorific forms leads to a difference in (ConTeXT). This local structure includes C- acceptability which forms a continuous spectrum. INDICES whose components are SPEAKER and ADDRESSEE (Siegel, 2000; Kim et al., 2006). In 3 Individual Constraints on Honorifics the LFG-based studies, honorification is regarded Minimal Recursion Semantics is the formalism as an F-structure, given that it is one of the reliable employed to compute semantic compositionality tests to diagnose subjecthood (Dalrymple, 2001). in the present work. In addition, the current work Outside the scope of grammar-based deep process- employs ICONS (Individual CONStraints) in or- ing, several studies make use of shallow process- der to incorporate discourse-related phenomena ing techniques, such as POS-based pattern match- into semantic representation of human language ing rules and regular expressions, for paraphras-

59 ing honorific expressions (Ohtake and Yamamoto, lation between two individuals, their value type is 2001, among others). In sum, no previous ap- individual (a supertype of event and ref-ind). proach represents honorifics into a (near) logical (7) mrs+icons form. In the semantics-based processing, all com-  hook  GTOP handle     ponents that have a part in transfer and generation  LTOP handle        must be accessed in semantic representation.  INDEX individual  HOOK     XARG individual  On the other hand, the current model provides      ICONS-KEY icons       computational flexibility for handling honorifica-  SPEAKER-KEY     ref-ind        HEARER-KEY ref-ind   tion. Many previous studies on honorification em-         RELS diff-list  ploy a syntactic and/or semantic feature [HON   HCONS diff-list    bool] (Kim et al., 2006, among others). However,    icons    this feature is sometimes misleading for computa- ICONS ! ..., IARG1 individual ,... !       * IARG2 +  individual  tional processing of honorifics for three reasons.         First, there exist more than a few mismatches On the other hand, the HOOK structure, which between honorific forms in real texts written in keeps track of the features that need to be Korean and Japanese (i.e. no honorification-as- externally visible upon semantic composition, agreement (Choe, 2004)). [HON bool] is too re- has three additional attributes, viz. ICONS-KEY, strictive to analyze rather infelicitous but accept- SPEAKER-KEY, and HEARER-KEY. These fea- able honorific expressions ( 2.2). For example, § tures function like a pointer in the compositional the two types of second personal pronouns in Chi- construction of the semantic structure. They are nese are interchangeable in many cases as pro- required to mark the constituent analyzed as the vided in (1a), and the use of the informal pro- speaker or the hearer of an utterance and deliver noun 你 nˇı in (1b) merely results in infelicity (not the information up to the parse tree. In particu- ungrammaticality). The current work deals with lar, first and second personal pronouns specify this honorifics grounded upon the premise “parsing ro- value as their own index ( 3.4). § bustly, generating strictly” (Bond et al., 2008). All CTXT (under local) includes C-INDICES just potential honorific forms can be parsed robustly as Jacy does (Siegel, 2000), but the names are dif- and flexibly, but the generation outputs are made ferent as presented in the following AVM. Note strictly and felicitously. Second, [HON bool] can- that the counterpart of “speaker” must be “hearer”, not fully reflect the fact that honorifics are some- and that of “addressee” must be “addressor”. The times ambiguous and the specific meaning can be value type is ref-ind, because the speaker and the incrementally resolved up to the parse tree (Kim hearer are also referential individuals. and Sells, 2007). For example, お爺さん o-jii- san ‘elderly man’ in Japanese can be used either (8) local CAT  cat  informally or formally, and the choice between CONT mrs    ACTIVATED  them depends on syntactic configuration. The cur-  bool     PRESUP diff-list  rent work makes use of a type hierarchy to con- CTXT      SPEAKER ref-ind  strain honorifics (see Figure 1), which manipulates  C-INDICES    HEARER ref-ind    " #    the potential ambiguity and identifies the mean-    ing throughout unification of structures. Third, The values of SPEAKER and HEARER re- the Boolean feature is too crude to place different main underspecified until an utterance is estab- types of constraints on subjects, objects, and ad- lished. The typed feature structure of utterance dressee. For instance, the verb of (4) (in Korean) is presented in (9), in which SPEAKER-KEY includes three HON glosses, and they have differ- and HEARER-KEY under CONT (i.e. mrs) are ent honorific relations. MRS+ICONS represents co-indexed with SPEAKER and HEARER un- honorifics as a binary relation amongst individu- der CTXT. Unless the SPEKER-KEY and the als, such as speaker, hearer, and referents. HEARER-KEY are assigned a specific value dur- ing the construction of the parse tree, the values 3.2 Fundamentals are still left underspecified. If the value is not MRS+ICONS is structured as shown in (7). The specified until an utterance is built up, that means value type is icons whose components are IARG1 that the speaker and the hearer cannot be identified and IARG2. Since ICONS stands for a binary re- within the intrasentential domain.

60 (9) utterance Regarding honorification, icons includes two im- UTTERED +   mediate subtypes: namely, dialogue and rank. The SPEAKER-KEY 1 CONT   2  former branches out into addressor and addressee,  "HEARER-KEY #       SPEAKER 1  and the latter includes two levels of subtypes. CTXT   HEARER 2   " #  Higher-or-int indicates that one individual is so-      sat-or-frag  cially higher than the other or intimate to the other.   ARGS H UTTERED −      Recall that お爺さん o-jii-san in Japanese can  * INDEX 3 +            be canonically used when the referent is higher  addressor addressee    ICONS IARG1 1 , IARG1 2  than the speaker (formal) or intimate to the speaker        * IARG2 3 IARG2 3 +   (less formal). The word itself has the [ICONS-             KEY higher-or-int] feature, which can be further The utterance rule syntactically forms a non- constrained by the value that the predicate assigns branching root node, whose daughter is either to the word. Honorification is normally relevant a saturated sentence or a fragment (sat-or-frag). to which is “higher” than which, but the linguis- This pseudo phrase structure rule introduces two tic forms can sometimes be altered when talking elements into the ICONS list, as shown at the to someone in the lower position. For instance, bottom of (9). They are valued as addressor Korean employs six levels of imperative inflec- (i.e. speaker) and addressee (i.e. hearer). These tions conditioned by the relationship between the ICONS elements play the key role to make dia- speaker and the hearer. Lower-or-int and lower logue participants visible in semantic representa- work for this case. Finally, note that int inherits tion. Their IARG1s are respectively co-indexed from both higher-or-int and lower-or-int. with SPEAKER and HEARER (i.e. 1 and 2 ), and the IARG2 are commonly co-indexed with the 3.4 Specifications semantic head’s INDEX of the utterance (i.e. 3 ). First, pronouns are specified with respect to the The main reason why they have a relation to the speaker and the hearer, as shown in (11). semantic head is that it is necessary to resolve the speaker/hearer scope in quotations. For example, (11) pr-1sg pr-2sg-hon (10) contains two different discourse frames, viz. STEM “我” STEM “您”  D E 1  DINDEXE 1  INDEX  inner frame and outer frame.    HOOK 2  S-KEY 1 HOOK S-KEY   " #      H-KEY 1     ICONS !!     (10) “You have been cruelly used,” said Holmes.        higher   D E   ICONS ! IARG1 1 !      The two different frames may have different  * IARG2 2 +       speakers and different hearers. For instance, the     speaker in the inner frame of (10) is Holmes, while The first personal pronoun has a co-index be- that in the outer frame is the narrator of the story. tween its own INDEX and SPEAKER-KEY, and In other words, (10) includes two different utter- the ICONS list is empty because it does not con- ances, and each introduces its own addressee and tribute to honorification by itself. Likewise, the addressor elements into the ICONS list (i.e. four second personal pronouns link their INDEX to ICONS elements, in total). HEARER-KEY. If the pronoun is honorific, one ICONS element is introduced. Otherwise (e.g. 你 3.3 Type Hierarchy nˇı), the ICONS list is empty. The ICONS element Going into the details, the type hierarchy of icons of the right AVM indicates that the hearer 1 is for honorification is sketched out in Figure 1. higher than the speaker 2 . Second, several inflectional rules introduce an icons ICONS element as exemplified in (12) for the dialogue ... rank subject-honorific form and the addressee-honorific form in Japanese. The left AVM’s ICONS element addressor addressee higher-or-int lower-or-int represents that the subject 1 is higher than the speaker 3 . Likewise, the right AVM’s ICONS el- higher int lower ement specifies the relation between the hearer 2 Figure 1: Type hierarchy of icons and the speaker 1 .

61 (12) ninaru masu [x4 higher x3] (ます masu) [x1 higher x3] (になる ni naru) STEM “に”, “なる” STEM “ます” 

 D INDEX 1E S-KEY D1 E  SUBJ   Each relation given in (15c) is in the format  H-KEY 2   *"I-KEY 2 #+      as [α X β], which is read as “α has an X rela-    S-KEY 3  higher     tion to β”. For instance, [x1 higher x3] means  ICONS ! IARG1 2 !   higher       * IARG2 1 + that x1 (i.e. the subject) is higher than x3 (i.e. the ICONS ! 2 IARG1 1 !           * IARG2 3 +      speaker). The first two relations are introduced         when the utterance is built up (see (9)). The last Third, the suppletive forms themselves do not two relations came from the verbal ending forms introduce an ICONS element, but the ICONS- (see (12)). The MRS representation for (14) is pro- KEY is specified in order to place a partial con- vided in (16), in which IARG1 and IARG2 respec- strain on polarity. This pointer value functions tively correspond to α and β in the [α X β] format. similarly to [HON bool], but operates more flexibly ( 3.1). They are instantiated in (13). Note INDEX 2 § (16) お休み  udef q rel  that oyasumi is a suppletive counterpart of oyasumi s 2 rel LBL 4 ojiisan n 1 rel 寝る    LBL 1  neru ‘sleep’ in Japanese. RELS ARG0 3 , LBL 7 ,      2   *  ARG0 +  RSTR 5  ARG0 3    neru      ARG1 3   (13) a. sofu  BODY 6                   STEM “祖父” STEM “寝る”      D E D E  qeq       ICONS !! ICONS !!  HCONS HARG 5   D E  D E       * LARG 7 +        b.     ojiisan oyasumi  addressor addressee       “ ” “ ”  IARG1 8 , IARG1 9 ,  STEM お爺さん STEM お休み        D E D E       IARG2 2 IARG2 2  I-KEY higher-or-int I-KEY higher  ICONS                   * higher higher +  ICONS !! ICONS !!     D E  D E   IARG1 9 , IARG1 3         IARG2 8 IARG2 8          While the neutral forms provided in (13a) are un-       derspecified, the honorific forms in (13b) place a The traditional representation (16) can be con- constraint on ICONS-KEY. Notably, お爺さん o- verted into a dependency graph for ease of expo- jii-san ‘elderly man’ in Japanese assigns higher- sition. In (17), and tentatively stand for the or-int to the ICNOS-KEY covering the ambiguity. dialogue participants., -

3.5 Sample Representation (17) addressor addressee The example sentence is illustrated in (14). Note , ojiisan oyasumi - that the nominative marker が ga and the two ver- SPEAKER elderly.man sleep HEARER higher ARG1 bal ending forms are semantically empty. higher

(14) お爺さんがお休みになります The solid line in (17) means that the relation is o-jiisan ga o-yasumi ni nari masu specified in the RELS list. The dotted line stands HON-elderly.man NOM HON-sleep become HON for the ICONS element. The relational value is ‘The elderly man is sleeping.’ (to an honoree) [jpn] labelled on the arrow, and the direction of the ar- Therefore, only underlined elements are left in the row indicates which individual is co-indexed with semantic representation as shown in (15a). In ad- which IARG. For instance, the arrow from ojiisan dition, there are two invisible elements as pro- to means the same as [x1 higher x3] presented vided in (15b), such as the speaker and the hearer. in, (15c) and the last ICONS element of (16). Recall that MRS+ICONS includes these invisible referential individuals into the semantics. These 4 Translating Honorifics four individuals have four relations as presented With respect to translating honorific expressions in (15c), and they are added into the ICONS list. across languages, there are different types of trans- (15) a. ojiisan oyasumi lation strategies. Notice that paraphrasing is re- x1 e2 garded as a specific type of translation (i.e. mono- b. speaker: x3, hearer: x4 lingual translation) in the current study, given that c. [x3 addressor e2] (utterance) it is also carried out via the same procedure con- [x4 addressee e2] (utterance) sisting of parsing, (transfer), and generation.

62 First, if both the source language and the target mative form to a more informative form is plausi- language have a complex honorific system (e.g. ble because there is no discarded information. Japanese Japanese), all ICONS elements gath- Second, if the source language has rich → ered in the parsing stage persist in the transfer and honorifics, and the target language places an generation stage. The four sentences in Japanese honorific constraint on only pronouns (e.g. provided in (18) convey a meaning like “Did Japanese Chinese), the ICONS elements are se- → you/someone sleep?” in English, but the prefer- lectively transferred: The element not linked to ence in the choice hinges on the social relation. pronouns are filtered out. In the opposite direc- The felicity condition is presented in Table 2. tion, the underspecified ICONS element in the input MRS can be resolved in the output as dis- Table 2: Choice of (18a-d) higher subject plain subject cussed above. For instance, the subject and the higher hearer (18a) (18b) hearer in (19) is the honorific second pronoun 您 plain hearer (18c) (18d) n´ın in Chinese. (19) cannot be translated into (18c- (18) a. お休みになりましたか？ d) in which masu does not show up (see (12)). In o-yasumi ni nari mashi ta ka ? contrast, all sentences given in (18) can be trans- HON-sleep become HON PST QUES PU lated into (19) without loss of information. Recall b. 寝ましたか？ ne mashi ta ka ? that the current model analyzes the neutral form as sleep HON PST QUES PU underspecified (not [HON ]). − c. お休みになった？ o-yasumi ni nat ta ? (19) 您睡了吗？ HON-sleep become PST PU n´ın shu`ı le ma ? 2.SG(HON) sleep PERF QUES PU [cmn] d. 寝た？ ne ta ? sleep PST PU [jpn] Third, if the source language employs rich honorification and the target language has no hon- The monolingual translation (i.e. paraphrasing) is orific form (e.g. English), the transfer system turns carried out as follows: First, if a suppletive form off ICONS. For example, (18a-d) are commonly is used (e.g. oyasumi), the corresponding form translated into “Did you sleep?” in English. The in the generation output should be the same be- other direction (e.g. English Japanese) raises no → cause the suppletive form is more informative and problem because all underspecified elements are specific than the neutral form. If suppletives are restored on the target language’s side. converted into neutrals, loss of information happens. Notice that the suppletive forms normally 5 Experiment have different PRED names as shown earlier in (15-17). For this reason, (18a-b) and (18c-d) are In order to verify whether the current model works not interchangeable. Second, if there is an el- for semantics-based processing, one experiment ement in the ICONS list of the input MRS, the was conducted with ACE (http://sweaglesw. element persists in the output MRS in order not org/linguistics/ace). The HPSG used for this to lose a piece of information. In other words, a experiment is the Jacy (Siegel and Bender, 2002). completely underspecified output for each ICONS The basic analysis for honorific expressions in element is not allowed in generation. For in- Japanese discussed in this paper was implemented. stance, both (18a) and (18c) use oyasumi (supple- The testset was the first 4,500 sentences in the tive), but (18a) cannot be paraphrased into (18c) Tanaka corpus (Tanaka, 2001). Using these re- because (18a) has one more honorific relation in sources, paraphrasing in Japanese (i.e. monolin- the ICONS list (i.e. ni naru). The same goes for gual translation) was carried out with the 5-best (18b) and (18d): Since masu in (18b) makes the option for parsing and 512MB memory capac- sentence more informative than (18d), (18b) can- ity. After paraphrasing was completed, the two not be paraphrased into (18d). Third, the opposite results (i.e. without or with ICONS) were com- direction is acceptable. Even if a constituent intro- pared, as provided in Table 3. The comparison duces no ICONS element, the output can include was made with respect to (A) the average output an honorific constituent. For instance, (18c) can numbers, (B) the number of the items with end-to- be paraphrased into (18a), and (18d) can be para- end-success, and (C) the number of the items with phrased into (18b). Translating from a less infor- exact-match-output out of (B).

63 Ann Copestake. 2009. Slacker Semantics: Why Su- Table 3: Evaluation (A, #) (B, %) (C, %) perficiality, Dependency and Avoidance of Commit- plain ICONS plain ICONS plain ICONS ment can be the Right Way to Go. In Proceedings of 132.67 280.98 50.38 45.40 59.64 67.89 the 12th Conference of the European Chapter of the ACL (EACL 2009), pages 1–9, Athens. Table 3 presents that the current model aids in pro- Mary Dalrymple. 2001. Lexical Functional Grammar. ducing more precise outputs, as indicated in (C): Academic Press, New York. The translation accuracy grows by 8.25%. The Sachiko Ide. 2005. How and Why Honorifics can number of outputs (A) also grows because all po- Signify Dignity and Elegance. In Robin Tolmach tential forms of expressing honorifics are gener- Lakoff and Sachiko Ide, editors, Broadening the ated without loss of information: All ambiguous Horizon of Linguistic Politeness, pages 45–64. John interpretations are generated as long as the in- Benjamins Publishing. formation is provided in the semantic representa- Jong-Bok Kim and Peter Sells. 2007. Korean Honori- tion. The end-to-end-success rate (B) decreases, fication: a Kind of Expressive Meaning. Journal of but it is mainly due to the memory limitation, East Asian Linguistics, 16(4):303–336. not the model itself: If the size of generated out- Jong-Bok Kim, Peter Sells, and Jaehyung Yang. 2006. puts exceeds the given value of memory limitation Parsing Korean Honorification Phenomena in a (512MB in the current experiment), all the out- Typed Feature Structure Grammar. In Luc Lamon- tagne and Mario Marchand, editors, Advances in Ar- puts are ignored in comparison. If a bigger value tificial Intelligence, pages 254–265. Springer. is chosen, this rate also increases though it takes much longer time to yield the outputs. Hideki Mima, Osamu Furuse, and Hitoshi Iida. 1997. A Situation-based Approach to Spoken Dialog Translation Between Different Social Roles. In Acknowledgments Seventh International Conference on Theoretical I am very grateful to Francis Bond, Jae-Woong and Methodological Issues in Machine Translation: TMI-97, pages 176–183, Santa-Fe. Choe, Yasunari Harada, Jong-Bok Kim, Michael Wayne Goodman, David Moeljadi, and Zhenzhen Shigeko Nariyama, Hiromi Nakaiwa, and Melanie Fan for their help and comments. This research Siegel. 2005. Annotating Honorifics Denoting So- cial Ranking of Referents. In Proceedings of the was supported in part by the MOE Tier 2 grant 6th International Workshop on Linguistically Inter- That’s what you meant: a Rich Representation for preted Corpora (LINC-05), pages 91–100, Jeju. Manipulation of Meaning (MOE ARC41/13). Kiyonori Ohtake and Kazuhide Yamamoto. 2001. Paraphrasing Honorifics. In Workshop Proceedings References of Automatic Paraphrasing: Theories and Applica- tions, pages 13–20, Jeju. Asif Agha. 1994. Honorification. Annual Review of Anthropology, pages 277–302. Carl Pollard and Ivan A. Sag. 1994. Head-Driven Phrase Structure Grammar. The University of Jonathan David Bobaljik and Kazuko Yatsushiro. Chicago Press, Chicago, IL. 2006. Problems with Honorification-as-Agreement in Japanese: A Reply to Boeckx & Niinuma. Natu- Melanie Siegel and Emily M. Bender. 2002. Efficient ral Language & Linguistic Theory, 24(2):355–384. Deep Processing of Japanese. In Proceedings of the 3rd Workshop on Asian Language Resources and In- Cedric Boeckx. 2006. Honorification as Agreement. ternational Standardization, pages 1–8, Taipei. Natural Language & Linguistic Theory, 24(2):385– 398. Melanie Siegel. 2000. Japanese Honorification in an HPSG Framework. In Proceedings of the 14th Pa- Francis Bond, Eric Nichols, Darren Scott Appling, and cific Asia Conference on Language, Information and Michael Paul. 2008. Improving Statistical Machine Computation, page 289–300, Tokyo. Translation by Paraphrasing the Training Data. In Proceedings of the International Workshop on Spo- Yasuhito Tanaka. 2001. Compilation of a Multilin- ken Language Translation, pages 150–157, Hawaii. gual Corpus. In Proceedings of the PACLING, pages 265–268, Kyushu. Jae-Woong Choe. 2004. Obligatory Honorification and the Honorific Feature. Studies in Generative Annie Zaenen, Jean Carletta, Gregory Garretson, Grammar, 14(4):545–559. Joan Bresnan, Andrew Koontz-Garboden, Tatiana Nikitina, M Catherine O’Connor, and Tom Wasow. Ann Copestake, Dan Flickinger, Carl Pollard, and 2004. Animacy Encoding in English: why and how. Ivan A. Sag. 2005. Minimal Recursion Semantics: In Proceedings of the 2004 ACL Workshop on Dis- An Introduction. Research on Language & Compu- course Annotation, pages 118–125, Stroudsburg. tation, 3(4):281–332.