Deep Linguistic Analysis for the Accurate Identification of Predicate
Total Page:16
File Type:pdf, Size:1020Kb
Deep Linguistic Analysis for the Accurate Identification of Predicate-Argument Relations Yusuke Miyao Jun'ichi Tsujii Department of Computer Science Department of Computer Science University of Tokyo University of Tokyo [email protected] CREST, JST [email protected] Abstract obtained by deep linguistic analysis (Gildea and This paper evaluates the accuracy of HPSG Hockenmaier, 2003; Chen and Rambow, 2003). parsing in terms of the identification of They employed a CCG (Steedman, 2000) or LTAG predicate-argument relations. We could directly (Schabes et al., 1988) parser to acquire syntac- compare the output of HPSG parsing with Prop- tic/semantic structures, which would be passed to Bank annotations, by assuming a unique map- statistical classifier as features. That is, they used ping from HPSG semantic representation into deep analysis as a preprocessor to obtain useful fea- PropBank annotation. Even though PropBank tures for training a probabilistic model or statistical was not used for the training of a disambigua- tion model, an HPSG parser achieved the ac- classifier of a semantic argument identifier. These curacy competitive with existing studies on the results imply the superiority of deep linguistic anal- task of identifying PropBank annotations. ysis for this task. Although the statistical approach seems a reason- 1 Introduction able way for developing an accurate identifier of Recently, deep linguistic analysis has successfully PropBank annotations, this study aims at establish- been applied to real-world texts. Several parsers ing a method of directly comparing the outputs of have been implemented in various grammar for- HPSG parsing with the PropBank annotation in or- malisms and empirical evaluation has been re- der to explicitly demonstrate the availability of deep ported: LFG (Riezler et al., 2002; Cahill et al., parsers. That is, we do not apply statistical model 2002; Burke et al., 2004), LTAG (Chiang, 2000), nor machine learning to the post-processing of the CCG (Hockenmaier and Steedman, 2002b; Clark et output of HPSG parsing. By eliminating the effect al., 2002; Hockenmaier, 2003), and HPSG (Miyao of post-processing, we can directly evaluate the ac- et al., 2003; Malouf and van Noord, 2004). How- curacy of deep linguistic analysis. ever, their accuracy was still below the state-of-the- Section 2 introduces recent advances in deep lin- art PCFG parsers (Collins, 1999; Charniak, 2000) in guistic analysis and the development of semanti- terms of the PARSEVAL score. Since deep parsers cally annotated corpora. Section 3 describes the de- can output deeper representation of the structure of tails of the implementation of an HPSG parser eval- a sentence, such as predicate argument structures, uated in this study. Section 4 discusses a problem in several studies reported the accuracy of predicate- adopting PropBank for the performance evaluation argument relations using a treebank developed for of deep linguistic parsers and proposes its solution. each formalism. However, resources used for the Section 5 reports empirical evaluation of the accu- evaluation were not available for other formalisms, racy of the HPSG parser. and the results cannot be compared with each other. In this paper, we employ PropBank (Kingsbury 2 Deep linguistic analysis and and Palmer, 2002) for the evaluation of the accu- semantically annotated corpora racy of HPSG parsing. In the PropBank, semantic Riezler et al. (2002) reported the successful applica- arguments of a predicate and their semantic roles tion of a hand-crafted LFG (Bresnan, 1982) gram- are manually annotated. Since the PropBank has mar to the parsing of the Penn Treebank (Marcus been developed independently of any grammar for- et al., 1994) by exploiting various techniques for malisms, the results are comparable with other pub- robust parsing. The study was impressive because lished results using the same test data. most researchers had believed that deep linguistic Interestingly, several studies suggested that the analysis of real-world text was impossible. Their identification of PropBank annotations would re- success owed much to a consistent effort to main- quire linguistically-motivated features that can be tain a wide-coverage LFG grammar, as well as var- ARG0-choose S FrameNet (Baker et al., 1998) are large English cor- NP-1 VP pora annotated with the semantic relations of words they VP VP in a sentence. Figure 1 shows an example of the did n’t have S annotation of the PropBank. As the target text of NP VP the PropBank is the same as the Penn Treebank, a ARG0-choose *-1 to VP ARG1-choose syntactic structure is given by the Penn Treebank. choose NP The PropBank includes additional annotations rep- REL-choose this particular moment resenting a predicate and its semantic arguments in a syntactic tree. For example, in Figure 1, REL de- notes a predicate, “choose”, and ARG represents Figure 1: Annotation of the PropBank its semantic arguments: “they” for the 0th argument (i.e., subject) and “this particular moment” for the ious techniques for robust parsing. 1st argument (i.e., object). However, the manual development of wide- Existing studies applied statistical classifiers to coverage linguistic grammars is still a difficult task. the identification of the PropBank or FrameNet an- Recent progress in deep linguistic analysis has notations. Similar to many methods of applying ma- mainly depended on the acquisition of lexicalized chine learning to NLP tasks, they first formulated grammars from annotated corpora (Xia, 1999; Chen the task as identifying in a sentence each argument and Vijay-Shanker, 2000; Chiang, 2000; Hocken- of a given predicate. Then, parameters of the iden- maier and Steedman, 2002a; Cahill et al., 2002; tifier were learned from the annotated corpus. Fea- Frank et al., 2003; Miyao et al., 2004). This ap- tures of a statistical model were defined as a pat- proach not only allows for the low-cost develop- tern on a partial structure of the syntactic tree output ment of wide-coverage grammars, but also provides by an automatic parser (Gildea and Palmer, 2002; the training data for statistical modeling as a by- Gildea and Jurafsky, 2002). product. Thus, we now have a basis for integrating Several studies proposed the use of deep linguis- statistical language modeling with deep linguistic tic features, such as predicate-argument relations analysis. To date, accurate parsers have been devel- output by a CCG parser (Gildea and Hockenmaier, oped for LTAG (Chiang, 2000), CCG (Hockenmaier 2003) and derivation trees output by an LTAG parser and Steedman, 2002b; Clark et al., 2002; Hocken- (Chen and Rambow, 2003). Both studies reported maier, 2003), and LFG (Cahill et al., 2002; Burke et that the identification accuracy improved by in- al., 2004). Those studies have opened up the appli- troducing such deep linguistic features. Although cation of deep linguistic analysis to practical use. deep analysis has not outperformed PCFG parsers in However, the accuracy of those parsers was still terms of the accuracy of surface structure, these re- below PCFG parsers (Collins, 1999; Charniak, sults are implicitly supporting the necessity of deep 2000) in terms of the PARSEVAL score, i.e., labeled linguistic analysis for the recognition of semantic bracketing accuracy of CFG-style parse trees. Since relations. one advantage of deep parsers is that they can out- However, these results do not directly reflect the put a sort of semantic representation, e.g. predicate- performance of deep parsers. Since these corpora argument structures, several studies have reported provide deeper structure of a sentence than surface the accuracy of predicate-argument relations (Hock- parse trees, they would be suitable for the evalua- enmaier and Steedman, 2002b; Clark et al., 2002; tion of deep parsers. In Section 4, we explore the Hockenmaier, 2003; Miyao et al., 2003). However, possibility of using the PropBank for the evaluation their evaluation employed a treebank developed for of an HPSG parser. a specific grammar formalism. Hence, those results cannot be compared fairly with parsers based on other formalisms including PCFG parsers. 3 Implementation of an HPSG parser At the same time, following the great success of machine learning approaches in NLP, many re- This study evaluates the accuracy of a general- search efforts are being devoted to developing vari- purpose HPSG parser that outputs predicate argu- ous annotated corpora. Notably, several projects are ment structures. While details have been explained underway to annotate large corpora with semantic in other papers (Miyao et al., 2003; Miyao et al., information such as semantic relations of words and 2004), in the remainder of this section, we briefly coreferences. review the grammar and the disambiguation model PropBank (Kingsbury and Palmer, 2002) and of our HPSG parser. S PHON “choose” arg head NP-1 VP HEAD verb arg HEAD noun head they VP VP SUBJ < SEM 1 > head mod head arg n’t have HEAD noun did S COMPS < > head SEM 2 NP VP arg REL choose *-1 tohead VP SEM ARG0 1 arg head ARG1 2 choose NP this particular moment HEAD verb Figure 3: Mapping from syntactic arguments to se- SUBJ < > COMPS < > subject-head mantic arguments HEAD noun HEAD verb 1 SUBJ < > SUBJ < > COMPS < > 2 head-comp HEAD verb HEAD verb we get lexical entries for all of the words in the tree they SUBJ < _ > SUBJ < 2 > head-comp (the bottom). did n’t have HEAD verb SUBJ < 1 > As shown in the figure, we can also obtain com- head-comp to HEAD verb plete HPSG derivation trees, i.e., an HPSG tree- SUBJ < 1 > head-comp bank. It is available for the machine learning of dis- choose HEAD noun SUBJ < > COMPS < > ambiguation models, and can also be used for the evaluation of HPSG parsing. this particular moment In an HPSG grammar, syntax-to-semantics map- HEAD verb pings are implemented in lexical entries.