Parse Tree Fragmentation of Ungrammatical Sentences
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) Parse Tree Fragmentation of Ungrammatical Sentences Homa B. Hashemi, Rebecca Hwa Intelligent Systems Program, Computer Science Department University of Pittsburgh [email protected], [email protected] Abstract resulting reliable structures may still provide some useful in- formation for down-stream NLP applications such as infor- Ungrammatical sentences present challenges for mation extraction (IE), machine translation (MT), and auto- statistical parsers because the well-formed trees matic evaluation of text (e.g., generated by MT or summariza- they produce may not be appropriate for these sen- tion systems or human second language learners); the omis- tences. We introduce a framework for reviewing sion of the problematic structures also helps to prevent mod- the parses of ungrammatical sentences and extract- els that learn from syntactic structures from degrading due to ing the coherent parts whose syntactic analyses incorrect syntactic analysis. make sense. We call this task parse tree fragmen- tation. In this paper, we propose a training method- ology for fragmenting parse trees without using a One approach for obtaining these partially completed task-specific annotated corpus. We also propose structures is to use shallow parsing [Abney, 1991; Sha and some fragmentation strategies and compare their Pereira, 2003; Sun et al., 2008] to identify recognizable performance on an extrinsic task – fluency judg- low-level constituents, but this excludes higher-level com- ments in two domains: English-as-a-Second Lan- plex structures. Instead, we propose to review the full parse guage (ESL) and machine translation (MT). Exper- tree generated by a state-of-the-art parser and identify the imental results show that the proposed fragmenta- parts of it that are plausible interpretations for the phrases tion strategies are competitive with existing meth- they cover. We call these isolated parts of the parse tree ods for making fluency judgments; they also sug- fragments, and the process of breaking up the tree, parse gest that the overall framework is a promising way tree fragmentation. Our approach differs from systems that to handle syntactically unusual sentences. produce partial parses (e.g., the Connexor parser) because we are not using a deterministic grammar building bottom 1 Introduction up; we are re-interpreting the parse tree post-hoc. Our task Syntactic parse trees are integral to many Natural Language also differs from disfluency detection in spoken utterances, Processing (NLP) applications. While state-of-the-art statisti- which focuses on removing extra fillers and repeated phrases cal parsers perform well on standard (newswire) benchmarks, [Honnibal and Johnson, 2014; Rasooli and Tetreault, 2013; their analyses for sentences from different domains are less Ferguson et al., 2015]; ungrammatical sentences written by reliable [Gildea, 2001; McClosky et al., 2010; Foster, 2010; non-native speakers or generated by machines have a wider Petrov et al., 2010; Foster et al., 2011]. Ungrammatical1 sen- range of error types, such as missing phrases and incorrect tences (or even awkward sentences that are technically gram- phrasal ordering. matical) can be seen as special kinds of out-of-domain sen- tences; in some cases, it is not even clear whether a complete In this paper, we present a framework for extracting parse parse should be given to the sentence. A statistical parser tree fragments from problematic sentences. We develop a trained on a standard treebank, however, often produces full, methodology for creating gold standard data for training and syntactically well-formed trees that are not appropriate for the evaluating parse tree fragmentation methods. We propose two sentences. fragmentation algorithms and compare them in an empirical Rather than throwing out parses with low-confidence study. Finally, we perform an extrinsic evaluation to deter- scores entirely or transforming the problematic sentences first mine the potential utility of tree fragmentation to a down- (which may not always be possible), we believe that a valu- stream application. In particular, we choose the task of sen- able middle ground is to identify well-formed syntactic struc- tential fluency judgment, in which a system automatically tures of those parts of the sentences that do make sense. The predicts how “natural” a sentence might sound to a native- 1In this context, a sentence is considered ungrammatical if all its speaker human. We consider two domains: MT outputs words are valid in the language, but it contains grammatical or usage and the writings of English-as-a-Second Language (ESL) stu- errors [Foster, 2007]. dents. 2796 S S NP VP ? ? NP VP NP PP VBZ S NP PP ? VBZ ? DT NNS IN NP opposes NP NP DT NNS IN NP ? ? opposes The members of DT NN DT PRP The members of DT NN DT PRP the vote any him the vote any him (a) Stanford parse tree (b) Coherent fragments Figure 1: (a) An ungrammatical sentence gets a well-formed but inappropriate parse tree. (b) A set of coherent tree fragments that might be extracted from the full parse tree. 2 A Framework for Parse Tree sentence. With only a sentence and an automatically gener- Fragmentation ated tree for it, we may mentally error-correct the sentence in different ways. If we are also given an acceptable paraphrase The goal of parse tree fragmentation is to take a sentence for the sentence, the fragmentation task becomes more cir- and its tree as input and extract from the tree a set of par- cumscribed because we now know the intended meaning. An tial trees that are well-formed and appropriate for the phrases example data source of this type is an MT evaluation cor- they cover. Before we explore methods for doing so, we need pus, which consists of machine-translated sentences and their to address some more fundamental problems: What kind of corresponding human-translated references. Furthermore, if partial trees are considered to be well-formed and appropri- we not only have access to a closely worded paraphrase but ate? How do we obtain enough examples of appropriate ways also an explanation for each change, the fragmentation deci- to fragment the trees? How should this task be evaluated? sions are purely deterministic (e.g., whenever a phrase is rec- ommended for deletion, the tree over it is fragmented). An 2.1 Ideal Fragmentation example data source of this type is an ESL learner’s corpus, One factor that dictates how fragmentation should be done is which consists of student sentences and their detailed correc- how the fragments will be used in a downstream application. tions. For example, a one-off slight grammar error (e.g., number agreement) probably will not greatly alter a parser output. For 2.2 Developing a Fragmentation Corpus the purpose of information extraction, this type of slight mis- Because the definition of an ideal fragmentation depends on matches should probably be ignored; for the purpose of train- multiple factors (e.g., the intended use and the context in ing future syntax-based computational models, on the other which the original sentences were generated), this task is not hand, more aggressive fragmentation may be necessary to fil- well-suited for a large-scale human annotation project. In- ter out unwanted syntactic relationships. stead, we propose to develop our fragmentation corpus by Even assuming a particular downstream application choice leveraging existing data sources previously mentioned (an (sentential fluency judgment in our case), the ideal fragmen- ESL learner’s corpus and an MT evaluation corpus). tation may not be obvious, especially when the errors inter- act with each other. Consider the following output from a An ESL learner’s machine translation system that contains three problem areas Pseudo Gold Fragmentation (PGold) corpus in which every sentence has been hand corrected by (underlined): an English teacher is ideal for our purpose. We identified The members of the vote opposes any him. sentences that are marked as containing word-level mistakes: Figure 1a shows the Stanford parser’s output for this sen- unnecessary, missing or replacing word errors. Given the po- tence. The first problem is the subject noun phrase even sitions and error types, a fluent sentence can be reconstructed though the embedded NP and PP subtrees are well-formed. and reliably parsed. The parse tree of the fluent sentence can The second problem is a number disagreement between the then be iteratively fragmented according to the error types subject and the verb. The third problem is an unusual bigram that occur in the original sentence (see Figure 2). The re- that causes the parser to group any and him into a sentential sulting sets of fragments approximate an explicitly manually clause to serve as the object of the main verb. created fragmentation corpus; however, since a parser may Which fragments should be salvaged from this parse tree? make mistakes even on a fluent sentence, we call these frag- Someone who thinks the sentence says: The members of the ments pseudo gold. voting body oppose any proposal by him might produce the fragment set shown in Figure 1b. On the other hand, if they REference Fragmentation (REF) Even if we do not have think it says: No parliament members voted against him, they detailed information about why certain parts of a sentence are might have opted to not keep the PP intact. problematic, we can construct an almost-as-good fragmenta- This example illustrates that fragmentation decisions are tion if we have access to a fluent paraphrase of the original. influenced by the amount of information we glean from the We call this a reference sentence, borrowing the terminology 2797 ? ? ? l1 l2 l1 l2 l3 (a) Unnecessary word (l3) error ? ? l1 l2 l1 l2 l3 Figure 3: Word N-gram features for the dotted edge.