Adapting Semantic Role Labeling Features to Dependency Parsing

CMILLS: Adapting Semantic Role Labeling Features to Dependency Parsing Chad Mills Gina-Anne Levow University of Washington University of Washington Guggenheim Hall, 4th Floor Guggenheim Hall, 4th Floor Seattle, WA 98195, USA Seattle, WA 98195, USA [email protected] [email protected] that sentence and making the following determina- Abstract tions relative to the given verb “abandon”: “Britain” is the syntactic subject of “abandon” We describe a system for semantic role label- and falls under the “Institution” semantic type ing adapted to a dependency parsing frame- “development” is the syntactic object of “aban- work. Verb arguments are predicted over nodes don” and is of semantic type “Activity” in a dependency parse tree instead of nodes in We organize the remainder of our paper as fol- a phrase-structure parse tree. Our system par- ticipated in SemEval-2015 shared Task 15, lows: Section 2 describes our system, Section 3 Subtask 1: CPA parsing and achieved an F- presents experiments, and Section 4 concludes. score of 0.516. We adapted features from prior semantic role labeling work to the dependency 2 System Description parsing paradigm, using a series of supervised classifiers to identify arguments of a verb and Our system consists of a pipelined five-component then assigning syntactic and semantic labels. system plus source data and resources. A system di- We found that careful feature selection had a agram is shown in Figure 1. A cascading series of major impact on system performance. How- MaxEnt classifiers are used to identify arguments, ever, sparse training data still led rule-based their syntactic labels, and then their semantic labels. systems like the baseline to be more effective Each token in an input sentence was a training ex- than learning-based approaches. ample. Sketch Engine (Kilgarriff 2014) was used to help 1 Introduction with featurization. All sentences in the training data were parsed and POS tagged using the Stanford We describe our submission to the SemEval-2015 CoreNLP tools (Manning et al. 2014). This data was Task 15, Subtask 1 on Corpus Pattern Analysis used to generate features which are then supplied to (Baisa et al. 2015). This task is similar to semantic an Argument Identification Classifier (AIC) that role labeling but with arguments based on nodes in identifies whether or not a particular token is one of dependency parses instead of a syntactic parse tree. the relevant verb’s arguments. The verb’s arguments are identified and labeled For the tokens identified as arguments to the verb, with both their syntactic and semantic roles. a Syntax Classifier identifies the syntactic role of For example, consider the sentence “But he the token. This is done using a multi-class MaxEnt said Labour did not agree that Britain could or model with the same features as the AIC plus fea- should abandon development, either for itself or for tures derived from the AIC’s predictions. A similar the developing world.” This subtask involves taking Semantics Classifier follows, taking the Syntax 433 Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages 433–437, Denver, Colorado, June 4-5, 2015. c 2015 Association for Computational Linguistics Classifier’s features and output. Finally, a Seman- al. 2003, Pradhan et al. 2004). Others were added to tics Consistency Heuristic Filter is applied to clean generalize better to unseen verbs, which is critical up some of the predictions using a series of heuris- for our task. tics to ensure the system is outputting semantic pre- Some of our features depend on having a phrase- dictions that are consistent with the syntax structure parse node corresponding to the candidate predictions for the same token. dependency parse node. Since dependency parse nodes each correspond to a token in the sentence, Task Data the tokens corresponding to the candidate node and its descendants in the dependency parse tree were identified. Then, in the phrase-structure parse tree, Sketch Engine Stanford Parser the lowest ancestor to all of these tokens was taken to be the phrase-structure parse node best corresponding to the candidate dependency parse node. Featurization The baseline features included some inspired by Gildea and Jurafsky (2002): Phrase Type: the syntactic label of the corre- Argument Identification sponding node in the parse tree Classifier Predicate Lemma: lemma of the verb Path: the path in the parse tree between the candidate syntax node and the verb including the Syntax Classifier vertical direction and syntactic parse label of each node (e.g. “--up-->S--down-->NP”) Semantics Classifier Position: whether the candidate is before or af- ter the verb in the sentence Voice: whether the sentence is active or passive Semantics voice; due to sparse details in Gildea and Juraf- Consistency Heuristic Filter sky this was based on tgrep search pattern heu- ristics found in Roland and Jurafsky (2001) Figure 1: System Architecture Diagram. The input data is parsed by the Stanford Parser and the argument heads Head Word of Phrase: the highest token in the are expanded using the Sketch Engine thesaurus. This dependency parse under the syntax parse tree data is then featurized and passed through three succes- node corresponding to the candidate token sive classifiers: the Argument Identification Classifier Sub-Cat CFG: the CFG rule corresponding to identifies verb arguments, the Syntax Classifier assigns the parent of the verb, defined by the syntactic syntax labels to the arguments, and the Semantics Clas- node labels of the parent and its children sifier assigns semantic labels to the arguments. Finally, Additional baseline features were obtained from the Semantics Consistency Heuristic Filter eliminates Surdeanu et al. (2003) and Pradhan et al. (2004): some systematic errors in the Semantics Classifier. First/Last Word/POS: For the syntactic parse 2.1 Featurization node corresponding to the candidate node, this includes four separate features: the first word in Many of the features used in our system were in- the linear sentence order, its part of speech, the spired by the system produced by Toutanova et al. last word, and its part of speech (2008), which used many features from prior work. Left/Right Sister Phrase-Type: The Phrase Type This was a top-performing system and we incorpo- of each of the left and right sisters rated each of the features that applied to the depend- Left/Right Sister Head Word/POS: The word ency parsing framework adopted in this task. We and POS of the head of the left and right sisters then augmented this feature set with a number of Parent Phrase-Type: The Phrase Type of the novel additional features. Many of these were adap- parent of the candidate parse node tations of Semantic Role Labeling (SRL) features Parent POS/Head-Word: The word and part of from the phrase-structure to dependency parsing speech of the parent of the parse node corre- paradigm (Gildea and Jurafsky 2002, Surdeanu et sponding to the candidate node 434 Node-LCA Partial Path: The Path between the First/Last DP Word/Lemma/POS – of all of the candidate node and the lowest common ances- descendants of the candidate node in the de- tor between the candidate node and the verb pendency parse, inclusive, the first/last PP Parent Head Word: The head word of the word/lemma/POS from the linear sentence parent node in the syntax tree, if that parent is a Dependency Path: the path in the dependency prepositional phrase. parse from the candidate node to the verb PP NP Head Word/POS: If the syntax parse Dependency Node-LCA Partial Path: path in node representing the candidate node is a PP, the dependency parse from the candidate node the head word and POS of the rightmost NP di- to its lowest common ancestor with the verb rectly under the PP. Dependency Depth: the depth in the depend- Finally, baseline features that consisted entirely ency parse of the candidate node of pairs of already-mentioned features were also Dependency Descendant Coverage: of all of the taken from Xue and Palmer (2004): tokens under the candidate syntax parse node, Predicate Lemma & Path the percentage of those also under the candidate Predicate Lemma & Head Word of Phrase node in the dependency parse tree. This Predicate Lemma & Phrase Type measures the candidate syntax and dependency Voice & Position parse node alignment. Predicate Lemma & PP Parent Head Word Additionally, due to the importance of the Pred- We added additional features adapted from the icate Lemma feature in prior SRL work and the need aforementioned features to generalize better given to generalize entirely to unseen verbs for evaluation the sparse training data relative to other SRL tasks: in this task, we used Sketch Engine (Kilgarriff Head POS of Phrase: the tagged POS of the 2014) word sketches for each verb. A word sketch Head Word of Phrase is obtained for each unseen test verb and the most Head Lemma of Phrase: the lemma of the Head similar verb from the training data is used as the Word of Phrase Similar Predicate Lemma feature. First/Last Lemma: the lemma of the first and We use a novel similarity function to identify last word under the candidate parse node similar verbs. A word sketch for each verb vi identi- Left/Right Sister Head Lemma: the lemmas of fies an ordered set of n grammatical relations r1i, r2i, r , ..., r that tend to co-occur with v . These are re- the Left/Right Sister Head Words 3i ni i lations like “object”, “subject”, prepositional Parent Head Lemma: the lemma of the Parent phrases head by “of”, etc. The word sketch for each Head Word relation r associated with v also includes a signifi- PP Parent Head Lemma/POS: the lemma and ji i cance value s (r ).

Load more