Improving Semantic Parsing with Enriched Synchronous Context-Free Grammar

Improving Semantic Parsing with Enriched Synchronous Context-Free Grammar Junhui Li1 Muhua Zhu2 Wei Lu3 Guodong Zhou1 1Natural Language Processing Lab, Soochow University, China lijunhui, gdzhou @suda.edu.cn { 2 Alibaba Inc., Hangzhou,} China [email protected] 3 Information Systems Technology and Design, Singapore University of Technology and Design [email protected] Abstract NL: What is the area of Seale MRL: answer(area_1(cityid(‘seale’, _))) Semantic parsing maps a sentence in natu- (a) before pre-processing ral language into a structured meaning representation. Previous studies show that se- NL’: what be the area of seale mantic parsing with synchronous context- MRL’: answer@1 area_1@1 cityid@2 seale@s _@0 free grammars (SCFGs) achieves favor- (b) aer pre-processing able performance over most other alter- Figure 1: Example of a sentence pair in NL and MRL. natives. Motivated by the observation that the performance of semantic parsing with SCFGs is closely tied to the translation rules, this paper explores ex- naturally viewed as a statistical machine transla- tending translation rules with high qual- tion (SMT) task, which translates a sentence in NL ity and increased coverage in three ways. (i.e., the source language in SMT) into its mean- First, we introduce structure informed ing representation in MRL (i.e., the target lan- non-terminals, better guiding the parsing guage in SMT). Indeed, many attempts have been in favor of well formed structure, instead made to directly apply statistical machine transla- of using a uninformed non-terminal in tion (SMT) systems (or methodologies) to seman- SCFGs. Second, we examine the differ- tic parsing (Papineni et al., 1997; Macherey et al., ence between word alignments for seman- 2001; Wong and Mooney, 2006; Andreas et al., tic parsing and statistical machine transla- 2013). However, although recent studies (Wong tion (SMT) to better adapt word alignment and Mooney, 2006; Andreas et al., 2013) show that in SMT to semantic parsing. Finally, we semantic parsing with SCFGs, which form the ba- address the unknown word translation is- sis of most existing statistical syntax-based trans- sue via synthetic translation rules. Eval- lation models (Yamada and Knight, 2001; Chiang, uation on the standard GeoQuery bench- 2007), achieves favorable results, this approach is mark dataset shows that our approach still behind the most recent state-of-the-art. For achieves the state-of-the-art across various details, please see performance comparison in An- languages, including English, German and dreas et al. (2013) and Lu (2014). Greek. The key issues behind the limited success of ap- 1 Introduction plying SMT systems directly to semantic parsing lie in the difference between semantic parsing and Semantic parsing, the task of mapping natural SMT: MRL is not a real natural language with language (NL) sentences into a formal meaning different properties from natural language. First, representation language (MRL), has recently re- MRL is machine-interpretable and thus strictly ceived a significant amount of attention with vari- structured with the meaning representation in a ous models proposed over the past few years. Con- nested structure of functions and arguments. Sec- sider the NL sentence paired with its correspond- ond, the two languages are intrinsically asymmet- ing MRL in Figure 1(a). Semantic parsing can be ric since each token in MRL carries specific mean- 1455 Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1455–1465, Lisbon, Portugal, 17-21 September 2015. c 2015 Association for Computational Linguistics. ing1 while this does not hold in NL since auxil- cally focus on the variable-free semantic represen- iary words and some function words usually have tations, as shown in Figure 1. On the target side, no counterparts in MRL. Third and finally, the ex- we convert these meaning representations to series pressions in NL are more flexible with respect to of strings similar to NL. To do so, we simply take a lexicon selection and token ordering. For exam- preorder traversal of every functional form, and la- ple, since sentences in NL ‘could you tell me the bel every function with the number of arguments states that utah borders’, ‘what states does utah it takes. Figure 1(b) shows an example of con- border’, and ‘utah borders what states’ convey the verted meaning representation, where each token same meaning, they should have the same expres- is in the format of A@B where A is the symbol sion in MRL. while B is either s indicating that the symbol is Motivated by the above observations, we be- a string or a number indicating the symbol’s arity lieve that semantic parsing with standard SMT (constants, including strings, are treated as zero- components is not an ideal approach. Alterna- argument functions). tively, this paper proposes an effective, yet simple On the source side, we perform stemming (for way to enrich SCFG in hierarchical phrase-based English and German) and lowercasing to over- SMT for better semantic parsing. Specifically, come data sparseness. since the translation rules play a critical role in Hereafter, we refer to the pre-processed NL and SMT, we explore to improve translation rule qual- MRL as NL0 and MRL0 respectively. ity and increase its coverage in three ways. First, Translation Given a corpus of NL0 sentences we enrich non-terminal symbols as to capture con- paired with MRL0, we learn a semantic parser textual and structured information. The enrich- by adopting a string-to-string translation system. ment of non-terminal symbols not only guides the Typical components in such a translation system translation in favor of well formed structures, but include word alignments between the source and also is beneficial to translation. Second, we ex- the target languages, translation rule extraction, amine the difference between word alignments for language model learning, parameter tuning and semantic parsing and SMT to better adapt word decoding. For more details about each component, alignment in SMT to semantic parsing. Third, please refer to (Chiang, 2007). In the rest of this unlike most existing SMT systems that keep un- paper, we refer to the source language (side) as known words untranslated and intact in transla- NL0, and the target language (side) as MRL0. tion, we exploit the translation of unknown words Post-Processing We convert MRL0 back into via synthetic translation rules. Evaluation on Geo- MRL by recovering parentheses and commas to Query benchmark dataset shows that our approach reconstruct the corresponding tree structure in obtains consistent improvement and achieves the MRL. This can be easily done by examining each state-of-the-art across various languages, includ- symbol’s arity. It eliminates any possible ambi- ing English, German and Greek. guity from the tree reconstruction: given any se- quence of tokens in MRL0, we can always recon- 2 Background: Semantic Parsing as struct the tree structure (if one exists). For those Statistical Machine Translation translations that can not be successfully converted, we call them ill-formed translations. In this section, we present the framework of semantic parsing as SMT, which was proposed 3 Semantic Parsing with Enriched SCFG in Andreas et al. (2013). Pre-Processing Various semantic formalisms In this section, we present the details of our en- have been considered for semantic parsing. Ex- riched SCFG for semantic parsing. amples include the variable-free semantic representations (that is, the meaning representation for 3.1 Enriched SCFG each utterance is tree-shaped), the lambda calculus In hierarchical phrase-based (HPB) translation expressions, and dependency-based compositional models, synchronous rules take the form X → semantic representations. In this work, we specifi- γ, α, , where X is the non-terminal sym- h ∼i bol, γ and α are strings of lexical items and 1As seen in Section 2, delimiters, including parentheses and commas which do not carry any meaning will be removed non-terminals in the source and target side re- in pre-processing and be recovered in post-processing spectively, and indicates the one-to-one cor- ∼ 1456 respondence between non-terminals in γ and α. texas, stateid@1 texas@s C texas, stateid@1 texas@s From an aligned phrase pair <state that border, seattle,→ h seattle@s @0 i state@1 next to 2@1> in Figure 2(a), for ex- C F2 seattle, seattle@s @0 \ → h i ample, we can get a synchronous rule X that border, next to 2@1 → C/A1 that border, next to 2@1 state X , state@1 X , where we use boxed in- → h i 1 1 state that border, state@1 next to 2@1 dices to indicate which nonterminal occurrences D E C/A1 state C/A1 , state@1 C/A1 are linked by . The fact that SCFGs in HPB mod- → 1 1 ∼ els contain only one type of non-terminal symbol2 (a) Examples of phrase pairs in enriched SCFG. is responsible for ill-formed translation (e.g., an- state that border , state@1 next to 2@1 swer@1 state@1). To this end, we enrich the non- ::::::::D ::::::::: E C/A1 C/A1 C/A1 , C/A1 C/A1 terminals to capture the tree structure information, → 1 2 1 2 guiding the translation in favor of well-formed state that border ::::texas ::::have the::::::::highest ::::::::population, translations. The enrichment of non-terminals is l.. one@1 p.. 1@1 state@1 n..@1 s..@1 t..@s two-fold: first, it can handle MRL with a nested :::::::D ::::::: E C C C/A1 , C/A1 C structure to guarantee the well-formed transla- → 1 2 2 1 tions; second, related studies in SMT have shown (b) Examples of glue rules in enriched SCFG.

Load more