A Statistical Machine Translation Model with Forest-To-Tree Algorithm for Semantic Parsing

A Statistical Machine Translation Model with Forest-to-Tree Algorithm for Semantic Parsing Zhihua Liao Yan Xie College of Teacher Education English Department Center for Faculty Development Foreign Studies College Hunan Normal University Hunan Normal University Changsha, China Changsha, China [email protected] [email protected] Abstract spired by the probabilistic forest-to-string generation algorithm (Lu and Ng, 2011) and the work In this paper, we propose a novel super- of Wong and Mooney (2006; 2007a; 2007b) and vised model for parsing natural language Wong (2007) that learn for semantic parsing with sentences into their formal semantic rep- statistical machine translation, our semantic pars- resentations. This model treats sentence- ing framework consists of two main components. to-λ-logical expression conversion within Firstly it contains a lexical acquisition component, the framework of the statistical machine which is based on phrase alignments between nat- translation with forest-to-tree algorithm. ural language sentences and linearized semantic To make this work, we transform the λ- parses, given by an off-the-shelf phrase alignment logical expression structure into a form model trained on a set of training examples. The suitable for the mechanics of statistical extracted transformation rules form a synchronous machine translation and useful for model- context free grammar (SCFG), for which a prob- ing. We show that our model is able to abilistic model is learned to resolve parse ambi- yield new state-of-the-art results on both guity. The second component is to estimate the standard datasets with simple features. parameters of a probabilistic model. The paramet- ric models are based on maximum-entropy. The 1 Introduction probabilistic model is trained on the same set of Semantic parsers convert natural language (NL) training examples in an unsupervised manner. sentences to logical forms (LFs) through a mean- This paper is structured as follows. Section 2 ing representation language (MRL). Recent re- describes how we build the framework of the sta- search has focused on learning such parsers di- tistical machine translation with forest-to-tree al- rectly from corpora made up of sentences paired gorithm to develop a semantic parser, and Section with logical meaning representations (Artzi and 3 discusses the decoder. Then Section 4 presents Zettlemoyer, 2011, 2013; Liao and Zhang, 2013; our experiments and reports the results. Finally, Liao et al., 2015b,a; Lu et al., 2008; Lu and Ng, we make the conclusion in Section 5. 2011; Krishnamurthy, 2016; Kwiatkowski et al., 2010, 2011; Zettlemoyer and Collins, 2005, 2007, 2 The Semantic Parsing Model 2009). And its goal is to learn a grammar that can map new, unseen sentences onto their correspond- Now we present the algorithm for semantic parsing meanings, or logical expressions. ing, which translates NL sentences into LFs us- While these algorithms usually work well on ing a reduction-based λ-SCFG. It is based on an specific semantic formalisms, it is not clear how extended version of a reduction-based SCFG (Lu well they could be applied to a different se- and Ng, 2011). Given a set of training sentences mantic formalism. In this paper, we propose a paired with their correct logical forms, the main novel supervised approach to learn semantic pars- learning task is to induce a set of reduction-based ing task using the framework of the statistical λ-SCFG rules, which we call a lexicon, a proba- machine translation with forest-to-tree algorithm. bilistic model for derivations. A lexicon defines This method integrates both lexical acquisition the set of derivations that are possible, so the in- and surface realization in a single framework. In- duction of probabilistic model first requires a lex- 446 Proceedings of Recent Advances in Natural Language Processing, pages 446–451, Varna, Bulgaria, Sep 4–6 2017. https://doi.org/10.26615/978-954-452-049-6_059 icon. Therefore, the learning task can be sepa- ··· rated into two sub-tasks:(1) the induction of a lex- τa : πa / τb / τc icon;(2) the induction of a probabilistic model - w τ : π / τ w maximum-entropy model. 1 b b d 2 τc : πc ··· 2.1 Lexical Acquistion w3 We introduce the grammar first. Next, we present the generative model for the grammar induction to Figure 1: The joint generative process of both λ- acquire the grammar rules. meaning tree and its corresponding natural lan- Grammar: We use a weighted λ-SCFG. The guage sentence. grammar is defined as follows: τ hω, pλ, → h ∼i give me r e, t where τ is the type associated with the sequence → h i the states e, t λx0.state(x0) e, t hω consisting of natural language words inter- h i → ∧ h i borderings e, e, t λx0. x1.next to(x1, x0) e, t mixed with types and the λ-production pλ. The h h ii → ∃ ∧ h i states e, t λx .state(x ) e, t symbol denotes the one-to-one correspondence h i → 1 1 ∧ h i ∼ that e, t λx .λx .loc(x , x ) / e between nonterminal occurrences in both hω and h i → 2 1 2 1 the mississippi e x : miss r p . Specially, the symbol ˆ denotes the one-to- → 2 λ ∼ one correspondence between terminal occurrence runs through in both hˆω and pˆλ, where hˆω is an NL phrase and pˆλ is the LF translation of hˆω. Then we allow Figure 3: A phrase alignment based on a λ-hybrid a maximum of two nonterminal symbols in each tree. synchronous rule (Lu and Ng, 2011). This makes the grammar a binary λ-SCFG. rules and two-level λ-hybrid sequence rules. Here Grammar Induction: We adopt a generative we give an example in Table 1. model for λ-hybrid tree models the mapping from λ-sub-expressions to word sequences with a joint 1. λ-hybrid sequence rules: these conventional generative process, which Lu and Ng (2011) de- rules are constructed from one λ-production veloped. Figure 1 describes the generative process and its corresponding λ-hybrid sequence. for a sentence together with its corresponding λ- meaning tree. It results in a λ-hybrid tree1 (Lu 2. Subtree rules: these rules are constructed et al., 2008). Figure 2 gives a part of the example from a complete substree of the λ-hybrid λ-hybrid tree. Here, grammar rules are extracted tree. A mapping between a complete sub- from the λ-hybrid trees. We can use the same expression and a contiguous sub-sentence grammar for both parsing and generation. Since can be acquired from each rule. a SCFG is fully symmetric with respect to both 3. Two-level λ-hybrid sequence rules: these generated strings, the same chart for parsing can rules are constructed from a tree fragment be easily adapted for efficient parsing. Now we with one of its grandchild subtrees being ab- show how to use the generative model for map- stracted with its type only. These rules are ping natural language sentence to λ-expressions. constructed via substitution and reductions. At first, this model finds the Viterbi λ-hybrid trees We show how to construct two-level λ-hybrid for all training instances, based on the learned pa- sequence rules through substitution and re- rameters of the generative λ-hybrid tree model. ductions. Table 2 gives an example based on Next, the model extracts grammar rules on the top a tree fragment of the λ-hybrid tree in Figure of these λ-hybrid trees. Specifically, we extract 2. the following tree types of synchronous grammar λ rules. They are -hybrid sequence rules, subtree To ground our discussion, we use the phrase align- 1The internal nodes of the λ-hybrid tree are called λ- ment in Figure 2 as an example. To represent productions, which are building blocks of a λ-forest. Each the logical form in Figure 3, we use its linearized λ-production in turn has at most two child λ-productions. A parse — a list of MRL productions that generate λ-production has the form τa : πa / τ , where τa is the ex- b the logical form in top-down and left-most order. pected type after type evaluation of the terms to its right, πa is a λ-expression, and τb are types of the child λ-productions. Since the MRL grammar is unambiguous, every 447 r : e, t 1 h i give me e, t 1 : λg.λf.λx.g(x) f(x) / e, t 1 / e, t 2 h i ∧ h i h i e, t 1 : λx.state(x) e, t 2 : λf.λg.λx. y.g(y) (f(x)y) / e e, t 1 / e, t 2 h i h i ∃ ∧ h h ii h i NL LF the states e, e, t 1 : λy.λx.next to(x, y) e, t 2 : λg.λf.λx.g(x) f(x) / e, t 1 / e, t 2 h h ii h i ∧ h i h i bordering e, t 2 : λx.state(x) e, t 1 : λy.λx.loc(y, x) / e 1 h i h i states that e 1 : miss r runs through the mississippi Figure 2: One example λ hybrid tree for the sentence “give me the states bordering states that the mis- − sissippi runs through” together with its logical form “λx .state(x ) x .[loc(miss r, x ) state(x ) 0 0 ∧∃ 1 1 ∧ 1 ∧ next to(x1, x0)]”. type 1: e, e, t bordering, λy.λx.next to(x, y) he, th i →e, h t e, t , λg.λy.λx.g(x) i f(x) / e, t / e, t h i → hh i 2 h i 1 ∧ h i 1 h i 2 i type 2: e, t states that the mississippi runs through, λx.loc(miss r, x) state (x) he, ti → h that the mississippi runs through, λx.loc(miss r, x) ∧ i h i → h i type 3: e, t the states bordering e, t 1 , λf.λx.state(x) y.[f(y) next to(y, x)] / e, t 1 h i → h h i ∧ ∃ ∧ h i i e, t states that e 1 runs through, λy.λx.loc(y, x) state(x) / e 1 h i → h ∧ i Table 1: Example synchronous rules that can be extracted from the λ hybrid tree.

Load more