Arxiv:2009.14116V1 [Cs.CL] 29 Sep 2020

A Survey on Semantic Parsing from the Perspective of Compositionality Pawan Kumar Srikanta Bedathur Indian Institute of Technology Delhi, Indian Institute of Technology Delhi, New Delhi, Hauzkhas-110016 New Delhi, Hauzkhas-110016 [email protected] [email protected] Abstract compositionality: evident by the choice of the formal language and the construction mechanism and Different from previous surveys in seman- lexical variation: grounding the words/phrases to tic parsing (Kamath and Das, 2018) and appropriate KB entity/relation. knowledge base question answering(KBQA) (Chakraborty et al., 2019; Zhu et al., 2019; Language Compositionality: According to Hoffner¨ et al., 2017) we try to takes a dif- Frege’s principle of compositionality: “the ferent perspective on the study of semantic meaning of a whole is a function of the meaning parsing. Specifically, we will focus on (a) meaning composition from syntactical struc- of the parts. The study of formal semantics ture (Partee, 1975), and (b) the ability of for natural language to appropriately represent semantic parsers to handle lexical variation sentence meaning has a long history (Szab, 2017). given the context of a knowledge base (KB). Most semantic formalism follow the tradition of In the following section after an introduction Montague’s grammar (Partee, 1975) i.e. there is of the field of semantic parsing and its uses a one-to-one correspondence between syntax and in KBQA, we will describe meaning represen- semantics e.g. CCG (Zettlemoyer and Collins, tation using grammar formalism CCG (Steed- 2005). We will not be delving into the concept of man, 1996). We will discuss semantic composition using formal languages in2. In section3 semantic representation in formal language in this we will consider systems that uses formal lan- survey. guages e.g. λ-calculus (Steedman, 1996), λ- Lexical Variance: Lexical variation in human DCS (Liang, 2013). Section4 and5 consider semantic parser using structured-language for language is huge. Differences in the surface form logical form. Section6 is on different bench- of words in the natural language and the label of the mark dataset ComplexQuestions (Bao et al., corresponding entity/relation in the KB is mainly 2016) and GraphQuestions (Su et al., 2016) due to the polysemy. For example attend may be that can be used to evaluate semantic parser on referred to by label Education (Berant et al., 2013). their ability to answer complex questions that Similarly paraphrases of a sentence may have dif- are highly compositional in nature. ferent phrases to mean the same thing, e.g. ‘What 1 Introduction is your profession’, ‘What do you do for a living’. ‘What is your source of earning’ all these variation arXiv:2009.14116v1 [cs.CL] 29 Sep 2020 One of the main challenge in Knowledge Base may points to label profession (Berant and Liang, Question Answering (KBQA) is semantic parsing - 2014). the construction of a complete, formal, symbolic, The two challenges are elegantly summed up meaning representation (MR) of a sentence (Wong in a function p = f(a; b; R; K) by Mitchell and and Mooney, 2006). Most commonly used formal Lapata(2010) i.e. the combined meaning of sym- frameworks use a combination of λ-calculus and bol a and b is function of lexicon a and lexicon b first order logic (FOL) e.g. CCG (Zettlemoyer and under the syntactic relation R and the context K. Collins, 2005), λ-DCS (Liang, 2013). The logical- We propose that the context K be the knowledge expression further needs to be grounded in a knowl- base (KB). KBQA provides an appropriate ground edge base (KB), in the case of KBQA, the challenge for testing different semantic parsing approaches here is lexical variation. The two main challenges empirically. There are some KBQA systems which that any KBQA system has to tackle are language use semantic parser as a module in their pipeline (Reddy et al., 2014; Cheng et al., 2017) where the parse the natural language sentence into functional purview of semantic parsing is to get to the logical- composition of meaningful parts greatly affects its expression, and a downstream process takes up capacity (expressiveness). Such a system is known lexicon grounding or disambiguation. There are as Semantic Parser. some systems which don’t have such seperations e.g. SEMPRE(Berant et al., 2013). However, both 2.1 Formal Language type of systems do resort to some formal language First order predicate logic (FOPL) can be used to or intermediate logical form. We exclude KBQA represent meaning of natural language sentence, systems which use non-symbolic representation however it fails to represent some concepts in the (Cohen et al., 2020). natural language, e.g. “How many primes are less We describe here some terminology commonly than 10?” (Liang, 2016) - FOPL doesn’t have a used in the study of semantic parsing (Diefenbach function to count the number of elements. The for- et al., 2018; Kamath and Das, 2018). 1. Intermedi- mal semantics can use a higher-order language e.g. ate logical form: represents the complete meaning λ-calculus, say a higher order function count ex- of the natural language using formal language e.g. ists, that can count the number of elements in a set. λ-calculus (Zettlemoyer and Collins, 2005; Haki- Thus we can represent the previous questions as mov et al., 2015), λ-DCS (Liang, 2013) or struc- count(λx.prime(x) ^ less(x; 10)). Without go- tured language Yih et al.(2015); Hu et al.(2018). ing into further details of the formal language, let’s This is the main output of semantic parsing. 2. consider an example here showing compositional phrase-mapping: mapping phrases in the question use of λ-calculus to represent the meaning of a to their corresponding resource in the KB is re- complex sentence. E.g. the sentence “Those who quired to provide a real-word context to the inter- had children born in Seattle.” Liang(2013). mediate logical form. This process is also called grounding of the logical form, thus obtaining a λx.9y:children(x; y) ^ P laceOfBirth(y; Seattle) grounded logical form. 3. Disambiguation: of Take another example showing coordination in many resources obtained in the phrase-mapping λ-calculus ”Sqaure blue or round yellow pillow” process only a few will be right according to the (Artzi et al., 2013) which is represented as semantic of the natural language. 4. Query construction: Querying the KB-endpoint requires trans- λx.pillow(x) ^ (square(x) ^ blue(x))_ lation of the grounded logical form into a query lan- (round(x) ^ yellow(x)) guage e.g. SPARQL. Translation from grounded logical form to the query language is a determinis- Many semantic parsing systems use only part of the tic process. operators available in λ-calculus thus they are lim- ited in expressiveness by their choice of operators, 2 Compositionality in Formal Semantics not by the choice of formal language. Considering the compositionality in the natural 2.2 Structured language language in the sense that meaning of the whole A graph-structured logical form or a tree-structured sentence is constructed from meaning of its parts. logical form can also be used to represent the mean- According to Pelletier(2011) this is a composi- ing of a natural language sentence. tionality in the “functional sense”: something is compositional if it is a complex thing with some Graph-structured language E.g. Semantic property that can be defined in terms of a func- Query Graph(SQG) (Hu et al., 2018), has nodes tion of the same property of its parts (with due representing constant/values and edges represent- consideration to the way the parts are combined). ing relation. The edges could be seen as analo- In formal semantics the complex things is a syn- gous to the binary relation of the logical formalism. tactically complex sentence and the property of This definition of SQG directly corresponds to the interest is meaning, while combining the parts due many knowledge graphs K like dbpedia (Auer et al., consideration has to be given to how those parts 2007), freebase (Bollacker et al., 2008). With four are syntactically present in the complex sentence. primitive operations to manipulate a graph struc- The choice of a formal language to represent mean- ture: connect and merge that operate on pair of ing of a complex sentence in a system trying to nodes, expand and fold that operate on single node (Hu et al., 2018), and with higher-order functions 3 Logic Based Formalism attached to nodes (Yih et al., 2015), graph-structure makes for good candidate for logical form of a nat- Many semantic parsing systems use higher-order ural language sentence e.g. Figure1. formal logic to represent meaning of natural language sentence e.g. λ−calculus (Zettlemoyer and λ− less Collins, 2005), DCS (Berant et al., 2013; Be- 10 Prime Number Count rant and Liang, 2014). A first order predicate logic can only express simple natural language sentence Figure 1: logical form for “How many primes are less of type yes/no or one that seek set of elements ful- than 10?” as graph-structure filling a logical expression. To operate on set of elements the formal languages are augmented with Tree-structured language Logical languages higer-order function, eg. count(A) that would re- with tree-hierarchy can represent such hierarchy turn cardinality of set A. λ−Calculus Carpenter of natural language as well. FunQL Cheng et al. (1997) is a higher order functional language, it is (2017); Zelle and Mooney(1996a); Kate et al. more expressive, it can represent natural language (2005) is a variable free functional language en- constructs like count, superlative etc.

Arxiv:2009.14116V1 [Cs.CL] 29 Sep 2020

Semantic Parsing for Single-Relation Question Answering

Semantic Parsing with Semi-Supervised Sequential Autoencoders

SQL Generation from Natural Language: a Sequence-To- Sequence Model Powered by the Transformers Architecture and Association Rules

Semantic Parsing with Neural Hybrid Trees

Learning Logical Representations from Natural Languages with Weak Supervision and Back-Translation

A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

Neural Semantic Parsing Graham Neubig

Practical Semantic Parsing for Spoken Language Understanding

Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

On the Use of Parsing for Named Entity Recognition

Neural Semantic Parsing for Syntax-Aware Code Generation

Learning to Learn Semantic Parsers from Natural Language Supervision