Universal Semantic

† ‡ ‡ †† †† Siva Reddy ∗ Oscar Tackstr¨ om¨ Slav Petrov Mark Steedman Mirella Lapata †Stanford University ‡ Google Inc. ††University of Edinburgh [email protected], oscart, slav @google.com, steedman, mlap @inf.ed.ac.uk { } { }

Abstract Seeking to exploit the benefits of UD for natu- ral language understanding, we introduce UDEP- Universal Dependencies (UD) offer a uni- LAMBDA, a semantic interface for UD that maps form cross-lingual syntactic representation, natural language to logical forms, representing un- with the aim of advancing multilingual ap- derlying predicate-argument structures, in an al- plications. Recent work shows that se- most language-independent manner. Our frame- mantic parsing can be accomplished by work is based on DEPLAMBDA (Reddy et al., transforming syntactic dependencies to log- 2016) a recently developed method that converts ical forms. However, this work is lim- English Stanford Dependencies (SD) to logical ited to English, and cannot process de- forms. The conversion process is illustrated in pendency graphs, which allow handling Figure1 and discussed in more detail in Section2. complex phenomena such as control. In Whereas DEPLAMBDA works only for English, U- this work, we introduce UDEPLAMBDA, DEPLAMBDA applies to any language for which a semantic interface for UD, which maps UD annotations are available.1 Moreover, DEP- natural language to logical forms in an LAMBDA can only process tree-structured inputs almost language-independent fashion and whereas UDEPLAMBDA can also process depen- can process dependency graphs. We per- dency graphs, which allow to handle complex con- form experiments on structions such as control. The different treatments against Freebase and provide German and of various linguistic constructions in UD compared Spanish translations of the WebQuestions to SD also require different handling in UDEP- and GraphQuestions datasets to facilitate LAMBDA (Section 3.3). multilingual evaluation. Results show that Our experiments focus on Freebase semantic UDEPLAMBDA outperforms strong base- parsing as a testbed for evaluating the framework’s lines across languages and datasets. For multilingual appeal. We convert natural language English, it achieves a 4.9 F1 point improve- to logical forms which in turn are converted to ma- ment over the state-of-the-art on Graph- chine interpretable formal meaning representations Questions. for retrieving answers to questions from Freebase. To facilitate multilingual evaluation, we provide arXiv:1702.03196v4 [cs.CL] 28 Aug 2017 1 Introduction translations of the English WebQuestions (Berant The Universal Dependencies (UD) initiative seeks et al., 2013) and GraphQuestions (Su et al., 2016) to develop cross-linguistically consistent annota- datasets to German and Spanish. We demonstrate tion guidelines as well as a large number of uni- that UDEPLAMBDA can be used to derive logical formly annotated for many languages forms for these languages using a minimal amount (Nivre et al., 2016). Such resources could advance of language-specific knowledge. Aside from devel- multilingual applications of parsing, improve com- oping the first multilingual semantic parsing tool parability of evaluation results, enable cross-lingual for Freebase, we also experimentally show that U- learning, and more generally support natural lan- DEPLAMBDA outperforms strong baselines across guage understanding. 1As of v1.3, UD annotations are available for 47 languages ∗ Work done at the University of Edinburgh at http://universaldependencies.org. languages and datasets. For English, it achieves the root strongest result to date on GraphQuestions, with nmod case competitive results on WebQuestions. Our imple- dobj det mentation and translated datasets are publicly avail- nsubj det able at https://github.com/sivareddyg/udeplambda. Disney won an Oscar for the movie Frozen propn verb det propn adp det noun propn EP AMBDA 2 D L (a) The dependency tree for Disney won an Oscar for the movie Frozen in the Universal Dependencies formalism. Before describing UDEPLAMBDA, we provide an overview of DEPLAMBDA (Reddy et al., 2016) (nsubj (nmod (dobj won (det Oscar an)) on which our approach is based. DEPLAMBDA (case (det (comp. Frozen movie) the) for)) Disney) converts a dependency tree to its logical form in (b) The binarized s-expression for the dependency tree. three steps: binarization, substitution, and com- λx. yzw.won(x ) Disney(y ) Oscar(z ) position, each of which is briefly outlined below. ∃ e ∧ a ∧ a Frozen(wa) movie(wa) Algorithm1 describes the steps of DEPLAMBDA ∧arg (x ,y ) ∧ arg (x ,z ) nmod.for(x ,w ) ∧ 1 e a ∧ 2 e a ∧ e a in lines 4-6, whereas lines 2 and 3 are specific to (c) The composed lambda-calculus expression. UDEPLAMBDA.

Binarization A dependency tree is first mapped Figure 1: The mapping of a dependency tree to its to a Lisp-style s-expression indicating the order logical form with the intermediate s-expression. of semantic composition. Figure 1(b) shows the s-expression for the sentence Disney won an Os- of type Ind Event. Roughly speaking, proper car for the movie Frozen, derived from the depen- × nouns and adjectives invoke ENTITY expressions, dency tree in Figure 1(a). Here, the sub-expression verbs and adverbs invoke EVENT expressions, and (dobj won (det Oscar an)) indicates that the logi- common nouns invoke both ENTITY and EVENT ex- cal form of the phrase won an Oscar is derived by pressions (see Section 3.3), while remaining composing the logical form of the label dobj with invoke FUNCTIONAL expressions. DEPLAMBDA the logical form of the won and the logical enforces the constraint that every s-expression is of form of the phrase an Oscar, derived analogously. the type η = Ind Event Bool, which simpli- × → The s-expression can also be interpreted as a bi- fies the type system considerably. narized tree with the dependency label as the root Expressions for dependency labels glue the node, and the left and right expressions as subtrees. semantics of heads and modifiers to articulate A composition hierarchy is employed to impose predicate-argument structure. These expressions in a strict traversal ordering on the modifiers to each general take one of the following forms: head in the dependency tree. As an example, won has three modifiers in Figure 1(a), which according COPY λ f gx. y. f (x) g(y) rel(x,y) e.g. nsubj, dobj⇒ , nmod∃, advmod∧ ∧ to the composition hierarchy are composed in the INVERT λ f gx. y. f (x) g(y) reli(y,x) order dobj > nmod > nsubj. In constructions like e.g. amod, acl⇒ ∃ ∧ ∧ coordination, this ordering is crucial to arrive at MERGE λ f gx. f (x) g(x) e.g. compound⇒, appos, amod∧, acl the correct semantics. Lines 7-17 in Algorithm1 HEAD λ f gx. f (x) describe the binarization step. e.g. case, punct⇒ , aux, mark .

Substitution Each symbol in the s-expressions As an example of COPY, consider the lambda is substituted for a lambda expression encoding expression for dobj in (dobj won (det Oscar an)): its semantics. Words and dependency labels are λ f gx. y. f (x) g(y) arg (x ,y ). This expres- ∃ ∧ ∧ 2 e a assigned different types of expressions. In general, sion takes two functions f and g as input, where words have expressions of the following kind: f represents the logical form of won and g repre- ENTITY λx.word(x ); e.g. Oscar λx.Oscar(x ) sents the logical form of an Oscar. The predicate- ⇒ a ⇒ a EVENT λx.word(x ); e.g. won λx.won(x ) ⇒ e ⇒ e argument structure arg2(xe,ya) indicates that the FUNCTIONAL λx. TRUE; e.g. an λx. TRUE ⇒ ⇒ arg2 of the event xe, i.e. won, is the individual ya, Here, the subscripts and denote the types i.e. the entity Oscar. Since arg (x ,y ) mimics the ·a ·e 2 e a of individuals (Ind) and events (Event), respec- dependency structure dobj(won, Oscar), we refer tively, whereas x denotes a paired variable (xa,xe) to the expression kind evoked by dobj as COPY. Expressions that invert the dependency direc- Algorithm 1: UDEPLAMBDA Steps tion are referred to as INVERT (e.g. amod in run- 1 Function UDepLambda(depTree): ning horse); expressions that merge two subexpres- 2 depGraph = Enhancement (depTree) sions without introducing any relation predicates #See Figure 2(a) for a depGraph. 3 bindedTree = SplitLongDistance (depGraph) are referred to as MERGE (e.g. compound in movie #See Figure 2(b) for a bindedTree. Frozen); and expressions that simply return the par- 4 binarizedTree = Binarization (bindedTree) #See Figure 1(b) for a binarizedTree. ent expression semantics are referred to as HEAD 5 logicalForm = Composition (binarizedTree) (e.g. case in for Frozen). While this generalization 6 return logicalForm applies to most dependency labels, several labels take a different logical form not listed here, some 7 Function Binarization (tree): of which are discussed in Section 3.3. Sometimes 8 parent = GetRootNode (tree); the mapping of dependency label to lambda expres- 9 (label1,child1),(label2,child2)... { = GetChildNodes (parent) } sion may depend on surrounding part-of-speech 10 sortedChildren = SortUsingLabelHierarchy ( (label1,child1),(label2,child2)... ) tags or dependency labels. For example, amod acts { } 11 binarziedTree.root = parent as INVERT when the modifier is a verb (e.g. in run- 12 for label, child sortedChildren: ∈ ning horse), and as MERGE when the modifier is 13 temp.root = label an adjective (e.g. in beautiful horse).2 Lines 26-32 14 temp.le ft = binarziedTree 15 temp.right = Binarization(child) in Algorithm1 describe the substitution procedure. 16 binarziedTree = temp 17 return binarizedTree Composition The final logical form is computed by beta-reduction, treating expressions of the form 18 Composition (f x y) as the function f applied to the arguments Function (binarizedTree): 19 mainLF = Substitution (binarizedTree.root) x and y. For example, (dobj won (det Oscar an)) 20 if binarziedTree has left and right children: 21 le ftLF = Composition (binarziedTree.le ft) results in λx. z.won(xe) Oscar(za) arg2(xe,za) ∃ ∧ ∧ 22 rightLF = Composition(binarziedTree.right) dobj when the expression for is applied to those 23 mainLF = BetaReduce (mainLF,le ftLF) for won and (det Oscar an). Figure 1(c) shows the 24 mainLF = BetaReduce (mainLF,rightLF) logical form for the s-expression in Figure 1(b). 25 return mainLF The binarized s-expression is recursively converted to a logical form as described in lines 18-25 in 26 Function Substitution (node): Algorithm1. 27 logicalForms = [ ] 28 for tregexRule, template substitutionRules: ∈ 29 if tregexRule.match(node): 3 UDEPLAMBDA 30 l f = GenLambdaExp (node, template) 31 logicalForms.add(l f ) We now introduce UDEPLAMBDA, a semantic in- 32 return logicalForms terface for Universal Dependencies.3 Whereas DEPLAMBDA only applies to English Stanford De- pendencies, UDEPLAMBDA takes advantage of the to assign lambda expressions, but only on informa- cross-lingual nature of UD to facilitate an (almost) tion contained in dependency labels and postags. language independent semantic interface. This is However, some linguistic phenomena are lan- accomplished by restricting the binarization, sub- guage specific (e.g. pronoun-dropping) or lexical- stitution, and composition steps described above ized (e.g. every and the in English have different to rely solely on information encoded in the UD semantics, despite being both determiners) and are representation. As shown in Algorithm1, lines not encoded in the UD schema. Furthermore, some 4-6 are common to both DEPLAMBDA and UDEP- cross-linguistic phenomena, such as long-distance LAMBDA, whereas lines 2 and 3 applies only to dependencies, are not part of the core UD represen- UDEPLAMBDA. Importantly, UDEPLAMBDA is tation. To circumvent this limitation, a simple en- designed to not rely on lexical forms in a language hancement step enriches the original UD represen- 2We use Tregex (Levy and Andrew, 2006) for substitu- tation before binarization takes place (Section 3.1). tion mappings and Cornell SPF (Artzi, 2013) as the lambda- This step adds to the dependency tree missing syn- calculus implementation. For example, in running horse, the tactic information and long-distance dependencies, tregex /label:amod/=target < /postag:verb/ matches amod to its INVERT expression λ f gx. y. f (x) g(y) amodi(y ,x ). thereby creating a graph. Whereas DEPLAMBDA ∃ ∧ ∧ e a 3In what follows, all references to UD are to UD v1.3. is not able to handle graph-structured input, UDEP- LAMBDA is designed to work with dependency xcomp graphs as well (Section 3.2). Finally, several con- nsubj mark dobj structions differ in structure between UD and SD, Anna wants to marry Kristoff which requires different handling in the semantic interface (Section 3.3). nsubj (a) With long-distance dependency. 3.1 Enhancement Both Schuster and Manning(2016) and Nivre et al. xcomp (2016) note the necessity of an enhanced UD rep- nsubj mark dobj resentation to enable semantic applications. How- Anna wants to marry Kristoff ever, such enhancements are currently only avail- bind nsubj able for a subset of languages in UD. Instead, we Ω Ω rely on a small number of enhancements for our (b) With variable binding. main application—semantic parsing for question- answering—with the hope that this step can be re- Figure 2: The original and enhanced dependency placed by an enhanced UD representation in the fu- trees for Anna wants to marry Kristoff. ture. Specifically, we define three kinds of enhance- ments: (1) long-distance dependencies; (2) types from that of DEPLAMBDA. This is indicated in of coordination; and (3) refined question word tags. line 3 of Algorithm1. First, each long-distance These correspond to line 2 in Algorithm1. dependency is split into independent arcs as shown First, we identify long-distance dependencies in in Figure 2(b). Here, Ω is a placeholder for the sub- relative clauses and control constructions. We fol- ject of marry, which in turn corresponds to Anna as low Schuster and Manning(2016) and find these indicated by the binding of Ω via the pseudo-label using the labels acl (relative) and xcomp (control). BIND. We treat BIND like an ordinary dependency Figure 2(a) shows the long-distance dependency in label with semantics MERGE and process the result- the sentence Anna wants to marry Kristoff. Here, ing tree as usual, via the s-expression: marry is provided with its missing nsubj (dashed arc). Second, UD conflates all coordinating con- (nsubj (xcomp wants (nsubj (mark (dobj marry Kristoff) to) Ω)(BIND Anna Ω)) , structions to a single dependency label, conj. To obtain the correct coordination scope, we refine with the lambda-expression substitutions: conj to conj:verb, conj:vp, conj:sentence, wants, marry EVENT; to FUNCTIONAL; ∈ ∈ conj:np, and conj:adj, similar to Reddy et al. Anna, Kristoff ENTITY; ∈ (2016). Finally, unlike the PTB tags (Marcus et al., mark HEAD; BIND MERGE; xcomp∈= λ f gx. y. f (∈x) g(y) xcomp(x ,y ) . 1993) used by SD, the UD part-of-speech tags do ∃ ∧ ∧ e e not distinguish question words. Since these are cru- These substitutions are based solely on unlexi- cial to question-answering, we use a small lexicon calized context. For example, the part-of-speech to refine the tags for determiners (DET), adverbs tag PROPN of Anna invokes an ENTITY expression. (ADV) and pronouns (PRON) to DET:WH, ADV:WH The placeholder Ω has semantics λx.EQ(x,ω), and PRON:WH, respectively. Specifically, we use where EQ(u,ω) is true iff u and ω are equal (have a list of 12 (English), 14 (Spanish) and 35 (Ger- the same denotation), which unifies the subject vari- man) words, respectively. This is the only part able of wants with the subject variable of marry. of UDEPLAMBDA that relies on language-specific After substitution and composition, we get: information. We hope that, as the coverage of mor- λz. xywv.wants(z ) Anna(x ) arg (z ,x ) EQ(x,ω) ∃ e ∧ a ∧ 1 e a ∧ phological features in UD improves, this refine- marry(ye) xcomp(ze,ye) arg1(ye,va) EQ(v,ω) ∧ Kristoff(w∧) arg (y ,w )∧, ∧ ment can be replaced by relying on morphological ∧ a ∧ 2 e a features, such as the interrogative feature (INT). This expression may be simplified further by replacing all occurrences of v with x and removing 3.2 Graph Structures and BIND the unification predicates EQ, which results in: To handle graph structures that may result from the λz. xyw.wants(z ) Anna(x ) arg (z ,x ) enhancement step, such as those in Figure 2(a), we ∃ e ∧ a ∧ 1 e a marry(ye) xcomp(ze,ye) arg1(ye,xa) propose a variable-binding mechanism that differs ∧ Kristoff(w∧) arg (y ,w )∧. ∧ a ∧ 2 e a This expression encodes the fact that Anna is the nsubjpass or auxpass indicate passive construc- arg1 of the marry event, as desired. DEPLAMBDA, tions, such clues are sometimes missing, such as in in contrast, cannot handle graph-structured input, reduced relatives. We therefore opt to not have sep- since it lacks a principled way of generating s- arate entries for passives, but aim to produce identi- expressions from graphs. Even given the above cal logical forms for active and passive forms when s-expression, BIND in DEPLAMBDA is defined in possible (for example, by treating nsubjpass as a way such that the composition fails to unify v direct object). With the following entries, and x, which is crucial for the correct semantics. won EVENT; an, was FUNCTIONAL; auxpass HEAD; nsubjpass∈ = λ f gx. y. f (∈x) g(y) arg (x ,y ) , ∈ Moreover, the definition of BIND in DEPLAMBDA ∃ ∧ ∧ 2 e a does not have a formal interpretation within the the lambda expression for An Oscar was won be- lambda calculus, unlike ours. comes λx.won(x ) Oscar(y ) arg (x ,y ), iden- e ∧ a ∧ 2 e a tical to that of its active form. However, not having 3.3 Linguistic Constructions a special entry for passive verbs may have unde- Below, we highlight the most pertinent differences sirable side-effects. For example, in the reduced- between UDEPLAMBDA and DEPLAMBDA, stem- relative construction Pixar claimed the Oscar won ming from the different treatment of various lin- for Frozen, the phrase the Oscar won ... will guistic constructions in UD versus SD. receive the semantics λx.Oscar(y ) won(x ) a ∧ e ∧ arg1(xe,ya), which differs from that of an Oscar Prepositional Phrases UD uses a content-head was won. We leave it to the target application to analysis, in contrast to SD, which treats function disambiguate the interpretation in such cases. words as heads of prepositional phrases, Accord- ingly, the s-expression for the phrase president Long-Distance Dependencies As discussed in in 2009 is (nmod president (case 2009 in)) in U- Section 3.2, we handle long-distance dependen- DEPLAMBDA and (prep president (pobj in 2009)) cies evoked by clausal modifiers (acl) and con- in DEPLAMBDA. To achieve the desired semantics, trol verbs (xcomp) with the BIND mechanism, whereas DEPLAMBDA cannot handle control con- λx. y.president(xa) president event(xe) ∃ arg (x ,x ) ∧2009(y ) prep.in(x ∧,y ) , structions. For xcomp, as seen earlier, we use the 1 e a ∧ a ∧ e a mapping λ f gx. y. f (x) g(y) xcomp(xe,ye). For DEPLAMBDA relies on an intermediate logical ∃ ∧ ∧ acl we use λ f gx. y. f (x) g(y), to conjoin the form that requires some post-processing, whereas ∃ ∧ main clause and the modifier clause. However, not UDEPLAMBDA obtains the desired logical form all acl clauses evoke long-distance dependencies, directly through the following entries: e.g. in the news that Disney won an Oscar, the in FUNCTIONAL; 2009 ENTITY; case HEAD; clause that Disney won an Oscar is a subordinating president∈ = λx.president(x∈) president event∈ (x ) a ∧ e conjunction of news. In such cases, we instead arg1(xe,xa) ; nmod = λ f gx∧. y. f (x) g(y) nmod.in(x ,y ) . assign acl the INVERT semantics. ∃ ∧ ∧ e a Other nmod constructions, such as possessives Questions Question words are marked with the (nmod:poss), temporal modifiers (nmod:tmod) enhanced part-of-speech tags DET:WH, ADV:WH and adverbial modifiers (nmod:npmod), are han- and PRON:WH, which are all assigned the seman- dled similarly. Note how the common noun presi- tics λx.$ word (x ) TARGET(x ). The predicate { } a ∧ a dent, evokes both entity and event predicates above. TARGET indicates that xa represents the variable of interest, that is the answer to the question. Passives DEPLAMBDA gives special treatment to passive verbs, identified by the fine-grained part- 3.4 Limitations of-speech tags in the PTB tag together with de- In order to achieve language independence, UDEP- pendency context. For example, An Oscar was LAMBDA has to sacrifice semantic specificity, since won is analyzed as λx.won.pass(x ) Oscar(y ) e ∧ a ∧ in many cases the semantics is carried by lexical arg1(xe,ya), where won.pass represents a passive information. Consider the sentences John broke event. However, UD does not distinguish be- the window and The window broke. Although it is 4 tween active and passive forms. While the labels the window that broke in both cases, our inferred 4UD encodes voice as a morphological feature, but most logical forms do not canonicalize the relation be- syntactic analyzers do not produce this information yet. tween broke and window. To achieve this, we language target people sprache target lengua target language target .human language type type type type type

x e1 y e2 Ghana x e1 Ghana x e1 Ghana speak speak people people gesprochen gesprochen lengua lengua x m location.country location.country Ghana .arg2 .arg1 .arg1 .nmod.in .arg2 .nmod.in .arg1 .nmod.de .official language.2 .official language.1 (a) English (b) German (c) Spanish (d) Freebase

Figure 3: The ungrounded graphs for What language do the people in Ghana speak?, Welche Sprache wird in Ghana gesprochen? and Cual´ es la lengua de Ghana?, and the corresponding grounded graph. would have to make the substitution of nsubj de- 4.1 Semantic Parsing as Graph Matching pend on lexical context, such that when window UDEPLAMBDA generates ungrounded logical occurs as nsubj with broke, the predicate arg is 2 forms that are independent of any knowledge base, invoked rather than arg1. UDEPLAMBDA does such as Freebase. We use GRAPHPARSER (Reddy not address this problem, and leave it to the tar- et al., 2016) to map these logical forms to their get application to infer context-sensitive semantics grounded Freebase graphs, via corresponding un- of arg and arg . To measure the impact of this 1 2 grounded graphs. Figures 3(a) to 3(c) show the limitation, we present UDEPLAMBDASRL in Sec- ungrounded graphs corresponding to logical forms tion 4.4 which addresses this problem by relying on from UDEPLAMBDA, each grounded to the same semantic roles from (Palmer Freebase graph in Figure 3(d). Here, rectangles de- et al., 2010). note entities, circles denote events, rounded rectan- Other constructions that require lexical informa- gles denote entity types, and edges between events tion are quantifiers like every, some and most, nega- and entities denote predicates or Freebase relations. tion markers like no and not, and intentional verbs Finally, the TARGET node represents the set of val- like believe and said. UD does not have special ues of x that are consistent with the Freebase graph, labels to indicate these. We discuss how to handle that is the answer to the question. quantifiers in this framework in the supplementary GRAPHPARSER treats semantic parsing as a material. graph-matching problem with the goal of finding Although in the current setup UDEPLAMBDA the Freebase graphs that are structurally isomorphic rules are hand-coded, the number of rules are only to an ungrounded graph and rank them according proportional to the number of UD labels, mak- to a model. To account for structural mismatches, 5 ing rule-writing manageable. Moreover, we view GRAPHPARSER uses two graph transformations: UDEPLAMBDA as a first step towards learning CONTRACT and EXPAND. In Figure 3(a) there are rules for converting UD to richer semantic repre- two edges between x and Ghana. CONTRACT col- sentations such as PropBank, AMR, or the Paral- lapses one of these edges to create a graph isomor- lel Meaning Bank (Palmer et al., 2005; Banarescu phic to Freebase. EXPAND, in contrast, adds edges et al., 2013; Abzianidze et al., 2017).. to connect the graph in the case of disconnected components. The search space is explored by beam search and model parameters are estimated with 4 Cross-lingual Semantic Parsing the averaged structured perceptron (Collins, 2002) from training data consisting of question-answer To study the multilingual nature of UDEPLAMBDA, pairs, using answer F1-score as the objective. we conduct an empirical evaluation on question answering against Freebase in three different lan- 4.2 Datasets guages: English, Spanish, and German. Before discussing the details of this experiment, we briefly We evaluate our approach on two public bench- outline the semantic parsing framework employed. marks of question answering against Freebase: WebQuestions (Berant et al., 2013), a widely used benchmark consisting of English questions and 5UD v1.3 has 40 dependency labels, and the number of their answers, and GraphQuestions (Su et al., 2016), substitution rules in UDEPLAMBDA are 61, with some labels having multiple rules, and some representing lexical seman- a recently released dataset of English questions tics. with both their answers and grounded logical forms. While WebQuestions is dominated by simple entity- WebQuestions attribute questions, GraphQuestions contains a en What language do the people in Ghana speak? large number of compositional questions involving de Welche Sprache wird in Ghana gesprochen? aggregation (e.g. How many children of Eddard es ¿Cual´ es la lengua de Ghana? Stark were born in Winterfell?) and comparison en Who was Vincent van Gogh inspired by? de Von wem wurde Vincent van Gogh inspiriert? (e.g. In which month does the average rainfall of es ¿Que´ inspiro´ a Van Gogh? New York City exceed 86 mm?). The number of GraphQuestions training, development and test questions is 2644, en NASA has how many launch sites? 1134, and 2032, respectively, for WebQuestions de Wie viele Abschussbasen besitzt NASA? and 1794, 764, and 2608 for GraphQuestions. es ¿Cuantos´ sitios de despegue tiene NASA? To support multilingual evaluation, we created en Which loudspeakers are heavier than 82.0 kg? translations of WebQuestions and GraphQuestions de Welche Lautsprecher sind schwerer als 82.0 kg? to German and Spanish. For WebQuestions two es ¿Que´ altavoces pesan mas´ de 82.0 kg? professional annotators were hired per language, Table 1: Example questions and their translations. while for GraphQuestions we used a trusted pool of 20 annotators per language (with a single annotator k WebQuestions GraphQuestions per question). Examples of the original questions en de es en de es and their translations are provided in Table1. 1 89.6 82.8 86.7 47.2 39.9 39.5 10 95.7 91.2 94.0 56.9 48.4 51.6 4.3 Implementation Details Here we provide details on the syntactic analyzers Table 2: Structured perceptron k-best entity linking employed, our entity resolution algorithm, and the accuracies on the development sets. features used by the grounding model. input to GRAPHPARSER, leaving the final disam- Dependency Parsing The English, Spanish, and German Universal Dependencies (UD) treebanks biguation to the semantic parsing problem. Table2 (v1.3; Nivre et al 2016) were used to train part of shows the 1-best and 10-best entity disambiguation F speech taggers and dependency parsers. We used a 1-scores for each language and dataset. bidirectional LSTM tagger (Plank et al., 2016) and Features We use features similar to Reddy et al. a bidirectional LSTM shift-reduce parser (Kiper- (2016): basic features of words and Freebase re- wasser and Goldberg, 2016). Both the tagger and lations, and graph features crossing ungrounded parser require word embeddings. For English, we events with grounded relations, ungrounded types used GloVe embeddings (Pennington et al., 2014) with grounded relations, and ungrounded answer trained on Wikipedia and the Gigaword corpus. type crossed with a binary feature indicating if the For German and Spanish, we used SENNA em- answer is a number. In addition, we add features beddings (Collobert et al., 2011; Al-Rfou et al., encoding the of ungrounded 2013) trained on Wikipedia corpora (589M words events and Freebase relations. Specifically, we used 6 German; 397M words Spanish). Measured on the the cosine similarity of the translation-invariant em- UD test sets, the tagger accuracies are 94.5 (En- beddings of Huang et al.(2015). 7 glish), 92.2 (German), and 95.7 (Spanish), with corresponding labeled attachment parser scores of 4.4 Comparison Systems 81.8, 74.7, and 82.2. We compared UDEPLAMBDA to four versions of GRAPHPARSER that operate on different represen- Entity Resolution We follow Reddy et al.(2016) tations, in addition to prior work. and resolve entities in three steps: (1) potential en- tity spans are identified using seven handcrafted SINGLEEVENT This model resembles the part-of-speech patterns; (2) each span is associated learning-to-rank model of Bast and Haussmann with potential Freebase entities according to the (2015). An ungrounded graph is generated by con- Freebase/KG API; and (3) the 10-best entity link- necting all entities in the question with the TARGET ing lattices, scored by a structured perceptron, are node, representing a single event. Note that this

6https://sites.google.com/site/rmyeid/projects/polyglot. 7http://128.2.220.95/multilingual/data/. WebQuestions GraphQuestions Method GraphQ. WebQ. Method en de es en de es SEMPRE (Berant et al., 2013) 10.8 35.7 SINGLEEVENT 48.5 45.6 46.3 15.9 8.8 11.4 JACANA (Yao and Van Durme, 2014) 5.1 33.0 DEPTREE 48.8 45.9 46.4 16.0 8.3 11.3 PARASEMPRE (Berant and Liang, 2014) 12.8 39.9 CCGGRAPH 49.5 – – 15.9 – – QA(Yao, 2015) – 44.3 UDEPLAMBDA 49.5 46.1 47.5 17.7 9.5 12.8 AQQU (Bast and Haussmann, 2015) – 49.4 UDEPLAMBDASRL 49.8 46.2 47.0 17.7 9.1 12.7 AGENDAIL(Berant and Liang, 2015) – 49.7 DEPLAMBDA (Reddy et al., 2016) – 50.3

Table 3: F1-scores on the test data. STAGG (Yih et al., 2015) – 48.4 (52.5) BILSTM(Ture¨ and Jojic, 2016) – 24.9 (52.2) MCNN(Xu et al., 2016) – 47.0 (53.3) baseline cannot handle compositional questions, or AGENDAIL-RANK (Yavuz et al., 2016) – 51.6 (52.6) those with aggregation or comparison. UDEPLAMBDA 17.7 49.5

DEPTREE An ungrounded graph is obtained di- Table 4: F1-scores on the English GraphQuestions rectly from the original dependency tree. An event and WebQuestions test sets (results with additional is created for each parent and its dependents in the task-specific resources in parentheses). tree. Each dependent is linked to this event with an edge labeled with its dependency relation, while the parent is linked to the event with an edge labeled LAMBDA on WebQuestions for English, we do not arg0. If a word is a question word, an additional see large performance gaps in other settings, sug- TARGET predicate is attached to its entity node. gesting that GRAPHPARSER is either able to learn context-sensitive semantics of ungrounded predi- CCGGRAPH This is the CCG-based semantic representation of Reddy et al.(2014). Note that cates or that the datasets do not contain ambiguous nsubj dobj nsubjpass this baseline exists only for English. , and mappings. Finally, while these results confirm that GraphQuestions is UDEPLAMBDASRL This is similar to UDEP- much harder compared to WebQuestions, we note LAMBDA except that instead of assuming nsubj, that both datasets predominantly contain single-hop dobj and nsubjpass correspond to arg1, arg2 and questions, as indicated by the competitive perfor- arg2, we employ semantic role labeling to identify mance of SINGLEEVENT on both datasets. the correct interpretation. We used the systems of Table4 compares UDEPLAMBDA with previ- Roth and Woodsend(2014) for English and Ger- ously published models which exist only for En- man and Bjrkelund et al.(2009) for Spanish trained glish and have been mainly evaluated on Web- on the CoNLL-2009 dataset (Haji et al., 2009).8 Questions. These are either symbolic like ours (first block) or employ neural networks (second block). 4.5 Results Results for models using additional task-specific Table3 shows the performance of GRAPHPARSER training resources, such as ClueWeb09, Wikipedia, with these different representations. Here and in or SimpleQuestions (Bordes et al., 2015) are shown what follows, we use average F1-score of predicted in parentheses. On GraphQuestions, we achieve answers (Berant et al., 2013) as the evaluation met- a new state-of-the-art result with a gain of 4.8 F1- ric. We first observe that UDEPLAMBDA consis- points over the previously reported best result. On tently outperforms the SINGLEEVENT and DEP- WebQuestions we are 2.1 points below the best TREE representations in all languages. model using comparable resources, and 3.8 points For English, performance is on par with CCG- below the state of the art. Most related to our GRAPH, which suggests that UDEPLAMBDA does work is the English-specific system of Reddy et al. not sacrifice too much specificity for universal- (2016). We attribute the 0.8 point difference in F1- ity. With both datasets, results are lower for Ger- score to their use of the more fine-grained PTB tag man compared to Spanish. This agrees with the set and Stanford Dependencies. lower performance of the syntactic parser on the German portion of the UD . While U- 5 Related Work DEPLAMBDASRL performs better than UDEP- Our work continues the long tradition of building 8The parser accuracies (%) are 87.33, 81.38 and 79.91for logical forms from syntactic representations initi- English, German and Spanish respectively. ated by Montague(1973). The literature is rife with attempts to develop semantic interfaces for HPSG scalability, and reduced duplication of effort across (Copestake et al., 2005), LFG (Kaplan and Bresnan, applications (Bender et al., 2015). Our work also re- 1982; Dalrymple et al., 1995; Crouch and King, lates to literature on parsing multiple languages to a 2006), TAG (Kallmeyer and Joshi, 2003; Gardent common executable representation (Cimiano et al., and Kallmeyer, 2003; Nesson and Shieber, 2006), 2013; Haas and Riezler, 2016). However, existing and CCG (Baldridge and Kruijff, 2002; Bos et al., approaches still map to the target meaning represen- 2004; Artzi et al., 2015). Unlike existing semantic tations (more or less) directly (Kwiatkowksi et al., interfaces, UDEPLAMBDA uses dependency syn- 2010; Jones et al., 2012; Jie and Lu, 2014). tax, a widely available syntactic resource. A common trend in previous work on seman- 6 Conclusions tic interfaces is the reliance on rich typed feature We introduced UDEPLAMBDA, a semantic inter- structures or semantic types coupled with strong face for Universal Dependencies, and showed that type constraints, which can be very informative the resulting semantic representation can be used but unavoidably language specific. Instead, UDEP- for question-answering against a knowledge base LAMBDA relies on generic unlexicalized informa- in multiple languages. We provided translations of tion present in dependency treebanks and uses a benchmark datasets in German and Spanish, in the simple type system (one type for dependency labels, hope to stimulate further multilingual research on and one for words) along with a combinatory mech- semantic parsing and question answering in gen- anism, which avoids type collisions. Earlier at- eral. We have only scratched the surface when it tempts at extracting semantic representations from comes to applying UDEPLAMBDA to natural lan- dependencies have mainly focused on language- guage understanding tasks. In the future, we would specific dependency representations (Spreyer and like to explore how this framework can benefit ap- Frank, 2005; Simov and Osenova, 2011; Hahn and plications such as summarization (Liu et al., 2015) Meurers, 2011; Reddy et al., 2016; Falke et al., and machine reading (Sachan and Xing, 2016). 2016; Beltagy, 2016), and multi-layered depen- dency annotations (Jakob et al., 2010;B edaride´ Acknowledgements and Gardent, 2011). In contrast, UDEPLAMBDA derives semantic representations for multiple lan- This work greatly benefited from discussions with guages in a common schema directly from Univer- Michael Collins, Dipanjan Das, Federico Fancellu, sal Dependencies. This work parallels a growing Julia Hockenmaier, Tom Kwiatkowski, Adam interest in creating other forms of multilingual se- Lopez, Valeria de Paiva, Martha Palmer, Fernando mantic representations (Akbik et al., 2015; Vander- Pereira, Emily Pitler, Vijay Saraswat, Nathan wende et al., 2015; White et al., 2016; Evang and Schneider, Bonnie Webber, Luke Zettlemoyer, and Bos, 2016). the members of ILCC Edinburgh University, the We evaluate UDEPLAMBDA on semantic pars- Microsoft Research Redmond NLP group, the Stan- ing for question answering against a knowledge ford NLP group, and the UW NLP and Linguistics base. Here, the literature offers two main modeling group. We thank Reviewer 2 for useful feedback. paradigms: (1) learning of task-specific grammars The authors would also like to thank the Univer- that directly parse language to a grounded repre- sal Dependencies community for the treebanks and sentation (Zelle and Mooney, 1996; Zettlemoyer documentation. This research is supported by a and Collins, 2005; Berant et al., 2013; Flanigan Google PhD Fellowship to the first author. We ac- et al., 2014; Pasupat and Liang, 2015; Groschwitz knowledge the financial support of the European et al., 2015); and (2) converting language to a lin- Research Council (Lapata; award number 681760). guistically motivated task-independent representa- tion that is then mapped to a grounded representa- tion (Kwiatkowski et al., 2013; Reddy et al., 2014; References Krishnamurthy and Mitchell, 2015; Gardner and Lasha Abzianidze, Johannes Bjerva, Kilian Evang, Krishnamurthy, 2017). Our work belongs to the Hessel Haagsma, Rik van Noord, Pierre Ludmann, latter paradigm, as we map natural language to Duc-Duy Nguyen, and Johan Bos. 2017. The Par- allel Meaning Bank: Towards a Multilingual Cor- Freebase indirectly via logical forms. Capitalizing pus of Translations Annotated with Compositional on natural-language syntax affords interpretability, Meaning Representations. In Proceedings of the European Chapter of the Association for Compu- Jonathan Berant, Andrew Chou, Roy Frostig, and Percy tational Linguistics. Association for Computational Liang. 2013. Semantic Parsing on Freebase from Linguistics, Valencia, Spain, pages 242–247. Question-Answer Pairs. In Proceedings of the Em- pirical Methods on Natural Language Processing. Alan Akbik, laura chiticariu, Marina Danilevsky, Yun- pages 1533–1544. yao Li, Shivakumar Vaithyanathan, and Huaiyu Zhu. 2015. Generating High Quality Proposition Banks Jonathan Berant and Percy Liang. 2014. Semantic Pars- for Multilingual Semantic Role Labeling. In Pro- ing via Paraphrasing. In Proceedings of the Asso- ceedings of the Association for Computational Lin- ciation for Computational Linguistics. pages 1415– guistics and the International Joint Conference on 1425. Natural Language Processing. Association for Com- putational Linguistics, Beijing, China, pages 397– Jonathan Berant and Percy Liang. 2015. Imitation 407. Learning of Agenda-Based Semantic Parsers. Trans- actions of the Association for Computational Lin- Rami Al-Rfou, Bryan Perozzi, and Steven Skiena. guistics 3:545–558. 2013. Polyglot: Distributed Word Representations for Multilingual NLP. In Proceedings of the Com- Anders Bjrkelund, Love Hafdell, and Pierre Nugues. putational Natural Language Learning. Sofia, Bul- 2009. Multilingual Semantic Role Labeling. In garia, pages 183–192. Proceedings of Computational Natural Language Learning (CoNLL 2009): Shared Task. Association Yoav Artzi. 2013. Cornell SPF: Cornell Semantic Pars- for Computational Linguistics, Boulder, Colorado, ing Framework. arXiv:1311.3011 [cs.CL] . pages 43–48. Yoav Artzi, Kenton Lee, and Luke Zettlemoyer. 2015. Antoine Bordes, Nicolas Usunier, Sumit Chopra, and Broad-coverage CCG Semantic Parsing with AMR. Jason Weston. 2015. Large-scale simple ques- In Proceedings of the Empirical Methods on Natural tion answering with memory networks. CoRR Language Processing. pages 1699–1710. abs/1506.02075. Jason Baldridge and Geert-Jan Kruijff. 2002. Coupling Johan Bos, Stephen Clark, Mark Steedman, James R. CCG and Hybrid Logic Dependency Semantics. In Curran, and Julia Hockenmaier. 2004. Wide- Proceedings of the Association for Computational Coverage Semantic Representations from a CCG Linguistics. pages 319–326. Parser. In Proceedings of the Conference on Com- putational Linguistics. pages 1240–1246. Laura Banarescu, Claire Bonial, Shu Cai, Madalina Georgescu, Kira Griffitt, Ulf Hermjakob, Kevin Philipp Cimiano, Vanessa Lopez, Christina Unger, Knight, Philipp Koehn, Martha Palmer, and Nathan Elena Cabrio, Axel-Cyrille Ngonga Ngomo, and Schneider. 2013. Abstract Meaning Representation Sebastian Walter. 2013. Multilingual question an- for Sembanking. In Linguistic Annotation Workshop swering over linked data (QALD-3): Lab overview. and Interoperability with Discourse. Sofia, Bulgaria, In Information Access Evaluation. Multilinguality, pages 178–186. Multimodality, and Visualization. Springer, Valencia, Spain, volume 8138. Hannah Bast and Elmar Haussmann. 2015. More Ac- curate Question Answering on Freebase. In Pro- Michael Collins. 2002. Discriminative Training Meth- ceedings of ACM International Conference on Infor- ods for Hidden Markov Models: Theory and Exper- mation and Knowledge Management. pages 1431– iments with Perceptron Algorithms. In Proceedings 1440. of the Empirical Methods on Natural Language Pro- cessing. pages 1–8. Paul Bedaride´ and Claire Gardent. 2011. Deep Seman- tics for Dependency Structures. In Proceedings of Ronan Collobert, Jason Weston, Leon Bottou, Michael Conference on Intelligent and Com- Karlen, Koray Kavukcuoglu, and Pavel Kuks. 2011. putational Linguistics. pages 277–288. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12:2493– Islam Beltagy. 2016. Natural Language Semantics Us- 2537. ing Probabilistic Logic. Ph.D. thesis, Department of Computer Science, The University of Texas at Ann Copestake, Dan Flickinger, Carl Pollard, and Austin. Ivan A. Sag. 2005. Minimal Recursion Semantics: An Introduction. Research on Language and Com- Emily M. Bender, Dan Flickinger, Stephan Oepen, putation 3(2-3):281–332. Woodley Packard, and Ann Copestake. 2015. Lay- ers of Interpretation: On Grammar and Composition- Dick Crouch and Tracy Holloway King. 2006. Seman- ality. In Proceedings of the International Confer- tics via f-structure rewriting. In Proceedings of the ence on Computational Semantics. Association for LFG’06 Conference. CSLI Publications, page 145. Computational Linguistics, London, UK, pages 239– 249. Mary Dalrymple, John Lamping, Fernando C. N. Pereira, and Vijay A. Saraswat. 1995. Linear Logic for Meaning Assembly. In Proceedings of Computa- tional Logic for Natural Language Processing. Kilian Evang and Johan Bos. 2016. Cross-lingual Translation Invariant Word Embeddings. In Pro- Learning of an Open-domain Semantic Parser. In ceedings of the Empirical Methods in Natural Lan- Proceedings of the Conference on Computational guage Processing. Lisbon, Portugal, pages 1084– Linguistics. The COLING 2016 Organizing Com- 1088. mittee, Osaka, Japan, pages 579–588. Max Jakob, Marketa´ Lopatkova,´ and Valia Kordoni. Tobias Falke, Gabriel Stanovsky, Iryna Gurevych, and 2010. Mapping between Dependency Structures and Ido Dagan. 2016. Porting an Open Information Ex- Compositional Semantic Representations. In Pro- traction System from English to German. In Pro- ceedings of the Fifth International Conference on ceedings of the Empirical Methods in Natural Lan- Language Resources and Evaluation. guage Processing. Association for Computational Linguistics, Austin, Texas, pages 892–898. Zhanming Jie and Wei Lu. 2014. Multilingual Seman- tic Parsing : Parsing Multiple Languages into Se- Jeffrey Flanigan, Sam Thomson, Jaime Carbonell, mantic Representations. In Proceedings of the Con- Chris Dyer, and Noah A. Smith. 2014. A Discrimi- ference on Computational Linguistics. Dublin City native Graph-Based Parser for the Abstract Meaning University and Association for Computational Lin- Representation. In Proceedings of the Association guistics, Dublin, Ireland, pages 1291–1301. for Computational Linguistics. pages 1426–1436. Bevan Keeley Jones, Mark Johnson, and Sharon Gold- Claire Gardent and Laura Kallmeyer. 2003. Semantic water. 2012. Semantic Parsing with Bayesian Tree Construction in Feature-based TAG. In Proceedings Transducers. In Proceedings of the Association for of European Chapter of the Association for Compu- Computational Linguistics. Association for Compu- tational Linguistics. pages 123–130. tational Linguistics, Stroudsburg, PA, USA, pages 488–496. Matt Gardner and Jayant Krishnamurthy. 2017. Open- Vocabulary Semantic Parsing with both Distribu- Laura Kallmeyer and Aravind Joshi. 2003. Factor- tional Statistics and Formal Knowledge. In Proceed- ing predicate argument and scope semantics: Under- ings of Association for the Advancement of Artificial specified semantics with LTAG. Research on Lan- Intelligence. guage and Computation 1(1-2):3–58. Jonas Groschwitz, Alexander Koller, and Christoph Te- Ronald M Kaplan and Joan Bresnan. 1982. Lexical- ichmann. 2015. Graph parsing with s-graph gram- functional grammar: A formal system for gram- mars. In Proceedings of the Association for Compu- matical representation. Formal Issues in Lexical- tational Linguistics. pages 1481–1490. Functional Grammar pages 29–130. Carolin Haas and Stefan Riezler. 2016. A Corpus and Eliyahu Kiperwasser and Yoav Goldberg. 2016. Sim- Semantic Parser for Multilingual Natural Language ple and Accurate Dependency Parsing Using Bidi- Querying of OpenStreetMap. In Proceedings of rectional LSTM Feature Representations. Transac- the North American Chapter of the Association for tions of the Association for Computational Linguis- Computational Linguistics: Human Language Tech- tics 4:313–327. nologies. Association for Computational Linguistics, San Diego, California, pages 740–750. Jayant Krishnamurthy and Tom M. Mitchell. 2015. Learning a Compositional Semantics for Freebase Michael Hahn and Detmar Meurers. 2011. On deriv- with an Open Predicate Vocabulary. Transactions ing semantic representations from dependencies: A of the Association for Computational Linguistics practical approach for evaluating meaning in learner 3:257–270. corpora. In Proceedings of the Int. Conference on Dependency Linguistics (Depling 2011). Barcelona, Tom Kwiatkowksi, Luke Zettlemoyer, Sharon Goldwa- pages 94–103. ter, and Mark Steedman. 2010. Inducing Probabilis- tic CCG Grammars from Logical Form with Higher- Jan Haji, Massimiliano Ciaramita, Richard Johans- Order Unification. In Proceedings of the Empiri- son, Daisuke Kawahara, Maria Antnia Mart, Llus cal Methods on Natural Language Processing. pages Mrquez, Adam Meyers, Joakim Nivre, Sebastian 1223–1233. Pad, Jan tpnek, and others. 2009. The CoNLL-2009 shared task: Syntactic and semantic dependencies in Tom Kwiatkowski, Eunsol Choi, Yoav Artzi, and Luke multiple languages. In Proceedings of the Compu- Zettlemoyer. 2013. Scaling Semantic Parsers with tational Natural Language Learning: Shared Task. On-the-Fly Ontology Matching. In Proceedings of Association for Computational Linguistics, pages 1– the Empirical Methods on Natural Language Pro- 18. cessing. pages 1545–1556. Kejun Huang, Matt Gardner, Evangelos Papalex- Roger Levy and Galen Andrew. 2006. Tregex and tsur- akis, Christos Faloutsos, Nikos Sidiropoulos, Tom geon: tools for querying and manipulating tree data Mitchell, Partha P. Talukdar, and Xiao Fu. 2015. structures. In Proceedings of LREC. pages 2231– 2234. Fei Liu, Jeffrey Flanigan, Sam Thomson, Norman Siva Reddy, Mirella Lapata, and Mark Steedman. 2014. Sadeh, and Noah A. Smith. 2015. Toward Ab- Large-scale Semantic Parsing without Question- stractive Summarization Using Semantic Represen- Answer Pairs. Transactions of the Association for tations. In Proceedings of North American Chap- Computational Linguistics 2:377–392. ter of the Association for Computational Linguistics. pages 1077–1086. Siva Reddy, Oscar Tackstr¨ om,¨ Michael Collins, Tom Kwiatkowski, Dipanjan Das, Mark Steedman, and Mitchell P. Marcus, Mary Ann Marcinkiewicz, and Mirella Lapata. 2016. Transforming Dependency Beatrice Santorini. 1993. Building a large annotated Structures to Logical Forms for Semantic Parsing. corpus of English: The Penn Treebank. Computa- Transactions of the Association for Computational tional linguistics 19(2):313–330. Linguistics 4:127–140. Richard Montague. 1973. The Proper Treatment of Michael Roth and Kristian Woodsend. 2014. Compo- Quantification in Ordinary English. In K.J.J. Hin- sition of Word Representations Improves Semantic tikka, J.M.E. Moravcsik, and P. Suppes, editors, Role Labelling. In Proceedings of the Empirical Approaches to Natural Language, Springer Nether- Methods in Natural Language Processing (EMNLP). lands, volume 49 of Synthese Library, pages 221– Association for Computational Linguistics, Doha, 242. Qatar, pages 407–413.

Rebecca Nesson and Stuart M. Shieber. 2006. Simpler Mrinmaya Sachan and Eric Xing. 2016. Machine Com- TAG Semantics Through Synchronization. In Pro- prehension using Rich Semantic Representations. In ceedings of the 11th Conference on Formal Gram- Proceedings of the Association for Computational mar. Center for the Study of Language and Informa- Linguistics. Association for Computational Linguis- tion, Malaga, Spain, pages 129–142. tics, Berlin, Germany, pages 486–492. Joakim Nivre, Marie-Catherine de Marneffe, Filip Gin- ter, Yoav Goldberg, Jan Hajic, Christopher D. Man- Sebastian Schuster and Christopher D. Manning. 2016. ning, Ryan McDonald, Slav Petrov, Sampo Pyysalo, Enhanced English Universal Dependencies: An Im- Natalia Silveira, Reut Tsarfaty, and Daniel Zeman. proved Representation for Natural Language Under- 2016. Universal Dependencies v1: A Multilingual standing Tasks. In Proceedings of the Tenth Interna- Treebank Collection. In Proceedings of the Tenth In- tional Conference on Language Resources and Eval- ternational Conference on Language Resources and uation. European Language Resources Association Evaluation. European Language Resources Associa- (ELRA), Paris, France. tion (ELRA), Paris, France. Kiril Simov and Petya Osenova. 2011. Towards Min- Joakim Nivre et al. 2016. Universal dependencies 1.3. imal Recursion Semantics over Bulgarian Depen- LINDAT/CLARIN digital library at the Institute of dency Parsing. In Proceedings of the International Formal and Applied Linguistics, Charles University Conference Recent Advances in Natural Language in Prague. Processing 2011. RANLP 2011 Organising Commit- tee, Hissar, Bulgaria, pages 471–478. Martha Palmer, Daniel Gildea, and Paul Kingsbury. 2005. The proposition bank: An annotated corpus of Kathrin Spreyer and Anette Frank. 2005. Projecting semantic roles. Computational linguistics 31(1):71– RMRS from TIGER Dependencies. In Proceedings 106. of the HPSG 2005 Conference. CSLI Publications. Martha Palmer, Daniel Gildea, and Nianwen Xue. 2010. Yu Su, Huan Sun, Brian Sadler, Mudhakar Srivatsa, Semantic role labeling. Synthesis Lectures on Hu- Izzeddin Gur, Zenghui Yan, and Xifeng Yan. 2016. man Language Technologies 3(1):1–103. On Generating Characteristic-rich Question Sets for QA Evaluation. In Proceedings of the Empirical Panupong Pasupat and Percy Liang. 2015. Composi- Methods in Natural Language Processing. Austin, tional Semantic Parsing on Semi-Structured Tables. Texas, pages 562–572. In Proceedings of the Association for Computational Linguistics. pages 1470–1480. Ferhan Ture¨ and Oliver Jojic. 2016. Simple and Ef- fective Question Answering with Recurrent Neural Jeffrey Pennington, Richard Socher, and Christopher Networks. CoRR abs/1606.05029. Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the Empirical Lucy Vanderwende, Arul Menezes, and Chris Quirk. Methods in Natural Language Processing. Associ- 2015. An AMR parser for English, French, German, ation for Computational Linguistics, Doha, Qatar, Spanish and Japanese and a new AMR-annotated pages 1532–1543. corpus. In Proceedings of the North American Chap- Barbara Plank, Anders Søgaard, and Yoav Goldberg. ter of the Association for Computational Linguistics: 2016. Multilingual Part-of-Speech Tagging with Demonstrations. Association for Computational Lin- Bidirectional Long Short-Term Memory Models and guistics, Denver, Colorado, pages 26–30. Auxiliary Loss. In Proceedings of the Annual Meet- ing of the Association for Computational Linguistics. Berlin, Germany, pages 412–418. Aaron Steven White, Drew Reisinger, Keisuke Sak- Semih Yavuz, Izzeddin Gur, Yu Su, Mudhakar Srivatsa, aguchi, Tim Vieira, Sheng Zhang, Rachel Rudinger, and Xifeng Yan. 2016. Improving Semantic Parsing Kyle Rawlins, and Benjamin Van Durme. 2016. Uni- via Answer Type Inference. In Proceedings of the versal Decompositional Semantics on Universal De- Empirical Methods in Natural Language Processing. pendencies. In Proceedings of the Empirical Meth- Association for Computational Linguistics, Austin, ods in Natural Language Processing. Association Texas, pages 149–159. for Computational Linguistics, Austin, Texas, pages 1713–1723. Wen-tau Yih, Ming-Wei Chang, Xiaodong He, and Jianfeng Gao. 2015. Semantic Parsing via Staged Kun Xu, Siva Reddy, Yansong Feng, Songfang Huang, Query Graph Generation: Question Answering with and Dongyan Zhao. 2016. Question Answering on Knowledge Base. In Proceedings of the Association Freebase via Relation Extraction and Textual Evi- for Computational Linguistics. pages 1321–1331. dence. In Proceedings of the Association for Compu- tational Linguistics. Association for Computational John M. Zelle and Raymond J. Mooney. 1996. Learn- Linguistics, Berlin, Germany, pages 2326–2336. ing to Parse Database Queries Using Inductive Logic Programming. In Proceedings of Association for the Xuchen Yao. 2015. Lean Question Answering over Advancement of Artificial Intelligence. pages 1050– Freebase from Scratch. In Proceedings of North 1055. American Chapter of the Association for Computa- tional Linguistics. pages 66–70. Luke S. Zettlemoyer and Michael Collins. 2005. Learn- ing to Map Sentences to Logical Form: Structured Xuchen Yao and Benjamin Van Durme. 2014. Infor- Classification with Probabilistic Categorial Gram- mation Extraction over Structured Data: Question mars. In Proceedings of Uncertainty in Artificial In- Answering with Freebase. In Proceedings of the As- telligence. pages 658–666. sociation for Computational Linguistics. pages 956– 966. Universal Semantic Parsing: Supplementary Material

Siva Reddy† Oscar Tackstr¨ om¨ ‡ Slav Petrov‡ Mark Steedman†† Mirella Lapata†† †Stanford University ‡ Google Inc. ††University of Edinburgh [email protected], oscart, slav @google.com, steedman, mlap @inf.ed.ac.uk { } { }

root Abstract xcomp dobj nsubj mark det This supplementary material to the main paper, provides an outline of how quantifi- Everybody wants to buy a house cation can be incorporated in the UDEP- (a) Original dependency tree. LAMBDA framework. root xcomp dobj

1 Universal Quantification nsubj mark det

Consider the sentence Everybody wants to buy a Everybody wants to buy a house house,1 whose dependency tree in the Universal bind nsubj Ω Ω Dependencies (UD) formalism is shown in Fig- (b) Enhanced dependency tree. ure 1(a). This sentence has two possible readings: either (1) every person wants to buy a different root house; or (2) every person wants to buy the same xcomp dobj nsubj:univ mark det house. The two interpretations correspond to the following logical forms: Everybody wants to buy a house bind nsubj (1) x.person(x ) ∀ a → Ω Ω [ zyw.wants(ze) arg1(ze,xa) buy(ye) xcomp(ze,ye) ∃ house(w )∧ arg (z ,x ∧) arg (z ∧,w )] ; ∧ (c) Enhanced dependency tree with universal quantification. a ∧ 1 e a ∧ 2 e a (2) w.house(w ) ( x.person(x ) ∃ a ∧ ∀ a → Figure 1: The dependency tree for Everybody [ zy.wants(ze) arg1(ze,xa) buy(ye) xcomp(ze,ye) ∃ arg (z ,x ∧) arg (z ,w∧)]) . ∧ ∧ wants to buy a house and its enhanced variants. 1 e a ∧ 2 e a In (1), the existential variable w is in the scope of the universal variable x (i.e. the house is dependent same variable cannot be lambda bound and quanti- on the person). This reading is commonly referred fier bound—that is we cannot have formulas of the to as the surface reading. Conversely, in (2) the form λx... x.... In this material, we first derive ∀ universal variable x is in the scope of the existential the logical form for the example sentence using variable w (i.e. the house is independent of the the type system from our main paper (Section 1.1) person). This reading is also called inverse reading. and show that it fails to handle universal quantifi- Our goal is to obtain the surface reading logical cation. We then modify the type system slightly form in (1) with UDEPLAMBDA. We do not aim to to allow derivation of the desired surface reading obtain the inverse reading, although this is possible logical form (Section 1.2). This modified type sys- with the use of Skolemization (Steedman, 2012). tem is a strict generalization of the original type In UDEPLAMBDA, lambda expressions for system.2 Fancellu et al.(2017) present an elaborate words, phrases and sentences are all of the discussion on the modified type system, and how it form λx..... But from (1), it is clear that we need can handle negation scope and its interaction with to express variables bound by quantifiers, e.g. x, ∀ universal quantifiers. while still providing access to x for composition. This demands a change in the type system since the 2Note that this treatment has yet to be added to our implementation, which can be found at https://github.com/ 1Example borrowed from Schuster and Manning(2016). sivareddyg/udeplambda. 1.1 With Original Type System Here everybody is assigned universal quantifier We will first attempt to derive the logical form in (1) semantics. Since the UD representation does not using the default type system of UDEPLAMBDA. distinguish quantifiers, we need to rely on a small Figure 1(b) shows the enhanced dependency tree (language-specific) lexicon to identify these. To for the sentence, where BIND has been introduced encode quantification scope, we enhance the la- nsubj nsubj:univ to connect the implied nsubj of buy (BIND is ex- bel to , which indicates that plained in the main paper in Section 3.2). The the subject argument of wants contains a universal s-expression corresponding to the enhanced tree is: quantifier, as shown in Figure 1(c). This change of semantic type for words and s- (nsubj (xcomp wants (mark (nsubj (dobj buy (det house a)) Ω) to)) expressions forces us to also modify the seman- (BIND everybody Ω)) . tic type of dependency labels, in order to obey the single-type constraint of DEPLAMBDA (Reddy With the following substitution entries, et al., 2016). Thus, dependency labels will now wants, buy EVENT; ∈ take the form λPQ f ...., where P is the parent ex- everybody, house ENTITY; ∈ a, to FUNCTIONAL; pression, Q is the child expression, and the return ∈ Ω = λx. EQ(x,ω); expression is of the form λ f ..... Following this nsubj = λ f gx. y. f (x) g(y) arg (x ,y ); ∃ ∧ ∧ 1 e a change, we assign the following lambda expres- dobj = λ f gx. y. f (x) g(y) arg2(xe,ya); xcomp = λ f gx∃. y. f (x)∧ g(y)∧ xcomp(x ,y ); sions to dependency labels: ∃ ∧ ∧ e a mark HEAD; ∈ nsubj:univ = λPQ f .Q(λy.P(λx. f (x) arg1(xe,ya))) ; BIND MERGE, nsubj = λPQ f .P(λx. f (x) Q(λy.arg ∧(x ,y ))) ; ∈ ∧ 1 e a dobj = λPQ f .P(λx. f (x) Q(λy.arg2(xe,ya))) ; the lambda expression after composition becomes: xcomp = λPQ f .P(λx. f (x)∧ Q(λy.xcomp(x ,y ))) ; ∧ e a λz. xywv.wants(z ) everybody(x ) arg (z ,x ) det, mark = λPQ f .P( f ) ; ∃ e ∧ a ∧ 1 e a EQ(x,ω) buy(y ) xcomp(z ,y ) arg (y ,v ) BIND = λPQ f .P(λx. f (x) Q(λy. EQ(y,x))) . ∧ ∧ e ∧ e e ∧ 1 e a ∧ EQ(v,ω) arg (x ,y ) house(w ) arg (y ,w ) . ∧ ∧ 1 e a ∧ a ∧ 2 e a Notice that the lambda expression of This expression encodes the fact that x and v are nsubj:univ differs from nsubj. In the for- in unification, and can thus be further simplified to: mer, the lambda variables inside Q have wider (3) λz. xyw.wants(z ) everybody(x ) arg (z ,x ) scope over the variables in P (i.e. the universal ∃ e ∧ a ∧ 1 e a buy(ye) xcomp(ze,ye) arg1(ye,xa) quantifier variable of everybody has scope over the ∧ arg (x ,∧y ) house(w )∧ arg (y ,w ) . ∧ 1 e a ∧ a ∧ 2 e a event variable of wants) contrary to the latter. However, the logical form (3) differs from the The new s-expression for Figure 1(c) is desired form (1). As noted above, UDEPLAMBDA (nsubj:univ (xcomp wants (mark (nsubj (dobj buy (det house a)) Ω) to)) with its default type, where each s-expression must (BIND everybody Ω)) . have the type η = Ind Event Bool, cannot × → handle quantifier scoping. Substituting with the modified expressions, and performing composition and simplification leads to 1.2 With Higher-order Type System the expression: (6) λ f . x .person(xa) Following Champollion(2010), we make a slight ∀ [ zyw. f (z) →wants(z ) arg (z ,x ) buy(y ) ∃ ∧ e ∧ 1 e a ∧ e modification to the type system. Instead of using xcomp(ze,ye) house(wa) ∧ arg (z ,x ) arg∧ (z ,w )] . expressions of the form λx.... for words, we use ∧ 1 e a ∧ 2 e a either λ f . x.... or λ f . x...., where f has type η. This expression is identical to (1) except for the ∃ ∀ As argued by Champollion, this higher-order form outermost term λ f . By applying (6) to λx.TRUE, makes quantification and negation handling sound we obtain (1), which completes the treatment of and simpler in Neo-Davidsonian event semantics. universal quantification in UDEPLAMBDA. Following this change, we assign the following lambda expressions to the words in our example sentence: References everybody = λ f . x.person(x) f (x) ; Lucas Champollion. 2010. Quantification and negation wants = λ f . x.wants∀ (x ) f (x→) ; in event semantics. Baltic International Yearbook of ∃ e ∧ to = λ f . TRUE ; Cognition, Logic and Communication 6(1):3. buy = λ f . x.buy(x ) f (x) ; ∃ e ∧ a = λ f . TRUE ; Federico Fancellu, Siva Reddy, Adam Lopez, and Bon- house = λ f . x.house(xa) f (x) ; nie Webber. 2017. Universal Dependencies to Logi- Ω = λ f . f (ω∃) . ∧ cal Forms with Negation Scope. arXiv Preprint . Siva Reddy, Oscar Tackstr¨ om,¨ Michael Collins, Tom proved Representation for Natural Language Under- Kwiatkowski, Dipanjan Das, Mark Steedman, and standing Tasks. In Proceedings of the Tenth Interna- Mirella Lapata. 2016. Transforming Dependency tional Conference on Language Resources and Eval- Structures to Logical Forms for Semantic Parsing. uation. European Language Resources Association Transactions of the Association for Computational (ELRA), Paris, France. Linguistics 4:127–140. Mark Steedman. 2012. Taking Scope - The Natural Se- Sebastian Schuster and Christopher D. Manning. 2016. mantics of Quantifiers. MIT Press. Enhanced English Universal Dependencies: An Im-