Glue for Universal Dependencies

Matthew Gotham and Dag Haug

Centre for Advanced Study at the Norwegian Academy of Science and Letters

Oslo, 20 March 2018

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 1 / 53 → interest in deriving semantic representations from UD structures, ideally in a language-independent way Our approach: adapt and exploit techniques from LFG + Glue semantics dependency structures ≈ f-structures LFG inheritance in UD (via Stanford dependencies) Glue offers a syntax-semantics interace where syntax can underspecify semantics Postpone the need for language-specific, lexical resources

Introduction

Universal Dependencies (UD) is a de facto annotation standard for cross-linguistic annotation of syntactic structure

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 2 / 53 Our approach: adapt and exploit techniques from LFG + Glue semantics dependency structures ≈ f-structures LFG inheritance in UD (via Stanford dependencies) Glue offers a syntax-semantics interace where syntax can underspecify semantics Postpone the need for language-specific, lexical resources

Introduction

Universal Dependencies (UD) is a de facto annotation standard for cross-linguistic annotation of syntactic structure → interest in deriving semantic representations from UD structures, ideally in a language-independent way

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 2 / 53 Postpone the need for language-specific, lexical resources

Introduction

Universal Dependencies (UD) is a de facto annotation standard for cross-linguistic annotation of syntactic structure → interest in deriving semantic representations from UD structures, ideally in a language-independent way Our approach: adapt and exploit techniques from LFG + Glue semantics dependency structures ≈ f-structures LFG inheritance in UD (via Stanford dependencies) Glue offers a syntax-semantics interace where syntax can underspecify semantics

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 2 / 53 Introduction

Universal Dependencies (UD) is a de facto annotation standard for cross-linguistic annotation of syntactic structure → interest in deriving semantic representations from UD structures, ideally in a language-independent way Our approach: adapt and exploit techniques from LFG + Glue semantics dependency structures ≈ f-structures LFG inheritance in UD (via Stanford dependencies) Glue offers a syntax-semantics interace where syntax can underspecify semantics Postpone the need for language-specific, lexical resources

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 2 / 53 Outline

1 Target representations

2 Introduction to Glue semantics

3 Universal Dependencies

4 Our pipeline

5 Evaluation and discussion

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 3 / 53 Target representations Plan

1 Target representations

2 Introduction to Glue semantics

3 Universal Dependencies

4 Our pipeline

5 Evaluation and discussion

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 4 / 53 We assume referents (drefs) of three sorts: entities (xn), eventualities (en) and (pn). The ant means that its argument has an antecedent (it’s a presupposed dref). → Also applies to the predicates beginning pron. The connective ∂ marks presupposed conditions—it maps true to true and is otherwise undefined. → Unlike Boxer, which has separate DRSs for presupposed and asserted material.

Target representations Target representations

Our target representations for sentence meanings are DRSs. The format of these DRSs is inspired by Boxer (Bos, 2008).

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 5 / 53 The predicate ant means that its argument has an antecedent (it’s a presupposed dref). → Also applies to the predicates beginning pron. The connective ∂ marks presupposed conditions—it maps true to true and is otherwise undefined. → Unlike Boxer, which has separate DRSs for presupposed and asserted material.

Target representations Target representations

Our target representations for sentence meanings are DRSs. The format of these DRSs is inspired by Boxer (Bos, 2008).

We assume discourse referents (drefs) of three sorts: entities (xn), eventualities (en) and propositions (pn).

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 5 / 53 Target representations Target representations

Our target representations for sentence meanings are DRSs. The format of these DRSs is inspired by Boxer (Bos, 2008).

We assume discourse referents (drefs) of three sorts: entities (xn), eventualities (en) and propositions (pn). The predicate ant means that its argument has an antecedent (it’s a presupposed dref). → Also applies to the predicates beginning pron. The connective ∂ marks presupposed conditions—it maps true to true and is otherwise undefined. → Unlike Boxer, which has separate DRSs for presupposed and asserted material.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 5 / 53 Target representations An example

Us: (1) Abrams persuaded the dog to bark. x1 x2 e1 p1 Boxer: named(x1, abrams) x1 e1 p1 ant(x2) named(x1, abrams) ∂(dog(x2)) persuade(e1) persuade(e1) agent(e1, x1) agent(e1, x1) x2 ( + theme(e1, x2) ) theme(e1, x2) dog(x2) content(e1, p1) content(e1, p1) e2 e2 p1 : bark(e2) p1 : bark(e2) agent(e2, x2) agent(e2, x2)

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 6 / 53 Target representations Other running examples (taken from the CCS development suite)

(2) He hemmed and hawed.

x1 e1 e2

pron.he(x1) hem(e1) agent(e1, x1) haw(e2) agent(e2, x1)

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 7 / 53 Target representations

(3) The dog they thought we admired barks.

x1 x2 x3 e1 e2 p1

ant(x1), ∂(dog(x1)) pron.they(x2), pron.we(x3) bark(e1), agent(e1, x1) ∂(think(e2)), ∂(agent(e2, x2)) ∂(content(e2, p1))

admire(e3) p1 : agent(e3, x3) theme(e3, x1)

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 8 / 53 Conceptually, we are assuming PCDRT (Haug, 2014), which has a definition of the ant predicate and (relatedly) a treatment of so-far-unresolved that doesn’t require indexing. This specific assumption is not crucial, though.

Target representations Underlying logic

The Glue approach relies on meanings being put together by application and abstraction, so we need some form of compositional or λ-DRT for meaning construction.

x1 someone λP. ; P(x1) person(x1)

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 9 / 53 Target representations Underlying logic

The Glue approach relies on meanings being put together by application and abstraction, so we need some form of compositional or λ-DRT for meaning construction.

x1 someone λP. ; P(x1) person(x1)

Conceptually, we are assuming PCDRT (Haug, 2014), which has a definition of the ant predicate and (relatedly) a treatment of so-far-unresolved anaphora that doesn’t require indexing. This specific assumption is not crucial, though.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 9 / 53 Introduction to Glue semantics Plan

1 Target representations

2 Introduction to Glue semantics

3 Universal Dependencies

4 Our pipeline

5 Evaluation and discussion

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 10 / 53 Has been applied to other frameworks: HPSG (Asudeh & Crouch, 2002), LTAG (Frank & van Genabith, 2001) and Minimalism (Gotham, 2018). Interpretations of constituents are paired with formulae of a fragment of (Girard, 1987), and semantic composition is deduction in that logic mediated by the Curry-Howard correspondence (Howard, 1980).

A crude characterisation would be that glue semantics is like and its semantics, but without the categorial grammar. (Crouch & van Genabith, 2000, 91)

Introduction to Glue semantics What is Glue?

A theory of the syntax/semantics interface, originally developed for LFG, and now the mainstream in LFG (Dalrymple et al., 1993, 1999).

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 11 / 53 Interpretations of constituents are paired with formulae of a fragment of linear logic (Girard, 1987), and semantic composition is deduction in that logic mediated by the Curry-Howard correspondence (Howard, 1980).

A crude characterisation would be that glue semantics is like categorial grammar and its semantics, but without the categorial grammar. (Crouch & van Genabith, 2000, 91)

Introduction to Glue semantics What is Glue?

A theory of the syntax/semantics interface, originally developed for LFG, and now the mainstream in LFG (Dalrymple et al., 1993, 1999). Has been applied to other frameworks: HPSG (Asudeh & Crouch, 2002), LTAG (Frank & van Genabith, 2001) and Minimalism (Gotham, 2018).

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 11 / 53 A crude characterisation would be that glue semantics is like categorial grammar and its semantics, but without the categorial grammar. (Crouch & van Genabith, 2000, 91)

Introduction to Glue semantics What is Glue?

A theory of the syntax/semantics interface, originally developed for LFG, and now the mainstream in LFG (Dalrymple et al., 1993, 1999). Has been applied to other frameworks: HPSG (Asudeh & Crouch, 2002), LTAG (Frank & van Genabith, 2001) and Minimalism (Gotham, 2018). Interpretations of constituents are paired with formulae of a fragment of linear logic (Girard, 1987), and semantic composition is deduction in that logic mediated by the Curry-Howard correspondence (Howard, 1980).

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 11 / 53 Introduction to Glue semantics What is Glue?

A theory of the syntax/semantics interface, originally developed for LFG, and now the mainstream in LFG (Dalrymple et al., 1993, 1999). Has been applied to other frameworks: HPSG (Asudeh & Crouch, 2002), LTAG (Frank & van Genabith, 2001) and Minimalism (Gotham, 2018). Interpretations of constituents are paired with formulae of a fragment of linear logic (Girard, 1987), and semantic composition is deduction in that logic mediated by the Curry-Howard correspondence (Howard, 1980).

A crude characterisation would be that glue semantics is like categorial grammar and its semantics, but without the categorial grammar. (Crouch & van Genabith, 2000, 91)

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 11 / 53 Introduction to Glue semantics as an example

(4) Someone sees everything.

Two interpretations: 1 There is someone who sees everything. (surface scope, ∃ > ∀) 2 Everything is seen. (inverse scope, ∀ > ∃)

Q: Where is the ambiguity?

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 12 / 53 Surface scope Inverse scope someone sees everything, 4 someone sees everything, 10, 0

someone sees everything, 5 everything someone sees he0, 4

sees everything someone sees he0, 5

sees he0

Introduction to Glue semantics (Montague, 1973; Dowty et al., 1981)

Ambiguity of syntactic derivation:

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 13 / 53 Inverse scope someone sees everything, 10, 0

everything someone sees he0, 4

someone sees he0, 5

sees he0

Introduction to Glue semantics Montague Grammar (Montague, 1973; Dowty et al., 1981)

Ambiguity of syntactic derivation:

Surface scope someone sees everything, 4

someone sees everything, 5

sees everything

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 13 / 53 Introduction to Glue semantics Montague Grammar (Montague, 1973; Dowty et al., 1981)

Ambiguity of syntactic derivation:

Surface scope Inverse scope someone sees everything, 4 someone sees everything, 10, 0

someone sees everything, 5 everything someone sees he0, 4

sees everything someone sees he0, 5

sees he0

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 13 / 53 Surface scope Inverse scope S S

NP1 S NP2 S

someone NP2 S everything NP1 S

everything NP VP someone NP VP

t1 V NP t1 V NP

sees t2 sees t2

Introduction to Glue semantics Mainstream Minimalism (May, 1977, 1985; Heim & Kratzer, 1998)

Ambiguity of syntactic structure:

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 14 / 53 Inverse scope S

NP2 S

everything NP1 S

someone NP VP

t1 V NP

sees t2

Introduction to Glue semantics Mainstream Minimalism (May, 1977, 1985; Heim & Kratzer, 1998)

Ambiguity of syntactic structure:

Surface scope S

NP1 S

someone NP2 S

everything NP VP

t1 V NP

sees t2

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 14 / 53 Introduction to Glue semantics Mainstream Minimalism (May, 1977, 1985; Heim & Kratzer, 1998)

Ambiguity of syntactic structure:

Surface scope Inverse scope S S

NP1 S NP2 S

someone NP2 S everything NP1 S

everything NP VP someone NP VP

t1 V NP t1 V NP

sees t2 sees t2

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 14 / 53 From this it follows that if a sentence is ambiguous, such as (4), then that ambiguity must be either lexical or syntactic. The Glue approach is that syntax constrains what can combine with what, and how. (to this extent there is a similarity with Cooper storage (Cooper, 1983))

Introduction to Glue semantics Another way

The approaches just mentioned have in common is the view that syntactic structure plus determines interpretation.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 15 / 53 The Glue approach is that syntax constrains what can combine with what, and how. (to this extent there is a similarity with Cooper storage (Cooper, 1983))

Introduction to Glue semantics Another way

The approaches just mentioned have in common is the view that syntactic structure plus lexical semantics determines interpretation. From this it follows that if a sentence is ambiguous, such as (4), then that ambiguity must be either lexical or syntactic.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 15 / 53 Introduction to Glue semantics Another way

The approaches just mentioned have in common is the view that syntactic structure plus lexical semantics determines interpretation. From this it follows that if a sentence is ambiguous, such as (4), then that ambiguity must be either lexical or syntactic. The Glue approach is that syntax constrains what can combine with what, and how. (to this extent there is a similarity with Cooper storage (Cooper, 1983))

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 15 / 53 sees applies to A, then B, to form C. JsomeoneK applies to (something that applies to B to form C) to form CJ . K everything applies to (something that applies to A to form C) to Jform C. K There’s more than one way to put someone , sees and everything together, while obeying these constraints,J toK formJ CK. J K The different ways: Give the different interpretations of (4). Correspond to different proofs from the same premises in Linear Logic.

Introduction to Glue semantics

Totally informal statement of what the constraints look like in (4):

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 16 / 53 There’s more than one way to put someone , sees and everything together, while obeying these constraints,J toK formJ CK. J K The different ways: Give the different interpretations of (4). Correspond to different proofs from the same premises in Linear Logic.

Introduction to Glue semantics

Totally informal statement of what the constraints look like in (4): sees applies to A, then B, to form C. JsomeoneK applies to (something that applies to B to form C) to form CJ . K everything applies to (something that applies to A to form C) to Jform C. K

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 16 / 53 The different ways: Give the different interpretations of (4). Correspond to different proofs from the same premises in Linear Logic.

Introduction to Glue semantics

Totally informal statement of what the constraints look like in (4): sees applies to A, then B, to form C. JsomeoneK applies to (something that applies to B to form C) to form CJ . K everything applies to (something that applies to A to form C) to Jform C. K There’s more than one way to put someone , sees and everything together, while obeying these constraints,J toK formJ CK. J K

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 16 / 53 Introduction to Glue semantics

Totally informal statement of what the constraints look like in (4): sees applies to A, then B, to form C. JsomeoneK applies to (something that applies to B to form C) to form CJ . K everything applies to (something that applies to A to form C) to Jform C. K There’s more than one way to put someone , sees and everything together, while obeying these constraints,J toK formJ CK. J K The different ways: Give the different interpretations of (4). Correspond to different proofs from the same premises in Linear Logic.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 16 / 53 Introduction to Glue semantics The syntax-semantics interface according to Glue

collection syntactic 1 2 linear 3 of linear semantic structure logic logic interpretation(s) + lexicon proof(s) premises

1 Function, given by Glue implementation 2 Relation, given by linear logic proof theory 3 Function, given by Curry-Howard correspondence

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 17 / 53 The reason for this is that, in linear logic, for a sequent

premise(s) ` conclusion

to be valid, every premise in premise(s) must be ‘used’ exactly once. So for example,

A ` A and A, A ( B ` B, but A, A 0 A and A, A ( (A ( B) 0 B

(( is linear implication)

Introduction to Glue semantics Linear logic

Linear logic is often called a ‘logic of resources’(Crouch & van Genabith, 2000, 5).

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 18 / 53 So for example,

A ` A and A, A ( B ` B, but A, A 0 A and A, A ( (A ( B) 0 B

(( is linear implication)

Introduction to Glue semantics Linear logic

Linear logic is often called a ‘logic of resources’(Crouch & van Genabith, 2000, 5). The reason for this is that, in linear logic, for a sequent

premise(s) ` conclusion

to be valid, every premise in premise(s) must be ‘used’ exactly once.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 18 / 53 Introduction to Glue semantics Linear logic

Linear logic is often called a ‘logic of resources’(Crouch & van Genabith, 2000, 5). The reason for this is that, in linear logic, for a sequent

premise(s) ` conclusion

to be valid, every premise in premise(s) must be ‘used’ exactly once. So for example,

A ` A and A, A ( B ` B, but A, A 0 A and A, A ( (A ( B) 0 B

(( is linear implication)

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 18 / 53 expressions of a meaning language (in this case, λ-DRT) are paired with formulae in a fragment of linear logic (the glue language), and steps of deduction carried out using those formulae correspond to operations performed on the meaning terms, according to the Curry-Howard correspondence.

Introduction to Glue semantics Interpretation as deduction

In Glue,

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 19 / 53 , and steps of deduction carried out using those formulae correspond to operations performed on the meaning terms, according to the Curry-Howard correspondence.

Introduction to Glue semantics Interpretation as deduction

In Glue, expressions of a meaning language (in this case, λ-DRT) are paired with formulae in a fragment of linear logic (the glue language)

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 19 / 53 Introduction to Glue semantics Interpretation as deduction

In Glue, expressions of a meaning language (in this case, λ-DRT) are paired with formulae in a fragment of linear logic (the glue language), and steps of deduction carried out using those formulae correspond to operations performed on the meaning terms, according to the Curry-Howard correspondence.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 19 / 53 Propositions as types:

type(X ( Y ) := type(X ) type(Y ) 

Introduction to Glue semantics Linear implication

Rules for ( Elimination. . . Introduction. . .

[X ]n . Exactly one hypothe- . . sis must be discharged f :X Ya :X Y ( (E (I ,n in the introduction Y X ( Y step.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 20 / 53 Propositions as types:

type(X ( Y ) := type(X ) type(Y ) 

Introduction to Glue semantics Linear implication and functional types

Rules for ( and their images under the Curry-Howard correspondence Elimination. . . Introduction. . .

[v : X ]n . Exactly one hypoth- . . esis must be dis- f : X ( Y a : X f : Y (E (I ,n charged in the intro- f (a): Y λv.f : X ( Y duction step.

. . . corresponds to ...... application. . . . abstraction.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 20 / 53 Introduction to Glue semantics Linear implication and functional types

Rules for ( and their images under the Curry-Howard correspondence Elimination. . . Introduction. . .

[v : X ]n . Exactly one hypoth- . . esis must be dis- f : X ( Y a : X f : Y (E (I ,n charged in the intro- f (a): Y λv.f : X ( Y duction step.

. . . corresponds to ...... application. . . . abstraction.

Propositions as types:

type(X ( Y ) := type(X ) type(Y ) 

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 20 / 53 ⇓

λQ.[x1 | person(x1)]; Q(x1):(B ( C) ( C type (e t) t   λv.λu.[ |see(u, v)] : A ( (B ( C) type e (e t)   λP.[ |[x1| ] ⇒ P(x1)] : (A ( C) ( C type (e t) t  

Introduction to Glue semantics What you need from syntax

label ABC assigned to the object argu- the subject argu- the sentence as a ment of sees ment of sees whole

everything someone (where someone takes scope)

(where everything takes scope)

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 21 / 53 Introduction to Glue semantics What you need from syntax

label ABC assigned to the object argu- the subject argu- the sentence as a ment of sees ment of sees whole

everything someone (where someone takes scope)

(where everything takes scope) ⇓

λQ.[x1 | person(x1)]; Q(x1):(B ( C) ( C type (e t) t   λv.λu.[ |see(u, v)] : A ( (B ( C) type e (e t)   λP.[ |[x1| ] ⇒ P(x1)] : (A ( C) ( C type (e t) t   Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 21 / 53 Introduction to Glue semantics Surface scope interpretation

sees : 1 A (J (B K( C)[z : A] (E 2 sees (z): B ( C [w : B] J K (E everything : sees (z)(w): C J K (I ,1 (JA ( C) (KC λz. sees (z)(w): A ( C J K (E someone : everything (λz. sees (z)(w)) : C J K J K (I ,2 (BJ ( C) (K C λw. everything (λz. sees (z)(w)) : B ( C someone (λw. everythingJ (λKz. seesJ (zK)(w))) : C (E J K J K J K

x1 person(x1) β x 2 ⇒ see(x1, x2)

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 22 / 53 Introduction to Glue semantics Inverse scope interpretation

sees : 1 someone : A (J (B K( C)[z : A] (E (BJ ( C) (K C sees (z): B ( C J K (E everything : someone ( sees (z)) : C J K J K (I ,1 (JA ( C) (KC λz. someone ( sees (z)) : A ( C everything (λz. someoneJ (K seesJ (Kz))) : C (E J K J K J K

x2 β x1 ⇒ person(x2) see(x2, x1)

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 23 / 53 Universal Dependencies Plan

1 Target representations

2 Introduction to Glue semantics

3 Universal Dependencies

4 Our pipeline

5 Evaluation and discussion

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 24 / 53 There are also some UD-specific choices No argument/adjunct distinction Some of this will be alleviated through enhanced dependencies but those are not yet widely available

Universal Dependencies Theoretical considerations

Dependency grammars have severe expressivity constraints Unique head constraint Overt token constraint

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 25 / 53 Some of this will be alleviated through enhanced dependencies but those are not yet widely available

Universal Dependencies Theoretical considerations

Dependency grammars have severe expressivity constraints Unique head constraint Overt token constraint There are also some UD-specific choices No argument/adjunct distinction

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 25 / 53 Universal Dependencies Theoretical considerations

Dependency grammars have severe expressivity constraints Unique head constraint Overt token constraint There are also some UD-specific choices No argument/adjunct distinction Some of this will be alleviated through enhanced dependencies but those are not yet widely available

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 25 / 53 Universal Dependencies Coordination structure

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 26 / 53 Universal Dependencies Control structure

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 27 / 53 Universal Dependencies Relative clause structure

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 28 / 53 Universal Dependencies No argument/adjunct distinction

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 29 / 53 Our pipeline Plan

1 Target representations

2 Introduction to Glue semantics

3 Universal Dependencies

4 Our pipeline

5 Evaluation and discussion

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 30 / 53 Our pipeline Overview

3 Proof / DRS 7 2 Multiset of meaning constructors + / ... rewritten 1 7 tree

UD tree / ...

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 31 / 53 Our pipeline Overview

Traversal of the UD tree, matching each node against a rule file For each matched rule, a meaning constructor is produced...... and then instantiated non-deterministically, possibly rewriting the UD tree in the process The result is a set of pairs hM, T i where M is a multiset of meaning constructors and T is a rewritten UD tree Each multiset is fed into a linear logic prover (by Miltiadis Kokkonidis) and beta reduction software (from Johan Bos’ Boxer)

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 32 / 53 Our pipeline Example

root

arrived pos = PROPN → pos=VERB λP.[x|named(x, :lemma:)] ; P(x): index=2 (e↓ ( t%R ) ( t%R

nsubj

Peter pos=PROPN index=1

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 33 / 53 Our pipeline Example

root

arrived pos = PROPN → pos=VERB λP.[x|named(x, Peter)] ; P(x): index=2 (e1 ( t2) ( t2

nsubj

Peter pos=PROPN index=1

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 33 / 53 Our pipeline Example

root

arrived pos = VERB → pos=VERB λF , [e|:lemma:(e)]; :DEP:(e); F (e): index=2 (v↓ ( t↓) ( t↓

nsubj

Peter pos=PROPN index=1

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 33 / 53 Our pipeline Example

root

arrived pos = VERB → pos=VERB λx.λF , [e|arrive(e), nsubj(e, x)] ; F (e): index=2 e↓nsubj ( (v↓ ( t↓) ( t↓

nsubj

Peter pos=PROPN index=1

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 33 / 53 Our pipeline Example

root

arrived pos = VERB → pos=VERB λx.λF , [e|arrive(e), nsubj(e, x)] ; F (e): index=2 e1 ( (v2 ( t2) ( t2

nsubj

Peter pos=PROPN index=1

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 33 / 53 Our pipeline Example

root

arrived pos=VERB relation = ROOT → index=2 λ .[ | ]: v(↓) ( t(↓)

nsubj

Peter pos=PROPN index=1

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 33 / 53 Our pipeline Example

root

arrived pos=VERB relation = ROOT → index=2 λ .[ | ]: v2 ( t2

nsubj

Peter pos=PROPN index=1

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 33 / 53 Our pipeline Example

root

λP.[x1|named(x1, Peter)] ; P(x1): arrived (e1 ( t2) ( t2 pos=VERB λx.λF , [e1|arrive(e1), nsubj(e1, x)] ; F (e1): index=2 e1 ( (v2 ( t2) ( t2 nsubj λ .[ | ]: v2 ( t2

Peter pos=PROPN index=1

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 33 / 53 Our pipeline Interpretation in Glue

arrived : J K 1 e1 ( (v2 ( t2) ( t2 [y : e1] root : (E arrived (y):(v2 ( t2) ( t2 Jv2 ( Kt2 J K (E Peter : arrived (y)( root ): t2 J K J K J K (I ,1 (e1 ( t2) ( t2 λy. arrived (y)( root ): e1 ( t2 J K J K (E Peter (λy. arrived (y)( root )) : t2 J K J K J K       e1   x1 λP. ; P(x1) λy. λx.λF . arrive(e1) ; F (e1) (y) λV .  named(x1, Peter) nsubj(e1, x)

x1 e1 named(x1, Peter) β arrive(e1) nsubj(e1, x1)

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 34 / 53 Our pipeline Control

root e1 x1 x2

persuade persuade(e1) nsubj xcomp controldep(e , x ) obj 1 2 λP.λy.λx.λF . xcomp(e1, x1) ; F (e1) Abrams dog bark obj(e1, y) nsubj(e1, x) det mark q : P(x2)(λ .[ | ]) the to

(e↓xcomp nsubj ( (v↓xcomp ( t↓xcomp) ( t↓xcomp) ( (e↓nsubj) ( (e↓obj) ( (v↓ ( t↓) ( t↓

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 35 / 53 Our pipeline Control

root e1 x1 x2

persuade persuade(e1) nsubj xcomp controldep(e , x ) obj 1 2 λP.λy.λx.λF . xcomp(e1, x1) ; F (e1) Abrams dog bark obj(e1, y) mark nsubj nsubj(e1, x) det q : P(x2)(λ .[ | ]) the to *

(e8 ( (v6 ( t6) ( t6) ( e4 ( e1 ( (v2 ( t2) ( t2

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 35 / 53 Our pipeline

persuade : J K ((v6 ( t6) ( t6) ( bark : J K e4 ( e1 ( (v2 ( t2) ( t2 (v6 ( t6) ( t6 1 persuade ( bark ): e4 ( e1 ( (v2 ( t2) ( t2 [u : e4] J K J K 2 persuade ( bark )(u): e1 ( (v2 ( t2) ( t2 [v : e1] root : J K J K . persuade ( bark )(u)(v):(v2 ( t2) ( t2 Jv2 ( Kt2 . J K J K the ( dog ): persuade ( bark )(u)(v)( root ): t2 J K J K J K J K J K 1 (e4 ( t2) ( t2 λu. persuade ( bark )(u)(v)( root ): e4 ( t2 J K J K J K Abrams : the ( dog )(λu. persuade ( bark )(u)(v)( root )) : t2 J K J K J K J K J K J K 2 (e1 ( t2) ( t2 λv. the ( dog )(λu. persuade ( bark )(u)(v)( root )) : e1 ( t2 J K J K J K J K J K Abrams (λv. the ( dog )(λu. persuade ( bark )(u)(v)( root ))) : t2 J K J K J K J K J K J K

x1 x2 x3 e1 p1 named(x1, abrams), ant(x2) ∂(dog(x2)), persuade(e1) nsubj(e1, x1), obj(e1, x2) β controldep(e1, x3), xcomp(e1, p1) e2 p1 : bark(e2) nsubj(e2, x3)

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 36 / 53 Our pipeline Relative clauses

root

barks

nsubj

dog λP.λV .λx.P(x); V (x)(λ .[ | ])

det acl:relcl (e↑ ( t↑) ( (e↓dep∗dep{PType=Rel} ( (v↓ ( t↓) ( t↓) ( the thought e↑ ( t↑

nsubj ccomp

they admired

nsubj

we Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 37 / 53 Our pipeline Relative clauses

root

barks

nsubj

dog λP.λV .λx.P(x); V (x)(λ .[ | ])

det acl:relcl (e2 ( t2) ( (e9 ( (v4 ( t4) ( t4) ( the thought e2 ( t2 nsubj ccomp

they admired

dep nsubj

* we Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 37 / 53 Our pipeline Relative clauses

root

barks

nsubj

dog λP.λV .λx.P(x); V (x)(λ .[ | ])

det acl:relcl (e2 ( t2) ( (e9 ( (v4 ( t4) ( t4) ( the thought e2 ( t2 nsubj ccomp dep

they * admired

nsubj

we Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 37 / 53 Our pipeline Relative clauses

root

barks

nsubj

dog λP.λV .λx.P(x); V (x)(λ .[ | ])

det acl:relcl (e2 ( t2) ( (e9 ( (v4 ( t4) ( t4) ( the thought e2 ( t2 nsubj ccomp

they admired

dep nsubj

* we Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 37 / 53 Our pipeline Other rules

relation = case; ↑↑ {coarsePos = VERB} → lam(Y,(lam(X,drs([ ],[rel(:LEMMA:,Y,X) ])))) : e(↑)(v(↑↑)(t(↓) relation = case; ↑↑ {coarsePos = VERB} → relation = case → lam(Y,(lam(X,drs([ ],[rel(:LEMMA:,Y,X) ])))) : e(↑)(e(↑↑)(t(↓)

coarsePos = DET, lemma = a; ↑ cop { } →

relation = conj; det { } → lam(X,lam(Q,lam(C,lam(Y,app(app(C,drs([],[leq(X,Y)])),app(app(Q,C),Y)))))) e(↓)(((t(↑)(t(↑)(t(↑))(n(↑))((t(↑)(t(↑)(t(↑))(n(↑)

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 38 / 53 Evaluation and discussion Plan

1 Target representations

2 Introduction to Glue semantics

3 Universal Dependencies

4 Our pipeline

5 Evaluation and discussion

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 39 / 53 What we have in the DRS above is as much information as can be extracted from the UD tree alone, without lexical knowledge. Lexical knowledge in the form of meaning postulates such as (5) can be harnessed to further specify the meaning representation.

(5) ∀e∀x((arrive(e) ∧ nsubj(e, x)) → theme(e, x))

Evaluation and discussion Discussion of output

x1 e1 named(x1, Peter) arrive(e1) nsubj(e1, x1) What kind of θ-role is ‘nsubj’? A syntactic name, lifted from the arc label. In and of itself, uninformative.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 40 / 53 Evaluation and discussion Discussion of output

x1 e1 named(x1, Peter) arrive(e1) nsubj(e1, x1) What kind of θ-role is ‘nsubj’? A syntactic name, lifted from the arc label. In and of itself, uninformative. What we have in the DRS above is as much information as can be extracted from the UD tree alone, without lexical knowledge. Lexical knowledge in the form of meaning postulates such as (5) can be harnessed to further specify the meaning representation.

(5) ∀e∀x((arrive(e) ∧ nsubj(e, x)) → theme(e, x))

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 40 / 53 Evaluation and discussion Discussion of output

x1 e1 named(x1, Peter) arrive(e1) theme(e1, x1) What kind of θ-role is ‘nsubj’? A syntactic name, lifted from the arc label. In and of itself, uninformative. What we have in the DRS above is as much information as can be extracted from the UD tree alone, without lexical knowledge. Lexical knowledge in the form of meaning postulates such as (5) can be harnessed to further specify the meaning representation.

(5) ∀e∀x((arrive(e) ∧ nsubj(e, x)) → theme(e, x))

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 40 / 53 But the information that persuade is an object control verb can again be encoded in a : ∀e∀x((persuade(e) ∧ controldep(e, x)) → obj(e, x))

With thematic uniqueness, we get x2 = x3 in this case. Blurs the distinction between lexical syntax and semantics.

Evaluation and discussion x1 x2 x3 e1 p1 ... persuade(e1), obj(e1, x2), controldep(e1, x3), xcomp(e1, p1) e2 p1 : ..., nsubj(e2, x3) The persuade + xcomp meaning constructor has

introduced an xcomp relation between the persuading event e1 and the p1 that there is a barking event e2, and introduced an individual x3 as the nsubj of e2 and the controldep of e1.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 41 / 53 With thematic uniqueness, we get x2 = x3 in this case. Blurs the distinction between lexical syntax and semantics.

Evaluation and discussion x1 x2 x3 e1 p1 ... persuade(e1), obj(e1, x2), controldep(e1, x3), xcomp(e1, p1) e2 p1 : ..., nsubj(e2, x3) The persuade + xcomp meaning constructor has

introduced an xcomp relation between the persuading event e1 and the proposition p1 that there is a barking event e2, and introduced an individual x3 as the nsubj of e2 and the controldep of e1. But the information that persuade is an object control verb can again be encoded in a meaning postulate: ∀e∀x((persuade(e) ∧ controldep(e, x)) → obj(e, x))

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 41 / 53 With thematic uniqueness, we get x2 = x3 in this case. Blurs the distinction between lexical syntax and semantics.

Evaluation and discussion x1 x2 x3 e1 p1 ... persuade(e1), obj(e1, x2), obj(e1, x3), xcomp(e1, p1) e2 p1 : ..., nsubj(e2, x3) The persuade + xcomp meaning constructor has

introduced an xcomp relation between the persuading event e1 and the proposition p1 that there is a barking event e2, and introduced an individual x3 as the nsubj of e2 and the controldep of e1. But the information that persuade is an object control verb can again be encoded in a meaning postulate: ∀e∀x((persuade(e) ∧ controldep(e, x)) → obj(e, x))

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 41 / 53 Evaluation and discussion x1 x2 x3 e1 p1 ... persuade(e1), obj(e1, x2), obj(e1, x3), xcomp(e1, p1) e2 p1 : ..., nsubj(e2, x3) The persuade + xcomp meaning constructor has

introduced an xcomp relation between the persuading event e1 and the proposition p1 that there is a barking event e2, and introduced an individual x3 as the nsubj of e2 and the controldep of e1. But the information that persuade is an object control verb can again be encoded in a meaning postulate: ∀e∀x((persuade(e) ∧ controldep(e, x)) → obj(e, x))

With thematic uniqueness, we get x2 = x3 in this case. Blurs the distinction between lexical syntax and semantics.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 41 / 53 Evaluation and discussion VP/Sentence coordination: He hemmed and hawed

x1 e2 e3

pron.he(x1) hem(e2) nsubj(e2, x1) haw(e3)

No way to distinguish V/VP/S coordination in DG because of the overt token constraint No argument sharing because of the unique head constraint

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 42 / 53 Evaluation and discussion NP Coordination: Abrams and/or Browne danced

e1 x2 x3 x4 e1 x2 x3 x4 dance(e1) dance(e1) nsubj(e,x2) nsubj(e,x2) named(x3, browne) named(x3, browne) named(x4, abrams) named(x4, abrams) x v x 3 2 ∨ x4 v x2 x3 v x2 x4 v x2

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 43 / 53 Evaluation and discussion Argument/adjunct distinction

e1 x2 x3 e1 x2 x3

rely(e1) leave(e1) named(x2, kim) named(x2, kim) named(x3, sandy) named(x3, tuesday) on(x3, e1) on(x3, e1)

Again, we will have to rely on meaning postulates to resolve the on relation to a thematic role in one case and a temporal relation in the other

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 44 / 53 Very far from something practically useful Basic coverage of UD relations except vocative, dislocated, clf, list, parataxis, orphan Little or no work on interactions, special constructions, real data noise

Evaluation and discussion Evaluation

What we have so far is a proof of concept tested on carefully crafted examples application of LFG techniques (functional uncertainties) to enrich underspecified UD syntax application of glue semantics to dependency structures

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 45 / 53 Evaluation and discussion Evaluation

What we have so far is a proof of concept tested on carefully crafted examples application of LFG techniques (functional uncertainties) to enrich underspecified UD syntax application of glue semantics to dependency structures Very far from something practically useful Basic coverage of UD relations except vocative, dislocated, clf, list, parataxis, orphan Little or no work on interactions, special constructions, real data noise

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 45 / 53 Evaluation and discussion Pros and cons of glue semantics

No need for binarization Flexible approach to scoping yield different readings Hard to restrict unwanted/non-existing scopings Computing lots of uninteresting scope differences

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 46 / 53 Evaluation and discussion Unwanted scopings

e

λF . ; F (e):(v1 t1) t1 arrive(e) ( (

λ . : v1 ( t1

It is clear which DRS sentence-level operators (, auxiliaries etc.) should target! Modalities in the linear logic Different types for the two DRSs

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 47 / 53 Evaluation and discussion Efficient scoping

Two parameters: level of scope order of combination of quantifiers at each level We currently naively compute everything with a light-weight prover → obvious performance problems Disallow intermediate scopings? Structure sharing across derivations (building on work in an LFG )

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 48 / 53 Practical achievement: an interesting proof of concept But lots of work remains Support for partial proofs Axiomatization of lexical knowledge Ambiguity management

Evaluation and discussion Conclusions

Theoretical achievement: application of glue to dependency grammar

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 49 / 53 But lots of work remains Support for partial proofs Axiomatization of lexical knowledge Ambiguity management

Evaluation and discussion Conclusions

Theoretical achievement: application of glue to dependency grammar Practical achievement: an interesting proof of concept

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 49 / 53 Evaluation and discussion Conclusions

Theoretical achievement: application of glue to dependency grammar Practical achievement: an interesting proof of concept But lots of work remains Support for partial proofs Axiomatization of lexical knowledge Ambiguity management

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 49 / 53 Evaluation and discussion ReferencesI

Asudeh, Ash & Richard Crouch. 2002. Glue Semantics for HPSG. In Frank van Eynde, Lars Hellan & Dorothee Beermann (eds.), Proceedings of the 8th international HPSG conference, Stanford, CA: CSLI Publications. Bos, Johan. 2008. Wide-coverage semantic analysis with Boxer. In Proceedings of the 2008 conference on semantics in text processing STEP ’08, 277–286. Stroudsburg, PA, USA: Association for Computational Linguistics. http://dl.acm.org/citation.cfm?id=1626481.1626503. Cooper, Robin. 1983. Quantification and syntactic theory (Studies in Linguistics and Philosophy 21). Dordrecht: D. Reidel. Crouch, Richard & Josef van Genabith. 2000. Linear logic for linguists. ESSLLI 2000 course notes.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 50 / 53 Evaluation and discussion ReferencesII

Dalrymple, Mary, Vineet Gupta, John Lamping & Vijay Saraswat. 1999. Relating resource-based semantics to categorial semantics. In Mary Dalrymple (ed.), Semantics and syntax in Lexical Functional Grammar, 261–280. Cambridge, MA: MIT Press. Dalrymple, Mary, John Lamping & Vijay Saraswat. 1993. LFG semantics via constraints. In Steven Krauwer, Michael Moortgat & Louis des Tombe (eds.), EACL 1993, 97–105. Dowty, David R., Robert E. Wall & Stanley Peters. 1981. Introduction to Montague semantics. Dordrecht: D. Reidel. Frank, Anette & Josef van Genabith. 2001. GlueTag: Linear logic-based semantics for LTAG—and what it teaches us about LFG and LTAG. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG01 conference, Stanford, CA: CSLI Publications.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 51 / 53 Evaluation and discussion ReferencesIII

Girard, Jean-Yves. 1987. Linear logic. Theoretical Computer Science 50(1). 1–101. doi:10.1016/0304-3975(87)90045-4. Gotham, Matthew. 2018. Making Logical Form type-logical. Linguistics and Philosophy doi:10.1007/s10988-018-9229-z. In press. Haug, Dag Trygve Truslew. 2014. Partial for anaphora. Journal of Semantics 31. 457–511. Heim, Irene & Angelika Kratzer. 1998. Semantics in (Blackwell Textbooks in Linguistics 13). Oxford: Wiley-Blackwell. Howard, W.A. 1980. The formulae-as-types notion of construction. In J.P. Seldin & J.R. Hindley (eds.), To H.B. Curry, 479–490. New York: Academic Press. May, Robert. 1977. The grammar of quantification: Massachusetts Institute of Technology dissertation.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 52 / 53 Evaluation and discussion ReferencesIV

May, Robert. 1985. Logical form (Linguistic Inquiry Monographs 12). Cambridge, MA: MIT Press. Montague, Richard. 1973. The proper treatment of quantification in ordinary English. In Patrick Suppes, Julius Moravcsik & Jaakko Hintikka (eds.), Approaches to natural language, 221–242. Dordrecht: D. Reidel.

Gotham & Haug (CAS) Glue for UD Oslo, 20 March 2018 53 / 53