<<

Computational L4: and underspecification

Simon Dobnik [email protected]

April 11, 2019 Outline

Ambiguity in natural

The computational problem with ambiguity

Underspecification Outline

Ambiguity in natural language

The computational problem with ambiguity

Underspecification I Kim ran to the riverbank. I Kim ran to the bank to get her money. I Kim ran to the bank before it closed.

Lexical ambiguity

I Kim ran to the bank.

4 / 38 I Kim ran to the bank to get her money. I Kim ran to the bank before it closed.

Lexical ambiguity

I Kim ran to the bank. I Kim ran to the riverbank.

4 / 38 I Kim ran to the bank before it closed.

Lexical ambiguity

I Kim ran to the bank. I Kim ran to the riverbank. I Kim ran to the bank to get her money.

4 / 38 Lexical ambiguity

I Kim ran to the bank. I Kim ran to the riverbank. I Kim ran to the bank to get her money. I Kim ran to the bank before it closed.

4 / 38 without

I NP → NP and NP I Kim and Lee and Chris arrived early.

5 / 38 S

NP VP

arrived early NP andNP

NP andNP Chris

Kim Lee

6 / 38 S

NP VP

arrived early NP andNP

Kim NP andNP

Lee Chris

7 / 38 Syntactic ambiguity with semantic ambiguity

I NP → NP or NP I Kim and Lee or Chris arrived early

8 / 38 S

NP VP

arrived early NP or NP

NP andNP Chris

Kim Lee

True if only Chris arrived early

9 / 38 S

NP VP

arrived early NP andNP

Kim NP orNP

Lee Chris

False if only Chris arrived early

10 / 38 I Kimi saw Leej and shei smiled at himj I Kimi saw Leej and shej smiled at himi

Anaphora

I Kim saw Lee and she smiled at him

11 / 38 I Kimi saw Leej and shej smiled at himi

Anaphora

I Kim saw Lee and she smiled at him I Kimi saw Leej and shei smiled at himj

11 / 38 Anaphora

I Kim saw Lee and she smiled at him I Kimi saw Leej and shei smiled at himj I Kimi saw Leej and shej smiled at himi

11 / 38 I two boys ate two pizzas I most students read most books

I ∃x[company representative(x) ∧ ∀y[new employee(y) → interview(x, y)]] I ∀y[new employee(y) → ∃x[company representative(x) ∧ interview(x, y)]] I some surprising examples:

Quantifier ambiguity

I a company representative interviews every new employee

12 / 38 I two boys ate two pizzas I most students read most books

I ∀y[new employee(y) → ∃x[company representative(x) ∧ interview(x, y)]] I some surprising examples:

Quantifier scope ambiguity

I a company representative interviews every new employee I ∃x[company representative(x) ∧ ∀y[new employee(y) → interview(x, y)]]

12 / 38 I two boys ate two pizzas I most students read most books

I some surprising examples:

Quantifier scope ambiguity

I a company representative interviews every new employee I ∃x[company representative(x) ∧ ∀y[new employee(y) → interview(x, y)]] I ∀y[new employee(y) → ∃x[company representative(x) ∧ interview(x, y)]]

12 / 38 I two boys ate two pizzas I most students read most books

Quantifier scope ambiguity

I a company representative interviews every new employee I ∃x[company representative(x) ∧ ∀y[new employee(y) → interview(x, y)]] I ∀y[new employee(y) → ∃x[company representative(x) ∧ interview(x, y)]] I some surprising examples:

12 / 38 I most students read most books

Quantifier scope ambiguity

I a company representative interviews every new employee I ∃x[company representative(x) ∧ ∀y[new employee(y) → interview(x, y)]] I ∀y[new employee(y) → ∃x[company representative(x) ∧ interview(x, y)]] I some surprising examples: I two boys ate two pizzas

12 / 38 Quantifier scope ambiguity

I a company representative interviews every new employee I ∃x[company representative(x) ∧ ∀y[new employee(y) → interview(x, y)]] I ∀y[new employee(y) → ∃x[company representative(x) ∧ interview(x, y)]] I some surprising examples: I two boys ate two pizzas I most students read most books

12 / 38 Evaluating expressions with quantifiers

evaluating-quantifiers.ipynb or .py

13 / 38 Outline

Ambiguity in natural language

The computational problem with ambiguity

Underspecification I 5! = 120 readings I . . . but no politician can fool all of the people all of the time

How many readings?

I In most democratic countries most politicians can fool most of the people on almost every issue most of the time. (Hobbs, 1983)

15 / 38 I . . . but no politician can fool all of the people all of the time

How many readings?

I In most democratic countries most politicians can fool most of the people on almost every issue most of the time. (Hobbs, 1983) I 5! = 120 readings

15 / 38 How many readings?

I In most democratic countries most politicians can fool most of the people on almost every issue most of the time. (Hobbs, 1983) I 5! = 120 readings I . . . but no politician can fool all of the people all of the time

15 / 38 I first you have to explain to the user what the ambiguity is. . . I . . . and then it is not clear that you can find enough unambiguous natural language sentences to express the different readings I so the user has to know logic!

How do you disambiguate?

I not practical to ask users to disambiguate

16 / 38 I . . . and then it is not clear that you can find enough unambiguous natural language sentences to express the different readings I so the user has to know logic!

How do you disambiguate?

I not practical to ask users to disambiguate I first you have to explain to the user what the ambiguity is. . .

16 / 38 I so the user has to know logic!

How do you disambiguate?

I not practical to ask users to disambiguate I first you have to explain to the user what the ambiguity is. . . I . . . and then it is not clear that you can find enough unambiguous natural language sentences to express the different readings

16 / 38 How do you disambiguate?

I not practical to ask users to disambiguate I first you have to explain to the user what the ambiguity is. . . I . . . and then it is not clear that you can find enough unambiguous natural language sentences to express the different readings I so the user has to know logic!

16 / 38 Outline

Ambiguity in natural language

The computational problem with ambiguity

Underspecification Packing several meanings in a single representation

I finding all the readings is computationally inefficient I . . . and then you have to figure out which of the meanings was meant I Underspecified meaning representations allow you to compute one single representation from which you can generate specified meanings if necessary

18 / 38 Cooper storage

(Cooper, 1983)

19 / 38 interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

λP[P(x1)] λx[interview(x, x0)] hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

λP[P(x0)] hλP[∀x[employee(x) → P(x)]], 0i

S

NP VP

a representative interviewsNP

every employee interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

λP[P(x1)] λx[interview(x, x0)] hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

S

NP VP

a representative interviewsNP λP[P(x0)] hλP[∀x[employee(x) → P(x)]], 0i

every employee interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

λP[P(x1)] hλP[∃x[rep(x) ∧ P(x)]], 1i

S

NP VP λx[interview(x, x0)] hλP[∀x[employee(x) → P(x)]], 0i a representative interviewsNP λP[P(x0)] hλP[∀x[employee(x) → P(x)]], 0i

every employee interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

S

NP VP λP[P(x1)] λx[interview(x, x0)] hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

a representative interviewsNP λP[P(x0)] hλP[∀x[employee(x) → P(x)]], 0i

every employee S interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

NP VP λP[P(x1)] λx[interview(x, x0)] hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

a representative interviewsNP λP[P(x0)] hλP[∀x[employee(x) → P(x)]], 0i

every employee I λP[∃x[rep(x) ∧ P(x)]](λx1[interview(x1, x0)]) hλP[∀x[employee(x) → P(x)]], 0i

I ∃x[rep(x) ∧ interview(x, x0)]) hλP[∀x[employee(x) → P(x)]], 0i I λP[∀x[employee(x) → P(x)]](λx0[∃x[rep(x) ∧ interview(x, x0)]]) I ∀y[employee(y) → ∃x[rep(x) ∧ interview(x, y)]]

Retrieval

I interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

21 / 38 I ∃x[rep(x) ∧ interview(x, x0)]) hλP[∀x[employee(x) → P(x)]], 0i I λP[∀x[employee(x) → P(x)]](λx0[∃x[rep(x) ∧ interview(x, x0)]]) I ∀y[employee(y) → ∃x[rep(x) ∧ interview(x, y)]]

Retrieval

I interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

I λP[∃x[rep(x) ∧ P(x)]](λx1[interview(x1, x0)]) hλP[∀x[employee(x) → P(x)]], 0i

21 / 38 I λP[∀x[employee(x) → P(x)]](λx0[∃x[rep(x) ∧ interview(x, x0)]]) I ∀y[employee(y) → ∃x[rep(x) ∧ interview(x, y)]]

Retrieval

I interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

I λP[∃x[rep(x) ∧ P(x)]](λx1[interview(x1, x0)]) hλP[∀x[employee(x) → P(x)]], 0i

I ∃x[rep(x) ∧ interview(x, x0)]) hλP[∀x[employee(x) → P(x)]], 0i

21 / 38 I ∀y[employee(y) → ∃x[rep(x) ∧ interview(x, y)]]

Retrieval

I interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

I λP[∃x[rep(x) ∧ P(x)]](λx1[interview(x1, x0)]) hλP[∀x[employee(x) → P(x)]], 0i

I ∃x[rep(x) ∧ interview(x, x0)]) hλP[∀x[employee(x) → P(x)]], 0i I λP[∀x[employee(x) → P(x)]](λx0[∃x[rep(x) ∧ interview(x, x0)]])

21 / 38 Retrieval

I interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

I λP[∃x[rep(x) ∧ P(x)]](λx1[interview(x1, x0)]) hλP[∀x[employee(x) → P(x)]], 0i

I ∃x[rep(x) ∧ interview(x, x0)]) hλP[∀x[employee(x) → P(x)]], 0i I λP[∀x[employee(x) → P(x)]](λx0[∃x[rep(x) ∧ interview(x, x0)]]) I ∀y[employee(y) → ∃x[rep(x) ∧ interview(x, y)]]

21 / 38 I λP[∀x[employee(x) → P(x)]](λx0[interview(x1, x0)]) hλP[∃x[rep(x) ∧ P(x)]], 1i

I ∀x[employee(x) → interview(x1, x)] hλP[∃x[rep(x) ∧ P(x)]], 1i

I λP[∃x[rep(x) ∧ P(x)]](λx1[∀x[employee(x) → interview(x1, x)]]) I ∃y[rep(y) ∧ ∀x[employee(x) → interview(y, x)]]

Retrieval, contd.

I interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

22 / 38 I ∀x[employee(x) → interview(x1, x)] hλP[∃x[rep(x) ∧ P(x)]], 1i

I λP[∃x[rep(x) ∧ P(x)]](λx1[∀x[employee(x) → interview(x1, x)]]) I ∃y[rep(y) ∧ ∀x[employee(x) → interview(y, x)]]

Retrieval, contd.

I interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

I λP[∀x[employee(x) → P(x)]](λx0[interview(x1, x0)]) hλP[∃x[rep(x) ∧ P(x)]], 1i

22 / 38 I λP[∃x[rep(x) ∧ P(x)]](λx1[∀x[employee(x) → interview(x1, x)]]) I ∃y[rep(y) ∧ ∀x[employee(x) → interview(y, x)]]

Retrieval, contd.

I interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

I λP[∀x[employee(x) → P(x)]](λx0[interview(x1, x0)]) hλP[∃x[rep(x) ∧ P(x)]], 1i

I ∀x[employee(x) → interview(x1, x)] hλP[∃x[rep(x) ∧ P(x)]], 1i

22 / 38 I ∃y[rep(y) ∧ ∀x[employee(x) → interview(y, x)]]

Retrieval, contd.

I interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

I λP[∀x[employee(x) → P(x)]](λx0[interview(x1, x0)]) hλP[∃x[rep(x) ∧ P(x)]], 1i

I ∀x[employee(x) → interview(x1, x)] hλP[∃x[rep(x) ∧ P(x)]], 1i

I λP[∃x[rep(x) ∧ P(x)]](λx1[∀x[employee(x) → interview(x1, x)]])

22 / 38 Retrieval, contd.

I interview(x1, x0) hλP[∃x[rep(x) ∧ P(x)]], 1i hλP[∀x[employee(x) → P(x)]], 0i

I λP[∀x[employee(x) → P(x)]](λx0[interview(x1, x0)]) hλP[∃x[rep(x) ∧ P(x)]], 1i

I ∀x[employee(x) → interview(x1, x)] hλP[∃x[rep(x) ∧ P(x)]], 1i

I λP[∃x[rep(x) ∧ P(x)]](λx1[∀x[employee(x) → interview(x1, x)]]) I ∃y[rep(y) ∧ ∀x[employee(x) → interview(y, x)]]

22 / 38 Implementing Cooper storage

nltk data/grammars/book grammars/storage.fcfg storage.fcfg (with my comments) cooper-storage.ipynb or .py

23 / 38 Quasi Logical Form (QLF)I

I Core Language Engine (CLE) – (Alshawi and van Eijck, 1989), (Alshawi, 1992) I Most doctors and some engineers read every article I quant(exists, e, Ev(e), Read(e, term_coord(A, x, qterm(most, plur, y, Doctor(y)), qterm(some, plur, z, Engineer(z))), qterm(every, sing, v, Article(v))))

24 / 38 Quasi Logical Form (QLF)II I resolved QLF quant(most, y, Doctor(y), quant(every, v, Article(v), quant(exists, e, Ev(e), Read(e,y,v)))) & quant(some, z, Engineer(z), quant(every, v, Article(v), quant(exists, e, Ev(e), Read(e,z,v))))

25 / 38 Quasi Logical Form (QLF)III I Mary expected him to introduce himself I him a_term(ref(pro, him, sing, [mary]), x, Male(x)) himself a_term(ref(refl, him, sing, [x,mary]), y, Male(y))

26 / 38 Quasi Logical Form (QLF)IV

I Does the unresolved QLF have a semantic interpretation? I Can you do inference on unresolved QLFs? I Do humans work with underspecified representations?

27 / 38 Hole semanticsI (Bos, 1996), (Blackburn and Bos, 2005), useful brief discussion in (Jurafsky and Martin, 2009), p.629ff. I a constraint-based approach I a company representative interviews every new employee l1 : ∃x[company representative(x) ∧ h1] l2 : ∀y[new employee(y) → h2] I l3 : interview(x, y) l1 ≤ h0, l2 ≤ h0, l3 ≤ h1, l3 ≤ h2 I l1 h0, l2 h1, l3 h2 ∃x[company representative(x) ∧ ∀y[new employee(y) → interview(x, y)]] I l2 h0, l1 h2, l3 h1 ∀y[new employee(y) → ∃x[company representative(x) ∧ interview(x, y)]] I interpretation of underspecified representations?

28 / 38 Implementing Hole semantics

nltk data/grammars/sample grammars/hole.fcfg hole.fcfg (with my comments) hole-semantics.ipynb or .py

29 / 38 Minimal recursion semantics (MRS)I (Copestake et al., 2005) I every dog chases some white cat I some(y, white(y)∧ cat(y), every(x, dog(x), chase(x, y))) h1: every(x, h3, h4) h3: dog(x) h7: white(y) I h7: cat(y) h5: some(y, h7, h1) h4: chase(x, y)

30 / 38 Minimal recursion semantics (MRS)II I every(x , dog(x ), some(y , white(y ) ∧ cat(y ), chase(x , y ))) h1: every(x, h3, h5) h3: dog(x) h7: white(y) I h7: cat(y) h5: some(y, h7, h4) h4: chase(x, y)

31 / 38 Minimal recursion semantics (MRS)III

I underspecified representation h1: every(x, h3, h8) h3: dog(x) h7: white(y) h7: cat(y) h5: some(y, h7, h9) h4: chase(x, y) I can be specified by h8= h5 and h9= h4 or h8= h4 and h9= h1 I Reading 1: h1: every(x,dog(x),h5) h5: some(y,cat(y),h4) h5: some(y,cat(y),chase(x,y)) h1: every(x,dog(x),some(y,cat(y),chase(x,y))

32 / 38 Minimal recursion semantics (MRS)IV

I Reading 2: h1: every(x,dog(x),h4) h1: every(x,dog(x),chase(x,y)) h5: some(y,cat(y),h1) h5: some(y,cat(y),every(x,dog(x),chase(x,y)) I question of interpretation

33 / 38 Summary

I natural are ambiguous I this is a computational problem I there is a large number of readings I unclear how to disambiguate I proposals for underspecified representations I structural manipulation (storage, QLF) I constraint based (hole semantics, MRS) I unclear what the interpretation of underspecified representations is and whether you can reason with them appropriately

34 / 38 Further reading

* (Bird, Klein, and Loper, 2009): Section 3.7 Quantifier Scope Ambiguity and Section 4.5 Quantifier Ambiguity Revisited * (Jurafsky and Martin, 2000): 18.3 Quantifier Scope Ambiguity and Underspecification I (Blackburn and Bos, 2005): Chapter 3: Underspecified Representations (advanced) * indicates basic reading

35 / 38 Acknowledgement

Some slides based on the slides by Robin Cooper.

36 / 38 ReferencesI

Alshawi, Hiyan. 1992. The Core language engine. ACL-MIT Press series in natural language processing. MIT Press, Cambridge, Mass. Alshawi, Hiyan and Jan van Eijck. 1989. Logical forms in the core language engine. In Proceedings of the 27th Annual Meeting on Association for Computational , ACL ’89, pages 25–32, Stroudsburg, PA, USA. Association for Computational Linguistics. Bird, Steven, Ewan Klein, and Edward Loper. 2009. Natural language processing with Python. O’Reilly. Blackburn, Patrick and Johan Bos. 2005. Representation and inference for natural language. A first course in computational semantics. CSLI Publications. Bos, Johan. 1996. logic unplugged. Universit¨atdes Saarlandes. Cooper, Robin. 1983. Quantification and syntactic theory, volume 21. D. Reidel Pub. Co., Dordrecht, Holland.

37 / 38 ReferencesII

Copestake, Ann, Dan Flickinger, Carl Pollard, and Ivan A. Sag. 2005. Minimal recursion semantics: An introduction. Research on Language and Computation, 3(2-3):281–332. Jurafsky, Dan and James H. Martin. 2000. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. Prentice Hall, Upper Saddle River, N.J. Jurafsky, Dan and James H. Martin. 2009. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. Pearson Prentice Hall, Upper Saddle River, N.J., 2nd ed edition.

38 / 38