EACL 2014

14th Conference of the European Chapter of the Association for Computational Linguistics

Proceedings of the Workshop on and Natural Language (TTNLS)

April 27, 2014 Gothenburg, Sweden c 2014 The Association for Computational Linguistics

Order copies of this and other ACL proceedings from:

Association for Computational Linguistics (ACL) 209 N. Eighth Street Stroudsburg, PA 18360 USA Tel: +1-570-476-8006 Fax: +1-570-476-0860 [email protected]

ISBN 978-1-937284-74-9

ii Introduction

Type theory has been a central area of research in logic, the semantics of programming languages, and natural language semantics over the past fifty years. Recent developments in type theory have been used to reconstruct the formal foundations of . The treatments are generally intensional and polymorphic in character. They allow for structured, fine-grained encoding of information across a diverse set of linguistic domains.

The work in this area has opened up new approaches to modelling the relations between, inter alia, syntax, semantic interpretation, dialogue, inference, and cognition, from a largely proof theoretic perspective. The papers in this volume cover a wide range of topics on the application of type theory to modelling semantic properties of natural language.

TTNLS 2014 is providing a forum for the presentation of leading edge research in this fast developing sub-field of computational linguistics. To the best of our knowledge it is the first major conference on this topic hosted by the ACL.

We received a total of 13 relevant submissions, 10 of which were accepted for presentation. Each submission was reviewed by two members of our Programme Committee. We thank these people for their detailed and helpful reviews.

We are also including Aarne Ranta’s invited paper here. We are honoured that he has accepted our invitation to present this paper at TTNLS, and we look forward to his talk.

We would like to thank the organisers of the EACL 2014 both for their financial assistance and their organisational support. We are also grateful to the Dialogue Technology Lab at the Centre for Language Technology of University of Gothenburg, and to the Wenner-Gren Foundations for funding Shalom Lappin’s research visits to University of Gothenburg, during which this workshop was organised.

Robin Cooper, Simon Dobnik, Shalom Lappin, and Staffan Larsson

University of Gothenburg and London

March 2014

iii

Workshop Organisation

Organising Committee: Robin Cooper (University of Gothenburg) Simon Dobnik (University of Gothenburg) Shalom Lappin (King’s College London) Staffan Larsson (University of Gothenburg)

Programme Committee: Krasimir Angelov (University of Gothenburg and Chalmers University of Technology) Patrick Blackburn (University of Roskilde) Stergios Chatzikyriakidis (Royal Holloway, University of London) Stephen Clark (University of Cambridge) Philippe de Groote (Inria Nancy - Grand Est) Jan van Eijck (University of Amsterdam) Raquel Fernández (University of Amsterdam) Tim Fernando (Trinity College, Dublin) Chris Fox (University of Essex) Jonathan Ginzburg (Université Paris-Diderot (Paris 7)) Zhaohui Luo (Royal Holloway, University of London) Uwe Mönnich (University of Tübingen) Bruno Mery (LaBRI, Université Bordeaux 1) Glyn Morrill (Universitat Politècnica de Catalunya, Barcelona) Larry Moss (Indiana University) Reinhard Muskens (Tilburg University) Bengt Nordström (University of Gothenburg and Chalmers University of Technology) Valeria de Paiva (Nuance Communications, Inc., Sunnyvale California) Carl Pollard (The Ohio State University) Ian Pratt-Hartmann (University of Manchester) Stephen Pulman (University of Oxford) Matt Purver (Queen Mary, University of London) Aarne Ranta (University of Gothenburg and Chalmers University of Technology) Christian Retoré (LaBRI, Université Bordeaux 1) Scott Martin (Nuance Communications, Inc., Sunnyvale California) Ray Turner (University of Essex)

Invited speaker: Aarne Ranta (University of Gothenburg and Chalmers University of Technology)

v

Table of Contents

Types and Records for Predication Aarne Ranta ...... 1

System with Generalized Quantifiers on Dependent Types for Justyna Grudzinska and Marek Zawadowski ...... 10

Monads as a Solution for Generalized Opacity Gianluca Giorgolo and Ash Asudeh ...... 19

The Phenogrammar of Coordination ChrisWorth...... 28

Natural Language Reasoning Using Proof-Assistant Technology: Rich Typing and Beyond Stergios Chatzikyriakidis and Zhaohui Luo ...... 37

A Type-Driven Tensor-Based Semantics for CCG Jean Maillard, Stephen Clark and Edward Grefenstette ...... 46

From Natural Language to RDF Graphs with Pregroups Antonin Delpeuch and Anne Preller ...... 55

Incremental semantic scales by strings Tim Fernando ...... 63

A Probabilistic Rich Type Theory for Semantic Interpretation Robin Cooper, Simon Dobnik, Shalom Lappin and Staffan Larsson ...... 72

Probabilistic Type Theory for Incremental Dialogue Processing Julian Hough and Matthew Purver ...... 80

Propositions, Questions, and Adjectives: a rich type theoretic approach Jonathan Ginzburg, Robin Cooper and Tim Fernando...... 89

vii

Conference Program

Sunday, 27 April 2014

8:15 Registration

8:45 Opening remarks by Robin Cooper

(9:00 - 10:30) Session 1

9:00 Types and Records for Predication Aarne Ranta

10:00 System with Generalized Quantifiers on Dependent Types for Anaphora Justyna Grudzinska and Marek Zawadowski

10:30 Coffee break

(11:00 - 12:30) Session 2

11:00 Monads as a Solution for Generalized Opacity Gianluca Giorgolo and Ash Asudeh

11:30 The Phenogrammar of Coordination Chris Worth

12:00 Natural Language Reasoning Using Proof-Assistant Technology: Rich Typing and Beyond Stergios Chatzikyriakidis and Zhaohui Luo

12:30 Lunch

(14:00 - 15:30) Session 3

14:00 A Type-Driven Tensor-Based Semantics for CCG Jean Maillard, Stephen Clark and Edward Grefenstette

14:30 From Natural Language to RDF Graphs with Pregroups Antonin Delpeuch and Anne Preller

15:00 Incremental semantic scales by strings Tim Fernando

15:30 Coffee break

ix Sunday, 27 April 2014 (continued)

(16:00 - 17:30) Session 4

16:00 A Probabilistic Rich Type Theory for Semantic Interpretation Robin Cooper, Simon Dobnik, Shalom Lappin and Staffan Larsson

16:30 Probabilistic Type Theory for Incremental Dialogue Processing Julian Hough and Matthew Purver

17:00 , Questions, and Adjectives: a rich type theoretic approach Jonathan Ginzburg, Robin Cooper and Tim Fernando

17:30 Concluding remarks

x Types and Records for Predication

Aarne Ranta Department of Computer Science and Engineering, University of Gothenburg [email protected]

a grammar may need a large number of categories Abstract and rules. In GPSG (Gazdar et al., 1985), and later in HPSG (Pollard and Sag, 1994), this is solved This paper studies the use of records by introducing a feature called subcat for verbs. and dependent types in GF (Grammatical Verbs taking different arguments differ in the sub- Framework) to build a grammar for pred- cat feature but share otherwise the characteristic of ication with an unlimited number of sub- being verbs. categories, also covering extraction and In this paper, we will study the syntax and se- coordination. The grammar is imple- mantics of predication in GF, Grammatical Frame- mented for Chinese, English, Finnish, and work (Ranta, 2011). We will generalize both over Swedish, sharing the maximum of code subcategories (as in GPSG and HPSG), and over to identify similarities and differences be- languages (as customary in GF). We use depen- tween the languages. Equipped with a dent types to control the application of verbs to probabilistic model and a large lexicon, legitimate arguments, and records to control the the grammar has also been tested in wide- placement of arguments in sentences. The record coverage machine translation. The first structure is inspired by the topological model of evaluations show improvements in parsing syntax in (Diderichsen, 1962). speed, coverage, and robustness in com- The approach is designed to apply to all lan- parison to earlier GF grammars. The study guages in the GF Resource Grammar Library confirms that dependent types, records, (RGL, (Ranta, 2009)), factoring out their typolog- and functors are useful in both engineer- ical differences in a modular way. We have tested ing and theoretical perspectives. the grammar with four languages from three fam- ilies: Chinese, English, Finnish, and Swedish. As 1 Introduction the implementation reuses old RGL code for all Predication is the basic level of syntax. In logic, it parts but predication, it can be ported to new lan- means building atomic formulas by predicates. In guages with just a few pages of new GF code. We linguistics, it means building sentences by verbs. have also tested it in wide coverage tasks, with a Categorial grammars (Bar-Hillel, 1953; Lambek, probabilistic tree model and a lexicon of 60,000 1958) adapt logical predication to natural lan- lemmas. guage. Thus for instance transitive verbs are cat- We will start with an introduction to the abstrac- egorized as (n s/n), which is the logical type tion mechanisms of GF and conclude with a sum- \ n n s with the information that one argument mary of some recent research. Section 2 places → → comes before the verb and the other one after. But GF on the map of grammar formalisms. Section 3 most approaches to syntax and semantics, includ- works out an example showing how abstract syn- ing (Montague, 1974), introduce cate- tax can be shared between languages. Section 4 gories as primitives rather than as function types. shows how parts of concrete syntax can be shared Thus transitive verbs are a category of its own, re- as well. Section 5 gives the full picture of predi- lated to logic via a semantic rule. This gives more cation with dependent types and records, also ad- expressive power, as it permits predicates with dif- dressing extraction, coordination, and semantics. ferent syntactic properties and variable word order Section 6 gives preliminary evaluation. Section 7 (e.g. inversion in questions). A drawback is that concludes.

1 Proceedings of the EACL 2014 Workshop on Type Theory and Natural Language Semantics (TTNLS), pages 1–9, Gothenburg, Sweden, April 26-30 2014. c 2014 Association for Computational Linguistics

2 GF: an executive summary logical frameworks, the abstract syntax encodes the structure relevant for semantics, whereas the GF belongs to a subfamily of categorial grammars concrete syntax defines “syntactic sugar”. inspired by (Curry, 1961). These grammars make The resulting system turned out to be equiv- a distinction between tectogrammar, which spec- alent to parallel multiple -free gram- ifies the syntactic structures (tree-like representa- mars (Seki et al., 1991) and therefore parsable tions), and phenogrammar, which relates these in polynomial time (Ljunglof,¨ 2004). Compre- structures to linear representations, such as se- hensive grammars have been written for 29 lan- quences of characters, words, or phonemes. Other guages, and later work has optimized GF pars- formalisms in this family include ACG (de Groote, ing and also added probabilistic disambiguation 2001) and Lambda grammars (Muskens, 2001). and robustness, resulting in state-of-the-art perfor- GF inherits its name from LF, Logical Frame- mance in wide-coverage deep parsing (Angelov, works, which are type theories used for defin- 2011; Angelov and Ljunglof,¨ 2014). ing logics (Harper et al., 1993). GF builds on the LF called ALF, Another Logical Framework 3 Example: subject-verb-object (Magnusson, 1994), which implements Martin- sentences Lof’s¨ higher-level type theory (first introduced Let us start with an important special case of predi- in the preface of (Martin-Lof,¨ 1984); see Chap- cation: the subject-verb-object structure. The sim- ter 8 of (Ranta, 1994) for more details). Before plest possible rule is GF was introduced as an independent formalism fun PredTV : NP -> TV -> NP -> S in 1998, GF-like applications were built as plug- ins to ALF (Ranta, 1997). The idea was that the that is, a function that takes a subject NP, a tran- LF defines the tectogrammar, and the plug-in de- sitive verb TV, and an object NP, and returns a fines the phenogrammar. The intended application sentence S. This function builds abstract syntax was natural language interfaces to formal proof trees. Concrete syntax defines linearization rules, systems, in the style of (Coscoy et al., 1995). which convert trees into strings. The above rule GF was born via two additions to the natural can give rise to different word orders, such as SVO language interface idea. The first one was multi- (as in English), SOV (as in Hindi), and VSO (as in linguality: one and the same tectogrammar can Arabic): be given multiple phenogrammars. The second lin PredTV s v o = s ++ v ++ o lin PredTV s v o = s ++ o ++ v addition was parsing: the phenogrammar, which lin PredTV s v o = v ++ s ++ o was initially just (generating strings linearization where ++ means concatenation. from type theoretical formulas), was reversed to The above rule builds a sentence in one step. rules that parse natural language into type theory. A more flexible approach is to do it in two steps: The result was a method for translation, which complementation, forming a VP (verb phrase) combines parsing the source language with lin- from the verb and the object, and predication earization into the target language. This idea was proper that provides the subject. The abstract syn- indeed suggested in (Curry, 1961), and applied tax is before GF in the Rosetta project (Landsbergen, fun Compl : TV -> NP -> VP 1982), which used Montague’s analysis trees as fun Pred : NP -> VP -> S tectogrammar. These functions are easy to linearize for the SVO GF can be seen as a formalization and gener- and SOV orders: alization of . Formalization, lin Compl v o = v ++ o -- SVO because it introduces a formal notation for the lin Compl v o = o ++ v -- SOV linearization rules that in Montague’s work were lin Pred s vp = s ++ vp -- both expressed informally. Generalization, because of where -- marks a comment. However, the VSO multilinguality and also because the type system order cannot be obtained in this way, because the for analysis trees has dependent types. two parts of the VP are separated by the subject. Following the terminology of programming lan- The solution is to generalize linearization from guage theory, the tectogrammar is in GF called strings to records. Complementation can then re- the abstract syntax whereas the phenogrammar is turn a record that has the verb and the object as called the concrete syntax. As in compilers and separate fields. Then we can also generate VSO:

2 lin Compl v o = {verb = v ; obj = o} lin Love = { lin Pred s vp = vp.verb ++ s ++ vp.obj s = table { {n = Sg ; p = P3} => "loves" ; The dot (.) means projection, picking the value _ => "love" of a field in a record. } Records enable the abstract syntax to abstract } away not only from word order, but also from We can now define English linearization: whether a language uses discontinuous con- lin Compl v o = stituents. VP in VSO languages is one example. {s = table {a => v.s ! a ++ o.s ! Acc}} Once we enable discontinuous constituents, they lin Pred s vp = {s = s.s ! Nom ++ vp.s ! np.a} turn out useful almost everywhere, as they enable us to delay the decision about linear order. It can using the same type of records for VP as for TV, then be varied even inside a single language, if it and a one-string record for S. The Compl rule depends on syntactic context (as e.g. in German; passes the agreement feature to the verb of the VP, cf. (Muller,¨ 2004) for a survey). and selects the Acc form of the object (with ! de- The next thing to abstract away from is inflec- noting selection from a table). The Pred rule se- tion and agreement. Given the lexicon lects the Nom form of the subject, and attaches to fun We, She : NP this the VP form selected for np.a, i.e. the agree- fun Love : TV ment feature of the subject. we can build the abstract syntax tree Pred We (Compl Love She) 4 Generalized concrete syntax to represent we love her. If we swap the subject To see the full power of GF, we now take a look and the object, we get at its type and module system. Figure 1 shows a Pred She (Compl Love We) complete set of grammar modules implementing transitive verb predication for Finnish and Chinese for she loves us. Now, these two sentences are with a maximum of shared code. built from the same abstract syntax objects, but The first module in Figure 1 is the abstract syn- no single word is shared between them! This is tax Pred, where the fun rules are preceded by because the noun phrases inflect for case and the a set of cat rules defining the categories of the verb agrees to the subject. grammar, i.e. the basic types. Pred defines five In to English, Chinese just reorders the categories: S, Cl, NP, VP, and TV. S is the top- words: level category of sentences, whereas Cl (clause) is women ai ta - “we love her” the intermediate category of predications, which ta ai women - “she loves us” can be used as sentences in many ways—here, as declaratives and as questions. Thus the above rules for SVO languages work as The concrete syntax has corresponding lincat they are for Chinese. But in English, we must in- rules, which equip each category with a lineariza- clude case and agreement as features in the con- tion type, i.e. the type of the values returned crete syntax. Thus the linearization of an NP is when linearizing trees of that category. The mod- a record that includes a table producing the case ule PredFunctor in Figure 1 contains four such forms, and agreement as an inherent feature: rules. In lincat NP, the type Case => Str is lin She = { the type of tables that produce a string as a func- s = table { Nom => "she" ; tion of a case, and Agr is the type of agreement Acc => "her" features. }; When a GF grammar is compiled, each lin rule a = {n = Sg ; p = P3} ; } is type checked with respect to the lincats of the categories involved, to guarantee that, for every The agreement feature (field a) is itself a record, with a number and a gender. In other languages, fun f : C1 Cn C case and agreement can of course have different → ··· → → sets of values. we have Verbs likewise include tables that inflect them for different agreement features: lin f : C∗ C∗ C∗ 1 → ··· → n →

3 abstract Pred = { cat S ; Cl ; NP ; VP ; TV ; fun Compl : TV -> NP -> VP ; fun Pred : TV -> NP -> Cl ; fun Decl : Cl -> S ; fun Quest : Cl -> S ; } incomplete concrete PredFunctor of Pred = open PredInterface in { lincat S = {s : Str} ; lincat Cl = {subj,verb,obj : Str} ; lincat NP = {s : Case => Str ; a : Agr} ; lincat VP = {verb : Agr => Str ; obj : Str} ; lincat TV = {s : Agr => Str} ; lin Compl tv np = {verb = tv.s ; obj = np.s ! objCase} ; lin Pred np vp = {subj = np.s !subjCase ; verb = vp.verb ! np.a ; obj = vp.obj} ; lin Decl cl = {s = decl cl.subj cl.verb cl.obj} ; lin Quest cl = {s = quest cl.subj cl.verb cl.obj} ; } interface PredInterface = { oper Case, Agr : PType ; oper subjCase, objCase : Case ; oper decl, quest : Str -> Str -> Str -> Str ; } instance PredInstanceFin of PredInterface = { concrete PredFin of Pred = oper Case = -- Nom | Acc | ... ; PredFunctor with oper Agr = {n : Number ; p : Person} ; (PredInterface = oper subjCase = Nom ; objCase = Acc ; PredInstanceFin) ; oper decl s v o = s ++ v ++ o ; oper quest s v o = v ++ "&+ ko" ++ s ++ o ; } instance PredInstanceChi of PredInterface = { concrete PredChi of Pred = oper Case, Agr = {} ; PredFunctor with oper subjCase, objCase = <> ; (PredInterface = oper decl s v o = s ++ v ++ o ; PredInstanceChi) ; oper quest s v o = s ++ v ++ o ++ "ma" ; }

Figure 1: Functorized grammar for transitive verb predication.

where A∗ is the linearization type of A. Thus lin- And more languages can be added to use it. Con- earization is a homomorphism. It is actually sider for example the definition of NP. The expe- an instance of denotational semantics, where the rience from the RGL shows that, if a language lincats are the domains of possible denota- has case and agreement, its NPs inflect for case tions. and have inherent agreement. The limiting case Much of the strength of GF comes from us- of Chinese can be treated by using the unit type ( i.e. the record type with no fields) for both ing different linearization types for different lan- {} guages. Thus English needs case and agreement, features. This would not be so elegant for Chinese Finnish needs many more cases (in the full gram- alone, but makes sense in the code sharing context. mar), Chinese needs mostly only strings, and so Discontinuity now appears as another useful on. However, it is both useful and illuminating to generalization. With the lincat definition in unify the types. The way to do this is by the use PredFunctor, we can share the Compl rule in all of functors, also known as a parametrized mod- of the languages discussed so far. In clauses (Cl), ules. we continue on similar lines: we keep the subject, PredFunctor in Figure 1 is an example; func- the verb, and the object on separate fields. Notice tors are marked with the keyword incomplete.A that verb in Cl is a plain string, since the value of functor depends on an interface, which declares Agr gets fixed when the subject is added. a set of parameters (PredInterface in Figure The final sentence word order is created as the 1). A concrete module is produced by giving last step, when converting Cl into S. As Cl is dis- an instance to the interface (PredInstanceFin continuous, it can be linearized in different orders. and PredInstanceChi). In Figure 1, this is used in Finnish for generat- The rules in PredFunctor in Figure 1 are de- ing the SVO order in declaratives and VSO on signed to work for both languages, by varying the questions (with an intervening question particle ko definitions of the constants in PredInterface. glued to the verb). It also supports the other word

4 orders of Finnish (Karttunen and Kay, 1985). For example, NP+QCl is used in verbs such as ask By using an abstract syntax in combination with (someone whether something). unordered records, parameters, and functors for What about PP (prepositional phrase) comple- the concrete syntax, we follow a kind of a “prin- ments? The best approach in a multilingual set- ciples and parameters” approach to language vari- ting is to treat them as NP complements with des- ation (Chomsky, 1981). The actual parameter set ignated cases. Thus in Figure 2.5, the lineariza- for the whole RGL is of course larger than the one tion type of VP has fields of type complCase. shown here. This covers cases and prepositions, often in com- Mathematically, it is possible to treat all differ- bination. For instance, the German verb lieben ences in concrete syntax by parameters, simply by (“love”) takes a plain accusative argument, fol- declaring a new parameter for every lincat and gen (“love”) a plain dative, and warten (“wait”) lin rule! But this is both vacuous as a theory and the preposition auf with the accusative. From the an unnecessary detour in practice. It is more il- abstract syntax point of view, all of them are NP- luminating to keep the functor simple and the set complement verbs. Cases and prepositions, and of parameters small. If the functor does not work thereby transitivity, are defined in concrete syntax. for a new language, it usually makes more sense to The category Cl, clause, is the discontinuous override it than to grow the parameter list, and GF structure of sentences before word order is deter- provides a mechanism for this. Opposite to “prin- mined. Its instance Cl (c np O) corresponds to ciples and parameters”, this is “a model in which the slash categories S/NP and S/PP in GPSG. language-particular rules take over the work of pa- Similarly, VP (c np O) corresponds to VP/NP rameter settings” (Newmeyer, 2004). A combina- and VP/PP, Adv (c np O) to Adv/NP (preposi- tion of the two models enables language compari- tions), and so on. son by measuring the amount of overrides. 2. Initial formation of verb phases. A VP is formed from a V by fixing its tense and polarity. 5 The full predication system In the resulting VP, the verb depends only on the So far we have only dealt with one kind of verbs, agreement features of the expected subject. The TV. But we need more: intransitive, ditransitive, complement case comes from the verb’s lexical sentence-complement, etc. The general verb cate- entry, but the other fields—such as the objects— gory is a dependent type, which varies over argu- are left empty. This makes the VP usable in both ment type lists: complementation and slash operations (where the subject is added before some complement). cat V (x : Args) VPs can also be formed from adverbials, ad- The list x : Args corresponds to the subcat fea- jectival phrases, and common nouns, by adding a ture in GPSG and HPSG. Verb phrases and clauses copula. Thus was in results from applying UseAdv have the same dependencies. Syntactically, a to the preposition (i.e. Adv/NP) in, and expands to phrase depending on x : Args has “holes” for a VP with ComplNP (was in France) and to a slash every argument in the list x. Semantically, it is a clause with PredVP (she was in). function over the of its arguments (see 3. Complementation, VP slash formation, re- Section 5.3 below). flexivization. The Compl functions in Figure 2.3 provide each verb phrase with its “first” comple- 5.1 The code ment. The Slash functions provide the “last” Figure 2 shows the essentials of the resulting complement, leaving a “gap” in the middle. For grammar, and we will now explain this code. The instance, SlashCl provides the slash clause used full code is available at the GF web site. in the question whom did you tell that we sleep. 1. Argument lists and dependent categories. The Refl rules fill argument places with reflexive The argument of a verb can be an adjectival phrase pronouns. (AP, become old), a clause (Cl, say that we go), a 4. NP-VP predication, slash termination, and common noun (CN, become a president), a noun adverbial modification. PredVP is the basic NP- phrase (NP, love her), a question (QCl, wonder VP predication rule. With x = c np O, it be- who goes), or a verb phrase (VP, want to go). The comes the rule that combines NP with VP/NP to definition allows an arbitrary list of arguments. form S/NP. SlashTerm is the GPSG “slash termi-

5 1. Argument lists and some dependent categories cat Arg ; Args -- arguments and argument lists fun ap, cl, cn, np, qcl, vp : Arg -- AP, Cl, CN, NP, QCl, VP argument fun O : Args -- no arguments fun c : Arg -> Args -> Args -- one more argument cat V (x : Args) -- verb in the lexicon cat VP (x : Args) -- verb phrase cat Cl (x : Args) -- clause cat AP (x : Args) -- adjectival phrase cat CN (x : Args) -- common noun phrase cat Adv (x : Args) -- adverbial phrase 2. Initial formation of verb phases fun UseV : (x : Args) -> Temp -> Pol -> V x -> VP x -- loved (X) fun UseAP : (x : Args) -> Temp -> Pol -> AP x -> VP x -- was married to (X) fun UseCN : (x : Args) -> Temp -> Pol -> CN x -> VP x -- was a son of (X) fun UseAdv : (x : Args) -> Temp -> Pol -> Adv x -> VP x -- was in (X) 3. Complementation, VP slash formation, reflexivization fun ComplNP : (x : Args) -> VP (c np x) -> NP -> VP x -- love her fun ComplCl : (x : Args) -> VP (c cl x) -> Cl x -> VP x -- say that we go fun SlashNP : (x : Args) -> VP (c np (c np x)) -> NP -> VP (c np x) -- show (X) to him fun SlashCl : (x : Args) -> VP (c np (c cl x)) -> Cl x -> VP (c np x) -- tell (X) that.. fun ReflVP : (x : Args) -> VP (c np x) -> VP x -- love herself fun ReflVP2 : (x : Args) -> VP (c np (c np x)) -> VP (c np x) -- show (X) to herself 4. NP-VP predication, slash termination, and adverbial modification fun PredVP : (x : Args) -> NP -> VP x -> Cl x -- she loves (X) fun SlashTerm : (x : Args) -> Cl (c np x) -> NP -> Cl x -- she loves + X 5. The functorial linearization type of VP lincat VP = { verb : Agr => Str * Str * Str ; -- finite: would,have,gone inf : VVType => Str ; -- infinitive: (not) (to) go imp : ImpType => Str ; -- imperative: go c1 : ComplCase ; -- case of first complement c2 : ComplCase ; -- case of second complement vvtype : VVType ; -- type of VP complement adj : Agr => Str ; -- adjective complement obj1 : Agr => Str ; -- first complement obj2 : Agr => Str ; -- second complement objagr : {a : Agr ; objCtr : Bool} ; -- agreement used in object control adv1 : Str ; -- pre-verb adverb adv2 : Str ; -- post-verb adverb ext : Str ; -- extraposed element e.g. that-clause } 6. Some functorial linearization rules lin ComplNP x vp np = vp ** {obj1 = \\a => appComplCase vp.c1 np} lin ComplCl x vp cl = vp ** {ext = that_Compl ++ declSubordCl cl} lin SlashNP2 x vp np = vp ** {obj2 = \\a => appComplCase vp.c2 np} lin SlashCl x vp cl = vp ** {ext = that_Compl ++ declSubordCl cl} 7. Some interface parameters oper Agr, ComplCase : PType -- agreement, complement case oper appComplCase : ComplCase -> NP -> Str -- apply complement case to NP oper declSubordCl : Cl -> Str -- subordinate question word order

Figure 2: Dependent types, records, and parameters for predication.

6 nation” rule. 5.3 Semantics 5. The functorial linearization type of VP. The abstract syntax has straightforward denota- This record type contains the string-valued fields tional semantics: each type in the Args list of a that can appear in different orders, as well as the category adds an argument to the type of denota- inherent features that are needed when comple- tions. For instance, the basic VP type is ments are added. The corresponding record for Cl Ent -> Prop, and the type for an arbitrary sub- has similar fields with constant strings, plus a sub- category of VP x is ject field. (x : Args) -> Den x (Ent -> Prop) 6. Some functorial linearization rules. The where Den is a type family defined recursively verb-phrase expanding rules typically work with over Args, , where the old VP is left un- record updates Den : Args -> Type -> Type changed except for a few fields that get new val- Den O t = t ues. GF uses the symbol ** for record updates. Den (c np xs) t = Ent -> Den xs t Notice that ComplCl and SlashCl have exactly Den (c cl xs) t = Prop -> Den xs t the same linearization rules; the difference comes and so on for all values of Arg. The second ar- from the argument list x in the abstract syntax. gument t varies over the basic denotation types of 7. Some interface parameters. The code VP, AP, Adv, and CN. in Figure 2.5 and 2.6 is shared by different lan- Montague-style semantics is readily available guages, but it depends on an interface that declares for all rules operating on these categories. As a parameters, some of which are shown here. logical framework, GF has the expressive power needed for defining semantics (Ranta, 2004). The 5.2 More constructions types can moreover be extended to express selec- tional restrictions, where verb arguments are re- Extraction. The formation of questions and rel- stricted to domains of individuals. Here is a type atives is straighforward. Sentential (yes/no) ques- system that adds a domain argument to NP and tions, formed by QuestCl in Figure 3.1, don’t in VP: many languages need any changes in the clause, cat NP (d : Dom) but just a different ordering in final linearization. cat VP (d : Dom)(x : Args) Wh questions typically put one interrogative (IP) fun PredVP : (d : Dom) -> (x : Args) -> NP d -> VP d x -> Cl x in the , which may be in the beginning of the sentence even though the corresponding argument The predication rule checks that the NP and the place in declaratives is later. The focus field in VP have the same domain. QCl is used for this purpose. It carries a Boolean feature saying whether the field is occupied. If its 6 Evaluation value is True, the next IP is put into the “normal” Coverage. The dependent type system for verbs, argument place, as in who loves whom. verb phrases, and clauses is a generalization of Coordination. The VP conjunction rules in the old Resource Grammar Library (Ranta, 2009), Figure 3.2 take care of both intransitive VPs (she which has a set of hard-wired verb subcategories walks and runs) and of verb phrases with argu- and a handful of slash categories. While it cov- ments (she loves and hates us). Similarly, Cl con- ers “all usual cases”, many logically possible ones juction covers both complete sentences and slash are missing. Some such cases even appear in the clauses (she loves and we hate him). Some VP Penn treebank (Marcus et al., 1993), requiring ex- coordination instances may be ungrammatical, in tra rules in the GF interpretation of the treebank particular with inverted word orders. Thus she is (Angelov, 2011). An example is a function of type tired and wants to sleep works as a declarative, V (c np (c vp O)) -> but the question is not so good: ?is she tired and VPC (c np O) -> VP (c np O) wants to sleep. Preventing this would need a much which is used 12 times, for example in This is de- more complex rules. Since the goal of our gram- signed to get the wagons in a circle and defend mar is not to define grammaticality (as in formal the smoking franchise. It has been easy to write language theory), but to analyse and translate ex- conversion rules showing that the old coverage is isting texts, we opted for a simple system in this preserved. But it remains future work to see what case (but did not need to do so elsewhere). new cases are covered by the increased generality.

7 1. Extraction. cat QCl (x : Args) -- question clause cat IP -- interrogative phrase fun QuestCl : (x : Args) -> Cl x -> QCl x -- does she love him fun QuestVP : (x : Args) -> IP -> VP x -> QCl x -- who loves him fun QuestSlash : (x : Args) -> IP -> QCl (c np x) -> QCl x -- whom does she love lincat QCl = Cl ** {focus : {s : Str ; isOcc : Bool}} -- focal IP, whether occupied 2. Coordination. cat VPC (x : Args) -- VP conjunction cat ClC (x : Args) -- Clause conjunction fun StartVPC : (x : Args) -> Conj -> VP x -> VP x -> VPC x -- love or hate fun ContVPC : (x : Args) -> VP x -> VPC x -> VPC x -- admire, love or hate fun UseVPC : (x : Args) -> VPC x -> VP x -- [use VPC as VP] fun StartClC : (x : Args) -> Conj -> Cl x -> Cl x -> ClC x -- he sells and I buy fun ContClC : (x : Args) -> Cl x -> ClC x -> ClC x -- you steal, he sells and I buy fun UseClC : (x : Args) -> ClC x -> Cl x -- [use ClC as Cl]

Figure 3: Extraction and coordination.

Multilinguality. How universal are the con- written. crete syntax functor and interface? In the stan- Robustness. Robustness in GF parsing is dard RGL, functorization has only been attempted achieved by introducing metavariables (“ques- for families of closely related languages, with Ro- tion marks”) when tree nodes cannot be con- mance languages sharing 75% of syntax code and structed by the grammar (Angelov, 2011). The Scandinavian languages 85% (Ranta, 2009). The subtrees under a metavariable node are linearized new predication grammar shares code across all separately, just like a sequence of chunks. In languages. The figure to compare is the percent- translation, this leads to decrease in quality, be- age of shared code (abstract syntax + functor + in- cause dependencies between chunks are not de- terface) of the total code written for a particular tected. The early application of tense and polarity language (shared + language-specific). This per- is an improvement, as it makes verb chunks con- centage is 70 for Chinese, 64 for English, 61 for tain information that was previously detected only Finnish, and 76 for Swedish, when calculated as if the parser managed to build a whole sentence. lines of code. The total amount of shared code is 760 lines. One example of overrides is 7 Conclusion and questions in English, which are complicated by the need of auxiliaries for some verbs (go) but We have shown a GF grammar for predication al- not for others (be). This explains why Swedish lowing an unlimited variation of argument lists: an shares more of the common code than English. abstract syntax with a concise definition using de- Performance. Dependent types are not inte- pendent types, a concrete syntax using a functor grated in current GF parsers, but checked by post- and records, and a straightforward denotational se- processing. This implies a loss of speed, be- mantics. The grammar has been tested with four cause many trees are constructed just to be thrown languages and shown promising results in speed away. But when we specialized dependent types and robustness, also in large-scale processing. A and rules to nondependent instances needed by the more general conclusion is that dependent types, lexicon (using them as metarules in the sense of records, and functors are powerful tools both for GPSG), parsing became several times faster than computational grammar engineering and for the with the old grammar. An analysis remains to do, theoretical study of languages. but one hypothesis is that the speed-up is due to Acknowledgements. I am grateful to Krasimir fixing tense and polarity earlier than in the old Angelov and Robin Cooper for comments, and RGL: when starting to build VPs, as opposed to to Swedish Research Council for support under when using clauses in full sentences. Dependent grant nr. 2012-5746 (Reliable Multilingual Dig- types made it easy to test this refactoring, since ital Communication). they reduced the number of rules that had to be

8 L. Magnusson. 1994. The Implementation of ALF - a Proof Editor based on Martin-Lof’s¨ Monomorphic K. Angelov and P. Ljunglof.¨ 2014. Fast statistical pars- Type Theory with Explicit Substitution. Ph.D. thesis, ing with parallel multiple context-free grammars. In Department of Computing Science, Chalmers Uni- Proceedings of EACL-2014, Gothenburg. versity of Technology and University of Goteborg.¨ K. Angelov. 2011. The Mechanics of the Grammatical M. P. Marcus, B. Santorini, and M. A. Marcinkiewicz. Framework. Ph.D. thesis, Chalmers University of 1993. Building a Large Annotated Corpus of En- Technology. glish: The Penn Treebank. Computational Linguis- tics, 19(2):313–330. Y. Bar-Hillel. 1953. A quasi-arithmetical notation for syntactic description. Language, 29:27–58. P. Martin-Lof.¨ 1984. Intuitionistic Type Theory. Bib- liopolis, Napoli. N. Chomsky. 1981. Lectures on Government and . Mouton de Gruyter. R. Montague. 1974. Formal Philosophy. Yale Univer- sity Press, New Haven. Collected papers edited by Y. Coscoy, G. Kahn, and L. Thery. 1995. Extract- Richmond Thomason. ing text from proofs. In M. Dezani-Ciancaglini and G. Plotkin, editors, Proc. Second Int. Conf. on Typed S. Muller.¨ 2004. Continuous or Discontinuous Con- Lambda Calculi and Applications, volume 902 of stituents? A Comparison between Syntactic Analy- LNCS, pages 109–123. ses for Constituent Order and Their Processing Sys- tems. Research on Language and Computation, H. B. Curry. 1961. Some logical aspects of grammat- 2(2):209–257. ical structure. In Roman Jakobson, editor, Structure of Language and its Mathematical Aspects: Pro- R. Muskens. 2001. Lambda Grammars and the ceedings of the Twelfth Symposium in Applied Math- Syntax-Semantics Interface. In R. van Rooy and ematics, pages 56–68. American Mathematical So- M. Stokhof, editors, Proceedings of the Thirteenth ciety. Amsterdam Colloquium, pages 150–155, Amster- dam. http://let.uvt.nl/general/people/ Ph. de Groote. 2001. Towards Abstract Categorial rmuskens/pubs/amscoll.pdf. Grammars. In Association for Computational Lin- guistics, 39th Annual Meeting and 10th Conference F. J. Newmeyer. 2004. Against a parameter-setting of the European Chapter, Toulouse, France, pages approach to language variation. Linguistic Variation 148–155. Yearbook, 4:181–234. C. Pollard and I. Sag. 1994. Head-Driven Phrase P. Diderichsen. 1962. Elementær dansk grammatik. Structure Grammar. University of Chicago Press. Gyldendal, København. A. Ranta. 1994. Type Theoretical Grammar. Oxford G. Gazdar, E. Klein, G. Pullum, and I. Sag. 1985. Gen- University Press. eralized Phrase Structure Grammar. Basil Black- well, Oxford. A. Ranta. 1997. Structures grammaticales dans le franc¸ais mathematique.´ Mathematiques,´ informa- R. Harper, F. Honsell, and G. Plotkin. 1993. A Frame- tique et Sciences Humaines, 138/139:5–56/5–36. work for Defining Logics. JACM, 40(1):143–184. A. Ranta. 2004. Computational Semantics in Type L. Karttunen and M. Kay. 1985. Parsing in a free Theory. Mathematics and Social Sciences, 165:31– word order language. In D. Dowty, L. Karttunen, 57. and A. Zwicky, editors, Natural Language Pars- ing, Psychological, Computational, and Theoretical A. Ranta. 2009. The GF Resource Grammar Perspectives, pages 279–306. Cambridge University Library. Linguistics in Language Technology, Press. 2. http://elanguage.net/journals/index. php/lilt/article/viewFile/214/158. J. Lambek. 1958. The mathematics of sentence struc- ture. American Mathematical Monthly, 65:154–170. A. Ranta. 2011. Grammatical Framework: Program- ming with Multilingual Grammars. CSLI Publica- J. Landsbergen. 1982. Machine translation based tions, Stanford. on logically isomorphic Montague grammars. In COLING-1982. H. Seki, T. Matsumura, M. Fujii, and T. Kasami. 1991. On multiple context-free grammars. Theoretical P. Ljunglof.¨ 2004. The Expressivity and Com- Computer Science, 88:191–229. plexity of Grammatical Framework. Ph.D. thesis, Dept. of Computing Science, Chalmers University of Technology and Gothenburg Uni- versity. http://www.cs.chalmers.se/~peb/ pubs/p04-PhD-thesis.pdf.

9 System with Generalized Quantifiers on Dependent Types for Anaphora

Justyna Grudzinska´ Marek Zawadowski Institute of Philosophy Institute of Mathematics University of Warsaw University of Warsaw Krakowskie Przedmiescie´ 3, 00-927 Warszawa Banacha 2, 02-097 Warszawa [email protected] [email protected]

Abstract pronouns as quantifiers; as in the systems of dy- namic semantics context is used as a medium sup- We propose a system for the interpreta- plying (possibly dependent) types as their poten- tion of anaphoric relationships between tial quantificational domains. Like Dekker’s Pred- unbound pronouns and quantifiers. The icate Logic with Anaphora and more recent mul- main technical contribution of our pro- tidimensional models (Dekker, 1994); (Dekker, posal consists in combining generalized 2008), our system lends itself to the compositional quantifiers with dependent types. Empir- treatment of unbound anaphora, while keeping a ically, our system allows a uniform treat- classical, static notion of truth. The main novelty ment of all types of unbound anaphora, in- of our proposal consists in combining generalized cluding the notoriously difficult cases such quantifiers (Mostowski, 1957); (Lindstrom,¨ 1966); as quantificational subordination, cumula- (Barwise and Cooper, 1981) with dependent types tive and branching , and don- (Martin-Lof,¨ 1972); (Ranta, 1994). key anaphora. The paper is organized as follows. In Section 2 we introduce informally the main features of our 1 Introduction system. Section 3 sketches the process of English- The phenomenon of unbound anaphora refers to to-formal language translation. Finally, sections 4 instances where anaphoric pronouns occur outside and 5 define the syntax and semantics of the sys- the syntactic scopes (i.e. the c-command domain) tem. of their quantifier antecedents. The main kinds of unbound anaphora are regular anaphora to quan- 2 Elements of system tifiers, quantificational subordination, and donkey 2.1 Context, types and dependent types anaphora, as exemplified by (1) to (3) respectively: The variables of our system are always typed. We (1) Most kids entered. They looked happy. write x : X to denote that the variable x is of type X and refer to this as a type specification of the (2) Every man loves a woman. They kiss them. variable x. Types, in this paper, are interpreted as (3) Every farmer who owns a donkey beats it. sets. We write the interpretation of the type X as X . k k Unbound anaphoric pronouns have been dealt Types can depend on variables of other types. with in two main semantic paradigms: dynamic Thus, if we already have a type specification x : semantic theories (Groenendijk and Stokhof, X, then we can also have type Y (x) depending 1991); (Van den Berg, 1996); (Nouwen, 2003) and on the variable x and we can declare a variable y the E-type/D-type tradition (Evans, 1977); (Heim, of type Y by stating y : Y (x). The fact that Y 1990); (Elbourne, 2005). In the dynamic seman- depends on X is modeled as a projection tic theories pronouns are taken to be (syntactically π : Y X . free, but semantically bound) variables, and con- k k → k k text serves as a medium supplying values for the So that if the variable x of type X is interpreted as variables. In the E-type/D-type tradition pronouns an element a X , Y (a) is interpreted as the are treated as quantifiers. Our system combines ∈ k k k k fiber of π over a (the preimage of a under π) aspects of both families of theories. As in the E- { } type/D-type tradition we treat unbound anaphoric Y (a) = b Y : π(b) = a . k k { ∈ k k }

10 Proceedings of the EACL 2014 Workshop on Type Theory and Natural Language Semantics (TTNLS), pages 10–18, Gothenburg, Sweden, April 26-30 2014. c 2014 Association for Computational Linguistics

One standard natural language example of such a 1957) further generalized by Lindstrom¨ to a dependence of types is that if m is a variable of so-called polyadic quantifier (Lindstrom,¨ 1966), the type of months M, there is a type D(m) of see (Bellert and Zawadowski, 1989). To use a the days of the month m. Such type dependencies familiar example, a multi-quantifier prefix like can be nested, i.e., we can have a sequence of type is thought of as a single two-place ∀m:M |∃w:W specifications of the (individual) variables: quantifier obtained by an operation on the two single quantifiers, and it has as denotation: x : X, y : Y (x), z : Z(x, y). m:M w:W = R M W : a M : Context for us is a partially ordered sequence of k∀ |∃ k { ⊆ k k × k k { ∈ k k type specifications of the (individual) variables b W : a, b R w:W m:M . and it is interpreted as a parameter set, i.e. as a { ∈ k k h i ∈ } ∈ k∃ k} ∈ k∀ k} set of compatible n-tuples of the elements of the In this paper we generalize the three chain- sets corresponding to the types involved (compat- constructors and the corresponding semantical op- ible wrt all projections). erations to (pre-) chains defined on dependent types. 2.2 Quantifiers, chains of quantifiers 2.3 Dynamic extensions of contexts Our system defines quantifiers and predicates polymorphically. A generalized quantifier Q is In our system language expressions are all defined an association to every set Z a subset of the in context. Thus the first sentence in (2) (on the power set of Z. If we have a predicate P de- most natural interpretation where a woman de- fined in a context Γ, then for any interpreta- pends on every man) translates (via the process de- tion of the context Γ it is interpreted as a scribed in Section 3) into a sentence with a chain k k subset of its parameter set. Quantifier phrases, of quantifiers in a context: e.g. every man or some woman, are interpreted as follows: every = man and Γ m:M w:W Love(m, w), k m:mank {k k} ` ∀ |∃ somew:woman = X woman : X = . k k { ⊆ k k 6 ∅} and says that the set of pairs, a man and a woman The interpretation of quantifier phrases is fur- he loves, has the following property: the set of ther extended into the interpretation of chains of those men that love some woman each is the set quantifiers. Consider an example in (2): of all men. The way to understand the second sen- (2) Every man loves a woman. They kiss them. tence in (2) (i.e. the anaphoric ) is that every man kisses the women he loves rather than Multi-quantifier sentences such as the first sen- those loved by someone else. Thus the first sen- tence in (2) are known to be ambiguous with tence in (2) must deliver some internal relation be- different readings corresponding to how various tween the types corresponding to the two quanti- quantifiers are semantically related in the sen- fier phrases. tence. To account for the readings available for In our analysis, the first sentence in (2) extends such multi-quantifier sentences, we raise quanti- the context Γ by adding new variable specifica- fier phrases to the front of a sentence to form tions on newly formed types for every quantifier (generalized) quantifier prefixes - chains of quan- phrase in the chain Ch = - for the ∀m:M |∃w:W tifiers. Chains of quantifiers are built from quanti- purpose of the formation of such new types we in- fier phrases using three chain-constructors: pack- troduce a new type constructor T. That is, the first formation rule (?,..., ?), sequential composi- sentence in (2) (denoted as ϕ) extends the context tion ? ?, and parallel composition ? . The se- | ? by adding: mantical operations that correspond to the chain- constructors (known as cumulation, iteration and T T tϕ, m : ϕ, m:M ; tϕ, w : ϕ, w:W (tϕ, m ) branching) capture in a compositional manner cu- ∀ ∀ ∃ ∃ ∀ mulative, -dependent and branching read- The interpretations of types (that correspond to ings, respectively. quantifier phrases in Ch) from the extended con- The idea of chain-constructors and the cor- text Γϕ are defined in a two-step procedure using responding semantical operations builds on the inductive clauses through which we define Ch Mostowski’s notion of quantifier (Mostowski, but in the reverse direction.

11 Step 1. We define fibers of new types by inverse (3) Every farmer who owns a donkey beats it. induction. To account for the dynamic contribution of modi- Basic step. fied common nouns (in this case common nouns For the whole chain Ch = m:M w:W we put: ∀ |∃ modified by relative clauses) we include in our T := Love . system -sentences (i.e. sentences with dummy ϕ, m:M w:W ∗ k ∀ |∃ k k k quantifier phrases). The modified common noun Inductive step. gets translated into a -sentence (with a dummy- ∗ quantifier phrase f : F ): Tϕ, = a M : b W : k ∀m:M k { ∈ k k { ∈ k k Γ f : F d:DOwn(f, d) a, b Love w:W ` |∃ h i ∈ k k} ∈ k∃ k} and for a M and we extend the context by dropping the speci- ∈ k k fications of variables: (f : F, d : D) and adding T ϕ, w:W (a) = b W : a, b Love new variable specifications on newly formed types k ∃ k { ∈ k k h i ∈ k k} for every (dummy-) quantifier phrase in the chain Step 2. We build dependent types from fibers. Ch∗: T T ϕ, w:W = a ϕ, w:W (a): T T k ∃ k {{ } × k ∃ k tϕ,f : ϕ,f:F ; tϕ, d : ϕ, d:D (tϕ,f ), [ ∃ ∃ a Tϕ, ∈ k ∀m:M k} The interpretations of types (that correspond to the Thus the first sentence in (2) extends the con- quantifier phrases in the Ch∗) from the extended T text Γ by adding the type ϕ, m:M , interpreted context Γϕ are defined in our two-step procedure. T ∀ as ϕ, m:M (i.e. the set of men who love some Thus the -sentence in (3) extends the context by k ∀ k T ∗ T T women), and the dependent type ϕ, w:W (tϕ, m ), adding the type ϕ,f:F interpreted as ϕ,f:F ∃ ∀ k k interpreted for a Tϕ, as Tϕ, (a) (i.e. the set of farmers who own some donkeys), ∈ k ∀m:M k k ∃w:W k (i.e. the set of women loved by the man a). and the dependent type Tϕ, (tϕ,f ), interpreted ∃d:D Unbound anaphoric pronouns are interpreted for a Tϕ,f:F as Tϕ, (a) (i.e. the set of ∈ k k k ∃d:D k with to the context created by the fore- donkeys owned by the farmer a). The main clause going text: they are treated as universal quantifiers translates into: and newly formed (possibly dependent) types in- T T crementally added to the context serve as their po- Γϕ tϕ,f : ϕ,f:F tϕ, : ϕ, (tϕ,f ) ` ∀ |∀ ∃d ∃d:D tential quantificational domains. That is, unbound Beat(tϕ,f , tϕ, ), anaphoric pronouns theym and themw in the sec- ∃d ond sentence of (2) have the ability to pick up and yielding the correct truth conditions Every farmer quantify universally over the respective interpreta- who owns a donkey beats every donkey he owns. tions. The anaphoric continuation in (2) translates Importantly, since we quantify over fibers (and not into: over farmer, donkey pairs), our solution does h i not run into the so-called ‘proportion problem’. Γ T T ϕ tϕ, m : ϕ, tϕ, w : ϕ, (tϕ, m ) ` ∀ ∀ ∀m:M |∀ ∃ ∃w:W ∀ Dynamic extensions of contexts and their in- terpretation are also defined for cumulative and Kiss(tϕ, m , tϕ, w ), ∀ ∃ branching continuations. Consider a cumulative where: example in (4):

T T = tϕ, m : ϕ, tϕ, w : ϕ, (tϕ, m ) (4) Last year three scientists wrote (a total of) five k∀ ∀ ∀m:M |∀ ∃ ∃w:W ∀ k articles (between them). They presented them R Tϕ, : a Tϕ, : { ⊆ k ∃w:W k { ∈ k ∀m:M k at major conferences. b Tϕ, (a): a, b R { ∈ k ∃w:W k h i ∈ } ∈ Interpreted cumulatively, the first sentence in (4) T (a) T , tϕ, w : ϕ, (tϕ, m ) tϕ, m : ϕ, translates into a sentence: k∀ ∃ ∃w:W ∀ k } ∈ k∀ ∀ ∀m:M k} yielding the correct truth conditions: Every man Γ (T hrees:S, F ivea:A) W rite(s, a). kisses every woman he loves. ` Our system also handles intra-sentential The anaphoric continuation in (4) can be inter- anaphora, as exemplified in (3): preted in what Krifka calls a ‘correspondence’

12 fashion (Krifka, 1996). For example, Dr. K wrote 4 System - syntax one article, co-authored two more with Dr. N, who 4.1 Alphabet co-authored two more with Dr. S, and the scien- tists that cooperated in writing one or more articles The alphabet consists of: also cooperated in presenting these (and no other) type variables: X,Y,Z,...; articles at major conferences. In our system, the type constants: M, men, women, . . .; first sentence in (4) extends the context by adding type constructors: , , T; individual variables: x, y, z, . . .; the type corresponding to (T hrees:S, F ivea:A): P Q predicates: P,P ,P ,...; t : T , 0 1 ϕ,(T hrees,F ivea) ϕ,(T hrees:S ; F ivea:A) quantifier symbols: , , five, Q ,Q ,...; ∃ ∀ 1 2 interpreted as a set of tuples three chain constructors: ? ?, ? , (?,..., ?). | ? T = k ϕ,(T hrees:S ,F ivea:A)k 4.2 Context = c, d : c S & d A & c wrote d {h i ∈ k k ∈ k k } A context is a list of type specifications of (indi- The anaphoric continuation then quantifies univer- vidual) variables. If we have a context sally over this type (i.e. a set of pairs), yielding the desired truth-conditions The respective scientists Γ = x1 : X1, . . . , xn : Xn( xi i Jn ) h i ∈ cooperated in presenting at major conferences the respective articles that they cooperated in writing. then the judgement Γ: cxt 3 English-to-formal language translation ` We assume a two-step translation process. expresses this fact. Having a context Γ as above, Representation. The syntax of the representa- we can declare a type Xn+1 in that context tion language - for the English fragment consid- ered in this paper - is as follows. Γ Xn+1( xi i Jn+1 ): type ` h i ∈ S P rdn(QP , . . . , QP ); → 1 n MCN P rdn(QP ,..., CN ,...,QP ); where Jn+1 1, . . . , n such that if i Jn+1, → 1 n ⊆ { } ∈ MCN CN; then Ji Jn+1, J1 = . The type Xn+1 depends → ⊆ ∅ on variables xi i Jn+1 . Now, we can declare a QP Det MCN; h i ∈ → X ( x ) Det every, most, three, . . .; new variable of the type n+1 i i Jn+1 in the → h i ∈ CN man, woman, . . .; context Γ → P rdn enter, love, . . .; → Γ xn+1 : Xn+1( xi i Jn+1 ) Common nouns (CNs) are interpreted as types, ` h i ∈ and common nouns modified by relative clauses and extend the context Γ by adding this variable (MCNs) - as -sentences determining some (pos- ∗ specification, i.e. we have sibly dependent) types.

Disambiguation. Sentences of English, con- Γ, xn+1 : Xn+1( xi i Jn+1 ): cxt trary to sentences of our formal language, are of- ` h i ∈ ten ambiguous. Hence one sentence representa- Γ0 is a subcontext of Γ if Γ0 is a context and a sub- tion can be associated with more than one sentence list of Γ. Let ∆ be a list of variable specifications in our formal language. The next step thus in- from a context Γ, ∆0 the least subcontext of Γ con- volves disambiguation. We take quantifier phrases taining ∆. We say that ∆ is convex iff ∆0 ∆ is − of a given representation, e.g.: again a context.

P (Q1X1,Q2X2,Q3X3) The variables the types depend on are always and organize them into all possible chains of quan- explicitly written down in specifications. We can tifiers in suitable contexts with some restrictions think of a context as (a linearization of) a partially imposed on particular quantifiers concerning the ordered set of declarations such that the declara- places in prefixes at which they can occur (a de- tion of a variable x (of type X) precedes the dec- tailed elaboration of the disambiguation process is laration of the variable y (of type Y ) iff the type Y left for another place): depends on the variable x. Q1x1:X1 Q2x2:X2 The formation rules for both Σ- and Π-types are | P (x1, x2, x3). Q3x3:X3 as usual.

13 4.3 Language 2. Sequential composition of pre-chains Quantifier-free formulas. Here, we need only Γ Ch ~ : p-ch, Γ Ch ~ : p-ch predicates applied to variables. So we write ` 1 ~y1:Y1(~x1) ` 2 ~y2:Y2(~x2) Γ Ch Ch : p-ch 1 ~y1:Y~1(~x1) 2 ~y2:Y~2(~x2) Γ P (x , . . . , x ): qf-f ` | ` 1 n provided ~y (~y ~x ) = ; the specifications of to express that P is an n-ary predicate and the 2 ∩ 1 ∪ 1 ∅ the variables (~x ~x ) (~y ~y ) form a con- specifications of the variables x , . . . , x form a 1 ∪ 2 − 1 ∪ 2 1 n text, a subcontext of Γ. In so obtained pre-chain: subcontext of Γ. the binding variables are ~y ~y and the indexing Quantifier phrases. If we have a context Γ, y : 1 ∪ 2 variables are ~x ~x . Y (~x), ∆ and quantifier symbol Q, then we can 1 ∪ 2 3. Parallel composition of pre-chains form a quantifier phrase Qy:Y (~x) in that context. We write Γ Ch ~ : p-ch, Γ Ch ~ : p-ch ` 1 ~y1:Y1(~x1) ` 2 ~y2:Y2(~x2) Ch ~ Γ, y : Y (~x), ∆ Qy:Y (~x) : QP Γ 1 ~y1:Y1(~x1) : p-ch ` Ch ~ ` 2 ~y2:Y2(~x2) to express this fact. In a quantifier prase Qy:Y (~x): provided ~y (~y ~x ) = = ~y (~y ~x ). the variable y is the binding variable and the vari- 2 ∩ 1 ∪ 1 ∅ 1 ∩ 2 ∪ 2 ables ~x are indexing variables. As above, in so obtained pre-chain: the binding variables are ~y ~y and the indexing variables Packs of quantifiers. Quantifiers phrases can 1 ∪ 2 are ~x ~x . be grouped together to form a pack of quantifiers. 1 ∪ 2 The pack of quantifiers formation rule is as fol- A pre-chain of quantifiers Ch~y:Y~ (~x) is a chain lows. iff ~x ~y. The following ⊆ Γ Qi yi:Yi(~xi) : QP i = 1, . . . k ` Γ Ch ~ : chain Γ (Q ,...,Q ): pack ` ~y:Y (~x) ` 1 y1:Y1(~x1) k yk:Yk(~xk) expresses the fact that Ch is a chain of quan- where, with ~y = y , . . . , y and ~x = k ~x , we ~y:Y~ (~x) 1 k i=1 i tifiers in the context Γ. have that y = y for i = j and ~y ~x = . In so i 6 j 6 ∩ S ∅ Formulas, sentences and -sentences. The for- constructed pack: the binding variables are ~y and ∗ mulas have binding variables, indexing variables the indexing variables are ~x. We can denote such and argument variables. We write ϕ (~z) for a pack P c to indicate the variables involved. ~y:Y (~x) ~y:Y~ (~x) a formula with binding variables ~y, indexing vari- One-element pack will be denoted and treated as ables ~x and argument variables ~z. We have the a quantifier phrase. This is why we denote such a following formation rule for formulas pack as Qy:Y (~x) rather than (Qy:Y (~x)). Pre-chains and chains of quantifiers. Chains Γ A(~z): qf-f, Γ Ch~y:Y~ (~x) : p-ch and pre-chains of quantifiers have binding vari- ` ` Γ Ch~y:Y~ (~x) A(~z): formula ables and indexing variables. By Ch~y:Y~ (~x) we de- ` note a pre-chain with binding variables ~y and in- provided ~y is final in ~z, i.e. ~y ~z and variable ⊆ dexing variables ~x so that the type of the variable specifications of ~z ~y form a subcontext of Γ. In − yi is Yi(~xi) with i ~xi = ~x. Chains of quantifiers so constructed formula: the binding variables are are pre-chains in which all indexing variables are S ~y, the indexing variables are ~x, and the argument bound. Pre-chains of quantifiers arrange quantifier variables are ~z. phrases into N-free pre-orders, subject to some A formula ϕ (~z) is a sentence iff ~z ~y ~y:Y (~x) ⊆ binding conditions. Mutually comparable QPs in a and ~x ~y. So a sentence is a formula without free ⊆ pre-chain sit in one pack. Thus the pre-chains are variables, neither individual nor indexing. The fol- built from packs via two chain-constructors of se- lowing quential ? ? and parallel composition ? . The chain | ? formation rules are as follows. Γ ϕ~y:Y (~x)(~z): sentence ` 1. Packs of quantifiers. Packs of quantifiers are pre-chains of quantifiers with the same bind- expresses the fact that ϕ~y:Y (~x)(~z) is a sentence ing variable and the same indexing variables, i.e. formed in the context Γ. We shall also consider some special formulas Γ P c : ~y:Y~ (~x) pack that we call -sentences. A formula ϕ (~z) is a ` ∗ ~y:Y (~x) Γ P c ~ : p-ch -sentence if ~x ~y ~z but the set ~z ~y is possibly ` ~y:Y (~x) ∗ ⊆ ∪ −

14 ˇ not empty and moreover the type of each variable Γ such that one variable specification tΦ0,ϕ : TΦ0,ϕ in ~z ~y is constant, i.e., it does not depend on vari- was already added for each pack Φ P acks − 0 ∈ Ch∗ ables of other types. In such case we consider the such that Φ0

Ch Γ0 TΦ,ϕ( tΦ0,ϕ Φ0 P acksCh ,Φ0

Γ0, tΦ,ϕ : TΦ,ϕ( tΦ ,ϕ Φ P acks ,Φ < Φ): cxt Γ ϕ (~z): -sentence h 0 i 0∈ Ch∗ 0 Ch∗ ` ~y:Y (~x) ∗ The context obtained from Γˇ by adding the new to express the fact that ϕ~y:Y (~x)(~z) is a -sentence ∗ variables corresponding to all the packs P acksCh formed in the context Γ. ∗ as above will be denoted by Having formed a -sentence ϕ we can form a ∗ new context Γϕ defined in the next section. Γϕ = Γˇ T(Ch ~ A(~z)). Notation. For semantics we need some notation ∪ ~y:Y (~x) for the variables in the -sentence. Suppose we At the end we add another context formation ∗ have a -sentence rule ∗ Γ Ch ~ A(~z): -sentence, Γ Ch~y:Y (~x) P (~z): -sentence ` ~y:Y (~x) ∗ ` ∗ Γϕ : cxt We define: (i) The environment of pre-chain Ch: Then we can build another formula starting in the Env(Ch) = Env(Ch~y:Y~ (~x)) - is the context defining variables ~x ~y; (ii) The binding variables context Γϕ. This process can be iterated. Thus − in this system sentence ϕ in a context Γ is con- of pre-chain Ch: Bv(Ch) = Bv(Ch~y:Y~ (~x)) - is the convex set of declarations in Γ of the binding structed via specifying sequence of formulas, with the last formula being the sentence ϕ. However, variables in ~y; (iii) env(Ch) = env(Ch~y:Y~ (~x)) - the set of variables in the environment of Ch, i.e. for the lack of space we are going to describe here only one step of this process. That is, sentence ϕ ~x ~y; (iv) bv(Ch) = bv(Ch ~ ) - the set of − ~y:Y (~x) in a context Γ can be constructed via specifying biding variables ~y; (v) The environment of a pre- -sentence ψ extending the context as follows chain Ch0 in a -sentence ϕ = Ch~y:Y (~x) P (~z), de- ∗ ∗ noted Envϕ(Ch0), is the set of binding variables Γ ψ : -sentence in all the packs in Ch∗ that are <ϕ-smaller than all ` ∗ packs in Ch0. Note Env(Ch0) Envϕ(Ch0). If Γ ϕ : sentence ⊆ ψ ` Ch0 = Ch1 Ch2 is a sub-pre-chain of the chain | For short, we can write Ch , then Env (Ch ) = Env (Ch ) ~y:Y (~x) ϕ 2 ϕ 1 ∪ Bv(Ch1) and Envϕ(Ch1) = Envϕ(Ch0). Γ Γ ϕ : sentence ` ψ ` 4.4 Dynamic extensions 5 System - semantics Suppose we have constructed a -sentence in a ∗ context 5.1 Interpretation of dependent types The context Γ Γ Ch A(~z): -sentence ` ~y:Y~ (~x) ∗ x : X(...), . . . , z : Z(. . . , x, y, . . .): cxt ` We write ϕ for Ch~y:Y~ (~x) A(~z). We form a context Γϕ dropping the specifica- gives rise to a dependence graph. A dependence tions of variables ~z and adding one type and one graph DGΓ = (TΓ,EΓ) for the context Γ has variable specification for each pack in P acks . types of Γ as vertices and an edge π : Y X Ch∗ Y,x → Let Γˇ denote the context Γ with the specifica- for every variable specification x : X(...) in Γ tions of the variables ~z deleted. Suppose Φ and every type Y (. . . , x, . . .) occurring in Γ that ∈ P acksCh∗ and Γ0 is an of the context depends on x.

15 The dependence diagram for the context Γ is an Envϕ(Ch0) is a sub-context of Γ disjoint from the association : DG Set to every type X convex set Bv(Ch ) and Env (Ch ), Bv(Ch ) is k − k Γ → 0 ϕ 0 0 in T a set X and every edge π : Y X a sub-context of Γ. For ~a Env (Ch ) we de- Γ k k Y,x → ∈ k ϕ 0 k in E a function π : Y X , so that fine Bv(Ch ) (~a) to be the largest set such that Γ k Y,xk k k → k k k 0 k whenever we have a triangle of edges in EΓ, πY,x as before π : Z Y , π : Z X we have ~a Bv(Ch0) (~a) Envϕ(Ch0), Bv(Ch0) Z,y → Z,x → { } × k k ⊆ k k π = π π . k Z,xk k Y,xk ◦ k Z,yk Interpretation of quantifier phrases. If we have The interpretation of the context Γ, the param- a quantifier phrase eter space Γ , is the limit of the dependence dia- k k gram : DGΓ Set. More specifically, Γ Q : QP k − k → ` y:Y (~x) Γ = x : X(...), . . . , z : Z(. . . , x, y, . . .) = and ~a Env (Q ) , then it is interpreted k k k k ∈ k ϕ y:Y (~x) k as Q ( Y (~a)) ( Y (~a ~x)). ~a : dom(~a) = var(Γ), ~a(z) Z (~a env(Z)), k k k k ⊆ P k k d { ∈ k k d Interpretation of packs. If we have a pack of πZ,x (~a(z)) = ~a(x), for z : Z in Γ, x envZ quantifiers in the sentence ϕ k k ∈ } where var(Γ) denotes variables specified in Γ and P c = (Q1 ,...Qn ) env(Z) denotes indexing variables of the type Z. y1:Y1(~x1) yn:Yn(~xn) The interpretation of the Σ- and Π-types are as and ~a Env (P c) , then its interpretation with ∈ k ϕ k usual. the parameter ~a is

5.2 Interpretation of language P c (~a) = (Q ,...,Q ) (~a) = k k k 1y1:Y1(~x1) nyn:Yn(~xn) k Interpretation of predicates and quantifier sym- n bols. Both predicates and quantifiers are inter- A Yi (~a ~xi): πi(A) Qi ( Yi (~a ~xi), preted polymorphically. { ⊆ i=1 k k d ∈ k k k k d If we have a predicate P defined in a context Γ: Y for i = 1, . . . , n } x1 : X1, . . . , xn : Xn( xi i Jn]) P (~x): qf-f π i h i ∈ ` where i is the -th projection from the product. Interpretation of chain constructors. then, for any interpretation of the context Γ , it k k 1. Parallel composition. For a pre-chain of is interpreted as a subset of its parameter set, i.e. quantifiers in the sentence ϕ P Γ . k k ⊆ k k Quantifier symbol Q is interpreted as quantifier Ch 1~y1:Y~1(~x1) Q i.e. an association to every1 set Z a subset Ch0 = k k Ch2 ~ Q (Z) (Z). ~y2:Y2(~x2) k k ⊆ P Interpretation of pre-chains and chains of quan- and ~a Envϕ(Ch0) we define tifiers. We interpret QP’s, packs, pre-chains, and ∈ k k chains in the environment of a sentence Envϕ. Ch1 ~ ~y1:Y1(~x1) (~a) = A B : This is the only case that is needed. We could kCh2~y :Y~ (~x ) k { × interpret the aforementioned syntactic objects in 2 2 2 their natural environment Env (i.e. independently A Ch (~a ~x ) and 1~y1:Y~1(~x1) 1 of any given sentence) but it would unnecessarily ∈ k k d B Ch2 ~ (~a ~x2) complicate some definitions. Thus having a ( -) ~y2:Y2(~x2) ∗ ∈ k k d } sentence ϕ = Ch~y:Y (~x) P (~z) (defined in a con- 2. Sequential composition. For a pre-chain of text Γ) and a sub-pre-chain (QP, pack) Ch0, for quantifiers in the sentence ϕ ~a Env (Ch ) we define the meaning of ∈ k ϕ 0 k Ch0 = Ch Ch 1~y1:Y~1(~x1) 2~y2:Y~2(~x2) Ch0 (~a) | k k and ~a Envϕ(Ch0) we define Notation. Let ϕ = Ch P (~y) be a - ∈ k k ~y:Y~ ∗ sentence built in a context Γ, Ch0 a pre-chain used Ch1 ~ Ch2 ~ (~a) = in the construction of the ( )-chain Ch. Then k ~y1:Y1(~x1)| ~y2:Y2(~x2)k ∗ 1 This association can be partial. R Bv(Ch0) (~a): ~b Bv(Ch ) (~a): { ⊆ k k { ∈ k 1 k

16 ~c Bv(Ch ) (~a,~b): ~b,~c R then we define sets { ∈ k 2 k h i ∈ } ∈ ~ Ch2 ~ (~a, b) Ch1 ~ (~a) T (~a) Ch (~a) k ~y2:Y2(~x2)k } ∈ k ~y1:Y1(~x1)k } k ϕ,Chi k ∈ k ik Validity. A sentence for i = 1, 2 so that

~x : X~ Ch ~ P (~y) T (~a) = T (~a) T (~a) ` ~y:Y k ϕ,Ch0 k k ϕ,Ch1 k × k ϕ,Ch2 k is true under the above interpretation iff if such sets exist, and these sets ( Tϕ,Ch (~a)) are k i k undefined otherwise. P ( Y~ ) Ch Sequential decomposition. If we have k k k k ∈ k ~y:Y~ k

Ch0 = Ch1 ~ Ch2 ~ 5.3 Interpretation of dynamic extensions ~y1:Y1(~x1)| ~y2:Y2(~x2) Suppose we obtain a context Γϕ from Γ by the fol- then we put lowing rule Tϕ,Ch (~a) = ~b Bv(Ch1) (~a): Γ Ch A(~z): -sentence, k 1 k { ∈ k k ` ~y:Y~ (~x) ∗ ~ ~ T Γ : cxt ~c Bv(Ch2) (~a, b): b,~c ϕ,Ch (~a) ϕ { ∈ k k h i ∈ k 0 k } Ch2 (~a,~b) where ϕ is Ch~y:Y~ (~x) A(~z). Then ∈ k k } For ~b Bv(Ch1) we put ˇ T ∈ k k Γϕ = Γ (Ch~y:Y~ (~x) A(~z)). ∪ T (~a,~b) = ~c Bv(Ch ) (~a,~b): k ϕ,Ch2 k { ∈ k 2 k Γ From dependence diagram : DGΓ Set k − k → ~b,~c T (~a) we shall define another dependence diagram h i ∈ k ϕ,Ch0 k } Step 2. (Building dependent types from fibers.) Γϕ = : DGΓϕ Set If Φ is a pack in Ch , ~a Env (Φ) then we k − k k − k → ∗ ∈ k ϕ k put Thus, for Φ P ackCh∗ we need to define Γ ∈ T ϕ and for Φ < Φ we need to define Tϕ,Φ = ~a Tφ,Φ (~a): ~a Envϕ(Φ) , k Φ,ϕk 0 Ch∗ k k {{ }×k k ∈ k k T T [ T πTΦ,ϕ,t : Φ,ϕ Φ ,ϕ Φ0

T (~a) Bv(Ch0) (~a) 6 Conclusion k ϕ,Ch0 k ⊆ k k It was our intention in this paper to show that This is done using the inductive clauses through adopting a typed approach to generalized quan- which we have defined Ch∗ but in the reverse di- tification allows a uniform treatment of a wide ar- rection. ray of anaphoric data involving natural language The basic case is when Ch0 = Ch∗. We put quantification. T ( ) = P k ϕ,Chk ∅ k k Acknowledgments The inductive step. Now assume that the set The work of Justyna Grudzinska´ was funded by T (~a) is defined for ~a Env (Ch ) . the National Science Center on the basis of de- k ϕ,Ch0 k ∈ k ϕ 0 k Parallel decomposition. If we have cision DEC-2012/07/B/HS1/00301. The authors would like to thank the anonymous reviewers for Ch 1~y1:Y~1(~x1) valuable comments on an earlier version of this pa- Ch0 = Ch per. 2~y2:Y~2(~x2)

17 References Barwise, Jon and Robin Cooper. 1981. Generalized Quantifiers and Natural Language. Linguistics and Philosophy 4: 159-219. Bellert, Irena and Marek Zawadowski. 1989. Formal- ization of the feature system in terms of pre-orders. In Irena Bellert Feature System for Quantification Structures in Natural Language. Foris Dordrecht. 155-172. Dekker, Paul. 1994. Predicate logic with anaphora. In Lynn Santelmann and Mandy Harvey (eds.), Pro- ceedings SALT IX. Ithaca, NY: DMLL Publications, Cornell University. 79-95. Dekker, Paul. 2008. A multi-dimensional treatment of quantification in extraordinary English. Linguistics and Philosophy 31: 101-127. Elworthy, David A. H. 1995. A theory of anaphoric information. Linguistics and Philosophy 18: 297- 332. Elbourne, Paul D. 2005. Situations and Individuals. Cambridge, MA: MIT Press. Evans, Gareth 1977. Pronouns, Quantifiers, and Rela- tive Clauses (I). Canadian Journal of Philosophy 7: 467-536. Heim, Irene. 1990. E-type pronouns and donkey anaphora. Linguistics and Philosophy 13: 137-78. Groenendijk, Jeroen and Martin Stokhof. 1991. Dy- namic Predicate Logic. Linguistics and Philosophy 14: 39-100. Kamp, Hans and Uwe Reyle. 1993. From to Logic. Kluwer Academic Publishers, Dordrecht. Krifka, Manfred. 1996. Parametrized sum individu- als for plural reference and partitive quantification. Linguistics and Philosophy 19: 555-598. Lindstrom,¨ Per. 1966. First-order predicate logic with generalized quantifiers. Theoria 32: 186-95. Martin-Lof,¨ Per. 1972. An intuitionstic theory of types. Technical Report, University of Stockholm. Mostowski, Andrzej. 1957. On a generalization of quantifiers. Fundamenta Mathematicae 44: 12-36. Nouwen, Rick. 2003. Plural pronominal anaphora in context: dynamic aspects of quantification. Ph.D. thesis, UiL-OTS, Utrecht, LOT dissertation series, No. 84. Ranta, Aarne. 1994. Type-Theoretical Grammar. Ox- ford University Press, Oxford. Van den Berg, Martin H. 1996. The Internal Structure of Discourse. Ph.D. thesis, Universiteit van Amster- dam, Amsterdam.

18 Monads as a Solution for Generalized Opacity

Gianluca Giorgolo Ash Asudeh University of Oxford University of Oxford and Carleton University

gianluca.giorgolo,ash.asudeh @ling-phil.ox.ac.uk { }

Abstract data like (1) inside a larger framework that also includes other types of expressions. In this paper we discuss a conservative ex- We decompose examples like (1) along two di- tension of the simply-typed lambda calcu- mensions: the presence or absence of a modal ex- lus in order to model a class of expres- pression, and the way in which we multiply re- sions that generalize the notion of opaque fer to the same individual. In the case of (1), we contexts. Our extension is based on previ- have a modal and we use two different co-referring ous work in the semantics of programming expressions. Examples (2)-(4) complete the land- languages aimed at providing a mathemat- scape of possible combinations: ical characterization of computations that produce some kind of side effect (Moggi, (2) Dr. Octopus punched Spider-Man but he 1989), and is based on the notion of mon- didn’t punch Spider-Man. ads, a construction in category theory that, (3) Mary Jane loves Peter Parker but she intuitively, maps a collection of “simple” doesn’t love Spider-Man. values and “simple” functions into a more complex value space, in a canonical way. (4) Reza doesn’t believe Jesus is Jesus. The main advantages of our approach with (2) is clearly a contradictory statement, as we respect to traditional analyses of opacity predicate of Dr. Octopus that he has the property are the fact that we are able to explain in of having punched Spider-Man and its negation. a uniform way a set of different but re- Notice that in this example there is no modal and lated phenomena, and that we do so in a the exact same expression is used twice to refer principled way that has been shown to also to the object individual. In the case of (3) we explain other linguistic phenomena (Shan, still have no modal and we use two different but 2001). co-referring expressions to refer to the same in- 1 Introduction dividual. However in this case the statement has a non-contradictory reading. Similarly (4) has a Opaque contexts have been an active area of re- non-contradictory reading, which states that, ac- search in natural language semantics since Frege’s cording to the speaker, Reza doesn’t believe that original discussion of the puzzle (Frege, 1892). A the entity he (Reza) calls Jesus is the entity that the sentence like (1) has a non-contradictory interpre- speaker calls Jesus (e.g., is not the same individual tation despite the fact that the two referring expres- or does not have the same properties). This case is sions Hesperus and Phosphorus refer to the same symmetrical to (3), as here we have a modal ex- entity, the planet we know as Venus. pression but the same expression is used twice to refer to the same individual. (1) Reza doesn’t believe Hesperus is Phos- If the relevant reading of (4) is difficult to get, phorus. consider an alternative kind of scenario, as in (5), The fact that a sentence like (1) includes the in which the subject of the sentence, Kim, suffers modal believe has influenced much of the analy- from Capgras Syndrome1 and thinks that Sandy is ses proposed in the literature, and has linked the an impostor. The speaker says: phenomenon with the notion of modality. In this 1From Wikipedia: “[Capgras syndrome] is a disorder in paper we challenge this view and try to position which a person holds a delusion that a friend, spouse, par-

19 Proceedings of the EACL 2014 Workshop on Type Theory and Natural Language Semantics (TTNLS), pages 19–27, Gothenburg, Sweden, April 26-30 2014. c 2014 Association for Computational Linguistics

(5) Kim doesn’t believe Sandy is Sandy simple into a more complex object and function space. They have been successfully used in the Given the definition of Capgras Syndrome (in semantics of programming languages to charac- fn. 1), there is a clear non-contradictory reading terise computations that are not “pure”. By pure available here, in which the speaker is stating that we mean code objects that are totally referentially Kim does not believe that the entity in question, transparent (i.e. do not depend on external factors that the speaker and (other non-Capgras-sufferers) and return the same results given the same input would call Sandy, is the entity that she associate independently of their execution context), and also with the name Sandy. that do not have effects on the “real world”. In We propose an analysis of the non-contradictory contrast, monads are used to model computations cases based on the intuition that the apparently co- that for example read from or write to a file, that referential expressions are in fact interpreted us- depend on some random process or whose return ing different interpretation functions, which cor- value is non-deterministic. respond to different perspectives that are pitted In our case we will use the monad that describe against each other in the sentences. Furthermore, values that are made dependent on some exter- we propose that modal expressions are not the only nal factor, commonly known in the functional pro- ones capable of introducing a shift of perspective, gramming literature as the Reader monad.3 We but also that verbs that involve some kind of men- will represent linguistic expressions that can be as- tal attitude of the subject towards the object have signed potentially different interpretations as func- the same effect. tions from interpretation indices to values. Effec- Notice that this last observation distinguishes tively we will construct a different type of lexicon our approach from one where a sentence like (3) is that does not represent only the linguistic knowl- interpreted as simply saying that Mary Jane loves edge of a single speaker but also her (possibly par- only one guise of the entity that corresponds to Pe- tial) knowledge of the language of other speak- ter Parker but not another one. The problem with ers. So, for example, we will claim that (4) can this interpretation is that if it is indeed the case be assigned a non-contradictory reading because that different co-referring expressions simply pick the speaker’s lexicon also includes the information out different guises of the same individual, then a regarding Reza’s interpretation of the name Jesus sentence like (6) should have a non-contradictory and therefore makes it possible for the speaker to reading, while this seems not to be the case. use the same expression, in combination with a (6) Dr. Octopus killed Spider-Man but he verb such as believe, to actually refer to two dif- didn’t kill Peter Parker. ferent entities. In one case we will argue that the name Jesus is interpreted using the speaker’s inter- While a guise-based interpretation is compatible pretation while in the other case it is Reza’s inter- 2 with our analysis , it is also necessary to correctly pretation that is used. model the different behaviour of verbs like love Notice that we can apply our analysis to any with respect to others like punch or kill. In fact, natural language expression that may have differ- we need to model the difference between, for ex- ent interpretations. This means that, for exam- ample, kill and murder, since murder does involve ple, we can extend our analysis, which is limited a mental attitude of intention and the correspond- to referring expressions here for space reasons, to ing sentence to (6) is not necessarily contradictory: other cases, such as the standard examples involv- (7) Dr. Octopus murdered Spider-Man but he ing ideally synonymous predicates like groundhog didn’t murder Peter Parker. and woodchuck (see, e.g., Fox and Lappin (2005)). The paper is organised as follows: in section The implementation of our analysis is based on 2 we discuss the technical details of our analysis; monads. Monads are a construction in category in section 3 we discuss our analyses of the moti- theory that defines a canonical way to map a set vating examples; section 4 compares our approach of objects and functions that we may consider as with a standard approach to opacity; we conclude ent, or other close family member has been replaced by an in section 5. identical-looking impostor.” 2Indeed, one way to understand guises is as different ways 3Shan (2001) was the first to sketch the idea of using the in which we interpret a referring term (Heim, 1998). Reader monad to model intensional phenomena.

20 2 Monads and interpretation functions associated with the expressions under considera- tion. To avoid introducing the complexities of the cat- egorical formalism, we introduce monads as they We thus add two operators, η and ?, to the are usually encountered in the computer science and the reductions work as ex- literature. A monad is defined as a triple , η, ? . pected for (9) and (10). These reductions are im- h♦ i ♦ is what we call a functor, in our case a mapping plicit in our analyses in section 3. between types and functions. We call the com- ponent of ♦ that maps between types ♦1, while the one that maps between functions ♦2. In our 2.1 Compositional logic case ♦1 will map each type to a new type that corresponds to the original type with an added For composing the meanings of linguistic re- interpretation index parameter. Formally, if i is sources we use a logical calculus adapted for the the type of interpretation indices, then ♦1 maps linear case5 from the one introduced by Benton et any type τ to i τ. In terms of functions, → al. (1998). The calculus is based on a language maps any function f : τ δ to a function ♦2 → with two connectives corresponding to our type f :(i τ) i δ. corresponds to func- 0 → → → ♦2 constructors: (, a binary connective, that corre- tion composition: sponds to (linear) functional types, and ♦, a unary connective, that represents monadic types. ♦2(f) = λg.λi.f(g(i)) (8) The logical calculus is described by the proof In what follows the component ♦2 will not be used rules in figure 1. The rules come annotated with so we will use ♦ as an abbreviation for ♦1. This lambda terms that characterise the Curry-Howard means that we will write τ for the type i τ. ♦ → correspondence between proofs and meaning η (pronounced “unit”) is a polymorphic func- terms. Here we assume a Lexical Functional tion that maps inhabitants of a type τ to inhabitants Grammar-like setup (Dalrymple, 2001), where of its image under , formally η : τ.τ τ. ♦ ∀ → ♦ a syntactic and functional grammar component Using the computational metaphor, η should em- feeds the semantic component with lexical re- bed a value in a computation that returns that value sources already marked with respect to their without any side-effect. In our case η should sim- predicate-argument relationships. Alternatively ply add a vacuous parameter to the value: we could modify the calculus to a categorial set- ting, by introducing a structural connective, and η(x) = λi.x (9) using directional versions of the implication con- ? (pronounced “bind”) is a polymorphic func- nective together with purely structural rules to tion of type τ. δ. τ (τ δ) δ, and control the compositional process. ∀ ∀ ♦ → → ♦ → ♦ acts as a sort of enhanced functional application.4 We can prove that the Cut rule is admissible, Again using the computational metaphor, ? takes therefore the calculus becomes an effective (al- care of combining the side effects of the argument though inefficient) way of computing the meaning and the function and returns the resulting compu- of a linguistic expression. tation. In the case of the monad we are interested A key advantage we gain from the monadic ap- in, ? is defined as in (10). proach is that we are not forced to generalize all a ? f = λi.f(a(i))(i) (10) lexical entries to the “worst case”. With the log- ical setup we have just described we can in fact Another fundamental property of ? is that, by freely mix monadic and non monadic resources. imposing an order of evaluation, it will provide For example, in our logic we can combine a pure us with an additional scoping mechanism distinct version of a binary function with arguments that from standard functional application. This will al- are either pure or monadic, as the following are all low us to correctly capture the multiple readings

4We use for ? the argument order as it is normally used in functional programming. We could swap the arguments and 5Linearity is a property that has been argued for in the make it look more like standard functional application. Also, context of natural language semantics by various researchers we write ? in infix notation. (Moortgat (2011), Asudeh (2012), among others).

21 Γ BB, ∆ C id ` ` Cut x : A x : A Γ, ∆ C ` ` Γ, x : A t : B ∆ t : A Γ, x : B u : C ` ( R ` ` ( L Γ λx.t : A ( B Γ, ∆, y : A B u[y(t)/x]: C ` ( ` Γ x : A Γ, x : A t : ♦B ` ♦R ` ♦L Γ η(x): ♦A Γ, y : A y ? λx.t : B ` ♦ ` ♦ Figure 1: Sequent calculus for a fragment of multiplicative linear logic enriched with a monadic modality, together with a Curry-Howard correspondence between formulae and meaning terms. provable theorems in this logic. the entire set of resources available. This shows that the compositionality of our monadic approach A B C, A, B C (11) ( ( ` ♦ cannot be equivalently recapitulated in a simple A B C, A, B C (12) type theory. ( ( ♦ ` ♦ A B C, A, B C (13) ( ( ♦ ` ♦ A B C, A, B C (14) 3 Examples ( ( ♦ ♦ ` ♦ The starting point for the analysis of examples (1)- In contrast, the following is not a theorem in the (4) is the lexicon in table 1. The lexicon represents logic: the linguistic knowledge of the speaker, and at A ( B ( C,I ( A, I ( B I ( C (15) the same time her knowledge about other agents’ 6` grammars. In general, then, if we were to simply lift the type Most lexical entries are standard, since we do of the lexical resources whose interpretation may not need to change the type and denotation of lex- be dependent on a specific point of view, we would ical items that are not involved in the phenomena be forced to lift all linguistic expressions that may under discussion. So, for instance, logical opera- combine with them, thus generalizing to the worst tors such as not and but are interpreted in the stan- case after all. dard non-monadic way, as is a verb like punch or The monadic machinery also achieves a higher kill. Referring expressions that are possibly con- level of compositionality. In principle we could tentious, in the sense that they can be interpreted directly encode our monad using the type con- → differently by the speaker and other agents, instead structor, with linear implication, (, on the logical have the monadic type ♦e. This is reflected in side. However this alternative encoding wouldn’t their denotation by the fact that their value varies have the same deductive properties. Compare the according to an interpretation index. We use a pattern of inferences we have for the monadic special index σ to refer to the speaker’s own per- type, in (11)–(14), with the equivalent one for the spective, and assume that this is the default index simple type: used whenever no other index is specifically in- troduced. For example, in the case of the name A B C, A, B C (16) ( ( ` Spider-Man, the speaker is aware of his secret A B C,I A, B I C (17) ( ( ( ` ( identity and therefore interprets it as another name A B C, A, I B I C (18) for the individual Peter Parker, while Mary Jane ( ( ( ` ( A ( B ( C,I ( A, I ( B (19) and Dr. Octopus consider Spider-Man a different ` entity from Peter Parker. I ( I ( C We assume that sentences are interpreted in a In the case of the simple type, the final formula we model in which all entities are mental entities, i.e. derive depends in some non-trivial way on the en- that there is no direct reference to entities in the tire collection of resources on the left-hand side of world, but only to mental representations. Enti- the sequent. In contrast in the case of the monadic ties are therefore relativized with respect to the type, the same type can be derived for all config- agent that mentally represents them, where non- urations. What is important is that we can pre- contentious entities are always relativized accord- dict the final formula without having to consider ing to the speaker. This allows us to represent the

22 WORD DENOTATION TYPE Reza rσ e Kim kσ e Dr. Octopus oσ e Mary Jane mjσ e Peter Parker ppσ e not λp. p t t ¬ → but λp.λq.p q t t t ∧ → → is λx.λy.x = y e e t → → punch λo.λs.punch(s)(o) e e t → → believe λc.λs.λi.B(s)(c(κ(s))) t e t ♦ → → ♦ love λo.λs.λi.love(s)(o(κ(s))) e e t ♦ → → ♦ esr if i = r, Hesperus λi. ♦e (vσ if i = σ msr if i = r, Phosphorus λi. ♦e (vσ if i = σ smi if i = o or i = mj, Spider-Man λi. ♦e (ppσ if i = σ jr if i = r, Jesus λi. ♦e (jσ if i = σ

impk if i = k, Sandy λi. ♦e (sσ if i = σ Table 1: Speaker’s lexicon fact that different agents may have distinct equiv- while love changes the interpretation of its object alencies between entities. For example, Reza in relative to the perspective of its subject. However our model does not equate the evening star and we will see that the interaction of these lexical en- the morning star, but the speaker equates them tries and the evaluation order imposed by ? will al- with each other and with Venus. Therefore, the low us to let the complement of a verb like believe speaker’s lexicon in table 1 represents the fact and the object of a verb like love escape the spe- that the speaker’s epistemic model includes what cific effect of forcing the subject point of view, and the speaker knows about other agents’ models, instead we will be able to derive readings in which e.g. that Reza has a distinct denotation (from the the arguments of the verb are interpreted using the speaker) for Hesperus, that Mary Jane has a dis- speaker’s point of view. tinct representation for Spider-Man, that Kim has Figure 2 reports the four non-equivalent read- a distinct representation for Sandy, etc. ings that we derive in our system for example (1), repeated here as (20).6 The other special lexical entries in our lexicon are those for verbs like believe and love. The two (20) Reza doesn’t believe that Hesperus is entries are similar in the sense that they both take Phosphorus. an already monadic resource and actively supply Reading (21) assigns to both Hesperus and a specific interpretation index that corresponds to Phosphorus the subject interpretation and results, the subject of the verbs. The function κ maps each after contextualising the sentence by applying it entity to the corresponding interpretation index, to the standard σ interpretation index, in the truth i.e. κ : e i. For example, in the lexical en- → conditions in (25), i.e. that Reza does not be- tries for believe and love, κ maps the subject to the lieve that the evening star is the morning star. This interpretation index of the subject. Thus, the entry 6The logic generates six different readings but the monad for believe uses the subject’s point of view as the we are using here has a commutative behaviour, so four of perspective used to evaluate its entire complement, these readings are pairwise equivalent.

23 believe ( Hesperus ? λx. Phosphorus ? λy.η( is (x)(y)))( Reza ) ? λz.η( not (z)) (21) Hesperus ? λx. believe ( Phosphorus ? λy.η( is (x)(y)))( Reza ) ? λz.η( not (z)) (22) JPhosphorusK J ? λx. KbelieveJ ( HesperusK ? λy.η(JisK (y)(x)))(JRezaK) ? λz.η(JnotK (z)) (23) JHesperusK ? λx. JPhosphorusK J ? λy. believeK (η(J isK (x)(y)))(J RezaK ) ? λz.η(J notK (z)) (24) J K J K J K J K J K J K J FigureK 2: Non-equivalentJ K readingsJ for RezaK doesn’tJ K believeJ HesperusK is Phosphorus.J K reading would not be contradictory in an epistemic The verb punch is not a verb that can change model (such as Reza’s model) where the evening the interpretation perspective and therefore the po- star and the morning star are not the same entity. tentially controversial name Spider-Man is inter- preted in both instances using the speaker’s inter- B(r)(esr = msr) (25) pretation index. The result are unsatisfiable truth ¬ conditions, as expected: In the case of (22) and (23) we get a similar ef- fect although here we mix the epistemic models, punch(oσ)(ppσ) punch(oσ)(ppσ) (30) and one of the referring expressions is interpreted ∧ ¬ under the speaker perspective while the other is In contrast a verb like love is defined in our lex- again interpreted under Reza’s perspective. For icon as possibly changing the interpretation per- these two readings we obtain respectively the truth spective of its object to that of its subject. There- conditions in (26) and (27). fore in the case of a sentence like (3), we ex- pect one reading where the potentially contentious B(r)(v = ms ) (26) ¬ σ r name Spider-Man is interpreted according to the B(r)(v = es ) (27) ¬ σ r subject of love, Mary Jane. This is in fact the re- sult we obtain. Figure 3 reports the two readings Finally for (24) we get the contradictory reading that our framework generates for (3). that Reza does not believe that Venus is Venus, as Reading (31), corresponds to the non contradic- both referring expressions are evaluated using the tory interpretation of sentence (3), where Spider- speaker’s interpretation index. Man is interpreted according to Mary Jane’s per- spective and therefore is assigned an entity differ- B(r)(v = v ) (28) ¬ σ σ ent from Peter Parker: The different contexts for the interpretation of love(mj )(pp ) love(mj )(sm ) (33) referring expressions are completely determined σ σ ∧ ¬ σ mj by the order in which we evaluate monadic re- sources. This means that, just by looking at the Reading (32) instead generates unsatisfiable truth linear order of the lambda order, we can check conditions, as Spider-Man is identified with Peter whether a referring expression is evaluated inside Parker according to the speaker’s interpretation: the scope of a perspective changing operator such love(mj )(pp ) love(mj )(pp ) (34) as believe, or if it is interpreted using the standard σ σ ∧ ¬ σ σ interpretation. If we consider a case like sentence (2), we ought Our last example, (4), repeated here as (35), is to get only a contradictory reading as the statement particularly interesting as we are not aware of pre- is intuitively contradictory. Our analysis produces vious work that discusses this type of sentence. a single reading that indeed corresponds to a con- The non-contradictory reading that this sentence tradictory interpretation: has seems to be connected specifically to two dif- ferent interpretations of the same name, Jesus, Spider-Man ? λx. Spider-Man ? both under the syntactic scope of the modal be- lieve. λy.η( but ( punch ( Dr. Octopus )(x)) J K J K ( not ( punch ( Dr. Octopus )(y)))) (29) (35) Reza doesn’t believe Jesus is Jesus. J K J K J K J K J K J K 24 love (η( Peter Parker ))( Mary Jane ) ? λp. love ( Spider-Man )( Mary Jane ) ? λq.η( but (p)( not (q))) (31) JloveK (η(JPeter ParkerK))(JMary JaneK) ? λp. JSpider-ManK J ? λx. KloveJ (η(x))( KMary Jane ) ? λq.η(JbutK (p)(JnotK (q))) (32) J K J K J K J K J K J K Figure 3:J Non-equivalentK J K readings for Mary Jane loves Peter Parker but she doesn’t love Spider-Man.

believe ( Jesus ? λx. Jesus ? λy.η( is (x)(y)))( Reza ) ? λz.η( not (z)) (36) Jesus ? λx. Jesus ? λy. believe (η( is (x)(y)))( Reza ) ? λz.η( not (z)) (37) JJesus ?K λx.J believeK (JJesusK ? λy.η(JisK (x)(y)))(JRezaK) ? λz.η(JnotK (z)) (38) J K J K J K J K J K J K J FigureK 4: Non-equivalentJ K J readingsK forJ RezaK doesn’tJ believeK Jesus isJ Jesus.K

Our system generates three non-equivalent read- 4 Comparison with traditional ings, reported here in figure 4.7 approaches Reading (36) and (37) corresponds to two con- tradictory readings of the sentence: in the first In this section we try to sketch how a traditional case both instances of the name Jesus are inter- approach to opaque contexts, such as one based preted from the subject perspective and therefore on a de dicto/de re with respect to a attribute to Reza the non-belief in a tautology, sim- modal operator, would fare in the analysis of (4), ilarly in the second case, even though in this case our most challenging example. the two names are interpreted from the perspec- To try to explain the two readings in the con- tive of the speaker. In contrast the reading in (38) text of a standard possible worlds semantics, we corresponds to the interpretation that assigns two could take (4) to be ambiguous with respect to a different referents to the two instances of the name de dicto/de re reading. In the case of the de dicto Jesus, producing the truth conditions in (39) which reading (which corresponds to the non-satisfiable are satisfiable in a suitable model. reading) the two names are evaluated under the scope of the doxastic operator believe, i.e. they B(r)(jσ = jr) (39) ¬ both refer to the same entity that is assigned to the The analysis of the Capgras example (5), re- name Jesus in each accessible world. Clearly this peated in (40), is equivalent; the non-contradictory is always the case, and so (4) is not satisfiable. In reading is shown in (41). the case of the de re reading, we assume that the two names are evaluated at different worlds that (40) Kim doesn’t believe Sandy is Sandy. assign different referents to the two names. One of these two worlds will be the actual world and B(k)(sσ = impk) (41) ¬ the other one of the accesible worlds. The reading We use impk as the speaker’s representation of is satisfiable if the doxastic modality links the ac- the “impostor” that Kim thinks has taken the place tual world with one in which the name Jesus refers of Sandy. to a different entity. Notice that for this analysis to More generally, there are again three non- work we need to make two assumptions: 1. that equivalent readings, including the one above, names behave as quantifiers with the property of which are just those in figure 4, with Jesus re- escaping modal contexts, 2. that names can be as- placed by Sandy and Reza replaced by Kim . signed different referents in different worlds, i.e. 7Again, there are six readings that correspondJ to differentK we have to abandon the standard notion that names proofs, but givenJ theK commutativeJ K behaviour of theJ ReaderK are rigid designators (Kripke, 1972). In contrast, monad, the fact that equality is commutative, and the fact that we have in this case two identical lexical items, only three of in our approach we do not need to abandon the them are non-equivalent readings. idea of rigid designation for names (within each

25 agent’s model). sufficiently general. However, such an approach would present a number of rather serious problems. The first is 5 Conclusion connected with the assumption that names are We started by discussing a diverse collection of scopeless. This is a common hypothesis in natural expressions that share the common property of language semantics and indeed if we model names showing nontrivial referential behaviours. We as generalized quantifiers they can be proven to be have proposed a common analysis of all these scopeless (Zimmermann, 1993). But this is prob- expressions in terms of a combination of differ- lematic for our example. In fact we would predict ent interpretation contexts. We have claimed that that both instances of the name Jesus escape the the switch to a different interpretation context is scope of believe. The resulting reading would bind triggered by specific lexical items, such as modal the quantified individual to the interpretation of Je- verbs but also verbs that express some kind of sus in the actual world. In this way we only cap- mental attitude of the subject of the verb towards ture the non-satisfiable reading. To save the sco- its object. The context switch is not obligatory, pal approach we would need to assume that names as witnessed by the multiple readings that the sen- in fact are sometimes interpreted in the scope of tences discussed seem to have. We implemented modal operators. our analysis using monads. The main idea of One way to do this would be to set up our se- our formal implementation is that referring ex- mantic derivations so that they allow different sco- pressions that have a potential dependency from pal relations between quantifiers and other opera- an interpretation context can be implemented as tors. The problem with this solution is that for sen- functions from interpretation indices to fully in- tences like (4) we generate twelve different deriva- terpreted values. Similarly, the linguistic triggers tions, some of which do not correspond to valid for context switch are implemented in the lexi- readings of the sentence. con as functions that can modify the interpreta- Even assuming that we find a satisfactory solu- tion context of their arguments. Monads allow tion for these problems, the scopal approach can- us to freely combine these “lifted” meanings with not really capture the intuitions behind opacity in standard ones, avoiding in this way to generalize all contexts. Consider again (4) and assume that our lexicon to the worst case. We have also seen there are two views about Jesus: Jesus as a divine how more traditional approaches, while capable of being and Jesus as a human being. Assume that Je- dealing with some of the examples we discuss, are sus is a human being in the actual world and that not capable of providing a generalised explanation Reza is an atheist, then the only possible reading of the observed phenomena. is the non-satisfiable one, as the referent for Jesus Acknowledgements will be the same in the actual world and all acces- sible Reza-belief-worlds. The problem is that the The authors would like to thank our anonymous scopal approach assumes a single modal model, reviewers for their comments. This research is while in this case it seems that there are two doxas- supported by a Marie Curie Intra-European Fel- tic models, Reza’s model and the speaker’s model, lowship from the Europoean Commision under under discussion. In contrast, in our approach the contract number 327811 (Giorgolo) and an Early relevant part of Reza’s model is embedded inside Researcher Award from the Ontario Ministry of the speaker’s model and interpretation indices in- Research and Innovation and NSERC Discovery dicate which interpretation belongs to Reza and Grant #371969 (Asudeh). which to the speaker. Finally an account of modality in terms of sco- pal properties is necessarily limited to cases in which modal operators are present. While this may be a valid position in the case of typical in- tensional verbs like seek or want, it would not be clear how we could extend this approach to cases like 3, as the verb love has no clear modal con- notation. Thus, the scopal approach would not be

26 References Ash Asudeh. 2012. The Logic of Pronominal Resump- tion. Oxford Studies in Theoretical Linguistics. Ox- ford University Press, New York. Nick Benton, G. M. Bierman, and Valeria de Paiva. 1998. Computational types from a logical per- spective. Journal of Functional Programming, 8(2):177–193. Mary Dalrymple. 2001. Lexical Functional Grammar. Academic Press, San Diego, CA. Chris Fox and Shalom Lappin. 2005. Foundations of Intensional Semantics. Blackwell, Oxford. Gottlob Frege. 1892. Uber¨ Sinn und Bedeutung. Zeitschrift fur¨ Philosophie und philosophische Kri- tik, 100:25–50. Gottlob Frege. 1952. On . In Pe- ter T. Geach and Max Black, editors, Translations from the Philosophical Writings of Gottlob Frege, pages 56–78. Blackwell, Oxford. Translation of Frege (1892). Irene Heim. 1998. Anaphora and semantic interpreta- tion: A reinterpretation of Reinhart’s approach. In Uli Sauerland and Orin Percus, editors, The Inter- pretive Tract, volume 25 of MIT Working Papers in Linguistics, pages 205–246. MITWPL, Cambridge, MA. Saul Kripke. 1972. Naming and necessity. In Donald Davidson and Gilbert Harman, editors, Semantics of Natural Language, pages 253–355. Reidel. Eugenio Moggi. 1989. Computational lambda- calculus and monads. In LICS, pages 14–23. IEEE Computer Society. Michael Moortgat. 2011. Categorial type logics. In Johan van Benthem and Alice ter Meulen, editors, Handbook of Logic and Language, pages 95–179. Elsevier, second edition. Chung-chieh Shan. 2001. Monads for natural lan- guage semantics. In Kristina Striegnitz, editor, Pro- ceedings of the ESSLLI-2001 Student Session, pages 285–298. 13th European Summer School in Logic, Language and Information. Thomas Ede Zimmermann. 1993. Scopeless quanti- fiers and operators. Journal of Philosophical Logic, 22(5):545–561, October.

27 The Phenogrammar of Coordination

Chris Worth Department of Linguistics The Ohio State University [email protected]

Abstract (2) Joffrey whined and sniveled. (3) Tyrion slapped and Tywin chastised Jof- Linear (LinCG) is a frey. sign-based, Curryesque, relational, logi- cal categorial grammar (CG) whose cen- The first example is a straightforward instance tral architecture is based on linear logic. of noun phrase coordination. The second and third Curryesque grammars separate the ab- are both instances of what has become known in stract combinatorics (tectogrammar) of the categorial grammar literature as “functor co- linguistic expressions from their concrete, ordination”, that is, the coordination of linguistic audible representations (phenogrammar). material that is in some way incomplete. The third Most of these grammars encode linear or- is particularly noteworthy as being an example of der in string-based lambda terms, in which a “right node raising” construction, whereby the there is no obvious way to distinguish right argument Joffrey serves as the object to both of from left. Without some notion of direc- the higher NP-Verb complexes. We will show that tionality, grammars are unable to differen- all three examples can be given an uncomplicated tiate, say, subject and object for purposes account in the Curryesque framework of Linear of building functorial coordinate struc- Categorial Grammar (LinCG), and that (2) and tures. We introduce the notion of a phe- (3) have more in common than not. nominator as a way to encode the term Section 1 provides an overview of the data and structure of a functor separately from its and central issues surrounding an analysis of co- “string support”. This technology is then ordination in Curryesque grammars. Section 2 in- employed to analyze a range of coordina- troduces the reader to the framework of LinCG, tion phenomena typically left unaddressed and presents the technical innovations at the heart by Linear Logic-based Curryesque frame- of this paper. Section 3 gives lexical entries and works. derivations for the examples in section 1, and sec- 1 Overview tion 4 discusses our results and suggests some di- rections for research in the near future, with refer- Flexibility to the notion of constituency in con- ences following. junction with introduction (and composition) rules has allowed categorial grammars to successfully 1.1 Curryesque grammars and Linear address an entire host of coordination phenomena Categorial Grammar in a transparent and compositional manner. While We take as our starting point the Curryesque (af- “Curryesque” CGs as a rule do not suffer from ter Curry (1961)) tradition of categorial grammars, some of the other difficulties that plague Lambek making particular reference to those originating CGs, many are notably deficient in one area: co- with Oehrle (1994) and continuing with Abstract ordination. Lest we throw the baby out with the Categorial Grammar (ACG) of de Groote (2001), bathwater, this is an issue that needs to be ad- Muskens (2010)’s Lambda Grammar (λG), Kub- dressed. We take the following to be an exemplary ota and Levine’s Hybrid Type-Logical Catego- subset of the relevant data, and adopt a fragment rial Grammar (Kubota and Levine, 2012) and methodology to show how it may be analyzed. to a lesser extent the Grammatical Framework (1) Tyrion and Joffrey drank. of Ranta (2004), and others. These dialects

28 Proceedings of the EACL 2014 Workshop on Type Theory and Natural Language Semantics (TTNLS), pages 28–36, Gothenburg, Sweden, April 26-30 2014. c 2014 Association for Computational Linguistics of categorial grammar make a distinction be- ering’ operators in the tectogrammar (that is, ex- tween Tectogrammar, or “abstract syntax”, and pressions which scope within a continuation), and Phenogrammar, or “concrete syntax”. Tec- it ‘focuses’ a particular meaningful unit in the se- togrammar is primarily concerned with the struc- mantics. A focused expression might share its tec- tural properties of grammar, among them co- totype ((NP ( S) ( S) with, say, a quantified occurrence, case, agreement, tense, and so forth. noun phrase, but the two could have different phe- Phenogrammar is concerned with computing a notypes, reflecting the accentuation or lack thereof pre-phonological representation of what will even- by placing the resulting expression in the domain tually be produced by the speaker, and encom- of prosodic boundary phenomena or not. Never- passes word order, morphology, prosody, and the theless, the system is constrained by the fact that like. the tectogrammar is based on linear logic, so if we take some care when writing grammar rules, we Linear Categorial Grammar (LinCG) is a sign- should still find resource sensitivity to be at the based, Curryesque, relational, logical categorial heart of the framework. grammar whose central architecture is based on linear logic. Abbreviatory overlap has been 1.2 Why coordination is difficult for a regrettably persistent problem, and LinCG is Curryesque grammars the same in essence as the framework varyingly Most Curryesque CGs encode linear order in called Linear Grammar (LG) and Pheno-Tecto- lambda terms, and there is no obvious way to dis- Differentiated Categorial Grammar (PTDCG), and tinguish ‘right’ from ‘left’ by examining the types developed in Smith (2010), Mihalicek (2012), (be they linear or intuitionistic).1 This is not a Martin (2013), Pollard and Smith (2012), and Pol- problem when we are coordinating strings directly, lard (2013). In LinCG, the syntax-phonology and as de Groote and Maarek (2007) show, but an anal- syntax-semantics interfaces amount to noting that ysis of the more difficult case of functor coordi- the logics for the phenogrammar, the tectogram- nation remains elusive.2 Without some notion of mar, and the semantics operate in parallel. This directionality, grammars are unable to distinguish stands in contrast to ‘syntactocentric’ theories of between, say, subject and object. This would seem grammar, where syntax is taken to be the fun- to predict, for example, that λs. s SLAPPED damental domain within which expressions com- · · JOFFREY and λs. TYRION SLAPPED s would bine, and then phonology and semantics are ‘read · · have the same syntactic category (NP S in the off’ of the syntactic representation. LinCG is con- ( tectogrammar, and St St in the phenogram- ceptually different in that it has relational, rather → mar), and would thus be compatible under coor- than functional, interfaces between the three com- dination, but this is generally not the case. What ponents of the grammar. Since we do not interpret we need is a way to examine the structure of a syntactic types into phenogrammatical or semantic lambda term independently of the specific string types, this allows us a great deal of freedom within constants that comprise it. To put it another way, each logic, although in practice we maintain a in order to coordinate functors, we need to be able fairly tight connection between all three compo- to distinguish between what Oehrle (1995) calls nents. Grammar rules take the form of derivational their string support, that is, the string constants rules which generate triples called signs, and they which make up the body of a particular functional bind together the three logics so that they operate term, and the linearization structure such functors concurrently. While the invocation of a grammar impose on their arguments. rule might simply be, say, point-wise application, the ramifications for the three systems can in prin- 2 Linear Categorial Grammar (LinCG) ciple be different; one can imagine expressions which exhibit type asymmetry in various ways. Curryesque grammars separate the notion of linear order from the abstract combinatorics of linguis- By way of example, one might think of ‘focus’ 1A noteworthy exception is Ranta’s Grammatical Frame- as an operation which has reflexes in all three as- work (GF), explored in, e.g. Ranta (2004) and Ranta pects of the grammar: it applies pitch accents to (2009). GF also makes distinctions between tectogrammar the target string(s) in the phenogrammar (the dif- and phenogrammar, though it has a somewhat different con- ception of each. ference between accented and unaccented words 2A problem explicitly recognized by Kubota (2010) in being reflected in the phenotype), it creates ‘low- section 3.2.1.

29 Γ, x : A b : B tic expressions, and as such base their tectogram- ` Abs Γ λx : A. b : A B mars around logics other than bilinear logic; the ` → Grammatical Framework is based on Martin-Lof¨ We additionally stipulate the following familiar type theory, and LinCG and its cousins ACG and axioms governing the conversion and reduction of 3 λG use linear logic. Linear logic is generally de- lambda terms: scribed as being “resource-sensitive”, owing to the λx : A. b = λy : A. [y/x]b (α-conversion) ` lack of the structural rules of weakening and con- (λx. b a) = [a/x]b (β-reduction) ` traction. Resource sensitivity is an attractive no- As is common to any number of Curryesque tion, theoretically, since it allows us to describe frameworks, we encode the phenogrammatical processes of resource production, consumption, parts of LinCG signs with typed lambda terms and combination in a manner which is agnostic consisting of strings, and functions over strings.4 about precisely how resources are combined. Cer- We axiomatize our theory of strings in the familiar tain problems which have been historically tricky way: for Lambek categorial grammars (medial extrac-  : St tion, quantifier scope, etc.) are easily handled by ` : St St St LinCG. ` · → → stu : St. s (t u) = (s t) u Since a full introduction to the framework is re- ` ∀ · · · · s : St.  s = s = s  grettably impossible given current constraints, we ` ∀ · · refer the interested reader to the references in sec- The first axiom asserts that the empty string  is tion 1.1, which contain a more in-depth discus- a string. The second axiom asserts that concate- nation, written , is a (curried) binary function on sion of the potential richness of the architecture · of LinCG. We do not wish to say anything new strings. The third axiom represents the fact that about the semantics or the tectogrammar of coor- concatenation is associative, and the fourth, that dination in the current discussion, so we will ex- the empty string is a two-sided identity for con- pend our time fleshing out the phenogrammatical catenation. Because of the associativity of con- component of the framework, and it is to this topic catenation, we will drop parentheses as a matter that we now turn. of convention. The phenogrammar of a typical LinCG sign will 2.1 LinCG Phenogrammar resemble the following (with one complication to LinCG grammar rules take the form of tripartite be added shortly): inference rules, indicating what operations take λs. s SNIVELED : St St place pointwise within each component of the ` · → signs in question. There are two main gram- Since we treat St as the only base type, we will generally omit typing judgments in lambda terms mar rules, called application (App) for combining when no confusion will result. Furthermore, we signs, and abstraction (Abs) for creating the po- tential for combination through hypothetical rea- use SMALLCAPS to indicate that a particular con- soning. Aside from the lexical entries given as stant is a string. So, the preceding lexical entry s axioms of the theory, it is also possible to obtain provides us with a function from some string , to strings, which concatenates the string SNIVELED typed variables using the rule of axiom (Ax), and s we make use of this rule in the analysis of right to the right of . node raising found in section 3.4. While the tec- 2.1.1 Phenominators togrammar of LinCG is based on a fragment of The center of our analysis of coordination is linear logic, the phenogrammatical and semantic the notion of a phenominator (short for pheno- components are based on higher order logic. Since combinator), a particular variety of typed lambda we are concerned only with the phenogrammatical term. Intuitively, phenominators serve the same component here, we have chosen to simplify the purpose for LinCG that bilinear (slash) types do exposition by presenting only the phenogrammat- for Lambek categorial grammars. Specifically, ical part of the rules of application and abstraction: 3 Ax The exact status of the rule of η-conversion with respect f : A f : A to this framework is currently unclear, and since we do not ` make use of it, we omit its discussion here. Γ f : A B ∆ a : A 4 ` → ` App although other structures have been proposed, e.g. the Γ, ∆ (f a): B node sets found in Muskens (2001). `

30 they encode the linearization structure of a func- long as the arguments in question are immediately tor, that is, where arguments may eventually oc- adjacent to the string support at each successive cur with respect to its string support. To put it an- application, it is possible to permute them to some other way, a phenominator describes the structure extent without losing the general thrust of the anal- a functor “projects”, in terms of linear order. ysis. For example, the choice to have transitive From a technical standpoint, we would like to verbs take their object arguments first is insignifi- define a phenominator as a closed monoidal linear cant.5 Since strings are implicitly under the image lambda term, i.e. a term containing no constants of the identity phenominator λs.s, we will consis- other than concatenation and the empty string. tently omit this subscript. The idea is that phenominators are the terms of We will be able to define a function we call the higher order theory of monoids, and they in say, so that it will have the following property: some ways describe the abstract “shape” of pos- s : St. ϕ :Φ.say (ϕ s) = s sible string functions. For those accustomed to ` ∀ ∀ thinking of “syntax” as being word order, then That is, say is a left inverse for unary phenomi- phenominators can be thought of as a kind of syn- nators. tactic combinator. In practice, we will make use The function say is defined recursively via cer- only of what we call the unary phenominators, the tain objects we call vacuities. The idea of a vacu- types of which we will refer to using the sort Φ ity is that it be in some way an “empty argument” (with ϕ used by custom as a metavariable over to which a functional term may apply. If we are unary phenominators, i.e. terms whose type is in dealing with functions taking string arguments, it Φ). These are not unary in the strict sense, but they seems obvious that the vacuity on strings should will have as their centerpiece one particular string be the empty string . If we are dealing with second-order functions taking St St arguments, variable, which will be bound with the highest → scope. We will generally abbreviate phenomina- for example, quantified noun phrases like every- one, then the vacuity on St St should be the tors by the construction with which they are most → λs.s commonly associated: VP for verb phrases and identity function on strings, . Higher vacuities intransitive verbs, TV for transitive verbs, DTV than these become more complicated, and defin- for ditransitive verbs, QNP for quantified noun ing all of the higher-order vacuities is not entirely phrases, and RNR for right node raising construc- straightforward, as certain types are not guaran- tions. Here are examples of some of the most com- teed to have a unique vacuity. Fortunately, we can mon phenominators we will make use of and the do it for any higher-function taking as an argument abbreviations we customarily use for them: another function under the image of a phenomina- tor – then the vacuity on such a function is just the Phenominator Abbreviation phenominator applied to the empty string.6 The λs.s (omitted) central idea is easily understood when one asks λvs.s v VP · what, say, a vacuous transitive verb sounds like. λvst.t v s TV · · The answer seems to be: by itself, nothing, but λvstu.u v s t DTV · · · it imposes a certain order on its arguments. One λvP.(P v) QNP practical application of this clause is in analyzing λvs.v s RNR · so-called “argument cluster coordination”, where As indicated previously, the first argument of a this definition will ensure that the argument cluster phenominator always corresponds to what we re- gets linearized in the correct manner. This analy- fer to (after Oehrle (1995)) as the string support sis is regrettably just outside the scope of the cur- of a particular term. With the first argument dis- rent inquiry, though the notion of the phenomina- pensed with, we have chosen the argument order 5Since we believe it is possible to embed Lambek catego- of the phenominators out of general concern for rial grammars in LinCG, this fact reflects that the calculus we what we perceive to be fairly uncontroversial cat- are dealing with is similar to the associative Lambek Calcu- lus. egorial analyses of English grammatical phenom- 6A reviewer suggests that this concept may be related to ena. That is, transitive verbs take their object ar- the “context passing representation” of Hughes (1995), and guments first, and then their subject arguments, di- the association of a nil term with its continuation with re- spect to contexts is assuredly evocative of the association of transitives take their first and second object argu- the vacuity on a phenominator-indexed type with the contin- ments, followed by their subject argument, etc. As uation of  with respect to a phenominator.

31 tor can be profitably employed to provide exactly Then our typing is justified along the following such an analysis by adopting and reinterpreting a lines: categorial account along the lines of the one given τ ::= (St St) ϕ → VP in Dowty (1988). ::= (St St)λvs.s v → · We formally define vacuities as follows: ::= (St St)λf:St St. t:St.f=(λvs.s v t) → → ∃ · vacSt St =def λs.s So applying the subtyping predicate to the term in → vacτϕ =def (ϕ ) question, we have The reader should note that as a special case of the (λf : St St. t : St. → ∃ second clause, we have f = (λvs.s v t) λs . s SNIVELED) · 0 0 · vac = vac = (λs.s ) =  = t : St.λs0. s0 SNIVELED = (λvs.s v t) St Stλs.s ∃ · · = t : St.λs . s SNIVELED = λs. s t This in turn enables us to define say: ∃ 0 0 · · = t : St.λs. s SNIVELED = λs. s t say =def λs.s ∃ · · St which is true with t = SNIVELED, and the term is sayτ1 τ2 =def λk : τ1 τ2. sayτ2 (k vacτ1 ) → → shown to be well-typed. say(τ1 τ2)ϕ =def sayτ1 τ2 → → For an expedient example, we can apply say to 3 Analysis our putative lexical entry from earlier, and verify The basic strategy underlying our analysis of coor- that it will reduce to the string SNIVELED as de- dination is that in order to coordinate two linguis- sired: tic signs, we need to track two things: their lin- saySt St λs. s SNIVELED → · earization structure, and their string support. If we = λk : St St. → have access to the linearization structure of each (say (k vac )) λs. s SNIVELED St St · conjunct, then we can check to see that it is the = say (λs. s SNIVELED vac ) St · St same, and the signs are compatible for coordina- = say (λs. s SNIVELED ) St · tion. Furthermore, we will be able to maintain this = say  SNIVELED St · structure independent of the actual string support = say SNIVELED St of the individual signs. = λs.s SNIVELED Phenominators simultaneously allow us to = SNIVELED check the linearization structure of coordination 2.1.2 Subtyping by unary phenominators candidates and to reconstruct the relevant lin- In order to augment our type theory with the earization functions after coordination has taken relevant subtypes, we turn to Lambek and Scott place. The function say addresses the second (1986), who hold that one way to do subtyping is point. For a given sign, we can apply say to it by defining predicates that amount to the charac- in order to retrieve its string support. Then, we teristic function of the particular subtype in ques- will be able to directly coordinate the resulting tion, and then ensuring that these predicates meet strings by concatenating them with a conjunction certain axioms embedding the subtype into the su- in between. Finally, we can apply the phenomi- pertype. We will be able to write such predicates nator to the resulting string and retrieve the new using phenominators. A unary phenominator is linearization function, containing the entire coor- one which has under its image a function whose dinate structure as its string support. string support is a single contiguous string. With 3.1 Lexical entries this idea in place, we are able to assign subtypes to functional types in the following way. In LinCG, lexical entries constitute the (nonlogi- For τ a (functional) type, we write τϕ (with ϕ a cal) axioms of the proof theory. First we consider the simplest elements of our fragment, the phenos phenominator) as shorthand for τϕ0 , where: for the proper names Joffrey, Tyrion, and Tywin: ϕ = λf : τ. s : St.f = (ϕ s) 0 ∃ (4) a. JOFFREY : St Then ϕ0 constitutes a subtyping predicate in the ` b. TYRION : St manner of Lambek and Scott (1986). For example, ` let τ = St St and ϕ = λvs.s v. Let us consider c. TYWIN : St → · ` the following (putative) lexical entry (pheno only): Next, we consider the intransitive verbs drank, λs . s SNIVELED :(St St) sniveled and whined.: ` 0 0 · → VP

32 (5) a. λs. s DRANK :(St St) the identity phenominator, and since say is also ` · → VP St b. λs. s SNIVELED :(St St) defined to be the identity on strings, the lexical en- ` · → VP c. λs. s WHINED :(St St)VP try we obtain for and simply concatenates each ` · → argument string to either side of the string AND. Each of these is a function from strings to strings, We give the full term reduction here, although this seeking to linearize its ‘subject’ string argument to version of and can be shown to be equal to the fol- the left of the verb. They are under the image of lowing: the “verb phrase” phenominator λvs.s v. · λc1c2 : St. c2 AND c1 : St St St The transitive verbs chastised and slapped seek ` · · → → to linearize their first string argument to the right, Since our terms at times become rather large, resulting in a function under the image of the VP we will adopt a convention where proof trees are phenominator, and their second argument to the given with numerical indexes instead of sequents, left, resulting in a string. with the corresponding sequents following below (at times on multiple lines). We will from time (6) a. λst. t CHASTISED s to time elide multiple steps of reduction, noting in ` · · :(St St St)TV passing the relevant definitions to consider when → → b. λst. t SLAPPED s reconstructing the proof. ` · · :(St St St)TV 1 2 → → 3 4 Technically, this type could be written (St → 6 5 (St St) ) , but for the purposes of coordina- 7 → VP TV tion, the present is sufficient. Each of these entries 1. λc1 : St. λc2 : St. ` is under the image of the “transitive verb” phe- λs.s ((saySt c2) AND (saySt c1)) · · nominator λvst.t v s. : St St St · · → → Finally, we come to the lexical entry schema for 2. JOFFREY : St and: ` 3. λc : St. ` 2 (7) λc : τ . λc : τ . λs.s ((saySt c2) AND (saySt JOFFREY)) ` 1 ϕ 2 ϕ · · ϕ ((say c ) AND (say c )) = λc2 : St. τϕ 2 · · τϕ 1 : τ τ τ λs.s ((saySt c2) AND (λs.s JOFFREY)) ϕ → ϕ → ϕ · · = λc : St. λs.s ((say c ) AND JOFFREY) 2 St 2 · · We note first that it takes two arguments of iden- : St St tical types τ, and furthermore that these must be → 4. TYRION : St under the image of the same phenominator ϕ. It ` 7 then returns an expression of the same subtype. 5. λs.s ((saySt TYRION) AND JOFFREY): ` · · This mechanism bears more detailed examination. St First, each conjunct is subjected to the function = λs.s ((λs.s TYRION) AND JOFFREY): St · · say, which, given its type, will return the string = λs.s (TYRION AND JOFFREY): St · · support of the conjunct. Then, the resulting strings = TYRION AND JOFFREY : St · · AND are concatenated to either side of the string . 6. λs. s DRANK :(St St)VP Finally, the phenominator of each argument is ap- ` · → 7. TYRION AND JOFFREY DRANK : St plied to the resulting string, creating a function ` · · · identical to the linearization functions of each of 3.3 Functor coordination the conjuncts, except with the coordinated string Here, in order to understand the term appearing in the relevant position. in each conjunct, it is helpful to notice that the following equality holds (with f a function from 3.2 String coordination strings to strings, under the image of the VP phe- String coordination is direct and straightforward. nominator): Since string-typed terms are under the image of f :(St St)VP say(St St) f → ` → VP 7 Since ϕ occurs within both the body of the term and the = saySt St f → subtyping predicate, we note that this effectively takes us into = saySt (f vacSt) the realm of dependent types. Making the type theory of the = say (f ) phenogrammar precise is an ongoing area of research, and we St are aware that constraining the type system is of paramount = λs.s (f ) importance for computational tractability. = (f ): St

33 This says that to coordinate VPs, we will first need The difference is that we encode directionality in to reduce them to their string support by feeding the phenominator, rather than in the type. Since their linearization functions the empty string. For our system does not include function composition the sake of brevity, this term reduction will be as a rule, but as a theorem, we will need to make elided from steps 5 and 8 in the derivations be- use of hypothetical reasoning in order to permute low. Steps 2 and 6 constitute the hypothesizing the order of the string arguments in order to con- and subsequent withdrawal of an ‘object’ string struct expressions with the correct structure.8 argument t0, as do steps 10 and 14 (s0). Format- As was the case with the functor coordina- ting restrictions prohibit rule-labeling on the proof tion example in section 3.3, applying say to the trees, so we note that these are each instances of conjuncts passes them the empty string, reducing the rules of axiom (Ax) and abstraction (Abs), re- them to their string support, as shown here: spectively. f :(St St)RNR say(St St) f → ` → RNR 1 2 = saySt St f 3 4 → = saySt (f vacSt) 5 6 = say (f ) 7 St = λs.s (f ) 1. λc :(St St) . λc :(St St) . ` 1 → VP 2 → VP = (f ): St λvs.s v · As before, this reduction is elided in the proof ((say(St St) c2) AND (say(St St) c1)) → VP · · → VP given below, occurring in steps 8 and 15. :(St St) (St St) (St St) → VP → → VP → → VP 2. λs. s SNIVELED :(St St)VP ` · → 1 2 3. λc2 :(St St)VP. λvs.s v 3 4 9 10 ` → · 5 11 12 ((say(St St) c2) AND → VP · 7 6 13 (say λs. s SNIVELED)) (St St)VP 8 14 ·. → · . 15 16 17 = λc :(St St) . λvs.s v 2 → VP · ((say(St St) c2) AND SNIVELED) 1. λst. t CHASTISED s :(St St St)TV → VP · · ` · · → → :(St St)VP (St St)VP 2. t : St t : St → → → 0 ` 0 4. λs. s WHINED :(St St)VP 3. t : St λt. t CHASTISED t :(St St) ` · → 0 ` · · 0 → VP 5. λvs.s v ((say(St St) λs. s WHINED) ` · → VP · 4. TYWIN : St AND SNIVELED) ` .· · 5. t0 : St TYWIN CHASTISED t0 : St . ` · · 6. λt . TYWIN CHASTISED t :(St St) = (λvs.s v WHINED AND SNIVELED) ` 0 · · 0 → RNR · · · = λs. s WHINED AND SNIVELED 7. λc1 :(St St)RNR. λc2 :(St St)RNR. · · · ` → → :(St St)VP λvs.v s ((say(St St) c2) → · → RNR AND (say(St St) c1)) 6. JOFFREY : St · · → RNR ` :(St St) (St St) → RNR → → RNR 7. JOFFREY WHINED AND SNIVELED : St (St St) ` · · · → → RNR 3.4 Right node raising 8. λc2 :(St St)RNR. λvs.v s ` → · ((say(St St) c2) AND In the end, ‘right node raising’ constructions prove → RNR · (say λt . TYWIN CHASTISED t )) only to be a special case of functor coordination. (St St)RNR 0 0 ·. → · · The key here is the licensing of the ‘rightward- . looking’ functors, which are under the image of = λc2 :(St St)RNR. λvs.v s the phenominator λvs.v s. As was the case with → · · 8Regrettably, space constraints prohibit a discussion veri- the ‘leftward-looking’ functor coordination exam- fying the typing for the ‘right node raised’ terms. The reader ple in section 3.3, this analysis is essentially the can verify that the terms are in fact well-typed given the sub- same as the well-known Lambek categorial gram- typing schema in section 2.1.2. It is possible to write infer- ence rules that speak directly to the introduction and elimina- mar analysis originating in Steedman (1985) and tion of the relevant functional subtypes, but these are omitted continuing in Dowty (1988) and Morrill (1994). here for the sake of brevity.

34 ((say(St St) c2) 4.1 Future work → RNR AND TYWIN CHASTISED) · · · It is possible to give an analysis of argument clus- :(St St) (St St) → RNR → → RNR ter coordination using phenominators, instantiat- 9. λst. t SLAPPED s :(St St St) ing the lexical entry for and with τ as the type ` · · → → TV (St St St St)DTV (St St)VP and ϕ as 10. s0 : St s0 : St → → → → → ` λvP s. s (P ) v, and using hypothetical reason- 11. s : St λt. t SLAPPED s :(St St) · · 0 ` · · 0 → VP ing. Regrettably, the necessity of brevity prohibits 12. TYRION : St a detailed account here. ` Given that phenominators provide access to the 13. s0 : St TYRION SLAPPED s0 : St ` · · structure of functional terms which concatenate 14. λs . TYRION SLAPPED s :(St St) ` 0 · · 0 → RNR strings to the right and left of their string support, 15. λvs.v s ((say(St St) it is our belief that any Lambek categorial gram- ` · → RNR λs . TYRION SLAPPED s ) mar analysis can be recast in LinCG by an algo- 0 · · 0 AND TYWIN CHASTISED) rithmic translation of directional slash types into .· · · phenominator-indexed functional phenotypes, and . we are currently in the process of evaluating a po- = (λvs.v s TYRION SLAPPED · · tential translation algorithm from directional slash AND TYWIN CHASTISED) · · · types to phenominators. This should in turn pro- = λs. TYRION SLAPPED AND · · vide us with most of the details necessary to de- TYWIN CHASTISED s :(St St) · · · → RNR scribe a system which emulates the HTLCG of 16. JOFFREY : St Kubota and Levine (2012), which provides anal- ` 17. TYRION SLAPPED AND yses of various gapping phenomena, greatly in- ` · · TYWIN CHASTISED JOFFREY : St creasing the overall empirical coverage. · · · There are a number of coordination phenomena 4 Discussion that require modifications to the tectogrammatical component. We would like to be able to analyze We provide a brief introduction to the framework unlike category coordinations like rich and an ex- of Linear Categorial Grammar (LinCG). One of cellent cook in the manner of Bayer (1996), as well the primary strengths of categorial grammar in as Morrill (1996), which would require the addi- general has been its ability to address coordina- tion of some variety of sum types in the tectogram- tion phenomena. Coordination presents a uniquely mar. Further muddying the waters is so-called “it- particular problem for grammars which distin- erated” or “list” coordination, which requires the guish between structural combination (tectogram- ability to generate coordinate structures contain- mar) and the actual linear order of the strings gen- ing a number of conjuncts with no coordinating erated by such grammars (part of phenogrammar). conjunction, as in Thurston, Kim, and Steve. Due to the inability to distinguish ‘directionality’ It is our intent to extend the use of phenom- in string functors within a standard typed lambda inators to analyze intonation as well, and we calculus, a general analysis of coordination seems expect that they can be fruitfully employed to difficult. give accounts of focus, association with focus, We have elaborated LinCG’s concept of contrastive topicalization, “in-situ” topicalization, phenogrammar by introducing phenominators, alternative questions, and any number of other closed monoidal linear lambda terms. We have phenomena which are at least partially realized shown how the recursive function say provides a prosodically. left inverse for unaryphenominators, and we have defined a more general notion of an ‘empty cate- Acknowledgements gory’ known as a vacuity, which say is defined in terms of. It is then possible to describe sub- I am grateful to Carl Pollard, Bob Levine, Yusuke types of functional types suitable to make the rele- Kubota, Manjuan Duan, Gerald Penn, the TTNLS vant distinctions. These technologies enable us to 2014 committee, and two anonymous reviewers give analyses of various coordination phenomena for their comments. Any errors or misunderstand- in LinCG, extending the empirical coverage of the ings rest solely on the shoulders of the author. framework.

35 References Reinhard Muskens. 2010. New Directions in Type- Theoretic Grammars. Journal of Logic, Lan- Samuel Bayer. 1996. The coordination of unlike cate- guage and Information, 19(2):129–136. DOI gories. Language, pages 579–616. 10.1007/s10849-009-9114-9. Haskell B. Curry. 1961. Some Logical Aspects of Richard Oehrle. 1994. Term-Labeled Categorial Type Grammatical Structure. In Roman. O. Jakobson, ed- Systems. Linguistics and Philosophy, 17:633–678. itor, Structure of Language and its Mathematical As- pects, pages 56–68. American Mathematical Soci- Dick Oehrle. 1995. Some 3-dimensional systems of ety. labelled deduction. Logic Journal of the IGPL, 3(2- 3):429–448. Philippe de Groote and Sarah Maarek. 2007. Type- theoretic Extensions of Abstract Categorial Gram- Carl Pollard and E. Allyn Smith. 2012. A unified mars. In Reinhard Muskens, editor, Proceedings analysis of the same, phrasal comparatives, and su- of Workshop on New Directions in Type-Theoretic perlatives. In Anca Chereches, editor, Proceedings Grammars. of SALT 22, pages 307–325. eLanguage.

Philippe de Groote. 2001. Towards Abstract Catego- Carl Pollard. 2013. Agnostic Hyperintensional Se- rial Grammars. In Association for Computational mantics. Synthese. to appear. Linguistics, 39th Annual Meeting and 10th Confer- ence of the European Chapter, Proceedings of the Aarne Ranta. 2004. Grammatical Framework: A Conference, pages 148–155. Type-Theoretical Grammar Formalism. Journal of Functional Programming, 14:145–189. David Dowty. 1988. Type raising, functional composi- Aarne Ranta. 2009. The GF resource grammar library. tion, and non-constituent conjunction. In Richard T. Linguistic Issues in Language Technology, 2(1). Oehrle, Emmon Bach, and Deirdre Wheeler, editors, Categorial Grammars and Natural Language Struc- E. Allyn Smith. 2010. Correlational Comparison in tures, volume 32 of Studies in Linguistics and Phi- English. Ph.D. thesis, The Ohio State University. losophy, pages 153–197. Springer Netherlands. Mark Steedman. 1985. Dependency and coordination¨ John Hughes. 1995. The design of a pretty-printing li- in the grammar of dutch and english. Language, brary. In Advanced Functional Programming, pages pages 523–568. 53–96. Springer.

Yusuke Kubota and Robert Levine. 2012. Gapping as like-category coordination. In D. Bechet´ and A. Dikovsky, editors, Logical Aspects of Computa- tional Linguistics (LACL) 2012.

Yusuke Kubota. 2010. (In)flexibility of Constituency in Japanese in Multi-Modal Categorial Grammar with Structured Phonology. Ph.D. thesis, The Ohio State University.

J. Lambek and P.J. Scott. 1986. Introduction to higher order categorical logic. Cambridge Univer- sity Press.

Scott Martin. 2013. The Dynamics of Sense and Impli- cature. Ph.D. thesis, The Ohio State University.

Vedrana Mihalicek. 2012. Serbo-Croatian Word Or- der: A Logical Approach. Ph.D. thesis, The Ohio State University.

Glyn V. Morrill. 1994. Type Logical Grammar. Kluwer Academic Publishers.

Glyn Morrill. 1996. Grammar and logic*. Theoria, 62(3):260–293.

Reinhard Muskens. 2001. Lambda grammars and the syntax-semantics interface. In Proceedings of the Thirteenth Amsterdam Colloquium, pages 150–155. Universiteit van Amsterdam.

36 Natural Language Reasoning Using proof-assistant technology: Rich Typing and beyond∗ Stergios Chatzikyriakidis Zhaohui Luo Dept of Computer Science, Dept of Computer Science, Royal Holloway, Univ of London Royal Holloway, Univ of London Egham, Surrey TW20 0EX, U.K; Open Egham, Surrey TW20 0EX, U.K; University of Cyprus [email protected] [email protected]

Abstract on formal semantics in MTTs with coercive sub- In this paper, we study natural language typing (Luo, 2012b) and its implementation in the inference based on the formal semantics proof assistant Coq (Coq, 2007). in modern type theories (MTTs) and their A Modern Type Theory (MTT) is a dependent implementations in proof-assistants such type theory consisting of an internal logic, which as Coq. To this end, the type theory follows the propositions-as-types principle. This UTT with coercive subtyping is used as latter feature along with the availability of power- the logical language in which natural lan- ful type structures make MTTs very useful for for- guage semantics is translated to, followed mal semantics. The use of MTTs for NL semantics by the implementation of these semantics has been proposed with exciting results as regards in the Coq proof-assistant. Valid infer- various issues of NL semantics, ranging from ences are treated as theorems to be proven quantification and anaphora to adjectival modifi- via Coq’s proof machinery. We shall em- cation, co-predication, belief and context formal- phasise that the rich typing mechanisms in ization. (Sundholm, 1989; Ranta, 1994; Boldini, MTTs (much richer than those in the sim- 2000; Cooper, 2005; Fox and Lappin, 2005; Re- ple type theory as used in the Montagovian tor´e, 2013; Ginzburg and Cooper, forthcoming; setting) provide very useful tools in many Luo, 2011a; Luo, 2012b; Chatzikyriakidis and respects in formal semantics. This is ex- Luo, 2012; Chatzikyriakidis and Luo, 2013a). Re- emplified via the formalisation of various cently, there has been a systematic study of MTT linguistic examples, including conjoined semantics using Luo’s UTT with coercive subtyp- NPs, comparatives, adjectives as well as ing (type theory with coercive subtyping, hence- various linguistic coercions. The aim of forth TTCS) (Luo, 2010; Luo, 2011a; Luo, 2012b; the paper is thus twofold: a) to show that Chatzikyriakidis and Luo, 2012; Chatzikyriakidis the use of proof-assistant technology has and Luo, 2013a; Chatzikyriakidis and Luo, 2013b; indeed the potential to be developed into Chatzikyriakidis and Luo, 2014). This is the ver- a new way of dealing with inference, and sion of MTT used in this paper. More specifically, b) to exemplify the advantages of having a the paper concentrates on one of the key differ- rich typing system to the study of formal ences between MTTs and simple typed ones, i.e. semantics in general and natural language rich typing. Rich typing will be shown to be a inference in particular. key ingredient for both formal semantics in gen- eral and the study of NLI in particular. 1 Introduction A proof assistant is a computer system that as- Natural Language Inference (NLI), i.e. the task of sists the users to develop proofs of mathemati- determining whether an NL hypothesis can be in- cal theorems. A number of proof assistants im- ferred from an NL premise, has been an active re- plement MTTs. For instance, the proof assistant search theme in computational semantics in which Coq (Coq, 2007) implements pCIC, the predica- various approaches have been proposed (see, for tive Calculus of Inductive Constructions1 and sup- example (MacCartney, 2009) and some of the ref- 1 erences therein). In this paper, we study NLI based pCIC is a type theory that is rather similar to UTT, es- pecially after its universe Set became predicative since Coq This∗ work is supported by the research grant F/07- 8.0. A main difference is that UTT does not have co-inductive 537/AJ of the Leverhulme Trust in the U.K. types. The interested reader is directed to Goguen’s PhD the-

37 Proceedings of the EACL 2014 Workshop on Type Theory and Natural Language Semantics (TTNLS), pages 37–45, Gothenburg, Sweden, April 26-30 2014. c 2014 Association for Computational Linguistics ports some very useful tactics that can be used to sivity compared to classical simple typed systems help the users to automate (parts of) their proofs. as these are used in mainstream Montagovian se- Proof assistants have been used in various applica- mantics. tions in computer science (e.g., program verifica- tion) and formalised mathematics (e.g., formalisa- 2.1 Type many-sortedness and CNs as types tion of the proof of the 4-colour theorem in Coq). In Montague semantics (Montague, 1974), the The above two developments, the use of MTT underlying logic (Church’s simple type theory semantics on the one hand and the implementa- (Church, 1940)) can be seen as ‘single-sorted’ in tion of MTTs in proof assistants on the other, has the sense that there is only one type e of all enti- opened a new research avenue: the use of existing ties. The other types such as t of truth values and proof assistants in dealing with NLI. In this pa- the function types generated from e and t do not per, two different goals are to be achieved: a) on a stand for types of entities. In this respect, there are more practical level, to show how proof-assistant no fine-grained distinctions between the elements technology can be used in order to deal with NLI of type e and as such all individuals are interpreted and b) on a theoretical level, the significance of using the same type. For example, John and Mary rich typing for formal semantics and NLI in par- have the same type in simple type theories, the ticular. These two different aspects of the paper type e of individuals. An MTT, on the other hand, will be studied on a par, by concentrating on a can be regarded as a ‘many-sorted’ logical system number of NLI cases (quite a lot actually) that in that it contains many types and as such one can are adequately dealt with on a theoretical level via make fine-grained distinctions between individu- rich typing and the implementation of the account als and further use those different types to interpret making use of rich type structures in Coq on a subclasses of individuals. For example, one can more practical level. We shall also consider how to have John: [[man]] and Mary : [[woman]], where employ dependent typing in the coercive subtyp- [[man]] and [[woman]] are different types. ing framework to formalise linguistic coercions. An important trait of MTT-based semantics is the interpretation of common nouns (CNs) as types 2 Rich typing in MTTs (Ranta, 1994) rather than sets or predicates (i.e., objects of type e t) as it is the case within A Modern Type Theory (MTT) is a variant of → the Montagovian tradition. The CNs man, human, a class of type theories in the tradition initiated table and book are interpreted as types [[man]], by the work of Martin-L¨of (Martin-L¨of, 1975; [[human]], [[table]] and [[book]], respectively. Then, Martin-L¨of, 1984), which have dependent and individuals are interpreted as being of one of the inductive types, among others. We choose to types used to interpret CNs. The interpretation of call them Modern Type Theories in order to dis- CNs as Types is also a prerequisite in order for the tinguish them from Church’s simple type theory subtyping mechanism to work. This is because, (Church, 1940) that is commonly employed within assuming CNs to be predicates, subtyping would the Montagovian tradition in formal semantics. go wrong given contravariance of function types.3 Among the variants of MTTs, we are going to employ the Unified Theory of dependent Types 2.2 Subtyping (UTT) (Luo, 1994) with the addition of the co- Coercive subtyping (Luo, 1999; Luo et al., 2012) ercive subtyping mechanism (see, for example, provides an adequate framework to be employed (Luo, 1999; Luo et al., 2012) and below). UTT is for MTT-based formal semantics (Luo, 2010; Luo, an impredicative type theory in which a type P rop 2012b).4 It can be seen as an abbreviation mech- of all logical propositions exists.2 This stands anism: is a (proper) subtype of ( ) if as part of the study of linguistic semantics using A B A