Linguists Who Use Probabilistic Models Love Them: Quantification

Linguists Who Use Probabilistic Models Love Them: Quantification in Functional Distributional Semantics Guy Emerson Department of Computer Science and Technology University of Cambridge [email protected] Abstract promising framework for automatically learning Functional Distributional Semantics provides truth-conditional semantics from large datasets. a computationally tractable framework for In previous work (Emerson and Copestake, learning truth-conditional semantics from a 2017b, §3.5, henceforth E&C), I sketched how this corpus. Previous work in this framework has approach can be extended with a probabilistic ver- provided a probabilistic version of first-order sion of first-order logic, where quantifiers are in- logic, recasting quantification as Bayesian in- terpreted in terms of conditional probabilities. I ference. In this paper, I show how the previous formulation gives trivial truth values when summarise this approach in §2 and §3. a precise quantifier is used with vague predi- There are four main contributions of this paper. cates. I propose an improved account, avoid- In §4.1, I first point out a problem with my pre- ing this problem by treating a vague predicate vious approach. Quantifiers like every and some as a distribution over precise predicates. I con- are treated as precise, but predicates are vague. nect this account to recent work in the Rational This leads to trivial truth values, with every triv- Speech Acts framework on modelling generic ially false, and some trivially true. quantification, and I extend this to modelling donkey sentences. Finally, I explain how the Secondly, I show in §4.2–4.4 how this problem generic quantifier can be both pragmatically can fixed by treating a vague predicate as a distri- complex and yet computationally simpler than bution over precise predicates. precise quantifiers. Thirdly, in §5 I look at vague quantifiers and generic sentences, which present a challenge for 1 Introduction classical (non-probabilistic) theories. I build on Model-theoretic semantics defines meaning in Tessler and Goodman(2019)’s account of gener- terms of truth, relative to model structures. In the ics using Rational Speech Acts, a Bayesian ap- simplest case, a model structure consists of a set proach to pragmatics (Frank and Goodman, 2012). of individuals (also called entities). The mean- I show how generic quantification is computation- ing of a content word is a predicate, formalised ally simpler than classical quantification, consis- as a truth-conditional function which maps indi- tent with evidence that generics are a “default” viduals to truth values (either truth or falsehood). mode of processing (for example: Leslie, 2008; Because of this precisely defined notion of truth, Gelman et al., 2015). arXiv:2006.03002v1 [cs.CL] 4 Jun 2020 model theory naturally supports logic, and has be- Finally, I show in §6 how this probabilistic ap- come a prominent approach to formal semantics. proach can provide an account of donkey sen- For detailed expositions, see: Cann(1993); Allan tences, another challenge for classical theories. In (2001); Kamp and Reyle(2013). particular, I consider generic donkey sentences, Mainstream approaches to distributional se- which are doubly challenging, and which provide mantics represent the meaning of a word as a counter-examples to the claim that donkey pro- vector (for example: Turney and Pantel, 2010; nouns are associated with universal quantifiers. Mikolov et al., 2013; for an overview, see: Emer- Taking the above together, in this paper I show son, 2020b). In contrast, Functional Distributional how a probabilistic first-order logic can be asso- Semantics represents the meaning of a word as ciated with a neural network model for distribu- a truth-conditional function (Emerson and Copes- tional semantics, in a way that sheds light on long- take, 2016; Emerson, 2018). It is therefore a standing problems in formal semantics. 2 Generalised Quantifiers 8x picture(x) ! 9z9y tell(y)^story(z)ÂRG1(y; x)ÂRG2(y; z) Partee(2012) recounts how quantifiers have Figure 1: A first-order logical proposition, representing played an important role in the development of the most likely reading of Every picture tells a story. model-theoretic semantics, seeing a major break- Scope ambiguity is not discussed in this paper. through with Montague(1973)’s work, and cul- minating in the theory of generalised quantifiers every(x) (Barwise and Cooper, 1981; Van Benthem, 1984). Ultimately, model theory requires quantifiers to picture(x) a(z) give truth values to propositions. An example of a logical proposition is given in Fig. 1, with a quan- story(z) 9(y) tifier for each logical variable. This also assumes a neo-Davidsonian approach to event semantics > tell(y) ^ ARG1(y; x) (Davidson, 1967; Parsons, 1990). ^ ARG2(y; z) Equivalently, we can represent a logical propo- Figure 2: A scope tree, equivalent to Fig. 1 above. Each sition as a scope tree, as in Fig. 2. The truth of the non-terminal node is a quantifier, with its bound vari- scope tree can be calculated by working bottom- able in brackets. Its left child is its restriction, and its up through the tree. The leaves of the tree are log- right child its body. ical expressions with free variables. They can be Quantifier Condition assigned truth values if each variable is fixed as some jR(v) \B(v)j > 0 an individual in the model structure. To assign a truth value to the whole proposition, we work up every jR(v) \B(v)j = jR(v)j through the tree, quantifying the variables one at no jR(v) \B(v)j = 0 1 at time. Once we reach the root, all variables have most jR(v) \B(v)j > 2 jR(v)j been quantified, and we are left with a truth value. Table 1: Classical truth conditions for precise quanti- Each quantifier is a non-terminal node with two fiers, in generalised quantifier theory. children – its restriction (on the left) and its body (on the right). It quantifies exactly one variable, 3 Generalised Quantifiers in Functional called its bound variable. Each node also has free Distributional Semantics variables. For each leaf, its free variables are ex- Functional Distributional Semantics defines a actly the variables appearing in the logical expres- probabilistic graphical model for distributional se- sion. For each quantifier, its free variables are the mantics. Importantly (from the point of view of union of the free variables of its restriction and formal semantics), this graphical model incorpo- body, minus its own bound variable. For a well- rates a probabilistic version of model theory. formed scope tree, the root has no free variables. This is illustrated in Fig. 3. The top row defines Each node in the tree defines a truth value, given a a distribution over situations, each situation being fixed value for each free variable. an event with two participants.1 This generalises The truth value for a quantifier node is defined a model structure comprising a set of situations, based on its restriction and body. Given values for as in classical situation semantics (Barwise and the quantifier’s free variables, the restriction and Perry, 1983). Each individual is represented by a body only depend on the quantifier’s bound vari- pixie, a point in a high-dimensional space, which able. The restriction and body therefore each de- represents the features of the individual. Two in- fine a set of individuals in the model structure – dividuals could be represented by the same pixie, the individuals for which the restriction is true, and and the space of pixies can be seen as a conceptual the individuals for which the body is true. We can space in the sense ofG ardenfors¨ (2000, 2014). write these as R(v) and B(v), respectively, where v denotes the values of all free variables. 1For situations with different structures (multiple events or different numbers of participants), we can define a family Generalised quantifier theory says that a quan- of such graphical models. Structuring the graphical model tifier’s truth value only depends on two quantities: in terms of semantic roles makes the simplifying assumption the cardinality of the restriction jR(v)j, and the that situation structure is isomorphic to a semantic depen- dency graph such as DMRS (Copestake et al., 2005; Copes- cardinality of the intersection of the restriction and take, 2009). In the general case, the assumption fails. For body jR(v) \B(v)j. Table 1 gives examples. example, the ARG3 of sell corresponds to the ARG1 of buy. ARG1 ARG2 Quantifier Condition X Y Z some P (b j r; v) > 0 2 X every P (b j r; v) = 1 no P (b j r; v) = 0 most (b j r; v) > 1 Tr; X Tr; Y Tr; Z P 2 2 f?; >g V Table 2: Truth conditions for precise quantifiers, in terms of the conditional probability of the body given Figure 3: Probabilistic model theory, as formalised in the restriction (and given all free variables). These con- Functional Distributional Semantics. Each node is a ditions mirror Table 1. random variable. The plate (box in bottom row) denotes repetition of nodes. Top row: pixie-valued random variables X, Y , Z to- one pixie for each free variable. Intuitively, the gether represent a situation composed of three individ- truth of a quantified expression depends on how uals. They are jointly distributed according to the se- likely B is to be true, given that R is true.3 mantic roles ARG1 and ARG2. Their joint distribution Truth conditions for quantifiers can be defined can be seen as a probabilistic model structure. in terms of (b j r; v), as shown in Table 2. For Bottom row: each predicate r in the vocabulary V has P a probabilistic truth-conditional function, which can be these precise quantifiers, the truth value is deter- applied to each individual.

Load more