Semantics Oup.Pdf
Total Page:16
File Type:pdf, Size:1020Kb
Semantics David Beaver and Joey Frazee 1 Introduction Semantics is concerned with meaning: what meanings are, how meanings are assigned to words, phrases and sentences of natural and formal languages, and how meanings can be combined and used for inference and reasoning. The goal of this chapter is to introduce computational linguists and computer scientists to the tools, methods, and concepts required to work on natural language semantics. Semantics, while often paired with pragmatics, is nominally distinct. On a traditional view, se- mantics concerns itself with the compositional buildup of meaning from the lexicon to the sentence level whereas pragmatics concerns the way in which contextual factors and speaker intentions affect meaning and inference (see, e.g., Potts pear in this volume). Although the semantics-pragmatics dis- tinction is historically important, and continues to be widely adopted, in practice it is not clearcut. Work in semantics inevitably involves pragmatics and vice versa. Furthermore, it is not a distinction which is of much relevance for applications in computational linguistics. This chapter is organized as follows. In sections 2 and 2.2 we introduce foundational con- cepts and discuss ways of representing the meaning of sentences, and of combining the meaning of smaller expressions to produce sentential meanings. In section 3 we discuss the representation of meaning for larger units, especially with respect to anaphora, and introduce two formal theories that go beyond sentence meaning: Discourse Representation Theory and Dynamic Semantics. Then, in section 4 we discuss temporality, introducing event semantics and describing standard approaches to the semantics of time. Section 5 concerns the tension between the surface-oriented statistical meth- ods characteristic of mainstream computational linguistics and the more abstract methods typical of formal semantics and includes discussion of phenomena for which it seems particularly important to utilize insights from formal semantics. Throughout the chapter we assume familiarity with ba- sic notions from logic (propositional logic and first-order predicate logic), computer science and, to a lesser extent, computational linguistics (e.g., algorithms, parsing, syntactic categories and tree representations). 2 Sentential semantics 2.1 Representation & logical form Aristotle and the medieval post-Aristotelian tradition apart, work on formal semantic representation only began in earnest with Boole’s (1854) semantics for propositional logic and Frege’s (1879) development of first-order predicate logic (FOPL). Frege’s work provided a precise and intuitive way of characterizing the meaning of sentences.1 The influence of this development has been vast, 1 as seen in the fact that for many years, introductory courses on logic and semantics have commonly included exercises in translating sentences of natural language into statements of propositional or first-order logic. Thus, for example, (1a) might be represented as (1b): (1) a. Fischer played a Russian. b. 9x (russian(x) ^ played(Fischer;x)) Subtle typographical distinctions we have made between (1a) and (1b) relate to a crucial theme in the development of semantics, made completely explicit in the work of Tarski (1944). This is the distinction between object language and metalanguage. For example, the sentence in (1a) and all the words that make it up are expressions of the object language, the language from which we are translating. On the other hand, the translations are given in a semantic metalanguage. Thus far, the semantic metalanguage has been FOPL. We use a sans serif font to mark expressions of the metalanguage, so that, for exMple, played is our way of representing in the semantic metalanguage the meaning of the English word played.2 Representations like that in (1b), so-called Logical Forms (LFs), immediately provide a way of approaching a range of computational tasks. For example, consider a database application, say a database of information about chess tournaments. While it is far from obvious how to query a database with an arbitrary sentence of English, the problem of checking whether a sentence of FOPL is satisfied by a record in that database is straightforward: (1) translate the English sentence into a sentence of FOPL, and (2) verify the sentence’s translation. Thus we can check whether (1a) holds in the database by breaking it down into these sub-problems. We can also consider whether one sentence of natural language follows from another using a similar procedure. The computational process for deciding whether (2a) follows from (1a) could involve translating both sentences into FOPL, (1b,2b), and then verifying the inference with an automated theorem prover. (2) a. Fischer played someone. b. 9x (played(Fischer;x)) Here let us note immediately that while FOPL offers us a strategy for dealing with such problems, it is far from a complete solution. In the case of the above examples, we have reduced one prob- lem (natural language inference) to two problems (translation into FOPL, and inference), neither of which itself is trivial. The simplicity of the sentences in examples (1) and (2) gives off the ap- pearance that translation into FOPL is easy, or uninteresting, but sentences often have multiple LFs as well as LFs that are not necessarily intuitive. This makes the particulars of the choice of an LF representation critical, as noted by Russell (1905) a century ago. As regards the unintuitiveness of LFs, Russell argued that a definite description like The Amer- ican in (3a) does not simply denote or refer to an individual (e.g., Bobby Fischer); rather, it is a quantificational element. So for Russell (3a) ought to be represented as in (3b). (3) a. The American won. b. 9x (american(x) ^ 8y (american(y) ! x = y) ^ won(x)) Instead of The American being Fischer (in the right context), it’s a quantifier that imposes the exis- tence of an individual that (1) is, uniquely, American and (2) has the property denoted by whatever predicate it combines with (i.e., won in the above example). The question of whether this LF is a 2 good representation for definite descriptions is a subject of debate, with many (from Strawson 1950 on) arguing that the full meaning of sentences like (3a) cannot be captured in classical FOPL at all.3 Whatever position is taken about the meaning of definite descriptions, it is hard to escape Rus- sell’s conclusion that surface forms bear a complex relationship to their LFs. This is especially evident when we look at sentences with multiple LFs. For example, a sentence like (4a) may exhibit a scope ambiguity whereby it may either have an LF like (4b) where the universal quantifier takes wide scope or an LF like (4c) where it takes narrow scope. (4) a. Every Russian played an American. b. 8x (russian(x) ! 9y (american(y) ^ play(x;y))) c. 9y (american(y) ^ 8x (russian(x) ! play(x;y))) The wide scope LF corresponds to the reading where every Russian played at least one American, but possibly different ones. The narrow scope LF corresponds to the reading where every Russian played the same American. The LFs show that this distinction can be represented in FOPL, but how do we get from the sentence (4a) to its LFs (4b) and (4c)? 2.2 Compositional semantics Above we have sketched a common way of representing the meanings of sentences, via translation to FOPL, but we have provided no general method for deriving these translations. This raises two questions: (1) how should we represent the meanings of smaller expressions (e.g., verbs like played and noun-phrases like Every Russian), and (2) how should the meanings of smaller expressions combine to yield sentence representations? In the examples so far, some parts of the original sentences have direct counterparts in LF, others do not. For example, while Russian in (4a) has russian as its translation, the expression Every Russian has no counterpart in either of the sentence’s LFs (4b,4c). It’s tempting to say that the translation of Every Russian is “8x (russian(x) !”, but this is not a well-formed expression of FOPL and so it has no meaning in the usual semantics of FOPL. More troubling, though, is that FOPL on its own does not provide a method for deriving the meaning of an expression like “8x (russian(x) !” from the meaning of its parts, Every and Russian. A method for assigning meanings to all syntactic units that make up a sentence and deriving the meanings of that sentence, algorithmically, from those parts, however, is available in Richard Montague’s (1973; 1970a; 1970b) seminal papers. Montague contended that there is no substan- tive difference between natural languages and formal languages (e.g., languages of philosophical logic such as FOPL, and programming languages) and that both can be analyzed in the same way (Montague 1970b,b; see also Halvorsen and Ladusaw 1979). As unlikely as this seems (indeed, Frege, Russell and others were drawn to FOPL thinking that natural language was too messy to be analyzable in precise terms), Montague (1970b,b) showed that significant fragments of English are directly interpretable in a precise and systematic fashion. Translating object an language into a formal language for interpretation (see, e.g., Montague 1973) is usually more straightforward, however, and we will follow this strategy throughout the paper. Indirect interpretation is primarily achieved using the Lambda Calculus (Church, 1932).4 Object language terms are mapped to typed lambda terms which are subsequently used to assign meanings. For example, Russian is paired with the function russian, which maps Boris Spassky to 1 or true and Bobby Fischer and Deep Blue to 0 or false. The type of this function is he;ti, the type of properties, 3 which maps entities e to truth-values t. Quantifiers like every are likewise analyzed as functions, but the corresponding function every is a mapping not from entities but from a pair of properties like russian and won to truth-values.