Probabilistic Pragmatics Explains Gradience and Focality in Natural
Total Page:16
File Type:pdf, Size:1020Kb
Probabilistic pragmatics explains gradience and focality in natural language quantification Bob van Tiela,b,1 , Michael Frankec , and Uli Sauerlandb aDonders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, 6525 AJ, Nijmegen, The Netherlands; bDepartment of Semantics and Pragmatics, Leibniz-Zentrum Allgemeine Sprachwissenschaft, 10117 Berlin, Germany; and cInstitute of Cognitive Science, Osnabruck¨ University, 49069 Osnabruck,¨ Germany Edited by Barbara H. Partee, University of Massachusetts at Amherst, Amherst, MA, and approved January 22, 2021 (received for review March 23, 2020) An influential view in philosophy and linguistics equates the based on empirical data. With this goal in mind, we focus on the meaning of a sentence to the conditions under which it is true. specific case of quantity words, such as “some” and “all.” But it has been argued that this truth-conditional view is too rigid and that meaning is inherently gradient and revolves around pro- Quantity Words totypes. Neither of these abstract semantic theories makes direct Quantification is a crucial aspect of human cognition that predictions about quantitative aspects of language use. Hence, supports generalization and underpins the communication of we compare these semantic theories empirically by applying prob- information about frequency and number (14, 15). Indeed, the abilistic pragmatic models as a link function connecting linguistic treatment of quantification is one of the centerpieces of the meaning and language use. We consider the use of quantity truth-conditional approach to meaning, leading to the develop- words (e.g., “some,” “all”), which are fundamental to human lan- ment of Generalized Quantifier Theory (GQT) (16, 17). guage and thought. Data from a large-scale production study suggest that quantity words are understood via prototypes. We GQT. GQT formalizes an intuitive approach to the meaning formulate and compare computational models based on the two of quantity words in set-theoretic terms. According to this views on linguistic meaning. These models also take into account approach, quantified statements express relations between sets. cognitive factors, such as salience and numerosity representation. More specifically, many quantity words—and all of the ones that Statistical and empirical model comparison show that the truth- we are concerned with here—express thresholds on the intersec- conditional model explains the production data just as well as the tion between two sets (18). For example, “Some S are P” means PSYCHOLOGICAL AND COGNITIVE SCIENCES prototype-based model, when the semantics are complemented that the sets S and P, denoted by the subject and predicate, by a pragmatic module that encodes probabilistic reasoning about have at least one element in common, i.e., jS \ Pj ≥ 1. Other the listener’s uptake. examples of GQ-theoretic definitions are: • “all”: jS \ Pj ≥ jSj language j quantifiers j semantics j pragmatics j probabilistic reasoning • “not all”: jS \ Pj < jSj • “no”: jS \ Pj ≤ 0 n influential tradition in philosophy equates the meaning • “most”: jS \ Pj > jS − Pj Aof a sentence to its truth conditions (1, 2). Accordingly, to know what a sentence means is to know in which circumstances For convenience, we will refer to jS \ Pj as the intersection set it is true or false. This truth-conditional approach to meaning size and to jSj as the total set size. GQT has enjoyed immense has given rise to the field of formal semantics, which uses tools theoretical success, explaining various otherwise puzzling from logic and set theory to formally spell out how the truth- conditional meaning of a sentence is built up from the meanings Significance of its constituents (3). Other authors have criticized the truth-conditional approach Theoretical linguistics postulates abstract structures that suc- (4, 5), drawing upon psychological research that shows that cate- cessfully explain key aspects of language. However, the pre- gories are built around prototypes that exemplify the most impor- cise relation between abstract theoretical ideas and empirical tant features of a category (6). The typicality of an exemplar data from language use is not always apparent. Here, we pro- is gradient and depends on similarity to prototypes. The pres- pose to empirically test abstract semantic theories through ence of prototype structure is sometimes construed as evidence the lens of probabilistic pragmatic modeling. We consider the that category membership—and consequently truth itself—is a historically important case of quantity words (e.g., “some,” matter of degree and depends on similarity to prototypes. This “all”). Data from a large-scale production study seem to sug- approach is called Prototype Theory (PT). gest that quantity words are understood via prototypes. But There is an important gap between linguistic meaning and lan- based on statistical and empirical model comparison, we show guage use, especially from the truth-conditional perspective. For that a probabilistic pragmatic model that embeds a strict example, many formal semanticists hold that “or” expresses logi- truth-conditional notion of meaning explains the data just as cal disjunction. However, utterances containing “or” also tend to well as a model that encodes prototypes into the meaning of imply that the speaker is unsure which disjunct is true (e.g., “He quantity words. is in Amsterdam or Berlin”). To explain such observations, Grice (7) outlined a theory of pragmatics connecting meaning and use Author contributions: B.v.T., M.F., and U.S. designed research; B.v.T., M.F., and U.S. per- based on the idea that speakers are rational, i.e., gear their con- formed research; B.v.T. and M.F. analyzed data; and B.v.T., M.F., and U.S. wrote the tributions toward reaching certain conversational goals. In the paper.y case at hand, a rational speaker who knows which disjunct is true The authors declare no competing interest.y would simply utter that disjunct (e.g., “He is in Amsterdam”) This article is a PNAS Direct Submission.y rather than the disjunction (8). Published under the PNAS license.y Recently, this type of Gricean reasoning has been formalized 1 To whom correspondence may be addressed. Email: [email protected] within probabilistic models that make precise predictions about This article contains supporting information online at https://www.pnas.org/lookup/suppl/ quantitative aspects of language use (9–13). Here, we leverage doi:10.1073/pnas.2005453118/-/DCSupplemental.y these probabilistic pragmatic models to test semantic theories Published February 22, 2021. PNAS 2021 Vol. 118 No. 9 e2005453118 https://doi.org/10.1073/pnas.2005453118 j 1 of 6 Downloaded by guest on October 4, 2021 observations about, e.g., the cognitive complexity of quantity vals of their GQT meanings. It has been argued that this type of words (19), their order of acquisition (20), the aptitude of gradience and focality is fundamentally incongruous with GQT people’s reasoning with quantified sentences (21), and the dis- (25–27)—or even with a bivalent truth-conditional approach to tribution of negative polarity items like “any” and “ever” (22). At meaning more generally (4, 5). Instead, it has been argued that the same time, however, it has been observed that there is no quantity words have a prototype-based semantics (25, 27); e.g., straightforward connection between the set-theoretic meanings the prototype for “most” may be situated at about 75% of the postulated by GQT and the way people use quantity words, as total set size, with the typicality of a situation decreasing with the we will show presently. distance from that prototype. Using Quantity Words. We conducted a large-scale study on the Outline. The goal of this paper is twofold. First, we show how production of quantity words in English (Experiment [Exp.] 1a). abstract semantic theories can be compared empirically with Each of 600 participants described 10 displays showing 432 cir- data-driven statistical modeling. Second, we show that GQT’s cles, which were either red or black. To describe these displays, truth-conditional account of quantity-word meaning offers an participants freely completed the sentence frame “— of the cir- equally compelling account of quantity-word production as a cles are red.” Each display varied the intersection set size, i.e., semantics based on prototypes, but only when it is complemented the number of red circles. To elicit quantity words that are by a probabilistic theory of language use. potentially vague and thus relevant to our purposes, the dis- The next section formalizes GQ-theoretic and PT-based plays had a large total set size, thereby likely triggering only semantics of quantity words. We then describe four speaker approximate representations of the true intersection set size. models that make probabilistic predictions about quantity-word Additionally, experimental instructions discouraged the use of choices as a function of the underlying semantic theory. numerical expressions like “fifty-two.” Fig. 1 A and B visualizes the production probabilities of the Semantics of Quantity Words 15 most frequently produced quantity words, all of which were A lexical meaning function L: M × T ! [0, 1] maps each pair of produced 50 times or more, as well as “all” and “none.” “All” quantity word m and state t to a truth value in the unit interval. and “none” were included because they are crucial for commu- Participants in Exp. 1 were presented with displays consisting of nication and historically salient, even though they were not very 432 circles, which were either red or black. There were thus 433 frequently produced in the experiment for the obvious reason possible intersection set sizes T = ft0, ::: , t432g, i.e., relevant that displays with uniformly colored dots occurred infrequently. states of the world. Taken together, these 17 quantity words made up 87% of the production data. GQT Semantics. According to GQT, many quantity words—and In line with earlier interpretation studies (23, 24), the results all of the ones that we are interested in—express thresholds of Exp.