Updates and Boolean Universals

Fausto Carcassi1,* and Giorgio Sbardolini1,* 1Institute for , Language and Computation; Universiteit van Amsterdam *Co-first authors.

Abstract Languages across the world have simple lexical terms that convey Boolean opera- tors, such as conjunction and disjunction. Of the many logically possible operators of this kind, only three are in fact attested: and, or, nor. Moreover, there are patterns in the way they co-occur in lexical inventories, e.g. nor is not observed by itself. In this paper, we develop a proposal to explain the occurrence of lexical inventories in natural language based on previous work in update semantics. We present a bilat- eral extension of standard update semantics, which allows to calculate how linguistic information can be accepted in a context but also rejected. Boolean operators are en- coded as particular forms of update, on the basis of which we define a cognitively plausible measure of conceptual simplicity. We show that the attested operators are conceptually simpler than the remaining Booleans. Moreover, we show that the pat- terns in co-occurrence of Boolean operators can be explained by balancing conceptual simplicity and simplicity of using the language. Our perspective reveals a way that formal and cognitive constraints on dynamic aspects of the use of sentences may have contributed to shaping language evolution.

1 Introduction

English and expresses Boolean conjunction: the compound ‘p and q’ is true just in case p and q are both true. Assuming two truth values, True and False, there are 16 Boolean truth functions of two arguments, like conjunction. However, only few of them are expressed in natural language by lexical primitives. For instance, English has morphologically simple lexical entries and, or, and nor, expressing ∧, ∨, and nor, respectively. Other Booleans can only be expressed compositionally. In English one can only express negated conjunction p nand q by the compound ‘not (both) p and q’. The lexicon in other languages patterns somewhat differently: in Iraqw there is no nor (Mous, 2004), there is no ∨ in Wari’ (Mauri, 2008; Everett and Kern, 1997), and no ∧ in Maricopa and Warlpiri (Gil, 1991; Bowler, 2015). On the other hand, no language has been observed to have a lexical primitive that expresses nand (Horn, 1972).

1 We present and discuss a novel explanation of these observations. We start from a version of update semantics in which the Boolean connectives can be encoded. Update semantics is a well-established approach to the study of semantic interpretation, with applications to phenomena such as anaphora resultion and default reasoning (Veltman, 1996; Heim, 1983). We now put it to new use in the study of language evolution, introduc- ing what can be in effect regarded as a Dynamic Language of Thought. Specifically, we aim at accounting for the attested lexical inventories of Boolean operators. For example, in languages in which nor occurs such as English and Italian, it never does so without conjunction and disjunction also being present. In line with previous work, on our account language finds the best compromise be- tween opposing pressures. The first pressure tends to minimize conceptual complexity. Boolean operators may be defined in the update system we introduce, according to the structure of complex updates. The conceptual complexity of a Boolean operator is as- signed on the basis of the complexity of the update procedure by which the operator is encoded in the update system. Our first conclusion is that, with respect to the logic of updates, the naturally occurring Boolean functions tend to minimize the conceptual com- plexity of information processing. The second pressure we discuss tends to minimize usage complexity, which can be thought of as the effort involved on average in expressing an observation. We measure usage complexity as the average length of a sentence expressing a combination of truth values in a given lexicon. We show that the naturally attested lexical inventories (English, Warlpiri, and so on) are different ways of optimizing the trade-off between conceptual and usage complexity. The paper is structured as follows. We start by discussing the typological data in Sec- tion 1. Then, we introduce our Dynamic Language of Thought in Section 2. In Section 3, we use our Dynamic LOT to calculate a measure of conceptual complexity for the indi- vidual Boolean concepts. In Section 4, we apply this measure, together with a measure of usage complexity, to inventories of Boolean concepts, and show how they explain the lexicalization patterns. In Section 5, we review previous accounts of this phenomenon and compare it with ours.

1.1 Overview of the Evidence Cross-linguistic evidence points to a few universal generalizations regarding the presence of lexically simple expressions that convey Boolean connectives. We assume a common understanding of the notion of ‘lexical simplicity’ as monomorphemicity, following Horn (1972, 1989) and Keenan and Stavi (1986). In this sense, English nor is lexically simple.1 In

1There is, however, a debate on whether nor and similar operators in English and other Germanic lan- guages hide complex syntactic structure. See Sauerland (2000) and Zeijlstra (2011) for arguments, and Geurts (1996) and De Swart (2000) for counterarguments.

2 some languages, such as Iraqw (Cushitic), nor is absent (Mous, 2004). In others, includ- ing several European languages, there are specialized lexical items for nor, along with conjunction and disjunction. Many languages lexicalize conjunction and disjunction differently (Payne, 1985). In some languages, however, a single coordinating expression can take conjunctive or dis- junctive meanings, depending on context. Examples are ASL (Davidson, 2013) and Japanese (Ohori, 2004; Sauerland et al., 2015). Bowler (2015) argues that the coordinator manu of Warlpiri (Pama-Nyungan) is a disjunction that can be pragmatically strengthened to a conjunction. Otherwise, there is no lexical conjunction in Warlpiri. A relatively common strategy to express coordination is simple juxtaposition, as in ‘Would you like coffee, tea?’. In Maricopa (Yuman; Gil, 1991), juxtaposition expresses conjunction, and an optional evidential element leads to disjunctive interpretations. In Maricopa there is a lexical disjunction, and no conjunction. The strategy of indicating uncertainty among alternatives to express disjunction (ex- pressing ‘p or q’ by something like ‘Perhaps p, perhaps q’) is observed quite frequently. Ex- amples include Kuskokwin (Athabaskan; Kibrik, 2004), Wari’ (Chapacuran; Mauri, 2008; Everett and Kern, 1997), Aranda (Pama-Nyungan; Wilkins, 1989). These languages lack a lexical disjunction. Several generalizations are apparent. For instance, nor is not attested unless ∨ and ∧ are, but both ∧ and ∨ can be present without the other. Moreover, there is no language that includes nand, confirming Horn (1972). Indeed, all remaining logically possible Boolean operators are missing.

1.2 The Language, Informally We present now an informal summary of the Dynamic Language of Thought discussed below. The aim is to define a language that can encode a specific class of concepts, namely all the possible Boolean operators of two arguments or, in other words, all possible func- tions from two truth values to one . We assume three conceptual primitives for the Dynamic LOT. The first conceptual primitive is assertion. When a proposition p is asserted, the agent gives high plausibility to p being true. The second conceptual primitive is rejection. When a proposition p is rejected, the agent gives low plausibility to p being true. The third and last conceptual primitive is restriction, where the agent updates their model of the world so that only the propositions with high plausibility are true in the model. All Boolean concepts can be defined using these three simple conceptual primitives. For instance, conjunction p ∧ q can be defined as the following process: (1) accept p, (2) restrict the world model, (3) accept q, (4) restrict the world model again. The end result of this process is a model of the world where both p and q are true. In order to define dis- junction, high plausibility is attributed to p and to q before the world model is restricted.

3 Other operators may require a more complex procedure. For instance, in the proce- dure for p nand q, the agent has to keep in mind the original world model through vari- ous restrictions to use it later in the process: (1) start with world model c, (2) accept p, (3) restrict the world model, (4) accept q, (5) restrict the world model again, (6) create a world model that is like the starting model c, but which excludes a world like the one defined in step (5). In the world model resulting from this procedure, p and q are not both true, which is the meaning of p nand q. In Section 2, we formalize this intuitive explanation as a bilateral update system.

2 A Language of Thought for Boolean Connectives

According to a popular account, the information accepted by all participants in a conver- sation is stored in a context (Stalnaker, 1978). A declarative sentence carries information that modifies the context in which the sentence is uttered. This dynamic perspective on semantics underlies a rich paradigm of linguistic analysis that has been employed for the study of a variety of phenomena, including presupposition projection, modal subordi- nation, default reasoning, conditional and hypothetical reasoning (Veltman, 1996; Heim, 1983; Gillies, 2004). We expand on standard update semantics by making a basic conceptual distinction. The informational content of a declarative sentence p can be “added” to the context c if p is asserted, or “removed” from it, if p is denied (Incurvati and Sbardolini, xb). Thus we distinguish the positive c[+ p] and the negative update c[− p] generated by p, using + and − as force indicators for assertion and denial. The resulting update system is called bilateral. In this section we explain the logic of a bilateral update system, and how it can be used to encode the basic truth functions. For simplicity, we will focus on propositional and binary uses of the connectives, as in simple conjunctions and disjunctions of declarative sentences. In an Appendix, we briefly compare the present system with Heim’s (1983) classical update semantics.2

2.1 Three Conceptual Primitives The two simplest updates are the assertion and denial of an atomic sentence p in a context c, resulting in an updated context c0.

c0 = c[+ p] c0 = c[− p]

2In a classical update system, such as Veltman (1996) or Heim (1983), denial is treated as the assertion of negations. This approach reduces the number of conceptual primitives, but obscures an important gener- alization, as it will be apparent below. For logical and philosophical background to bilateralism, see Price (1990); Smiley (1996); Rumfitt (2000). For background on the bilateral update system we discuss here, see Incurvati and Sbardolini (xb,x).

4 Figure 1: A context {w1, w2, w3, w4}. Each world represents a combination of truth values for two atoms p and q. For example, p is false and q is true in w3.

If p is asserted, the information carried by the sentence is accepted. If p is denied, the same information is rejected. We assume an idealized but standard view of the dynamics of conversation, in which interlocutors are cooperative and fully informed. The basic attitudes of acceptance and rejection can be understood as ordering the plausibility of possible worlds, or information states, broadly within the framework of Stalnaker (1978, 1999). For illustration, suppose that a context is a of four worlds, in which all combinations of truth values for p and q are represented, as in Fig. 1. Worlds are ordered according to how plausible they are: a world is a more or less plausible represen- tation of the actual state of affairs, given the interlocutors’ information during a conver- sation. We will now describe algorithmic procedures corresponding to assertion and denial for updating an initial context with the information carried by a sentence. Initially, all worlds are ranked equally: all possibilities are considered equally plausible.

w1 ≈ w2 ≈ w3 ≈ w4 After an assertion of p, the information carried by p is accepted, and thus the p-worlds (see Fig. 1) get higher in the plausibility ordering.

w1 ≈ w2 > w3 ≈ w4 At this point a new context can be defined, by eliminating all worlds that are not highest in the ordering. The definition of a new context, in the current setting, is a restriction of the input context to the highest-ranking worlds. As a result, after a positive update + with p in c = {w1, w2, w3, w4}, the context c[ p] = {w1, w2} does not contain information incompatible with p. On the other hand, if the information carried by p is rejected in the same initial c, the worlds in which p is true are ranked lower.

w1 ≈ w2 < w3 ≈ w4

5 By restriction, the new context includes only the highest-ranking worlds. Consequently, after a negative update with p, the context does not contain information compatible with p. These update procedures show the interplay between three conceptual primitives: ac- ceptance, rejection, and restriction. All updates are analyzed as combinations of these primitives, the first two governing the plausibility ordering, the third governing the defi- nition of a new context. These will be the basic conceptual primitives for our study. Compatibility and incompatibility are set-theoretic notions. Let p M,g be the semantic interpretation of a sentence p relative to a model M and a variable assignmentJ K g. (For ease of notation we will suppress superscripts henceforth.) Relative to a context c, information compatible with p is c ∩ p . In assertion, the worlds in c ∩ p are “promoted”, and in denial they are “demoted”J K with respect to the plausibility ordering.J K Since the update only keeps the highest-ranking worlds, the following equalities hold for atomic p in any context: c[+ p] = c ∩ p J K c[− p] = c\ p J K The idea that assertion updates the context by intersecting it with the semantic interpre- tation of the sentence is familiar from Stalnaker (1978), and Heim (1983). The idea that denial does so by subtraction is a natural extension.

2.2 The Basic Connectors We will now show how to express Boolean combinations of p and q in terms of update procedures. An update procedure encodes a Boolean formula just in case the result of the procedure on a context c is the same as intersecting c with the information carried by the formula. Definition: Encoding. Let O be a Boolean operator of n arguments, ∗ a force sign (either + or −), ∗ ∗ and p1,..., pn atomic sentences. An update procedure c[ p1 ... pn] encodes a ∗ ∗ Boolean formula O(p1 ... pn) just in case c[ p1 ... pn] = c ∩ O(p1 ... pn) J K For example, a positive update by p, c[+ p], encodes the atomic formula p. Another simple example is the encoding of ¬p, which is given by the update procedure c[− p]. This match is adequate, for c[− p] = c\ p = c ∩ ¬p . We proceed to analogous semantic character- izations for Boolean operatorsJ K in theJ languageK of update functions, suitably generalized. We assume a standard formulation of the “static” interpretation function · relative to a possible world model M, as illustrated in Table 2. J K In order to do this, we need a way to calculate how the information of multiple sen- tences modifies the initial context. There is a difference between adding or removing in- formation from atoms p and q sequentially, and doing so not in sequence. This difference underlies different encodings for conjunction and disjunction.

6 p ∧ q = p ∩ q p nor q = (W\ p ) ∩ (W\ q ) Jp ∨ qK = JpK ∪ JqKJp nand Kq = (W\J Kp ) ∪ (W\J Kq ) J K J K J KJ K J K J K Table 2: The standard interpretation we adopted for Boolean sentences, relative to an intensional model M with domain W. The Table lists only four of the complete list of operators we considered, for illustrative purposes. The definition of interpretation for the remaining operators is textbook.

For example, the information carried by p and q can be accepted in an input context c in a sequence, by accepting p first, and then q. In this case, an intermediate context is defined after the acceptance of p, relative to which q is then accepted. We write this ‘(c[+ p])[+q]’, using round brackets to indicate the intermediate context. It is easy to verify that a sequence of positive updates by p and then by q encodes the conjunction of p and q: (c[+ p])[+q] = (c ∩ p ) ∩ q = c ∩ p ∧ q . In terms of the plausibilityJ K J K ordering,J considerK again the context depicted in Fig. 1. Acceptance of p in c determines the ordering w1 ≈ w2 > w3 ≈ w4, on the basis of which a context c0 can be defined by restriction to the highest-ranking worlds. This is c0 = 0 {w1, w2}. With respect to c , q is then accepted, and thus w1 > w2. The context defined at this point only includes w1, in which p and q are both true. Hence Boolean conjunction can be expressed in the language of bilateral updates as a sequence of positive updates, first by p and then by q.3 Alternatively, p and q can be accepted simultaneously. In this case, there is no defini- tion of an intermediate context after the acceptance of the first sentence, and a new context is defined once p and q are both accepted. We write this ‘c[+ p,+ q]’, indicating that p and q are accepted together in c. In terms of plausibility ordering, after acceptance of p defines w1 ≈ w2 > w3 ≈ w4, no intermediate context is introduced, and so all worlds in the initial context are still “in play”, as it were. Acceptance of q at this point moves the q-worlds up in the ordering, yielding w1 ≈ w2 ≈ w3 > w4. A new context is finally defined that includes only the highest-ranking worlds, namely all but w4, in which p and q are both

3This is essentially Heim’s (1983) update rule for conjunction. For Heim (1983), the existence of an in- termediate or local context matters to account for well-known dynamic phenomena, such as the perceived contrast in (1): 1. (a) They shot him and someone called the police. (b) Someone called the police and they shot him. The information carried by the second conjunct is added not to the initial context, but to the subordinate context generated by the first conjunct. This distinction is responsible for the felt causal asymmetry between calling the police and getting shot, or vice-versa, in (1a) and (1b). As a matter of fact, the object language for our update system consists in atomic sentences and Boolean operators, hence does not contain presuppo- sitional material (such as tense or the pronoun they, in example (1)). So, we do not predict dynamic effects such as the asymmetry of conjunction. The update system could be easily extended for this purpose. See the Appendix for more detail.

7 false. Thus, c[+ p,+ q] = c ∩ ( p ∪ q ) = c ∩ p ∨ q . Boolean disjunction is therefore encoded by simultaneous (non-sequential)J K J K positiveJ updatesK with p and q.4 We have shown how to write update rules that characterize negation, conjunction, and disjunction. Next, we consider negative updates. This is where we expect to find expressions in update notation that correspond to the semantic information of “negative” connectives like nor, and the unattested *nand.

2.3 The Problem with *nand From a set-theoretic perspective, there does not seem to be any asymmetry among con- junction, disjunction, and their negations, since all are definable in terms of complement, union, and intersection (see Table 2). For essentially the same reason, all Booleans are easily definable in classical approaches to update semantics such as Veltman’s (1996) and Heim’s (1983), whose metalanguage includes update functions as well as set-theoretic op- erations. However, an important difference emerges in the bilateral update system we have described. Sequential and non-sequential positive updates encode conjunction and disjunction respectively. Corresponding negative updates may be obtained by simply inverting the force signs from acceptance to rejection. Information that in a positive update is placed higher in the plausibility ordering is placed lower in the plausibility ordering by the cor- responding negative update. However, with negative updates sequencing does not mat- ter semantically. For, (c[− p])[−q] is semantically equivalent to c[− p,− q], and so both of these update procedures encode the same Boolean formula. This is because (c\ p )\ q = c\( p ∪ q ). Moreover, this negative update encodes negated disjunction, orJ ‘not-K pJ andK not-Jq’:K henceJ K c[− p,− q] = (c[− p])[−q] = c ∩ p nor q . As a result, the familiar Aris- totelian Square of Opposition reduces to a Triangle,J depictedK in Fig. 3. With regards to the plausibility order on the four world context of Fig. 1, rejection of p determines the ordering w1 ≈ w2 < w3 ≈ w4, in which the p-worlds stand below. If q is simultaneously rejected with no intermediate context, we have w1 ≈ w2 ≈ w3 < w4 and the new context will then include only w4 since it’s highest in the ordering. Alternatively, 0 0 an intermediate context c = {w3, w4} is defined once p is rejected. Rejection of q in c yields w3 < w4 and thus the resulting final context is the same, namely {w4}. This is the world in which both p and q are false. Based on our primitives, the negative updates corresponding to conjunction and disjunction collapse onto negated disjunction nor. In order to encode nand in the current framework, more complexity must be added.

4Formally speaking, the possibility of simultaneous updates with multiple sentences requires the update function c[·] to take objects of type (st)t as arguments, i.e. sets of propositions. This allows for a recursive definition of positive and negative updates, and it means that, in effect, the update function is a device for processing discourses, rather than single sentences taken one by one. For technical details, see Incurvati and Sbardolini (xa).

8 nor c[− p,− q] (c[− p])[−q]

(c[+ p])[+q] c[+ p,+ q] ∧ ∨

Figure 3: The Triangle of Opposition. The horizontal axis separates positive updates (be- low) and negative updates (above). The vertical axis separates sequential updates (left) and non-sequential updates (right). This distinction collapses on the negative side, result- ing in the absence of nand.

2.3.1 Local and global updates Up to this point, we only considered formulas where the initial input context c appears just once. We can think of such formulas as local, in the sense that they define updates that progressively restrict the initial context, without having to keep track of previous stages of the procedure. In other words, an agent computing a local update only needs to keep track of the current state of the context. However, our language allows c to appear more than once in a formula. For instance, consider the following expression, in which a local update is embedded in a larger one:

c[−(c[+ p])]

This relation expresses the following update (going from inside out): p is accepted in c, and a local context is defined, as indicated by the round brackets. Then the local context c[+ p] is removed from the initial context c. The result is equivalent to a denial of p, but the crucial innovation is that c appears more than once. The formula defines a global op- eration. An agent calculating this update needs to keep track not only of the current state of the context, but also of a previous state. While the result is semantically equivalent to the local expression c[− p], global updates can express meanings that cannot be expressed locally.

2.3.2 nand as a global operator We can now encode ‘p nand q’ as the following global update, corresponding to the rejec- tion of a conjunction: c[−(c[+ p])[+q]]

9 Formula Update p ∧ q 1000 (c[+ p])[+q] p ∨ q 1110 c[+ p,+ q] p nor q 0001 (c[− p])[−q] p nand q 0111 c[−(c[+ p])[+q]] p xor q 0110 (c[−(c[+ p])[+q])])[−(c[− p])[−q]] p ↔ q 1001 c[−(c[−(c[+ p])[+q])])[−(c[− p])[−q]]] p → q 1011 c[−(c[+ p])[−q]] p ← q 1101 c[−(c[− p])[+q]] p 6→ q 0100 (c[+ p])[−q] p 6← q 0010 (c[− p])[+q] justp 1100 c[+ p] justq 1010 c[+q] justnotp 0011 c[− p] justnotq 0101 c[−q] > 1111 c ⊥ 0000 c[−c]

Table 4: For each binary Boolean connective, its truth table, and the shortest update in- structions that encode it. In the Truth Table column, we assume by convention that the order of truth values is 1100 for p and 1010 for q.

The interpretation of this formula is consistent with the rules introduced above, though the procedure described is considerably longer. First, the initial context c is positively updated with p, creating a first context c0 = c[+ p]. Then, c0 is positively updated with q, creating a second context c00. So far, this is the update procedure seen above for con- junction. As a final step, the original context c is updated by subtracting c00, which leaves everything but the conjunction of p and q. The resulting procedure is global, as the input c needs to be called more than once, and it is considerably more demanding than we have seen before, since three subordinate contexts are defined in order to complete the process. In practice, the plausibility ranking is manipulated by calculating contextual restrictions separately. Global updates are more complex procedures.

2.4 Interim Conclusion Table 4 lists all Boolean truth functions of two arguments, together with a corresponding expression in the language of bilateral updates.5 We have already discussed the first four lines. Almost all of the remaining update formulas are global, with the exception of the negated material conditionals 6→ and 6←.

5In some cases, more than one expression in the language of bilateral updates encodes the same truth function. This is the case, for example, with c[− p,− q] and (c[− p])[−q]. In Table 4, we have written the shortest such formula, according to the notion of cognitive complexity introduced below.

10 The last six lines of Table 4 list simple but trivial operators, in the sense that not both arguments p and q are needed to determine the truth value of the Boolean formula on the left. > and ⊥ return True and False for every argument, and the first and second projections justp and justq, as well as their negations, are simply redundant, given their arguments and negation (which we assume to be available in all languages we consider below). Our aim in the first part of the paper has been to define the Boolean connectives in terms of a set of few cognitively plausible primitives. Based on previous work in update semantics, the language of bilateral updates is built with only three primitives: accep- tance, rejection, and restriction to the “best” worlds in the context. All Boolean operators can be defined as combinations of these primitives, as Table 4 shows. Crucially, expres- sions which are similar in propositional logic turn out to be substantially different when expressed in our language. Specifically, this is the case for nor and nand, which in proposi- tional logic are both negations of simple operators, while in our language are substantially different. In the next part of the paper, we move onto the consequences of our character- ization of Boolean connectives for their conceptual complexity and for the typological generalizations discussed above.

3 Conceptual Complexity and Typology

Three features could be relevant for measuring the complexity of an update procedure that encodes a Boolean concept. First, an update can be local or global, depending on whether the initial context is called once or more than once as input. Second, we con- sider polarity, i.e. whether the updates are positive—assertions—or negative—rejections. Considerable linguistic and cognitive evidence suggests that the category of “the nega- tive” requires more cognitive effort for production and interpretation. Finally, an update process can be homogeneous or heterogeneous, corresponding to whether the operations on context are all of the same force or not. For example, the negated material conditional 6→ is encoded by (c[+ p])[−q], an update that requires to first accept, then reject, some information. We look at these in turn.

3.1 Globality In a global update, c appears more than once. Multiple appearances of c imply that an agent performing the update does not only work on the current state of the context, but needs to keep track of the original context to perform operations on it multiple times dur- ing the update. The need to keep track of several contexts at once would strain the agent’s memory. The assumption that global updates are cognitively more complex has the in- tended consequence that the attested connectives—∨, ∧, nor—are simplest with respect to this feature of complexity.

11 Previous work on the computational complexity of quantifiers provides some indi- rect evidence for the idea that global updates are more complex. An important class of quantifiers, the quantifiers definable in first-order logic (a class which includes among others Aristotelian quantifiers, cardinality quantifiers, and parity quantifiers), can be rep- resented by finite automata (Van Benthem, 1986). Crucially, this class of quantifier is char- acterized both by cognitive simplicity and by the fact that the corresponding automata are memoryless. On the other hand, proportional quantifiers such as ‘most’ and ‘less than half’ are cognitively more complex and require a type of automata with memory (Mostowski, 1998). Szymanik and Zajenkowski (2010) confirms experimentally that, in the case of quantifiers, a semantic object requiring no memory is cognitively simpler than one requiring a memory.6 Future work can explore the relation between these previous results in the computational and cognitive complexity of quantifiers and the complexity of Boolean connective in a Dynamic Language of Thought. The definition of globality could be extended in various ways. For instance, a more fine grained definition could look at the number of occurrences of the global context in the update formula. This measure of globality is more fine grained in that updates that, on the current definition, are simply both global—such as nand and xor in table 4—would not be equivalent on the revised definition: xor would be more complex than nand. Another more fine-grained approach would have ∨ more complex than ∧. In some sense, the agent performing a simultaneous update by p and q does have to “keep in mind” the initial context somewhat longer. Further experimental work on the relative complexity of disjunction over conjunction could clarify this point, but see e.g. Goodwin and Johnson- Laird (2013); Singh et al. (2016) for some experimental evidence.

3.2 Polarity The second feature we consider is the polarity of updates. Updates that contain a rejection have negative polarity, otherwise an update is positive. We claim that positive polarity updates are cognitively simpler than negative polarity updates. In the simple case of local updates represented in the Triange of Fig. 3, polarity corresponds to the well-studied phenomenon of monotonicity (Barwise and Cooper, 1981; van Benthem, 1984). An n-ary Boolean operator O is upward monotonic in its i-th argument iff, if O(... pi ...) is true, 0 0 then O(... pi ...) is true, for any p entailed by p. Vice-versa, an operator is downward monotonic in its i-th argument just in cases it reverses the direction of entailment on its argument. Consider the case of 6→ as an example. 6→ is upward monotonic in its left argument because if p 6→ q is true, then, since p ∨ q follows from p, (p ∨ q) 6→ q is true. On the other hand, 6→ is downward monotonic in its right argument because, if p 6→ q is true, then, since q follows from p ∧ q, p 6→ (p ∧ q) is true. For local updates, polarity corresponds directly to monotonicity: operators are up-

6See Szymanik (2016) for a review of computational complexity theory and quantification.

12 ward monotonic in arguments that are accepted, and downward monotonic in argument that are rejected. For instance, in the update encoding p 6→ q the left argument is accepted, and the right argument is rejected. The connection between polarity and monotonicity is significant as monotonicity has been argued to play a role in determining the cognitive complexity of quantifiers. In particular, Geurts and van der Slik (2005) shows with an experiment that upward mono- tonicity is cognitively simpler than downward monotonicity. With reference to this work on monotonicity, we give a simple measure of complexity to updates with respect to polarity. We rank updates which only contain acceptances as simpler than updates which contain one or more rejections. In a more fine-grained version of this measure, the total complexity could be a function of the number acceptances and rejections. Future work could investigate whether polarity or monotonicity as such plays a role in determining the conceptual complexity of a Boolean operator.

3.3 Homogeneity The third and final feature of updates that we claim plays a role in determining conceptual complexity is homogeneity. A homogeneous update is one that only contains acceptances or only contains rejections, while a heterogeneous update is one that contains a mix of acceptances and rejections. Our claim is that homogeneous updates are simpler than het- erogeneous updates. The connection between homogeneity and monotonicity discussed in the previous section further supports the idea that homogeneity plays a role in determining cognitive complexity. Experimental work by Geurts and van der Slik (2005) is evidence that quan- tificational contructions with mixed monotonicity profiles in their arguments are more complex than constructions with uniform monotonicity properties in all their arguments. Given the correspondence between monotonicity and arguments marked by + and −, we assign a higher complexity cost to heterogeneous update procedures. Future experimental work on Booleans could confirm this hypothesis more directly. More evidence might also suggest more sophisticated definitions of homogeneity. For instance, an update could be minimally homogeneous if it has the same number of assertions as rejections, and maxi- mally homogeneous if it has only positive or only negative arguments. We leave to further work whether more sophisticated definitions would track cognitive differences.

3.4 Complexity Rank On the basis of these distinctions, we assigned a complexity rank to each of the non-trivial Boolean operator listed in Table 4, plus negation. The ranks are reported in Table 5. The more features of an operator are complex according to the discussion above, the higher the rank of the operator. For instance, ∨ is local, homogeneous, and positive, which implies that it is simpler than any of the other operators besides conjunction. On the other hand,

13 Formula Local Homogeneous Positive Complexity Rank ¬p XX 2 p ∧ q XXX 1 p ∨ q XXX 1 p nor q XX 2 p nand q 4 p xor q 4 p ↔ q 4 p → q 4 p ← q 4 p 6→ q X 3 p 6← q X 3 Table 5: Complexity-relevant feature of each non-trivial Boolean connective from Table 4, including negation, along with their rank in terms of conceptual complexity. nor is local and homogeneous, but not positive, which implies it is not as simple as ∨ but simpler than e.g. nand, which is global, heterogeneous, and negative. Note that our rank does not make quantitative assumptions about the relative difficulty of the various features. Since all positive operators are homogeneous and local and all homogeneous operators are local, we do not need to decide the relative rank of, e.g., two operators which are local but not homogeneous, or homogeneous but not local. In this section, we have defined a measure of cognitive complexity for updates based on independently motivated and general aspects of cognition. Previous literature has shown that languages tend to become as simple as possible. Accordingly, we expect con- nectives that score low in our complexity measure to also be lexicalized more. This is in- deed what we see. The binary operators frequently attested in natural language, namely ∧ and ∨, are also the simplest in absolute terms according to our complexity measure, while nor is also relatively simpler than all other Booleans.

4 The Evolution of Lexical Inventories

While the complexity ranks for Boolean operators defined in the previous section might be a promising first step, it cannot be the final answer to the question of lexicalization. First of all, not all languages include the whole set of binary operators on the lower end of the spectrum ∧, ∨, nor. As we briefly reported in Section 1, some naturally attested options are Wari’ (Chapacuran), which only includes ∧, and Warlpiri (Pama-Nyungan), which only includes ∨. So there isn’t simply a hard cutoff point for operators with low complexity. In this section, we shift perspective from single operators to lexical inventories, i.e. non-empty sets of Boolean operators. We define a measure of usage complexity for in-

14 ventories, which encodes the effort required on average to express an observation. We then show that the attested inventories are those that find a best compromise between conceptual and usage complexity.

4.1 The Model For each lexical inventory we consider, we model an ideal speaker who conveys obser- vations using the operators in the lexical inventory of its language. An observation is a string of four binary digits, 0s or 1s, representing the distribution of truth values of two atoms p and q. The observation 1000, for example, represents the truth of both p and q. The complete list of observations is under the Truth Table column of Table 4. The ideal speaker is keen to express any Boolean distinction she observes. The speaker’s task is to express their observation by linguistic means. The speaker’s language for this task is the syntactic closure of p and q under the operators that belong to the speaker’s lex- ical inventory. We assume that they will solve the task by means of the shortest possible formula, in order to minimize production effort.7

4.1.1 The set of inventories We only consider lexical inventories that are expressively complete, i.e. that can express all observations by compositional combination, under the methodological assumption that all natural languages are expressively complete (see the discussion in von Fintel and Matthewson (2008)). For example, {¬, ∨} is expressively complete, and belongs to the list of languages we consider. Indeed, this is the lexical inventory that represents a language like Warlpiri in our model. If a lexical inventory I is expressively complete, then for each of the observations there is a formula in the language recursively built on the basis of I whose truth-conditions coincide with the string. Thus, for example, the lexical inventory {∧, ∨} is not expressively complete, since there is no way of combining atoms p and q with conjunctions and disjunctions to express, say, 0111. The expressively complete inventories of one element are {nor} and {nand}. If we consider only the operators listed in Table 5, there are 36 expressively complete inven- tories of two elements. Expressively complete inventories of up to five elements, based on the operators in Table 5, amount to 958 languages. Of these, we only considered lan- guages with negation. Lexical forms for sentential negation are present in all known natu- ral languages (Horn, 1989), presumably for independent reasons, so this restriction is not implausible. Thus, the total number of languages we considered in our model is 382. Importantly, our model is not limited to minimal inventories. For example, while the inventory {¬, ∨} is included, so are all its supersets of up to five elements. The total conceptual complexity of a language increases by having more lexical primitives

7The code and minimal formulas for the model in this section are available at https://github.com/ thelogicalgrammar/booleanExpressivistLOT.

15 than logically necessary to achieve expressive completeness, but more primitives might reduce the number of embeddings that are needed to express an observation, and there- fore reduce usage complexity. An example is given by the observation 0001, compar- ing two expressively complete inventories, {¬, ∨} and {nor}. In the former case, the observation is straightforwardly expressed by ‘¬(p ∨ q)’. In the latter, by the unwieldy ‘q nor ((p nor q) nor (p nor q))’. The second formula is so complex to be utterly im- practical, and the notion of usage complexity defined below is intended to capture this intuition.

4.1.2 Computational limits We introduced a couple of limitations to the model, due to limited computing powers. The model calculates the shortest formula to express an observation, given a set of lexical primitives. The algorithm proceeds as follows. For a given inventory I, the model starts with two lists, one list S with the saturated expressions and one list U with the unsatu- rated expressions. The former is initialized with p and q, the latter with the unsaturated operators in I. At each step of the algorithm, a new list U0 is defined. For each element u in U, the first argument in u is saturated with each element of I ∪ {p, q} producing a list 0 0 Lu. For every element u in Lu, if u is saturated and an equivalent meaning is not already in S, u0 is added to S, if u0 is unsaturated, it is added to U0. When this operation has been repeated for all u ∈ U, U is set to U0 for the following step. Since saturated formulas are encountered in order of number of operations they contain, S will end up containing the shortest formulas for each observation for I.8 The proportion between the size of a lexical inventory and the width of the tree at a given step increases exponentially with the number of operators, so that one more lex- ical elements can explode the memory required for computing the shortest formula to express an observation. To counter this, we capped the size of inventories to 5 elements. This is somewhat arbitrary, but lexical inventories of at most 5 elements are more than enough for present purposes. As we have seen, lexical inventories in the Boolean domain in natural language seem to be at most as large as in English, namely {¬, ∧, ∨, nor}. No known language has more Boolean primitives than this. Moreover, we put a limit of 7 to the number of operators in a minimal formula, so that any observation which cannot be expressed with at most seven operators is automatically set to a generic approximation. The effect of this is to set an upper bound to the measure of complexity defined below. Only 38 languages in our sample required this approximation.

8The implemented algorithm is slightly different to account for symmetric operators. This difference does not affect the present discussion.

16 4.1.3 Usage complexity of inventories For each observation and each lexical inventory, we calculate the formulas of minimal length that express the observation in the vocabulary of that inventory. Since we only consider expressively complete inventories, for every observation there is always a short- est formula that expresses it. As we mentioned, not all lexical inventories are equivalent, when it comes to mini- mizing the length of formulas that are used to express an observation. We define the usage complexity of a lexical inventory I as the average length over all observations of the short- est formulas that expresses an observation in I. The length of a formula is just the number of open parenthesis. Thus for example atomic formulas have length 1, for we take them to be written ‘(p)’ and ‘(q)’. Formulas with numerous subformulas have higher usage com- plexity. Usage complexity gives us a compelling way of measuring how much effort is re- quired to use a language built on a given expressively complete inventory, in terms of the average number of operators required to express an observation. For illustration, Table 6 reports the minimal formulas for all observations for the naturally attested languages, {¬, ∧, ∨, nor}, {¬, ∧, ∨}, {¬, ∧}, {¬, ∨}, including the numeric values we calculated for their usage complexity.9

4.1.4 Conceptual complexity of inventories In order to compare the total conceptual complexity of different inventories, we need to calculate conceptual complexity scores from the complexity ranks from the previous section. The simplest hypothesis is that the complexity score of an operator corresponds to its rank in Table 5. Plausibly, then, complexity adds up: we calculate the complexity of an inventory as the sum of the complexity scores of its elements. This method is a straightforward way to incorporate our earlier discussion of the conceptual complexity of updates at a linguistic level. Thus, the total cost of a language like English, that includes ¬, ∧, ∨, and nor, would be greater than the total cost of a hypothetical language whose lexicon includes only nand – at least, according to the scores based on Table 5. For, {nand} scores 4, while English scores 6, since negation and nor have value 2, while disjunction and conjunction 1. However, this point does not depend on our choice of numerical values: in principle, languages may prefer lexicalizing fewer but relatively complex operators to lexicalizing many but relatively simple ones, other things being equal. Why not spending more once, to avoid spending twice?

9We emphasize that usage complexity is not intended to capture the actual cognitive effort of a speaker of a language such as, say, Aranda (= {¬, ∧}). As discussed in the introduction, languages that lack disjunction tend not express it by means of complicated compositional constructions, but rather to employ a modal strategy (e.g., ‘Perhaps p, perhaps q’). We have not considered modal operators. Intuitively, however, these alternative strategies have the effect of allowing a speaker to express an observation with minimal complexity.

17 Truth Table { Not, and, or, nor }{ Not, and, or }{ not, and }{ not, or } 0000 ∧(p, ¬(p)) ∧(p, ¬(p)) ∧(p, ¬(p)) ¬(∨(p, ¬(p))) 0001 nor(p, q) ¬(∨(p, q)) ∧(¬(p), ¬(q)) ¬(∨(p, q)) 0010 ∧(q, ¬(p)) ∧(q, ¬(p)) ∧(q, ¬(p)) ¬(∨(p, ¬(q))) 0011 ¬(p) ¬(p) ¬(p) ¬(p) 0100 ∧(p, ¬(q)) ∧(p, ¬(q)) ∧(p, ¬(q)) ¬(∨(q, ¬(p))) 0101 ¬(q) ¬(q) ¬(q) ¬(q) 0110 nor(∧(p, q), nor(p, q)) ∧(¬(∧(p, q)), ∨(p, q)) ∧(¬(∧(p, q)), ¬(∧(¬(p), ¬(q)))) ∨(¬(∨(p, ¬(q))), ¬(∨(q, ¬(p)))) 0111 ¬(∧(p, q)) ¬(∧(p, q)) ¬(∧(p, q)) ∨(¬(p), ¬(q)) 1000 ∧(p, q) ∧(p, q) ∧(p, q) ¬(∨(¬(p), ¬(q)))

18 1001 ∨(∧(p, q), nor(p, q)) ∨(¬(∨(p, q)), ∧(p, q)) ∧(¬(∧(p, ¬(q))), ¬(∧(q, ¬(p)))) ∨(¬(∨(p, q)), ¬(∨(¬(p), ¬(q)))) 1010 q q q q 1011 ∨(q, ¬(p)) ∨(q, ¬(p)) ¬(∧(p, ¬(q))) ∨(q, ¬(p)) 1100 p p p p 1101 ∨(p, ¬(q)) ∨(p, ¬(q)) ¬(∧(q, ¬(p))) ∨(p, ¬(q)) 1110 ∨(p, q) ∨(p, q) ¬(∧(¬(p), ¬(q))) ∨(p, q) 1111 ∨(p, ¬(p)) ∨(p, ¬(p)) ¬(∧(p, ¬(p))) ∨(p, ¬(p)) Usage 1.5625 1.75 2.5625 2.5625 Complexity

Table 6: Minimal formulas for all possible observations for the four attested inventories. In the Truth Table column, we assume by convention that the order of truth values is 1100 for p and 1010 for q. There is an important distinction to notice at this juncture: complexity scores, in con- trast to the complexity ranks discussed above, assume quantitative relations between the conceptual complexity of the various features. Since the complexity of each operator is calculated as the sum of its complex features, and the complexity of an inventory is calcu- lated as the sum of the complexity of the operators it contains, the total complexity of an inventory depends on the total number of global, heterogeneous, and negative updates it contains. The assumption is weaker than it might first appear, because as will become clear in the next section all that matters is the rank of complexity of the inventories rather than their absolute complexity value. We leave exploration of the consequences of assum- ing different complexity levels for different features to future work.

4.2 Results Figure 6 shows the value of each inventory with respect to its conceptual and usage com- plexity. We expect those inventories to be lexicalized that find the best compromise be- tween the two complexities. This corresponds to the idea of the Pareto frontier. A lan- guage is in the Pareto frontier iff, given its value in one of the measures, it cannot improve with respect to the other measure. In our case, inventories in the Pareto frontier cannot get a lower usage complexity given their conceptual complexity and vice versa. The crucial result is that the four observed inventories are all in the Pareto frontier (red dots in Figure 6). Both {¬, ∨} and {¬, ∧} have a low level of conceptual complexity, because in addi- tion to negation they only have one of the simplest operators. However, they have high usage complexity, because a speaker attempting to precisely convey some of the observa- tions would have to recur to highly nested expressions. Inventory {¬, ∨, ∧} has slightly higher conceptual complexity, because it requires the speaker to store three rather than two operators. However, it makes up for it by having a lower usage complexity. Im- portantly, some languages with the same conceptual complexity as {¬, ∨, ∧} have higher usage complexity, and therefore are correctly predicted to not be lexicalized. Finally, in- ventory {¬, ∨, ∧, nor} has still higher conceptual complexity because of the addition of nor, but lower usage complexity. While all attested languages are in the Pareto frontier, not all languages in the Pareto frontier are attested. It is worth noticing however that the attested languages are all and only the languages in the Pareto frontier below a certain level of conceptual complexity. This points to the possibility that a harder constraint on conceptual complexity is at play, which prevents the development of languages with conceptual complexity above a certain threshold.

19 Figure 6: Results of the model. The x-axis shows the usage complexity, i.e. the average number of operators needed to express an observation. The y-axis shows the conceptual complexity, i.e. the sum of the complexities of the operators in the lexical inventory. The figure shows each inventory in the model (note that multiple languages can overlap, so that there are more languages than dots). The attested languages are shown in red, and the corresponding inventory is shown in text. Two of the attested languages overlap, {¬, ∧} and {¬, ∨}. The main result is that the attested languages are on the Pareto frontier: given their value of conceptual complexity, they cannot improve in terms of usage complexity, and vice versa.

5 Previous Literature

In this final section, we discuss previous accounts of lexicalization in the Boolean domain, and compare them with ours. Horn (1972) observed that nand is only expressed composi- tionally in natural language. He then proposed a Gricean explanation of this fact (see also Horn (1989)). Horn’s account has been revised by Katzir and Singh (2013) and Uegaki (ms), who kept its essential outline. Consider two sets of lexical entries X and Y that are equally expressive. Two general principles are assumed:

Gricean Condition: If Y ⊂ X, X cannot be lexicalized.

Negation Condition: If X contains more instances of negation than Y, X cannot be lexicalized.

The combination of the two Conditions amounts to the claim that languages tend to keep the size of the lexicon to a minimum, while minimizing the number of negations in it,

20 other things being equal. This setting can be applied to the case of Boolean operators. Suppose, for simplicity, that the relevant semantic space is given by the following four ob- servations S = {1000, 1110, 0111, 0001}. This space characterizes the Aristotelian Square of Oppositions in the Boolean domain for binary operators, and it’s a subset of the set of observations we have considered in our model. A language can express an observation both directly through one of its lexical entries or indirectly by implicature. A language could cover S, then, by having a lexical entry for each observation in S. However, this would contradict the Gricean Condition, since the smaller set {and, nor, or} also suffices to cover S. To see why, notice that ‘p or q’ implicates by scalar reasoning ‘not both p and q’. Intuitively, the expressive function of nand can thus be carried out by a pragmatically enriched disjunction. In this sense, {and, nor, or} suffices to cover S. For similar reasons, {and, nor, nand} would also suffice to cover S.A scalar implicature would presumably exist from ‘p nand q’ to ‘p or q’, if nand existed, so that disjunction could be expressed pragmatically. While the Gricean Condition excludes a system of four connectors for semantic space S, it leaves two possible sets of lexical entries that both cover S, namely {and, nor, or} and {and, nor, nand}. However, if nand is analysed as negated conjunction, then {and, nor, or} contains fewer instances of negation than {and, nor, nand}. This is because or contributes no negations while nand contributes (at least) one. Therefore by the Negation Condition {and, nor, nand} cannot be lexicalized. As noted by Uegaki (ms) and discussed above, several other operators besides nand are not lexically realized, and several possible combinations of operators are not attested inventories. Uegaki (ms) extends the Gricean approach to account for these additional data. Drawing on previous literature (see e.g. Regier et al. (2015); Kemp et al. (2018); Car- cassi et al. (2019)), Uegaki assumes that languages evolve under a combination of two pressures. The first pressure tends to produce languages that lexicalize conceptually sim- ple operators—on this, we will say more below. The second pressure tends to produce languages that induce accurate communication between agents. In Uegaki (ms)’s model of communication, observations are the same as in our model presented above. However, the speaker cannot produce composition- ally obtained sentences, but only individual operators. As a consequence, signals can be compatible with multiple observations, and the listener has to guess between the possible observations compatible with a signal. Different languages can therefore differ in terms of how accurate communication is on average. Languages that lead on average to more suc- cessful communication are assumed to be more likely to evolve. In Uegaki (ms)’s model, agents are assumed to be pragmatic and scalar implicatures can also be conveyed. The picture of communication we presented in our model is different from the one assumed in much of previous literature, including Uegaki (ms), Regier et al. (2015), and Kemp et al. (2018). Specifically, previous literature assumes a lazy speaker who uses the shortest possible signals, even when they underspecify the speaker’s intended meaning. In such a situation, it is important for the language to lead to accurate communication

21 despite the lazy speaker. On the other hand, we assume a maximally cooperative speaker who always sends signals that uniquely identify the intended meaning, even when this requires the construction of highly complicated signals. In such a situation, it is important for the language to not put excessive strain on the speaker’s calculation of signals and utterance of signals. Communication in real languages lie between these two extremes: speakers sometimes produce signals that are somewhat underspecified with respect to the intended meaning to communicate, and sometimes produce more costly signals that however uniquely identify the intended meaning. Beyond these differences in methodology, there is at least one reason to be skeptical of Horn’s Gricean approach, and its extensions and refinements. The approach relies on the claim that nand contains a negation more than or, or, in Uegaki (ms)’s setting, the claim that ∨ is conceptually simpler than nand. Such claims license a crucial application of the Negation Condition, but they are problematic. For, why should nand contain more negations than or? It appears to be assumed that ∧, ∨, and ¬ are primitives of the Lan- guage of Thought that the theory uses to encode the Boolean operators. In fact, Katzir and Singh (2013) explicitly assume so. But there is a threat of circularity here, for what counts as conceptually simple, or as containing negations, crucially depends on what are the primitives of the Language of Thought. Suppose, for example, that nand and ¬ are the only primitives of the Language of Thought. Then or would contain negations, since ‘p or q’ would correspond to ‘¬p nand ¬q’. Moreover, nand would be the conceptually simplest connective. Gricean accounts lack justification for the claim that conjunction and disjunction are conceptually more primitive than nand. Crucially, such justification is not found in the observation that or and and are lexicalized while nand is not, for that would be circular. In contrast, our account does not claim that some operators are more funda- mental than others, but rather depends on independently plausible conceptual primitives that are used to construct all the Boolean operators. Lexical gaps similar to the lack of nand exist in semantically related domains, specifi- cally the quantificational determiners. In particular, no natural language includes a lexical entry for negated universal nall (= ¬∀) (Horn, 1972). An account of the lexicalization pat- tern of quantificational determiners is presented in Enguehard and Spector (ms). The account is broadly Gricean, but different from Horn’s. In a version of the Rational Speech Act model (Goodman and Stuhlmuller,¨ 2013; Goodman and Frank, 2016), Enguehard and Spector show that if (i) languages lexicalize more frequently used expressions, and (ii) some is more informative than nall, then languages lexicalize {all, no, some} rather than {all, no, nall}. Applied to the Boolean connectives, the account would rest on the assump- tion (ii’), parallel to (ii), that or is more informative than nand. We find (ii) and (ii’) problematic. Enguehard and Spector defend the claim that some is more informative than nall by arguing that, for any two predicates A and B, “a randomly picked A is still more likely not to have property B than to have it” (p. 7). In other words, any two randomly picked predicates are likely to have an empty intersection. Therefore, higher priors are assigned to A and B having empty intersection than otherwise. Hence,

22 ‘some A are B’ has lower prior probability than ‘nall A are B’, and so the former is rel- atively more informative than the latter. Since it’s more informative, it’s uttered more frequently. Hence some is expected to be lexicalized, but not nall. This is an argument to assume favorable priors. It’s difficult to imagine how a similar argument could justify the requisite priors in the case of Boolean operators. Is it more likely that any two randomly picked propositions are both false than that one is true? It’s very hard to assess this claim. Maybe a different argument could be given, but such claims as (ii) and (ii’) are still dubious. For they are only apparently empirical claims. In fact, they are about the stipulated priors of an idealized rational agent. Assuming that such agent has priors that favor some over nall is supiciously circular, if the goal is to explain why some is lexicalized and not nall.

6 Conclusions

In this paper, we have made two contributions. First, we have developed a bilateral up- date semantics for Boolean connectives, which allows to encode any Boolean connective with only three simple primitives, namely acceptance, rejection, and restriction. We have proposed our logic as a Dynamic Language of Thought. Our second contribution is a new account of the lexicalization patterns of Boolean connectives in natural language based on the proposed Language of Thought. In particular, we have shown that a combination of two pressures can account for the typological data. The first pressure tends to mini- mize the total cognitive complexity of the operators in the lexical inventories. The second pressure tends to minimize how difficult it is for a speaker to precisely communicate an observation. We have shown that the attested lexical inventories are in the Pareto frontier of these two pressures, i.e. find the best compromise between them. The work presented in this paper can be extended in various directions. First of all, the two measures of complexity developed above could be validated. The predicted cognitive complexity of the operators could be tested experimentally in a learning task similar to Feldman (2003). Our measure of usage complexity can be compared to corpus data to esti- mate how often speakers use compositional means to express observations and how often they use expressions compatible with multiple meanings. A second avenue for future re- search is to develop the model further. In particular, the bilateral semantics we presented above can formally be extended to the case of quantifiers (see Incurvati and Sbardolini (xb)), leading to a possible extension of this line of research to the study of lexicalization in the domain of quantificational determiners. We leave these possible developments to future research.

23 7 Acknowledgments

We would like to thank participants in the 2021 EXPRESS Workshop on ‘Non-Assertoric Speech Acts’ for comments and discussion. Special thanks to Luca Incurvati, Jakub Szy- manik, and Wataru Uegaki. The first author acknowledges received funding from the Eu- ropean Research Council under the European Union’s Seventh Framework Programme (FP/2007–2013)/ERC Grant Agreement n. STG 716230 CoSaQ. The second author ac- knowledges the generous support of the European Research Council under the European Union’s Horizon 2020 Research and Innovation Programme (Grant Agreement n. 758540) within the project EXPRESS: From the Expression of Disagreement to New Foundations for Expressivist Semantics.

8 Appendix: Comparison with Heim

The bilateral update semantics presented in this paper is not the standard update seman- tics of Heim (1983). This is to be expected, if we are to give a logical analysis of lexical gaps, because Heim’s semantics has the expressive power to encode all Boolean operators. In Heim’s semantics, updates are unilateral: rejections are positive updates of negated for- mulas. For a propositional language {¬, ∧, ∨}, Heim updates are defined recursively as follows:

c[p] = c ∩ p J K c[¬φ] = c\c[φ] c[φ ∧ ψ] = (c[φ])[ψ] c[φ ∨ ψ] = c[φ] ∪ (c[¬φ])[ψ]

There is only one (positive) update function, which in the atomic case coincides with ·[+·]. More importantly, the semantical clauses import set-theoretic operators, intersec- tion, complement and union. So the formula c[¬φ] ∪ (c[φ])[¬ψ] is a straightforward way to encode nand, which does not appear to be significantly more complex than, say, the formula Heim uses to encode disjunction. For these reasons, Heim’s update semantics does not seem a promising approach to explaining lexical gaps. Of course, it was never intended to be so. In Incurvati and Sbardolini (xa), the recursive closure of bilateral updates is presented and discussed, so that the updates presented in this paper can be defined compositionally. As a result, update functions take sets of sentences as argument, not single sentences one by one, as for Heim. For a set of atomic sentences Γ, let c[Γ] = c[Γ+] ∪ c[Γ−], with Γ+, Γ−, possibly empty subsets of Γ. The clauses for recursion of the base logic characterizing the Triangle of Fig. 3 are the following:

24 + + S − − S c[ p1,..., pn] = c ∩ i∈n pi c[ p1,..., pn] = c\ i∈n pi c[+¬φ, Γ] = c[−φ, Γ] J K c[−¬φ, Γ] = c[+φ, Γ] J K c[+φ ∧ ψ, Γ] = (c[+φ])[+ψ] ∪ c[Γ] c[−φ ∧ ψ, Γ] = (c[−φ])[−ψ] ∪ c[Γ] c[+φ ∨ ψ, Γ] = c[+φ,+ ψ, Γ] c[−φ ∨ ψ, Γ] = c[−φ,− ψ, Γ] By means of the following notions of Support and Validity, these clauses defined a logic (Veltman, 1996; Willer, 2013). + Definition 1. Support: c φ iff c[ φ] = c + + Definition 2. Validity: φ1,..., φn  ψ iff for every c, c[ φ1] ... [ φn] ψ Of course, the resulting logic is too weak to recapture Heim’s classical system, and only characterize the sub-classical logic of Fig. 3. There is more than one way to recapture Heim’s classic system from here. By doing this, we no longer proceed under cautious con- siderations of cognitive complexity, and aim solely at reaching the full expressive power of classical logic. Two updates need fixing: conjunction under negation (nand), and disjunction under negation. For, the clauses above validate both ¬(φ ∧ ψ) ≡ ¬φ ∧ ¬ψ and ¬(φ ∨ ψ) ≡ ¬φ ∨ ¬ψ. A practical method for recapturing Heim’s system is to introduce the exhaustification operator EXH (Chierchia et al., 2002). Define φ∨¯ ψ :=EXH(φ ∨ ψ). So ∨¯ is in effect an exclusive disjunction. The reader can check that the following bilateral clauses for positive and negative update are truth-conditionally adequate for exhaustified disjunction. c[+φ∨¯ ψ, Γ] = (c[−φ])[+ψ] ∪ c[+φ, Γ] c[−φ∨¯ ψ, Γ] = (c[+φ])[−ψ] ∪ c[−φ, Γ] If negative updates are converted into positive updates with negated formulas, and with Γ = ∅, the first clause coincides with Heim’s update for disjunction. The second clause is truth-conditionally equivalent to the update we proposed above for nand. Importantly, exhaustified disjunction is not scope-insensitive with respect to negation, as ordinary dis- junction is in the context of bilateral updates. Thus, replacing the clauses for asserted and negated disjunction in the proper places, we can recover Heim’s classical system. + + S − − S c[ p1,..., pn] = c ∩ i∈n pi c[ p1,..., pn] = c\ i∈n pi c[+¬φ, Γ] = c[−φ, Γ] J K c[−¬φ, Γ] = c[+φ, Γ] J K c[+φ ∧ ψ, Γ] = (c[+φ])[+ψ] ∪ c[Γ] c[−φ ∧ ψ, Γ] = c[−φ∨¯ ψ, Γ] c[+φ ∨ ψ, Γ] = c[+φ∨¯ ψ, Γ] c[−φ ∨ ψ, Γ] = c[−φ,− ψ, Γ] For further detail and proofs, see Incurvati and Sbardolini (xb,x).

References

Barwise, J. and R. Cooper (1981). Generalized quantifiers and natural language. Linguistics and Philosophy 4(2), 159–219.

25 Bowler, M. (2015). Conjunction and disjunction in a language without ‘and’. Proceedings of SALT 24, 137. Carcassi, F., M. Schouwstra, and S. Kirby (2019, July). The evolution of adjectival mono- tonicity. Proceedings of Sinn und Bedeutung 23(1), 219–230. Chierchia, G., D. Fox, and B. Spector (2002). Scalar implicature as a grammatical phe- nomenon. In C. Maienborn, K. von Heusinger, and P. Portner (Eds.), Semantics: An International Handbook of Natural Language Meaning, Volume 3, pp. 23–42. Berlin: Mou- ton de Gruyter. Davidson, K. (2013). ‘and’ or ‘or’: General use coordination in asl. Semantics and Pragmat- ics 6, 1–44. De Swart, H. (2000). Scope ambiguities with negative quantifiers. In K. von Heusinger and U. Egli (Eds.), Reference and Anaphoric Relations, pp. 109–132. Dordrecht: Kluwer. Emile´ Enguehard and B. Spector (ms). Explaining gaps in the logical lexicon of natural languages. x x, x. Everett, D. L. and B. Kern (1997). Wari: the Pacaas Novos Language of Western Brazil. De- scriptive Grammar Series. London: Routledge. Feldman, J. (2003, February). A catalog of Boolean concepts. Journal of Mathematical Psy- chology 47(1), 75–89. von Fintel, K. and L. Matthewson (2008). Universals in semantics. The Linguistics Re- view 25, 139–201. Geurts, B. (1996). On no. Journal of Semantics 13, 67–86. Geurts, B. and F. van der Slik (2005, February). Monotonicity and Processing Load. Journal of Semantics 22(1), 97–117. Gil, D. (1991). Aristotle goes to arizona and finds a language without ‘and’. In D. Zaefferer (Ed.), Semantic Universals and Universal Semantics, pp. 96–130. Berlin: Foris Publications. Gillies, A. S. (2004). Epistemic conditionals and conditional epistemics. Noˆus38, 585–616. Goodman, N. and M. Frank (2016). Pragmatic language interpretation as probabilistic inference. Trends in Cognitive Sciences 20, 818–829. Goodman, N. and A. Stuhlmuller¨ (2013). Knowledge and implicature: Modeling language understanding as social cognition. Topics in Cognitive Science 5, 173–184. Goodwin, G. P. and P. N. Johnson-Laird (2013, March). The acquisition of Boolean con- cepts. Trends in Cognitive Sciences 17(3), 128–133.

26 Heim, I. (1983). On the projection problem for presuppositions. In P. Portner and B. H. Partee (Eds.), Formal Semantics - the Essential Readings, pp. 249–260. Blackwell.

Horn, L. (1972). On the semantic properties of the logical operators in English. Ph. D. thesis, UCLA.

Horn, L. (1989). A Natural History of Negation. University of Chicago Press.

Incurvati, L. and G. Sbardolini (xa). The Logic of Bilateral Updates. x x, x.

Incurvati, L. and G. Sbardolini (xb). Update Rules and Semantic Universals. x x, x.

Katzir, R. and R. Singh (2013). Constraints on the lexicalization of logical operators. Lin- guistics and Philosophy 36, 1–29.

Keenan, E. L. and J. Stavi (1986). A semantic characterization of natural language deter- miners. Linguistics and Philosophy 9, 253–326.

Kemp, C., Y. Xu, and T. Regier (2018). Semantic Typology and Efficient Communication. Annual Review of Linguistics 4, 109–128.

Kibrik, A. (2004). Coordination in upper kuskokwim athabaskan. In M. Haspelmath (Ed.), Coordinating Constructions, pp. 537–553. Amsterdam: Benjamins.

Mauri, C. (2008). The irreality of alternatives: Towards a typology of disjunction. Studies in Language 32, 22–55.

Mostowski, M. (1998). Computational semantics for monadic quantifiers. Journal of applied Non-Classical Logic 8, 107–121.

Mous, M. (2004). The grammar of conjunctive and disjunctive coordination in iraqw. In M. Haspelmath (Ed.), Coordinating Constructions. Amsterdam/Philadelphia: John Ben- jamins.

Ohori, T. (2004). Coordination in mentalese. In M. Haspelmath (Ed.), Coordinating Con- structions, pp. 41–66. Amsterdam: John Benjamins.

Payne, J. (1985). Negation. Language Typology and Syntactic Description 1, 197–242.

Price, H. (1990). Why ‘not’? Mind 99(394), 221–238.

Regier, T., C. Kemp, and P. Kay (2015). Word meanings across languages support efficient communication. In B. MacWhinney and W. O’Grady (Eds.), The Handbook of Language Emergence, pp. 237–263. Oxford: Wiley-Blackwell.

Rumfitt, I. (2000). Yes and no. Mind 109(436), 781–823.

27 Sauerland, U. (2000). No ‘no’: On the crosslinguistic absence of a determiner ‘no’.

Sauerland, U., A. Tamura, M. Koizumi, and J. M. Tomlinson (2015). Tracking down dis- junction. In D. Bekki (Ed.), Preproceedings of Logic and Engineering of Natural Language Semantics, Volume 12, pp. 10. Tokyo: Ochanomizu University.

Singh, R., K. Wexler, A. Astle, D. Kamawar, and D. Fox (2016). Children Interpret Dis- junction as Conjunction: Consequences for the Theory of Scalar Implicature. Natural Language Semantics 24(4), 60.

Smiley, T. (1996). Rejection. Analysis 56, 1–9.

Stalnaker, R. (1978). Assertion. Syntax and Semantics 9, 315–332.

Stalnaker, R. (1999). Context and Content. Oxford: Oxford University Press.

Szymanik, J. (2016). Quantifiers and Cognition: Logical and Computational Perspectives. Stud- ies in Linguistics and Philosophy. Springer International Publishing.

Szymanik, J. and M. Zajenkowski (2010). Comprehension of Simple Quantifiers: Empiri- cal Evaluation of a Computational Model. Cognitive Science 34(3), 521–532.

Uegaki, W. (ms). *NAND and the communicative efficiency model. x x, x. van Benthem, J. (1984). Questions about quantifiers. Journal of Symbolic Logic 49, 443–466.

Van Benthem, J. (1986). Semantic Automata. In J. Van Benthem (Ed.), Essays in Logi- cal Semantics, Studies in Linguistics and Philosophy, pp. 151–176. Dordrecht: Springer Netherlands.

Veltman, F. (1996). Defaults in update semantics. Journal of Philosophical Logic 25(3), 221– 261.

Wilkins, D. P. (1989). Mparntwe arrernte (aranda): Studies in the structure and semantics of grammar.

Willer, M. (2013). Dynamics of epistemic modality. Philosophical Review 122, 45–92.

Zeijlstra, H. (2011). On the syntactically complex status of negative indefinites. The Journal of Comparative Germanic Linguistics 14, 111–138.

28