<<

CHAPTER TWO: FUNCTIONAL TYPE

2.1. Compositionality

We have given a compositional for , a semantics in which the meaning of any complex expression is a of the meanings of its parts and the way these meanings are put together. In predicate logic, complex expressions are formulas, and their parts are formulas, terms (constants or variables), n-place predicates, and the logical constants.

As is well known to anybody who has gone through lists of exercises of the form: 'translate into predicate logic', predicate logic does a much better job on representing the meanings of various types of sentences, than it does on representing the meanings of their parts.

Take for example the following sentence:

(1) Some old man and every boy kissed and hugged Mary.

With the one-place predicate constants MAN, OLD, BOY and the two-place predicate constants KISSED and HUGGED, we can quite adequately represent (1) in predicate logic as (1'):

(1') x[MAN(x)  OLD(x)  KISSED(x,m)  HUGGED(x,m)]  y[BOY(y)  KISSED(y,m)  HUGGED(y,m)]

The problem is that without an explicit and a compositional semantics, representing (1) as (1') is a completely creative exercise, and it shouldn't be.

What we really need is a semantics that gives us the right semantic representation for sentence (1) (and hence the right meaning), and that tells us at the same how this semantic representation is related to the meanings of the parts of (1). Our representation (1') has various parts (like ) that don't correspond to any part of the sentence; some parts of the representation occur twice in different forms, while in the sentence there is only one part (kissed and hugged Mary); the representation contains sentence conjunction , whereas the sentence contains NP conjunction and verb conjunction; OLD is separated from MAN in the representation, which is motivated by what the meaning should be, but nothing explains how it got separated, etc. What we really need is a logical language in which we can represent the meaning of sentence (1) and the meanings of its parts: OLD/ MAN/ SOME/ OLD MAN/ SOME OLD MAN/ EVERY/ BOY/ EVERY BOY/ AND/ SOME OLD MAN AND EVERY BOY/ KISSED/ HUGGED/ AND/ KISSED AND HUGGED/ MARY/ KISSED AND HUGGED MARY/ SOME OLD MAN AND EVERY BOY KISSED AND HUGGED MARY

The theory we will use to represent these meanings and the operations to but them together to give the correct sentence meanings is type theory, which is based on the two basic of function- application (functional application) and function definition (lambda abstraction). Such a theory we will now formulate.

13

2.2. The syntax of functional type logic.

We will start by specifying the language of functional type logic and its semantics. For compactness sake, I will give the whole language, including the lambda-operator and its semantics, but we will ignore that part till later. The language of functional type logic gives us a rich theory of functions and . Our theory of functions is typed. This means that for each function in the theory it is specified what its domain and range is, and functions are restricted to their domain and range. This means that the theory of functions is a theory of extensional functions, in the -theoretic sense of the word:

a function f from set A into set B: f:A  B is a set of ordered pairs in A  B such that: 1. for every a  A: there is a b  B such that  f 2. for every a  A, b,b'  B: if  f and  f then b=b'

The domain of f, dom(f) = {a  A: b  B:  f} The range of f, ran(f) = {b  B: a e A:  f}

When we talk about, say, the function + of addition on the natural numbers as the same function as addition on numbers, we are using an intensional of function: extensionally +:N x N  N and +:R x R  R are different functions, because their domain and range are different and because they are different sets of ordered pairs. They are the same function, intensionally, in that these functions have the same relevant mathematical properties. Our theory of functions will be a theory of extensional functions, and that means that each function is extensionally specified as a set of ordered pairs with a fixed domain and fixed range. This means that functions can only apply to arguments which are in their domain. Partly as a means to avoid the set-theoretic paradoxes, we carry this over into our logical language, which contains expressions denoting these functions. We assume that each functional expression comes with an indication of its type, which in is an indication of what the domain and range of its will be. The syntax of our language will only allow a functional expression f of a certain type a to apply to an argument of the type of expressions that denote entities in the domain of the interpretation of f.

This is the basic of the functional language: -each expression is of a certain type and will be interpreted into a certain semantic domain corresponding to that type. -a functional expressions of a certain type can only apply to expressions which denote entities in the domain of the function which is the interpretation of that functional expression and hence are of the type of the input for that function. -The result of applying a functional expression to an expression of the correct input type is an expression which is of the output type, and hence is interpreted as an entity within the range of the interpretation of that functional expression.

We make this precise by defining the language of functional type logic TL. We first define all the possible types that we will have expressions of in the language:

14

Let e and t be two symbols. TYPETL is the smallest set such that: 1. e,t  TYPETL 2. if a,b  TYPETL then  TYPETL e and t are our basic types. They are the only types in the theory that will not be types of functions. -e (entity) is the type of expressions that denote individuals. Thus e is the type of terms in predicate logic. -t ( value) is the type of expressions that denote truth values. Thus t is the type of formulas in predicate logic. -For any types a and b, is the type of expressions that denote functions from entities denoted by expressions of type a into entities denoted by expressions of type b. In short, is the type of expressions that denote functions mapping a-entities onto b-entities.

This definition of types gives us a rich set of types of expressions: Since e and t are types, is a type: the type of functional expressions denoting functions that individuals (the of expressions of type e) onto truth values (the denotations of expressions of type t, formulas). As we will see, this is the type of properties (or sets) of individuals.

Assuming to be the type of -denoting expressions, then <,> is the type of expressions mapping properties onto properties: the corresponding functions take a function from individuals to truth values as input and deliver similarly a function from individuals into truth values as output. As we will see, this is the just the type for expressions like prenominal adjectives.

Since is a type and t is a type, <,t> is a type as well. This is the type of expressions that denote functions that take themselves a function from individuals into truth values as input and deliver a as output. As we will see, this is the right type for interpreting noun phrases like every boy. Since is a type and <,t> is a type, also <,<,t>> is a type. The type of expressions that denote functions which take functions from individuals into truth values as input, and give noun phrase interpretations as output. This is the type of determiners.

We really have a rich theory of types. Many types are included in the theory that we may not have any use for. Like the type <,>: the type of expressions that denote functions which map functions from truth values into individuals onto functions from truth values into individuals. It is unlikely that we can argue plausibly that there are natural language expressions that will be interpreted as functions of this type. To have a general and simple definition of the types and our language, we will nevertheless include these in our logical language (they just won't be used in given interpretations to natural languages expressions). We will very soon give some ways of thinking about the types described above in a more intuitive or simple way. The above examples were only meant to show how the definition of types works.

15

Based on this definition, we now define the logical language TL. The language TL has the following logical constants: ¬,,,,,,=,(,),λ As we can see, for ease we take over all the logical constants of predicate logic and we add another , λ, the λ-operator. I will from now on in the definitions leave out subscript TL, indicating that a set is particular to the language. I take that to be understood.

Constants and variables: For each type a  TYPE, we specify a set CONa of non-logical constants of type a and a set VARa of variables of type a: As before, we can think about the non-logical constants of type a as the lexical items whose semantics we do not further specify.

a a For every a  TYPE: CONa = {c 1,c 2,...} a set of constants of type a (at most countably many) a a For every a e TYPE: VARa = {x 1,x 2,...} a set of variables of type a (countably many) We will never write these type-indices explicitly in the formulas, rather we will make typographical conventions about the types of our expressions. For instance, I will typically use the following conventions: c,j,m are constants of type e x,y,z are variables of type e φ,ψ are expressions of type t P,Q are variables of type But generally I will indicate the typographical conventions where needed.

So we have as many constants as we want for each type (that is up to our choice), and we have countably many variables of each type.

With this given we will specify the syntax of the language. We do that by specifying for each type a  TYPE the set EXPa, the set of well formed expressions of type a.

For each a  TYPE, EXPa is the smallest set such that: 1. Constants and variables: CONa  VARa  EXPa Constants and variables of type a are expressions of type a. 2. Functional application: If α  EXP and β  EXPa then (α(β))  EXPb 3. Functional abstraction: If x  VARa and β  EXPb then λxβ  EXP 4. Connectives: If φ,ψ  EXPt then ¬φ, (φ  ψ), (φ  ψ), (φ  ψ)  EXPt 5. Quantifiers: If x  VARa and φ  EXPt then xφ, xφ  EXPt 6. : If α  EXPa and β  EXPa then (α=β)  EXPt

16

As said before, we will ignore clause 3 of this definition for the moment. Focussing on clauses 4-6, we see that type logic contains predicate logic as part. Clause 4, concerning the connectives is just the same clause that we had in predicate logic (because FORM = EXPt). Clause 5 for quantifiers looks just the same as the corresponding clause in predicate logic, but note that in predicate logic we could only quantify over individual variables (that is, variable of type e), while in type logic, we allow quantification over variables of any type: in type logic we have higher order quantification. Similarly, while in predicate logic we can form identity statements between terms (where = EXPe), in type logic we can form identity statements between any two expressions of any type, as long as they have the same type. The resulting identity statement is a formula, an expression of type t.

As an aside. In predicate logic, we only need the logical constants ¬ and , predication ( ,..., ), one  and identity, and we can define all the other logical constants. In type logic we only need application ( ( )) and abstraction λ and identity =, and we can define in type logic all the connectives and quantifiers we want. This means that clauses 4 and 5 are really only for convenience.

Let us now on clause 2, the clause for functional application.

To see how this works let us specify some constants. Let us assume that we have the following constants:

MARY  CONe e is the type of individuals. MAN, BOY  CON is the type of one-place predicates of individuals. KISSED, HUGGED  CON> > is the type of two-place between individuals. SOME, EVERY  CON<,<,t>> <,<,t>> is the type of determiners. OLD  CON<,> <,> is the type of one-place predicate modifiers: these are functions that take a property and yield a property. AND1  CON<<,t>,<<,t>,<,t>>> <,t> is the type of noun phrase meanings. AND1 takes a noun phrase meaning, and another noun phrase meaning and produces a noun phrase meaning. AND2  CON<>,< >,>>> Here AND is conjunction of two-place relations, it takes two two place relations, and produces a two place relation.

17

Let us now do some functional applications: HUGGED  CON>, a two-place predicate, hence (by clause 1): HUGGED  EXP> Similarly AND2  EXP<>,<>,>>>

We see that: AND2  EXP<>,<>,>>> < ------, ------> a b HUGGED  EXP> ------a Hence (AND2(HUGGED))  EXP<>,>> ------b Next: (AND2(HUGGED))  EXP<>,>> <------, ------> a b KISSED  EXP> ------a Hence ((AND2(HUGGED))(KISSED))  EXP> ------b We see that ((AND2(HUGGED))(KISSED)) is itself a two-place predicate. This makes it very suitable to serve as a representation for kissed and hugged (of course, we still would need to give AND2 the correct interpretation).

((AND2(HUGGED))(KISSED))  EXP> <-, ---> a b MARY  EXPe - a Hence (((AND2(HUGGED))(KISSED))(MARY))  EXP ---- b (((AND2(HUGGED))(KISSED))(MARY)) is a one-place predicate, again, that is what we would expect kissed and hugged Mary to be.

OLD  EXP<,> MAN  EXP Hence (OLD(MAN))  EXP

SOME  EXP<,<,t>> (OLD(MAN))  EXP Hence (SOME((OLD(MAN))))  EXP<,t>

18

EVERY  EXP<,<,t>>> BOY  EXP Hence (EVERY(BOY))  EXP<,t>

AND1  EXP<<,t>,<<,t>,<,t>>> (EVERY(BOY))  EXP<,t> Hence (AND1((EVERY(BOY))))  EXP<<,t>,<,t>>

(AND1((EVERY(BOY))))  EXP<<,t>,<,t>> (SOME((OLD(MAN))))  EXP<,t> Hence ((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))  EXP<,t>

Assuming that <,t> is the correct type for noun phrase interpretations, we have see that all of (SOME((OLD(MAN)))) (some old man), (EVERY(BOY)) (every boy) and ((AND1((EVERY(BOY))))((SOME((OLD(MAN)))))) (some old man and every boy) are of that type.

Finally: ((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))  EXP<,t> (((AND2(HUGGED))(KISSED))(MARY))  EXP Hence (((AND1((EVERY(BOY))))((SOME((OLD(MAN)))))) ((((AND2(HUGGED))(KISSED))(MARY))))  EXPt

We can make this more perspicuous by introducing a notation convention:

Notation convention: Infix notation: Let R be an expression of type >, α and β expression of type a. Then ((R(α))(β)) is an expression of type c. We define: (β R α) := ((R(α))(β)

:= means 'is by definition'. Thus (β R α) is notation that we add to type theory: it is not defined as part of the syntax of the language, but is just a way in which we make our expressions more readable. With infix notation, we can write the above expression with infix notation applied to both AND1 and to AND2 as:

((EVERY(BOY)) AND1 (SOME((OLD(MAN))))) ((KISSED AND2 HUGGED)(MARY)))

This is a wellformed expression of type logic of type t, a sentence, and it can be used as the representation of sentence (1). Of course, we haven't yet interpreted the expressions occurring in this expression, but we can note that, unlike the predicate logical expression, in the above expression we can find a wellformed sub-expression corresponding to each of the parts of sentence (1).

(1) Some old man and every boy kissed and hugged Mary.

19

2.3. The semantics of functional type logic.

Let us now move to the semantics of functional type logic. As before, we start with specifying the models for our type logical language TL. I will follow the practice here to include in the definition of the models only those aspects that vary from model to model.

A model for functional type logic is a pair: M = , where: 1. D, the domain of individuals, is a non-. 2. F is the interpretation function for the non-logical constants.

This looks just like predicate logic. However, since we possibly have many more types of constants, it is in the specification of the interpretation function F that differences will come in. In the model we only specify domain D, the domain of individuals. this is obviously not the only domain in which expressions are interpreted. We don't put the other domains in the definition for the model, because, once D is specified, all other domains are predictable and are given in a domain definition:

Let M = be a model for TL. For any type a  TYPETL, we define Da, the domain of type a: 1. De = D 2. Dt = {0,1} 3. D = (Da  Db)

So, the domain of type e, De is the domain of individuals given in the model M. The domain of type t, Dt, is the set of truth values {0,1}.

For any sets A,B, (A  B) = {f: f is a function from A into B}

Thus (A  B) is the set of all functions with domain A and range included in B. Hence D is the set of all functions with domain Da - the set of entities in the domain of a - and with their range included in Db - the set of entities in the domain of b. In other words, D is the set of all functions from a-entities into b-entities.

Thus, D = (De  Dt) = (D  {0,1}) The set of all functions from individuals into truth values. D> = (De  (De  Dt)) = (D  (D  {0,1})) The set of all functions that map each individual into a function from individuals into truth values. D<,t> = ((De  Dt)  Dt) = ((D  {0,1})  {0,1}) The set of all functions that map functions from individuals into truth values onto truth values.

D<,<,t>> = ((De  Dt)  ((De  Dt)  Dt)) = ((D  {0,1})  ((D  {0,1})  {0,1})) The set of all functions that map functions from individuals into truth values onto functions that map functions from individuals into truth values onto truth values.

20

D<,> = ((De  Dt)  (De  Dt)) = ((D  {0,1})  (D  {0,1}) The set of all functions that map each function from individuals into truth values onto a function from individuals onto truth values. D = (Dt  Dt) = ({0,1}  {0,1}) The set of all functions from truth values into truth values (= the set of all one place truth functions). D> = (Dt  (Dt  Dt)) = ({0,1}  ({0,1}  {0,1})) The set of all functions from truth values into one-place truth functions (=, as we will see, the set of all two-place truth functions).

The domain definition specifies a semantic domain for every possible type in TYPE. With that, we can specify the interpretation function F for the non-logical constants of the model:

A model for functional type logic TL is a pair M = , where D is a non-empty set and: for every a  TYPETL: F:CONa  Da

For every type a, F maps any constant c  CONa onto an entity F(c)  Da. So F interprets a non-logical constant of type a as an entity in the domain of a. Let M = be a model for TL. An assignment function (on M) is any function g such that: for every a  TYPETL for every x  VARa: g(x)  Da

Thus an assignment function maps every variable of every type into an entity of the domain corresponding to that type. Hence, g maps variable x  VARe onto an individual g(x) in D, it maps predicate variable P  VAR onto some function g(P) in (D  {0,1}), etc.

We can now give the semantics for our functional type logical language, by specifying, as before vαbM,g, the interpretation of expression α in model M, relative to assignment function g:

The semantic interpretation: vαbM,g For any a,b  TYPETL: 1. Constants and variables. If c  CONa, then vcbM,g = F(c) If x  VARa, then vxbM,g = g(x) 2. Functional application. If α  EXP and β  EXPa then: v(α(β))bM,g = vαbM,g(vβbM,g) 3. Functional abstraction. If x  VARa and β  EXPb then: vλxβbM,g = h, where h is the unique function in (Da  Db) such that: for every d  Da: h(d) = vβbM,gxd

21

4. Connectives. If φ,ψ  EXPt, then: v¬φbM,g = ¬(vφbM,g) v(φ  ψ)bM,g = () v(φ  ψ)bM,g = () v(φ  ψb]M,g = () Here on the right hand side we find the following truth functions: ¬:{0,1}  {0,1}; ,, are functions from {0,1} x {0,1} into {0,1}, given by: ¬ = {<0,1>,<1,0>}  = {<<1,1>,1>,<<1,0>,0>,<<0,1>,0>,<<0,0>,0>}  = {<<1,1>,1>,<<1,0>,1>,<<0,1>,1>,<<0,0>,0>}  = {<<1,1>,1>,<<1,0>,0>,<<0,1>,1>,<<0,0>,1>} 5. Quantifiers. If x  VARa and φ  EXPt then: vxφbM,g = 1 iff for every d  Da: vφbM,gxd = 1; 0 otherwise vxφbM,g = 1 iff for some d  Da: vφbM,gxd = 1; 0 otherwise 6. Identity. If α, β  EXPa then: v(α=β)bM,g = 1 iff vαbM,g = vβbM,g

A sentence is a formula without free variables. Let X be a set of sentences and φ a sentence. vφbM = 1, φ is true in M, iff for every g: vφbM,g = 1 vφbM = 0, φ is false in M, iff for every g: vφbM,g = 0

X ╞ φ, X entails φ, iff for every model M: if for every δ  X: vδbM = 1 then vφbM = 1

2.4. Boolean .

Take set {a,b,c}. Its powerset pow({a,b,c} is { Ø, {a}, {b}, {c}, {a,b}, {a,c}, {b,c}, {a,b,c} } The following picture shows how these sets are related to each other by the relation :

{a,b,c}

{a,b} {a,c} {b,c}

{a} {b} {c}

Ø The line going up from {a} to {a,b} in the picture means that {a} is a subset of {a,b}. The lines are understood as transitive: there is also a line going up from {a} to {a,b,c}, and, though we don't draw it, we take it to be understood that the order is reflexitive: every has an (invisible) line going to itself.

22

It can easily be checked that the set pow({a,bc}) is closed under the standard set theoretic operations of complementation , union , and intersection , meaning: for every X,Y  pow({a,b,c}): XY, XY, XY  pow({a,b,c}).

The structure: is a Boolean .

A powerset is a structure where: A is a set,  is the subset relation on pow(A) and , ,  are the standard set theoretic operations on pow(A).

In general, Boolean algebras are structures that model what could be called naturalistic typed extensional part-of structures.

Think of me and my parts. Different things can be part of me in different senses. For instance, I can see myself as the whole of my body parts, but also as the whole of my achievements, and in many other ways. -Typing restricts the notion of part-of to one of those senses. Thus, the domain in which I am the whole of my body parts will not include my achievements as part of me, even though, in another sense these are part me. - means that in the domain chosen I am not more and not less than the whole of my parts. This means that if I choose the domain in which I am the whole of my body parts, then I cannot express in that domain that really and truly I am more than the sum of my parts. If I want to express the latter, I must do that by relating the whole of my body parts in the body part domain to me as an entity in a different domain in which I cannot be divided into body parts. This is how intensionality comes in: I can be at the same time indivisible me and the sum of my body parts, because I occur in different domains (or, alternatively, there is a cross-domain identity relation which identifies objects in different domains as me). -Naturalistic means that each typed extensional part-of notion is naturalistic: in that structure parts work the way parts naturalistically work. The claim is, then, that Boolean algebras formalize this notion.

I define a successive series of notions: A partial order is a pair B = , where B is a non-empty set and v, the part-of relation, is a reflexive, transitive, antisymmetric relation on B.

Reflexive: for every b  B: b v b Transitive: for every a,b,c  B: if a v b and b v c then a v c Antisymmetric: for every a,b  B: if a v b and b v a then a = b

In a partial order we define notions of join, t, and meet, u. In a structure of parts, we think of join as the sum of the parts, and of meet as their overlap.

Let B = be a partial order and a,b  B:

The join of a and b, (a t b), is the unique element of B such that: 1. a v (a t b) and b v (a t b). 2. For every c  B: if a v c and b v c then (a t b) v c.

23

The meet of a and b, (a u b), is the unique element of B such that: 1. (a u b) v a and (a u b) v b. 2. For every c  B: if c v a and c v b then c v (a u b)

The idea is: -To find the join of a and b, you look up in the part-of structure at the set of all elements of which both a and b are part: {c  B: a v c and b v c}. The join of a and b is the minimal element in that set. -To find the meet of a and b, you look down in the part-of structure at the set of all elements which are part both of a and of b: {c  B: c v a and c v b}. The meet of a and b is the maximal element in that set.

Elements need not have a join or a meet in a partial order. A partial order in which any two elements do have a both a join and a meet is called a lattice:

A lattice is a partial order B = such that: for every a,b  B: (a t b), (a u b)  B.

Not lattices:

I a b II 1 III 1

0 a b c d

a b

0

In I a and b do not have a join. In II a and b do not have a meet. In III, the set {c  B: a v c and b v c} = {c,d,1}. This set doesn't have a minimum, hence a and b don't have a join. Similarly, c and d don't have a meet.

Lattices:

IV o V o VI o VII o o o o o o o o o o o o o o o o o o o o o o o

o

A partial order B = is bounded iff B has a maximum 1 and a minimum 0 in B. 1 is a maximum in B iff 1  B and for every b  B: b v 1 0 is a minimum in B iff 0  B and for every b  B: 0 v b

Obviously, then, a bounded lattice is a lattice which is bounded.

Fact: Every finite lattice is bounded.

24

Proof Sketch : If B = {b1,...,bn} then 1 = (b1 t (b2 t (...t (bn1 t bn)...) and 0 = (b1 u (b2 u (...u (bn1 u bn)...)

Not every lattice is bounded, though. If we take Z, the integers, ...,1,0,1,... with the normal partial order ≤ and we define: (n u m) = the smallest one of n and m (n t m) = the biggest one of n and m then we get a lattice, but not one that has a minimum or a maximum.

0 is part of everything. Conceptually this means that 0 doesn't count as a naturalistic part. Nevertheless, it is very convenient to admit 0 to the structure, because it simplifies proofs of structural insights immensely. But indeed we do not count 0 as a naturalistic part, because we define:

a and b have no part in common iff (a u b)=0

We assume as minimal constraints on naturalistic extensional typed part-of structures that they be lattices with a minimum 0. And this means that we assume that in a part- of structure we can always carve out the sum of two parts; of overlapping parts, we can carve out the part in which they overlap; and for non-overlapping parts we conveniently set their overlap to 0.

Now we come to the naturalistic constraints. They concern what happens to the parts of an a when we cut it. When I say cut it, I mean cut it. In a naturalistic part- of structure, cutting should be real cutting, the way my daughter does it. It turns out that real cutting imposes requirements on part-of structures that not all part of structures allowed so far can live up to. If in those structures cutting cannot mean real cutting, they cannot be regarded as naturalistic structures.

Naturalistic constraint on cutting 1:

Suppose we have object d and part c of d: d c

Now suppose we cut through d:

d

a b

This means that d = (a t b) and (a u b) = 0. If we cut in this way, where is c going to sit? And the intuitive answer is obvious: either c is part of a, and c goes to the a-side, or c is part of b, and c goes to the b-side, or c overlaps both a and b and hence c itself gets cut into a part a1 of a and a part b1 or b, and this means that c = a1 t b1 and a1 u b1 = 0.

25

Structures that don't satisfy this naturalistic cutting intuition miss something crucial about what parts are and how you cut. We define:

Lattice B = is distributive iff for every a,b,c  B: If c v (a t b) then either c v a or c v b or for some a1 v a and some b1 v b: c = (a1 t b1)

Lattices IV and V are not distributive. IV is called the diamond, V is called the pentagon. The diamond is not distributive: 1 a c b

0 1 = (a t b), but c is not part of a, nor of b, nor do a and b have parts a1 and b1 respectively, such that c is the sum of a1 and b1.

The pentagon is not distributive:

1 c b a 0 1 = (a t b) but c is not part of a, not of b, and not the sum of any part of a and any part of b.

There is a very insightful about distributivity, which says that in essence, the pentagon and the diamond are the only possible sources of non-distributivity:

Theorem: A lattice is distributive iff you cannot find either the pentagon or the diamond as a substructure.

Standardly, distributivity is defined by one of the following two alternative definitions:

(1) Lattice B = is distributive iff for all a,b,c  B: (a u (b t c)) = ((a u b) t (a u c)) (2) Lattice B = is distributive iff for all a,b,c  B: (a t (b u c)) = ((a t b) u (a t c))

I gave the above definition because its conceptual naturalness is clearer. But it is not hard to prove that either definition (1) or (2) is equivalent to the one I gave.

26

We come to the second constraint on cutting:

Naturalistic constraint on parts and cutting 2: We go back to d and c:

d c

The second naturalistic intuition is very simple. It is that c can indeed be cut out of d, and that cutting it out means what it intuitively does: when you cut c out of d, you remove from the partial order all parts of d that have a part in common with c. This means that you will be left only with parts of d that didn't have a part in common with c. The naturalistic constraint is that what you are left with is a part e of d such that c t e = d and c u e = 0, and its parts. What this says is that the way my daughter cuts is like this:

c e

The constraint, thus, regulates in cutting the relation between what you cut out and what is left. What is left, when we cut out c, we call the complement of c. We define:

Bounded lattice B = is complemented iff for every b  B: there is a c  B such that (b t c) = 1 and (b u c) = 0.

In non-distributive bounded lattices elements can have more than one complement. For instance, in the diamond, both a and b are complement of c. In distributive bounded lattices each element has at most one complement. Thus each element has null or one complement. This means that in a complemented distributive lattice each element has a unique complement:

Example of a distributive lattice that is not complemented:

o 1

o a

o 0 a is a non-null proper part of 1 (a part that is smaller than1b itself). a does not have a complement. The notion of complement naturally generalizes from 1 to arbitrary non-null elements in B:

27

Let a,b  B − {0}, and a ⊑ b

The relative complement of a in b, b(a) is the unique element of B such that a ⊓ b(a) = 0 and a ⊔ b(a) = b (if there is one)

The relative complement of a in b represents the natural notion of 'remainder'. The naturalistic intuition is that if a is part of b, but not all of b, then there is something left, and the maximal part of b that doesn't overlap a is a's relative complement.

All this motivates the assumption of compementation for naturalistic structures:

B = is a Boolean lattice iff B is a complemented distributive lattice.

Next we generalize the notions of meet and join:

Let B = be a partial order and X  B.

The join of X in B is the unique element tX of B such that: 1. for every x  X: x v tX 2. for every c  B: if for every x  X: x v c then tX v c

The meet of X in B is the unique element uX of B such that: 1. for every x  X: uX v x 2. for every c  B: if for every x  X: c v x then c v uX

Lattice B = is complete iff for every X  B: tX, uX  B.

In a complete lattice not just finite joins and meets exist, but all joins and meets do.

Facts: Every finite lattice is complete. Every powerset lattice is complete.

We have already indicated why the first fact holds above: for finite lattices tX and uX can be defined in terms of the two place join and meet. A powerset lattice is complete because its domain is the set of all of some set A, and for any set X  pow(A), X and X are subsets of A, hence in pow(A).

Not every lattice is complete, not every Boolean lattice is complete. Example: Let A be an . FIN(A) is the set of all finite subsets of A. COFIN(A) is the set of all subsets whose settheoretic complement in A is finite: COFIN(A) = {X  A: AX is finite}

(So if A = N, the set of natural numbers, N{1,2,3} is cofinite, but E, the set of even numbers is not.).

Let B = FIN(A)  COFIN(A), and look at: . This is a bounded distributive lattice with minimum Ø and maximum A, and it is complemented as well: For every X  A: (X) = A  X. Namely: if X  B then X  A and X is finite or X is cofinite

28 if X is finite, , then A  X is cofinite, hence A – X  B. If X is cofinite,then A  X is finite, hence A − X  B. This means that B is a Boolean lattice. But B is not complete. For example:

Every singlton set is finite, so {{n}: n is even}  FIN(A), hence {{n}: n is even}  B. But {{n}: n is even} = E, the set of even numbers, and E is not in B, it is neither finite nor cofinite.

For practically all purposes of semantics we assume our structures to be complete. For complements, completeness has the intuitive consequence that:

Fact: Every complete Boolean lattice is relatively complemented, which means that: for every a,b  B – {0}: if a ⊑ b then b(a)  B

(Proof: b(a) = ⊔{c ⊑ b: a ⊓ c = 0}. Since, {c ⊑ b: a ⊓ c = 0}  B, This join is in B, if B is complete.)

Notice that if we assume our structure to be complete distributive lattices, we can do with a very weak complement condition to get Boolean lattices.

Let part(b) = {x  B – {0}: x ⊑ b}.

Complementation in complete distributive lattices: If a  part(b) then there is a c  part(b): a ⊓ c = 0

If you take out something from b (namely a), there is something left.

One more central notion is that of atomicity:

Let B = be a partial order with minimum 0 and let a  B. a is an atom in B iff a  0 and for every b  B: if b v a then either b =0 or b = a a is an atom in B if there is nothing in between 0 and a.

Let B = be a partial order with minimum 0. B is atomic iff for every b  B{0}: there is an atom a  B such that a v b.

Thus, in an atomic partial order, every element has an atomic part. De facto this means that when you divide something into smaller and smaller parts, you always end up with atoms.

Facts: Every finite lattice is atomic. Every powerset lattice is atomic.

That every finite lattice is atomic is obvious when you think about it: in order for an element not to have any atomic parts, you need to find smaller and smaller parts all the way down. That implies infinity. In the powerset lattice the singleton sets are the atoms.

29

Not every infinite lattice is atomic. In fact, infinite lattices, and for that matter, infinite Boolean lattices exist that are atomless, that have no atoms at all. Even stronger, complete Boolean lattices exist that are atomless.

We have defined Boolean lattices. We now define Boolean algebras:

A Boolean algebra is a structure B = such that: 1. is a Boolean lattice. 2. : B  B is the operation which maps every b  B onto its complement b 2. u: (BB)B is the operation which maps every a,b  B onto their meet (a u b) 3. t: (BB)B is the operation which maps every a,b  B onto their join (a t b) 5. 0  B is the minimum of B and 1  B is the maximum of B.

Let A = and B = be Boolean algebras.

A and B are isomorphic iff there is an isomorphism from A into B.

An isomorphism from A into B is a function h:A  B such that: 1. h is a bijection, a one-one function from A onto B. 2. h preserves the structure from A onto B: a. for every a  A: h(Aa) = Bh(a) b. for every a1,a2  A: h((a1 uA a2)) = (h(a1) uB h(a2)) c. for every a1,a2  A: h((a1 tA a2)) = (h(a1) tB h(a2)) d. h(1A) = 1B e. h(0A) = 0B

Fact: If h is an isomorphism from A into B then the inverse function h1 is an isomorphism from B into A.

What does the notion of isomorphism tell us? Suppose that there is an isomorphism between A and an unknown structure B:

A 1 B

c b a ? h a b c

0

The fact that h is a bijection tells us that B has as many elements as A:

30

A 1 B

c b a d e f h h i j a b c k

0

The fact that h maps 1 onto 1 and 0 onto 0 tells us that B has a maximum 1 (say j) and a minimum 0 (say k):

A 1 B 1

c b a d e f h h i a b c

0 0

The fact that for each element x  A h(x) = h(x) tells us that also in B every element has a complement.

A 1 B 1

c b a d e f h d e f a b c

0 0

Now we start to construct. Map a and b and c onto three of these elements. -They can't map onto 0 and 1, because we have a bijection. -Also, we cannot map, say, a onto d and b onto d. The reason is the following: a t b = c. If h(a)=d and h(b) = d, then h(a) t h(b) = d t d = 1. But h(a) t h(b) = h(a t b) = h(c). So, then h(c)=1. But then h is not a bijection. So if, say, we set h(a)=d, we cannot choose d as the value for either b or c. So we set, say, h(b)=e. Then we cannot set the value for c to e. So let's say we set h(c)=f.

Since a u b = 0, we know that h(a) u h(b) = h(a u b) = h(0) = 0. Hence d u e = 0. The same argument shows that d u f = 0 and that e u f = 0. So we get:

A 1 B 1

c b a d e f h a b c d e f

0 0

31

Now h(a) = h(a) = d h(b) = h(b) = e h(c) = h(c) = f

Since a t a = 1, we get: h(a) t h(b) = h(a t b) = h(1) = 1. Hence d t e = 1: The same argument shows that d t f = 1 and e t f = 1

So we get:

A 1 B 1

c b a f e d h a b c d e f

0 0

Now a t b = c. Hence h(a tb) = h(c). Hence h(a) t h(b) = h(c). Hence d t e = f. We determine in the same way that d t f = e and e t f = d and we get:

A 1 B 1

c b a f e d h a b c d e f

0 0

We see that isomorphic Boolean algebras have the same Boolean structure. They at most differ in the naming of the elements, but not in the Boolean relations between the elements. The that the theory of Boolean algebras is formulated in is extensional. This means that A and B are not literally the same Boolean algebra. But mathematicians (and we) think of Boolean algebras as more intensional entities. With the mathematicians we will regard A and B as the same Boolean algebra.

In general, we do not distinguish between isomorphic structures: if A happens to be an algebra of sets (like a powerset) and B an algebra of, say, functions, they are the same structure if they are isomorphic. And that means that we are free to think of them as either sets or functions, whichever is more convenient. As we will see, this double is one of the basic tenets of the functional type theory we have introduced.

Secondly, I have related A and B through isomorphism h such that h(a)=d and h(b)=e and h(c)=f. But h is not the only isomorphism between A and B. If I set, as for h, h'(a)=d, but I set h'(b)=f and h'(c)=e I get another isomorphism between A and B

32

(by working out the other values):

A 1 B 1

c b a e f d h' a b c d f e

0 0

I use the pictures to indicate here which element from A maps onto which element of B. While this B-picture differs from the previous B-picture (e in the middle versus f in the middle), the difference lies only in the pictures, not in structure: in pictures of lattices only the vertical ordering is real (since it corresponds to the order v), the left- right order is only artistic license: if I check which elements are part of which elements in both B-pictures I get twice the same literal structure, B.

The fact that there is usually more than one isomorphism between two structures means that we can let convenience determine which isomorphism we use when we think about these structures as the same thing. This is a second important tenet of the functional type theory, as we will see.

I will now discuss some and facts that will give us a lot of insight in the structure of Boolean algebras.

Representation Theorem: Every complete atomic Boolean algebra is isomorphic to a powerset Boolean algebra.

Proof sketch: Let B be a complete atomic Boolean algebra and let ATOMB be the set of atoms in B. For every b  B let ATb = {a  ATOMB: a v b} So ATb is the set of atoms below b, the set of b's atomic parts. Define a function h: B  pow(ATOMB) by: for every b  B: h(b) = ATb. What you can prove is that h is an isomorphism between B and

. So every complete atomic Boolean algebra is isomorphic to the powerset Boolean algebra of the powerset of its atoms. This is not very difficult to prove, if you have first proved the following basic fact about complete atomic Boolean algebras:

Fact: In a complete atomic Boolean algebra every element is the join of its atomic parts:

for every b  B: b = tATb.

The representation theorem tells us that the complete atomic Boolean algebras are just the powerset Boolean algebras. Since we have already seen that every finite lattice is complete and atomic, and hence so is every finite Boolean algebra, it follows that:

Corrollaries: The finite Boolean algebras are exactly the finite powerset Boolean algebras.

33

More generally:

1. Each powerset Boolean algebra has cardinality 2n, for some n (i.e. if |A|= n, |pow(A)|=2n) 2. For each cardinality 2n (for some n), there is exactly one powerset Boolean algebra, up to isomorphism. 3. The powerset Boolean algebras are up to isomorphism exactly the complete atomic Boolean algebras.

I will next discuss a theorem that gives a very deep insight in the structure of Boolean algebras.

The decomposition theorem: Every Boolean algebra which has an atom can be decomposed into two non- overlapping isomorphic Boolean algebras.

Proof sketch: Let B be a Boolean algebra and a an atom in B. Look at the following sets:

Ba = {b  B: a v b}, the set of all elements that a is part of. Ba = {b  B: b v a}, the set of all parts of of b.

It is not difficult to prove that: Ba  Ba = B and Ba  Ba = Ø. This means that Ba and Ba partition B. Now, you can restrict the operations of B to Ba and the result is a Boolean algebra Ba with a as 0. Similarly, Ba forms a Boolean algebra Ba with a as 1. And you can prove that Ba and Ba are isomorphic.

Example: pick atom a in A. Look at Ba and Ba:

A 1

o o a a o o

0

You see the two Boolean algebras sitting in the structure (each isomorphic to pow({a,b}). It doesn't matter which atom you use here. In fact, there is a further fact:

Fact: If B is a Boolean algebra and a and b are atoms in B then Ba and Bb are isomorphic.

That means that in a given Boolean algebra, if you stand at any of its atom and you look up, the sky looks the same, you see the same structure (up to isomorphism).

34

So, we also get:

A 1 A 1

o b o c o o o b o o o c

0 0

And it doesn't stop here. If Ba has an atom, then by the same theorem, both Ba and Ba themselves each decompose into two non-overlapping isomorphic Boolean algebras, so, all in all we have decomposed A into four isomorphic Boolean algebras: A 1

o o a a o o

0

The inverse of the decompositiuon theorem is the composition theorem:

The Composition Theorem: Let B1 and B2 be two Boolean algebras, that have no element in common, and that are isonmorphic. Then we can define a Boolean algebra on B1  B2.

Proof sketch: Note that |B1 + B2| = |B1| + |B2|, which is a power of 2, of B1 and B2 are Boolean algebras (so that is promising, in light of the facts stated above).

Let h: B1  B2 be the isomorphism. You define the order relation on B1  B2 in terms of the orders ⊑ B1, ⊑B2 and h. Take the relation: vB1  vB2  {: a  B1} The only problem with this relation is that it isn't transitive, so take its transitive closure, the result of adding all pairs such that for some c: and stand in the above relation. The resulting relation is the partial order relation on B1  B2. In the resulting partial order you can define the Boolean operations quite simply (eg. if a  B1, [B1  B2](a) = h(B1(a)); if a  B2, [B1  B2](a) = −1 h (B2(a)) )

This gives you a second procedure to generate all finite Boolean algebras (the first is as powersets), but this time one that actually gives you a plotting .

Start with two one element Boolean lattices: <{0},{<0,0>}> and <{1},{<1,1>}> (These are really degenerate Boolean algebra, since 0 =1. We really start counting at 2.) And the isomorphism h = {<0,1>}. Then the composition theorem tells us how to form the two element Boolean algebra:

0

1-element Boolean algebra: a point.

35

1 1

0 0

Two 1-element Boolean algebras give a 2-element Boolean algebra: a line.

Take two disjoint two element Boolean algebra and an isomorphism: turn the arrow into a line (the transitive closure adds <0,1>), and you get the 4-element Boolean algebra: 1 1 a b a b

0 0

Two 2-element Boolean algebras give a 4 element Boolean algebra: a square.

Take two disjoint four element Boolean algebras and an isomorphism, form the eight element Boolean algebra: o o

o o o o o o o o o o o o o o

Two 4-element Boolean algebras give an 8 element Boolean algebra: a cube.

Take two disjoint eight element Boolean algebras and an isomorphism, form the sixteen element Boolean algebra. o

o o o o o o o o o o o o o o

o

Two 8-element Boolean algebras form a 16 element Boolean algebra: a four dimensional cube.

36

The key to the relation between the theory of Boolean algebras and the functional type theory lies in the following theorem:

The Lifting Theorem: Let B be a Boolean algebra and A a set. Define on the function space (A  B) the following relation v: f v g iff for every a  A: f(a) vB g(a) Then: 1. <(A  B), v> is a Boolean lattice. 2. <(A  B), v> is complete iff B is complete. 3. <(A  B), v> is atomic iff B is atomic.

Thus, if B is a Booelan algebra, you can lift the Boolean structure from B onto the set of all functions from A into B.

Now we observe the following.

BOOL is the smallest set of types such that: 1. t  BOOL 2. if a  TYPE and b  BOOL then  BOOL

The lifting theorem tells us that the domain of every Boolean type has the structure of a Boolean algebra. It tells us more actually. t is not just a Boolean type, but Dt is finite, hence the domain Dt forms a complete atomic Boolean algebra. And this means that the domain of every type in BOOL forms a complete atomic Boolean algebra.

With that we have proved:

Corrolary: Every Boolean type is isomorphic to a powerset Boolean algebra.

To apply this to the type theory we now mention Cantors cardinality theorems (the first two of the three facts mentioned, since the third is trivial):

Cantor's Theorems: 1. |pow(A)| = 2|A| |A| 2. |(A  B)| = |B| 3. |(A  B)| = |A|  |B|

We are now ready to harvest the fruits of the discussion.

37

D is isomorphic to pow(D)

D = (D  {0,1}) |(D  {0,1})| = |{0,1}||D| = 2|D| = |pow(D)| From this, we now know, the above statement follows. The domain of functions from individuals to truth values is, up to isomorphism, the domain of sets of individuals. Thus functions from individuals to truth values are, up to isomorphism, just sets of individuals. Which sets depends on the choice of the isomorphism.

The isomorphism we choose is the one that is maximally useful for conceptual and linguistic purposes: CHAR: pow(D)  D 1 CHAR : D  pow(D) 1 (CHAR = {: X  pow(D) and f  D and CHAR(f)=X})

CHAR is the function that maps every set onto its characteristic function.

1 CHAR is the isomorphism such that for every function f  D: CHAR1(f) = {d  D: f(d)=1}

Thus, the function which maps every function onto the set characterized by it is an isomorphism (and so is the function that maps every set onto its characteristic function).

Example. Let D = {a,b,c}.

D a1 b1 {a,b,c} c1 a1 a1 a0 b1 b0 b1 {a,b} {a,c} {b,c} c0 c1 c1 a1 a0 a0 b0 b1 b0 {a} {b} {c} c0 c0 c1

a0 b0 Ø c0

The isomorphisms CHAR and CHAR1 map elements in the same position the cubes onto each other.

38

We could have taken another isomorphism, say, the function that maps every f onto {d  D: f(d)=0}. While technically sound, that would only have been confusing for us. Why? Well, mainly because we are creatures of habit. We are so used to thinking of a predicate logical formula WALK(j) as expressing the truth conditions of the natural language sentence John walks, that a logic in which the formula WALK(j) would express the truth conditions of the natural language sentence John doesn't walk would bother us. Yet, technically, there wouldn't be anything wrong with such a logic, as long as the formula WALK(j) would express the truth conditions of the natural language sentence John walks. Technically, there is nothing wrong, but practically such a logic would be dreadful. Our brain wants its calculating systems to be mnemonic and gets dreadfully upset when the mnemonics is upside down (that is why, in a theory of individuals and events I change the name of type e to d, so that I can use e for events). Practically useful are mnemonic, and that is the reason (and, given the of different isomorphisms, the only reason) that we make sure that the logical formula WALK(j) expresses that FM(j)  FM(WALK) and not that FM(j)  FM(WALK).

So the isomorphism we use is CHAR and this means that we identify a function into truth values with the set of arguments mapped onto 1.

D> is isomorphic to pow(DD)

D> = (D  (D  {0,1)) |D| D| |D| |D>| = |(D  {0,1}| = (2| ) |pow(DD)| = 2|DD| = 2|D||D|

Now (2n)m = 2nm.

You don't remember? Well, show first that 2n2m = 2n+m. That is just a question of counting 2's:

2n2m = (2...2)(2...2) = (2...... 2)

n m n+m

Given this:

(2n)m = (2n)...(2n) = 2(n + ... + n) = 2nm m m

So indeed, D> is isomorphic to pow(DD). That means that the domain of functions from individuals to functions from individuals to truth values is, up to isomorphism, just the domain of all two place relations, where relations are sets of ordered pairs.

The difference between the two perspectives becomes maybe clearer if we look at a third set: ((DD){0,1}). |((DD){0,1})| = 2|D||D|, so this domain is also isomorphic to pow(DD).

39

That's not a surprise, because we have just above studied the relation between (A  {0,1}) and pow(A). So ((DD){0,1}) is just the set of all the characteristic functions of the sets in pow(DD). So, obviously, we can switch between a set of ordered pairs and the function that maps each in that set onto 1 and each ordered pair not in that set onto 0.

Now look at ((DD){0,1}) and (D  (D  {0,1})). A function in ((DD){0,1}) takes an ordered pair in DD and maps it onto a truth value. A function in (D  (D  {0,1})) takes an individual d in D and maps it onto a function fd in (D  {0,1}. That function fd takes an individual d' in D and maps it onto a truth value fd(d'). What the isomorphism says is that the total effect of this is that in two steps an ordered pair of d' and d is mapped onto a truth value. In other words: -a function in ((DD) {0,1}) is a relation which maps two arguments simultaneously onto a truth value.

An n-place relation takes n arguments. -We can think of it as a set of n-. -We can also think of it as a function that takes all the n arguments simultaneously, and maps them as an n- onto a truth value. An n-place relation interpreted as a function that takes all n arguments simultaneously we call an uncurried function. -We can also think of it as a function that takes the n arguments one at a time, mapping the first argument in onto an n1 place function, then mapping the second argument in onto an n2 place function, ..., mapping the n1 th argument onto a 1- place function, mapping the last argument onto a truth value. An n-place relation interpreted as a function that takes the n arguments one by one we call a curried function.

Though the topic is spicy, the terminology curried has not to do with Indian food, but with the mathematician Curry who in the Nineteen Thirties developed Combinatorial Logic (which is one of the characterizations of the notion of algorithmic function, ). Curry based Combinatorial Logic on the technique of functions (turning n-place functions into functions that take the arguments one at a time). The actual technique of currying functions goes back to a paper by Schönfinkel in the Nineteen Twenties.

The importance of currying (and uncurrying) for semantics can hardly be underestimated. You see it directly when you consider the compositionality problem, the problem of finding interpretations for the constituents:

40

Go back to predicate logic. Take a formula like KISS(MARY, JOHN). The construction in the logical language of this formula is:

FORM

PRED2 TERM TERM

KISS MARY JOHN

Currying predicate logic means that we can take a function CURRY(KISS) which also takes two arguments, but one at a time: This could give us a construction tree of the following form:

FORM

PRED1 TERM

PRED2 TERM MARY

CURRY(KISS) JOHN

And given our considerations about semantic domains, the semantics of CURRY(KISS) could give you the same interpretation for the sentence. In fact, to get the same semantics as you have assigned to KISS in predicate logic, the only thing you need to make sure that is that CURRY maps FM(KISS) onto the right function.

But, of course, this is eminently useful if we are dealing with the interpretation of natural language expressions that combine with their arguments one at a time, like verbs:

S

NP VP

Mary V NP

kiss John

The domain D> and pow(DD) are isomorphic. As before, there is more than one isomorphism between these domains. Which one we choose is determined by what we want with the theory and again the mnemonics of the logic.

The mnemonics of the theory of relations is, once again, determined by how we are used to write expressions with relations in predicate logic. The mnemonics is given by the fact that we are used to regard the truth conditions of the formula KISS(j,m) as adequate for those of the natural language expression John kisses Mary. This means that when we require that  FM(KISS) we interpret the first element of the ordered pair as the kisser and the second element as the kissee.

41

This mnemonics is a given fact that we are not going to tinker with. (The mnemonics is, by the way, much less strong for three place predicates. If I write a formula GIVE(a,b,c) it is clear to me (and that is the mnemonics) that a is the giver. But for the other two arguments I would be likely to tell you who's on second, i.e. fix the convention.)

One third of the connection between D> and pow(DD) is going to be fixed by the mnemonic fact that when we say  FM(KISS) we fix j as the kisser.

The second third is fixed by the mnemonics of type theory. The type > is interpreted as the domain (D  (D  {0,1}). We see two e's in this type, and the left right order of the e's in the type encodes the order in which the arguments go in a function of this type:

> 1 2

This is the fundamental mnemonics of functional type theory: a function of type > combines with an argument of type e (1 = first argument in) to give a function of type . That one combines with an argument of type e (2 = second argument in) to give a truth value. This mnemonics was fixed by the definition of Da.

As far as the functions of this type themselves go, we read the arguments from the inside out. A curried function f in domain (D  (D  {0,1})) takes two elements p, q  D in order to get a truth value: The first argument in gives you a function in (D  {0,1}): f(p)  (D  {0,1} The second argument in gives you a truth value in {0,1}: [(f(p)](q) This means indeed that in curried functions we count from the inside out. This is not a mnemonics, this is the way the theory works:

Type: > 1 2 Argument combination: [(f(1)](2 )

And this generalizes:

The type of three-place relations between individuals is: >> 1 2 3

Again, the numbers indicate the order in which the arguments go in the function. If f is a function in domain D>>, which is (D  (D  (D  {0,1}))), then the order the argument go in is, again, inside out:

Type: >> 1 2 3 Argument combination: [ [f(1)](2) ] (3)

42

In general, for n-place functions: Type: >...>> 1 2 n1 n Argument combination: [[...[ [f(1)](2) ] ...](n1)](n)

The final third of the connection between D> and pow(DD) is not fixed by mnemonics but by what we are using the theory for. And this final third is precisely the choice of the isomorphism between the two domains.

For grammatical reasons we semanticists first want to combine the interpretation of kiss with the interpretation of the object John and then with the interpretation of the subject Mary. If we write KISS for the kissing relation in pow(DD), we have already, following the standard mnemonics, interpreted  KISS as j is the kisser and m is the kissee. If we call the isomorphism from pow(DD) into (D  (D  {0,1})) CURRY, then these two concerns fix the isomorphism CURRY:

[[CURRY(KISS)](j)](m) = 1 iff  KISS

CURRY is the function from pow(DD) into D> such that: for all R  DD: for all d1,d2  D:  R iff [[CURRY(R)](d2)](d1)=1

Equivalently,

1 CURRY is the function from D> into pow(DD) such that: 1 for every f  D>: CURRY (f) = {  DD: [f(d2)](d1)=1}

For the type logical language this means that if we see the following expression:

((KISS(JOHN))(MARY)) we should not be misled by the left right order of the arguments in this expression to think that it means that John kisses Mary. It means that Mary kisses John. The same holds for the type >. 1 2 The numbers I have used here indicate the order in which the arguments go in and not the order in which they will be sitting in the ordered pair. Given CURRY, the order in which the arguments go in is precisely the inverse order in which they sit in the ordered pair. The above formula corresponds to the predicate logical formula:

KISS( MARY, JOHN )

This may be confusing in the beginning, but there is nothing to be done about it, you need to get used to it.

There is nothing to be done about it, because: 1. We don't want to reinterpret KISS( MARY, JOHN) to mean that John kisses Mary. That would again be dreadfully confusing.

43

2. We don't want to change the type-domain relation because, as you will find out, the powerfulness of type theory as a tool lies precisely in the fact that function-argument structure can be read off directly from the types. 3. We don't want to use another isomorphism than CURRY between these two domains for linguistic reasons.

What we do do is introduce relational, uncurried notation as a notation convention in the language of type logic, because relational notation has one great advantage: it reduces the amount of brackets.

Given all this we define:

Notation convention: Relational notation for > Let α  EXP> and β1,β2  EXPe. We define: α(β1,β2) := ((α(β2))(β1))

The notation convention extends to all domains we assume to be linked by isomorphism CURRY (that is, the isomorphism that works like CURRY works for >). This means that we understand, for α  EXP>>:

α(β1,β2,β3) := (((α(β3))(β2))(β1))

In fact, α can be an expression of type > and β2  EXPb and β1  EXPa. Then we also can write, if we so want, ((α(β2))(β1)) as α(β1,β2). This applies for instance to verbs like believe. We will later introduce a type of , which I will here call p. This will be the type of an expression like that the earth is flat. We will have a constant BELIEVE of type > and will form expressions like ((BELIEVE(β))(JOHN)) which we interpret as John believes β And, with CURRY we will unproblematically write that in relational notation as BELIEVE(JOHN,β).

This notation convention is meant to be useful. That means that we apply it when it is useful to do so. If it is not useful, we don't apply it, or we apply something else.

For instance, we have assumed a constant OLD of type <,>. And we were able to form a formula like ((OLD(MAN))(JOHN)). Now D<,> is isomorphic to pow(Dpow(D)). This means that, with CURRY, we could write the above expression in relational notation: OLD(JOHN,MAN). There is nothing wrong with that, except that it is conceptually easier to think of OLD as a function from sets into sets, than as a relation between individuals and sets. We switch from the functional perspective to the set theoretic perspective when it helps us interpreting the types: so <,> is the type of functions from sets into sets. That's easier than functions from functions into functions. But it isn't really easier than relations between individuals and sets. So we choose the middle perspective.

44

There are a few domains where the isomorphism CURRY is not the most natural one. For those we will blatantly choose another isomorphism and another notation convention.

D<,t> is isomorphic to pow(pow(D))

D<,t> is the type of generalized quantifiers. Since is (up to isomorphism) the type of sets of individuals, <,t> is the type of functions from sets of individuals to {0,1}: (pow(D)  {0,1}). Thus a is a function that takes a set of individuals and maps it onto a truth value. But, of course, any such function f characterizes a set: {X  pow(D): f(X)=1}. And by now you should see that this means that D<,t> is isomorphic to pow(pow(D)). Thus, generalized quantifiers are sets of sets of individuals.

D<,<,t>> is isomorphic to pow(pow(D)pow(D))

Next, we look at the type <,<,t>> of determiners. A determiner like EVERY is a function which maps a set of individuals BOY onto a set of set of individuals: (EVERY(BOY)). The latter is a function that maps a set of individuals WALK onto a truth value: ((EVERY(BOY))(WALK)). This means that EVERY is a function which takes a set of individuals (BOY), then takes another set of individuals (WALK), and gives a truth value. But that means that EVERY is a two-place relation between sets of individuals.

In Generalized Quantifier Theory, determiners are interpreted directly as relations between sets of individuals. We have formulas like

EVERY[BOY,WALK] and EVERY is interpreted as the subset relation between sets of individuals. This is, of course, natural, but now we should note that the isomorphism this uses is not CURRY, but an isomorphism that, for lack of a better term I will call CORIANDER:

On >, CORIANDER is the isomorphism from pow(DD) into D> such that:

1 For every f  D>: CORIANDER (f) = {  DD: [f(d1)](d2) = 1}

We generalize this to other types in the same way as CURRY. So CORIANDER is the obvious isomorphism that we didn't want for type >, but it is the one we want for type <,<,t>>, and that is, because determiners as two place relations first combine with the nominal argument, and then with the verbal argument. If we use CURRY, we should interpret EVERY as the superset relation, and write ((EVERY(BOY))(WALK)) as EVERY(WALK,BOY). But that is pathetic.

45

Instead, what we do is not use the relational notation introduced above, but a different relational notation (with square brackets):

EVERY[BOY,WALK] := ((EVERY(BOY))(WALK))

While both notation conventions are defined for any relational type, we see that their use is not completely trivial, because they indicate a deeper conceptual difference between the semantics of determiners and the semantics of transitive verbs (indicated by the difference in mnemonics).

In general.

Type is the type of functions from a-entities into truth values and this is the type of sets of a-entities.

For some types the set perspective helps, for others it doesn't. Thus: - is usually the type of sets of individuals. -<t> is sometimes the type of functions from sets of individuals to truth values, sometimes the type of sets of sets of individuals. - is the type of functions from truth values to truth values, the type of one-place truth functions. We don't often think of as the set {0}. The functional perspective is dominant here.

Type > is the type of functions from a-entities into functions from a-entities into ttuth values and this is the type of relations between a-entities.

-> is usually the type of relations between individuals. -<,<,t>> is either the type of functions from sets of individuals to sets of sets of individuals, or the type of relations between sets of individuals. -> is the type of functions from pairs of truth values to truth values, two- place truth functions, rarely that of relations between truth values.

Type > is the type of functions from b-entities into functions from a-entities into truth values. These can be functions from b-entities into sets of a-entities, or relations between a-entities and b-entities (under CURRY).

Thus > is the type of relations between individuals and propositions. But <,> is the type of functions from sets of individuals to sets of individuals.

In the course of using type theory you will get used to reading types in their intended interpretation quite automatically. Again, technically, the above choices are arbitrary, and you can feel free to use a different isomorphic interpretation if it helps you better. Conceptually, the fact that we naturally think of, say, negation as a function and not at all as a set may well point at something much more fundamental about how our brains work.

46

2.5. Interpreting type-logic.

Let us come back to our sentence (1):

(1) Some old man and every boy kissed and hugged Mary.

We gave the following type logical representation for this sentence:

(((AND1((EVERY(BOY))))((SOME((OLD(MAN)))))) ((((AND2(HUGGED))(KISSED))(MARY))))  EXPt

Of this, we specified the following parts:

((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))  EXP<,t> applies to:

(((AND2(HUGGED))(KISSED))(MARY)))  EXP

AND1  CON<<,t>,<<,t>,<,t>>> It applies first to: (EVERY(BOY))  EXP<,t> and then to: (SOME((OLD(MAN))))  EXP<,t>

EVERY, SOME  CON<,<,t>> BOY, (OLD(MAN))  EXP OLD  CON<, and MAN  CON AND2  CON<>,< >,>>> It applies first to HUGGED and then to KISSED, both constants of type > The result is: ((AND2(HUGGED))(KISSED))  EXP> which applies to MARY  CONe The resulting expression is a wellformed expression of type t. Let M = be a model and g an assignment function. Let us calculate the semantic interpretation of this expression: v(((AND1((EVERY(BOY))))((SOME((OLD(MAN)))))) ((((AND2(HUGGED))(KISSED))(MARY))))bM,g =

(v((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))bM,g (v(((AND2(HUGGED))(KISSED))(MARY)))bM,g) =

(v((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))bM,g ((v((AND2(HUGGED))(KISSED))bM,g(vMARYbM,g)))) =

(v((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))bM,g ((v((AND2(HUGGED))(KISSED))bM,g(F(MARY))))) =

47

(v((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))bM,g (((v(AND2(HUGGED))bM,g(vKISSEDbM,g))(F(MARY))))) =

(v((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))bM,g (((v(AND2(HUGGED))bM,g(F(KISSED)))(F(MARY))))) =

(v((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))bM,g ((((vAND2bM,g(vHUGGEDbM,g))(F(KISSED)))(F(MARY))))) =

(v((AND1((EVERY(BOY))))((SOME((OLD(MAN))))))bM,g ((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =

((v(AND1((EVERY(BOY))))bM,g(v(SOME((OLD(MAN))))bM,g)) ((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =

((v(AND1((EVERY(BOY))))bM,g((vSOMEbM,g(v(OLD(MAN))bM,g)))) ((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =

((v(AND1((EVERY(BOY))))bM,g((F(SOME)(v(OLD(MAN))bM,g)))) ((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =

((v(AND1((EVERY(BOY))))bM,g((F(SOME)((vOLDbM,g(vMANbM,g)))))) ((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =

((v(AND1((EVERY(BOY))))bM,g((F(SOME)((F(OLD)(F(MAN))))))) ((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =

(((vAND1bM,g(v(EVERY(BOY))bM,g))((F(SOME)((F(OLD)(F(MAN))))))) ((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =

(((F(AND1)(v(EVERY(BOY))bM,g))((F(SOME)((F(OLD)(F(MAN))))))) ((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =

(((F(AND1)((vEVERYbM,g(vBOYbM,g))))((F(SOME)((F(OLD)(F(MAN))))))) ((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY))))) =

(((F(AND1)((F(EVERY)(F(BOY)))))((F(SOME)((F(OLD)(F(MAN))))))) ((((F(AND2)(F(HUGGED)))(F(KISSED)))(F(MARY)))))

Believe it or not, this is a truth value. Why? Because F(MAN) is a function in D and F(OLD) a function in D<,>, which means that F(OLD)( F(MAN) ) is a function in D again. Since F(SOME) is a function in D<,<,t>>, F(SOME)( F(OLD)( F(MAN) ) is a function in D<,t>. F(AND1) takes that function and the function F(EVERY) ( F(BOY) ) and gives you once again a function f  D<,t>.

48

By a similar argument, F(AND2) takes the functions F(HUGGED) and F(KISSED) and gives a function in D>. This function takes F(MARY) and gives a function g  D. And indeed f(g)  Dt.

This means that we have provided a solution to the compositionality problem. But we cannot stop here. We have interpreted every expression as a non-logical constant and the semantics for our type logical puts no constraints on the interpretation of the constants (except their typing). But that means that we still do worse than predicate logic, because predicate logic got the truth conditions of our sentence right, and so far we don't. The problem is that at the moment we are saying that every boy denotes some set of sets of individuals, but we are not saying which set. We need to constrain the interpretations. We can easily do so – type theory is rich enough for that. But we can do better, and let the logical theory do the work for us. We have equipped the type logical language with a function definition operation, the lambda operator. The lambda operator is a function definition operator. With that we can easily find expressions in the logical language that define exactly which meanings we want. To that we now turn.

49