<<

The Compactness Theorem

Mike Prest Department of Mathematics Alan Turing Building University of Manchester Manchester M13 9PL UK [email protected]

July 19, 2017

July 19, 2017 1 / 21 Is 0.9 = 1? meaning, is 0.99999 ··· = 1?

Theorem (Compactness Theorem v.1) If you want something and there’s no reason you can’t have it, then you can get it.

Set  = 1 − 0.9. + The conditions on , for it to be nonzero, are: { < 1/n : n ∈ Z } ∪ { > 0}.

July 19, 2017 2 / 21 There is a non-standard version, R∗, of the reals, which has an infinitesimal (a solution, , to all those conditions).

In R∗, we will have 0.9 < 1. R∗ is an elementary extension of R (said otherwise, R is an elementary substructure of R∗) meaning that, whenever ϕ(x) is a formula (in n free variables and with parameters from R) then the solution , ϕ(R), of ϕ in R is the intersection of Rn with the solution set ϕ(R∗) in R∗.

July 19, 2017 3 / 21 We can “construct” R∗ using . As a simpler case we’ll first construct an of finite fields. First we should define infinite products.

July 19, 2017 4 / 21 Suppose that M1, M2,..., Mi ,... are structures all of the same kind (all groups, or rings, or partially ordered sets, or...). Their product is, as a set, the product

Xi=1 Mi = M1 × M2 × · · · × Mi × ... ∞ of their underlying sets. The elements of this set are the sequences

+ a = (ai ) = (ai )i = (ai )i∈Z = (a1, a2,..., ai ,... ) with ai ∈ Mi for every i. The structure is defined on this set pointwise. For example, applying this with the component structures being the integers mod p for the various primes p, we get the structure Xp Zp where p ranges over the primes. The structure on this is the structure of a ring: there’s an addition and a multiplication, both defined coordinatewise, an identity 1 = (1p)p for the multiplication and an identity 0 = (0p)p for the addition, where 1p denotes the image of 1 ∈ Z in Zp and similarly for 0. But the product, though a ring, is not a field. The aim is to produce a structure whose properties are some kind of “average” of the properties of the component structures. Since all those are fields, we should aim to produce a field.

July 19, 2017 5 / 21 We’re going to identify a = (ai )i with b = (bi )i if these agree on a “large” set of coordinates. “Large”?? Certainly I itself should be a large (equal elements should be identified) and the ∅ should not be large (collapsing all elements together would not give an interesting result). If J ⊆ I is large and J ⊆ K ⊆ I then surely K should also be large. If we’re going to identify a and b and also identify b and c then we’re going to have to identify a and c (“identification” will be an equivalence relation). Setting J = {i ∈ I : ai = bi } and K = {i ∈ I : bi = ci } to be the sets where these pairs of elements agree, then all we can really say about the set of coordinates where ai = ci is that it contains J ∩ K; so it looks as if we should require this set to be large. Let’s extract those conditions.

July 19, 2017 6 / 21 Definition If I is a set then a filter on I is a collection F of of I such that: • I ∈ F; • ∅ ∈/ F; • if J ⊆ K ⊆ I and J ∈ F then K ∈ F; • if J, K ∈ F, then J ∩ K ∈ F.

Given a filter F on I , we define an equivalence relation ∼ on the product Xi∈I Mi by (ai )i ∼ (bi )i iff {i ∈ I : ai = bi } ∈ F.

Denote by Xi∈I Mi /F the set of equivalence classes, writing a/ ∼ for the equivalence class of any element a ∈ Xi∈I Mi . We can then turn Xi∈I Mi /F into a structure of the same kind as the Mi , defining operations and relations pointwise. This structure is called the reduced product of the Mi (with respect to the filter F). If all the structures Mi are copies of the same structure M then we use the notation MI /F and refer to this as a reduced power of M.

July 19, 2017 7 / 21 Let’s do this with the field example above, taking F to be the set of cofinite + + subsets of Z (those with a finite complement in Z ). We can then define the algebraic operations by setting    (ap)p/ ∼ + (bp)p/ ∼ = (ap + bp)p/ ∼ and    (ap)p/ ∼ × (bp)p/ ∼ = (ap × bp)p/ ∼ . 0 It has to be checked that this is well-defined (e.g. that if (ap)p ∼ (ap)p and 0 0 0 (bp)p ∼ (bp)p then (ap + bp)p ∼ (ap + bp)p) but the conditions in the definition of a filter include what we need to do this.

We can then check that (0p)p/ ∼ is the zero for addition and (1p)p/ ∼ is the identity for multiplication and, indeed, that all the axioms for a commutative ring are satisfied by Xp Zp/F. However, it’s still not a field.

July 19, 2017 8 / 21 Definition An ultrafilter U on a set I is a filter on I which satisfies the further equivalent conditions: • for each J ⊆ I either J ∈ U or I \ J ∈ U; • if J ∪ K ∈ U then either J ∈ U or K ∈ U; •U is a maximal filter (meaning that any collection of subsets of I which properly includes all the sets in U cannot be a filter).

Let’s continue our example using an ultrafilter U and check that the quotient ring F = Xp Zp/U, is a field:   if a/ ∼ = (ap)p/ ∼ 6= 0 then Z = {p : ap = 0} ∈/ U. Since we have an ultrafilter, it follows that I \ Z = {p : ap 6= 0} ∈ U.

But whenever ap 6= 0, ap has an inverse, bp say.  Set b = (bp)p (where, if p ∈ Z, set bp = 0 say). Then ab/ ∼ = 1 so a/ ∼ has a multiplicative inverse, b/ ∼, as required.

July 19, 2017 9 / 21 We will consider only non-principal ultrafilters. But - are there any? If I is any infinite set then the collection of all cofinite sets is a filter, sometimes called the Fr´echet filter F0. If U is any ultrafilter containing F0 then (quick exercise) U cannot be principal. We do need to call on Zorn’s Lemma to give the existence of a maximal=ultra filter containing any given filter.

July 19, 2017 10 / 21 In the special case where all the ‘component’ structures are the same, M, say, there is a natural map from M to any reduced product, M∗ = MI /F, given by taking a ∈ M to (a)i / ∼ (the equivalence class of the constant sequence (a)i ). It’s easy to check that this is an embedding of M into M∗, termed the diagonal embedding. So M∗ has a copy of the original structure sitting inside it. Nothing like this is the case when the components are all different - we saw an ultraproduct of finite fields giving a field of characteristic 0 (which cannot, therefore, contain any finite field).

July 19, 2017 11 / 21 Theorem (Los’ Theorem v.1) If all the component structures (or even just a “large” set of them) have a certain property, then their ultraproduct has that property.

Let’s come back to infinitesimals. We’ll take our index set I to be the set of positive integers. For each n ∈ I , take the structure Mn to be the reals R. Choose (rather, apply Zorn’s Lemma to get) a non-principal ultrafilter U and form the corresponding ultrapower, R∗. This contains the diagonally-embedded copy of the reals. 1   ∗ Consider the element  = n n / ∼ ∈ R . Each component is > 0 so, from the way we define the ordering in the ultraproduct,  > 0. Also, given a positive integer n, for all but finitely many k, the kth component of  is < 1/n; so, again by the definition of the relation in the ultraproduct,  < 1/n. Thus we have our infinitesimal, , but we had to move to some “non-standard” version R∗ of the reals to get it.

July 19, 2017 12 / 21 Structures: functions, constants, relations Equations: between terms involving variables

July 19, 2017 13 / 21 Definition 0 Let M be a structure. Let x1,..., xn be variables (“unknowns”) and let t, t be two terms built up from these variables, using the algebraic operations and also allowing the constants, if there are any, to appear. We write t(x) to display the variables. We refer to t = t 0 as an equation and we define its solution set to be {a ∈ Mn : t(a) = t 0(a)}.

Definition Suppose first that M is a purely algebraic structure (meaning the “structure” is given by operations and constants - no relations). The definable subsets of (the various finite powers of) M are the sets obtained as follows: • the solution set of every equation t = t 0 is a definable subset; • the complement, Mn \ D of any definable subset D of Mn is definable; • the intersection of any two definable subsets of Mn is definable (therefore, in view of the previous clause, their union also is definable); • if D is a definable subset of Mn and i is any of {1,..., n} then the image of D under projection along the ith axis, that is {(a1,..., ai−1, ai+1,..., an): ∃a ∈ M with (a1,..., ai−1, a, ai+1,..., an) ∈ D}, is a definable subset of Mn−1.

July 19, 2017 14 / 21 Definition Suppose first that M is a purely algebraic structure (meaning the “structure” is given by operations and constants - no relations). The definable subsets of (the various finite powers of) M are the sets obtained as follows: • the solution set of every equation t = t 0 is a definable subset; • the complement, Mn \ D of any definable subset D of Mn is definable; • the intersection of any two definable subsets of Mn is definable (therefore, in view of the previous clause, their union also is definable); • if D is a definable subset of Mn and i is any of {1,..., n} then the image of D under projection along the ith axis, that is {(a1,..., ai−1, ai+1,..., an): ∃a ∈ M with (a1,..., ai−1, a, ai+1,..., an) ∈ D}, is a definable subset of Mn−1. If the structure M also has relations then we just add those in at the beginning, along with the solution sets of equations. This does make sense, since an n-ary relation on a set M is, formally, a subset of Mn. For instance, a partial order “<” is treated formally as a set of pairs - exactly those pairs (a, b) with a < b. So {(a, b) ∈ M2 : a < b} would be one of the basic definable subsets.

July 19, 2017 15 / 21 Every definable set has a definition. If you have done a course on Predicate Logic, you will know that the definitions are given by formulas of a formal language, that is, the definable sets are those which are solution sets of formulas in the predicate language appropriate for M. Even if not, you might note that the operations we used are the “boolean” ones, of complement (“not” ¬) and intersection (“and” ∧) (and hence union (“or” ∨)), together with existential quantification = projection (“there exists” ∃) (and hence, using complements, universal quantification (“for all” ∀)).

July 19, 2017 16 / 21 Theorem ∗ (Los’ Theorem v.2.1) Suppose that M = Xi∈I Mi /U is an ultraproduct. Suppose 1 n that ϕ is a formula (with free variables x1,..., xn). Then a = (a ,..., a ) is in the ∗ ∗ solution set, ϕ(M ), of ϕ in M , iff {i ∈ I : ai ∈ ϕ(Mi )} is in U, 1 n j j  where ai = (aj ,..., aj ) and a = (ai )i / ∼ .

July 19, 2017 17 / 21 Theorem ∗ (Los’ Theorem v.2.2) Suppose that M = Xi∈I Mi /U is an ultraproduct. (a) Suppose that ϕ is a formula (with free variables x1,..., xn). Then a = (ai )i / ∼ ∗ ∗ is in the solution set, ϕ(M ) of ϕ in M , iff {i ∈ I : ai ∈ ϕ(Mi )} is in U. ∗ (b) Suppose that σ is a sentence. Then σ is true in M iff {i ∈ I : σ is true in Mi } is in U.

July 19, 2017 18 / 21 Theorem (Compactness theorem, v.2a) Suppose that we have a set Σ of sentences in a language appropriate for some specific of structure. Suppose that, for every finite subset S of Σ, there is a structure MS of that kind which satisfies all the sentences in S. Then there is an ultraproduct of the MS which satisfies all the sentences in Σ.

Proof. We take the index set I to be the set of finite subsets S of Σ, where the structure being indexed by S is MS . For each S ∈ I , define hSi = {S 0 ∈ I : S ⊆ S 0}. Then the set F = {J ⊆ I : J ⊇ hSi for some S ∈ Σ} of subsets of I which contain some set of the form hSi, is a filter. By Zorn’s Lemma there is an ultrafilter U on I (necessarily non-principal) containing F. ∗ Form the ultraproduct M = XS∈I MS /U and check that this does the job.

July 19, 2017 19 / 21 Theorem (Compactness theorem, v.2b) Suppose that M is a structure and that ϕ is a set of formulas with free variables (among) x = (x1,..., xn) (in a language appropriate n for M). Suppose that, for every finite subset S of ϕ, there is aS ∈ M which satisfies all the formulas in S. Then there is an ultrapower M∗ of M and a ∈ (M∗)n which satisfies all the formulas in ϕ.

Proof. Similar to that above. In fact, it can be made into a special case by introducing n new constant symbols to replace the variables x1,..., xn, so that a formula ϕ(x) can be replaced by a sentence in this slightly enriched language, thus replacing ϕ by a set of sentences, which can then be fed into the version we proved.

July 19, 2017 20 / 21 ∗ Sketch proof ofLos’ Theorem: So suppose that M = Xi∈I Mi /U is an ultraproduct. The assertion is that, given a formula ψ, ∗ (∗) for all a, we have a ∈ ψ(M ) iff {i ∈ I : ai ∈ ψ(Mi )} is in U 1 n ∗ n j j where x = (x1,..., xn), a = (a ,..., a ) ∈ (M ) and a = (ai )i / ∼. This is proved “by induction on the complexity of ψ” (and with prefix“for all a”). 0 Base cases: If t and t are terms built from variables y = (y1,..., ym) and 1 m ∗ m k k 0 b = (b ,..., b ) ∈ (M ) where b = (bi )i / ∼, then t(b) = t (b) iff j 0 j {i ∈ I : t(bi ) = t (bi )} ∈ U. This statement, in turn, has to be proved by induction on complexity of terms (how they are built up from the variables and constant symbols by successively applying function symbols). The induction steps have the following three forms. If ψ and ψ 0 are formulas and if each of these satisfies (∗), then so does the conjunction ψ ∧ ψ 0 [the proof uses the closure of U under intersections]. If ψ is a formula which satisfies (∗), then so does the negation ¬ψ (this proof of this uses that U is actually an ultrafilter). If ψ is a formula which satisfies (∗) and y is a variable then ∃y ψ satisfies (∗).

July 19, 2017 21 / 21