<<

Notes, Fall 2011

Lecturer: Lou van den Dries

0.1 Introduction

Recursion theory (or: theory of computability) is a branch of mathematical studying the notion of computability from a rather theoretical point of view. This includes giving a lot of attention to what is not computable, or what is computable relative to any given, not necessarily computable, . The subject is interesting on philosophical-scientific grounds because of the Church- Turing Thesis and its role in , and because of the intriguing of Kolmogorov complexity. This course tries to keep touch with how recursion theory actually impacts and computer science. This impact is small, but it does exist. Accordingly, the first chapter of the course covers the basics: primitive recur- sion, partial recursive functions and the Church-Turing Thesis, arithmetization and the of Kleene, the and Rice’s , recur- sively enumerable sets, selections and reductions, recursive inseparability, and index systems. (Turing machines are briefly discussed, but the arithmetization is based on a numerical coding of combinators.) The second chapter is devoted to the remarkable negative solution (but with positive aspects) of Hilbert’s 10th Problem. This uses only the most basic notions of recursion theory, plus some elementary that is worth knowing in any case; the coding aspects of natural numbers are combined here ingeniously with the arithmetic of the of natural numbers. The last chapter is on Kolmogorov complexity, where of recursion theory are used to define and study notions of randomness and information content. This also involves a bit of measure theory. The first and last chapter are largely based on a of notes written by Christian Rosendal, and in the second chapter I follow mostly the treatment in Smorynski’s excellent book Logical Number Theory I.

An obvious omission in the material above is that of relative computability. This topic merges with effective descriptive and deserves to be included, but there is only so much one can do in one semester. I have opted for a rather careful treatment of fewer topics.

Notation: N = {0, 1, 2,...} is the set of natural numbers, including 0, and a, b, , d, e and k, , m, n (sometimes with subscripts or accents) range over N.

1 Chapter 1

Basic Recursion Theory

In this chapter, x, y, z, sometimes with subscripts, range over N as well. (In later chapters, x, y, z can be something else.)

1.1 Primitive Recursion

d S We define Fd to be the set of all functions f : N → N, and put F := d Fd. We identify F0 with N in the obvious way. For i = 1, . . . , d, the ith projection i d i function Pd : N → N is given by Pd(x1, . . . , xd) = xi. The successor function S : N → N is given by S(x) = x + 1. To denote functions, we sometimes borrow a notation from λ-calculus: if t(x1, . . . , xd) is an expression such that t(a1, . . . , ad) ∈ N for all a1, . . . , ad, then λx1 . . . xd.t(x1, . . . , xd) denotes the corresponding function

d (a1, . . . , ad) 7→ t(a1, . . . , ad): N → N.

i For example, S = λx.x + 1 and Pd = λx1 . . . xd.xi. Now we define substitution. For g ∈ Fn and f1, . . . , fn ∈ Fd, g(f1, . . . , fn) is the function λx1 . . . xd.g(f1(x1, . . . , xd), . . . , fn(x1, . . . , xd)) in Fd. For any g ∈ Fd and h ∈ Fd+2, there is a unique f ∈ Fd+1 such that for all x1, . . . , xd, y:

f(x1, . . . , xd, 0) = g(x1, . . . , xd),

f(x1, . . . , xd, y + 1) = h(x1, . . . , xd, y, f(x1, . . . , xd, y)).

This function f is said to be obtained by primitive recursion from g and h.

1.1.1 Primitive Recursive Functions Loosely speaking, the functions obtainable by the above procedures are the primitive recursive functions. More precisely,

2 Definition 1. The set PR of primitive recursive functions is the smallest of F such that, with PRd := PR ∩ Fd, (a) all constant functions in F belong to PR;

(b) S ∈ PR1;

i (c) Pd ∈ PRd for i = 1, . . . , d;

(d) if g ∈ PRn, f1, . . . , fn ∈ PRd, then g(f1, . . . , fn) ∈ PRd ( under substitution);

(e) if g ∈ PRd and h ∈ PRd+2 and f ∈ Fd+1 is obtained by primitive recursion from g, h then f ∈ PRd+1 (closure under primitive recursion). Essentially all functions in F that arise in ordinary mathematical practice are primitive recursive. For example, the greatest common divisor function gcd is primitive recursive, but proving such facts will be much easier once we have some lemmas available. In to functions, some sets will be called primitive recursive. Recall d d that the characteristic function of a set A ⊆ N is the function χA : N → N d defined by χA(~x) = 1 if ~x ∈ A and χA(~x) = 0 if ~x ∈ N \ A. Definition 2. A set A ⊆ Nd is said to be primitive recursive if its charac- teristic function χA is primitive recursive.

A set A ⊆ Nd is often construed as a d-ary relation on N, and so, when ~x ∈ A, we often write A(~x) instead of ~x ∈ A.

1.1.2 Basic Examples 2 1 • The addition operation + : N → N is primitive recursive: let g = P1 ∈ 3 F1 ∩ E, h = S(P3 ) ∈ F3 ∩ E, i.e. g(x) = x and h(x, y, z) = z + 1. Then + is obtained by primitive recursion from g, h:

x + 0 = x = g(x), x + (y + 1) = (x + y) + 1 = h(x, y, x + y).

• The multiplication operation · : N2 → N is primitive recursive. For let g ∈ F1 ∩ E, h ∈ F3 ∩ E be given by

3 1 g(x) = 0, h(x, y, z) = z + x (= +(P3 ,P3 )(x, y, z)). Then · is obtained by primitive recursion from g, h, since x · 0 = 0 = g(x) and x · (y + 1) = xy + x = h(x, y, xy). • λxy.xy is primitive recursive: x0 = 1, xy+1 = xy · x.

• The monus function −· : N2 → N, where

 x − y if x ≥ y, x −· y = 0 otherwise.

3 First, we observe that λx.x −· 1 is primitive recursive, as 0 −· 1 = 0 and (x+1)−· 1 = x. Then −· itself is primitive recursive: x−· 0 = x, x−· (y+1) = (x −· y) −· 1.

• sign : N → N, given by

 1 if x > 0, sign(x) = 0 if x = 0,

is primitive recursive, for sign(0) = 0, sign(x + 1) = 1.

• The set A = {(x, y) ∈ N2 : x < y} ⊆ N2 is primitive recursive since χA(x, y) = sign(y −· x).

• The diagonal ∆ = {(x, y) ∈ N2 : x = y} ⊆ N2 is primitive recursive:

χ∆(x, y) = 1 −· ((x −· y) + (y −· x)).

d • For A, B ⊆ N we have χ∅ = 0, χA∩B = χA · χB, and χ−A = 1 −· χA. Therefore, the primitive recursive of Nd are the elements of a of subsets of Nd. d • Definition by cases: Suppose A ⊆ N and f, g ∈ Fd are primitive recursive. Then the function h : Nd → N given by

 f(~x) for ~x ∈ A, h(~x) = g(~x) for ~x 6∈ A,

is primitive recursive because h = f · χA + g · χ−A.

• Iterated sums/products: Let f ∈ Fd+1 and let

y X g = λx1 . . . xdy. f(x1, . . . , xd, i). i=0 Note that g(~x, 0) = f(~x, 0) and g(~x,y + 1) = g(~x,y) + f(~x,y + 1). Thus if f is primitive recursive, so is g, and vice-versa. Similarly, if f is primitive recursive, so is

y Y λx1 . . . xdy. f(x1, . . . , xd, i). i=0

Notation. Let p(i) be a condition on i ∈ N: for each i ∈ N either p(i) holds, or p(i) does not hold. Then µi≤y p(i) denotes the least i ≤ y such that p(i) holds if there is such an i, and if no such i exists, then µi≤y p(i) := 0. For example, if A ⊆ N, then µi≤3 i ∈ A is one of the numbers 0, 1, 2, 3.

4 d+1 • Bounded search: Let A ⊆ N , and define f ∈ Fd+1 by

f(~x,y) = µi≤y A(~x,i).

If A is primitive recursive, then so is f. This is because f(~x, 0) = 0 and

 Py f(~x,y) if i=0 χA(~x,i) ≥ 1,  Py f(~x,y + 1) = y + 1 if i=0 χA(~x,i) = 0 and A(~x,y + 1),  Py 0 if i=0 χA(~x,i) = 0 and A(~x,y + 1).

d+1 More generally, if A ⊆ N and g ∈ Fd+1 are primitive recursive, so is the function λ~xy.µi≤g(~x,y) A(~x,i) (this is a function in Fd+1).

• Bounded quantification: Let A ⊆ Nd+1 and define:

≤ d+1 d+1 ∃ A = {(~x,y) ∈ N : ∃i ≤ y A(~x,i)} ⊆ N , ≤ d+1 d+1 ∀ A = {(~x,y) ∈ N : ∀i ≤ y A(~x,i)} ⊆ N .

Py  Qy  Since χ∃≤A(~x,y) = sign i=0 χA(~x,i) and χ∀≤A(~x,y) = sign i=0 χA(~x,i) , it follows that if A is primitive recursive, then so are ∃≤A and ∀≤A.

Exercise. Prove that gcd : N2 → N is primitive recursive. Here, gcd(0, 0) = 0, and gcd(x, y) is the greatest d such that d|x and d|y, for x, y not both zero.

1.1.3 Coding

Enumerate N2 diagonally as follows: (0, 0) (1, 0) (0, 1) (2, 0) (1, 1) (0, 2) (3, 0) ··· l l l l l l l 0 1 2 3 4 5 6 ···

5 This bijection (x, y) 7→ |x, y| : N2 → N is given by: (x + y)(x + y + 1) |x, y| = + y. 2 This bijection is primitive recursive: use that

|x, y| = µi≤f(x,y) 2i = f(x, y) with f : N2 → N given by f(x, y) = (x + y)(x + y + 1) + 2y, so f is primitive recursive. Note also that x ≤ |x, y| and y ≤ |x, y| for all x, y. 3. For d ≥ 1, we have primitive recursive functions

d d d d h i : N → N and ()1,..., ()d : N → N, d d d such that h i is a bijection and h(x)1,..., (x)di = x for all x. 1 1 Proof. For d = 1, take both h i and ( )1 as the identity function on N. For 2 2 2 d = 2, take hx, yi = |x, y| and define ( )1 and ( )2 by bounded search:

2 2 (x)1 := µi≤x (∃y ≤ x |i, y| = x), (x)2 = µi≤x (∃y ≤ x |y, i| = x).

d d d We now proceed by induction. Given h i and ( )1,..., ()d with the desired properties, and d ≥ 2, put

d+1 d hx1, . . . , xd, yi = |hx1, . . . , xdi , y| and d+1 2 d d+1 2 (x)i = ((x)1)i for i = 1, . . . , d, (x)d+1 = (x)2. Note that the identities just displayed also hold for d = 1.

1.1.4 More General 0 0 Let us consider first double recursions: Suppose g, g ∈ Fd and h, h ∈ Fd+3 are 0 primitive recursive. Let f, f ∈ Fd+1 be given by:

f(~x, 0) = g(~x), f(~x,y + 1) = h~x,y, f(~x,y), f 0(~x,y), f 0(~x, 0) = g0(~x), f 0(~x,y + 1) = h0~x,y, f(~x,y), f 0(~x,y).

Then f is primitive recursive. To see this, we use the above coding method and 0 define t ∈ Fd+1 by t(~x, 0) = |g(~x), g (~x)|, and

2 2 0 2 2 t(~x,y + 1) = |h ~x,y, (t(~x,y))1, (t(~x,y))2 , h ~x,y, (t(~x,y))1, (t(~x,y))2 |.

2 0 Thus t is primitive recursive. We have f(~x,y) = (t(~x,y))1 and f (~x,y) = 2 0 (t(~x,y))2, so f and f are primitive recursive. It turns out that if a function is defined recursively in terms of several of its previous values (the most general case being that f(~x,y + 1) is computed in terms of f(~x, 0), . . . , f(~x,y)), then it is still primitive recursive. To deal with

6 this situation, we use a Skolem trick. Given f ∈ Fd+1, define fb ∈ Fd+1 by fb(~x, 0) = f(~x, 0), and fb(~x,y) = |f(~x,y), fb(~x,y − 1)| for y > 0. So fb(~x,y) encodes the values of f at (~x, 0), (~x, 1),..., (~x,y).

Let g ∈ Fd and h ∈ Fd+2, and define f ∈ Fd+1 by

f(~x, 0) = g(~x) and f(~x,y + 1) = h(~x,y, fb(x, y)).

Claim. If g and h are primitive recursive, then so is f. Proof. Assume g and h are primitive recursive. To obtain that f is primitive recursive, it suffices to show that fb is primitive recursive, because ( fb(~x,y) if y = 0, f(~x,y) = 2 (fb(~x,y))1 if y > 0.

But fb(~x, 0) = g(~x), and

fb(~x,y + 1) = hf(~x,y + 1), fb(~x,y)i = hh(~x,y, fb(~x,y)), fb(~x,y)i, so fb is primitive recursive, as desired.

Exercise. Show that the Fibonacci sequence (Fn) defined by F0 = 0,F1 = 1 and Fn = Fn−1 + Fn−2 for n ≥ 2, is primitive recursive.

For later use we also need an encoding of finite sequences of variable length such that the length of the sequence can be decoded from its code. We define

d+1 hm1, . . . , mdi := hm1, . . . , md, di .

Note that then

hm1, . . . , mdi = hn1, . . . , nei ⇐⇒ d = e and m1 = n1, . . . , md = nd.

2 Define lh : N → N by lh(n) = (n)2, so that lh(hn1 . . . , ndi) = d. It is clear that lh is primitive recursive. Let B ⊆ N be the set of all hn1, . . . , ndi for all d. 2 Claim. B = {b ∈ N :(b)2 > 0} ∪ {0}. 2 To see this, note that if (b)2 = d > 0, then b = |a, d| with a ∈ N, so a = d hn1, . . . , ndi for suitable n1, . . . , nd, hence b = hn1, . . . , ndi ∈ B. It remains to 1 note that the only hn1, . . . , ndi of B with d = 0 is h0i = 0. It follows from the claim that B is primitive recursive. Next, we introduce 2 a primitive recursive function (n, j) 7→ (n)j : N → N such that for n = hn1, . . . , ndi and 1 ≤ j ≤ d we have (n)j = nj, and thus (n)j < n. First we define a primitive recursive f : N2 → N by 2 f(n, 0) = n, f(n, i + 1) = f(n, i) 1.

7 It is easy to check that for n = hnd, nd−1, . . . , n1i and 1 ≤ i ≤ d we have

d+1−i f(n, i) = hnd, . . . , nii .

2 Now define the primitive recursive function (n, j) 7→ (n)j : N → N by · 2 (n)j := f(n, lh(n) + 1 − j)2 for j 6= 1, (n)1 := f(n, lh(n)). This function has the desired properties as is easily verified.

1.2 Partial Recursive Functions

From the way we defined “primitive recursive”, it is clear that each primitive recursive function is computable in an intuitive sense. A standard diagonaliza- tion , however, yields a f : N → N that is not primitive recursive. This argument goes as follows: one can effectively produce a list f0, f1, f2,... of all functions in PR1. (Any primitive recursive function N → N may appear infinitely often in this list.) Here, effective means that we have an /program that on any input (m, n) computes fm(n). Now define f : N → N, f(n) := fn(n) + 1.

Then f is computable in the intuitive sense, but f 6= fn for every n, so f cannot be primitive recursive. More generally, any effective method of characterizing the intuitive notion of (total) computable function N → N must fail by a similar diagonalization. It turns out that we can effectively characterize a more general notion of partial computable function N * N that does correspond to the intuitive notion of what is computable by an algorithm; the key point is that may fail to terminate on some inputs. Accordingly, we extend our notion of primitive recursive function to that of partial recursive function, essentially by allowing unbounded search.

Notation For sets P,Q, we denote by f : P*Q a partial function f from P into Q, that is, a function from a set D(f) ⊆ P into Q; then

f(p) ↓ (in words: f converges at p) means that p ∈ D(f), and

f(p) ↑ (in words: f diverges at p) means that p ∈ P but p∈ / D(f). Let f : Nd+1 * N. Then λ~x. (µyf(~x,y) = 0) denotes the partial function g : Nd * N such that for ~x ∈ Nd: g(~x) ↓ ⇐⇒ ∃y∀i ≤ y f(~x,i) ↓ and ∀i < y f(~x,i) 6= 0 and f(~x,y) = 0,

8 and if g(~x) ↓, then g(~x) is the unique y witnessing the righthandside in the n d above equivalence. Let g : N * N and f1, . . . , fn : N * N. Then d g(f1, . . . , fn): N * N is given by:

g(f1, . . . , fn)(~x) ↓ ⇐⇒ f1(~x) ↓, . . . , fn(~x) ↓ and g(f1(~x), . . . , fn(~x)) ↓ and if g(f1, . . . , fn)(~x) ↓, then g(f1, . . . , fn)(~x) = g(f1(~x), . . . , fn(~x)). Given g : Nd * N, h : Nd+2 * N there is clearly a unique f : Nd+1 * N such that for all ~x ∈ Nd and y ∈ N: • f(~x, 0) ↓ ⇐⇒ g(~x) ↓; if f(~x, 0) ↓, then f(~x, 0) = g(~x). • f(~x,y + 1) ↓ ⇐⇒ f(~x,y) ↓, h(~x,y, f(~x,y)) ↓; if f(~x,y + 1) ↓, then f(~x,y + 1) = h(~x,y, f(~x,y)). This f is said to be obtained by primitive recursion from g, h. Definition 4. The set of partial recursive functions is the smallest set of partial functions Nd * N for d = 0, 1, 2,... such that: (a) the function O : N0 → N with value 0 is partial recursive, and the successor function S : N → N is partial recursive; i (b) the functions Pd, for 1 ≤ i ≤ d, are partial recursive; m d (c) whenever g : N * N, h1, . . . , hm : N * N are partial recursive, so is g(h1, . . . , hm);

(d) whenever g : Nd * N, h : Nd+2 * N are partial recursive, then so is the function obtained from g, h by primitive recursion;

(e) whenever f : Nd+1 * N is partial recursive, then so is d λ~x. (µyf(~x,y) = 0) : N * N.

A recursive function is a partial recursive function f : Nd → N, the notation d d d here indicating that D(f) = N . A set A ⊆ N is recursive if χA : N → N is recursive. It is easy to check that primitive recursive functions are recursive.

Lemma 5. Given f : Nd → N, the following are equivalent: (a) f is recursive;

(b) graph(f) ⊆ Nd+1 is recursive.

Proof. Use that χgraph(f)(~x,y) = χ∆(f(~x), y). In the other direction, use

f(~x) = µy(χgraph(f)(~x,y) = 1).

9 By similar as for primitive recursive sets one obtains:

Theorem 6. The of recursive subsets of Nd is closed under finite unions, intersections and complements. If A ⊆ Nd+1 is recursive, so are ∃≤A and ∀≤A as subsets of Nd. Exercise. Show that if f : N → N is a recursive bijection, then so is its inverse. Show that there is a primitive recursive bijection N → N whose inverse is not primitive recursive. (For the second problem, use that there is a recursive function N → N that is not primitive recursive.)

1.2.1 Turing Machines Definition 7. A consists of a biinfinite tape of successive boxes,

·· ·· qi 4 head and a head that can read, erase, write in the box in case it is blank, and move to the box on the left or the right. Moreover, there are given

• A finite alphabet Σ = {s0, s1, . . . , sn} with n ≥ 1, si 6= sj when i 6= j, with distinguished symbols s0 := # for ’blank’, and s1 := 0.

• A finite set of internal states Q = {q0, q1, . . . , qm} with m ≥ 1, qi 6= qj for i 6= j, with distinguished states q0 (the initial state), and qm := qf (the final state).

• A finite set {I1,...,Ip} of instructions, each of one of the following three types:

(a) qasbscqd: if in state qa reading sb, erase sb, write sc in its place, and enter state qd;

(b) qasbRqd: if in state qa reading sb, move to the box on the right, and enter state qd;

(c) qasbLqd: if in state qa reading sb, move to the box on the left, and enter state qd.

This set of instructions should be complete: for each qa 6= qf , and each sb there is exactly one instruction of the form qasb ... , and there is no instruction of the form qf .....

Formally, a Turing machine is just a triple (Σ, Q, {I1,...,Ip}) as above. Definition 8. Let M be a Turing machine with alphabet {#, 0, 1}. We say that d M computes the partial function f : N * N if for any input (x1, . . . , xd) as follows:

10 ·· ]] 0 1 1 ··· 1 0 1 1 ··· 1 0 ····· 0 1 1 ··· 1 0 ]]] ·· q0 4| {z } | {z } | {z } head x1 x2 xd

the machine, applying the instructions,

• either never enters qf , and then f(~x) ↑.

• or eventually enters state qf , and then f(~x) ↓, with its head and tape as follows:

··0 1 1 ··· 1 0 ·· | {z } 4qf f(x1,...,xd) head

Fact. Turing Computable = Partial Recursive. Turing gave a compelling analysis of the intuitive concept of computability, in 1936, and this led him to identify it with the precise notion of Turing com- putability. Turing machines remain important as a model of computation in connection with complexity theory. But in the rest of this course we deal di- rectly with partial recursive functions without using Turing machines.

1.3 Arithmetization and Kleene’s Theorems

We shall give numerical codes to the programs that compute partial recursive functions. First, we introduce formal expressions that specify these programs. These expressions will be called combinators and they are words on the alphabet with the following distinct symbols: (a) the two symbols O and S,

i (b) for each pair i, d such that 1 ≤ i ≤ d a symbol Pd, m (c) for each pair m, d, a symbol Sd (a substitution symbol),

(d) for each d a symbol Rd (a primitive recursion symbol),

(e) for each d a symbol Sd (an unbounded search symbol). Each combinator has a specific arity (a ) and each d-ary combi- nator f is a word on this alphabet, and has associated to it a partial recursive function fˆ : Nd * N. The definition is inductive: (a) The word O of length 1 is a nullary combinator with associated function N0 → N taking the value 0. The word S of length 1 is a unary combinator with associated function x 7→ x + 1 : N → N.

11 i (b) For 1 ≤ i ≤ d, the word Pd is a d-ary combinator of length 1 with associ- d ated function (x1, . . . , xd) 7→ xi : N → N.

(c) If g is an m-ary combinator and h1, . . . , hm are d-ary combinators, then m ˆ ˆ Sd gh1 . . . hm is a d-ary combinator with associated functiong ˆ(h1,..., hm).

(d) If g is a d-ary combinator and h is a (d + 2)-ary combinator, then Rd gh is a (d + 1)-ary combinator whose associated function Nd+1 * N is obtained by primitive recursion fromg ˆ and hˆ.

(e) If g is a (d + 1)-ary combinator, then Sd g is a d-ary combinator whose associated function is

λx1 . . . xd. (µy (ˆg(x1, . . . , xd, y) = 0) .

Next we associate inductively to each combinator g a number #g ∈ N: (a) # O = h1, 0i, # S = h1, 1i,

i (b) # Pd = h2, i, di, m (c) #(Sd fg1 . . . gm) = h3, #f, #g1,..., #gm, di,

(d) #(Rd gh) = h4, #g, #h, d + 1i,

(e) # Sd g = h5, #g, di. Let Co be the set of all combinators. Then # : Co → N is clearly injective. It is easy to check that the primitive recursive function α : N → N given by α(n) := (n)lh(n) has the property that if n = #g with g ∈ Co, then α(n) is the arity of g. Lemma 9. The set # Co is primitive recursive. Proof. With B as in subsection 1.1.4, consider the following subsets of B:

(a) B1 := {h1, 0i, h1, 1i};

(b) B2 := {n ∈ B :(n)1 = 2, lh(n) = 3, 1 ≤ (n)2 ≤ (n)3};

(c) B3 is the set of all n ∈ B such that (n)1 = 3 and such that m := α((n)2) satisfies lh(n) = m + 3, α((n)3) = α((n)4) = ··· = α((n)m+2) = (n)lh(n);

(d) B4 := {n ∈ B :(n)1 = 4, lh(n) = 4, α((n)3) = α((n)2) + 2 = (n)4 + 1};

(e) B5 := {n ∈ B :(n)1 = 5, lh(n) = 3, α((n)2) = (n)3 + 1}.

Note that these sets Bi are primitive recursive. Let A := #Co, so χA(n) = 0 S5 if n∈ / B, and also if n ∈ B \ i=1 Bi. It remains to describe χA on the Bi. Clearly, χA(n) = 1 for n ∈ B1 ∪ B2. For n ∈ B3 and m := α((n)2) we have

m+2 Y χA(n) = χA((n)i), i=2

12 For n ∈ B4 we have χA(n) = χA((n)2)χA((n)3), and for n ∈ B5 we have χA(n) = χA((n)2).

Note that for any partial recursive function φ : Nd * N, there are infinitely many d-ary combinators f such that fˆ = φ. This is because for any d-ary ˆ 1 1 combinator f we have f =g ˆ where g := Sd P1 f. We have encoded programs—more precisely, combinators—by numbers, and next we shall encode terminating computations using these programs.

Given a d-ary combinator f and ~x = (x1, . . . , xd), we also write f(~x) instead of fˆ(~x). Let f be a combinator and f(~x) = y. The latter means that ~x is in the domain of fˆ and fˆ(~x) = y. Then we assign to the triple (f, ~x,y) a number ct(f, ~x,y) ∈ N, called the computation of f at input ~x. This assignment is defined inductively:

i (a) if f is O, S, or Pd, then, with ~x = (x1, . . . , xd),

ct(f, ~x,y) := h#f, ~x,yi := h#f, x1, . . . , xd, yi;

m (b) if f is Sd gh1 . . . hm, then, with ~x = (x1, . . . , xd), h1(~x) = u1, ..., hm(~x) = um, ~u = (u1, . . . , um), and g(~u) = y = f(~x),

ct(f, ~x,y) := h#f, ~x, ct(h1, ~x,u1),..., ct(hm, ~x,um), ct(g, ~u,y), yi;

(c) if f is Rdgh, then, with ~x = (x1, . . . , xd, xd+1) and ~xd := (x1, . . . , xd),

 h#f, ~x , 0, ct(g, ~x , y), yi if x = 0 ct(f, ~x,y) := d d d+1 h#f, ~xd, n + 1, ct(h, ~xd, n, f(~xd, n), y), yi if xd+1 = n + 1;

(d) if f is Sdg, then, with ~x = (x1, . . . , xd),

ct(f, ~x,y) := h#f, ~x, ct(g, ~x, 0, g(~x, 0)),..., ct(g, ~x, y, g(~x,y)), yi.

Lemma 10. The

(f, ~x,y) 7→ ct(f, ~x,y): {f a combinator, f(~x) = y} → N is injective, and its T ⊆ N is primitive recursive. Proof. As before. Recall that we introduced the primitive recursive function

α : N → N, α(z) := (z)lh(z).

d+2 Theorem 11 (Kleene). For each d we have a primitive recursive Td ⊆ N such that for every d-ary combinator f and all ~x ∈ Nd:

13 • if f(~x) ↓, then Td(#f, ~x,z) for a unique z and f(~x) = α(z) for this z;

• if f(~x) ↑, then Td(#f, ~x,z) for no z. d+2 Proof. Define Td ⊆ N as follows:

Td(e, ~x,z) ⇐⇒ z is the computation tree of some d-ary combinator

f with #f = e at input ~x = (x1, . . . , xd)

⇐⇒ T (z), (z)1 = e, (z)2 = x1,..., (z)d+1 = xd, α(e) = d.

Thus, Td is primitive recursive. If f is a d-ary combinator, and f(~x) = y, then clearly Td(#f, ~x,z) for a unique z, namely z = ct(f, ~x,y), and y = (z)lh(z) = α(z) for this z. If f is a d-ary combinator and f(~x) ↑, then there is no z with Td(#f, ~x,z).

Corollary 12. Suppose φ : Nd * N is partial recursive. Then φ = fˆ for some d-ary combinator f with exactly one occurrence of an unbounded search symbol.

Proof. Take any d-ary combinator g such that φ =g ˆ. Then for all x1, . . . , xd,  φ(x1, . . . , xd) ' α µz(Td(#g, x1, . . . , xd, z)) . [Note: The symbol ' indicates that either both sides are defined and equal, or ˆ both sides are undefined.] Thus φ = f for some d-ary combinator f in which Sd occurs exactly once, and no other unbounded search symbol occurs.

(d)  d Definition 13. φe = λx1 . . . xd.α µz(Td(e, x1, . . . , xd, z)) : N * N.

(d) Corollary 14. Each φe is partial recursive, and for each partial recursive d (d) φ : N * N there is an e such that φ = φe . Another consequence is that the recursive functions, as defined earlier, are exactly the computable functions, as defined in MATH 570.

Tracing back the definition of Td in terms of combinators and computation (d) trees we see that if e = #f with f a d-ary combinator, then φe = fˆ, while if (d) d e 6= #f for all d-ary combinators f, then φe is the partial function N * N with empty domain. This fact is needed to prove: Lemma 15. Let d ≥ 1 be given. Then there is a primitive recursive function 2 ρ : N → N (depending on d) such that for all x1, . . . , xd, (d) (d−1) φ (x1, . . . , xd) ' φ (x2, . . . , xd). e ρ(e,x1) d Before giving the proof we note that for all a, d there is a d-ary combinator ca whose associated function is the constant (total) function Nd → N taking the d value a. Indeed, one can construct ca in such a way that, for any given d, the map d a 7→ #(ca): N → N is primitive recursive. We leave this construction as an exercise to the reader.

14 Proof. Let f be a d-ary combinator, d ≥ 1. Then

d d−1 1 d−1 fa := Sd−1 f ca Pd−1 ··· Pd−1 is a (d − 1)-ary combinator such that for all x2, . . . , xd, ˆ ˆ fa(x2, . . . , xd) ' f(a, x2, . . . , xd).

Note that

d−1 #(fa) = h3, #f, #(ca ), h2, 1, d − 1i,..., h2, d − 1, d − 1i, d − 1i.

Thus the primitive recursive function ρ : N2 → N defined by

d−1 ρ(e, a) := h3, e, #(ca ), h2, 1, d − 1i,..., h2, d − 1, d − 1i, d − 1i has the property that if f is any d-ary combinator with #f = e, then fa is a (d−1)-ary combinator with #(fa) = ρ(e, a), while if there is no d-ary combinator f with #f = e, then there is no (d − 1)-ary combinator g with #g = ρ(e, a). It is clear from the remark preceding the lemma that this function ρ has the desired properties.

If we have to indicate the dependence on d we let ρd be a function ρ as in the lemma. The next consequence has the strange name of “s-m-n theorem”.

m m+1 Corollary 16. Given m, n, there is a primitive recursive sn : N → N such that for all e, x1, . . . , xm, y1, . . . , yn :

(m+n) (n) φe (x1, . . . , xm, y1, . . . , yn) ' φ m (y1, . . . , yn). sn (e,x1,...,xm)

0 Proof. Take sn := idN, and put m+1 m sn (e, x1, . . . , xm+1) := ρn+1(sn (e, x1, . . . , xm), xm+1).

The next result corresponds to the existence of a universal computer:

Corollary 17. There is a partial recursive ψ : N2 * N such that for all d, e and ~x ∈ Nd we have (d) φe (~x) ' ψ(e, h~xi). Proof. Exercise. The next three results are due to Kleene, and are collectively referred to as the Recursion Theorem. They look a bit strange, but are very useful.

d+1 Theorem 18. Let f : N * N be partial recursive. Then there exists an e0 such that for all ~x ∈ Nd, f(e , ~x) ' φ(d)(~x). 0 e0

15 d+1 1 Proof. Define partial recursive g : N * N by g(e, ~x) ' f(sd(e, e), ~x). Take a (d+1) d such that g = φa . Then for all ~x ∈ N ,

1 (d+1) (d) f(sd(a, a), ~x) ' g(a, ~x) ' φa (a, ~x) ' φ 1 (~x). sd(a,a)

1 Thus e0 = sd(a, a) works.

(1) From now on, φe := φe , so φ0, φ1, φ2,... is an of the set of partial recursive functions N * N. 2 Note also that the function (e, x) 7→ φe(x): N * N is partial recursive.

Theorem 19. Let h : N → N be recursive. Then there exists an e0 such that

φe0 = φh(e0). 2 Proof. Define a partial recursive f : N * N by f(e, x) ' φh(e)(x). By the previous theorem we get e0 ∈ N such that for all x,

φe0 (x) ' f(e0, x) ' φh(e0)(x),

so φe0 = φh(e0).

Theorem 20. There is a primitive recursive β : N → N such that for each total φe,

φβ(e) = φφe(β(e)).

This is a uniform version of the previous theorem: given recursive h : N → N with index e, that is, h = φe, the number e0 = β(e) satisfies φe0 = φh(e0). 1 1 1 Proof. We claim that β : N → N defined by β(k) = s1(s2(b, k), s2(b, k)) does the job, where b is an index of the partial recursive function g : N3 * N defined by g(e, x, y) ' φ 1 (y). Verification of claim: φe(s1(x,x))

φ (y) ' φ 1 1 1 (y) β(e) s1(s2(b,e),s2(b,e)) (2) 1 ' φ 1 (s2(b, e), y) s2(b,e) (3) 1 ' φb (e, s2(b, e), y) 1 ' g(e, s2(b, e), y)

' φ 1 1 1 (y) φe(s1(s2(b,e),s2(b,e)))

' φφe(β(e))(y).

16 1.4 The Halting Problem and Rice’s Theorem

Instead of saying that A ⊆ Nd is recursive, we also say that A is decidable. Assuming the Church-Turing Thesis, for A to be decidable means to have an effective procedure for deciding membership in A, that is, an algorithm that decides for any input ~x ∈ Nd whether ~x ∈ A. Theorem 21 (Turing). The halting problem,

K0 := {(e, n): φe(n) ↓} is undecidable.

Proof. Suppose otherwise. Then the set {e : φe(e) ↓} is decidable. Define f : N * N by  0 if φ (e) ↑, f(e) = e ↑ if φe(e) ↓ .

Note that f is partial recursive since f(e) = µy(y = 0 & φe(e) ↑). So we have an e0 such that f = φe0 . But f(e0) ↑ ⇐⇒ φe0 (e0) ↓, that is,

φe0 (e0) ↑ ⇐⇒ φe0 (e0) ↓, a contradiction. This proof also shows that the set

K := {e : φe(e) ↓} is undecidable. Most natural undecidability results in mathematics can be - duced to the halting problem, although the reduction is not always obvious. Let A be a set of partial recursive functions N * N. In many cases it is natural to ask whether there is an effective procedure for deciding, for any unary combinator f, whether the associated partial function fˆ belongs to A. For example, is it decidable whether a unary combinator defines (a) a partial function with nonempty domain? (b) a partial function with finite domain? (c) a partial function with finite image? (d) a total function? and so on. Assuming the Church-Turing Thesis, the next theorem of Rice says that all such questions have a negative answer. To understand this reading of Rice’s theorem, note that one can effectively compute from any unary combi- nator f its index e = #f, and in the other direction, we can decide for any given e whether e is the index of a unary combinator, and if so, find such a combinator. So these decision problems about unary combinators translate to equivalent decision problems about natural numbers.

17 Theorem 22 (Rice). Let A be a nonempty proper subset of the set of all partial recursive functions N * N. Then {e : φe ∈ A} is undecidable.

Proof. Suppose {e : φe ∈ A} is decidable. Take a, b such that φa ∈ A and 2 φb ∈/ A. Define partial recursive f : N * N by

f(e, n) ' φa(n) if φe ∈/ A,

f(e, n) ' φb(n) if φe ∈ A.

By the recursion theorem we have an e such that φe = f(e, .). If φe ∈/ A, then φe = f(e, .) = φa ∈ A. If φe ∈ A, then φe = f(e, .) = φb ∈/ A. Thus we have a contradiction. Of course, if A is empty or A is the entire set of partial recursive functions N * N, then {e : φe ∈ A} is empty or equal to N, so the restrictions on A in Rice’s theorem cannot be dropped.

Corollary 23. The set {(a, b): φa = φb} is undecidable.

Proof. If {(a, b): φa = φb} were decidable, so would be the set {a : φa = φ0}. By Rice’s Theorem, the latter set is undecidable, and thus the former must be undecidable. Here is another typical application of the recursion theorem:

There is an e such that D(φe) = {e}. To get such an e, let ψ : N2 * N be the partial recursive function given by  0 if x = y, ψ(x, y) ' ↑ if x 6= y.

The recursion theorem gives an e such that ψ(e, ·) = φe. Then D(φe) = {e}.

1.5 Recursively Enumerable Sets

Definition 24. A set A ⊆ N is called recursively enumerable, abbreviated r.e. (or , abbreviated c.e.) if A = ∅ or A = f(N) for some recursive f : N → N. Proposition 25. Let A ⊆ N. The following are equivalent: (a) A is recursively enumerable;

(b) A = Im(f) for some partial recursive f : N * N;

(c) There is a primitive recursive S ⊆ N2 such that for all x, x ∈ A ⇐⇒ ∃y S(x, y);

18 (d) A = D(f) for some partial recursive function f : N * N; (e) A = ∅, or A = f(N) for some primitive recursive function f : N → N. Proof. The case that A = ∅ is trivial, so we assume that A is non-empty. The implication (a) =⇒ (b) is clear. For (b) =⇒ (c), let A = Im(φe). Then  y ∈ A ⇐⇒ ∃x∃z T1(e, x, z)& y = α(z)  ⇐⇒ ∃n T1(e, (n)1, (n)2)& y = α((n)2 .

Now use that the set

{(y, n): T1(e, (n)1, (n)2)& y = α((n)2)} is primitive recursive. For (c) =⇒ (d), let S be as in (c). Define a partial recursive function f : N * N by f(x) ' µy S(x, y). Then A = D(f). For (d) =⇒ (e), let A = D(φe). Then for all x,  x ∈ A ⇐⇒ ∃z T1(e, x, z) .

Pick a ∈ A and define f : N → N by  (n) if T e, (n) , (n) , f(n) = 1 1 1 2 a otherwise.

Then f is primitive recursive with f(N) = A. It is clear that (e) =⇒ (a). If A ⊆ N is recursive, then A is recursively enumerable: use (d) above with f : N * N defined by  1 if x ∈ A, f(x) = ↑ otherwise.

For arbitrary d we call a set A ⊆ Nd recursively enumerable if there is a partial recursive function f : Nd * N such that A = D(f).

Lemma 26. Let A ⊆ Nd. Then A is recursively enumerable iff A = π(S) for some primitive recursive set S ⊆ Nd+1, where π : Nd+1 → Nd is given by π(x1, . . . , xd, y) = (x1, . . . , xd). We leave the proof as an exercise. The reader should check that the lemma still holds when “primitive recursive” is replaced by “recursive”.

Lemma 27. If A, B ⊆ Nd are recursively enumerable, so are A ∪ B and A ∩ B. If C ⊆ Nm+n is recursively enumerable, then so is π(C) ⊆ Nm, where

m+n m π : N → N , π(x1, . . . , xm, y1, . . . , yn) = (x1, . . . , xm).

If A ⊆ Nd and B ⊆ Ne are recursively enumerable, so is A × B ⊆ Nd+e.

19 Proof. Suppose A0,B0 ⊆ Nd+1 are primitive recursive such that

d 0 d 0 A = {~x ∈ N : ∃y A (~x,y)},B = {~x ∈ N : ∃y B (~x,y)}.

d 0 0  0 0 Then A ∪ B = {~x ∈ N : ∃y A (~x,y) or B (~x,y) }. Since A ∪ B is primitive recursive, A ∪ B is recursively enumerable. Also,

d 0 0  A ∩ B = {~x ∈ N : ∃y A (~x, (y)1)& B (~x, (y)2) }, hence A ∩ B is recursively enumerable. Let C0 ⊆ Nm+n+1 be primitive recursive and

m+n 0 C = {(~x,~y) ∈ N : ∃z C (~x,~y, z)}. Then m 0 π(C) = {~x ∈ N : ∃y C (~x, (y)1,..., (y)n+1)}, hence C is recursively enumerable. The asssertion on cartesian products of r.e. sets is left as an exercise.

Exercise. Show that if A ⊆ Nd+1 is recursively enumerable, then so is ∀≤A. m Some notation: For i = 1, . . . , n, let fi : N * N be such that D(fi) = D(fj) m n for i, j ∈ {1, . . . , n}. This yields the partial map f = (f1, . . . , fn): N * N . For A ⊆ Nm and B ⊆ Nn we set f(A) : = f(A ∩ D(f)) = {f(~x): ~x ∈ A ∩ D(f)}, f −1(B) : = {~x ∈ D(f): f(~x) ∈ B}.

We call f partial recursive if each fi is partial recursive.

Lemma 28. For f : Nm * Nn as above, f is partial recursive iff graph(f) ⊆ Nm+n is recursively enumerable. Proof. We assume n = 1 since the general case follows easily from this case. (m) m+1 Suppose f = φe . Then for all (~x,y) ∈ N ,  (~x,y) ∈ graph(f) ⇐⇒ ∃z Tm(e, ~x,z)& α(z) = y .

For the converse, suppose graph(f) ⊆ Nm+1 is recursively enumerable. Take recursive S ⊆ Nm+2 such that for all (~x,y) ∈ Nm+1, (~x,y) ∈ graph(f) ⇐⇒ ∃z S(~x,y, z).  Then f(~x) ' µn S(~x, (n)1, (n)2) 1. Corollary 29. Let f : Nm * Nn be partial recursive, and let A ⊆ Nm and B ⊆ Nn be recursively enumerable. Then f(A) ⊆ Nn and f −1(B) ⊆ Nm are recursively enumerable.

20 Proof. Use that for all ~y ∈ Nn, ~y ∈ f(A) ⇐⇒ ∃~xA(~x)&(~x,y) ∈ graph(f), and that projecting a recursively enumerable set to Nn yields a recursively enumerable set. Also, for all ~x ∈ Nm, −1 n  ~x ∈ f (B) ⇐⇒ ∃~y ∈ N B(~y)&(~x,~y) ∈ graph(f) , showing that f −1(B) is recursively enumerable.

It will be convenient to approximate φe by functions φe,s with finite domains. Here s ∈ N and φe,s : N * N is given by  φ (x) if ∃z ≤ sT (e, x, z), φ (x) ' e 1 e,s ↑ otherwise.

Note that if T1(e, x, z), then x ≤ z, so D(φe,s) ⊆ {0, . . . , s}. We also use the following traditional notations:

We := D(φe),We,s := D(φe,s).

Note that (We)e∈N is an enumeration of the set of recursively enumerable subsets of N, and that We = K0(e). Corollary 30. With these notations, we have: (a) The halting set

2 K0 := {(e, n) ∈ N : φe(n) ↓} = {(e, n): ∃z T1(e, n, z)} is recursively enumerable;

(b) K := {e ∈ N : φe(e) ↓} = {e ∈ N : ∃z T1(e, e, z)} is recursively enumerable; S (c) We,s ⊆ We,s+1, and We,s = We; s∈N 3 3 (d) The set {(e, n, s) ∈ N : n ∈ We,s} = {(e, n, s) ∈ N : ∃z ≤ s(T1(e, n, z))} is primitive recursive. Theorem 31. Let B ⊆ N be recursively enumerable. Then there is a primitive recursive function β : N → N such that for all n, B(n) ⇐⇒ K(β(n)). 2 Proof. Let B = D(φb). Define a partial recursive function f : N * N by (2) f(x, y) ' φb(x). Take c such that f = φc . Then for all n, 1 B(n) ⇐⇒ φb(n) ↓ ⇐⇒ f(n, s1(c, n)) ↓ (2) 1 1 1 ⇐⇒ φ (n, s (c, n)) ↓ ⇐⇒ φ 1 (s (c, n)) ↓ ⇐⇒ K(s (c, n)). c 1 s1(c,n) 1 1 1 Thus we can take β(n) := s1(c, n) for this c. The significance of this result is that deciding membership in any recursively enumerable subset of N reduces computably to deciding membership in K.

21 1.6 Selections & Reductions

Theorem 32 (Selection Theorem). Let A ⊆ Nd+1 be recursively enumerable. Then there is a partial recursive function f : Nd * N such that for all ~x ∈ Nd, (a) f(~x) ↓ ⇐⇒ ∃y A(~x,y), (b) f(~x) ↓ =⇒ A(~x,f(~x)).

Proof. Take a primitive recursive S ⊆ Nd+2 such that for all (~x,y) ∈ Nd+1, A(~x,y) ⇐⇒ ∃z S(~x,y, z). Define f : Nd * N by  f(~x) ' µn(S(~x, (n)1, (n)2) 1. Then f has the desired properties. An f as in the theorem above is called a partial recursive selector for A.

Theorem 33 (Reduction Theorem). Let A, B ⊆ N be recursively enumerable. Then there are recursively enumerable sets C,D ⊆ N such that C ⊆ A, D ⊆ B, C ∩ D = ∅, and C ∪ D = A ∪ B.

Proof. Let E := (A × {0}) ∪ (B × {1}) ⊆ N2. Then E is recursively enumerable. Take a partial recursive selector f : N * N for E. Let C = f −1(0), D = f −1(1). Then C and D have the desired properties.

Theorem 34 (Post’s Theorem). A set A ⊆ N is recursive iff A and N \ A are recursively enumerable.

Proof. The left-to-right direction is clear. Assume that A and N \ A are recur- sively enumerable. Let

E := ((N \ A) × {1}) ∪ (A × {0}).

By assumption, E ⊆ N2 is recursively enumerable. Take a partial recursive selector f : N * N for E. Then D(f) = N, so f is a recursive function. Since f = χA, we obtain that A is recursive. Post’s Theorem is easy to explain by the Church-Turing Thesis: Suppose A and N\A are recursively enumerable and nonempty, and let f, g : N → N computably enumerate these sets. Since the union of these two sets is N, every natural number must appear in exactly one of the sequences f(0), f(1), f(2),... and g(0), g(1), g(2),... . Hence the computable sequence

f(0), g(0), f(1), g(1), f(2), g(2),... enumerates all of N, and thus provides a computable way to determine whether or not a given n is in A.

The sets A, B ⊆ N are called recursively separable if there is a recursive set R ⊆ N such that A ⊆ R and B ∩ R = ∅ (so A and B are disjoint). Such an R is said to recursively separate A from B.

22 Theorem 35. There are disjoint recursively enumerable sets A, B ⊆ N that cannot be recursively separated.

Proof. Let A = {e : φe(e) ' 0}, and B = {e : φe(e) ' 1}. Then A and B are disjoint and recursively enumerable. Assume that R ⊆ N recursively separates A and B. Let e0 ∈ N satisfy

χR = φe0 . Then

e0 ∈ R ⇐⇒ φe0 (e0) ' 1 ⇐⇒ e0 ∈ B =⇒ e0 ∈/ R,

e0 ∈/ R ⇐⇒ φe0 (e0) ' 0 ⇐⇒ e0 ∈ A =⇒ e0 ∈ R, a contradiction.

1.7 Trees and Recursive Inseparability

We recall that 2N is the set of all functions β : N → {0, 1}, that is, the set of all characteristic functions of subsets of N, and we think of β ∈ 2N as the infinite sequence β(0), β(1), β(2),... of zeros and ones. Graphically we represent 2N as the infinite binary tree, suggested in the picture below. For example, a sequence 0, 1, 1, 0, 0,... represents the branch in the tree that travels from the top down via “left, right, right, left, left, . . . ” , with “0” representing “go left” and “1” representing “go right”.

Figure 1.1: The complete infinite binary tree

< We also let 2 N be the set of all finite sequences ~s = (s0, . . . , sn−1) such that si ∈ {0, 1} for i < n; note that this includes an empty sequence for n = 0. The elements of 2

{hs0, . . . , sn−1i :(s0, . . . , sn−1) ∈ D} ⊆ N

23 Figure 1.2: A binary tree is recursive.

Lemma 36. The full binary tree 2

< Proof. Let A := {hs0, . . . , sn−1i :(s0, . . . , sn−1) ∈ 2 N}. Then  x ∈ A ⇐⇒ x ∈ B ∧ ∀i ≤ lh(x) i = 0 ∨ (x)i = 0 ∨ (x)i = 1 .

Thus A is recursive. Lemma 37 (K¨onig). Every infinite binary tree has an infinite branch.

Proof. Let D be an infinite binary tree. We define β : N → {0, 1} inductively such that β is an infinite branch of D. ( 0 if there are infinitely many ~s ∈ D such that s = 0, β(0) = 0 1 otherwise, and for n > 0  0 if there are infinitely many ~s ∈ D such that  β(n) = |~s| > n, sn = 0 and si = β(i) for i < n, 1 otherwise.

Induction on n shows that for each n there are infinitely many ~s ∈ D such that |~s| > n and β(0) = s0, . . . , β(n) = sn. It follows that β is an infinite branch of D. (It is actually the “leftmost” such branch.) Theorem 38. There exists an infinite recursive binary tree with no recursive infinite branch.

24 Proof. Let A0,A1 ⊆ N be disjoint and recursively inseparable r.e. sets. Thus 2 we have recursive sets S0,S1 ⊆ N such that for all m,

A0(m) ⇐⇒ ∃nS0(m, n)

A1(m) ⇐⇒ ∃nS1(m, n)

< < Next, define D ⊆ 2 N as follows; for ~s = (s0, . . . , sm−1) ∈ 2 N,

~s ∈ D ⇐⇒ ∀k < m [(∃n < m (k, n) ∈ Si) ⇒ sk = i] for i = 0, 1.

It is easy to check that D is a recursive binary tree. To see that D is infinite we show how to construct, for any m, a sequence of length m in D. The sequence (s0, . . . , sm−1) defined such that for k < m, if there exists an n < m with (k, n) ∈ S1, then sk = 1 and otherwise sk = 0, satisfies our requirements. Next we show that for any infinite branch β : N → {0, 1} of D, the set R ⊆ N such that χR = β, is not recursive. Specifically, we show that R separates A0 and A1, and thus cannot be recursive. Let k ∈ Ai and take n such that (k, n) ∈ Si. Consider any sequence (β(0), . . . , β(m − 1)) where m > k, n; note that (β(0), . . . , β(m − 1)) ∈ D by assumption. Then by definition β(k) = i, so k ∈ A0 ⇒ k∈ / R and k ∈ A1 ⇒ k ∈ R, so R ∩ A0 = ∅ and A1 ⊆ R. If R was recursive this would contradict that A0 and A1 are not recursively separable.

1.8 Indices and Enumeration

n In this section ψ = {ψe } is an index system, that is, for each n,

n n {ψe : e = 0, 1,...} = {f : N * N : f is partial recursive}. (n) Our earlier work shows that {φe } is an index system. Definition 39. We say that ψ satisfies enumeration if for each n there is an a such that n+1 n 1+n ψa (e, ~x) ' ψe (~x) for all (e, ~x) ∈ N , n 1+n equivalently, λe~x.ψe (~x): N * N is partial recursive, for each n. We say that ψ satisfies parametrization if for all m, n there is a recursive s : N1+n → N such that

n m+n 1+m+n ψs(e,~x)(~y) ' ψe (~x,~y) for all (e, ~x,~y) ∈ N .

We say that ψ is acceptable if for each n there are recursive fn, gn : N → N such that for all e ψn = φ(n) and φ(n) = ψn e fn(e) e gn(e)

(n) The index system {φe } is trivially acceptable, and earlier results show that it 1 satisfies enumeration and parametrization. We put ψe := ψe .

25 Proposition 40. ψ is acceptable if and only if ψ satisfies enumeration and parametrization.

Proof. Assume ψ is acceptable and let {fn, gn}n witness this as in the definition. (n) (n+1) To prove enumeration, fix n and take c such that φk (~x) ' φc (k, ~x) for all (k, ~x) ∈ N1+n. Then for all e, ~x,

n (n) (n+1) ψ (~x) ' φ (~x) ' φ (fn(e), ~x). e fn(e) c The right hand side is partial recursive as a function of (e, ~x), so we have b such that

(n+1) (n+1) n+1 φ (fn(e), ~x) ' φ (e, ~x) ' ψ (e, ~x) for all (e, ~x). c b gn+1(b)

n n+1 So for a = gn+1(b), we have ψe (~x) ' ψa (e, ~x) for all e, ~x. To prove parametrization, fix m, n. Then enumeration gives a such that for all (e, ~x,~y) ∈ N1+m+n, m+n 1+m+n ψe (~x,~y) ' ψa (e, ~x,~y) ' φ(1+m+n) (e, ~x,~y) f1+m+n(a) (n) ' φ 1+m (~y) sn (f1+m+n(a),e,~x) n ' ψ 1+m (~y), gn(sn (f1+m+n(a),e,~x)) so ψ satisfies parametrization. Conversely, assume ψ satisfies enumeration and parametrization. Fix n, and 1+n n 1+n take a such that ψa (e, ~x) ' ψe (~x) for all (e, ~x) ∈ N . Also, take b such (1+n) 1+n that φb (e, ~x) ' ψa (e, ~x) for all e, ~x, hence

(n) 1+n n φ 1 (~x) ' ψa (e, ~x) ' ψe (~x) for all e, ~x. sn(b,e)

1 Then fn : N → N given by fn(e) = sn(b, e) satisfies

ψn = φ(n) . e fn(e)

Likewise we can get a recursive gn : N → N such that for all e we have φ(n) = ψn . e gn(e) We need an effective form of the recursion theorem for acceptable ψ.

Corollary 41. Assume ψ is acceptable. There is a recursive β : N → N so that for any recursive h : N → N and any d with h = ψd the number e = β(d) satisfies ψe = ψh(e). Proof. Repeat the proofs of Kleene’s recursion theorems 18, 19, and 20, with (n) the system ψ instead of {φe }, using that these proofs only rely on enumeration and parametrization.

26 Lemma 42 (Padding Lemma). Assume ψ is acceptable. Then we can effectively construct, for any a, infinitely many b such that ψa = ψb.

Proof. Given a and finite D ⊆ N such that ψd = ψa for all d ∈ D, we shall 2 construct e 6∈ D with ψe = ψa. Define partial recursive g : N * N by  ψ (x) if e 6∈ D g(e, x) ' a ↑ if e ∈ D.

Enumeration and parametrization give a recursive f : N → N such that for all (e, x) ∈ N2, ψf(e)(x) ' g(e, x)

The recursion theorem yields e0, effectively from a, D, with ψe0 = ψf(e0). If eo 6∈ D, then ψe0 = ψa and we are done. Suppose e0 ∈ D; then ψa = ψe0 , hence

D(ψa) = D(ψe0 ) = ∅. So it remains to construct effectively an index e 6∈ D with D(ψe) = ∅. As before, we get a recursive h : N → N such that  0 if i ∈ D ψ (x) = h(i) ↑ if i 6∈ D

The recursion theorem gives e1 with ψh(e1) = ψe1 . If e1 ∈ D, then

ψa = ψe1 = ψh(e1), a function with nonempty domain,

contradicting D(ψa) = ∅. So e1 6∈ D, hence D(ψe1 ) = ∅, and we are done. Theorem 43. The index system ψ is acceptable if and only if for each n, there n (n) is a recursive bijection h : N → N such that ψe = φh(e) for all e. Proof. Such a bijection for each n witnesses the acceptability condition for ψ. Conversely, suppose ψ is acceptable. We shall construct h as in the theorem for n = 1, and we leave it to the reader to deal with arbitrary n. We have recursive functions f, g : N → N such that ψd = φf(d) and φe = ψg(e) for all d, e. We construct h in stages where at each stage n we have a partial function hn : N * N with |D(hn)| = n.

Stage 0: Take h0 with empty domain.

Stage 2n + 1: Take the least d∈ / D(h2n) and use f and the padding lemma for φ to obtain effectively an e∈ / Im(h2n) such that ψd = φe. Define h2n+1 as an extension of h2n with D(h2n+1) = D(h2n) ∪ {d} with h2n+1(d) = e.

Stage 2n + 2: Take the least e∈ / Im(h2n+1) and use g and the padding lemma to obtain effectively a d∈ / D(h2n+1) such that ψd = φe. Define h2n+2 as an extension of h2n+1 with D(h2n+2) = D(h2n+1)∪{d} with h2n+2(d) = e. S Then h = hn has the desired properties.

27 Chapter 2

Hilbert’s 10th Problem

2.1 Introduction

Recursively enumerable sets occur outside pure recursion theory, most strikingly in Hilbert’s 10th Problem, hereafter denoted H10: Question. Is there an algorithm that decides for any given polynomial

f(X1,...,Xn) ∈ Z [X1,...,Xn] whether the equation

f(X1,...,Xn) = 0 (2.1) has a solution in Zn? A solution of (2.1) in Zn (or solution of (2.1)) is n a tuple (x1, . . . , xn) ∈ Z such that f(x1, . . . , xn) = 0. Likewise we define what we mean by a solution of (2.1) in Nn (or natural number solution). The answer to H10 is known since 1970: there is no such algorithm. It is still open whether such an algorithm exists when n is fixed to be 2. Observation. The following are equivalent:

(a) There is an algorithm that decides for any f(X1,...,Xn) ∈ Z [X1,...,Xn] whether (2.1) has a solution in Zn.

(b) There is an algorithm that decides for any f(X1,...,Xn) ∈ Z [X1,...,Xn] whether (2.1) has a solution in Nn. Proof. Note that (2.1) has a solution in Zn if and only if one of the 2n equations

f(q1X1, . . . , qnXn) = 0,

n with all qi ∈ {1, −1}, has a solution in N . This yields (b) =⇒ (a).

28 For the other direction we use Lagrange’s four-square theorem. By this theorem, (2.1) has a solution in Nn if and only if the equation

2 2 2 2 2 2 2 2 f(X11 + X12 + X13 + X14,...,Xn1 + Xn2 + Xn3 + Xn4) = 0 has a solution in Zn. In the following account of H10, we use the Observation above to restrict attention to natural number solutions for equations of type (2.1). Thus from now on in this chapter, u, x, y, z (sometimes with subscripts) range over N. This is in addition to the convention that a, b, c, d, e, k, l, m, n range over N. Let A1,...,Am,X1,...,Xn be distinct polynomial indeterminates, A = (A1,...,Am), X = (X1,...,Xn), and f(A, X) ∈ Z[A, X]. We consider f(A, X) as defining the family of diophantine equations

m f(~a,X) = 0, ~a ∈ N . Definition 44. The defined by f(A, X) is

m {~a ∈ N : ∃~x (f(~a,~x) = 0)}

A diophantine set in Nm is a diophantine set defined by some polynomial f(A, X) ∈ Z[A, X], where X = (X1,...,Xn) and n can be arbitrary. Lemma 45. Diophantine sets are recursively enumerable.

Proof. Let f = f(A, X) ∈ Z [A, X] be as before, and take g, h ∈ N [A, X] such that f = g − h. Then

m+n m+n {(~a,~x) ∈ N : f(~a,~x) = 0} = {(~a,~x) ∈ N : g(~a,~x) = h(~a,~x)}, so {(~a,~x) ∈ Nm+n : f(~a,~x) = 0} is primitive recursive. Thus the set

m {~a ∈ N : ∃~x (f(~a,~x) = 0)} is recursively enumerable.

Suppose we have a nonrecursive diophantine set D ⊆ Nm. Then, assuming the Church-Turing Thesis, the answer to H10 is negative. To see why, take f(A, X) ∈ Z [A, X] as above such that

m n D = {~a ∈ N : f(~a,X1,...,Xn) = 0 has a solution in N } Since D is not recursive, the Church-Turing Thesis says that there cannot exist an algorithm deciding for any ~a ∈ Nm as input, whether the equation f(~a,X) = 0 has a solution in Nn. In particular, there cannot exist an algorithm as demanded by H10. The following is the key result. It gives much more than just a nonrecursive diophantine set.

29 Main Theorem [Matiyasevich 1970]. The diophantine sets in Nm are exactly the recursively enumerable sets in Nm. It follows that there is no algorithm as required in H10 (assuming the Church- Turing Thesis). Suggestive results in the direction of Matiyasevich’s theorem were obtained by M. Davis, H. Putnam, and J. Robinson in the 50’s and 60’s. Basically, they reduced the problem to showing that the set

a 2 {(a, b): b = 2 } ⊆ N is diophantine. This reduction is given in the next section.

2.2 Reduction to Exponentiation

Lemma 46. The following sets are diophantine.

(a) {(a, b): a = b} ⊆ N2, (b) {(a, b): a ≤ b} ⊆ N2, (c) {(a, b): a < b} ⊆ N2,

(d) {(a, b): a|b} ⊆ N2, (e) {(a, b, c): a ≡ b mod c} ⊆ N3, (f) {(a, b, c): c = |a, b|} ⊆ N3,

(g) {a : a 6= 2n for all n} ⊆ N. Proof. Just note the following equivalences: (a) a = b ⇐⇒ a − b = 0; (b) a ≤ b ⇐⇒ ∃x a + x = b;

(c) a < b ⇐⇒ ∃x a + x + 1 = b; (d) a|b ⇐⇒ ∃x ax = b; (e) a ≡ b mod c ⇐⇒ ∃x∃y a − b = cx − cy; (f) c = |a, b| ⇐⇒ 2c = (a + b)(a + b + 1) + 2a;

(g) a 6= 2n for all n ⇐⇒ ∃x∃y (2x + 3)y = a.

Lemma 47. Being diophantine is preserved under certain operations:

(a) If D,E ⊆ Nm are diophantine, so are D ∪ E,D ∩ E ⊆ Nm.

30 (b) If D ⊆ Nm+n is diophantine, so is π(D) ⊆ Nm where π : Nm+n → Nm is defined by π(x1, . . . , xm, y1, . . . , yn) = (x1, . . . , xm).

(c) A set D ⊆ Nm is diophantine if and only if D is existentially definable in the ordered semiring (N; 0, 1, +, ·, ≤). Proof.

(a) Let D,E ⊆ Nm be diophantine, and take f, g ∈ Z [A; X] such that D = {~a : ∃~xf(~a,~x) = 0}, E = {~a : ∃~xg(~a,~x) = 0}.

Then clearly

D ∪ E = {~a : ∃~xf(~a,~x)g(~a,~x) = 0}, D ∩ E = {~a : ∃~xf(~a,~x)2 + g(~a,~x)2 = 0}.

(b) Let D ⊆ Nm+n be diophantine, and take f ∈ Z [A, B, X] such that

D = {(~a,~b): ∃~xf(~a,~b, ~x) = 0}

Then π(D) = {~a : ∃~b ∃~xf(~a,~b, ~x) = 0}.

(c) For (⇒) use that for each f ∈ Z[A, X] there exist g, h ∈ N [A, X] such that f = g − h. For the (⇐) direction, note that for any quantifier free formula in the of (N; 0, 1, +, ·, <) you can get rid of inequalities, negations, dis- junctions, and conjunctions at the cost of introducing extra existentially quantified variables by means of the following equivalences:

x 6= y ⇔ ∃z (x + z + 1 = y ∨ y + z + 1 = x) , x = 0 ∨ y = 0 ⇔ xy = 0, x = 0 ∧ y = 0 ⇔ x2 + y2 = 0, x < y ⇔ ∃z (x + z + 1 = y) .

Definition 48. A diophantine function f : Nm → N is a function whose graph is a diophantine set in Nm+1. For example, addition and multiplication are diophantine functions on N2, and so is the monus function. Also gcd : N2 → N is diophantine: gcd(a, b) = c ⇐⇒ ∃x, y(ax − by = c ∨ by − ax = c) ∧ c|a ∧ c|b.

31 m m n Lemma 49. Let f : N → N, R ⊆ N and g1, . . . , gm : N → N be diophantine. n n Then the function f(g1, . . . , gm): N → N, and the relation R(g1, . . . , gm) ⊆ N defined by

R(g1, . . . , gm)(~a) ⇐⇒ R(g1(~a), . . . , gm(~a)) are diophantine. Proof. Exercise.

Some terminology. A polynomial zero set in Nn is a set {~x ∈ Nn : f(~x) = 0} where f(X) ∈ Z [X]. A projection of a set R ⊆ Nn is a set S = π(R) ⊆ Nm n m with m ≤ n and π : N → N given by π(x1, . . . , xn) = (x1, . . . , xm). Thus polynomial zero sets are primitive recursive, and the diophantine sets are just the projections of polynomial zero sets.

The bounded universal quantification of a set R ⊆ Nn+2, denoted ∀≤R, is the n+1 ≤ subset of N defined by: (∀ R)(~x,y) ⇐⇒ ∀i≤y R(~x,y, i). Let Σ be the S m smallest subset of m P(N ) that contains all polynomial zero sets and is closed under taking projections and bounded universal quantifications.

S m Proposition 50. Σ = m{S : S is a recursively enumerable set in N }. Before we can prove this we need to know that Σ is closed under some elementary operations like taking cartesian products and intersections. This is basically an exercise in predicate logic, which we now carry out. Instead of N and its ordering we may consider in this exercise any nonempty set A with a M on A. Till further notice we let a, b, x, y, some- times with indices, range over A, with vector notation like ~a used to denote a n+2 tuple (a1, . . . , an) of the appropriate length n. For R ⊆ A we define the bounded universal quantification ∀MR ⊆ An+1 of R by

(∀MR)(~a,b):⇐⇒ ∀y R(~a,b, y):⇐⇒ ∀y y b ⇒ R(~a,b, y), Mb M n n m n For n ≥ m, let πm : A → A be given by πm(x1, . . . , xn) = (x1, . . . , xm). Taking cartesian products commutes with projecting:

d n n d+n D ⊆ A , m ≤ n, S ⊆ A =⇒ D × πm(S) = πd+m(D × S). It also commutes with bounded universal quantification:

D ⊆ Ad,R ⊆ An+2 =⇒ ∀M(D × R) = D × ∀MR.

For λ : {1, . . . , m} → {1, . . . , n} and S ⊆ Am we define

n λ(S) := {(x1, . . . , xn):(xλ(1), . . . , xλ(m)) ∈ S} ⊆ A .

n Let for each n a collection Cn of subsets of A be given such that:

0 • A ∈ C0 and ∆ := {(x, y): x = y} ∈ C2;

32 • C,D ∈ Cn =⇒ C ∩ D ∈ Cn;

• λ : {1, . . . , m} → {1, . . . , n} is injective,C ∈ Cm ⇒ λ(C) ∈ Cn. With m = 0 in the last condition we obtain An ∈ C for all n. More generally, the inclusion map {1, . . . , m} → {1, . . . , m + n} shows that if C ∈ Cm, then n C × A ∈ Cn. Using also permutations of coordinates we see that if D ∈ Cn, m n m then A × D ∈ Cm+n, and thus by intersecting C × A and A × D,

C ∈ Cm,D ∈ Cn =⇒ C × D ∈ Cm+n. n Also, if 1 ≤ i < j ≤ n, then ∆i,j := {(x1, . . . , xn): xi = xj} ∈ Cn. An example of a family (Cn) with the properties above is obtained by taking n Cn to be the set of polynomial zerosets in N , with (A, M) = (N, ≤). S S n Let C := n Cn, a subset of the disjoint union n P(A ), and define Σ(C) S n to be the smallest subset of n P(A ) that includes C, and is closed under projection and bounded universal quantification. (In the example above for (A, M) = (N, ≤) this gives Σ(C) = Σ.) Below we show that the conditions imposed on C are inherited by Σ(C). The main issue is to deal with the fact that bounded universal quantification does not commute with some coordinate reindexings given by maps λ: {1, . . . , m} → {1, . . . , n}. Lemma 51. If D ∈ C, and S ∈ Σ(C), then D × S ∈ Σ(C). 0 Proof. Let D ∈ Cd. Define the subset Σ of Σ(C) to have as elements the S ∈ Σ(C) such that D × S ∈ Σ(C). Note that C ⊆ Σ0. Since projection and bounded universal quantification commute with the operation of taking the cartesian product with D as first factor, it follows that Σ0 is closed under projection and bounded universal quantification. Thus Σ0 = Σ(C). Lemma 52. If S ∈ Σ(C), S ⊆ Am, then:

(1) C ∩ S ∈ Σ(C) for all C ∈ Cm; (2) λ(S) ∈ Σ(C) for all injective λ : {1, . . . , m} → {1, . . . , n}. Proof. Define the subset Σ0 of Σ(C) to have as its elements the S ⊆ Am, m = 0, 1, 2 ... , such that S ∈ Σ(C) and D ∩ λ(S) ∈ Σ(C) for all D ∈ Cn and injective λ : {1, . . . , m} → {1, . . . , n}. Then C ⊆ Σ0. Next we show that Σ0 is closed under n 0 n 0 projection. Let m ≤ n and S ⊆ A ,S ∈ Σ . To show that πmS ∈ Σ , let λ : {1, . . . , m} → {1, . . . , k} be injective and D ∈ Ck. Then for ~a = (a1, . . . , ak) ∈ k n−m A and with ~x = (x1, . . . , xn−m) ranging over A , n  n D ∩ λ(πmS) (~a) ⇔ D(~a)&(πmS)(aλ(1), . . . , aλ(m))

⇔ D(~a)& ∃~xS(aλ(1), . . . , aλ(m), ~x)  ⇔ ∃~x D(~a)& S(aλ(1), . . . , aλ(m), ~x) ⇔ ∃~xD(~a)&(µS)(~a,~x), where µ : {1, . . . , n} → {1, . . . , k + n − m} is given by µ(j) = λ(j) for 1 ≤ j ≤ m, µ(m + j) = k + j for 1 ≤ j ≤ n − m,

33 Since S ∈ Σ0, the set (D × An−m) ∩ µS belongs to Σ(C), and so does

n k+n−m n−m  D ∩ λ(πmS) = πk (D × A ) ∩ µS . Thus Σ0 is closed under projection. It remains to show that Σ0 is closed under bounded universal quantification, so let R ⊆ Am+2, R ∈ Σ0. To get ∀MR ∈ Σ0, let λ : {1, . . . , m + 1} → {1, . . . , n} be injective and D ∈ Cn. Then for S = D ∩ λ(∀MR) and ~a = (a1, . . . , an),

S(~a) ⇔ D(~a)&(∀MR)(aλ(1), . . . , aλ(m+1)) ⇔ D(~a)& ∀y R(a , . . . , a , y) Maλ(m+1) λ(1) λ(m+1) ⇔ ∃xD(~a)& x = a & ∀y R(a , . . . , a , x, y) λ(m+1) Mx λ(1) λ(m) ⇔ ∃x∀y D(~a)& x = a &(µR)(~a,x, y), where Mx λ(m+1) µ : {1, . . . , m + 2} → {1, . . . , n + 2} is given by µ(j) = λ(j) for 1 ≤ j ≤ m, µ(m + 1) = n + 1, µ(m + 2) = n + 2.

Arguing as before this gives S ∈ Σ(C), and so Σ0 is closed under bounded universal quantification. Corollary 53. If S, T ∈ Σ(C), then S × T ∈ Σ(C). If S, T ∈ Σ(C) and also S, T ⊆ An, then S ∩ T ∈ Σ(C). Proof. Let S ⊆ An and S ∈ Σ(C). Define Σ0 to be the subset of Σ(C) whose elements are the T ∈ Σ(C) with S × T ∈ Σ(C). It follows from Lemma 51 and (2) of Lemma 52 that C ⊆ Σ0. Also, projection and bounded universal quantification commute with taking cartesian products with S as first factor, so Σ0 is closed under projection and bounded universal quantification. Hence Σ0 = Σ(C), which gives the first claim of the corollary. Now, with also T ⊆ An and T ∈ Σ(C), we have

2n 2n 2n  S ∩ T = πn ∆1,n+1 ∩ · · · ∩ ∆n,2n ∩ (S × T ) , and so S ∩ T ∈ Σ(C) by (1) of Lemma 52. We have now shown that Σ(C) inherits the conditions that we imposed on C. In particular, with (A, M) = (N, ≤), it follows that Σ is closed under cartesian products and intersections. We now return to the setting where variables like a, b, x, y range over N and give the proof of Proposition 50: Proof. For ⊆, note that the polynomial zero sets are primitive recursive, and hence recursively enumerable. Now use that the class of recursively enumerable sets is closed under taking projections and bounded universal quantifications. For ⊇ we use the familiar inductive definition of “recursive function” to show that these functions are in Σ, that is, their graphs are. It is clear that this holds for the initial functions (addition, multiplication, . . . ). Assume inductively that

34 n m the graphs of f : N → N and g1, . . . , gn : N → N belong to Σ. Then with ~a = (a1, . . . , am) and ~x = (x1, . . . , xn) we have the equivalence  f(g1, . . . , gn)(~a) = b ⇐⇒ ∃~x g1(~a) = x1 & ... & gn(~a) = xn & f(~x) = b , which shows that the graph of f(g1, . . . , gn) belongs to Σ. Next, assume that the graph of g : Nn+1 → N is in Σ, that for all ~x ∈ Nn there is y such that g(~x,y) = 0 and that f : Nn → N is given by f(~x) = µy g(~x,y) = 0. Then

f(~x) = y ⇐⇒ g(~x,y) = 0 & ∀i

x y x Lemma 54. If the function 2 is diophantine, so are x , y and x!. Proof. Trivially, 2xy ≡ x mod 2xy − x, so (2xy)y ≡ xy mod 2xy − x, that is,

2 2xy ≡ xy mod 2xy − x.

2 As xy < 2xy − x for y > 1, we have xy = rem(2xy , 2xy − x) for y > 1. This expression shows that if 2x is diophantine, so is xy.

x x x Px x x x For y , note that 2 = (1 + 1) = y=0 y , so y < 2 for x > 0. Also, we x Px x y x have (1 + d) = y=0 y d , which gives the base d expansion of (1 + d) when d ≥ 2x. In other words, for all x > 0, 0 ≤ y ≤ x and all n: x = n ⇐⇒ ∃c, d d = 2x, n < d, c < dy, (1 + d)x ≡ c + ndy mod dy+1 . y

35 x x This equivalence shows that if 2 is diophantine, then y is diophantine.

For x!, note that if y > x > 1 then

y y(y − 1) ... (y − x + 1) = x x! yx(1 − 1 ) ... (1 − x−1 ) = y y . x!

An easy induction on n shows that for 0 ≤ 1, . . . , n ≤ 1 we have

n n Y X (1 − i) ≥ 1 − i. i=1 i=1

For x > 1 and y ≥ x2 this gives     y x y 1 x! ≤ y = x! 1 x−1 x x (1 − y ) ... (1 − y ) y  x2  ≤ x! 1 + x y y x2x!y = x! + . x y x

x2x! x y x2x! x+2 Hence, if y < 1, then x! = qu(y , x ). But y < 1 for y ≥ 2x , and thus x y x+2 x x! = qu(y , x ) for y = 2x , x > 1. Therefore, x! is diophantine if 2 is. We are going to reduce the proof of the Main Theorem to showing that 2x is a diophantine function. This reduction can be viewed as an elaborated version of G¨odel’scoding lemma, which we review here for the reader’s convenience, since its proof is suggestive of what comes later.

Lemma 55. Let b1, . . . , bn ≤ B ∈ N. Then there is a unique β ∈ N such that

n Y β < (1 + i · n!B) and bi = rem(β, 1 + i · n!B) for i = 1, . . . , n. i=1 Proof. The numbers 1 + i · n! · B for i = 1, . . . , n are pairwise coprime: if p were a common prime factor of 1 + i · n!B and 1 + j · n!B with 1 ≤ i < j ≤ n, then p | (j − i) · n!B, so p | n! or p | B, a contradiction. Therefore, by the Chinese Remainder Theorem, there exists β ∈ N such that β ≡ bi mod 1 + i · n!B for i = 1, . . . , n and such a β is uniquely determined Qn modulo i=1(1 + i · n! · B). Qn Since bi < 1 + i · n! · B for i = 1, . . . , n, the unique β < i=1(1 + i · n! · B) with β ≡ bi mod 1 + i · n! · B for i = 1, . . . , n has the desired property.

36 It will be convenient to fix some notation in the remainder of this section: ~ ~a := (a1, . . . , am), b := (b1, . . . , bn), ~v := (v1, . . . , vn),

~v ≤ y :⇐⇒ v1 ≤ y, . . . , vn ≤ y.

We showed that, for each m, the recursively enumerable sets in Nm are exactly the subsets of Nm that are in Σ. Hence, to obtain the Main Theorem, it suffices to prove: If D ⊆ Nm+2 is diophantine, then so is ∀≤D ⊆ Nm+1. We first make a small further reduction. Let the diophantine set D ⊆ Nm+2 be given by the equivalence

D(~a,x, u) ⇐⇒ ∃~vF (~a,x, u,~v) = 0 where F (A, X, U, V ) ∈ Z [A, X, U, V ], A = (A1,...,Am), V = (V1,...,Vn) and where A1,...,Am, X, U, V1,...,Vn are distinct polynomial indeterminates. Then for all ~a,x we have:

≤ (∀ D)(~a,x) ⇐⇒ ∀u≤x D(~a,x, u)

⇐⇒ ∀u≤x ∃~vF (~a,x, u,~v) = 0

⇐⇒ ∃y ∀u≤x ∃~v ≤ y F (~a,x, u,~v) = 0.

Thus it remains to show that for any F (A, X, U, V ) ∈ Z [A, X, U, V ], the set

m+2 {(~a,x, y): ∀u≤x ∃~v≤y F (~a,x, u,~v) = 0} ⊆ N is diophantine. The next result goes under the name of Bounded Quantifier Theorem. Besides the polynomial indeterminates in (A, X, U, V ), we let Y be an extra indeterminate which we allow also to occur in F :

Theorem 56. Suppose F (A, X, Y, U, V ) in Z[A, X, Y, U, V ] and G(A, X, Y ) in N[A, X, Y ] are polynomials such that for all ~a,x, y and all u ≤ x and ~v ≤ y, G(~a,x, y) > |F (~a,x, y, u,~v)| + 2x + y + 2.

Then the following equivalence holds for all ~a,x, y, with g := G(~a,x, y):

∀u≤x ∃~v≤y F (~a,x, y, u,~v) = 0 ⇐⇒ h i ~ b1  bn  ~ g!−1 ∃b y+1 ≡ · · · ≡ y+1 ≡ F (~a,x, y, g! − 1, b) ≡ 0 mod x+1

Before we start the proof, note that for any F ∈ Z[A, X, Y, U, V ] we obtain G(A, X, Y ) ∈ N[A, X, Y ] as in the hypothesis of the lemma as follows: Let F ∗ ∈ N[A, X, Y, U, V ] be obtained from F by replacing each coefficient with its absolute value and put

G(A, X, Y ) := F ∗(A, X, Y, X, Y, . . . , Y ) + 2X + Y + 3.

37 Proof. We shall refer to the equivalence in the Bounded Quantifier Theorem as the BQ-equivalence. Let ~a, x, and y be given and put g = G(~a,x, y). Then g! − 1 (g! − 1)(g! − 2) ... (g! − x − 1) = x + 1 1 · 2 ····· (x + 1) g! − 1 g! − 2 g! − x − 1 = · ····· 1 2 x + 1 g!  g!   g!  = − 1 − 1 ... − 1 . 1 2 x + 1

Since g > x + 2, all x + 1 factors in this last product are in N>1. g!−1 Claim 1. Each prime factor of x+1 is greater than g. g! This is because g ≥ 2x + 2, so every prime p ≤ g divides u+1 for each u ≤ x, so g!−1 no prime p ≤ g divides x+1 . g! g! g! Claim 2. The factors 1 − 1, 2 − 1,..., x+1 − 1 are pairwise coprime. g! g! To see why, suppose p is a prime factor of i −1, and j −1, with 1 ≤ i < j ≤ x+1. Then p > g, but p | g! − i, p | g! − j, so p | j − i, contradicting j − i ≤ g.

g! For each u ≤ x, take a prime factor pu of u+1 − 1. Then g! p > g, − 1 ≡ 0 mod p , u u + 1 u ~ so g! ≡ u + 1 mod pu, hence g! − 1 ≡ u mod pu. Thus for all b and all u ≤ x: ~ F (~a,x, y, g! − 1, b) ≡ F (~a,x, y, u, rem(b1, pu),..., rem(bn, pu)) mod pu.

Now, assume that for each u ≤ x we have natural numbers vu1, vu2, . . . , vun ≤ y such that F (~a,x, y, u, vu1, . . . , vun) = 0. g!−1 For i = 1, . . . , n, the Chinese Remainder Theorem provides bi < x+1 such g! that bi ≡ vui mod u+1 − 1 for all u ≤ x. In other words, g! − 1 | b (b − 1) ··· (b − y) for i = 1, . . . , n and all u ≤ x. u + 1 i i i g!−1 Then by Claim 2, x+1 | bi(bi − 1) ... (bi − y). By Claim 1 and g > y + 1, all g!−1 prime factors of x+1 are greater than y + 1. So, g! − 1 b (b − 1) ... (b − y)  b  | i i i = i . x + 1 (y + 1)! y + 1 Hence,  b   b  g! − 1 1 ≡ · · · ≡ n ≡ 0 mod . y + 1 y + 1 x + 1

38  g!  For u ≤ x, u+1 − 1 (u + 1) = g! − (u + 1) = (g! − 1) − u, so

g! g! − 1 ≡ u mod − 1, hence u + 1 ~ F (~a,x, y, g! − 1, b) ≡ F (~a,x, y, u,~vu) g! ≡ 0 mod − 1. u + 1 ~ g!−1 Thus, by the second claim, F (~a,x, y, g! − 1, b) ≡ 0 mod x+1 . This proves the forward direction of the BQ-equivalence. For the converse, let ~b be such that  b   b  g! − 1 1 ≡ · · · ≡ n ≡ F (a, x, y, g! − 1,~b) ≡ 0 mod . y + 1 y + 1 x + 1 Then for 1 ≤ i ≤ n, u ≤ x:  b  i ≡ 0 mod p , so p | b (b − 1) ... (b − y), y + 1 u u i i i hence pu|bi − k for some k ≤ y, which gives rem(bi, pu) ≤ y. Hence g! − 1 |F (~a,x, y, u, rem(b , p ),..., rem(b , p ))| < g < p | , 1 u n u u x + 1 and thus F (~a,x, y, u, rem(b1, pu),..., rem(bn, pu)) = 0, as desired. Lemma 54 and The Bounded Quantifier Theorem reduce the proof of the Main Theorem to showing that the function 2x is diophantine. This will be achieved by exploiting subtle properties of solutions of Pell equations. In the next section we derive the relevant facts about these equations. We finish this section with a digression on exponentially diophantine equations. Readers who want to take the shortest path to a proof of the Main Theorem can skip this. Exponentially diophantine sets. Some results above are conditional on the function 2x being diophantine. A nice way to eliminate the conditional nature of these results without proving outright that 2x is diophantine is as follows. Call a set E ⊆ Nm exponentially diophantine if for some polynomial

F (A, X, Y ) ∈Z[A, X, Y ], with A = (A1,...,Am),X = (X1,...,Xn),Y = (Y1,...,Yn), the following equivalence holds for all ~a ∈ Nm: E(~a) ⇐⇒ ∃~xF (~a,~x, 2~x) = 0,

~x x1 xn n m where 2 := (2 ,..., 2 ) for ~x = (x1, . . . , xn) ∈ N . Clearly, if E ⊆ N is diophantine, then E is exponentially diophantine. Note that Lemma 45 goes

39 through with “exponentially diophantine” instead of “diophantine”. So does Lemma 47, where in part (c) the ordered semiring (N; 0, 1, +, ·, ≤) is replaced by the ordered exponential semiring (N; 0, 1, +, ·, exp, ≤) with exp(x) := 2x. A function f : Nm → N is said to be exponentially diophantine if its graph (as a subset of Nm+1) is exponentially diophantine. Then Lemma 49 goes through with “exponentially diophantine” instead of “diophantine”. The proof y x of Lemma 54 shows (unconditionally) that the functions x , y , and x! are exponentially diophantine. We are now approaching a proof of the following unconditional result.

Theorem 57. For each set S ⊆ Nm, S is recursively enumerable ⇐⇒ S is exponentially diophantine.

As stated, this is of course a consequence of the Main Theorem to be proved in the next section, but it seems worth deriving this exponential version using just the facts we have already established. Moreover, this exponential version holds in an enhanced form that we briefly touch on at the end of this digression. One might try to prove Theorem 57 by an exponential analogue of the BQ-Theorem. However, the proof of that theorem doesn’t go through, since it can happen that x ≡ y mod m, but 2x 6≡ 2y mod m. Instead we shall focus on improving Proposition 50, which says that any given recursively enumerable set can be obtained from a polynomial zero set by a finite sequence of projections and bounded universal quantifications. The main point is to show that bounded universal quantification is needed only once in such a description of a given recursively enumerable set. This is an old result due to Martin Davis, the Davis Normal Form:

Proposition 58. Given any recursively enumerable set S ⊆ Nd, there is a diophantine set D ⊆ Nd+m+2 with the property that for all ~a ∈ Nd,

S(~a) ⇐⇒ ∃x1 ...∃xm∃x ∀u≤x D(~a,x1, . . . , xm, x, u).

Proof. We first show how to interchange the order of bounded universal quan- tification and existential quantification. Let S ⊆ Nm+3. Using the Chinese Remainder Theorem to code finite sequences as in the proof of Lemma 55 we have for all (~a,b) ∈ Nm+1,

∀u≤b ∃x S(~a,b, u, x) ⇐⇒  ∃y, z ∀u≤b S ~a,b, u, rem(y, 1 + (u + 1)z) .

Next, we show how to contract two successive bounded universal quantifications into a single bounded universal quantification, at the cost of an extra existential quantification. Let L, R : N → N be the functions such that n = |L(n),R(n)| for all n. It is easy to check that L, R are diophantine. Let now S ⊆ Nm+4.

40 Then for all (~a,b, c) ∈ Nm+2:

∀u≤b ∀v≤c S(~a,b, c, u, v) ⇐⇒   ∀x≤|b,c| (L(x) ≤ b & R(x) ≤ c) ⇒ S(~a,b, c, L(x),R(x)) ⇐⇒   ∃y ∀x≤y y = |b, c| & { L(x) ≤ b & R(x) ≤ c ⇒ S(~a,b, c, L(x),R(x)) ]

Now an inductive argument using Proposition 50 yields the desired result. The BQ-theorem in combination with Proposition 58 and the facts mentioned earlier on exponentially diophantine sets and functions show that any recursively enumerable set S ⊆ Nd is exponentially diophantine. This concludes the proof of Theorem 57.

2.3 Pell Equations

Below we assume that d ∈ N is not a square, that is,

2 d 6∈ sq(N) := {n : n = 0, 1, 2,... } = {0, 1, 4, 9,... }. So 2, 3, 5, 6, 7, 8, 10,... are the possible values of d. Then

X2 − dY 2 = 1 (P) is called a Pell equation. Note that (P) has the solution (1, 0), which we call the trivial solution. Given any solution (a, b), we have the factorization √ √ a2 − db2 = (a + b d)(a − b d) = 1, √ √ which yields (a + b d)n(a − b d)n = 1, and taking a0, b0 such that √ √ √ √ (a + b d)n = a0 + b0 d, (a − b d)n = a0 − b0 d, we see that (a0, b0) is also a solution of (P). Example. Consider the case d = 2. Then X2 − 2Y 2 = 1 has the non-trivial solution (3, 2), and √ √ √ (3 + 2 2)2 = 9 + 12 2 + 8 = 17 + 12 2, √ √ √ (3 − 2 2)2 = 9 − 12 2 + 8 = 17 − 12 2, yielding another solution (17, 12).

A key fact in connection with H10 is that the solutions of (P) in N2 form a se- quence (xn, yn), n = 0, 1, 2,... where xn and yn grow roughly as an exponential function of n. Our short term goal is to establish this fact.

41 √ Lemma 59. G := {x ± y d : x2 − dy2 = 1} is a subgroup of the multiplicative R>0 of positive real numbers. √ √ √ 2 2 Proof. First, 1 = 1 + 0 d, so√ 1 ∈ G. We have x − dy = (x + y d)(x − y d), so if x2 − dy2 = 1, then x − y d = 1√ . Also, x+y d √ √ √ (x + y d)(x + y d) = (x x + dy y ) + (x y + x y ) d 1 1√ 2 2√ 1 2 1 2 1 2 2 1 √ (x1 − y1 d)(x2 − y2 d) = (x1x2 + dy1y2) − (x1y2 + x2y1) d

2 2 2 2 2 2 Hence, if x1 − dy1 = x2 − dy2 = 1, then x − dy = 1 where

x := x1x2 + dy1y2, y := ±(x1y2 + x2y1).

√ √ Note that the x + y d ∈ G are ≥ 1 and the x − y d ∈ G are ≤ 1.

2 Corollary 60. Suppose (P) has a nontrivial solution in N . Let (x1, y1) be such 2 a solution with minimal x1 > 1. Then the solutions in N of (P) are exactly the 2 (xn, yn) ∈ N given by √ √ n (x1 + y1 d) = xn + yn d, n = 0, 1, 2,... √ Proof. Put r := x1 + y1 d, so r is the least element of G which is greater 2 n than 1. We have 1 < r < r < . . . and r → ∞ as n → ∞√. Let (x, y) n n+1 be a solution of√ (P). Take the unique n such that r ≤ x +√ y d < r . −n −n Then 1√≤ (x + y d)r < r. By√ the previous lemma, (x + y d)r ∈ G, so −n n (x + y d)r = 1, that is, x + y d = r , so x = xn, y = yn.

2 If there is a non-trivial solution of (P) in N , then (x1, y1) as in Corollary 60 is called the minimal solution of (P). Note that in this corollary, n = 0 yields the trivial solution, and n = 1 the minimal solution. We record two rules for solutions of (P), assuming there is a non-trivial solution of (P) in N2. First, the addition rules:

xm+n = xmxn + dymyn, ym+n = xmyn + xnym, which follow from the definitions of xn and yn, and the doubling rules, which follow from the addition rules:

2 x2n = 2xn − 1, y2n = 2xnyn.

Our next goal is Lagrange’s theorem that non-trivial solutions of (P) in N2 do exist. First a lemma.

Lemma 61. Let ξ ∈ R>1 be irrational. Then there are infinitely many (x, y) with y > 0 such that ξ − x < 1 . y y2

42 1 The obvious bound would be y , so the lemma says that there are much better rational approximations to ξ, in terms of the size of the denominator. One can 2 r show that replacing y in the lemma√ by y with a fixed real number r > 2 would result in a false statement for ξ = 2. Proof. Let n ≥ 1; we claim that there are x, y > 0 such that y ≤ n and

x 1 1 ξ − < ≤ . y ny y2

For y = 0, 1, 2, . . . , n, we have yξ = x + (y), where x ∈ N and 0 ≤ (y) < 1. Divide the interval [0, 1) into the n subintervals

[0, 1/n), [1/n, 2/n),..., [(n − 1)/n, 1). By the pigeonhole principle, two of the n + 1 numbers (0), (1), . . . , (n) must lie in the same subinterval. Thus we have y1 < y2, both in {0, 1, . . . , n} such that |(y1) − (y2)| < 1/n. This gives x1 ≤ x2 such that

y1ξ = x1 + (y1), y2ξ = x2 + (y2).

Now set y = y2 − y1 and x = x2 − x1. We get yξ = x + (y2) − (y1). So

1 x 1 |yξ − x| < , and thus ξ − < . n y ny

1 >0 We have x > 0 because ξ > 1 and ny < 1. For each n ≥ 1, take x(n), y(n) ∈ N such that y(n) ≤ n and

x(n) 1 1 ξ − < ≤ . y(n) ny y2 x(n) As n goes to infinity, we have y(n) → ξ, so the set

{(x(n), y(n)) : n = 1, 2, 3,...} is infinite.

Theorem 62. The equation (P) has a nontrivial solution in N2. This theorem is due to Lagrange, but we give Dirichlet’s proof. √ >0 2 Proof. If x, y ∈ N and d − x/y < 1/y , then 2 x √ x √ x 1  √  0 < d − = d − d + < 1 + 2 d , y2 y y y2 so √ 2 2 0 < x − dy < 1 + 2 d.

43 √ By the previous lemma we have therefore a nonzero integer λ with |λ| < 1+2 d, such that the equation

2 2 X − dY = λ, (Pλ) has infinitely many solutions (x, y) with x, y ∈ N>0. Consider two such solutions of (Pλ), (x1, y1) and (x2, y2). Then: √ √ 2 2 x1 − dy1 = (x1 + y1√d)(x1 − y1√d) = λ, 2 2 x2 − dy2 = (x2 + y2 d)(x2 − y2 d) = λ, so √ ! √ ! x + y d x − y d 1 1√ 1 1√ = 1. x2 + y2 d x2 − y2 d Now we remove square roots from denominators: √ √ √ √ x + y d (x + y d)(x − y d) (x x − dy y ) + (x y − x y ) d 1 1√ = 1 1√ 2 2√ = 1 2 1 2 2 1 1 2 , x + y d (x + y d)(x − y d) λ 2 2√ 2 2√ 2 2√ √ x − y d (x − y d)(x + y d) (x x − dy y ) − (x y − x y ) d 1 1√ = 1 1√ 2 2√ = 1 2 1 2 2 1 1 2 . x2 − y2 d (x2 − y2 d)(x2 + y2 d) λ Thus, setting

x1x2 − dy1y2 x2y1 − x1y2 s := , t := , λ λ we get s2 −dt2 = 1. The problem is that s and t might not be . However, (Pλ) has infinitely many solutions (x, y), so by the pigeonhole principle we can choose (x1, y1) and (x2, y2) as above so that x1 ≡ x2 mod λ and y1 ≡ y2 mod λ, and (x1, y1) 6= (x2, y2). Then

2 2 x1x2 − dy1y2 ≡ x1 − dy1 ≡ 0 mod λ, and x2y1 − x1y2 ≡ 0 mod λ, so s, t ∈ N. It remains to show that (s, t) is not the trivial solution to (P). Suppose towards a contradiction that s = 1 and t = 0. Then x2y1 − x1y2 = 0, so x2/x1 = y2/y1. We can assume x2 > x1; set r = x2/x1 > 1. Then

2 2 2 2 2 2 x2 − dy2 = r (x1 − dy1) = r λ 6= λ, a contradiction. So (s, t) is a nontrivial solution of (P) in N2. Although this proof appears non-constructive, one can actually derive from it an explicit upper bound on the minimal solution: our use of the pigeonhole principle does not really need infinitely√ many solutions to (Pλ), but only more than λ2, and we have |λ| < 1 + 2 d.

44 The minimal solution of a Pell equation X2 − dY 2 = 0 varies rather wildly with d. For our purpose we can restrict attention to a situation where the minimal solution is obvious: let a ≥ 2 and consider a Pell equation

2 2 2 X − (a − 1)Y = 1, (Pa ).

Its minimal solution is (a, 1). Now define xa(n), ya(n) ∈ N by

p 2 p 2 n xa(n) + ya(n) a − 1 = (a + a − 1) .

By previous results, {(xa(n), ya(n)) : n = 0, 1, 2,...} is the set of all solutions in N2. We now list several rules, most of which follow trivially from the earlier addition and doubling rules. First of all, we have the addition rules:

2 xa(m + n) = xa(m)xa(n) + (a − 1)ya(m)ya(n),

ya(m + n) = xa(m)ya(n) + xa(n)ya(m), next, the doubling formulae:

2 xa(2n) = 2xa(n) − 1,

ya(2n) = 2xa(n)ya(n), the simultaneous recursion equations:

2 xa(0) = 1, xa(n + 1) = axa(n) + (a − 1)ya(n), ya(0) = 0, ya(n + 1) = xa(n) + aya(n), and the course-of-values recursion:

xa(0) = 1, xa(1) = a, xa(n + 2) = 2axa(n + 1) − xa(n), ya(0) = 0, ya(1) = 1, ya(n + 2) = 2aya(n + 1) − ya(n). One can prove the course-of-values recursion, for example, as follows:

2 xa(n + 2) = axa(n + 1) + (a − 1)ya(n + 1) 2 = axa(n + 1) + (a − 1) [xa(n) + aya(n)] 2 = axa(n + 1) + (a − 1)xa(n) + a [xa(n + 1) − axa(n)]

= 2axa(n + 1) − xa(n). In addition we have a result about growth:

n n (2a − 1) ≤ ya(n + 1) < (2a) 2n ≤ ya(n) for n ≥ 2 Exercise. Prove the growth result. Finally, we state some congruence rules: for a, b ≥ 2 we have

ya(n) ≡ n mod (a − 1)

ya(n) ≡ yb(n) mod (a − b)

45 Proof. For the first result, we have ya(0) = 0 ≡ 0 mod (a−1) and ya(1) = 1 ≡ 1 mod (a − 1). Thus we can apply induction:

ya(n + 2) ≡ 2ya(n + 1) − ya(n) mod (a − 1) ≡ 2(n + 1) − n ≡ n + 2 mod (a − 1).

For the second result, we note that it holds for n = 0, 1, and also that a ≡ b mod (a − b). Thus,

ya(n + 2) ≡ 2bya(n + 1) − ya(n) mod (a − b), which shows that ya(n) satisfies the same recursion as yb(n) modulo (a−b).

Lemma 63 (Periodicity). Suppose m, n > 0 and ya(n) ≡ 0 mod m. Then for all k, l, if k ≡ l mod 2n, then ya(k) ≡ ya(l) mod m.

Thus the remainder rem(ya(k), m) is periodic as a function of k, with period dividing 2n, where the period is the least p ∈ N>0 such that

rem(ya(k), m) = rem(ya(k + lp), m) for all k, l.

Proof. Just observe the following equalities and congruences:

ya(k + 2n) = xa(k)ya(2n) + xa(2n)ya(k) 2 = 2xa(k)xa(n)ya(n) + (2xa(n) − 1)ya(k) 2 ≡ (2xa(n) − 1)ya(k) mod m  2 2   ≡ 2 (a − 1)ya(n) + 1 − 1 ya(k) mod m

≡ ya(k) mod m.

We note also that for any m > 0, there is an n > 0 so that ya(n) ≡ 0 mod m. This is because for d := m2(a2 − 1) the Pell equation X2 − dY 2 = 1 has a non-trivial solution, say (x, y). Then (x, my) is a nontrivial solution to (Pa ). Next we prove some “step-down” lemmas. Roughly speaking, these pro- vide more precise information about the period in the sequence of remainders rem(ya(k), m) when m has a special form. In addition, they provide information about n from information about ya(n). Lemma 64 (First Step-Down Lemma). Let m, n > 0. Then:

ya(m)|ya(n) ⇐⇒ m|n, 2 ya(m) |ya(n) ⇐⇒ mya(m)|n.

46 Proof. Assume first that m|n, so n = mk, k > 0. Then

p 2 p 2 mk xa(n) + ya(n) a − 1 = (a + a − 1)  p mk = a + a2 − 1  k p 2 = xa(m) + ya(m) a − 1 .

Expanding√ the right hand side by the binomial theorem and comparing coeffi- cients of a2 − 1, we get k−1 3 (∗) ya(n) = kxa(m) ya(m) + ya(m) · (integer).

In particular, (*) yields that ya(m) divides ya(n), proving the backwards direc- tion of the implication. Conversely, assume that ya(m)|ya(n). Towards a contradiction, let m - n. Then n = qm + r with q, r ∈ N and 0 < r < m. By the addition formula,

ya(n) = ya(qm + r) = xa(qm)ya(r) + xa(r)ya(qm).

From ya(m)|ya(qm), we obtain ya(m)|xa(qm)ya(r). But ya(qm) and xa(qm) are coprime, so ya(m) and xa(qm) are coprime. Hence ya(m)|ya(r), which contradicts 0 < ya(r) < ya(m). This proves the first equivalence. For the second, first note that both sides of the equivalence imply m|n, so we can assume n = mk, with k ∈ N>0. Then we again apply (*): 2 2 k−1 ya(m) |ya(n) ⇐⇒ ya(m) |kxa(m) ya(m) k−1 ⇐⇒ ya(m)|kxa(m)

⇐⇒ ya(m)|k

⇐⇒ mya(m)|n.

Lemma 65 (Second Step-Down Lemma). Let n ≥ 1 and suppose that ya(k) ≡ ya(l) mod xa(n). Then k ≡ ±l mod 2n. Proof. Let i, j ∈ N. Then the following congruences hold:

ya(2n + i) ≡ −ya(i) mod xa(n) ya(2n − i) ≡ ya(i) mod xa(n) for 0 ≤ i ≤ n ya(4n + i) ≡ ya(i) mod xa(n) ya(4n − i) ≡ −ya(i) mod xa(n) for 0 ≤ i ≤ n We prove the first of these congruences, leave the second as an exercise, and note that the third and fourth follow easily from applications of the first and second. The proof of the first congruence is as follows.

ya(2n + i) = xa(2n)ya(i) + xa(i)ya(2n)

≡ xa(2n)ya(i) mod xa(n) 2 ≡ (2xa(n) − 1)ya(i) mod xa(n)

≡ −ya(i) mod xa(n)

47 The second line here uses xa(n)|ya(2n), which follows from the doubling formula. Next we have two incongruences to the effect that the congruences above are in some sense best possible:

ya(i) 6≡ ±ya(j) mod xa(n) for 0 ≤ i < j ≤ n,

ya(i) 6≡ −ya(i) mod xa(n) for 0 < i ≤ n.

To prove these incongruences we assume a > 2, and leave the somewhat special case a = 2 to the reader. Then

p 2 2 p 2 xa(n) = (a − 1)ya(n) + 1 > ya(n) a − 1 > 2ya(n)

The first incongruence now follows because for 0 ≤ i < j ≤ n,

xa(n) > 2ya(n) ≥ ya(j) ± ya(i).

The second incongruence holds because if 0 < i ≤ n, then

xa(n) > 2ya(n) ≥ 2ya(i) > 0.

Now put i = rem(k, 4n), j = rem(l, 4n). Then

ya(i) ≡ ya(j) mod xa(n).

If i = j then k ≡ l mod 4n, so k ≡ l mod 2n. If i 6= j, then a close look at the four initial congruences and the two incongruences above yields i ≡ −j mod 2n, so k ≡ −l mod 2n.

2.4 Proof of the Main Theorem

3 We are now ready to show that {(a, n, y): a ≥ 2, y = ya(n)} ⊆ N is diophan- tine. In the next result and its proof, all variables range over N. Theorem 66. Let a, y, n ≥ 2. Then the following are equivalent

• y = ya(n); • there are x, u, v, s, t, b such that

(i) 2n ≤ y (vi) b ≡ 1 mod y (ii) x2 − (a2 − 1)y2 = 1 (vii) t ≡ y mod u (iii) v ≥ 1 & u2 − (a2 − 1)v2 = 1 (viii) t ≡ n mod y (iv) b ≥ 2 & s2 − (b2 − 1)t2 = 1 (ix) y2|v. (v) b ≡ a mod u

48 Proof. Let x, u, v, s, t, b be as in (i)–(ix). Then take k, l, m such that

y = ya(k), u = xa(l), t = yb(m).

We shall prove that k = n, so y = ya(n). By (iii), l ≥ 1. By the congruence rules we have yb(m) ≡ ya(m) mod b − a, and so by (v),

yb(m) ≡ ya(m) mod xa(l).

By (vii) we have yb(m) ≡ ya(k) mod xa(l), so

ya(m) ≡ ya(k) mod xa(l)

By the second step down lemma,

m ≡ ±k mod 2l

2 By (ix), ya(k) |ya(l), so by the first step down lemma, kya(k)|l, so y|l and thus

m ≡ ±k mod y (2.2)

By the congruence rules, yb(m) ≡ m mod (b − 1), so

yb(m) ≡ m mod y by (vi),

yb(m) ≡ n mod y by (viii) n ≡ m mod y and thus by (2.2), n ≡ ±k mod y (2.3)

By (i) and an earlier result on the bounds on solutions of Pa we have 2n ≤ y, 2k ≤ y, which in view of (2.3) yields n = k, so y = ya(n), as desired.

For the converse, assume ya(n) = y. Then (ii) holds with x = xa(n). Take

l := nya(n), u := xa(l), v := ya(l).

Then (iii) holds. By the first step down lemma,

2 2 y = (ya(n)) |ya(l) = v ≥ 1 and (ix) holds. By (iii), u, v are coprime, and y|v, so u, y are coprime. By the Chinese Remainder Theorem we get b ≥ 2 such that

b ≡ a mod u b ≡ 1 mod y

Then (v) and (vi) hold. Now take s := xb(n), t := yb(n). Then (iv) holds, and the congruence rules yield (vii) and (viii). Corollary 67. The function x 7→ 2x is diophantine.

49 Proof. Let a ≥ 2. Then

n n n n (2a − 1) ≤ ya(n + 1) ≤ (2a) , (4a − 1) ≤ y2a(n + 1) ≤ (4a) .

Therefore

 n n n  n n 1 (4a − 1) y2a(n + 1) (4a) n 1 2 1 − = n ≤ ≤ n = 2 1 + . 4a (2a) ya(n + 1) (2a − 1) 2a − 1 For fixed n, with a → ∞,

 1 n  1 n 1 + → 1, 1 − → 1. 2a − 1 4a

So for some B(n) ∈ N>0, whenever a > B(n), then

y2a(n + 1) n 1 − 2 < , ya(n + 1) 2 hence for all a > B(n) and all m,

n 2 = m ⇐⇒ 2 |y2a(n + 1) − mya(n + 1)| < ya(n + 1).

So it only remains to show that there is a diophantine function B : N → N>0 such that for all n and all a > B(n),

y2a(n + 1) n 1 − 2 < . ya(n + 1) 2 We have  1 n  n  n2n 2n 1 − ≥ 2n 1 − = 2n − , 4a 4a 4a

n n 1 n n 1 so if 4a > 2n2 , then 2 1 − 4a > 2 − 2 . Also,  1 n  n2n  n22n 2n 1 + ≤ 2n 1 + = 2n + , 2a − 1 2a − 1 2a − 1

n 2n n  1  n 1 n n so if 2a−1 > 2n2 , then 2 1 + 2a−1 < 2 + 2 . As 5 ≤ y3(n+1) ≤ 6 , we 2n can take B(n) := 2ny3(n + 1) + 1. Then B(n) > 2n2 , so B is a diophantine function as desired. The main theorem has now been established. It gives much more than a negative solution to H10. Exercise. Show that there is no algorithm determining for any system

P1(X1,...,Xn) = ··· = Pm(X1,...,Xn) = 0

50 with all Pi ∈ Z[X1,...,Xn] of degree ≤ 2, whether the system has a solution in Zn. Using this, show that there is no algorithm determining for any given equation P (X1,...,Xn) = 0 with P ∈ Z[X1,...,Xn] of degree ≤ 4 whether the equation has a solution in Zn.

The answer to H10 is not even known for polynomial equations in two variables. There is, however, an algorithm to decide for any given P ∈ Z[X,Y ] whether P (X,Y ) = 0 has infinitely many integer solutions. This follows from a deep theorem due to C.L. Siegel (1929), with some extra work. There is also an algorithm based on other work by Siegel to decide whether a given polynomial equation P (X1,...,Xn) = 0 with P ∈ Z[X1,...,Xn] of degree 2, has an integer solution. It is not known whether H10 has a positive answer when we allow rational solutions instead of integer solutions. On the other hand, H10 does have a positive answer when we allow real solutions, or p-adic solutions; this follows from the stronger result that the elementary theory of the real field is decidable (Tarski), and likewise for the elementary theory of each p-adic field (Ax–Kochen & Ersov).

51 Chapter 3

Kolmogorov Complexity

This is a theory about randomness using concepts of recursion theory. The idea is that a long finite string of 0’s and 1’s is random if any program to compute the string is about as long as the string itself. Instead of natural numbers it is more convenient to work with finite strings of 0’s and 1’s, that is, elements of 2

In words, tr is the unique bijection 2

g(tr(x1),..., tr(xd)) ' tr(f(x1, . . . xd))

is partial recursive. Note: 2|σ| − 1 ≤ tr(σ) ≤ 2|σ|+1 − 2.

52 3.1 Plain Kolmogorov Complexity

Let f : 2

 min{|ρ| : f(ρ) = σ} if there is ρ ∈ Df such that f(ρ) = σ, C (σ) = f ∞ otherwise.

We say that σ is Kolmogorov random with respect to f if Cf (σ) ≥ |σ|. Let U : 2

{U(π, ·): π ∈ 2

Define the partial recursive function u : 2

u(0|π|1πρ) ' U(π, ρ), u(σ) ↑ if σ does not have the form 0|π|1πρ.

Then 1 ≤ Cu(σ) < ∞ for all σ: the left inequality holds because u(∅) ↑, and the right inequality holds because U(π, ·) = id2

Lemma 68. For all partial recursive f : 2

Cu(σ) ≤ Cf (σ) + const(f).

Here and later const(f) denotes a constant in N depending only on f. Likewise, const denotes an absolute constant in N. (Of course, the possible values of const may vary with the context.) Proof. Let π be such that f = U(π, ·). If ρ ∈ Df and f(ρ) = σ, then u(0|π|1πρ) = σ and |0|π|1πρ| = |ρ| + 2|π| + 1. Thus for all σ:

Cu(σ) ≤ Cf (σ) + 2|π| + 1.

< Definition 69. C := Cu : 2 N → N. This complexity function C depends on the initial choice of U, but only in an inessential way: Let U 0 : 2

53 (b) C(xx) ≤ |x| + const;

(c) if h : 2

C(h(x)) ≤ C(x) + const(h).

Proof. For f = id2

C(h(x)) ≤ Cf (h(x)) + const(f)

≤ Cu(x) + const(f) = C(x) + const(h).

This yields (c), and (b) is a special case of (c). Call x random if C(x) ≥ |x|. For each n, there is a random x with |x| = n. This follows from the pigeon hole principle:

|{τ : |τ| < n}| = 20 + 21 + ··· + 2n−1 = 2n − 1, and |{τ : |τ| = n}| = 2n, so the partial map

σ 7→ u(σ): {σ : |σ| < n} * {x : |x| = n} cannot be surjective. More precisely, we have the following result. Lemma 71. For all n and all k ∈ {0, . . . , n}, 1 |{x : |x| = n, C(x) ≥ n − k}| > 2n(1 − ). 2k Proof. We have |{τ : |τ| < n − k}| = 20 + 21 + ··· + 2n−k−1 = 2n−k − 1, so n n−k n 1 |{x : |x| = n, C(x) ≥ n − k}| ≥ 2 − (2 − 1) > 2 (1 − 2k ). Thus C(x) ≥ 90 for more than 99.9% of the strings x of length 100.

Lemma 72 (Martin-L¨of). Let k ∈ N be given. Then any sufficiently long z has an initial segment x with C(x) < |x| − k.

Proof. Let f : 2 2k+∆+2, and let ρ be the initial segment of z with |ρ| = k + ∆. Let r = tr(ρ); then 2|ρ| −1 ≤ r < 2|ρ|+1 −1, hence |ρ|+r < k+∆+2k+∆+1 −1 ≤ |z|. Take x = z||ρ|+r (the initial segment of z of length |ρ| + r). Then x = ρσ, with |σ| = r, so f(σ) = ρσ = x. Hence,

C(x) < Cf (x) + ∆ ≤ |σ| + ∆ = (|ρ| − (∆ + k)) + |σ| + ∆ = |ρ| + |σ| − k = |x| − k.

54 Corollary 73. For all sufficiently long random z, there are x, y such that z = xy, but C(x) + C(y) < C(z).

Proof. Take k ∈ N such that for all σ, C(σ) ≤ |σ| + k. Let z be random and so long that it has an initial segment x with C(x) < |x| − k. Take y with z = xy. Then C(x) + C(y) < |x| − k + |y| + k = |x| + |y| = |z| ≤ C(z). Call a string σ compressible if C(σ) < |σ|. Then the Martin-L¨oflemma says that any sufficiently long string has compressible initial segments, and any suf- ficiently long random string can be decomposed into two strings the sum of whose complexities is strictly less than that of the original string. Thus, the de- composition itself stores information additional to that contained in the strings separately. The below are with respect to base 2, so log 2n = n. Lemma 74. C(xy) ≤ C(x) + C(y) + 2 log C(x) + const.

Proof. Let f : 2 0, and C(y) = |τ| > 0. Take ρ such that |σ| = tr(ρ). Then 2|ρ|−1 ≤ 2|ρ| − 1 ≤ |σ| = C(x), so log C(x) ≥ |ρ| − 1. Then f(0|ρ|1ρστ) = xy, and

|0|ρ|1ρστ| = |σ| + |τ| + 2|ρ| + 1 ≤ C(x) + C(y) + 2(log(C(x)) − 1) + 1 = C(x) + C(y) + 2 log(C(x)) − 1, hence

C(xy) ≤ Cf (xy) + const(f) ≤ C(x) + C(y) + 2 log C(x) + const.

The next result has a striking consequence for the incompleteness of any given axiomatic framework for mathematics, as we discuss after the proof.

|x| Proposition 75. The set A = {x : C(x) ≥ 2 } of half-random strings is immune; that is, A has no infinite recursively enumerable subset. In particular, A is not recursively enumerable, and thus C : 2

By Lemma 71, almost all strings are half random: the probability that a string of length n is half random goes to 1 very rapidly as n → ∞. As a consequence of Proposition 75, however, there are only finitely many σ for which ZFC can

55 prove that σ is half random, assuming of course that ZFC is consistent. (Here ZFC is the standard axiomatic set theory in which essentially all of mathematics can be formally derived, including the entire contents of this course.) The same remains true for any recursively axiomatized consistent extension of ZFC.

3.2 Prefix-Free Sets

We say that σ and τ are comparable if σ is an initial segment of τ or τ is an initial segment of σ. We call a set A ⊆ 2

Nσ := {α : α(i) = σ(i) for all i < |σ|} as a basis for its topology. Note that N∅ = 2N, and that σ and τ are incompa- rable iff Nσ ∩ Nτ = ∅. Let µ be the probability measure on the σ-algebra of Borel subsets of 2N −|σ| (the σ-algebra on 2N generated by the Nσ) given by µ(Nσ) = 2 .

P 1 P S N Proof. σ∈A 2|σ| = σ∈A µ(Nσ) = µ( σ∈A Nσ) ≤ µ(2 ) = 1.

Theorem 77 (Kraft-Chaitin). Let (di)i∈N be a sequence of natural numbers such that P∞ 1 ≤ 1. Then there is an injective sequence (σ ) such that i=0 2di i i∈N |σi| = di for all i and {σi : i ∈ N} is prefix-free.

Proof. We can assume that d0 ≤ d1 ≤ d2 ≤ ... . Then we choose σ0, σ1, σ2,... successively by always choosing the left-most possibility. For example, when d0, d1, d2, d3 are 2, 3, 3, 4, this is illustrated by the following picture.

Q  Q  Q  QQ @ @ @ @ @ @ σ0 ¢B ¢B ¢B ¢ B ¢ B ¢ B ¢ B ¢ B ¢ B σ1 σ2 ¤D ¤D ¤D ¤D ¤ D ¤ D ¤ D ¤ D ¤ D ¤ D ¤ D ¤ D σ3

56 The assumption that P∞ 1 ≤ 1 guarantees that we can always continue i=0 2di this process. We make this precise as follows. Take σ0 = 0 ... 0, with |σ0| = d0. By the induction hypothesis, suppose that at stage n, we have pairwise incomparable σ0, . . . , σn satisfying |σi| = di for i = 0, . . . , n and such that all

α with tr(α|dn) ≤ tr(σn) belong to Nσ0 ∪ · · · ∪ Nσn . Then there is a σ with

|σ| = dn and tr(σ) > tr(σn); otherwise, all α would belong to Nσ0 ∪ · · · ∪ Nσn , yielding 1 = µ(N ∪ · · · ∪ N ) = Pn 1 < P∞ 1 ≤ 1, a contradiction. σ0 σn i=0 2di i=0 2di Let σ satisfy |σ| = dn and tr(σ) = tr(σn) + 1. Define σn+1 := σ0 ... 0 with |σn+1| = dn+1. It is easy to verify that σ0, . . . , σn+1 are pairwise incomparable, and all α with tr(α|dn+1) ≤ tr(σn+1) belong to Nσ0 ∪ · · · ∪ Nσn+1 . This proof contains an inconstructive step, namely the initial rearrangement of the di in increasing order. It is desirable to have a more constructive proof which avoids this. Constructive proof. We make the inductive assumption that at stage n, we have σ0, . . . , σn with |σi| = di for i = 0, . . . , n, and for 0 ≤ i < j ≤ n, σi and σj are n n incomparable and that in addition we have τ1 , . . . , τp , p ≥ 1, such that

n n |τ | < ··· < |τ |, 2N = N t · · · t N t N n t · · · t N n , 1 p σ0 σn τ1 τp where the square union signs indicate a disjoint union. Here p depends on n. For n = 0, the inductive assumption is realized as follows: note that d0 > 0 and 0 put σ0 = 0 ... 0, with |σ0| = d0. If d0 = 1, set p = 1, and τ1 = 1; if d0 ≥ 2, set 0 0 p = d0, and τi = 0 ... 01, with |τi | = i. Given stage n, we construct σn+1 at stage n + 1 as follows. We are given n n dn+1. If |τi | = dn+1, 1 ≤ i ≤ p, then we set σn+1 := τi , so

2N = N t · · · t N t N t N n t · · · t N n t N n t · · · t N n . σ0 σn σn+1 τ1 τi−1 τi+1 τp We put  n n+1 τj if 1 ≤ j ≤ i − 1 τj = n τj+1 if i + 1 ≤ j ≤ p. n Now suppose that for all i ∈ {1, . . . , p}, |τi | 6= dn+1. It is easy to check that n n |τ1 | < dn+1. Take i ∈ {1, . . . , p} maximal such that |τi | < dn+1. Set k := n n n dn+1 − |τi |. Take ρ1, . . . , ρk, σn+1 such that ρj = τi 0 ... 01, with |ρj| = |τi | + j n (so |ρk| = dn+1) for 1 ≤ j ≤ k, and σn+1 = τi 0 ... 0, with |σn+1| = dn+1. Then

N n = N t · · · t N t N . τi ρ1 ρk σn+1

n n n Thus replacing τi in the sequence τ1 , . . . , τp by ρ1, . . . , ρk yields a sequence n+1 n+1 τ1 , . . . , τp+k−1 with the desired properties. This constructive proof of the Kraft-Chaitin Theorem yields:

P∞ −di Proposition 78. Let i 7→ di : N → N be recursive such that i=0 2 ≤ 1. < Then there is a recursive injective map i 7→ σi : N → 2 N such that |σi| = di for all i and {σi : i ∈ N} is prefix-free.

57 Here a function f : N → 2

3.3 Prefix-Free Complexity

A partial recursive function f : 2

{f : 2

S ˆ with D(φe,s) ⊆ {0, . . . , s}, φe = s φe,s. This yields an enumeration (φe) of

{f : 2

ˆ < < and for each s a function φe,s : 2 N * 2 N with ˆ D(φe,s) ⊆ {σ : tr(σ) ∈ {0, . . . , s}},

ˆ S ˆ

So φe,0 ⊆ ψe,1 ⊆ ψe,2 ⊆ ... . Put

[ < < ψe := ψe,s : 2 N * 2 N. s A bit of thought shows:

(a) each ψe is partial recursive and prefix-free;

(b) (ψe) is an enumeration of

{f : 2

ˆ (c) there is a recursive ι : N → N such that ψe = φι(e) for all e.

58 Define Ψ : 2

e Ψ(0 1σ) ' ψe(σ), Ψ(ρ) ↑ if ρ 6= 0e1σ for all e and σ.

>0 Note: Ψ is partial recursive, prefix-free, surjective, and CΨ(σ) ∈ N for all σ. Lemma 79. Let f : 2

CΨ ≤ Cf + const(f)

Proof. Take e such that f = ψe. Then CΨ ≤ Cf + (e + 1). Definition 80. The prefix-free complexity of a string σ is

>0 K(σ) := CΨ(σ) ∈ N . Lemma 81. There are infinitely many σ such that K(σ) > |σ| + log |σ|. Proof. Suppose K(σ) ≤ |σ| + log |σ| for all but finitely many σ. Then, with σ ranging over the nonempty strings, we have X X 2−K(σ) ≥ 2−(|σ|+log |σ|) + const σ σ ∞ X X = 2−(n+log n) + const n=1 |σ|=n ∞ X = 2n2−(n+log n) + const n=1 ∞ X 1 = + const = ∞. n n=1

Since Ψ is surjective, we have an injective map σ 7→ σ0 : 2

X X 0 X 1 ≥ 2−|ρ| ≥ 2−|σ | = 2−K(σ) = ∞, ρ∈D(Ψ) σ σ a contradiction.

P∞ − log n The only fact about log that we used is that n=1 2 = ∞. The same proof yields the following generalization:

>0 >0 P∞ −f(n) Lemma 82. If f : N → R satisfies n=1 2 = ∞, then there are infinitely many σ such that K(σ) > |σ| + f(|σ|).

Let n∗ ∈ 2

59 Proposition 83. K(σ) < |σ| + 2 log|σ| + const for nonempty σ.

Proof. It suffices to find a prefix-free partial recursive f : 2

We shall construct such f via the effective version of the Kraft-Chaitin Theorem, P 1 2k+1 using k=0 10(k+1)2 ≤ 1. Let nk = b 10(k+1)2 c ∈ N. Note that nk → ∞ as k → ∞. Define di ∈ N by

di = 1 for 0 ≤ i < n0,

di = 2 for n0 ≤ i < n0 + n1,

di = 3 for n0 + n1 ≤ i < n0 + n1 + n2, . .

di = k + 1 for n0 + n1 + ··· + nk−1 ≤ i < n0 + n1 + ··· + nk.

Then

∞ X −di −1 −2 2 = n02 + n12 + ... i=0 ∞ X −(k+1) = nk2 k=0 ∞ X 2k+1 ≤ · 2−(k+1) 10(k + 1)2 k=0 ∞ X 1 = 10(k + 1)2 k=0 ≤ 1.

The function i 7→ di : N → N is recursive, so by the Kraft-Chaitin Theorem we < have an injective recursive map i 7→ ρi : N → 2 N such that |ρi| = di for all i, < ∗ and {ρi : i ∈ N} is prefix-free. Now define f : {ρi : i ∈ N} → 2 N by f(ρn) = n . Note that f : 2 0 and k = bm + 2 log m + cc, c ∈ N,

60 2k+1 n ≥ − 1 k 10(k + 1)2 2m+2 log m+c ≥ − 1 10(m + 2 log m + c + 1)2 2 2m · 2log m · 2c ≥ − 1 10(m + 2 log m + c + 1)2 2m · m2 · 2c ≥ − 1 10(m + 2 log m + c + 1)2 ≥ 2m+1 for c = 10.

Therefore, if 0 < |σ| ≤ m, we obtain ρi such that

|ρi| ≤ m + 2 log m + 11, and f(ρi) = σ.

Theorem 84. K(σ) ≤ |σ| + K(|σ|∗) + const. Proof. Since Ψ is prefix-free, each string σ has at most initial segment ρ ∈ D(Ψ). So we can define a prefix-free f : 2

f(σ) ' τ ⇐⇒ ∃ρ (σ = ρτ and Ψ(ρ) ' |τ|∗).

Then f is partial recursive, so K(τ) = CΨ(τ) ≤ Cf (τ) + const. ∗ We claim that Cf (τ) ≤ |τ| + K(|τ| ). To see this, take ρ of minimal length such that Ψ(ρ) ' |τ|∗. Then |ρ| = K(|τ|∗), and f(ρτ) ' τ. Hence,

∗ Cf (τ) ≤ |ρτ| = |τ| + |ρ| = |τ| + K(|τ| ).

From this claim, we obtain

∗ K(τ) ≤ Cf (τ) + const ≤ |τ| + K(|τ| ) + const .

3.4 Information Content Measures

Here and below we let k range over N. Definition 85. An information content measure (icm) is a partial function I : 2

(b) the set {(σ, k): σ ∈ D(I),I(σ) ≤ k} ⊆ 2

61 The next two results establish a close connection between partial recursive prefix-free functions and icm’s, almost a one-to-one correspondence.

Lemma 86. Let ψ : 2

Proof. X X X 2−Iψ (σ) = 2−Cψ (σ) ≤ 2−|ρ| ≤ 1.

σ∈D(Iψ ) σ∈Im(ψ) ρ∈D(ψ)  Also, σ ∈ D(Iψ),Iψ(σ) ≤ k ⇐⇒ ∃ρ |ρ| ≤ k, ψ(ρ) ' σ . Therefore, the set

< {(σ, k): σ ∈ D(Iψ),Iψ(σ) ≤ k} ⊆ 2 N × N is recursively enumerable.

For ψ = Ψ, this yields Iψ = K, so K is an icm. Conversely, Lemma 87. Let I be an icm with infinite domain D(I). Then there is a prefix- free partial recursive map ψ : 2

(b) I(σ) + 1 = Cψ(σ) for all σ ∈ D(I).

< Proof. Take a recursive map i 7→ (σi, ki): N → 2 N × N with image {(σ, k): σ ∈ D(I),I(σ) ≤ k}.

We construct a set A ⊆ {(σ, k + 1) : σ ∈ D(I),I(σ) ≤ k} in stages.

• Stage 0: Put (σ0, k0 + 1) in A.

• Stage s + 1: Compute (σs+1, ks+1), see whether ks+1 < ki for all i ≤ s with σi = σs+1, and if so, put (σs+1, ks+1 + 1) in A. Note: σ ∈ D(I) ⇐⇒ ∃d(σ, d) ∈ A: the direction ⇐ is clear, and ⇒ holds in the more precise form that if σ ∈ D(I), then min{d :(σ, d) ∈ A} = I(σ) + 1. Put A(σ) := {d :(σ, d) ∈ A}. Then, for σ ∈ D(I), X 2−d ≤ 2−(I(σ)+1) + 2−(I(σ)+2) + · · · ≤ 2 · 2−(I(σ)+1) = 2−I(σ), so d∈A(σ) X X X X 2−d = 2−d ≤ 2−I(σ) ≤ 1. (σ,d)∈A σ∈D(I) d∈A(σ) σ∈D(I)

By construction, A is r.e. and infinite, so there is an injective recursive map

0

62 P∞ −di with image A. By the last inequality we have i=0 2 ≤ 1, so the recursive < version of Kraft-Chaitin yields a recursive injective map i 7→ ρi : N → 2 N such < that |ρi| = di for all i, and {ρi : i ∈ N} is prefix-free. Now define ψ : 2 N * N to 0 have domain D(ψ) = {ρi : i ∈ N} and ψ(ρi) = σi for all i. Then ψ is prefix-free partial recursive and Im(ψ) = D(I). It remains to check that (b) is satisfied: Let σ ∈ D(I) = Im(ψ). Then

0 Cψ(σ) = min{di : i ∈ N, σ = σi} = min{d :(σ, d) ∈ A} = I(σ) + 1.

The next lemma follows from K being an icm with D(K) = 2

Lemma 88. The set X n {(m, n, k): 2−K(σ) ≥ } ⊆ 3 2k N |σ|=m is recursively enumerable.

P −K(σ) n Proof. Just note that |σ|=m 2 ≥ 2k if and only if there is a tuple (nσ)|σ|=m of natural numbers such that X n 2−nσ ≥ and K(σ) ≤ n for each σ with |σ| = m. 2k σ |σ|=m

Lemma 89. There is an enumeration (Ie)e∈N of the set of icms such that

< {(σ, e, k) ∈ 2 N × N × N : σ ∈ D(Ie),Ie(σ) ≤ k} is recursively enumerable. Proof. Note that we have a recursive bijection

< (σ, n) 7→ | tr(σ), n| : 2 N × N → N.

Using this bijection the sets We and their finite approximations We,s yield an < enumeration (Ve) of the set of recursively enumerable subsets of 2 N × N and a < family (Ve,s) of finite subsets of 2 N × N with the following properties:

(a) Ve,s ⊆ {(σ, n): | tr(σ), n| ≤ s}; S (b) |Ve,0| ≤ 1, Ve,s ⊆ Ve,s+1, and s Ve,s = Ve; < 3 (c) {(σ, n, e, s) ∈ 2 N × N :(σ, n) ∈ Ve,s} is recursive.

63 <  Define ηe,s : 2 N * N by ηe,s(σ) := µn (σ, n) ∈ Ve,s , so D(ηe,s) is finite, D(ηe,s) ⊆ D(ηe,s+1), and ηe,s+1(σ) ≤ ηe,s(σ) for all σ ∈ D(ηe,s). Moreover, the partial map 2 < (e, s, σ) 7→ ηe,s(σ): N × 2 N * N < is partial recursive and has recursive domain. Next, define Ie,s : 2 N * N by recursion on s: Ie,0 := ηe,0 , Ie,s+1 = ηe,s+1 if ηe,s+1 is an icm, and Ie,s+1 = Ie,s otherwise. Identifying functions with their graphs, it is easy to check that Ie,s is a finite icm, D(Ie,s) ⊆ D(Ie,s+1), and Ie,s+1(σ) ≤ Ie,s(σ) for all σ ∈ D(Ie,s), and 2 < (e, s, σ) 7→ Ie,s(σ): N × 2 N * N is partial recursive and has recursive domain. In particular, the set

< 3 {(σ, n, e, s) ∈ 2 N × N : Ie,s(σ) ' n}

< is recursive. Define Ie : 2 N * N by [ D(Ie) = D(Ie,s),Ie(σ) = min{Ie,s(σ): σ ∈ D(Ie,s)}. s

Given any icm I, take e such that Ve = {(σ, k): σ ∈ D(I),I(σ) ≤ k}, and note that then I = Ie. It follows easily that (Ie) is an enumeration of the set of icms as required by the lemma.

Fix an enumeration (Ie) of the set of icms as in the lemma above. This allows us to define Kˆ : 2

Kˆ (σ) := min{Ie(σ) + e + 1 : Ie(σ) ↓}

Lemma 90. Kˆ is an icm. Proof. Condition (a) in the definition of icm holds for Kˆ :

∞ X ˆ X X 2−K(σ) ≤ 2−(Ie(σ)+e+1)

σ e=0 σ∈D(Ie) ∞ X X = 2−(e+1) 2−Ie(σ)

e=0 σ∈D(Ie) ∞ X ≤ 2−(e+1) = 1. e=0

Condition (b) is satisfied by Kˆ , since

Kˆ (σ) ≤ k ⇐⇒ ∃e(σ ∈ D(Ie),Ie(σ) + e + 1 ≤ k).

64 This icm Kˆ is also called the minimal icm. We can use it to prove results about K because of the following fact. Corollary 91. |K − Kˆ | ≤ const. Proof. We already know that Kˆ ≤ K + const. Since Kˆ is an icm, Lemma 87 < < gives a prefix-free partial recursive ψ : 2 N * 2 N such that Cψ = Kˆ + 1. Hence, K(σ) ≤ Cψ(σ) + const = Kˆ (σ) + 1 + const. Theorem 92 (Counting Theorem). We have (a) |{σ : |σ| = n, K(σ) < n + K(n∗) − k}| ≤ 2n−k+const; (b) n + K(n∗) + const ≤ max{K(σ): |σ| = n} ≤ n + K(n∗) + const.

Proof. Assuming (a) for the moment, we show how (b) follows. Let c ∈ N be a value of const for which (a) holds. Then

|{σ : |σ| = n, K(σ) < n + K(n∗) − c − 1}| ≤ 2n−1 < 2n.

So we get σ with |σ| = n such that K(σ) ≥ n + K(n∗) − c − 1. This gives the lower bound in (b). The upper bound is given by Theorem 84. To prove (b), note first that X 1 ≥ 2−K(σ) σ ∞ X X = 2−K(σ) n=0 |σ|=n X X = 2−K(σ). ρ |σ|=tr(ρ)

−I(ρ) P −K(σ) Take I(ρ) ∈ [1, ∞) with 2 = |σ|=tr(ρ) 2 , that is,   X −K(σ) I(ρ) = − log  2  . |σ|=tr(ρ)

P −I(ρ) Then ρ 2 ≤ 1, so I is like an icm, except that I may have values not in N. So we modify I to Iˆ : 20 by Iˆ(ρ) := dI(ρ)e. Claim. Iˆ is an information content measure. P −Iˆ(ρ) P −I(ρ) To prove this claim, note that ρ 2 ≤ ρ 2 ≤ 1, and

Iˆ(ρ) ≤ k ⇐⇒ I(ρ) ≤ k ⇐⇒ 2−I(ρ) ≥ 2−k X ⇐⇒ 2−K(σ) ≥ 2−k. |σ|=tr(ρ)

65 Using for example lemma 88 it follows that the set X {(ρ, k): Iˆ(ρ) ≤ k} = {(ρ, k): 2−K(σ) ≥ 2−k} |σ|=tr(ρ) is recursively enumerable. This finishes the proof of the claim. This claim and earlier results yield a c ∈ N such that K(σ) ≤ Iˆ(σ) + c for all σ. Then

K(n∗) ≤ Iˆ(n∗) + c ≤ I(n∗) + c + 1   X −K(σ) = − log  2  + c + 1. |σ|=n

Therefore, ∗ X 2−K(n )+c+1 ≥ 2−K(σ). |σ|=n Let F := {σ : |σ| = n, K(σ) < n + K(n∗) − k}, and assume |F| ≥ 2n−k+c+1. Then by the above,

∗ X 2−K(n )+c+1 ≥ 2−K(σ) |σ|=n X ≥ 2−K(σ) σ∈F ∗ > 2n−k+c+1 · 2−n−K(n )+k ∗ = 2−K(n )+c+1, a contradiction. Likewise, we obtain the following.

∗ Proposition 93. |{σ : |σ| = n, K(σ) < n − k}| ≤ 2n−K(n )−k+const. Proof. Take c as in the previous proof. Let F := {σ : |σ| = n, K(σ) < n − k}. ∗ Assume |F| > 2n−K(n )−k+c+1. Then

∗ X 2−K(n )+c+1 ≥ 2−K(σ) |σ|=n X ≥ 2−K(σ) σ∈F ∗ > 2n−K(n )−k+c+1 · 2−(n−k) ∗ = 2−K(n )+c+1, a contradiction.

66 3.5 Martin-L¨ofRandomness

A set T ⊆ N × 2

A set B ⊆ 2N is said to have effective measure zero if there is a Martin-L¨oftest T S  T such that B ⊆ n σ∈T (n) Nσ . Note that µ(B) = 0 for such B, and that every Martin-L¨oftest T determines a largest subset of 2N of effective measure zero witnessed by T , namely   \ [ NT :=  Nσ . n σ∈T (n)

We say that α passes the Martin-L¨oftest T if α∈ / NT . We say that α is 1- random if it passes every Martin-L¨oftest. We are going to show that there is Martin-L¨oftest such that passing it is equivalent to passing all Martin-L¨of tests. Such a Martin-L¨oftest is said to be universal. Note that if T is a universal Martin-L¨oftest, then NT is the largest subset of 2N of effective measure zero; in particular, this set is independent of the choice of universal Martin-L¨oftest. This is somewhat remarkable, since for, say, Lebesgue measure on R, there is obviously no largest subset of R of measure zero. Theorem 94 (Martin-L¨of). There exists a universal Martin-L¨oftest. Proof. Let R ⊆ × × 2

Rt = {f(0), f(1), . . . , f(t)} ⊆ R, t ∈ N. S Then R0 ⊆ R1 ⊆ ... , R = Rt and t∈N

Rt(e, n, σ) ⇐⇒ ∃s ≤ t [f(s) = (e, n, σ)] , so the set < {(t, e, n, σ): Rt(e, n, σ)} ⊆ N × N × N × 2 N is recursive. Define T ⊆ N × 2

τ∈Rt(n,n)

67 It is is rather clear from the Church-Turing Thesis that T is recursively enu- merable. Note also that T (0) ⊇ T (1) ⊇ T (2) ⊇ ... . Put Bn := S N , t τ∈Rt(n,n) τ n n so B0 ⊆ B1 ⊆ ... . Moreover,

    [ [ [ n µ  Nσ = µ  Bt  n −n σ∈T (k) n≥k+1 µ(Bt )≤2   X [ n ≤ µ  Bt  n −n n≥k+1 µ(Bt )≤2 X ≤ 2−n n≥k+1 = 2−k.

Thus T is a Martin-L¨oftest. To see that T is universal, let S ⊆ N × 2

It wouldn’t be interesting to replace K by C in this definition, as by Martin-L¨of’s compressability lemma, given any α and any and c ∈ N, there exists infinitely many n such that C(α|n) < n − c.

68 Theorem 96 (Schnorr).

α is LGC-random ⇐⇒ α is 1-random.

Proof. (of ⇐) Define T ⊆ N × 2

Thus T is a Martin-L¨oftest. Now suppose α is 1-random. Then α passes T , so S we obtain k such that α 6∈ σ∈T (k) Nσ, that is,

K(α|n) > |σ| − k for all n, so α is LGC-random. For the converse we need:

Lemma 97. Let T ⊆ N × 2

(b) Tˆ(k) ⊆ 2

Proof. Let f : N → N × 2

69 be the distinct elements of Tˆs whose first component is k. Note that it is possible to find σ1, . . . , σm such that

Nσ \ (Nτ1 t ... t Nτn ) = Nσ1 t ... t Nσm where the unions are disjoint, and then we take

Tˆs+1 = Tˆs ∪ {(k, σ1),..., (k, σm)} . For Tˆ we take the union of this sequence of sets: [ Tˆ = Tˆs. s∈N Then Tˆ is an ML-test satisfying (a) and (b). Note also the following property of ML-tests; if T is a ML-test and σ ∈ T (n), then 2−|σ| ≤ 2−n, and hence |σ| ≥ n. We now continue with the other direction of Schnorr’s theorem. Proof of ⇒ in Schnorr’s theorem. We prove the contrapositive. Suppose α is not 1-random; take a ML-test T such that α fails T . That is, \ [ α ∈ Nσ n σ∈T (n) By Lemma 97, we can assume that T (k) is prefix-free for all k. Define

< I : 2 N * N by D(I) := {σ : σ ∈ T (2n) for some n ≥ 1} , and I(σ) := min{|σ| − n : σ ∈ T (2n) for some n ≥ 1} for σ ∈ D(I). Then X X X X X 2−I(σ) ≤ 2−(|σ|−n) ≤ 2n 2−|σ| σ∈D(I) n≥1 σ∈T (2n) n≥1 σ∈T (2n) X X ≤ 2n2−2n = 2−n = 1. n≥1 n≥1 Thus one of the two conditions for I to be be an icm is satisfied; we leave it as an exercise to check that the other condition is satisfied as well. So I is an icm. By earlier results we have, for σ ∈ D(I) K(σ) ≤ I(σ) + const(I) = min ({|σ| − n : σ ∈ T (2n), n ≥ 1}) + const(I).

Now let c ∈ N; it remains to show that K(α|k) < k − c for some k. Consider S any n ≥ 1. Because α ∈ σ∈T (2n) Nσ, we can find σ ∈ T (2n) such that σ is an initial segment of α. Set k = |σ|, so α|k = σ, hence k = |σ| ≥ 2n, σ ∈ D(I), and K(α|k) = K(σ) ≤ |σ| − n + const(I) = k − n + const(I) < k − c provided n > const(I) + c. Now choose n to satisfy this inequality.

70 3.6 Solovay Randomness

The definition of Solovay randomness proceeds from a corresponding notion of a Solovay test.

Definition 98. A Solovay test is a recursively enumerable set S ⊆ 2

Given a Solovay test S ⊆ 2

σ ∈ S ⇐⇒ ∃n ((σ, n) ∈ T ) it is clear that S is recursively enumerable, so S is a Solovay test. Let α be Solovay random. Then α passes S, so there are only finitely many initial segments of α in S. Take m such that α|n∈ / S for all n > m. So for S n > m we have α∈ / σ∈T (n) Nσ, because |σ| ≥ n for all σ ∈ T (n). Thus, \ [ α∈ / Nσ, n σ∈T (n) that is, α passes T . As to the direction (⇐), let α be 1-random, and let S be a Solovay test. We are going to show that α passes S. Removing finitely many elements from S, we arrange that X 2−|σ| ≤ 1. σ∈S

Define Uk ⊆ 2N by

 k Uk := β | there are at least 2 many σ ∈ S with β ∈ Nσ .

T

71 k S. Then β ∈ Uk is equivalent to there being at least 2 many initial segments S S of β in S, and thus equivalent to β ∈ σ∈T (k) Nσ. So Uk = σ∈T (k) Nσ. Hence \ [ β passes S ⇐⇒ β∈ / Nσ. k σ∈T (k) So, assuming T is a ML-test, we have that α passes T , which implies that α passes S, as promised. To show that T is, indeed, a ML-test, we leave it as an exercise to check that −k T is recursively enumerable. It remains to show that µ(Uk) ≤ 2 for every k. Take finite subsets St of S, with t ∈ N, such that [ S0 ⊆ S1 ⊆ S2 ⊆ ..., and S = St. t Put

t  k Uk := β | there are at least 2 many initial segments of β in St , so 0 1 2 [ t Uk ⊆ Uk ⊆ Uk ⊆ ..., and Uk = Uk. t It suffices to show that for all t, k,

t −k µ(Uk) ≤ 2 .

Fix t and let σ1, . . . , σn be the distinct elements of St. Then Z χ (β) + ··· + χ (β) dµ(β) Nσ1 Nσn β Z ≥ χ (β) + ··· + χ (β) dµ(β) ≥ 2kµ(U t) Nσ1 Nσn k t β∈Uk and also,

n Z X Z χ (β) + ··· + χ (β) dµ(β) = χ (β) dµ(β) Nσ1 Nσn Nσi β i=1 β n n X X −|σi| X −|σ| = µ(Nσi ) = 2 ≤ 2 ≤ 1. i=1 i=1 σ∈S

t −k Thus, µ(Uk) ≤ 2 as required. As a consequence of the Theorem and the argument for the (⇒) direction, the S set S = n T (n) with T a universal ML-test is a universal Solovay test in the sense that passing S is equivalent to passing all Solovay tests. We use the following notation for the initial segment relation on 2

ρ ⊆ σ :⇐⇒ ρ is an initial segment of σ

72 Theorem 100 (Miller and Yu). X α is 1-random ⇐⇒ 2n−K(α|n) < ∞ n Proof. (⇐) We prove the contrapositive; assume α is not 1-random, so α is not LGC-random, hence K(α|n) < n for infinitely many n, and thus n−K(α|n) > 0 for infinitely many n. Therefore, X 2n−K(α|n) = ∞. n To prove the other (⇒) direction, first note that X X X X 2n−K(σ|n) = 2|ρ|−K(ρ) |σ|=m n≤m |σ|=m ρ⊆σ X X = 2m−|ρ| · 2|ρ|−K(ρ) = 2m 2−K(ρ) ≤ 2m |ρ|≤m |ρ|≤m where we use that K is an icm. Let p ∈ N≥1 in what follows. Then   m  X n−K(σ|n)  2 σ : |σ| = m, 2 ≥ p ≤ , p  n≤m  so for each m,    X  1 µ α : 2n−K(α|n) ≥ p ≤ .   p  n≤m 

We introduce the following subsets of 2N:    X n−K(α|n)  Up,m := α : 2 ≥ p ,  n≤m  so Up,0 ⊆ Up,1 ⊆ Up,2 ⊆ ... S 1 In addition, define Up := Up,m; note that µ(Up) ≤ . m∈N p Now, let T ⊆ N × 2

73 To see why this holds, note the following equivalences [ α ∈ Nσ ⇔ ∃m (α|m ∈ T (k)) σ∈T (k)   X n−K(α|n) k ⇔ ∃m  2 ≥ 2  n≤m

⇔ α ∈ U2k , S and so σ∈T (k) Nσ = U2k . Hence   [ 1 µ N = µ (U k ) ≤ ,  σ 2 2k σ∈T (k) which proves that T is a ML-test. Assume now that α is 1-random. Then we get P n−K(α|n) k k such that α∈ / U2k , and so α∈ / U2k,m for each m, that is, n≤m 2 < 2 P n−K(α|n) k for each m, and thus n 2 ≤ 2 < ∞. We proved earlier that there are infinitely many σ such that

K(σ) > |σ| + log |σ|.

The following corollary is much stronger. P −f(n) Corollary 101. Suppose f : N → R satisfies n 2 = ∞. Given any 1-random α, there are infinitely many n such that K(α|n) > n + f(n). Proof. Let α be 1-random. If K(α|n) ≤ n + f(n) for almost all n, then also n − K(α|n) ≥ −f(n) for almost all n. So X X 2n−K(α|n) ≥ 2−f(n) + const = ∞, n n which is a contradiction. As an example for the corollary just proved, let f take the value f(n) = log n for n > 0. Then ∞ X X 1 2−f(n) ≥ = ∞, n n n=1 so if α is 1-random, then there are infinitely many n with K(α|n) > n + log n.

3.7 Martingales

A martingale is a function f : 2

74 Observation 102. If f and g are martingales, then so are f + g and rf for r ∈ [0, ∞). Note also that if f is a martingale with f(∅) = 0, then f(σ) = 0 for all σ. Observation 103. Let f be a martingale and σ ≤ τ; then 2−|σ|f(σ) ≥ 2−|τ|f(τ), that is, f(σ) ≥ (1/2|τ|−|σ|)f(τ). For τ = σ0 and τ = σ1 the inequality reduces 1 to f(σ) ≥ 2 f(τ), which holds trivially. The general case follows by induction on |τ| − |σ|. Theorem 104 (First Kolmogorov Inequality). Let f be a martingale, σ a string, and let X ⊆ {τ : σ ≤ τ} be prefix-free. Then X 2−|σ|f(σ) ≥ 2−|τ|f(τ). τ∈X

Proof. We can assume that X is finite. The theorem holds for X = ∅ and when |X| = 1. Let n ≥ 1, assume inductively that the inequality holds for all X of size at most n, and suppose that |X| = n + 1. Let π ∈ 2

τ∈X τ∈X0 τ∈X1 ≤ 2−|π0|f(π0) + 2−|π1|f(π1) f(π0) + f(π1) = 2−|π| 2 = 2−|π|f(π) ≥ 2−|σ|f(σ), which concludes the proof.

Recall from the section on prefix-free sets that for each Y ⊆ 2

Theorem 105 (Second Kolmogorov Inequality). Let f be a martingale, r ∈ (0, ∞), and put Sr(f) := {τ : f(τ) ≥ r}. Then   [ f(∅) µ N ≤ .  τ  r τ∈Sr (f)

Proof. Take a prefix-free X ⊆ Sr(f) such that [ [ Nτ = Nτ . τ∈Sr (f) τ∈X

75 Then   ! [ [ X −|τ| µ  Nτ  = µ Nτ = 2 . τ∈Sr (f) τ∈X τ∈X Applying the first Kolmogorov inequality gives   [ X −|τ| X −|τ| rµ  Nτ  = 2 r ≤ 2 f(τ) ≤ f(∅). τ∈Sr (f) τ∈X τ∈X

Definition 106. A martingale f is said to be effective if there is a recursive F : 2

Here we code a nonnegative rational q as the unique pair (a, b) ∈ N2 such that a and b are coprime, b 6= 0 and q = a/b.

Definition 107. A martingale f succeeds on α if lim supn→∞ f(α|n) = ∞, that is, f(α|n) takes arbitrarily large values. These definitions allow us to characterize 1-randomness as follows. Theorem 108 (Schnorr). α is 1-random ⇐⇒ no effective martingale succeeds on α. Proof. We do the forward direction here; the backward direction is proved in a stronger form in the next proposition. Let f be an effective martingale given by F . Take q ∈ Q>0 such that the martingale g = qf satisfies g(∅) ≤ 1. Define T ⊆ N × 2

T (k, σ) ⇔ g(σ) > 2k. It is easy to see that T is recursively enumerable. In addition,     [ [ g(∅) 1 µ N ≤ µ N ≤ ≤ ,  σ  σ 2k 2k σ∈T (k) σ∈S2k (g) so T is an ML-test. Now we have the following equivalences: \ [ α fails T ⇐⇒ α ∈ Nσ k σ∈T (k) ⇐⇒ for every k there is n with g(α|n) > 2k ⇐⇒ lim sup g(α|n) = ∞ n→∞ ⇐⇒ lim sup f(α|n) = ∞. n→∞ This concludes the proof of ⇒. For the converse, see below.

76 Proposition 109. Let T be an ML-test. Then there is an effective martingale dT such that for all α, if α fails T , then limk→∞ dT (α|k) = ∞. Proof. By Lemma 97 we can assume that T (n) is prefix-free for all n. Define the martingale dσ by

 1 if σ ⊆ τ  |τ|−|σ| dσ(τ) = 2 if τ ⊆ σ  0 if σ, τ are incomparable.

< Define dT : 2 N → [0, ∞] by X dT (τ) = dσ(τ).

σ∈∪nT (n)

Note that d (τ0) + d (τ1) d (τ) = T T , T 2 so dT is a martingale if dT (∅) 6= ∞ (in which case all values of dT are finite). We have X dT (∅) = dσ(∅) S σ∈ n T (n) X = 2−|σ| S σ∈ n T (n) X X = 2−|σ| n σ∈T (n) X ≤ 2−n n = 2 < ∞,

≥0 so dT is a martingale. The map (σ, τ) 7→ dσ(τ) into Q is recursive and S n T (n) is recursively enumerable, from which it follows easily that the infinite sum dT of the dσ is an effective martingale. Suppose α fails T , that is, \ [ α ∈ Nσ. n σ∈T (n) Then for all n there is k ≥ n with α|k ∈ T (n). We claim that, given any m, we have dT (α|k) ≥ m for all sufficiently large k. To see why this claim is true, let m be given. Beginnning with n = 0, take k0 so that α|k0 ∈ T (0). Then (using n = k0 + 1) take k1 > k0 with α|k1 ∈ T (k0 + 1). Proceeding in this fashion, we get an increasing sequence k0 < k1 < . . . with α|ki ∈ T (ki−1 + 1) for each i. Thus for k ≥ km + 1 each α|ki with i < m contributes 1 to the sum forming dT (α|k), so dT (α|k) ≥ m for all k ≥ km + 1. This proves the claim.

77 The ⇐ direction of Schnorr’s theorem now follows by considering the contra- positive. Corollary 110. There is a universal effective martingale d, that is, d is an effective martingale such that for all α the following are equivalent:

(i) α is 1-random,

(ii) lim supn→∞ d(α|n) < ∞,

(iii) lim infn→∞ d(α|n) < ∞.

Proof. Let T be a universal ML-test, and let d = dT be the corresponding effective martingale. Then clearly (i) ⇒ (ii) ⇒ (iii) by Schnorr’s theorem, and (iii) ⇒ α passes T , by the proposition, giving (i).

78