<<

and the four- theorem Background on Hamilton’s Quaternions We have the complex C = {a + bi : a, b ∈ R} an their integral analogue, Z[i] = {a + bi : a, b ∈ R}, the Gaussian , and we can draw a diagram accordingly. C

R

Z[i]

Z A higher dimensional, but analogous, scenario was discovered by the Irish mathematician Wm. Rowan Hamilton in 1843 whilst crossing Dublin’s Broom Bridge on his way into town. Definition 1. The Hamilton Quaternions are given by H = {a + bi + cj + dk : a, b, c, d ∈ R} where i2 = j2 = k2 = −1 ij = k = −ji jk = i = −kj ki = j = −ik, and they are a . We would like these to have have an analogue, and one possibility is Z[i, j, k] = {a + bi + cj + dk : a, b, c, d ∈ Z}. Note that sum, difference, or of things in Z[i, j, k] return things in Z[i, j, k]. We could extend our subring accordingly.

H

C

R Z[i, j, k]

Z[i]

Z

1 Examples: We have 1 + i + j + k ∈ Z[i, j, k] and clearly the sum (1 + i + j + k) + (1 + i + j + k) = 2 + 2i + 2j + 2k, and difference (1 + i + j + k) − (1 + i + j + k) = 0, are in Z[i, j, k], but a bit more work shows the product (1 + i + j + k)(1 + i + j + k) = 1 + i + j + k + i − 1 + ij + ik + j + ji − 1 + jk + k + ki + kj − 1 = −2 + 2i + 2j + 2k is also in Z[i, j, k]. So from our theorem yesterday, we know Z[i, j, k] is a subring of H. The 1 + i + j + k can be thought of as the point one “unit out” in each in 4 R . We can’t visualize fully this (damn human brains), but it might help to think about it in three .

Recall, the of point a + bi + cj + dk from the origin is just p p (a − 0)2 + (b − 0)2 + (c − 0)2 + (d − 0)2 = a2 + b2 + c2 + d2.

Definition 2. We define a on Z[i, j, k] by norm(a + bi + cj + dk) = a2 + b2 + c2 + d2, noting that this is just the square of the distance of a point from the origin. This is multiplicative, that is norm(q1q2) = norm(q1)norm(q2)

2 Examples: We can use this norm to determine when a integer in Z[i, j, k] is prime. Since

norm(−2 + 2i + 2j + 2k) = 16 we know it can be decomposed. On the other hand,

norm(−2 + i + j + k) = 7, and so it can’t be decomposed. More generally, for β ∈ Z[i, j, k], we have

• For q1, q2 ∈ Z[i, j, k], we have

norm(βq1 − βq2) = norm(β(q1 − q2)) = norm(β)norm(q1 − q2)

• This multiplying Z[i, j, k] by β has the effect of stretching every distance by a factor of norm(β) (note, this is a real value).

• This also means all are unchanged. So in particular,

βi, βj, βk

are all still perpendicular, and just shifted away from the origin.

• This means by β sends a “” in Z[i, j, k] gets sent to a “cube” in Z[i, j, k], possibly just stretching out the sides and rotating.

3 in the quaternions To begin, we recall the division algorithm in the integers. We would like something similar to exist in the quaternions.

The Division Algorithm. For any a, b ∈ Z there exists unique integers r and q with 0 ≤ r < b such that a = bq + r.

Suppose α, β ∈ Z[i, j, k], and we want to write α = βµ + ρ, where 0 ≤ ρ < β Then we would want to compute the remainder ρ = α − µβ which is the distance from α to the nearest “corner” in the multiplication by β grid. But if α is at the of a cube, then  1 i j k α = µ + + + + β, 2 2 2 2 which may still be an element of Z[i, j, k]. And consequently,  1 i j k 1 1 1 1 norm(α − βµ) = norm β + + + = norm(β) + + + = norm(β). 2 2 2 2 4 4 4 4 Suppose we include all of the “center points” into the grid, this would be achieved by including 1 + i + j + k 2 into the integers.

Definition 3. The most practical integer analogue inside H is the Hurwitz integers which are the set 1 + i + j + k  1 + i + j + k , i, j, k = {a + b + ci + dj + ek : a, b, c, d, e ∈ }. Z 2 2 Z The are a ring, clearly an additive , and now we know they also have the division algorithm!

4 • An element in here is called a Hurwitz prime if it is not the product of two smaller ring elements.

• Note that every Hurwitz integer can be expressed as a+bi+cj+dk where 2a, 2b, 2c, 2d ∈ Z.

H

C

h 1+i+j+k i R Z 2 i, j, k

Z[i]

Z Examples: Primality can still be checked using norms, for example

 1 + i + j + k 7 i j k 49 1 1 1 52 norm 3 + = norm + + + = + + + = = 13, 2 2 2 2 2 4 4 4 4 4 so this is a Hurwitz prime!

Using the quaternions to prove the four square theorem

Definition 4. Any a + bi ∈ C has a conjugate

a + bi = a − bi, and similarly, any a + bi + cj + dk ∈ H has a conjugate

a + bi + cj + dk = a − bi − cj − dk.

The conjugate satisfies some properties, among them

• r = r for r ∈ R.

• q1q2 = q2 q1.

• q1q1 = norm(q1). Theorem 1. If p is an ordinary prime but not a Hurwitz prime then

p = a2 + b2 + c2 + d2 where 2a, 2b, 2c, 2d ∈ Z.

5 Proof. Suppose that p is an ordinary prime but not a Hurwitz prime, so p has a nontrivial Hurwitz p = (a + bi + cj + dk)γ where 2a, 2b, 2c, 2d ∈ Z and γ a Hurwitz integer. Then conjugating both sides p = p = (a + bi + cj + dk)γ = γ(a − bi − cj − dk). But now, p2 = pp = (a + bi + cj + dk)γγ(a − bi − cj − dk) = (a + bi + cj + dk)(a + bi + cj + dk)γγ = (a2 + b2 + c2 + d2)γγ. But since p is prime, it must follow that p = a2 + b2 + c2 + d2.

Theorem 2. If p is an ordinary prime but not a Hurwitz prime then p = a2 + b2 + c2 + d2 where a, b, c, d ∈ Z. Proof. If (a + bi + cj + dk) is a Hurwitz integer, then (a + bi + cj + dk) = ω + a0 + b0i + c0j + d0k where a0, b0, c0, d0 are even integers and ±1 ± i ± j ± k ω = . 2 Notice that norm(ω) = 1, so in particular ωω = 1. We know p = a2 + b2 + c2 + d2 where a, b, c, d are half integers from Theorem 1. But now p = (a + bi + cj + dk)(a + bi + cj + dk) = (a + bi + cj + dk)(a − bi − cj − dk) = (ω + a0 + b0i + c0j + d0k)(ω − a0 − b0i − c0j − d0k) = (ω + a0 + b0i + c0j + d0k)ωω(ω − a0 − b0i − c0j − d0k) = (ω + a0 + b0i + c0j + d0k)ω(ω + a0 + b0i + c0j + d0k)ω But focus on the first component for a moment, (ω + a0 + b0i + c0j + d0k)ω = 1 + ω(a0 + b0i + c0j + d0k) = A + Bi + Cj + Dk where A, B, C, D ∈ Z. But consequently, p = (A + Bi + Cj + Dk)(A + Bi + Cj + Dk) = A2 + B2 + C2 + D2.

6 If every odd prime p is the sum of 4 squares, then our work is done, since a product of a sum of four squares is again a sum of four squares

(a2 + b2 + c2 + d2)(e2 + f 2 + g2 + h2) = (l2 + m2 + s2 + t2).

This was shown by Euler in 1748. Notice that was 100 years before Hamilton!

Lemma 1. If p is an odd prime then there are integers m and l such that p divides 1 + m2 + l2.

Proof. Let p = 2n + 1 where n ∈ Z. Then for any m, l ∈ {1, ..., n}, we have m + l < p and so

m2 ≡ l2 mod p ⇒ (m + l)(m − l) ≡ 0 mod p ⇒ m = l.

This means there are n + 1 incongruent choices for l2 mod p, and n + 1 incongruent choices of −1 − m2 mod p. But there are only 2n + 1 equivalence classes mod p, so for some choice of m, l we must have l2 ≡ −1 − m2 mod p, in other words, p | 1 + m2 + l2.

Theorem 3. Every is the sum of four squares.

Proof. Suppose p is an odd prime, so p = 2n+1. Then Lemma 1 (a classical result of Lagrange) shows that p | 1 + l2 + m2 for some integers l and m. This means that p - m and p - l. But

1 + l2 + m2 = (1 + li + mj)(1 − li − mj).

But now p must divide (1 + li + mj) or (1 − li − mj). But

1 i j 1 i j + + and − − p p p p p p are both not Hurwitz integers, and hence aren’t Hurwitz primes. Therefore p is the sum of 4 squares. Now the claim follows from Euler’s identity.

7