Quick viewing(Text Mode)

Codes Over Z4 2.1 the Puzzle What Is the Largest Binary Code of Length 16 Having Minimum Distance 6? It Turns out That No Such Code Can Have More Than 256 Codewords

Codes Over Z4 2.1 the Puzzle What Is the Largest Binary Code of Length 16 Having Minimum Distance 6? It Turns out That No Such Code Can Have More Than 256 Codewords

MT5821 Advanced Combinatorics

2 Codes over Z4 2.1 The puzzle What is the largest of length 16 having minimum distance 6? It turns out that no such code can have more than 256 codewords. This sug- gests that there might be a linear code of dimension 8; but in fact it can be shown that no such code exists, the largest dimension of a linear code with these param- eters being 7. The problem of constructing a code (necessarily non-linear) was posed by Robinson, and solved by Nordstrom (who was a high-school student who hap- pened to be in the audience of Robinson’s lecture). This code is now known as the Nordstrom–Robinson code. It, and the 7-dimensional linear code, were con- structed in Exercise 1.6. The Nordstrom–Robinson code has several remarkable properties. A code of length n is said to be distance-invariant if, for any i ∈ {0,1,...,n} and any c ∈ C, the number Ai(c) of codewords at distance i from c depends only on i, not on c. Any linear code is distance-invariant; for subtracting c from every codeword doesn’t change distances, so Ai(c) = Ai(0) is the number of words of weight i. It turns out that the Nordstrom–Robinson code, though non-linear, is distance-invariant. In a distance-invariant code, we can talk about the distance enumerator

n n−i i F(X,Y) = ∑ Ai(c)X Y , i=0 since this does not depend on the choice of c. If the code is linear, it is just the weight enumerator of the code. For the Nordstrom–Robinson code, the distance enumerator is

F(X,Y) = X16 + 112X10Y 6 + 30X8Y 8 + 112X6Y 10 +Y 16.

1 Now some calculation shows that 1 F(X,Y) = F(X +Y,X −Y). 256 If the code were linear and self-dual, this would be the MacWilliams theorem. But the code is not linear, so self-duality does not make sense! In the early 1970s, the puzzle deepened. Two families of non-linear codes were found by Kerdock and Preparata respectively: these are the Kerdock codes n+1 Kn and the Preparata codes Pn. Each of Kn and Pn has length 4 , and K1 = P1 is the Nordstrom–Robinson code.

• Each is better than any linear code with the same length and minimum dis- tance. • Each is distance-invariant, and their distance enumerators satisfy 1 FPn (X,Y) = FKn (X +Y,X −Y) |Kn| (which implies the same equation with P and K reversed).

In other words, although they are non-linear, they behave like a dual pair of linear codes! The mystery remained for two decades, during which time some eminent mathematicians speculated that it was just coincidence. Finally, it was resolved by five authors, Hammons, Kumar, Calderbank, Sloane and Sole,´ in 1995. As well as resolving the mystery, these authors did more. For a linear code, encoding is very simple; decoding is not so simple, but we have general methods such as syndrome decoding to help. For non-linear codes, on the face of it, encoding and (especially) decoding are very hard. The authors showed how linear methods could be used to implement all these non-linear codes. This is important because, as noted, they are better than comparable linear codes.

2.2 The Gray map

The solution involves codes over the alphabet Z4, the integers mod 4. We regard the four elements of Z4 as being arranged around a circle, and define the dis- tance dL between two of them as the number of steps apart they are: for example, dL(1,3) = 2, but dL(0,3) = 1. Now we replace the between

2 n two words a = (a1,...,an) and b = (b1,...,bn) of Z4 by the Lee distance, defined by n dL(a,b) = ∑ dL(ai,bi). i=1

Similarly the Lee weight of a is wtL(a) = dL(a,0). n Now, if C is a Z4-linear code, that is, an additive subgroup of Z4, then the Lee weight enumerator of C is given by

2n−wtL(c) wtL(c) LWC(X,Y) = ∑ X Y . c∈C (Note that the maximum possible Lee weight of a word of length n is 2n.) It turns out that there is a version of MacWilliams’ Theorem connecting the ⊥ Lee weight enumerators of a Z4-linear code C and its dual C (with respect to the natural inner product). 2 The set Z4, with the Lee metric dL, is isometric to the set Z2 with the Hamming metric, under the Gray map γ, defined by

γ(0) = 00, γ(1) = 01, γ(2) = 11 γ(3) = 10.

2 11 ...... @ ...... u... u @ ...... @ 3. . 1 10 01 ...... @ ...... u...... uu@ u ...... @ 0u00 u

The crucial property to observe is that the Lee distance between points on the left (the number of steps round the circle) is equal to the Hamming distance between the corresponding pairs on the right.

Digression A is an arrangement of the 2n binary n-tuples in a se- quence so that any two terms in the sequence have Hamming distance 1. So, for example, (00,01,11,10) is a Gray code. Gray codes exist for all possible lengths n. If (v1,...,v2n ) is a Gray code for n-tuples, then we obtain a Gray code for (n + 1) tuples as (v10,v20,...,v2n 0,v2n 1,...,v21,v11). Indeed, we see that this construction gives us a circular sequence, since the Hamming distance between the first and last terms is also 1.

3 Gray codes are used in analog-to-digital conversion. If you write the integers from 0 to 2n − 1 in base 2, at various points in the sequence the distance between consecutive terms can be large. If we take a reading when the quantity being measured lies between two such values, then any of the changing digits could be read incorrectly. (Imagine taking a reading from the odometer or mileage gauge of a car as it was changing from 39999 to 40000: you might get any one of 32 values between 30000 and 49999.) If instead we use a Gray code, the possible error is restricted to one digit, and we will get one of the values on either side of the true reading. n 2n Now we extend the definition of the Gray map to map from Z4 to Z2 by

γ(a1,...,an) = (γ(a1),...,γ(an)). n 2n It is easily seen that γ is an isometry from Z4 (with the Lee metric) to Z2 (with the Hamming metric). The Gray map is non-linear, so the image of a Z4-linear code C is usually a non-linear binary code. But the isometry property shows that γ(C) is necessarily distance-invariant (since the linear code C is), and that its distance enumerator is equal to the Lee weight enumerator of C. Thus, taking a Z4-linear code and its dual, and applying the Gray map, we obtain a pair of formally self-dual non-linear binary codes. Hammons et al. show that, if this procedure is applied to the Z4 analogue of the extended Hamming codes and their duals, then the Preparata and Kerdock codes are obtained. Thus, the mystery is explained. (There is a small historical inaccuracy in this statement. They obtained, not the original Preparata codes, but another family of codes with the same weight enumerators, hence just as good in practice.) Here is an example of a linear Z4 code whose Gray map image is non-linear. The code C is spanned by 011 and 112. So the 16 words are 000,011,022,033,112,123,130,101, 220,231,202,213,332,303,312,321, and their Gray map images are 000000,000101,001111,001010,010111,011110,011000,010001, 111100,111001,110011,110110,101011,100010,100111,101101. So the Gray map image of the code contains 000101 and 010111, but not their sum 010010.

4 But the Lee weight enumerator of the Z4-code and the Hamming weight enu- merator of its Gray map image are both X6 +6X4Y 2 +9X2Y 4, equal as they should be. There is a temporary web page describing the Nordstrom–Robinson code at

http://en.wikipedia.org/wiki/User:Nmonje

2.3 Chains of binary codes

Another approach to Z4-linear codes is via a representation as pairs of Z2-linear codes. Let C be a Z4-linear code. We construct binary codes C1 and C2 as follows. C1 is obtained just by reading the words of C modulo 2; and C2 is obtained by se- lecting the words of C in which all coordinates are even, and replacing the entries 0 and 2 mod 4 by 0 and 1 mod 2. To state the next theorem, we must define a more general weight enumerator associated with a Z4-linear code C. This is the symmetrised weight enumerator of C, defined as follows:

n0(c) n2(c) n13(c) SWC(X,Y,Z) = ∑ X Y Z , c∈C where n0(c) is the number of coordinates of C equal to zero; n2(c) the number of coordinates equal to 1; and n13(c) the number of coordinates equal to 1 or 3. Since these coordinates contribute respectively 0, 2, and 1 to the Lee weight, we have 2 2 LWC(X,Y) = SWC(X ,Y ,XY). ‘

Theorem 2.1 The pair (C1,C2) of binary codes associated with a Z4-linear codes C satisfies

(a) C1 ⊆ C2;

(b) |C| = |C1| · |C2|;

(c) WC1 (X,Y) = SWC(X,X,Y)/|C2| and WC2 (X,Y) = SWC(X,Y,0).

5 Proof (a) If v ∈ C, then doubling v gives a word with all coordinates even; the corresponding word in C2 is obtained by reading v mod 2. So C1 ⊆ C2. n n (b) C1 is the image of C under the natural homomorphism from Z4 to Z2 which simply reads each coordinate mod 2, and C2 is naturally bijective with the kernel of this map; so |C| = |C1| · |C2|. The proof of (c) is an exercise. 

We call a pair (C1,C2) of binary linear codes with C1 ⊆ C2 a chain of binary codes. Every chain of binary codes arises from a Z4-linear code in the manner of the theorem. For suppose that binary codes C1 and C2 are given with C1 ⊆ C2. Let

C = {v1 + 2v2 : v1 ∈ C1,v2 ∈ C2}, where the elements 0 and 1 of Z2 are identified with 0 and 1 in Z4 for this construc- tion. Then the preceding construction applied to C recovers C1 and C2. So every chain of codes (that is, every pair (C1,C2) with C1 ⊆ C2) arises from a Z4-linear code. However, the correspondence fails to be bijective, and many important prop- erties are lost. Fore example, the two Z4-codes {000,110,220,330} and {000,112,220,332} give rise to the same pair of binary codes (with C1 = C2 = {000,110}) but have different symmetrised weight enumerators (and so different Lee weight enumera- tors). The problem of describing all Z4-linear codes arising from a given chain has not been solved. It resembles in some ways the “extension problem” in group theory.

Exercises

2.1. Show that the Gray map image of the Z4 code spanned by 1111 and 0123 is a (linear) self-dual [8,4,4] code. 2.2. Prove that the Nordstrom–Robinson code as defined in Exercise 1.6 is distance-invariant and has the claimed weight enumerator. 2.3. Prove Theorem 2.1(c). Verify the conclusion directly for the two codes in the example following the theorem. Construct the images of these two codes under the Gray map.

6 2.4. Show that the Z4-linear code with generator matrix 1 3 1 2 1 0 0 0 1 0 3 1 2 1 0 0   1 0 0 3 1 2 1 0 1 0 0 0 3 1 2 1 is equal to its dual and has Lee weight enumerator

X16 + 112X10Y 6 + 30X8Y 8 + 112X6Y 10 +Y 16.

(This is the code whose Gray map image is the Nordstrom–Robinson code.)

2.5. Prove that, for any a,b ∈ Z4, we have

γ(a + b) = γ(a) + γ(b) + (γ(a) + γ(−a)) ∗ (γ(b) + γ(−b)), where ∗ denotes componentwise product: (a,b) ∗ (c,d) = (ac,bd). Hence prove that a (not necessarily linear) binary code C is equivalent to the Gray map image of a linear Z4 code if and only if there is a fixed-point-free involutory permutation σ of the coordinates such that, for all u,v ∈ C, we have

u + v + (u + uσ) ∗ (v + vσ) ∈ C, where ∗ is the componentwise product of binary vectors of arbitrary length. (Define σ so that, if u = γ(c), then uσ = γ(−c); this permutation interchanges the two coordinates corresponding to each coordinate of the Z4 code.) 2.6. Construct Gray codes of length n over the alphabet {0,1,...,b − 1} for any base b. Can you do a little more: Arrange the n-tuples in a sequence so that any consecutive tuples in the sequence differ in only one position, and have x and x + 1 there, in either order, for some x. If you are familiar with the Tower of Hanoi puzzle, can you explain its con- nection with a Gray code?

7