<<

Coding Theory

RWTH , SS 2019 Universit¨atDuisburg-, WS 2011 RWTH Aachen, SS 2006

J¨urgenM¨uller Contents

I 1

0 Introduction ...... 1 1 Parity check codes ...... 1 2 Information theory ...... 7

II 14

3 Block codes ...... 14 4 Linear codes ...... 18 5 Bounds for codes ...... 23 6 Hamming codes ...... 31 7 Weight enumerators ...... 37

III 38

8 Cyclic codes ...... 39 9 BCH codes ...... 45 10 Minimum distance of BCH codes ...... 53 11 Quadratic residue codes ...... 59

IV 70

12 Exercises to Part I (in German) ...... 70 13 Exercises to Part II (in German) ...... 73 14 Exercises to Part III (in German) ...... 80 15 References ...... 85 I 1

I

0 Introduction

(0.1) The basic model of communication is that of sending information between communication partners, Alice and Bob say, who communicate through some channel, which might be anything as, for example, a line, radio, an audio compact disc, a keyboard, and so on. This leads to the following questions: • Information theory. What is information? How much information can be sent through a channel per time unit? • Coding theory. The channel might be noisy, that is information might be changed randomly when sent through the channel. How is it possible to recover a sufficiently large fraction of the original information from distorted data? • Cryptography. The channel might be insecure, that is information which is intended to be kept private to Alice and Bob might be caught by an opponent, Oscar say, or even be changed deliberately by Oscar. How can this be prevented?

(0.2) Alphabets. A finite set X such that |X | ≥ 2 is called an alphabet, its elements are called letters or symbols. A finite sequence w = [x1, . . . , xn] consisting of n ∈ N letters xi ∈ X is called a word over X of length l(w) = n. The empty sequence  is called the empty word, and we let l() := 0. Let X n be ∗ ` n ∗ the set of all words of length n ∈ 0, and let X := X . For v, w ∈ X let N n∈N0 vw ∈ X ∗ be their concatenation. We have v = v = v and (uv)w = u(vw), for u, v, w ∈ X ∗. Hence X ∗ is a monoid, the free monoid over X , and the ∗ length function l : X → (N0, +): w 7→ l(w) is a monoid homomorphism.

To describe numbers, typically alphabets Zq := {0, . . . , q − 1} for q ∈ N are used, for example q = 10. In computer science, the alphabet Z2 = {0, 1}, whose elements are called binary digits or bits, the alphabet Z8, whose elements are called Bytes, and the hexadecimal alphabet {0,..., 9, A, B, C, D, E, F } being in bijection with Z16 are used. For of written texts the alphabet {a,..., z} being in bijection with Z26, and the American Standard Code for Information Interchange (ASCII) alphabet being in bijection with Z128 are used. Then, to transmit information, it has to be encoded into words over an alphabet X suitable for the chosen channel, and words have to be decoded again after transmition. Thus, a code is just a subset ∅= 6 C ⊆ X ∗; then C is interpreted as the set of all words representing sensible information.

1 Parity check codes

Parity check codes are used to detect typing errors. They are not capable of correcting errors, and thus are used whenever the data can easily be reentered I 2

Table 1: Typing errors.

error frequency single a −→ b 79.0% adjacent transposition ab −→ ba 10.2% jump transposition abc −→ cba 0.8% twin aa −→ bb 0.6% jump twin aca −→ bcb 0.3% phonetic a0 −→ 1a 0.5% random 8.6%

again. Typical typing errors and their frequencies are given in Table 1; an example of a phonetic error is replacing ‘thirty’ by ‘thirteen’.

(1.1) Example: The ISBN [1968, 2007]. The International Standard Book Number is used to identify books, where up to the year 2006 the stan- dard was ISBN-10, which from the year 2007 on has been replaced by ISBN- 13. The ISBN-10 is formed as follows:

The alphabet is Z11, where 10 is replaced by the Roman letter X, and words 9 [x1; x2, . . . , x6; x7, . . . , x9; x10] ∈ Z10 × Z11 have length 10, where X might pos- sibly occur only as a last letter. Here x1, . . . , x9 are information symbols, where x1 is the group code, x1 ∈ {0, 1} referring to English, x1 = 2 re- ferring to French, and x1 = 3 referring to German, [x2, . . . , x6] is the pub- lisher code,[x7, . . . , x9] is the title code, and x10 is a check symbol fulfilling P9 x10 = i=1 ixi ∈ Z11. Hence a valid ISBN-10 is an element of the Z11-subspace 10 P10 10 {[x1, . . . , x10] ∈ Z11; i=1 ixi = 0 ∈ Z11} ≤ Z11. From 2007 on the ISBN-13 is used: After a 3-letter prefix, being a country code 978 or 979 referring to ‘bookland’, the first 9 letters of the ISBN-10 are taken, and then a check symbol is added such that the EAN standard is fulfilled, see (1.2). For example, a valid ISBN-10 is ‘1-58488-508-4’: We have 1·1+2·5+3·8+4· 4+5·8+6·8+7·5+8·0+9·8 = 246 = 4 ∈ Z11. The corresponding ISBN-13 is ‘978- 1-58488-508-5’; indeed we have 9+3·7+8+3·1+5+3·8+4+3·8+8+3·5+0+3·8 = 145 = 5 = −5 ∈ Z10.

(1.2) Example: The EAN [1977]. The International Article Number (EAN), formerly European Article Number, is formed as follows: The 13 alphabet is Z10, and words [x1, . . . , x3; x4, . . . , x7; x8, . . . , x12; x13] ∈ Z10 have length 13. Here x1, . . . , x12 are information symbols, where [x1, . . . , x3] is the country code, x1 = 4 referring to , [x4, . . . , x7] is the company code,[x8, . . . , x12] is the article code, and x13 is a check symbol fulfilling P6 x13 = − i=1(x2i−1 + 3x2i) ∈ Z10. Hence a valid EAN is an element of the set I 3

13 P6 13 {[x1, . . . , x13] ∈ Z10; i=1(x2i−1 + 3x2i) + x13 = 0 ∈ Z10} ⊆ Z10. The bar code printed on goods is formed as follows: Each bar is either black or white, and has width 1, 2, 3 or 4. Each letter is encoded by 4 bars of alternating colors, whose widths add up to 7; see Table 2 where 0 and 1 stand for white and black, respectively. The odd and even type codes for each letter start with white and end with black, where for the even type codes the width patterns are just those for the odd type codes read backwardly. The negative type code for each letter starts with black and ends with white, using the same width pattern as for the odd type code, which hence amounts to just reading the even type code backwardly. In the odd type code for each letter the widths of the black bars add up to an odd number, while in the even type code these sums are even.

An EAN is depicted as follows: There is a prefix 101, then the letters x2, . . . , x7 are depicted by odd and even type codes, then there is an infix 01010, then the letters x8, . . . , x12 are depicted by negative type codes, and finally there is a postfix 101. The choice of the odd and even type codes for x2, . . . , x7 is determined by x1; see Table 2, where − and + stand for odd and even, respectively. Since the even type codes, that is the negative type codes read backwardly, are disjoint from the odd type codes, this allows to read bar codes in either direction and to swap data if necessary, or to read data in two halves. For example, ‘4-901780-728619’ indeed yields 4 + 3 · 9 + 0 + 3 · 1 + 7 + 3 · 8 + 0 + 3 · 7 + 2 + 3 · 8 + 6 + 3 · 1 = 121 = 1 = −9 ∈ Z10, hence this is a valid EAN. The pattern [odd, even, odd, odd, even, even] for x1 = 4 yields:

101 0001011 0100111 0011001 0111011 0001001 0100111 01010 1000100 1101100 1001000 1010000 1100110 1110100 101

(1.3) Example: The IBAN [2007]. The German general version of the International Bank Account Number (IBAN) is formed as follows: 2 20 Words [x1, x2; x3, x4; x5, . . . , x12; x13, . . . , x22] ∈ Z26 ×Z10 have length 22, where actually x1, x2 are Latin letters, and we identify the Latin alphabet with Z26 by letting A 7→ 0, B 7→ 1, ..., Z 7→ 25. Here, x1, x2; x5, . . . , x12; x13, . . . , x22 are information symbols, where [x1, x2] is the country code, for Germany being DE 7→ [3, 4], followed by the 8-digit bank identification number [x5, . . . , x12], and the 10-digit bank account number [x13, . . . , x22], where the latter is possibly filled up by leading zeroes; the word [x5, . . . , x22] is also called the Basic Bank Account Number (BBAN).

Finally, [x3, x4] are check symbols fulfilling the following condition: The con- 24 catenation v := x5 ··· x22(x1 + 10)(x2 + 10)x3x4 ∈ Z10 can be considered as a non-negative integer having 24 decimal digits, where Z26 + 10 = {10,..., 35}. Then v is a valid IBAN if v ≡ 1 (mod 97). I 4

Table 2: EAN bar code.

letter odd code negative even code odd/even 0 3211 0001101 1110010 1123 0100111 − − − − −− 1 2221 0011001 1100110 1222 0110011 − − + − ++ 2 2122 0010011 1101100 2122 0010011 − − + + −+ 3 1411 0111101 1000010 1141 0100001 − − + + +− 4 1132 0100011 1011100 2311 0011101 − + − − ++ 5 1231 0110001 1001110 1321 0111001 − + + − −+ 6 1114 0101111 1010000 4111 0000101 − + + + −− 7 1312 0111011 1000100 2131 0010001 − + − + −+ 8 1213 0110111 1001000 3121 0001001 − + − + +− 9 3112 0001011 1110100 2113 0010111 − + + − +−

Hence allowing for the alphabet Z97, containing the digits Z10 as a subset, the P22 28−i 4 check condition can be rephrased as ( i=5 xi · 10 ) + (x1 + 10) · 10 + (x2 + 2 10) · 10 + x3 · 10 + x4 = 1 ∈ Z97. Thus check symbols x3, x4 ∈ Z10 can always be found, where [x3, x4] 6∈ {[0, 0], [0, 1], [9, 9]} for uniqueness. Letting 24−i 18 w := [10 ∈ Z97; i ∈ 1,..., 18] ∈ Z97, that is w = [56, 25, 51, 73, 17, 89, 38, 62, 45, 53, 15, 50, 5, 49, 34, 81, 76, 27],

4 2 and x1 := 3 and x2 := 4, entailing (x1 + 10) · 10 + (x2 + 10) · 10 = 62 ∈ Z97, we infer that the valid IBAN can be identified with the set {[x3, . . . , x22] ∈ 20 P18 Z10;(x3 · 10 + x4) + i=1 wixi+4 = 36 ∈ Z97}. For example, given the bank identification number ‘390 500 00’ and the fictious bank account number ‘0123456789’, we get the BBAN ‘3905 0000 0123 4567 89’. P18 For the latter we get i=1 wixi+4 = 65 ∈ Z97, thus the check condition yields x3 · 10 + x4 = 68 ∈ Z97, so that we get the IBAN ‘DE68 3905 0000 0123 4567 89’.

(1.4) Parity check codes over Zq. Let q ≥ 2 be the modulus, let n ∈ N, and n n let the weights w := [w1, . . . , wn] ∈ Zq be fixed. Then v := [x1, . . . , xn] ∈ Zq tr Pn is called valid if vw := i=1 xiwi = 0 ∈ Zq. n 0 a) We consider single errors: Let v := [x1, . . . , xn] ∈ Zq be valid and let v := 0 n 0 [x1, . . . , xj, . . . , xn] ∈ Zq such that xj 6= xj for some j ∈ {1, . . . , n}. Then we 0 tr 0 tr 0 0 have v w = (v − v)w = (xj − xj)wj ∈ Zq, hence xj 6= xj is detected if and 0 only if xjwj 6= xjwj ∈ Zq. Thus all single errors are detected if and only if all weights wj ∈ Zq are chosen such that the map µwj : Zq → Zq : x 7→ xwj is injective, or equivalently bijective. ∗ For y ∈ Zq the map µy : Zq → Zq : x 7→ xy is injective if and only if y ∈ Zq := {z ∈ Zq; gcd(z, q) = 1}, the group of units of Zq: Let d := gcd(y, q) ∈ N. If q q d > 1 then we have 0 6= d ∈ Zq and d · y = 0 = 0 · y ∈ Zq, hence µy is I 5 not injective. Since by the Euclidean Algorithm there are B´ezoutcoefficients s, t ∈ Z such that d = ys + qt ∈ Z, if d = 1 then ys = d = 1 ∈ Zq, thus from 0 0 0 xy = x y ∈ Zq we get x = xys = x ys = x ∈ Zq, implying that µy is injective. For example, for the non-prime modulus q = 10 used in the EAN we get

µ1 = idZ10 and µ3 = (0)(1, 3, 9, 7)(2, 6, 8, 4)(5) ∈ SZ10 , hence the weight tu- ∗ 13 ple w = [1, 3,..., 1, 3, 1] ∈ (Z10) allows to detect all single errors. For the ∗ prime modulus q = 11 used in the ISBN-10 we have Z11 = Z11 \{0}, hence ∗ 10 again the weight tuple w = [1,..., 10] ∈ (Z11) allows to detect all single er- rors. A similar consideration for the IBAN, using the prime modulus q = 97, shows that the weight tuple for the BBAN allows to detect all single errors. b) We consider adjacent transposition errors for n ≥ 2: Let v := [x1, . . . , xn] ∈ n 0 n Zq be valid and let v := [x1, . . . , xj+1, xj, . . . , xn] ∈ Zq such that xj+1 6= xj for 0 tr 0 tr some j ∈ {1, . . . , n−1}. Then we have v w = (v −v)w = (xj −xj+1)(wj+1 − wj) ∈ Zq. Thus all adjacent transposition errors are detected if and only if the ∗ weights fulfill wj+1 − wj ∈ Zq for all j ∈ {1, . . . , n − 1}. ∗ Since for the EAN we have wj+1 − wj ∈ {2, 8} ⊆ Z10 \ Z10, for j ∈ {1,..., 12}, adjacent transposition errors are not necessarily detected. Since for the ISBN- ∗ 10 we have wj+1 − wj = 1 ∈ Z11, for j ∈ {1,..., 9}, all adjacent transposition errors are detected; thus in this respect the transition from ISBN-10 to ISBN-13 is not an improvement. Similarly, since for the BBAN the adjacent weights in Z97 are pairwise distinct, all adjacent transposition errors are detected.

(1.5) Parity check codes over arbitrary groups. a) Let G be a finite group, n let n ∈ N, and let πi : G → G for i ∈ {1, . . . , n} be fixed. Then [x1, . . . , xn] ∈ G π1 πn is called valid if x1 ··· xn = 1. For example, letting G := Zq and πi := µwi , where wi ∈ Zq for i ∈ {1, . . . , n}, we recover the parity check codes in (1.4); here additionally the πi are group homomorphisms. n 0 We consider single errors: Let v := [x1, . . . , xn] ∈ G be valid and let v := 0 n 0 [x1, . . . , xj, . . . , xn] ∈ G such that xj 6= xj for some j ∈ {1, . . . , n}. Let yi := πi 0 0 πj 0 xi ∈ G for i ∈ {1, . . . , n}, and yj := (xj) . Then v is valid if and only if 0 y1 ··· yj−1yjyj+1 ··· yn = 1 = y1 ··· yn, which by multiplying from by −1 −1 −1 −1 0 πj y1 , . . . , yj−1, and from the right by yn , . . . , yj+1, is equivalent to (xj) = 0 πj yj = yj = xj . Hence we conclude that all single errors are detected if and only if πj is injective, or equivalently bijective, for all j ∈ {1, . . . , n}. b) Let πi be injective for all i ∈ {1, . . . , n}. We consider adjacent transpo- n 0 sition errors for n ≥ 2: Let v := [x1, . . . , xn] ∈ G be valid and let v := n [x1, . . . , xj+1, xj, . . . , xn] ∈ G such that xj+1 6= xj for some j ∈ {1, . . . , n − 1}. πi 0 πj 0 πj+1 Let yi := xi ∈ G for i ∈ {1, . . . , n} and yj := xj+1 ∈ G and yj+1 := xj ∈ G. 0 0 0 Then v is valid if and only if y1 ··· yj−1yjyj+1yj+2 ··· yn = 1 = y1 ··· yn, −1 −1 which by multiplying from the left by y1 , . . . , yj−1, and from the right by −1 −1 πj πj+1 0 0 πj πj+1 yn , . . . , yj+2, is equivalent to xj+1xj = yjyj+1 = yjyj+1 = xj xj+1 . Writ- πj πj −1 ing g := xj ∈ G and h := xj+1 ∈ G and letting τj := πj πj+1, we conclude that all adjacent transposition errors are detected if and only if ghτj 6= hgτj for I 6 all g 6= h ∈ G and j ∈ {1, . . . , n − 1}.

(1.6) Complete maps. We briefly digress into group theory, inasmuch the above leads to the following definition: Given a finite group G, a bijective map σ : G → G is called complete if the map τ : G → G: g 7→ gσg is bijective again. a) This is related to the above as follows: Let G be abelian. Then, given a bijective map τ : G → G, the condition ghτ 6= hgτ , for all g 6= h ∈ G, is τ−1 τ−1 equivalent to g 6= h , for all g 6= h ∈ G, that is σ := τ −idG : G → G: g 7→ gτ g−1 is bijective as well. Hence, for a parity check code over G with respect to −1 bijections πj, for j ∈ {1, . . . , n}, the associated maps σj := πj πj+1 − idG, for j ∈ {1, . . . , n − 1}, are complete. Conversely, given a complete map σ : G → G, j we may let πj := (σ + idG) for j ∈ {1, . . . , n}. This shows that there is a parity check code over G, for n ≥ 2, which detects all single errors and all adjacent transposition errors if and only if G has a complete map. Hence the question arises when G has a complete map. Here, we restrict our- selves to the case of cyclic groups, that is let G := Zq. We show that a complete map on Zq exists if and only if q is odd: If q is odd, then both ∗ σ = idZq and τ = σ + idZq = µ2 are bijective; recall that indeed 2 ∈ Zq in this case. If q is even, then assume that σ and τ := σ + idZq are bijective, then q −1 P q P 2 q P P τ x = 0 + + (i + (q − i)) = 6= 0 ∈ q and x = x = x∈Zq 2 i=1 2 Z x∈Zq x∈Zq P xσ+1 = (P xσ) + (P x) = (P x) + (P x) = q + q = x∈Zq x∈Zq x∈Zq x∈Zq x∈Zq 2 2 0 ∈ Zq, a contradiction. In particular; there is no parity check code over Z10 which detects all single errors and adjacent transposition errors; thus it is not surprising that the EAN does not detect all adjacent transposition errors. ∗ Moreover, if πi = µwi where wi ∈ Zq for i ∈ {1, . . . , n}, then we get τj = π−1π = µ−1µ , for j ∈ {1, . . . , n−1}, and τ −id = µ−1(µ −µ ) = j j+1 wj wj+1 j Zq wj wj+1 wj µ−1µ is bijective if and only if µ is bijective, or equivalently wj wj+1−wj wj+1−wj ∗ wj+1 − wj ∈ Zq , as already seen in (1.4). Note that for the ISBN-10 we have −1 πi = µi : Z11 → Z11, for i ∈ {1,..., 10}, thus τj = µj µj+1 and τj − idZ11 = −1 −1 −1 µj µ(j+1)−j = µj µ1 = µj , for j ∈ {1,..., 9}. b) The general group theoretic question concerning the existence of complete maps is answered as follows: The Hall-Paige Conjecture [1955] says that a finite group G has a complete map if and only of G has a trivial or a non-cyclic Sylow 2-subgroup. This has recently been proved in a series of papers: Firstly, [Hall, Paige, 1955] already showed the necessity of the condition given, and its sufficiency for the alternating groups. Next, [Dalla-Volta, Gavioli, 2001] showed that a minimal counterexample is almost simple or has a center of even order. Then [Wilcox, 2009] showed that a minimal counterexample is actually simple, and that simple groups of Lie type (excluding the Tits group) possess complete maps. This reduced the problem, by the classification of finite simple groups, to the sporadic simple groups. Now [Evans, 2009] showed that the Tits group and the sporadic simple groups (excluding the Janko group J4) possess I 7

Table 3: Elements of D10.

x x x 0 A 1 id () 1 D 2 α (1, 2, 3, 4, 5) 2 G 3 α2 (1, 3, 5, 2, 4) 3 K 4 α3 (1, 4, 2, 5, 3) 4 L 5 α4 (1, 5, 4, 3, 2) 5 N 6 β (2, 5)(3, 4) 6 S 7 αβ (1, 5)(2, 4) 7 U 8 α2β (1, 4)(2, 3) 8 Y 9 α3β (1, 3)(4, 5) 9 Z 10 α4β (1, 2)(3, 5)

complete maps. Finally, [Bray, 2018] showed that J4 possesses complete maps. Recall that any non-abelian simple group has a non-cyclic Sylow 2-subgroup.

(1.7) Example: Serial numbers. Let D10 be the dihedral group of order 10, that is the symmetry group of the plane equilateral pentagon; up to isomorphism there are precisely two groups of order 10, the cyclic group Z10 and the non- abelian group D10. Numbering the vertices of the pentagon counterclockwise, the elements of D10 := hα, βi ≤ S5 are as given in Table 3. Using the numbering of the elements given there let τ := (1, 2, 6, 9, 10, 5, 3, 8)(4, 7) ∈ SD10 . Then it τ τ can be checked that gh 6= hg for all g 6= h ∈ D10. The serial numbers on the former German currency Deutsche Mark (DM) are formed as follows: The alphabet is X := {0,..., 9,A,D,G,K,L,N,S,U,Y,Z}, and 11 words [x1, . . . , x10; x11] ∈ X have length 11, where x1, . . . , x10 are information symbols and x11 is a check symbol. Replacing xi ∈ X by xi ∈ D10 as indicated τ τ 10 in Table 3, a word is valid if x1 ··· x10 x11 = id ∈ D10. For example, for GG0184220N0 we get elements [3, 3, 1, 2, 9, 5, 3, 3, 1, 6; 1], hence 2 3 4 5 6 7 8 9 10 [3τ , 3τ , 1τ , 2τ , 9τ , 5τ , 3τ , 3τ , 1τ , 6τ ; 1] = [8, 1, 9, 5, 1, 9, 5, 3, 2, 10; 1], and it can be checked that the product of the associated elements equals id ∈ D10.

2 Information theory

(2.1) Information. Let X be an alphabet, and let µ: P(X ) → R≥0 be a probability distribution, that is i) µ(X ) = 1, and ii) µ(A ∪ B) = µ(A) + µ(B) for all A, B ⊆ X such that A ∩ B = ∅. a) To model the information content of a symbol x ∈ X , we use the frequency of its occurrence, which is given by µ. We restrict ourselves to possible elementary events x ∈ X , that is µ(x) > 0. Then the information content should be I 8 the smaller the more often it occurs. Moreover, for independent events their information contents should add up, while the associated probabilities multiply. Hence letting I := {a ∈ R; 0 < a ≤ 1}, this motivates to let an information measure be a strongly decreasing continuous map ι: I → R such that ι(ab) = ι(a) + ι(b) for all a, b ∈ I. Then the information content of a possible elementary event x ∈ X , by abusing notation, is given as ι(x) := ι(µ(x)). We show that information measures are unique up to normalization: Given an information measure ι, we consider the continuous map η : R≤0 → R: a 7→ ι(exp(a)), which hence fulfills η(a + b) = ι(exp(a + b)) = ι(exp(a) exp(b)) = ι(exp(a)) + ι(exp(b)) = η(a) + η(b) for all a, b ∈ R≤0. Letting α := −η(−1) ∈ R, n n we get η(−n) = −αn for all n ∈ N0, and from that η(− m ) = −α · m for all n ∈ N0 and m ∈ N, hence η being continuous we infer η(a) = αa for all a ∈ R≤0. Thus from ι(exp(a)) = η(a) = αa = α ln(exp(a)), for all a ∈ R≤0, we get ι(a) = α ln(a), for all a ∈ I. Since ι is strongly decreasing we have α < 0. Conversely, for any α < 0 the map I → R≥0 : a 7→ α ln(a) indeed is an information measure. Hence it remains to normalize:

The information content of a binary digit from the alphabet Z2, carrying the 1 1 1 uniform distribution, is set to be 1, hence 1 = ι( 2 ) = α ln( 2 ), that is α = − ln(2) . ln(a) 1 Thus henceforth we let ι(a) = − ln(2) = − log2(a) = log2( a ), for all a ∈ I. b) The average information content or entropy of X = {x1, . . . , xq} is the expected value H(X ) = H(µ) := − P µ (i) log (µ (i)) ∈ i∈{1,...,q},µX (i)>0 X 2 X R≥0 − log2(a) of the information content. Since lima→0+ (a log2(a)) = lima→∞( ) = 0, a . the function I → R≥0 : a 7→ −a log2(a) can be continuously extended to I ∪ {0}. Pq Thus we may let H(X ) = − i=1 µX (i) log2(µX (i)), saying that impossible elementary events do not contribute to the average information content. 1 Pq 1 If X carries the uniform distribution, we get H(X ) = − q · i=1 log2( q ) = log2(q) = log2(|X |); actually we always have H(X ) ≤ log2(|X |), with equality if and only if X carries the uniform distribution, see Exercise (12.10). Moreover, we have H(X ) = 0 if and only if all summands in the defining sum are zero, or equivalently we have either µX (i) = 0 or log2(µX (i)) = 0, for all i ∈ {1, . . . , q}, Pq the latter case being equivalent to µX (i) = 1; since i=1 µX (i) = 1 this in turn is equivalent to µX (i) = 1 for a unique i ∈ {1, . . . , q}, and µX (j) = 0 for j 6= i, that is µ is concentrated in xi for some i ∈ {1, . . . , q}.

For example, we consider X := Z2 = {0, 1} with elementary probabilities µX (0) = α and µX (1) = 1 − α for some 0 ≤ α ≤ 1. Then we get average information content H(µ) = H(α) := −α log2(α)−(1−α) log2(1−α). Differen- ∂ 1−α tiating yields ∂α H(α) = − log2(α) + log2(1 − α) = log2( α ), for all 0 < α < 1. Since H(0) = H(1) = 0 and H(α) > 0, for all 0 < α < 1, we infer that H(α) 1 1 has a unique maximum for α = 1 − α = 2 , where H( 2 ) = 1. Thus the average information content of the symbols in Z2 is maximized if and only if Z2 carries the uniform distribution. The relevance of these notions is elucidated by the First Main Theorem of I 9 information theory, Shannon’s Theorem on source coding [1948]:

∗ (2.2) Theorem: Shannon [1948]. Let h: X → (Z2) \{} be an injective ∗ and prefix-free encoding, that is for all v ∈ im(h) and w ∈ (Z2) \{} we have vw 6∈ im(h). Then for the average length of the code words in h(X ) we have

q X µX (i) · l(h(xi)) ≥ H(X ). i=1 Replacing X by X n, where the symbols in words are chosen independently, the lower bound can be approximated with arbitrary precision, for n → ∞. ]

The inequality in Shannon’s Theorem can be interpreted as follows: We consider ∗ n the set (Z2) \{} of possible code words. For the set Z2 of words of length n ∈ N, carrying the uniform distribution, which is obtained from the uniform distribution on Z2 by choosing the symbols in the words independently, we get 1 n 1 n µ(w) = 2n , for all w ∈ Z2 , thus ι(w) = − log2( 2n ) = log2(2 ) = n = l(w); n n n 1 1 n thus summing over Z2 also yields H(Z2 ) = −2 · 2n · log2( 2n ) = log2(2 ) = n. Hence the left hand side of the inequality is the average information content of n the genuine code words h(X ), with respect to the uniform distribution on Z2 for all n ∈ N, and Shannon’s Theorem says that this cannot possibly be strictly smaller that the average information content of the original alphabet X . In order to approximate the lower bound with arbitrary precision, for n → ∞, for example Huffman encoding can be used, see Exercise (12.8) and (12.9).

(2.3) Noisy channels. a) The standard model for this is the binary channel: The data consists of elements of X := Z2, sent with probability distribution µX , and being distorted by the channel, so that the received elements of Y := Z2 carry the probability distribution µY , where the noise is described by the P conditional distribution µY|X , thus µY (j) = i∈X µX (i)µY|i(j), for j ∈ Y. 1 The symmetric binary channel with error probability 0 ≤ p ≤ 2 is given by µY|i(i + 1) = p and µY|i(i) = 1 − p, for i ∈ X . In other words, µY|X is given by 1 − p p  the transition matrix M(p) := , entailing µ = [µ (0), µ (1)] = p 1 − p Y Y Y [µX (0), µX (1)] · M(p) = [µX (0)(1 − p) + µX (1)p, µX (0)p + µX (1)(1 − p)].

In particular, the quiet binary channel is given by p = 0, that is µY|i(j) = δij, hence we have µY = µX ; and the completely noisy binary channel is given 1 1 by p = 2 , that is µY|i(j) = 2 for i ∈ X and j ∈ Y, hence µY is the uniform distribution, independently of µX . 1 1 1 1 If X carries the uniform distribution, then we have µY = [ 2 , 2 ] · M(p) = [ 2 , 2 ], saying that Y carries the uniform distribution as well, independently of p. Con- 1 − p −p  versely, if 0 ≤ p < 1 , then we have M(p)−1 = 1 · ∈ GL ( ), 2 1−2p −p 1 − p 2 R I 10

1 1 −1 1 1 hence if Y carries the uniform distribution, then µX = [ 2 , 2 ] · M(p) = [ 2 , 2 ], that is X necessarily carries the uniform distribution as well. b) For decoding purposes we haste to describe the transition matrix Mf(p) de- scribing the conditional probability µX |Y as well: Bayes’s Theorem says that µX |j(i)µY (j) = µX ×Y (i, j) = µX (i)µY|i(j), for i ∈ X and j ∈ Y. Hence we tr have Mf(p) · diag[µY (j); j ∈ Y] = diag[µX (i); i ∈ X ] · M(p). Assuming that µY (j) 6= 0 for all j ∈ Y, this yields

" µX (0)(1−p) µX (1)p # M(p) = µX (0)(1−p)+µX (1)p µX (0)(1−p)+µX (1)p . f µX (0)p µX (1)(1−p) µX (0)p+µX (1)(1−p) µX (0)p+µX (1)(1−p) In particular, if X carries the uniform distribution, thus Y carrying the uniform distribution as well, then we have Mf(p) = M(p).

If µY (j) = 0 for some j ∈ Y, then we necessarily have p = 0, thus µX (j) = 0, entailing that µX |k(j) = 0 for all k ∈ X . Hence, if µY (0) = 0 or µY (1) = 0, then 0 1 1 0 M(0) = and M(0) = , respectively, thus M(0) = lim + M(p). f 0 1 f 1 0 f p→0 f

(2.4) Capacity. a) We still consider a noisy channel working over an alphabet X = Y, with associated probability distributions µX and µY , respectively.

Given an elementary event j ∈ Y, the conditional distribution µX |j describes the probability distribution on the sent symbols in X provided j is received. Then P for µX |j we get H(X |j) = − i∈X µX |j(i) log2(µX |j(i)), describing the average information content of X which is not revealed by seeing j. Hence the aver- age conditional information content or conditional entropy H(X |Y) = P P P j∈Y µY (j)H(X |j) = − j∈Y i∈X µY (j)µX |j(i) log2(µX |j(i)) is the average information content of X which not revealed by Y, that is which is lost by trans- port through the channel. Thus the capacity C(X |Y) := H(X ) − H(X |Y) is the average information content being transported through the channel; it can be shown that we always have H(X |Y) ≤ H(X ), with equality if and only if X and Y are independent, see Exercise (12.11).

Using Bayes’s Theorem, saying that µX |j(i)µY (j) = µX ×Y (i, j), we get P P H(X × Y) = − i∈X j∈Y µX ×Y (i, j) log2(µX ×Y (i, j)) P P = − i∈X j∈Y µY (j)µX |j(i)(log2(µY (j)) + log2(µX |j(i))) P P = − j∈Y (µY (j) · i∈X µX |j(i) log2(µX |j(i))) P P − j∈Y (µY (j) log2(µY (j)) · i∈X µX |j(i)) P P = j∈Y µY (j)H(X |j) − j∈Y µY (j) log2(µY (j)) = H(X |Y) + H(Y). This implies C(X |Y) = H(X )−H(X |Y) = H(X )+H(Y)−H(X ×Y) = H(Y)− H(Y|X ) = C(Y|X ), saying that the capacity of the channel is independent of the direction of information transport. I 11

1 b) For the symmetric binary channel with error probability 0 ≤ p ≤ 2 , working over the alphabet X = Z2 = Y, using the transition matrix M(p) describing µY|X , we obtain H(Y|X ) = −p log2(p) − (1 − p) log2(1 − p), thus the capacity is C(p) = H(Y) − H(Y|X ) = H(Y) + p log2(p) + (1 − p) log2(1 − p). In particular, for the quiet channel we have µY = µX and C(0) = H(Y) = H(X ), saying that the average information content of X is completely transported through the channel; and for the completely noisy channel the alphabet Y carries the uniform 1 1 distribution in any case, hence we have H(Y) = 1 and C( 2 ) = 1 + log2( 2 ) = 0, saying that no information is transported through the channel. In general, we have 0 ≤ H(Y) ≤ 1, where the maximum H(Y) = 1 is attained precisely for the uniform distribution on Y. Hence the maximum capacity is 1 given as Cmax(p) = 1 + p log2(p) + (1 − p) log2(1 − p). Moreover, if 0 ≤ p < 2 this is equivalent to X carrying the uniform distribution, in other words if and only if H(X ) = 1, saying that the maximum capacity of the channel is attained if and only if X has maximum average information content.

(2.5) Redundancy. The idea of error correcting channel coding, to be used for noisy channels, is to add redundancy. For example, letting X be an alphabet, words over X consisting of k ∈ N0 information symbols are encoded into words of length k ≤ n ∈ N by adding n − k check symbols. More generally, any subset ∅= 6 C ⊆ X n is called block code of length n ∈ N and order m := |C| ∈ N, where we no longer distinguish between information and check symbols. In particular, if X = Fq is the field with q elements, then any Fq- n subspace C ≤ Fq is called a linear code of dimension k := dimFq (C) ∈ N0; specifically, C is called binary and ternary, if q = 2 and q = 3, respectively. Letting q := |X |, and assuming the uniform distribution on X n, the relative information content of a word in C, as compared to viewing it as an element of − log ( 1 ) n H(C) 2 |C| X , is given as the information rate ρ(C) = ρX n (C) := H(X n) = 1 = − log2( |X n| )

log2(|C|) logq (|C|) logq (m) n = n = . We have 0 ≤ ρ(C) ≤ 1, where ρ(C) = 1 if and log2(|X |) logq (|X |) n only if C = X n, that is no redundancy is added; thus the larger ρ(C) the better C is, as far as information transport is concerned. For example, allowing for arbitrary words in X k to start with as information symbols, and adding n − k k k logq (|X |) k check symbols, we obtain ρ(X ) = n = n ; similarly, if X = Fq and dim (C) n logq (|C|) Fq C ≤ Fq , then we have ρ(C) = n = n .

(2.6) Maximum likelihood decoding. a) If words are sent through a noisy channel, they are susceptible to random errors, where we assume that errors occurring in distinct positions in a word are independent of each other, that is we have a channel without memory. Hence if a word is received, the question arises how to decode it again: We again consider the binary symmetric channel 1 with error probability 0 ≤ p < 2 , working over the alphabet X = Z2 = Y. We assume that X carries the uniform distribution, or equivalently that Y carries the uniform distribution, so that the transition matrix describing µX |Y is M(p). I 12

Let ∅= 6 C ⊆ X n be a code, for some n ∈ N. If the word v ∈ Yn is received, then it is decoded to some c ∈ C which has maximum probability to be the word sent, that is µX n|v(c) = max{µX n|v(w) ∈ R; w ∈ C}, hence this is called maximum likelihood (ML) decoding. This is turned into combinatorics as follows: n n For x = [x1, . . . , xn] ∈ X and y = [y1, . . . , yn] ∈ X we let d(x, y) := |{i ∈ {1, . . . , n}; xi 6= yi}| ∈ {0, . . . , n} be their Hamming distance. Since distinct n positions are considered to be independent, for all w ∈ X we have µX n|v(w) = d(v,w) n−d(v,w) n p d(v,w) 1 p p (1−p) = (1−p) ( 1−p ) . If 0 < p < 2 , then from 0 < 1−p < 1 p a we infer that the function R≥0 → R>0 : a 7→ ( 1−p ) is strictly decreasing; if p = 0, then we have µX n|v(w) = δv,w. Thus we choose c ∈ C having minimum n distance to v ∈ Y , that is d(v, c) = min{d(v, w) ∈ N0; w ∈ C}, being called nearest neighbor decoding. In practice, although complete decoding is desired, if this does not determine c uniquely, we revert to partial decoding by unique nearest neighbor decoding, and mark v as an erasure. b) If c ∈ C is sent, let 0 ≤ γc ≤ 1 be the probability that c is not recovered, and 1 P the expected value γ(C) := |C| · c∈C γc is called the average error probability of C; thus the smaller γ(C) the better C is, as far as erroneous decoding is concerned. The question whether there are good codes, in the sense of having a large information rate and a small average error probability at the same time, is generally answered by the Second Main Theorem of information theory, Shannon’s Theorem on channel coding [1948], which we proceed to prove. But note that the proof is a pure existence proof giving no clue at all how to actually find good codes. We need a lemma first:

(2.7) Lemma: Chernoff inequality. Let X be an alphabet with probability distribution µ, and let X1,...,Xn, for n ∈ N, be independent random variables with values in {0, 1}, such that µ(Xi = 1) = p, for 0 ≤ p ≤ 1. Then X := Pn n d n−d i=1 Xi is binomially distributed, that is we have µ(X = d) = d ·p (1−p) , − 1 2pn for d ∈ {0, . . . , n}, and for 0 ≤  ≤ 1 we have µ(X ≥ (1 + )pn) ≤ e 2 .

Proof. The first assertion is a matter of counting. Next, for t > 0 we have P P the special case t · µ(X ≥ t) = µ(x)tδX(x)≥t ≤ µ(x)X(x)δX(x)≥t ≤ P x∈X x∈X x∈X µ(x)X(x) = E(X) of the Markov inequality; note that here we only use that X has non-negative values. Now, for t ∈ R and i ∈ {1, . . . , n} we have

E(exp(tXi)) = µ(Xi = 0) · exp(0) + µ(Xi = 1) · exp(t) = 1 + p(exp(t) − 1), Pn Qn hence we obtain E(exp(tX)) = E(exp(t · i=1 Xi)) = E( i=1 exp(tXi)) = Qn n i=1 E(exp(tXi)) = (1 + p(exp(t) − 1)) , which using the convexity of the exponential function yields E(exp(tX)) ≤ exp((exp(t) − 1)pn). Thus we get

µ(X ≥ (1 + )pn) = µ(exp(tX) ≤ exp(t(1 + )pn)) ≤ exp(−t(1 + )pn) · E(exp(tX)) = exp(−t(1 + )pn + (exp(t) − 1)pn). I 13

Letting t := ln(1+) this yields µ(X ≥ (1+)pn) ≤ exp(pn(−(1+) ln(1+))). 1 2 1 3 1 2 Finally, the Taylor expansion (1 + ) ln(1 + ) =  + 2  − 6  + ··· <  + 2  , 1 2 for 0 ≤  ≤ 1, entails µ(X ≥ (1 + )pn) ≤ exp(− 2  pn) as asserted. ]

1 (2.8) Theorem: Shannon [1948]. Keeping the setting of (2.6), let 0 ≤ p < 2 and 0 < ρ < 1 + p log2(p) + (1 − p) log2(1 − p) = Cmax(p). Then for any  > 0 there is a code ∅= 6 C ⊆ X n, for some n ∈ N, such that ρ(C) ≥ ρ and γ(C) < .

Proof. If p = 0, then we may take C = X , fulfilling ρ(X ) = 1 and γ(X ) = 0, dρne hence we may assume that p > 0. For n ∈ N let m := 2 ∈ N and Γn := n 2n {C ⊆ X ; |C| = m}; note that |Γn| = m > 0 for n  0. Hence for C ∈ Γn we log2(m) dρne have ρ(C) = n = n ≥ ρ. Now let C0 ∈ Γn such that γ(C0) is minimal.

Hence γ(C0) is bounded above by the expected value EΓn (γ(C)), subject to codes C ∈ Γn being chosen according to the uniform distribution on Γn. If some word in X n is sent, then the probability that the received word contains precisely d ∈ {0, . . . , n} errors is given by the binomial distribution β(d) = n d n−d d · p (1 − p) . Thus Chernoff’s inequality yields µ(β ≥ (1 + a)np) ≤ 1 2 exp(− 2 a np), for any 0 ≤ a ≤ 1. For n ∈ N let 0 < an < 1 such that 2 1 an → 0 and nan → ∞, for n → ∞; for example we may let an ∼ ln(n) . Letting 1 2 δ = δn := b(1 + an)npc yields µ(β > δ) ≤ exp(− 2 annp) < , for n  0. δ 1 n Moreover, we have n → p < 2 , for n → ∞; hence we have δ < 2 − 1, for n  0. n n n d n−d n nn The equation n = (d + (n − d)) ≥ d d (n − d) yields d ≤ dd(n−d)n−d , n for d ∈ {0, . . . , n}. Letting Bδ(v) := {w ∈ X ; d(v, w) ≤ δ} be the sphere or ball with radius δ around v ∈ X n, the unimodularity of binomial coefficients yields, for n  0,

δ X n n n nn+1 n b = bδ := |Bδ(v)| = < · ≤ = . d 2 δ 2δδ(n − δ)n−δ 2( δ )δ(1 − δ )n−δ d=0 n n

If C ∈ Γn, then we decode by unique nearest neighbor decoding with respect to n δ, that is, given v ∈ Y , if C ∩ Bδ(v) = {c} then we decode v to c, otherwise we n mark v as an erasure. For c ∈ C let χc : Y → {0, 1} be defined by χc(v) := 1 if n d(v, c) ≤ δ, and χc(v) := 0 if d(v, c) > δ. Let ϕc : Y → N0 be defined by  X |C ∩ Bδ(v)| + 1, if d(v, c) > δ, ϕc(v) := 1 − χc(v) + χc0 (v) = |(C\{c}) ∩ Bδ(v)|, if d(v, c) ≤ δ; c0∈C\{c} thus in particular we have ϕc(v) = 0 if and only if C ∩ Bδ(v) = {c}. P Hence for c ∈ C we have γc ≤ n µYn|c(v)ϕc(v), where moreover we have P P v∈Y n µ n (v)(1 − χ (v)) = n µ n (v) = µ(β > δ) < , for n  0. v∈Y Y |c c v∈Y \Bδ (c) Y |c Thus we get 1 X 1 X X γ(C) = · γ <  + · µ n (v)χ 0 (v). m c m Y |c c c∈C v∈Yn [c,c0]∈C2, c6=c0 II 14

2n Hence averaging over all m many m-subsets of Γn, distinct code words being chosen uniformly and independently, since any 2-subset of X n is contained in 2n−2 precisely m−2 of its m-subsets, we get

1 m(m − 1) X X E (γ(C)) <  + · · µ n (v)χ 0 (v). Γn m 2n(2n − 1) Y |c c v∈Yn [c,c0]∈(X n)2, c6=c0

n P P For all v ∈ Y fixed we have c∈X n χc(v) = b as well as c∈X n µYn|c(v) = P d(v,c) n−d(v,c) Pn n d n−d c∈X n p (1 − p) = d=0 d p (1 − p) = 1. Thus we get

(m − 1)b mb 2dρne−n−1 · n γ(C ) ≤ E (γ(C)) <  + <  + <  + . 0 Γn 2n − 1 2n δ δ δ n−δ ( n ) (1 − n )

log2(γ(C0)−) dρne−n−1+log2(n) δ δ δ δ This entails n < n − n log2( n )−(1− n ) log2(1− n ) → ρ − 1 − p log2(p) − (1 − p) log2(1 − p) < 0, for n → ∞. Hence there is α > 0 such −nα that γ(C0) −  < 2 , for n  0. ]

II

3 Block codes

(3.1) Hamming distance. a) Let X be an alphabet such that q := |X |. n n Letting v = [x1, . . . , xn] ∈ X and w = [y1, . . . , yn] ∈ X , then d(v, w) := |{i ∈ {1, . . . , n}; xi 6= yi}| ∈ {0, . . . , n} is called their Hamming distance; recall that we have already used this for q = 2 in (2.6). The Hamming distance defines a discrete metric on X n: We have positive definiteness d(v, w) ∈ R≥0, where d(v, w) = 0 if and only if v = w; we have symmetry d(v, w) = d(w, v); and the triangle inequality holds: Letting u := [z , . . . , z ] ∈ X n, from {i ∈ {1, . . . , n}; x 6= z } = {i ∈ {1, . . . , n}; y = 1 . n i i i xi 6= zi} ∪ {i ∈ {1, . . . , n}; yi 6= xi 6= zi} ⊆ {i ∈ {1, . . . , n}; yi 6= zi} ∪ {i ∈ {1, . . . , n}; xi 6= yi} we get d(v, u) ≤ d(v, w) + d(w, u). An isometry of X n is a map ϕ: X n → X n such that d(v, w) = d(vϕ, wϕ), for all v, w ∈ X n; it follows from positive definiteness that any isometry is injective, hence is bijective. Thus the set of all isometries of X n forms a group, called n the isometry group of X . In particular, given permutations πi ∈ Sq, for all i ∈ {1, . . . , n}, this yields an isometry [π1, . . . , πn] acting component-wise, and any permutation in Sn induces an isometry by permuting the components. Hence the isometry group contains a subgroup isomorphic to the semidirect n n product Sq o Sn, thus in particular acts transitively on X . n b) Let X = Fq be the field with q elements, and let 0n := [0,..., 0] ∈ Fq . n For v = [x1, . . . , xn] ∈ Fq let wt(v) := d(v, 0n) ∈ {0, . . . , n} be the Hamming weight of v, and let supp(v) := {i ∈ {1, . . . , n}; xi 6= 0} be the support of v; hence we have wt(v) = |supp(v)|. II 15

n An Fq-linear isometry of Fq is called a linear isometry, the group In(Fq) ≤ n GLn(Fq) of all linear isometries is called the linear isometry group of Fq . We determine In(Fq): n We have d(v + u, w + u) = d(v, w), for all u, v, w ∈ Fq , thus we have d(v, w) = n ϕ d(v − w, 0n) = wt(v − w). Since In(Fq) fixes 0n ∈ Fq , we infer wt(v ) = wt(v) n n for all v ∈ Fq . Hence for the i-th unit vector ei = [0,..., 0, 1, 0,..., 0] ∈ Fq , ϕ ∗ where i ∈ {1, . . . , n}, we have ei = xieiπ , where π ∈ Sn and xi ∈ Fq := Fq \{0}. Thus ϕ is described by a monomial matrix diag[x1, . . . , xn] · Pπ ∈ GLn(Fq), where Pπ ∈ GLn(Fq) is the permutation matrix associated with π ∈ Sn. Since conversely any invertible diagonal matrix and any permutation matrix, and thus any monomial matrix, gives rise to a linear isometry, we conclude that In(Fq) ≤ ∼ ∗ n GLn(Fq) is the subgroup of monomial matrices, hence In(Fq) = (Fq ) o Sn. In n particular, In(Fq) acts transitively on {v ∈ Fq ; wt(v) = i}, for all i ∈ {0, . . . , n}.

(3.2) Minimum distance. a) Let X be an alphabet such that q := |X |, and let ∅= 6 C ⊆ X n be a block code of length n ∈ N and order m := |C| ∈ N; if m = 1 then C is called trivial. Codes C, C0 ⊆ X n are called equivalent, if there is an isometry ϕ such that Cϕ = C0; the automorphism group of C is the group of all isometries ϕ such that Cϕ = C. If C is non-trivial, then d(C) := min{d(v, w) ∈ N; v 6= w ∈ C} ∈ {1, . . . , n} is called the minimum distance of C; we have d(X n) = 1, and if C is trivial we let d(C) := ∞. If d(C) = d then C is called an (n, m, d)-code over X . Hence equivalent codes have the same minimum distance. For any non-trivial (n, m, d)-code C we have the Singleton bound [1964] n n−d+1 logq(m) ≤ n − d + 1: We consider the map α: X → X :[x1, . . . , xn] 7→ [x1, . . . , xn−d+1]; since for any v 6= w ∈ C we have d(v, w) ≥ d, we infer that the n−d+1 n−d+1 restriction α|C : C → X is injective, thus m = |C| ≤ q . Note that in the above argument we can choose any n−d+1 components instead. If we have equality d−1 = n−logq(m), then C is called a maximum distance separable (MDS) code; in particular, X n is the only MDS code such that d = 1. n b) Let X = Fq, and let C ≤ Fq be a linear code of dimension k ∈ N0 and mini- mum distance d(C) = d, then C is called an [n, k, d]-code over Fq; in particular C is an (n, qk, d)-code, and the Singleton bound for k ≥ 1 reads d − 1 ≤ n − k.

Moreover, if C is non-trivial then wt(C) := min{wt(v) ∈ N; 0n 6= v ∈ C} ∈ {1, . . . , n} is called the minimum weight of C; if C is trivial we let wt(C) := ∞. Then we have wt(C) = d, that is the minimum distance and the minimum weight of C coincide: We may assume that C is non-trivial, that is k ≥ 1; since wt(v) = d(v, 0n) ≥ d for all 0n 6= v ∈ C, we have wt(C) ≥ d; conversely, for all v 6= w ∈ C we have 0n 6= v − w ∈ C and d(v, w) = d(v − w, 0n) = wt(v − w) ≥ wt(C), hence we have d ≥ wt(C) as well. 0 n Codes C, C ≤ Fq are called linearly equivalent, if there is ϕ ∈ In(Fq) such that ϕ 0 C = C ; the linear automorphism group of C is the subgroup AutFq (C) := ϕ {ϕ ∈ In(Fq); C = C} ≤ In(Fq). Hence linearly equivalent codes have the same II 16 dimension and the same minimum weight.

(3.3) Error correction. a) Let X be an alphabet such that q := |X |. For n n ∈ N and r ∈ N0 let Br(v) := {w ∈ X ; d(v, w) ≤ r} be the sphere or ball with radius r around v ∈ X n. Hence independently of v ∈ X n we have Pmin{r,n} n Pmin{r,n} n d |Br(v)| = d=0 |{w ∈ X ; d(v, w) = d}| = d=0 d · (q − 1) ∈ N0; recall that we have already used this for q = 2 in (2.8). Let C be an (n, m, d)-code over X . Then C is called (e, f)-error detecting, for some e ∈ {0, . . . , n} and f ∈ {e, . . . , n}, if Bf (v) ∩ Be(w) = ∅ for all v 6= w ∈ C; in particular, if f = e + 1 then C is called f-error detecting, and if e = f then C is called e-error correcting. Thus, if C is (e, f)-error detecting then it is e-error correcting, if C is e-error correcting then it is e-error detecting. n Hence for u ∈ X we have u ∈ Bf (v), for some v ∈ C, if and only if u is obtained from v by at most f errors. In this case, u has distance at least e + 1 from any v 6= w ∈ C; thus it is detected that an error has occurred. If at most min{f, e+1} errors have occurred, the number of errors is detected, but u might not be uniquely nearest neighbor decodable. If at most e errors have occurred, then u are correctly decoded by unique nearest neighbor decoding. Moreover, if C is e-error correcting then since any element of X n is contained in at most one sphere Be(v), where v ∈ C, we have the Hamming bound [1960] Pe n i n n or sphere packing bound m · i=0 i · (q − 1) ≤ q = |X |. b) If C is trivial, then C is n-error correcting; hence let C be non-trivial. Then d ∈ N is related to the error correction and detection properties of C as follows: The code C is (e, f)-error detecting if and only if e + f ≤ d − 1: Let e + f be as large as possible, that is e+f = d−1. Assume that there are v 6= w ∈ C such that u ∈ Bf (v)∩ Be(w) 6= ∅, then we have d(v, w) ≤ d(v, u) +d(u, w) ≤ f +e = d−1, a contradiction; hence C is (e, f)-error correcting. Conversely, let v 6= w ∈ C such that d(v, w) = d, then Bf+1(v) ∩ Be(w) 6= ∅ and Bf (v) ∩ Be+1(w) 6= ∅, thus C is neither (e, f + 1)-error detecting, nor (e + 1, f)-error detecting if e < f. In particular, C is f-error detecting if and only if 2f ≤ d, and C is e-error d−1 correcting if and only if 2e + 1 ≤ d. In other words, C always is b 2 c-error d correcting, and if d is even then C moreover is 2 -error detecting; note that d−1 d−1 d−1 d b 2 c = 2 if d is odd, but b 2 c = 2 − 1 if d is even.

n tr Example. Let C := {v ∈ Zq ; vw = 0 ∈ Zq} be a parity check code over Zq, for ∗ n ∗ q ≥ 2 and n ≥ 2 and weights w ∈ (Zq ) . Since wn ∈ Zq , for all [x1, . . . , xn−1] ∈ n−1 Zq there is a unique xn ∈ Zq such that v := [x1, . . . , xn−1, xn] ∈ C. Hence logq (|C|) 1 the information rate of C is ρ(C) = n = 1 − n . 0 0 0 0 Moreover, for xn−1 6= xn−1 ∈ Zq there is xn ∈ Zq such that [x1, . . . , xn−1, xn] ∈ ∗ C as well, implying that d(C) ≤ 2. Since wi ∈ Zq , for all i ∈ {1, . . . , n}, 0 from (1.4) we infer that we have [x1, . . . , xi−1, xi, xi+1, . . . , xn] 6∈ C, whenever II 17

0 xi 6= xi ∈ Zq. This says that B1(v) ∩ C = {v}, entailing that C has minimum distance d(C) = 2, and thus is 0-error correcting and 1-error detecting.

(3.4) Covering radius. a) Let X be an alphabet such that q := |X |, and let C be an (n, m, d)-code over X . The minimum c ∈ {0, . . . , n} such that n S X = v∈C Bc(v) is called the covering radius c(C) of C. Hence we have c(C) = 0 if and only if C = X n, and we have c(C) = n if and only if C is trivial. Letting C be non-trivial, the covering radius is related to the minimum distance as follows: d−1 If d is odd, letting e := 2 we have Be(v) ∩ Be(w) = ∅ for all v 6= w ∈ C, hence d since Be(v)\Be−1(v) 6= ∅, we conclude that c(C) ≥ e. If d is even, letting f := 2 we have Bf (v)∩Bf−1(w) = ∅ for all v 6= w ∈ C, hence since Bf (v)\Bf−1(v) 6= ∅, we conclude that c(C) ≥ f. d−1 n ` b) If c(C) = e := b 2 c, that is X = v∈C Be(v), then C is called perfect. In this case we have unique nearest neighbor decoding for any element of X n, but C incorrectly decodes any word containing at least e + 1 errors. In other words, C is perfect if and only if the Hamming bound is an equality, that is we have Pe n i n n m · i=0 i · (q − 1) = q . Hence C is perfect with d = 1 if and only if C = X . d As for the existence of perfect codes in general, if d is even then c(C) ≥ f = 2 and Bf (v) ∩ Bf (w) 6= ∅, for all v 6= w ∈ C such that d(v, w) = d, imply that there are no perfect codes in this case. The picture changes if d is odd, but still perfect codes are rare; in particular fulfilling the Hamming bound is by far not sufficient, as (3.5) below shows. d+1 d If d is odd and c(C) = e + 1 = 2 , or d is even and c(C) = f = 2 , the code C is called quasi-perfect; in this case we do not have unique nearest neighbor decoding for all elements of X n, and C incorrectly decodes any word containing at least e + 2 respectively f + 1 errors, and possibly some words containing e + 1 respectively f errors.

Example. The repetition code C := {[x, . . . , x] ∈ X n; x ∈ X }, for n ∈ N, logq (|C|) 1 has information rate is ρ(C) = n = n . Its minimum distance is d(C) = n, n−1 n hence C is b 2 c-error correcting, and if n is even it is 2 -error detecting. n In particular, if X = F2 then we have C = {0n, 1n} ≤ F2 , where we abbreviate n n n 1n := [1,..., 1] ∈ F2 . Since for all v ∈ F2 we have d(v, 0n) ≤ b 2 c or d(v, 1n) ≤ n n−1 n b 2 c, we get c(C) = 2 if n is odd, that is C perfect, and c(C) = 2 if n is even, that is C quasi-perfect. II 18

6 Example. Let C := h[1, 0, 0, 0, 1, 1], [0, 1, 0, 1, 0, 1], [0, 0, 1, 1, 1, 0]iF2 ≤ F2, hence X = F2, and the elements of C consist of the rows of the following matrix:  ......   1 ... 1 1     . 1 . 1 . 1     .. 1 1 1 .  8×6   ∈ F2 .  1 1 . 1 1 .     1 . 1 1 . 1     . 1 1 . 1 1  1 1 1 ...

Thus we have the minimum distance d = d(C) = wt(C) = 3, that is C is a d−1 [6, 3, 3]-code, which hence is 1-error correcting. Using q = 2 and e = 2 = 1, 3 6 6 6 the Hamming bound yields 2 · ( 0 + 1 ) = 56 < 64 = 2 , thus C is not perfect. Hence the covering radius is c(C) ≥ e + 1 = 2, and we show that actually d+1 c(C) = 2 = 2 , saying that C is quasi-perfect: 6 Let u ∈ F2. Since the Hamming distance is translation invariant, by adding a suitable element of C we may assume that u = [0, 0, 0, ∗, ∗, ∗]. Since the permutation (1, 2, 3)(4, 5, 6) ∈ S6 induces a linear isometry leaving C invariant, we may assume that u ∈ {06, [0, 0, 0, 1, 0, 0], [0, 0, 0, 1, 1, 0], [0, 0, 0, 1, 1, 1]}. Now we observe that u ∈ B2(v) for some v ∈ C. ]

n (3.5) Theorem: Tiet¨av¨ainen,van Lint [1973]. Let C ⊆ Fq be a perfect n (n, m, 2e + 1)-code, where e ∈ N; in particular we have C 6= Fq and n ≥ 3. a) If e ≥ 2, then C is equivalent to a linear code, and linearly equivalent to i) the binary repetition [n, 1, n]-code {0n, 1n}, where n ≥ 5 is odd; ii) the binary Golay [23, 12, 7]-code G23, see (11.7); iii) the ternary Golay [11, 6, 5]-code G11, see (11.8).

qk−1 n−k b) If e = 1, then n = q−1 for some k ≥ 2, and m = q . If C is linear, then it is linearly equivalent to the Hamming [n, n − k, 3]-code Hk, see (6.2). ]

In particular, for q = 2 and k = 2 we recover the binary repetition [3, 1, 3]- code {03, 13}. Actually, there are non-linear codes having the parameters of Hamming codes, and their classification still is an open problem.

4 Linear codes

n (4.1) Generator matrices. Let C ≤ Fq be a linear code of length n ∈ N over k×n Fq, and let k := dimFq (C) ∈ {0, . . . , n}. A matrix G ∈ Fq whose rows form an Fq-basis of C is called a generator matrix of C; hence we have C = im(G) = n k 0×n {vG ∈ Fq ; v ∈ Fq }; in particular, for k = 0 we have G ∈ Fq , and for k = n we n×n k may choose the identity matrix G = En ∈ Fq . Then v ∈ Fq is encoded into n n vG ∈ Fq , and conversely w ∈ C = im(G) ≤ Fq is decoded by solving the system II 19

n of Fq-linear equations [X1,...,Xk] · G = w ∈ Fq , which since rkFq (G) = k has a unique solution.

Since rkFq (G) = k, by Gaussian row elimination and possibly column permu- k×n tation G can be transformed into standard form [Ek | A] ∈ Fq , where k×(n−k) A ∈ Fq . Row operations leave the row space C of G invariant; and column permutations amount to permuting the positions of symbols, thus transform C k into a linearly equivalent code. Hence in this case [x1, . . . , xk] ∈ Fq is encoded n into [x1, . . . , xk; y1, . . . , yn−k] ∈ Fq , where [y1, . . . , yn−k] = [x1, . . . , xk] · A ∈ n−k Fq . Thus the first k symbols can be considered as information symbols, and the last n − k symbols as check symbols; since information and check symbols can be distinguished like this C is called separable. Moreover, the projection k C → Fq :[z1, . . . , zn] 7→ [z1, . . . , zk] onto the first k positions is a bijection; hence C is called systematic on the information symbols.

(4.2) Check matrices. a) Let Fq be the field with q elements. For n ∈ N n n tr Pn let h·, ·i: Fq × Fq → Fq : [[x1, . . . , xn], [y1, . . . , yn]] 7→ x · y = i=1 xiyi be n the standard Fq-bilinear form on Fq . This form is symmetric and non- tr n n×n degenerate, and we have hvM, wi = hv, wM i for all v, w ∈ Fq and M ∈ Fq . n ⊥ n For a code C ≤ Fq , the orthogonal space C := {v ∈ Fq ; hv, wi = 0 ∈ n Fq for all w ∈ C} ≤ Fq with respect to the standard Fq-bilinear form is called the associated dual code. Letting k := dimFq (C) ∈ {0, . . . , n}, we have ⊥ ⊥ ⊥ ⊥ ⊥ dimFq (C ) = n − k. Moreover, we have C ≤ (C ) , and from dimFq ((C ) ) = ⊥ ⊥ ⊥ n − (n − k) = k = dimFq (C) we get (C ) = C. If C ≤ C then C is called weakly self-dual, and if C = C⊥ then C is called self-dual; in the latter case ⊥ we have n − k = dimFq (C ) = dimFq (C) = k, thus n = 2k is even. k×n ⊥ n tr b) If G ∈ Fq is a generator matrix of C, then we have C = {v ∈ Fq ; Gv = k×1 n tr k (n−k)×n 0 ∈ Fq } = {v ∈ Fq ; vG = 0 ∈ Fq }. Hence if H ∈ Fq is a generator ⊥ ⊥ ⊥ n tr n−k tr n matrix of C , then C = (C ) = {v ∈ Fq ; vH = 0 ∈ Fq } = ker(H ) ≤ Fq . Thus H is called a check matrix of C, and instead of using a generator matrix the code C can also be defined by a check matrix. In particular, for k = n we have 0×n n×n H ∈ Fq , and for k = 0 we may choose the identity matrix H = En ∈ Fq . k×n tr If G = [Ek | A] ∈ Fq is in standard form, then H = [−A | En−k] ∈ (n−k)×n ⊥ Fq is a generator matrix of C , also being called a standard check matrix  E  for C: We have rk (H) = n − k, and HGtr = [−Atr | E ] · k = Fq n−k Atr tr tr (n−k)×k −A · Ek + En−k · A = 0 ∈ Fq . 0 n Duality interferes with linear equivalence nicely inasmuch a code C ≤ Fq is linearly equivalent to C if and only if its dual (C0)⊥ is linearly equivalent to C⊥: It suffices to show one direction. If C0 is linearly equivalent to C, then there is 0 0 −tr tr −1 tr M ∈ In(Fq) such that C = C·M. Then we have C ·(HM ) = C·MM H = −tr (n−k)×n −tr {0}. Thus HM ∈ Fq , having rank rkFq (HM ) = n − k, is a check II 20 matrix for C0, entailing that (C0)⊥, having HM −tr as a generator matrix, is linearly equivalent to C⊥, which has H as a generator matrix.

7 Example. Let the binary Hamming code H ≤ F2 be given by the following 4×7 0 generator matrix G ∈ F2 , or equivalently its standard form G , ... 1 1 1 1  1 ... . 1 1  . 1 1 1 1 .. 0  . 1 .. 1 . 1  G :=   and G :=   . 1 1 .. 1 1 .  .. 1 . 1 1 .  1 1 1 1 1 1 1 ... 1 1 1 1

3×7 0 A check matrix H ∈ F2 , or equivalently its standard form H , is given as ... 1 1 1 1  . 1 1 1 1 ..  0 H := . 1 1 .. 1 1 and H :=  1 . 1 1 . 1 .  . 1 . 1 . 1 . 1 1 1 . 1 .. 1

4 We have k = dimF2 (H) = 4, that is m = |H| = 2 = 16. From inspecting the elements of H, as given by the rows of the matrices below, or from (4.3) below, we get d = d(H) = 3, thus H is a [7, 4, 3]-code. Moreover, we have d−1 P 2 7 7 7 m · i=0 i = 16 · (1 + 7) = 128 = 2 = |F2|, thus by the Hamming bound we conclude that H is perfect; see also (3.5) and (6.3).

......  1 .... 1 1 ... 1 1 1 1 1 .. 1 1 ..     .. 1 . 1 1 . 1 . 1 . 1 . 1     .. 1 1 .. 1 1 . 1 1 . 1 .     . 1 .. 1 . 1 1 1 .. 1 1 .     . 1 . 1 . 1 . 1 1 . 1 .. 1     . 1 1 .. 1 1 1 1 1 .... . 1 1 1 1 .. 1 1 1 1 1 1 1

n (4.3) Theorem. Let C ≤ Fq be a non-trivial [n, k, d]-code, let Fq ⊆ F be a field extension, and let H ∈ F (n−k)×n be a generalized check matrix of C, tr n that is C = ker(H ) ∩ Fq is a subfield subset of the code having check matrix H; thus for F = Fq the matrix H is just a check matrix of C.

Then any (d−1)-subset of columns of H is Fq-linearly independent, and there is some Fq-linearly dependent d-subset of columns of H. In particular, for F = Fq we recover the Singleton bound d − 1 ≤ rkFq (H) = n − k for linear codes.

0 0 Proof. Let d ∈ N such that any (d − 1)-subset of columns of H is Fq-linearly 0 independent, while there is some Fq-linearly dependent d -subset of columns of n 0 tr n−k H. Let 0 6= v ∈ Fq such that wt(v) ≤ d − 1. Hence vH ∈ F is a non- 0 tr trivial Fq-linear combination of at most d − 1 rows of H . Since the latter are tr 0 Fq-linearly independent we have vH 6= 0, hence v 6∈ C, implying that d ≥ d . II 21

0 Conversely, picking an Fq-linearly dependent d -subset of columns of H, there n 0 tr n−k is 0 6= v ∈ Fq such that wt(v) ≤ d and vH = 0 ∈ F , thus we have v ∈ C, implying that d ≤ d0. ]

n (4.4) Syndrome decoding. Let C ≤ Fq , where n ∈ N, be given by a check (n−k)×n n tr n−k matrix H ∈ Fq , where k := dimFq (C). Then for v ∈ Fq let vH ∈ Fq be the syndrome of v with respect to H. n We consider the quotient Fq-vector space Fq /C, whose elements v := v + C = n n n {v + w ∈ Fq ; w ∈ C} ∈ Fq /C, for v ∈ Fq , are called cosets with respect to tr n C. Since C = ker(H ) ≤ and rk n (H) = n − k, by the homomorphism Fq Fq n n tr ∼ tr n−k theorem we have Fq /C = Fq / ker(H ) = im(H ) = Fq as Fq-vector spaces. Thus the syndromes are in natural bijection with the cosets with respect to C, n n−k n−k in particular there are |Fq /C| = |Fq | = q syndromes. n To decode a possibly erroneous vector v ∈ Fq , we proceed as follows: Let n w ∈ C be the codeword sent, and let u := v − w ∈ Fq be the associated error vector, thus we have d(v, w) = wt(u). Moreover, for the syndromes we have tr tr tr n−k n vH = (w + u)H = uH ∈ Fq , that is we have v = u ∈ Fq /C. Hence n v is uniquely nearest neighbor decodable if and only if the coset v + C ⊆ Fq n n possesses a unique element u ∈ Fq of minimum weight, and in this case v ∈ Fq n is decoded to v − u ∈ Fq . n The elements of the coset v+C ⊆ Fq having minimum weight are called the coset n leaders of v ∈ Fq /C. In general coset leaders are not unique. Still, the zero n n vector 0n ∈ Fq always is the unique coset leader of the coset 0n ∈ Fq /C; and if C n is e-error correcting, for some e ∈ {0, . . . , n}, and the coset v +C ⊆ Fq possesses n an element u ∈ Fq of weight wt(u) ≤ e, then u is the unique coset leader of n tr n−k v ∈ Fq /C. In practice, coset leaders for all syndromes in im(H ) = Fq are computed once and stored into a table, so that then coset leaders are found by computing syndromes and subsequent table lookup. Finding coset leaders algorithmically, being called the decoding problem for linear codes, is an NP-hard problem; hence the aim is to find codes having fast decoding algorithms.

n tr n Example: Parity check codes. Let C = {v ∈ Fq ; vw = 0 ∈ Fq} ≤ Fq be n defined by the check matrix 0 6= H := [w] = [w1, . . . , wn] ∈ Fq , where up to linear equivalence we may assume that wn = 1, that is H is in standard form; hence we have k = dimFq (C) = n − 1. Thus the standard generator matrix is tr (n−1)×n n−1 given as G := [En−1 | y ] ∈ Fq , where y := −[w1, . . . , wn−1] ∈ Fq . n−1 Pn−1 n Hence [x1, . . . , xn−1] ∈ Fq is encoded into [x1, . . . , xn−1; − i=1 xiwi] ∈ Fq , n tr Pn and for v = [x1, . . . , xn] ∈ Fq we get the syndrome vH = i=1 xiwi ∈ Fq. From (4.3) we get d(C) ∈ {1, 2}, where for n ≥ 2 we have d(C) = 2 if and only if wi 6= 0 for all i ∈ {1, . . . , n}. n n−1 In particular, for q = 2 and w = 1n ∈ F2 , any [x1, . . . , xn−1] ∈ F2 is encoded Pn−1 n into [x1, . . . , xn−1; i=1 xi] ∈ F2 , which indeed amounts to adding a parity II 22

n tr Pn check symbol; and v = [x1, . . . , xn] ∈ F2 has syndrome vH = i=1 xi ∈ F2, hence we have v ∈ C if and only if wt(v) ∈ N0 is even, thus C is called the binary even-weight code of length n. For n ≥ 2 we have d = d(C) = 2, thus C is a binary [n, n − 1, 2]-code. Indeed, n it is the unique one: Any such code has a check matrix in F2 without zero n entries, thus being equal to 1n. Since any v ∈ F2 has distance at most 1 from an element of C, the covering radius equals c(C) = 1 = d , thus C is quasi-perfect. . 2 n n We have F2 = (0 + C) ∪ (v + C), where v ∈ F2 is any vector such that wt(v) is odd, corresponding to the syndromes 0 ∈ F2 and 1 ∈ F2, respectively; thus n any vector of weight 1 is a coset leader of the coset F2 \C, which hence is not uniquely nearest neighbor decodable.

Example: Repetition codes. Let C be given by the standard generator ma- n trix G := [1n] ∈ Fq ; hence we have k = dimF2 (C) = 1 and d(C) = n. Thus tr (n−1)×n the standard check matrix is H := [−1n | En−1] ∈ Fq . Then x ∈ Fq is n n encoded into [x, . . . , x] ∈ Fq , and for v = [x1, . . . , xn] ∈ Fq we get the syndrome tr n−1 vH = [x2 − x1, . . . , xn − x1] ∈ Fq . n In particular, for q = 2 we have C = {0n, 1n} ≤ F2 ; recall that C is perfect if n is odd, and quasi-perfect if n is even. The repetition code is the unique binary [n, 1, n]-code, and is weakly self-dual if and only if h1n, 1ni = 0, which holds if n and only if n is even. The generator matrix G = [1n] ∈ F2 is a check matrix of the even-weight code of length n, hence the latter coincides with C⊥. n n−1 Any v = [x1, . . . , xn] ∈ F2 has syndrome [x2 + x1, . . . , xn + x1] ∈ F2 ; and n−1 n the coset affording syndrome w ∈ F2 equals [0 | w] + C ⊆ F2 , where wt([0 | w]) = wt(w) and wt([0 | w] + 1n) = n − wt(w). Thus, given v, computing its syndrome and finding coset leaders yields the following decoding algorithm: n−1 For n odd, coset leaders are uniquely given as [0 | w] if wt(w) ≤ 2 =: e, and n+1 [0 | w] + 1n if wt(w) ≥ 2 = e + 1; in both cases the coset leaders have weight at most e. Thus if wt(v) ≤ e and x1 = 0, then v has syndrome [x2, . . . , xn], and is decoded to v + [0 | x2, . . . , xn] = 0n; if x1 = 1, then v has syndrome [x2 + 1, . . . , xn + 1], and is decoded to v + ([0 | x2 + 1, . . . , xn + 1] + 1n) = 0n; if wt(v) ≥ e + 1 and x1 = 0, then v is decoded to v + ([0 | x2, . . . , xn] + 1n) = 1n; if x1 = 1, then v is decoded to v + [0 | x2 + 1, . . . , xn + 1] = 1n. n For n even, coset leaders are uniquely given as [0 | w] if wt(w) ≤ 2 − 1 =: e, n and [0 | w] + 1n if wt(w) ≥ 2 + 1 = e + 2, where in both cases the coset leaders n have weight at most e, but for wt(w) = 2 = e + 1 we have wt([0 | w]) = wt([0 | w] + 1n) = wt(w), in which case coset leaders are not unique. Thus if wt(v) ≤ e and x1 = 0, then v is decoded to v + [0 | x2, . . . , xn] = 0n; if x1 = 1, then v is decoded to v + ([0 | x2 + 1, . . . , xn + 1] + 1n) = 0n; if wt(v) ≥ e + 2 and x1 = 0, then v is decoded to v + ([0 | x2, . . . , xn] + 1n) = 1n; if x1 = 1, then v is decoded to v + [0 | x2 + 1, . . . , xn + 1] = 1n; but if wt(v) = e + 1 then v is not uniquely nearest neighbor decodable. II 23

5 Bounds for codes

(5.1) Theorem: Plotkin bound [1960]. Let C be a non-trivial (n, m, d)-code q−1 over an alphabet X such that q := |X |. Then we have m · (d − n · q ) ≤ d. In particular, if we have equality then d(v, w) = d for all v 6= w ∈ C, that is C is an equidistant code. Note that the above inequality is fulfilled for all m ∈ N d q−1 whenever n ≤ q , hence giving no obstruction at all in this case.

P Proof. We compute two estimates of ∆ := [v,w]∈C2, v6=w d(v, w) ∈ N. Firstly, since d(v, w) ≥ d for all v 6= w ∈ C, we get ∆ ≥ m(m − 1)d.

Secondly, letting X = {x1, . . . , xq}, let mij ∈ N0 be the number of occurrences of the symbol xi, for i ∈ {1, . . . , q}, in position j ∈ {1, . . . , n} of the various words Pq in C. Hence we have i=1 mij = m, and the Cauchy-Schwarz inequality, applied q q 2 Pq 2 to the tuples [mij; i ∈ {1, . . . , q}] ∈ R and 1q ∈ R , yields m = ( i=1 mij) ≤ Pq 2 Pq 2 m2 q· i=1 mij, thus i=1 mij ≥ q . Now, there are mij words in C having entry xi at position j, and m − mij words having a different entry there. This accounts Pn Pq for all contributions to ∆, hence we get ∆ = j=1 i=1 mij(m − mij) = Pn 2 Pq 2 Pn 2 1 2 q−1 j=1(m − i=1 mij) ≤ j=1 m (1 − q ) = nm · q . 2 q−1 q−1 Thus we get m(m − 1)d ≤ ∆ ≤ nm · q , entailing (m − 1)d ≤ nm · q . In particular, equality implies that ∆ = m(m − 1)d, thus C is equidistant. ]

(5.2) Theorem: Gilbert bound [1952]. a) Let X be an alphabet such that n q := |X |, and let n, m, d ∈ N such that 2 ≤ m ≤ q . If m · |Bd−1(·)| = Pd−1 n i n 0 0 m · i=0 i · (q − 1) ≤ q , there is an (n, m, d )-code over X such that d ≥ d. Pd−1 n i n−k+1 b) Let n, k, d ∈ N such that k ≤ n. If |Bd−1(0n)| = i=0 i ·(q−1) < q , 0 0 then there is an [n, k, d ]-code over Fq such that d ≥ d,

Proof. a) We construct a suitable code successively, starting from a singleton set C ⊆ X n; recall that C has infinite minimum distance. As long as |C| < m we n n S n have |C| · |Bd−1(·)| < q = |X |, hence v∈C Bd−1(v) ⊂ X is a proper subset of X n. Thus there is w ∈ X n having distance at least d from all elements of C. . Hence, since C has minimum distance at least d, this also holds for C ∪ {w}. b) If k = 1, then we have d ≤ n, and thus the repetition [n, 1, n]-code is as desired; hence we may assume that k ≥ 2. Then by induction there is an 0 n 0 k−1 n [n, k − 1, d ]-code C ≤ Fq such that d ≥ d. Since q · |Bd−1(0n)| < q , we S n n n conclude that v∈C Bd−1(v) ⊂ Fq is a proper subset of Fq . Thus there is w ∈ Fq having distance at least d from all elements of C. Hence we have wt(w) ≥ d, and n ∗ xw ∈ Fq has the same distance properties from C, for all x ∈ Fq . Thus we have n C ∩ hwiFq = {0}, and we let Cb := C ⊕ hwiFq ≤ Fq . Hence we have dimFq (Cb) = k, 0 0 0 0 0 0 and d(xw+v, x w+v ) = d((x−x )w, v −v) ≥ d, for all x, x ∈ Fq and v, v ∈ C, entails that Cb has minimum distance at least d, saying that Cb is as desired. ] II 24

n (5.3) Theorem: Griesmer bound [1960]. a) If C ≤ Fq is an [n, k, d]-code ∗ ∗ n−d ∗ d such that k ≥ 2, there is an [n−d, k −1, d ]-code C ≤ Fq such that d ≥ d q e. n Pk−1 d b) If C ≤ Fq is a non-trivial [n, k, d]-code, then we have i=0 d qi e ≤ n.

k×n Proof. We use the Helgert-Stinaff construction [1973]: Let G ∈ Fq be a generator matrix of C. Up to linear equivalence we may assume that G =  1 0  d n−d , where G∗∗ ∈ (k−1)×d and G∗ ∈ (k−1)×(n−d); note that k ≥ 2 G∗∗ G∗ Fq Fq ∗ n−d implies that d < n. We show that the residual code C ≤ Fq generated by ∗ ∗ ∗ d the rows of G is an [n − d, k − 1, d ]-code such that d ≥ d q e: ∗ ∗ We first show that rkFq (G ) = k − 1: Assume to the contrary that rkFq (G ) ≤  w 0  k −2, then up to row operations may assume that [G∗∗ | G∗] = n−d ∈ ∗ ∗ (k−1)×n d d d Fq , for some w ∈ Fq . If w = x · 1d ∈ Fq for some x ∈ Fq , then we have rkFq (G) < k, a contradiction; otherwise we have 0 6= [w − x · 1d, 0n−d] ∈ C for all x ∈ Fq, but wt(w − x · 1d) < d for some x ∈ Fq, a contradiction as well. ∗ d We show that the minimum weight wt(C ) ∈ N is bounded below by d q e: Let ∗ d 0 6= v ∈ C , and let [w | v] ∈ C for some w ∈ Fq . Then for some x ∈ Fq there d d are at least d q e entries of w equal to x, hence wt(w − x · 1d) ≤ d − d q e. Since d 0 6= [w − x · 1d | v] ∈ C has weight at least d, we conclude that wt(v) ≥ d q e. b) This now follows by induction on k, the case k = 1 being trivial: For k ≥ 2 Pk−2 d∗ d Pk−2 d Pk−1 d we have n − d ≥ i=0 d qi e, thus n ≥ d q0 e + i=0 d qi+1 e = i=0 d qi e. ] We have a couple of immediate comments: Firstly, the Griesmer bound for linear codes improves the (non-linear) Plotkin bound: The former entails n ≥ Pk−1 1 q−k−1 q qk−1 k q−1 k d · i=0 qi = d · q−1−1 = d · qk · q−1 , or equivalently nq · q ≥ d(q − 1), k q−1 that is d ≥ q (d − n · q ), which is the Plotkin bound. Secondly, if C is an MDS code such that k ≥ 2, then the Griesmer bound entails Pk−1 d d ≤ q: Assume to the contrary that d > q, then we have n ≥ i=0 d qi e ≥ Pk−1 d + 2 + i=2 1 = d + 2 + (k − 2) = n + 1, a contradiction. This shows that the minimum distance of MDS codes is a severely restricted; recall that we have observed a similar phenomenon for perfect codes.

(5.4) Theorem: Gilbert-Varshamov bound [1952, 1957]. Let n, k, d ∈ N Pd−2 n−1 i n−k such that k ≤ n. If |Bd−2(0n−1)| = i=0 i · (q − 1) < q , where we let 0 0 B−1(0n−1) := ∅, then there is an [n, k, d ]-code over Fq such that d ≥ d.

Proof. We may assume that d ≥ 2. For k = n, the above inequality is fulfilled if and only if d = 1, hence we have k < n. We construct an Fq-generating set n−k Bn := {v1, . . . , vn} ⊆ Fq such that any (d − 1)-subset of Bn is Fq-linearly II 25

tr tr (n−k)×n independent; then [v1 , . . . , vn ] ∈ Fq is the check matrix of a code as desired. We proceed by induction to find subsets Bn−k ⊂ Bn−k+1 ⊂ · · · ⊂ Bn, n−k where Bn−k := {v1, . . . , vn−k} ⊆ Fq is an Fq-basis. For j ∈ {n−k, . . . , n−1}, n−k the number of vectors in Fq being Fq-linear combinations of at most d − 2 Pd−2 j i Pd−2 n−1 i n−k elements of Bj is at most · (q − 1) ≤ · (q − 1) < q , i=0 i i=0 i . n−k hence there is vj+1 ∈ Fq such that any (d − 1)-subset of Bj+1 := Bj ∪ {vj+1} is Fq-linearly independent. ] We again have an immediate comment, saying that the Gilbert-Varshamov bound for linear codes improves the linear Gilbert bound: Given the inequality Pd−1 n i n+1−k 0 i=0 i ·(q −1) < q , the latter ensures the existence of an [n, k, d ]-code such that d0 ≥ d, while the former even ensures the existence of an [n + 1, k, d00]- code C such that d00 ≥ d + 1, which indeed is an improvement, inasmuch the • n 00 punctured code C ≤ Fq since d ≥ d + 1 ≥ 2 has Fq-dimension k, and has minimum distance at least d00 − 1 ≥ d. Note that here we have already borrowed the concept of punctured codes. We will freely use the concepts of punctured, shortened and extended codes in the sequel; the details on these constructions are given in (6.1) below.

(5.5) Optimal codes. Let Fq be the field with q elements, being kept fixed. For n, d ∈ N such that d ≤ n let 0 0 Kq(n, d) := max{k ∈ N; there is an [n, k, d ]-Code over Fq such that d ≥ d}; note that the existence of repetition [n, 1, n]-codes entails that Kq(n, d) ≤ n is n well-defined. In particular, for d = 1, the [n, n, 1]-code Fq shows that Kq(n, 1) = n n; for d = n, since any code C ≤ Fq such that d(C) = n is 1-dimensional, we conclude that Kq(n, n) = 1.

We have Kq(n, d) = max{k ∈ N; there is an [n, k, d]-Code over Fq}, as well as Kq(n, d + 1) ≤ Kq(n, d): Let C be an [n, k, d + 1]-code, where we may assume that there is [x1, . . . , xn] ∈ C such that xn 6= 0 and having minimal weight • n−1 d + 1 ≥ 2. Then the punctured code C ≤ Fq is an [n − 1, k, d]-code. Hence adding an n-th component consisting of zeroes only we get the [n, k, d]-code n • n {[x1, . . . , xn−1, 0] ∈ Fq ;[x1, . . . , xn−1] ∈ C } ≤ Fq , proving both assertions. n A code C ≤ Fq such that d(C) = d and dimFq (C) = Kq(n, d) is called optimal.

The bounds discussed above, and repeated below, provide bounds for Kq(n, d), where (i)–(iv) are upper ones, while (v)–(vii) are lower ones; recall that (i)–(iii) and (v) also hold for non-linear codes. Thus any linear code fulfilling one of the upper bounds mentioned is optimal; for example MDS codes and perfect codes are so, but for these classes of codes d is severely restricted. Actually, Kq(n, d) is not at all known precisely, and improving the bounds for Kq(n, d), aiming at determining Kq(n, d) affirmatively, and finding and classifying optimal codes continue to be major problems of combinatorial coding theory.

Upper bounds on k ≥ 1 for an [n, k, d]-code over Fq to possibly exist are: II 26

b d−1 c P 2 n i n−k i) Singleton bound k ≤ n−d+1, ii) Hamming bound i=0 i ·(q−1) ≤ q , k q−1 Pk−1 d iii) Plotkin bound q (d − n · q ) ≤ d, iv) Griesmer bound i=0 d qi e ≤ n. 0 0 Lower bounds on k for an [n, k, d ]-code over Fq such that d ≥ d to exist are: Pd−1 n i n−k Pd−1 n v) Gilbert bound i=0 i ·(q−1) ≤ q , vi) linear Gilbert bound i=0 i · i n+1−k Pd−2 n−1 i n−k (q − 1) < q , vii) Gilbert-Varshamov bound i=0 i · (q − 1) < q .

Example. For q = 2 and n = 13 and d = 5 we get the following: i) The Singleton bound yields k ≤ 9; ii) the Hamming bound yields 213−k ≥ P2 13 Pk−1 5 i=0 i = 92, that is k ≤ 6; iv) the Griesmer bound yields i=0 d 2i e = 5+3+2+1+1+1+ · · · ≤ 13, that is k ≤ 6. 13 iii) The Plotkin bound since 5 ≤ 2 does not yield immediately. But, noting that we may assume that k ≥ 4 by the lower bounds to follow, shortening a binary [13, k, 5]-code four times (and possibly puncturing and adding zero components) k−4 9 yields a [9, k − 4, 5]-code, for which the Plotkin bound yields 2 (5 − 2 ) ≤ 5, that is k ≤ 7. We can do even better, if we first extend a binary [13, k, 5]-code to a [14, k, 6]-code, and then shorten it three times to obtain a [11, k −3, 6]-code, k−3 11 then the Plotkin bound yields 2 (6 − 2 ) ≤ 6, that is k ≤ 6. 13−k P4 13 Conversely, v) the Gilbert bound yields 2 ≥ i=0 i = 1093, that is k ≥ 2; vi) the linear Gilbert bound yields 214−k > 1093, that is k ≥ 3; vii) the 13−k P3 12 Gilbert-Varshamov bound yields 2 > i=0 i = 299, that is k ≥ 4. Hence there is a binary [13, 4, 5]-code, and there is no binary [13, 7, 5]-code, but it is left open whether a binary [13, 5, 5]-code or a binary [13, 6, 5]-code exists. We show that K2(13, 5) = 5, so that the bounds given are not sharp; we note that actually there is a (unique optimal non-linear) binary (13, 26, 5)-code, which via shortening is related to the (optimal non-linear) binary Nadler (12, 25, 5)-code: Assume that there is a binary [13, 6, 5]-code. Then by the Helgert-Stinaff con- struction there is a residual [8, 5, d]-code, where d ≥ 3. But now the Hamming b d−1 c P1 8 P 2 8 8−5 bound 9 = i=0 i ≤ i=0 i ≤ 2 = 8 yields a contradiction. Hence there is no binary [13, 6, 5]-code. But there is a binary [13, 5, 5]-code: In view of the Helgert-Stinaff construction we begin with a binary [8, 4, 4]-code, namely the extended Hamming code ∗ 8 C := Hb3 ≤ F2, see also (6.3), which has generator and check matrices as follows:  ... 1 1 1 1 .   ... 1 1 1 1 .  ∗  . 1 1 1 1 .. .  ∗  . 1 1 .. 1 1 .  G :=   H :=    1 1 .. 1 1 . .   1 . 1 . 1 . 1 .  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

∗ As any 3-subset of columns of H is F2-linearly independent, we conclude that II 27

∗ 13 indeed d(C ) = 4. Now let C ≤ F2 be the code generated by  1 1 1 1 1 ......   .. 1 .. ... 1 1 1 1 .    4×13 G :=  ... 1 . . 1 1 1 1 ...  ∈ F2 .    .... 1 1 1 .. 1 1 ..  ..... 1 1 1 1 1 1 1 1

0 00 0 5 We shoe that d(C) = 5: Let 0 6= v = [v | v ] ∈ C, where v ∈ F2 and 00 8 0 00 00 v ∈ F2. If v = 0, then we have v = 18, hence wt(v) = 8; if v = 0, then 0 0 00 we have v = 15, hence wt(v) = 5; if both v 6= 0 and v 6= 0, then we have wt(v) = wt(v0) + wt(v00) ≥ 1 + 4 = 5. ]

(5.6) Asymptotic bounds. a) We consider the question how good codes might be asymptotically for n  0. Since the error correction capabilities of a n non-trivial [n, k, d]-code C ≤ Fq , which are governed by its minimum distance d d, should grow proportionally with respect to its length n, we let δ(C) := n ≤ 1 k be the relative minimum distance of C; recall that ρ(C) = n ≤ 1 is the information rate of C. 1 For 0 ≤ δ ≤ 1 we let κq(δ) := lim supn→∞ n · Kq(n, dδne), that is  n k d o κq(δ) = lim sup max ∈ R; there is an [n, k, d]-code such that ≥ δ . n→∞ n n

In other words, 0 ≤ κq(δ) ≤ 1 is largest such that there is a sequence of codes of unbounded length whose relative minimum distance is approaches δ from above, and whose information rate tends towards κq(δ).

Hence κq(δ) is decreasing, where for δ = 0 from Kq(n, 1) = n we get κq(0) = 1, while for δ = 1 from Kq(n, n) = 1 we get κq(1) = 0. Again, the bounds (i)–(iv) above provide upper bounds for κq(δ), while (v)–(vii) give lower bounds for κq(δ), but in general κq(δ) is not known precisely. We proceed to derive the associated asymptotic bounds explicitly; they are depicted for q = 2 in Table 4. In order to do so, we need a few preparations: q−1 b) For 0 < α ≤ q let Hq(α) := α logq(q − 1) − α logq(α) − (1 − α) logq(1 − α); since limα→0+ Hq(α) = 0 we extend Hq(α) continuously for α = 0 by letting q−1 q−1 q−1 q−1 1 1 Hq(0) := 0. We have Hq( q ) = q ·logq(q−1)− q ·logq( q )− q logq( q ) = 1. ∂ 1−α q−1 Differentiating yields ∂α Hq(α) = logq(q − 1) + logq( α ), for 0 < α ≤ q . ∂ α q−1 Hence we have ∂α Hq(α) = 0 if and only if 1−α = q −1, or equivalently α = q . ∂ ∂ + Since ∂α Hq(α) is strictly decreasing, such that ∂α Hq(α) → ∞ for α → 0 , we ∂ q−1 conclude that ∂α Hq(α) > 0 for all 0 < α < q , implying that Hq(α) is strictly q−1 increasing and strictly concave for 0 ≤ α ≤ q .

Note that for q = 2 we recover the conditional entropy H(Y|X ) = −p log2(p) − (1−p) log2(1−p) = H2(p) of a symmetric binary channel with error probability II 28

1 0 ≤ p ≤ 2 . Actually, Hq(p) is the conditional entropy of a symmetric q-ary q−1 channel with error probability 0 ≤ p ≤ q , for which reason Hq is also called the associated entropy function.

q−1 dn Lemma. Let 0 ≤ δ < q , and let dn ∈ N such that n → δ for n → ∞. Then

d ! 1 1 Xn n · log (|B (0 )|) = · log · (q − 1)i → H (δ) for n → ∞. n q dn n n q i q i=0

q−1 n i n j Proof. Let 0 ≤ i ≤ j ≤ n · q . Then we have i · (q − 1) ≤ j · (q − 1) : The Qj s j! (n−j)! j−i assertion is equivalent to s=i+1 n+1−s = i! · (n−i)! ≤ (q −1) . The factors of the product are increasing with s increasing, hence the left hand side is bounded j j−i q−1 above by ( n+1−j ) . Now, from j ≤ n· q we get j+j(q−1) = jq ≤ n(q−1) ≤ j (n + 1)(q − 1), implying j ≤ (n + 1 − j)(q − 1), thus n+1−j ≤ q − 1. Hence we conclude that the left hand side indeed is bounded above by (q − 1)j−i. n j Pj n i n j q−1 This yields j ·(q−1) ≤ i=0 i ·(q−1) ≤ (j +1)· j ·(q−1) for j ≤ n· q . 1 q−1 In particular, this applies to j := dn for n  0. Since n · logq(n · q ) → 0 for 1 n j n → ∞, it suffices to show that Ln := n · logq j · (q − 1) → Hq(δ). n To this end, note that Stirling’s formula lim n!√·e = 1 implies that √ n→∞ nn· 2πn n! = ( n )n · 2πn·(1+o(1)), for n → ∞, where o(1): → fulfills o(1) → 0 for e N R √ 1 n → ∞. Thus we get logq(n!) = (n + 2 ) logq(n) − n logq(e) + logq( 2π) + o(1). n j 1 1 From this we obtain logq j · (q − 1) = (n + 2 ) logq(n) − (j + 2 ) logq(j) − 1 j (n − j + 2 ) logq(n − j) + j logq(q − 1) + o(1). This entails Ln = logq(n) − n · j n−j n−j j j j n−j logq( n · n) − n · logq( n · n) + n · logq(q − 1) + o(1) = − n · logq( n ) − n · n−j j j dn logq( n ) + n · logq(q − 1) + o(1). Since n = n → δ for n → ∞, this yields Ln → δ logq(q − 1) − δ logq(δ) − (1 − δ) logq(1 − δ) = Hq(δ). ]

(5.7) Asymptotic upper bounds. Let C be a non-trivial [n, k, d]-code. k 1 d 1 The Singleton bound says that ρ(C) = n ≤ 1 + n − n = 1 + n − δ(C). Hence, 1 given 0 ≤ δ ≤ 1, for any code such that δ(C) ≥ δ we have ρ(C) ≤ 1 + n − δ. Thus for n → ∞ we get the asymptotic Singleton bound κq(δ) ≤ 1 − δ. Note that it is superseded by the asymptotic Plotkin bound given below.

δ Theorem: Asymptotic Hamming bound. We have κq(δ) ≤ 1 − Hq( 2 ).

a Proof. We may assume that δ < 1. Since for a ≥ 0 we have 2 · d 2 e ≤ dae + 1, δ we may weaken the condition d ≥ dδne by assuming that d ≥ 2 · d 2 · ne − 1. k n The Hamming bound q · |B d−1 (0n)| ≤ q yields k ≤ n − logq(|B d−1 (0n)|) ≤ b 2 c b 2 c k 1 n − log (|B δ (0n)|), that is ρ(C) = ≤ 1 − · log (|B δ (0n)|). Thus q d 2 ·ne−1 n n q d 2 ·ne−1 II 29

1 δ δ δ 1 q−1 since n · (d 2 · ne − 1) → 2 for n → ∞, and 2 < 2 ≤ q , by the Lemma in (5.6) δ we get κq(δ) ≤ 1 − Hq( 2 ). ] Next, we consider the Plotkin bound, which yields the asymptotic result to follow. The Griesmer bound, although for specific cases often being better than the Plotkin bound, yields the same asymptotic bound; indeed, asymptotically there is no loss in applying the estimate used to show in the first place that the Griesmer bound implies the Plotkin bound.

q−1 Theorem: Asymptotic Plotkin bound. We have κq(δ) = 0 for q ≤ δ ≤ 1, q q−1 and κq(δ) ≤ 1 − q−1 · δ for 0 ≤ δ ≤ q .

q−1 k q−1 Proof. We may assume that δ 6∈ {0, q }. The Plotkin bound q (d−n· q ) ≤ d k d q−1 d k q−1 can be written as q ( n − q ) ≤ n , in other words q · (δ(C) − q ) ≤ δ(C). q−1 k δ(C) q−1 For δ(C) > q we have q ≤ q−1 . Thus if q < δ ≤ δ(C) then k is δ(C)− q k bounded above, implying that ρ(C) = n → 0, hence κq(δ) = 0. q−1 0 q(d−1) Let now δ(C) < q , and we may assume that d ≥ 2. Letting n := b q−1 c, 0 q(d−1) n(d−1) 0 q−1 q−1 q(d−1) we get 1 ≤ n ≤ q−1 ≤ d < n and d − n · q = d − q · b q−1 c ≥ q−1 q(d−1) 0 0 0 0 d − q · q−1 = 1. Shortening n − n times we get an [n , k , d ]-code where k0 ≥ k −(n−n0) and d0 ≥ d ≥ 2. Hence there also is an [n0, k0, d]-code, which by k0 d k k0 n−n0 the Plotkin bound fulfills q ≤ 0 q−1 ≤ d. Thus we have q ≤ q q ≤ d−n · q n−n0 0 k n0 1 dq , hence k ≤ n − n + logq(d), thus ρ(C) = n ≤ 1 − n + n · logq(d). Hence q−1 n0 1 q(d−1) q d−1 if 0 < δ < q then limn→∞ n = limn→∞ n · b q−1 c = q−1 · limn→∞ n = q 1 q q−1 · δ, and since n · logq(d) → 0 anyway, we infer that κq(δ) ≤ 1 − q−1 · δ. ]

(5.8) Asymptotic lower bounds. We provide an asymptotic lower bound, which is based on the Gilbert-Varshamov bound. The weaker estimates given by the Gilbert bound and the linear Gilbert bound yield the same asymptotic bound; indeed, the estimates used in the proof given below show that essentially the Gilbert bound is used.

q−1 Theorem: Asymptotic Gilbert-Varshamov bound. For 0 ≤ δ ≤ q we have κq(δ) ≥ 1 − Hq(δ).

q−1 Proof. We may assume that 0 < δ < q and that d := dδne ≥ 2. Let k ∈ N n−k be maximal such that |Bd−2(0n−1)| < q , then by the Gilbert-Varshamov bound there exists an [n, k, d0]-code such that d0 ≥ d. By maximality we have n qk+1 ≥ q , and thus we have ρ(C) + 1 = k+1 = 1 · log (qk+1) ≥ |Bd−2(0n−1)| n n n q 1 1 d 1 − n · logq(|Bd−2(0n−1)|) ≥ 1 − n · logq(|Bd(0n)|), which since n → δ for n → ∞ by the Lemma in (5.6) entails κq(δ) ≥ 1 − Hq(δ). ] II 30

Table 4: Asymptotic bounds for q = 2.

(5.9) Remark. a) We mention a few further, better asymptotic bounds: The asymptotic Elias-Bassalygo bound [1967], being based on an improve- q−1 ment of the strategy used to prove the Plotkin bound, says that for 0 ≤ δ ≤ q  q−1 q q−1 q−1  we have κq(δ) ≤ 1 − Hq q − q · ( q − δ) ; in particular, for q = 2 we √  1  get κq(δ) ≤ 1 − H2 2 (1 − 1 − 2δ) . This improves the Hamming bound and the Plotkin bound, and was the best asymptotic upper bound at that time.

For q = 2 (and 0.15 ∼ δ0 < δ) the latter is superseded by the asymptotic McEliece-Rodemich-Rumsey-Welch bound [1977], based on the linear 1 programming bound [Delsarte, 1973], saying that for 0 ≤ δ ≤ 2 we have  1 p  κq(δ) ≤ H2 2 − δ(1 − δ) , and being the best asymptotic upper bound known. On the other side, the asymptotic Gilbert-Varshamov bound was long con- sidered to be the best possible asymptotic lower bound. But using algebraic- geometric codes, which in turn are based on the idea of Goppa codes [1981], Tsfasman-Vladut-Zink [1982] provided a series of linear codes over Fp2 , where p ≥ 7, which exceed the asymptotic Gilbert-Varshamov bound. These are still the asymptotically best codes known. b) Finally, a similar approach works for non-linear codes: Let X be a fixed alphabet such that q := |X |. For n, d ∈ N such that d ≤ n let

0 0 M(n, d) := max{m ∈ N; there is an (n, m, d )-Code over X such that d ≥ d},

1 and for 0 ≤ δ ≤ 1 let µ(δ) := lim supn→∞ n · logq(M(n, dδne), that is

 nlogq(m) d o µ(δ) = lim sup max ∈ R; there is (n, m, d)-code such that ≥ δ . n→∞ n n II 31

Since the Singleton bound, the Hamming bound, the Plotkin bound and the Gilbert bound are all non-linear bounds, the asymptotic bounds given above also hold for non-linear codes. Similarly, the Elias-Bassalygo bound and the McEliece-Rodemich-Rumsey-Welch bound hold for non-linear codes.

6 Hamming codes

(6.1) Modifying codes. Let C be an [n, k, d]-code over Fq, with generator k×n (n−k)×n matrix G ∈ Fq and check matrix H ∈ Fq . a) i) Puncturing by deleting the n-th component, for n ≥ 2, yields the • n−1 n−1 code C := {[x1, . . . , xn−1] ∈ Fq ;[x1, . . . , xn] ∈ C} ≤ Fq . Hence, using n n−1 the Fq-linear map Fq → Fq :[x1, . . . , xn] 7→ [x1, . . . , xn−1], having kernel n • • h[0,..., 0, 1]iFq ≤ Fq , shows that k − 1 ≤ k := dimFq (C ) ≤ k; in particular, • • if d ≥ 2, or xn = 0 for all [x1, . . . , xn] ∈ C, then k = k. Moreover, if C is non-trivial, then for its minimum distance we have d−1 ≤ d• ≤ d; in particular, • if xn = 0 for all [x1, . . . , xn] ∈ C, then d = d. ii) Extending by adding a check symbol yields Cb := {[x1, . . . , xn, xn+1] ∈ n+1 Pn+1 n+1 Fq ;[x1, . . . , xn] ∈ C, i=1 xi = 0} ≤ Fq . Hence Cb has check matrix Hb :=  tr  H 0n−k (n+1−k)×(n+1) ∈ Fq , thus bk := dimFq (Cb) = k. 1n 1

If C is non-trivial, then Cb has minimum distance d ≤ db ≤ d + 1: Since any (d − 1)-subset of columns of H is Fq-linearly independent, any (d − 1)-subset of columns of Hb is Fq-linearly independent as well, and since there is an Fq- linearly dependent d-subset of columns of H, there is an Fq-linearly dependent (d + 1)-subset of columns of Hb.

In particular, for q = 2 the additional condition corresponding to the row 1n+1 in Hb amounts to wt(v) ∈ N0 even, for all v ∈ Cb; hence if C is non-trivial then db is even, implying db= d + 1 for d odd, and db= d for d even. n+1 • Puncturing the extended code Cb ≤ Fq we recover (Cb) = {[x1, . . . , xn] ∈ n n n Fq ;[x1, . . . , xn; xn+1] ∈ C}b = {[x1, . . . , xn] ∈ Fq ;[x1, . . . , xn] ∈ C} = C ≤ Fq . 0 b) i) Expurgating by throwing away codewords yields C := {[x1, . . . , xn] ∈ Pn n 0 0 C; i=1 xi = 0} ≤ C ≤ Fq . Hence we have k − 1 ≤ k := dimFq (C ) ≤ k, and for the minimum distance of C0 we have d0 ≥ d. If k0 = k − 1 then C0 has   0 H (n−k+1)×n check matrix H := ∈ Fq ; in particular, if 1n ∈ C then we have 1n 0 0 1n 6∈ C if and only if gcd(q, n) = 1, thus in this case k = k − 1. 0 In particular, for q = 2 we have C := {v ∈ C; wt(v) ∈ N0 even} ≤ C, being called the even-weight subcode of C; hence if C0 is non-trivial then d0 is even, and we have k0 = k − 1 if and only if C contains elements of odd weight. n ii) Augmenting by adding new codewords yields Ce := hC, 1niFq ≤ Fq . Hence II 32

we have k ≤ ek := dimFq (Ce) ≤ k + 1, and for the minimum distance of Ce we have de ≤ d. We have ek = k + 1 if and only if 1n 6∈ C; in this case Ce has generator   G (k+1)×n matrix Ge := ∈ Fq . 1n

n In particular, for q = 2 we have Ce := C∪(1n +C) ≤ F2 , consisting of the elements of C and their complements. If C is non-trivial, since wt(1n +v) = n−wt(v) for n all v ∈ F2 , we get de= min{d, n − δ}, where δ := max{wt(v) ∈ N0; 1n 6= v ∈ C}; if 1n ∈ C we have 1n + C = C and thus δ = n − d and de= d anyway. 0 n 0 0 If 1n ∈ C, then augmenting the expurgated code C ≤ C ≤ Fq yields C ≤ (gC ) = 0 0 0 0 hC , 1niFq ≤ C, hence (gC ) = C if gcd(q, n) = 1, and (gC ) = C if gcd(q, n) > 1. 0 Moreover, if gcd(q, n) > 1, then we have h1ni = h1ni , and thus expurgating Fq Fq n 0 0 0 0 the augmented code C ≤ yields (C) = hC, 1ni = hC , 1ni = (gC ). e Fq e Fq Fq c) i) Shortening by taking a cross section, for n ≥ 2, is the concatena- tion of n-expurgation and subsequent puncturing, yielding the code C◦ := n−1 (n) • n−1 {[x1, . . . , xn−1] ∈ Fq ;[x1, . . . , xn] ∈ C, xn = 0} = (C ) ≤ Fq , where (n) n ◦ C := {[x1, . . . , xn] ∈ C; xn = 0} ≤ Fq . Hence we have k − 1 ≤ k := ◦ ◦ (n) dimFq (C ) ≤ k, where we have k = k − 1 if and only if C < C, that is if and only if there is [x1, . . . , xn] ∈ C such that xn 6= 0. For the minimum distance of C◦ we have d◦ ≥ d. If k◦ = k −1, then C◦ is an [n−1, k −1, d◦]-code, amounting ◦ (n−k)×(n−1) to deleting an information symbol, which has check matrix H ∈ Fq obtained from H by deleting column n. ii) Lengthening is the concatenation of augmentation and subsequent exten- b n+1 b b sion, yielding Ce ≤ Fq . Thus we have k ≤ ek := dimFq (Ce) ≤ k + 1, where we have ebk = k if and only if Ce = C, that is if and only if 1n ∈ C. For the minimum distance of Cbe we have bde≤ d + 1. If ebk = k + 1, then Cbe is an [n + 1, k + 1, bde]-code, thus amounting to adding an information symbol. n+1 ◦ Thus, shortening the extended code Cb ≤ Fq yields (Cb) = {[x1, . . . , xn] ∈ n Pn 0 Fq ;[x1, . . . , xn+1] ∈ Cb, xn+1 = 0} = {[x1, . . . , xn] ∈ C, i=1 xi = 0} = C . In n+1 ◦ 0 particular, shortening the lengthened code Cbe ≤ Fq yields (Cbe) = (Ce) .

k−1 k k (6.2) Hamming codes [1950]. Let P (Fq) := {hviFq ≤ Fq ; 0 6= v ∈ Fq } be the (k − 1)-dimensional projective space over Fq, where k ≥ 2, and let k−1 qk−1 k×n n := |P (Fq)| = q−1 ≥ 3. Let Hk ∈ Fq having columns being in bijection k−1 with P (Fq); note that Hk is unique up to linear equivalence. Thus we have rkFq (Hk) = k, and any 2-set of columns of Hk is Fq-linearly independent, while there are Fq-linearly dependent 3-sets of columns. n Let the Hamming code Hk ≤ Fq be defined by having check matrix Hk; hence Hk is unique up to linear equivalence. Thus Hk is an [n, n − k, 3]-code, which n−k P1 n i n−k n−k k n since q · i=0 i · (q − 1) = q (1 + n(q − 1)) = q q = q is perfect. II 33

k×n Conversely, any [n, n−k, 3]-code C is linearly equivalent to Hk: Let H ∈ Fq be a check matrix of C, then any 2-set of columns of H is Fq-linearly independent, that is the columns of H generate pairwise distinct 1-dimensional subspaces of k k−1 k−1 Fq , and hence the n = |P (Fq)| columns are in bijection with P (Fq). k×n We may choose the columns of Hk ∈ Fq according to the q-ary representation of the integers i ∈ {1, . . . , qk − 1} having 1 as their highest digit. This allows for a fast decoding algorithm: Since Hk is perfect of minimum distance 3, and hence 1-error correcting, the non-trivial coset leaders are precisely the vectors n ∗ n xe1, . . . , xen ∈ Fq of weight 1, where x ∈ Fq and ei ∈ F2 is the i-th unit vector. n Given v = w + xei ∈ Fq \Hk, where w ∈ Hk, the associated syndrome is tr tr k tr vHk = xeiHk ∈ Fq , that is the x-fold of the i-th row of Hk , which can be translated into the q-ary representation of an integer with highest digit 1 and a scalar, revealing both the position of the error and saying how to correct it.

(6.3) Binary Hamming codes. Keeping the notation of (6.2), let q := 2. k×n k Ordering the columns of Hk ∈ F2 , where k ≥ 2 and n = 2 − 1, according 1×1 to the binary representation of i ∈ {1, . . . , n}, letting H1 := [1] ∈ F2 , we recursively get   tr 02k−1 12k−1 k×2k [0k | Hk] = tr tr ∈ F2 ; 0k−1 Hk−1 0k−1 Hk−1

... 1 1 1 1 . 1 1 for example, we get H = and H = . 1 1 .. 1 1 . 2 1 . 1 3   1 . 1 . 1 . 1

The code H2 is a [3, 1, 3]-code, hence is the repetition code; the [7, 4, 3]-code H3 k−1 has been considered in (4.2). For k ≥ 2, all rows of Hk have weight 2 , which tr k×k is even, hence we have 1n ∈ Hk. For k ≥ 3 we get HkHk = 0 ∈ F2 , that is ⊥ k×n Hk ≤ Hk: Letting Hk = [w1, . . . , wk] ∈ F2 , using the standard F2-bilinear k−1 form we get hwi, wii = wt(wi) = 2 = 0 ∈ F2 for all i ∈ {1, . . . , k}, as well as k−2 hw1, wii = 2 = 0 ∈ F2 and hwi, wji = 0 ∈ F2 for j > i ≥ 2. We apply the modifications described in (6.1), see Table 5:

n n+1 i) Extending Hk ≤ F2 yields the extended Hamming code Hbk ≤ F2 , an • [n + 1, n − k, 4]-code, and puncturing Hbk yields (Hbk) = Hk again; note that Hb2 is a [4, 1, 4]-code, hence is the repetition code. For k ≥ 3 the associated (k+1)×(n+1) tr (k+1)×(k+1) check matrix Hbk ∈ F2 fulfills HbkHbk = 0 ∈ F2 , that is ⊥ ⊥ (Hbk) ≤ Hbk; in particular, since dim((Hb3) ) = 4 = dim(Hb3), we conclude that ⊥ (Hb3) = Hb3 is a self-dual [8, 4, 4]-code. n 0 n ii) Expurgating Hk ≤ F2 yields the even-weight Hamming code Hk ≤ F2 . 0 0 Since 1n ∈ Hk and n is odd, we conclude that Hk is an [n, n − k − 1, d ]-code, 0 0 0 0 3 where d ≥ 3, and augmenting Hk yields (]Hk) = Hk again. While H2 ≤ F2 is the trivial code, for k ≥ 3 we have d0 = 4: II 34

Table 5: Modified binary Hamming and simplex codes.

Hbk  dualise - Rk  @I@  @I@ @ @ extend @ shorten extend @ shorten lengthen@ lengthen@ puncture @@ puncture @@ © augment @@R © augment @@R H - H0 R• - S k expurgate k k expurgate k

0 0 Since d ≥ 3 is even, it suffices to show that d ≤ 4: We choose an F2-linearly dependent 4-subset of columns of Hk, such that all its 3-subsets are F2-linearly independent; for example we may take the following and extend by zeroes below:

.. 1 1 . 1 . 1 . 1 .. 1

k×1 Summing up these columns yields 0 ∈ F2 . Hence summing up the associated 0 (k+1)×1 columns of Hk, whose last row consists of 1n, yields 0 ∈ F2 , saying that 0 Hk contains an element of weight 4. ] n+1 ◦ 0 n iii) Shortening Hbk ≤ F2 yields (Hbk) = Hk ≤ F2 again, and lengthening 0 n [0 n+1 Hk ≤ F2 yields (]Hk) = Hbk ≤ F2 again.

qk−1 k×n (6.4) Simplex codes. a) For k ≥ 2 and n := q−1 let Hk ∈ Fq be as ⊥ n in (6.2). Then the code Sk := Hk ≤ Fq having Hk as a generator matrix is called the associated simplex code. We show that all 0 6= v ∈ Sk have weight k−1 k−1 wt(v) = q ; in particular Sk is an equidistant [n, k, q ]-code:

Let {v1, . . . , vk} ⊆ Sk be the Fq-basis given by the rows of Hk. Then there is k Pk n 0 6= [x1, . . . , xk] ∈ Fq such that v = i=1 xivi ∈ Fq . The j-th entry of v, for Pk j ∈ {1, . . . , n}, is zero if and only if 0 = hv, eji = i=1 xihvi, eji. This amounts k ⊥ to saying that [hv1, eji,..., hvk, eji] ∈ is an element of U := h[x1, . . . , xk]i ≤ Fq Fq k k tr Fq . Now [hv1, eji,..., hvk, eji] ∈ Fq coincides with the j-th row of Hk , hence qk−1−1 dimFq (U) = k − 1 shows that there are precisely q−1 such rows. Thus we qk−1−1 qk−1 qk−1−1 k−1 have wt(v) = n − q−1 = q−1 − q−1 = q . ] Pk−1 qk−1 Pk−1 k−i−1 Pk−1 i qk−1 k k−1 qk−1 Note that i=0 d qi e = i=0 q = i=0 q = q−1 and q (q − q−1 · q−1 k−1 q ) = q show that Sk fulfills the Griesmer bound and the Plotkin bound; recall that the latter fact also implies that Sk is an equidistant code. k−1 n b) Conversely, any [n, k, q ]-code C ≤ Fq is linearly equivalent to Sk: Let II 35

⊥ ⊥ n ⊥ d ∈ N be the minimum distance of C ≤ Fq . We show that d ≥ 3; then ⊥ since dimFq (C ) = n − k and any [n, n − k, 3]-code is perfect, we conclude that ⊥ ⊥ ⊥ d = 3, and hence C is linearly equivalent to the Hamming code Hk = Sk : (n−k)×n To this end, in turn, let H ∈ Fq be a check matrix of C, hence H is ⊥ n ⊥ a generator matrix of C ≤ Fq . Firstly, assume that d = 1, then we may n assume that [0,..., 0, 1] ∈ im(H) ≤ Fq . Thus any word in C has zero n-th entry, implying that the shortened code C◦ is an [n−1, k, qk−1]-code. Hence the Pk−1 qk−1 qk−1 Griesmer bound implies n − 1 ≥ i=0 d qi e = q−1 = n, a contradiction. Secondly, assume that d⊥ = 2, then we may assume that [0,..., 0, 1, 1] ∈ n n im(H) ≤ Fq . Thus any word in C is of the form [∗,..., ∗, −x, x] ∈ Fq , for ◦ some x ∈ Fq, implying that any word in the shortened code C has zero (n − 1)- st entry, entailing that the doubly shortened code C◦◦ is an [n−2, k◦◦, d◦◦]-code, where k −1 ≤ k◦◦ ≤ k and d◦◦ ≥ qk−1. Hence this time the Griesmer bound im- ◦◦ ◦◦ Pk◦◦−1 d◦◦ Pk◦◦−1 qk−1 qk−qk−k qk−k −1 plies n−2 ≥ i=0 d qi e ≥ i=0 d qi e = q−1 = n− q−1 ≥ n−1, a contradiction again. ]

(6.5) Binary simplex codes and Reed-Muller codes. a) Keeping the no- k tation of (6.4), let q := 2, hence for k ≥ 2 we have n = 2 − 1, thus Sk is a k k−1 [2 − 1, k, 2 ]-code. For example, the elements of S2 = h[0, 1, 1], [1, 0, 1]iF2 = 3 {03, [0, 1, 1], [1, 0, 1], [1, 1, 0]} ≤ F2 can be seen as the vertices of a tetrahedron inscribed into a cube; note that S2 is the even-weight [3, 2, 2]-code. We apply the modifications described in (6.1), see Table 5: Since all elements 0 n+1 of Sk have even weight, we get Sk = Sk and Sbk = {[v, 0] ∈ F2 ; v ∈ Sk}. i) Dualizing the even-weight Hamming code H0 yields a code which has H0 :=   k k Hk (k+1)×n 0 ⊥ n ∈ F2 as a generator matrix, thus (Hk) = Sek ≤ F2 is the 1n k−1 augmented simplex code. Since for all 0 6= v ∈ Sk we have wt(v) = 2 and k−1 k−1 wt(1n + v) = n − 2 = 2 − 1, we conclude that Sek has minimum distance k−1 k k−1 2 −1, thus is a [2 −1, k +1, 2 −1]-code. Moreover, expurgating Sek yields . 0 0 (Sek) = (Sk ∪ (1n + Sk)) = Sk again. ii) Dualizing the extended Hamming code Hbk yields the (first order) Reed-  tr  ⊥ n+1 Hk 0k Muller code Rk := Hbk ≤ F2 , having generator matrix Hbk = ∈ 1n 1 (k+1)×(n+1) b F2 . Hence Rk = Sek is obtained by extending Sek, that is by lengthen- k−1 ing Sk. Since Rk is an extension of Sek, it has minimum distance 2 , thus is a k k−1 ◦ b ◦ 0 [2 , k + 1, 2 ]-code. Moreover, shortening Rk yields Rk = (Sek) = (Sek) = Sk again. In particular, for k = 2 we infer that R2 is the even weight [4, 3, 2]-code, ⊥ while for k = 3 we get the extended Hamming [8, 4, 4]-code R3 = Hb3 = Hb3. Pk 2k−1 Pk−1 i k Note that i=0d 2i e = 1 + i=0 2 = 2 shows that Rk fulfills the Griesmer k−1 k 1 bound; but since 2 − 2 · 2 = 0 the Plotkin bound does not yield. II 36

• b • iii) The punctured Reed-Muller code is Rk = (Sek) = Sek, hence is a k k−1 • • b [2 − 1, k + 1, 2 − 1]-code. Moreover, extending Rk yields ([Rk) = Sek = Rk • • 3 again. In particular, for k = 2 we infer that R2 is a [3, 3, 1]-code, thus R2 = F2; • • for k = 3 we get the Hamming [7, 4, 3]-code R3 = (Hb3) = H3. k k−1 b) Conversely, any binary [2 , k + 1, 2 ]-code C is linearly equivalent to Rk: 2 We proceed by induction on k ≥ 1, letting R1 := F2 be the unique [2, 2, 1]-code. Let k ≥ 2. Since C fulfills the Griesmer bound, it cannot possibly possess a zero component, hence the shortened code C◦ is a [2k − 1, k, d◦]-code, where d◦ ≥ k−1 k k−1 Pk−1 2k−1 2 . Since any [2 −1, k, 2 ]-code fulfills the Griesmer bound i=0 d 2i e = Pk−1 i k ◦ k−1 i=0 2 = 2 − 1 as well, we conclude that d = 2 , and thus we may assume ◦ that C = Sk. Note that this also shows that shortening with respect to any component yields a code linearly equivalent to the simplex code, implying that k−1 any word in C\{02k , 12k } has weight 2 . We show that indeed 12k ∈ C, then ◦ k \k b we have C = h(dC ), 12 iF2 = Sbk + h12 −1iF2 = Sek = Rk:   12k−1 02k−1 (k+1)×2k Let G = tr ∗ ∈ Fq be a generator matrix of C, and let ∗ 0k G ∗ 2k−1 k−1 ∗ ∗ k×2k−1 C ≤ Fq be the residual [2 , k, d ]-code generated by G ∈ Fq , where ∗ 2k−1 k−2 k−1 k−2 d ≥ 2 = 2 . Since any [2 , k, 2 ]-code fulfills the Griesmer bound Pk−1 2k−2 Pk−2 i k−1 ∗ k−2 i=0 d 2i e = 1 + i=0 2 = 2 , we conclude that d = 2 , and thus we ∗ may assume by induction that C = Rk−1; note that for k = 2 we indeed see ∗ ∗ that C is a [2, 2, 1]-code. Hence we have 12k−1 ∈ C , and thus C contains a word of the form [∗ | 0 | 12k−1 ] + [12k−1 | 02k−1 ] = [∗ | 1 | 12k−1 ], which has weight k−1 at least 2 + 1, hence equals 12k . ]

k k−1 The Reed-Muller [2 , k + 1, 2 ]-codes Rk are merely the first ones in the series of higher order Reed-Muller codes [1954], which in turn belong to the class of geometric codes, being based on finite geometries, and having a rich algebraic structure. Moreover, higher order Reed-Muller codes have been generalized over arbitrary finite fields [Kasami, Lin, Petersen, 1968].

In particular, the Reed-Muller codes Rk are Hadamard codes, being defined by Hadamard matrices of Sylvester type, see Exercise (13.29), and thus have a fast decoding algorithm. From a practical point of view, together with their 2k−1 1 large relative minimum distance δ(Rk) = 2k = 2 this outweighs their low k+1 information rate ρ(Rk) = 2k , making them suitable for very noisy channels.

In particular, the Reed-Muller [32, 6, 16]-code R5 was used in the ‘Mariner’ expeditions to planet Mars [1969–1976]: The 6 binary information symbols are used to encode picture data based on dots on a grey-scale with 26 = 64 steps, 6 where R5 has a low information rate of ρ(R5) = 32 ∼ 0.2, but is able to correct 16−1 up to b 2 c = 7 errors. II 37

7 Weight enumerators

(7.1) Weight enumerators. Let X be an alphabet and let C ⊆ X n be a code. For i ∈ N0 let wi = wi(C) := |{v ∈ C; wt(v) = i}| ∈ N0. Hence we have w0 ≤ 1, and wi = 0 for i ∈ {1,..., wt(C) − 1}, as well as wwt(C) ≥ 1, and wi = 0 for Pn i ≥ n + 1, and i=0 wi = |C|. Let {X,Y } be indeterminates. The homogeneous generating function Pn i n−i P wt(v) n−wt(v) AC := i=0 wiX Y = v∈C X Y ∈ C[X,Y ] associated with the sequence [w0, w1, . . . , wn] is called the (Hamming) weight enumera- tor of C. Hence AC is homogeneous of total degree n and has non-negative rational integers as its coefficients. By dehomogenizing, that is specializ- ing X 7→ X and Y 7→ 1, we obtain the (ordinary) generating function Pn i P wt(v) AC(X, 1) = i=0 wiX = v∈C X ∈ C[X].

n n Example. For the trivial code C := {0n} ≤ Fq we get AC = Y . For the code ⊥ n Pn n i i n−i n C = Fq we get AC⊥ = i=0 i (q − 1) X Y = (Y + (q − 1)X) . Thus we have AC⊥ (X,Y ) = AC(Y − X,Y + (q − 1)X), in accordance with (7.2) below. n n n For the binary repetition code C := {0n, 1n} ≤ F2 we get AC = Y +X . For the b n c ⊥ n 0 n P 2 n  2i n−2i binary even-weight code C = (F2 ) ≤ F2 we get AC⊥ = i=0 2i X Y = 1 Pn n i n−i 1 Pn in i n−i 1 n n 2 · i=0 i X Y + 2 · i=0(−1) i X Y = 2 · (Y + X) + (Y − X) , 1 thus AC⊥ (X,Y ) = 2 · AC(Y − X,Y + X)], in accordance with (7.2) below.

n (7.2) Theorem: MacWilliams [1963]. Let C ≤ Fq be a linear code such that ⊥ n k := dimFq (C) ∈ N0, and let C ≤ Fq be its dual. Then for the associated weight k enumerators we have q · AC⊥ (X,Y ) = AC(Y − X,Y + (q − 1)X) ∈ C[X,Y ]. In particular, if C = C⊥ is self-dual, then for the weight enumerators we have Y +(q−1)X Y√−X √ n AC(X,Y ) = AC( q , q ) ∈ C[X,Y ]; recall that in this case k = 2 .

∗ Proof. i) Let χ: Fq → C := C \{0} be a character of Fq, that is a group homomorphism from the additive group (Fq, +) to the multiplicative group ∗ ∗ (C , ·); note that there always is the trivial character 1: Fq → C : a 7→ 1. n Let V be a C-vector space, and let ω : Fq → V be a map. Then the map n P χω : → V : v 7→ n χ(hv, wi)ω(w) is called the discrete Fourier trans- Fq w∈Fq form or Hadamard transform of ω with respect to χ. We show that for any P k P character χ 6= 1 we have v∈C χω(v) = q · w∈C⊥ ω(w): P P P For the left hand side we obtain v∈C χω(v) = v∈C w∈ n χ(hv, wi)ω(w) = P P P P Fq ⊥ χ(hv, wi)ω(w) + n ⊥ χ(hv, wi)ω(w) ∈ V , where since w∈C v∈C w∈Fq \C v∈C ∗ P χ(hv, wi) = χ(0) = 1 ∈ C the first summand evaluates to |C|· w∈C⊥ ω(w) ∈ V , which coincides with the right hand side of the above equation. Hence we have to show that the second summand vanishes: n ⊥ Given w ∈ Fq \C , the map C → Fq : v 7→ hv, wi is Fq-linear and non-zero, hence |C| k−1 surjective, and thus for any a ∈ q we have |{v ∈ C; hv, wi = a}| = = q . F |Fq | II 38

k−1 P  P  Thus the second summand evaluates to q · a∈ χ(a) · w∈C⊥ ω(w) ∈ P Fq V , hence it suffices to show that a∈ χ(a) = 0 ∈ C: Since χ 6= 1, there is Fq P P b ∈ q such that χ(b) 6= 1; then we have χ(b) · χ(a) = χ(a + b) = F a∈Fq a∈Fq P χ(a), implying (χ(b) − 1)) · P χ(a) = 0 ∈ . a∈Fq a∈Fq C ii) Let now V := C[X,Y ]n ≤ C[X,Y ] be the C-vector space of all homogeneous n polynomials of total degree n, including the zero polynomial, and let ω : Fq → wt(v) n−wt(v) C[X,Y ]n : v 7→ X Y . Moreover, let δ : Fq → {0, 1} be defined by δ(0) = 0, and δ(a) = 1 for a 6= 0. Thus for any character χ 6= 1 the associated n discrete Fourier transform is given as, where v = [x1, . . . , xn] ∈ Fq ,

P wt(w) n−wt(w) χω(v) = n χ(hv, wi)X Y w∈Fq Pn Pn P Pn δ(yi) (1−δ(yi)) = n χ( xiyi)X i=1 Y i=1 [y1,...,yn]∈Fq i=1 P Qn δ(yi) 1−δ(yi) = n χ(xiyi)X Y [y1,...,yn]∈Fq i=1 Qn P δ(a) 1−δ(a) = χ(axi)X Y . i=1 a∈Fq

∗ If xi = 0 then χ(axi) = χ(0) = 1 ∈ C shows that the associated factor equals P δ(a) 1−δ(a) P X Y = Y + (q − 1)X ∈ V . If xi 6= 0 then, using χ(a) = a∈Fq a∈Fq 0 ∈ again, the associated factor evaluates as P χ(ax )Xδ(a)Y 1−δ(a) = C a∈ q i P  P  F Y + ∗ χ(axi) · X = Y + ∗ χ(a) · X = Y − χ(0) · X = Y − X ∈ V . a∈Fq a∈Fq wt(v) n−wt(v) Thus we get χω(v) = (Y − X) (Y + (q − 1)X) ∈ V . k k P P In conclusion we have q · AC⊥ (X,Y ) = q · w∈C⊥ ω(w) = v∈C χω(v) = P wt(v) n−wt(v) v∈C(Y − X) (Y + (q − 1)X) = AC(Y − X,Y + (q − 1)X) ∈ V . ]

n k−1 Example. For k ≥ 2, the simplex code Sk ≤ Fq is an equidistant [n, k, q ]- k k−1 k q −1 k−1 q −1 q −1 q−1 k q q−1 code, where n := q−1 , hence ASk = Y + (q − 1)X Y ∈ C[X,Y ]. ⊥ 1 Thus for the associated Hamming code Hk = Sk we get AHk (X,Y ) = qk ·

ASk (Y − X,Y + (q − 1)X) ∈ C[X,Y ]. n+1 n−1 k n 2 2 In particular, for q = 2 we have n = 2 −1, and hence ASk = Y +nX Y ∈ n+1 n−1 1 n 2 2 C[X,Y ], which yields AHk = n+1 · ((Y + X) + n(Y − X) (Y + X) ). De- Pn i homogenizing, that is specializing X 7→ X and Y 7→ 1, yields i=0 wi(Hk)X = n+1 n−1 1 n 2 2 AHk (X, 1) = n+1 · ((1 + X) + n(1 − X) (1 + X) ) ∈ C[X]. 3 For example we have AH2 (X, 1) = 1 + X , showing again that H2 is the binary 3 4 7 repetition code, and AH3 (X, 1) = 1 + 7X + 7X + X , and AH4 (X, 1) = 1 + 35X3 + 105X4 + 168X5 + 280X6 + 435X7 + 435X8 + 280X9 + 168X10 + 105X11 + 35X12 + X15. III 39

III

8 Cyclic codes

n (8.1) Cyclic codes. Let C ≤ Fq be a linear code of length n ∈ N over the field Fq. If for all [a0, . . . , an−1] ∈ C we have [an−1, a0, . . . , an−2] ∈ C as well, that is if P(1,...,n) ∈ AutFq (C), then C is called cyclic.

n Example. The repetition code C := {[a, . . . , a] ∈ Fq ; a ∈ Fq} and the associ- ⊥ n Pn−1 ated dual code, the parity check code C := {[a0, . . . , an−1] ∈ Fq ; i=0 ai = 0}, are cyclic; note that in both cases the full symmetric group Sn is a subgroup n of AutFq (C). A generator matrix of C is given as G := [1,..., 1] ∈ Fq , and a generator matrix of C⊥, that is a check matrix of C, is given as

1 −1 ......  . 1 −1 .....  . .  ......  (n−1)×n H := . . . .  ∈ Fq .   ......  . . . .  ...... 1 −1

7 Example. The binary Hamming code H ≤ F2, whose elements are given in (4.2), is not cyclic, but applying P(3,4)(5,7,6) ∈ In(Fq) yields the linearly equiva- 7 lent code C ≤ F2, whose elements are given by the rows of the following matrices, which hence is cyclic:

 ......   1 1 1 .. 1 .   1 1 . 1 ...   . 1 1 1 .. 1       . 1 1 . 1 ..   1 . 1 1 1 ..       .. 1 1 . 1 .   . 1 . 1 1 1 .       ... 1 1 . 1   .. 1 . 1 1 1       1 ... 1 1 .   1 .. 1 . 1 1       . 1 ... 1 1   1 1 .. 1 . 1  1 . 1 ... 1 1 1 1 1 1 1 1

4×7 3×7 Moreover, a generator matrix G ∈ F2 and a check matrix H ∈ F2 of C are 1 1 . 1 ... 1 . 1 1 1 .. . 1 1 . 1 .. G :=   and H := . 1 . 1 1 1 . . .. 1 1 . 1 .     .. 1 . 1 1 1 ... 1 1 . 1

(8.2) Cyclic codes as ideals. Let Fq[X] be the polynomial ring over Fq in the Pd i indeterminate X. For 0 6= f = i=0 aiX ∈ Fq[X], where d ∈ N0 and ai ∈ Fq such that ad 6= 0, let deg(f) := d denote its degree and let lc(f) := ad denote III 40

its leading coefficient; if lc(f) = 1 then f is called monic. Moreover, Fq[X] is an Fq-algebra, and an Euclidean ring with respect to polynomial division, hence in particular is a principal ideal domain. n n For n ∈ N let hX −1i = {(X −1)·f; f ∈ Fq[X]}EFq[X] be the principal ideal n n generated by X − 1 ∈ Fq[X], let : Fq[X] → Fq[X]: f 7→ f := f + hX − 1i be n the natural epimorphism of Fq-algebras, where Fq[X] := Fq[X]/hX − 1i is the associated quotient Fq-algebra. n Then polynomial division shows that q[X] = q[X]

n Example. For g = 1 ∈ Fq[X] we get h1i = Fq[X], thus C = Fq , while g = n n n X − 1 ∈ Fq[X] yields hX − 1i = {0} ≤ Fq[X], thus C = {0} ≤ Fq . III 41

n The repetition code C := h[1,..., 1]iFq ≤ Fq corresponds to hgi = hgiFq E Pn−1 i Xn−1 Fq[X], where g := i=0 X = X−1 ∈ Fq[X] is the associated monic generator polynomial. We consider the dual code, the parity check code C⊥: ⊥ n Pn−1 n The code C = {[a0, . . . , an−1] ∈ Fq ; i=0 ai = 0} ≤ Fq corresponds to Pn−1 i Pn−1 Pn−1 i Pn−1 { i=0 aiX ∈ Fq[X]; i=0 ai = 0} = { i=0 aiX ∈ Fq[X]; i=0 ai = 0} E Pn−1 i Pn−1 Fq[X]. Now for f := i=0 aiX ∈ Fq[X] we have i=0 ai = 0 if and only if f(1) = 0, which holds if and only if X − 1 | f. In other words, ⊥ C corresponds to {f ∈ Fq[X]

7 Example. The non-zero elements of the Hamming code H ≤ F2, up to the lin- ear equivalence applied in (8.1), consist of the cyclic shifts of [1, 1, 0, 1, 0, 0, 0] and its complement [0, 0, 1, 0, 1, 1, 1], together with 17. We get ν([1, 1, 0, 1, 0, 0, 0]) = X3 + X + 1 and ν([0, 0, 1, 0, 1, 1, 1]]) = X6 + X5 + X4 + X2 = X2(X + 1)(X3 + P6 i 3 3 2 X + 1) and ν(17) = i=0 X = (X + X + 1)(X + X + 1), implying that 3 ν(H) = hgi E F2[X], with generator polynomial g := X + X + 1 ∈ F2[X]. In accordance with (8.3) below this can also be read off from the generator X7+1 matrix of H given above. Moreover, we have the check polynomial h := g = 3 2 4 2 ⊥ (X + 1)(X + X + 1) = X + X + X + 1 ∈ F2[X], hence H has generator ∗ 4 3 2 polynomial h := X + X + X + 1 ∈ F2[X], as can also been seen from the check matrix of H given above.

n (8.3) Theorem. Let C ≤ Fq be a cyclic code with generator polynomial g = Pk i Pn−k i i=0 giX ∈ Fq[X] of degree k := deg(g) ∈ {0, . . . , n}. Let h = i=0 hiX ∈ n Fq[X] such that X − 1 = gh ∈ Fq[X]; hence we have deg(h) = n − k. a) Then we have dimFq (C) = n − k, and a generator matrix of C is given as   g0 g1 . . . gk ......  . g0 . . . gk−1 gk .....   . .   ......  (n−k)×n G :=  ......  ∈ Fq .    ......   ......  ...... g0 . . . gk−1 gk

⊥ n b) The dual code C ≤ Fq is cyclic as well, and is generated by the reversed ∗ n−k −1 Pn−k n−k−i Pn−k i polynomial h := X · h(X ) = i=0 hiX = i=0 hn−k−iX ∈ Fq[X] associated with h; hence h is called a check polynomial of C. Thus a III 42 generator matrix of C⊥, that is a check matrix of C, is given as   hn−k hn−k−1 . . . h0 ......  . hn−k . . . h1 h0 .....   . .   ......  k×n H :=  ......  ∈ Fq .    ......   ......  ...... hn−k . . . h1 h0

Note that, by reversing the order of the columns, the cyclic codes generated by h and h∗ are linearly equivalent.

Proof. a) For any v ∈ C we have ν(v) = gf ∈ Fq[X] for some f ∈ Fq[X]. Let f = qh + r ∈ Fq[X], where q, r ∈ Fq[X] such that r = 0 or deg(r) < deg(h) = n − k. Then we have ν(v) − gf = ν(v) − g(qh + r) = ν(v) − gr − n n (X − 1)q ∈ hX − 1i E Fq[X], which implies ν(v) = gr ∈ Fq[X]. Thus since dimFq (C) = dimFq (hgi) = dimFq (Fq[X])−dimFq (Fq[X]/hgi) = n−k we conclude n−k−1 that {g, gX,..., gX } ⊆ hgi = ν(C) is an Fq-basis, which consists of the images under ν of the rows of G. n ∗ b) From gh = X − 1 we conclude that h0 6= 0, hence we have deg(h ) = n − k, and from g∗h∗ = (Xk · g(X−1)) · (Xn−k · h(X−1)) = Xn · (gh)(X−1) = (gh)∗ = n ∗ n ∗ n (X − 1) = 1 − X ∈ Fq(X), we infer that h | X − 1 ∈ Fq[X]. Hence H is a generator matrix of a cyclic code with generator polynomial h∗, having ⊥ dimension rkFq (H) = k = dimFq (C ). Thus it suffices to show that the rows of H are orthogonal to the rows of G:

For i ∈ {1, . . . , n − k} the i-th row of G is vi := [0,..., 0, g0, . . . , gk, 0,..., 0] ∈ n Fq , where g0 is the i-th entry, and for j ∈ {1, . . . , k} the j-th row if H is n wj := [0,..., 0, hn−k, . . . , h0, 0,..., 0] ∈ Fq , where hn−k is the j-th entry. Thus letting gl := 0 for l > k and l < 0, and hl := 0 for l > n − k and l < 0, we Pn have hvi, wji = l=1 gl−ihn−k+j−l ∈ Fq. Since 1 − i ≤ 0 and n − i ≥ k, and (n − k + j) − 1 ≥ n − k and (n − k + j) − n ≤ 0, the latter sum equals the (n−k+j)−i n coefficient of X in gh = X −1 ∈ Fq[X]. Since 1 ≤ n−k+j−i ≤ n−1, from that we conclude hvi, wji = 0. ]

(8.4) Cyclic redundancy check (CRC) codes [Peterson, 1961]. Let n Pk i C ≤ Fq be cyclic with generator polynomial g = i=0 giX ∈ Fq[X] of de- (n−k)×n gree deg(g) = k and associated generator matrix G ∈ Fq . Since gk 6= 0 the matrix G can be transformed by Gaussian row elimination to [A | En−k] ∈ (n−k)×n (n−k)×k Fq , for some A ∈ Fq ; this does not affect the cyclicity of C. n−k Using the generator matrix [A | En−k], a word v = [a0, . . . , an−k−1] ∈ Fq n is encoded into w = [b0, . . . , bk−1; a0, . . . , an−k−1] ∈ Fq , where we have to find k k [b0, . . . , bk−1] ∈ Fq : We have ν(w) = ν([b0, . . . , bk−1]) + X · ν(v) ∈ Fq[X]. k Pn−k−1 i+k Polynomial division yields X ·ν(v) = i=0 aiX = qg+r, for q, r ∈ Fq[X] III 43 such that r = 0 or deg(r) < deg(g) = k. Since w ∈ C we have g | ν(w) = Pk−1 i Pk−1 i (qg + r) + ( i=0 biX ), thus r + i=0 biX = 0. Hence ν([b0, . . . , bk−1]) is the remainder of the shifted polynomial −Xk · ν(v) upon polynomial division by g. n Error detection, which is the typical application, runs as follows: Given w ∈ Fq , we have w ∈ C if and only if g | ν(w) ∈ Fq[X]. Again polynomial division yields ν(w) = qg+r, for q, r ∈ Fq[X] such that r = 0 or deg(r) < deg(g) = k. Hence we have w ∈ C if and only if r = 0, and in this case w = [b0, . . . , bk−1; bk, . . . , bn−1] ∈ n−k C is decoded to [bk, . . . , bn−1] ∈ Fq . We discuss a few types of errors: i) A burst error of length l ∈ {0, . . . , n} is an error vector u = [c0, . . . , cn−1] ∈ n Fq , such that ci 6= 0 only if i ∈ {j, . . . , j + l − 1} ⊆ Zn, for some j ∈ Zn. Then C detects all burst errors of length l ≤ k; in particular, if k ≥ 1 all single errors are detected: We may assume that u = [0,..., 0, c0, . . . , cl−1, 0,..., 0] ∈ n j−1 Pl−1 i Fq , where c0 occurs at position j ∈ {1, . . . , n}, then ν(u) = X · i=0 ciX ∈ Fq[X], hence from gcd(g, X) = 1 and deg(g) = k we infer g - ν(u), thus u 6∈ C. 0 n ii) We show that we have C = C ≤ Fq , the latter denoting the expurgated 0 code, if and only if X − 1 | g ∈ Fq[X]: We have C = C if and only if for all Pn−1 i Pn−1 w = [b0, . . . , bn−1] ∈ C we have ν(w)(1) = ( i=0 biX )(1) = i=0 bi = 0, that is X − 1 | ν(w) for all w ∈ C, in other words hgi = ν(C) ⊆ hX − 1i E Fq[X]. In particular, for q = 2 we have X + 1 | g ∈ F2[X] if and only if C is an n even-weight code. In this case, if u = [c0, . . . , cn−1] ∈ F2 is an error vector of Pn−1 odd weight, then we have ν(u)(1) = i=0 ci = wt(u) 6= 0 ∈ F2, hence g - ν(u), saying that C detects all errors of odd weight. iii) Letting q = 2, for a double error occurring in positions i < j ∈ {1, . . . , n}, n i−1 j−i we have the error vector u = ei +ej ∈ F2 , hence ν(u) = X (X +1) ∈ F2[X]. Thus, since gcd(g, X) = 1 and 1 ≤ j −i ≤ n−1, all double errors are detected if and only if g - Xm + 1 for all m ∈ {1, . . . , n − 1}, that is if and only if g has an n-primitive divisor; recall that an irreducible polynomial f ∈ Fq[X] such that f | Xn − 1, but f - Xm − 1 for all m ∈ {1, . . . , n − 1}, is called n-primitive.

Example. Actually, CRC codes over F2 are used throughout information tech- nology. In particular, polynomial division over F2 is extremely fast, on a machine level just consisting of bit shifts and ‘exclusive or’ (XOR) commands. For example, this is used in the Universal Serial Bus (USB) [≥1996] data 5 2 transmission standard: The ‘CRC-5-USB’ polynomial X + X + 1 ∈ F2[X] is used to add 5 check bits to ‘token’ packets consisting of 11 information bits, making up a code of length 16, that is 2 Bytes; note that X5 + X2 + 1 is 31-primitive. Similarly, for ‘data’ packets, having length up to 1023 Bytes, the 16 15 2 15 ‘CRC-16’ polynomial X +X +X +1 = (X+1)(X +X+1) ∈ F2[X] is used; note that X15 + X + 1 is the lexicographically smallest irreducible polynomial of degree 15, and is 32767-primitive, where 215 − 1 = 32767 = 7 · 31 · 151. III 44

(8.5) Example: The RWTH-ID [Bunsen, J.M., 2007]. Identity manage- ment is a task which all large organizations dealing with many customers are faced with. The aim is to associate an identity number with any customer, in order to uniquely identify them. It should have the following properties: The set of available numbers should be large enough; the number should not convey any further information about the customer in question; the number should be easy to remember to human beings; and it should be possible to detect simple transmission errors. To create identity numbers, an alphabet X consisting of 32 alpha-numerical symbols, decimal digits and capital Latin letters, is used; in order to avoid mixing up symbols, the letters I, J, O and V, resembling 1, 0 and U, respectively, are not allowed. Thus using 5 information symbols, we obtain a set of |X |5 = 325 = 33 554 432 ∼ 3 · 107 words over X , to which we add a single check symbol, yielding identity numbers being words of length 6. To ease remembering identity numbers, these are written as two words of length three each, connected by a hyphen, for example SL8-BRX. 5 By source coding, X is encoded into the elements of F2 as given in Table 6. 30 Thus we get a linear binary code D ≤ F2 of length 6 · 5 = 30 and dimension dimF2 (D) = 5 · 5 = 25. To ease practical implementation, and to achieve the desired error detection properties, namely to be able to detect single errors and adjacent transposition errors, we aim at choosing D as a shortened cyclic code. n To this end, we look for a suitable cyclic code C ≤ F2 of length n ≥ 30 and 30 dimension dimF2 (C) = n−5, then D ≤ F2 such that dimF2 (D) = 25 is obtained by (n−30)-fold shortening with respect to the last component; note the encoding and decoding algorithms are not affected by shortening like this. Thus we look for a suitable (monic) generator polynomial g ∈ F2[X] of degree k := deg(g) = 5, n dividing the polynomial X + 1 ∈ F2[X]. We consider the relevant error types: A single error yields a burst error of length 5, hence any such error is detected by any cyclic code with the above parameters. Moreover, an adjacent transposition n error yields an error vector u = [0,..., 0; c0, . . . , c4; c0, . . . , c4; 0,..., 0] ∈ F2 , 5 j 5 P4 i where [c0, . . . , c4] ∈ F2. Hence we have ν(u) = X (X + 1) · i=0 ciX ∈ F2[X], where the leftmost c0 occurs in entry j ∈ {0, . . . , n − 1}. Hence all adjacent transposition errors are detected if and only if g - ν(u) for all error vectors as above. This in turn holds if and only if gcd(g, X5 + 1) = 1; note that we have 5 4 3 2 the factorization X + 1 = (X + 1)(X + X + X + X + 1) ∈ F2[X]. Since gcd(g, X) = 1 = gcd(g, X + 1) we conclude that g cannot possibly have a linear factor, thus either g is the product of two irreducible polynomials of degree 2 and 3, respectively, or g is irreducible of degree 5. Now X2+X+1 is the unique irreducible polynomial of degree 2, and X3 +X +1 and X3 +X2 +1 are those of degree 3, hence leading to the candidate polynomials (X2 +X+1)(X3 +X+1) = X5 + X4 + 1 and (X2 + X + 1)(X3 + X2 + 1) = X5 + X + 1. Moreover, as further candidates there are the six irreducible polynomials of degree 5. For n := 30 we find X30 + 1 = (X15 + 1)2 = (X + 1)(X2 + X + 1)(X4 + X + III 45

Table 6: The alphabet of the RWTH-ID.

0 00000 8 01000 G 10000 R 11000 1 00001 9 01001 H 10001 S 11001 2 00010 A 01010 K 10010 T 11010 3 00011 B 01011 L 10011 U 11011 4 00100 C 01100 M 10100 W 11100 5 00101 D 01101 N 10101 X 11101 6 00110 E 01110 P 10110 Y 11110 7 00111 F 01111 Q 10111 Z 11111

4 3 4 3 2 2 1)(X + X + 1)(X + X + X + X + 1) ∈ F2[X], which does not have any of the above candidates as a divisor. But for n := 31 = 25 − 1 we find that X31 + 1 = (X + 1) · (X5 + X2 + 1) · (X5 + X3 + 1) · (X5 + X3 + X2 + X + 1) · (X5 + X4 + X2 + X + 1) 5 4 3 5 4 3 2 · (X + X + X + X + 1) · (X + X + X + X + 1) ∈ F2[X] is the product of X + 1 and all six irreducible polynomials of degree 5. Hence either of the latter is a suitable generator polynomial; since 31 is a prime all of them are 31-primitive. For the RWTH-ID the ‘CRC-5-USB’ polynomial g := 5 2 31 ◦ 30 X +X +1 is used; let C ≤ F2 be the associated cyclic code and D := C ≤ F2 . For example, for L8BRX from Table 6 we get h = (1 + X3 + X4) + X5 · (X) + X10 · (X + X3 + X4) + X15 · (1 + X) 20 2 4 + X · (1 + X + X + X ) ∈ F2[X],

5 4 where polynomial division of X ·h by g yields the remainder 1+X+X ∈ F2[X], which belongs to the symbol S, saying that SL8-BRX is a valid identity number.

9 BCH codes

(9.1) Roots of unity. Let Fq be the field with q elements, and let n ∈ N such n that gcd(q, n) = 1. We consider the polynomial X − 1 ∈ Fq[X]. Since n 6= ∂ n n n−1 n 0 ∈ Fq we have gcd( ∂X (X − 1),X − 1) = gcd(nX ,X − 1) = 1 ∈ Fq[X], n implying that X − 1 ∈ Fq[X] is square-free, that is the product of pairwise distinct monic irreducible polynomials.

We aim at describing its factorization more precisely: Let Fq ⊆ F be alge- braically closed. Since Xn − 1 ∈ F[X] still is square-free, we conclude that it n ∗ splits into pairwise distinct linear factors, thus letting Vn := V(X − 1) ⊆ F be its set of zeroes, we have |V | = n and Xn − 1 = Q (X − ζ) ∈ [X]. n ζ∈Vn F III 46

0 −1 0 Since whenever ζ, ζ ∈ Vn we also have ζ ζ ∈ Vn, we conclude that Vn is ∗ a finite subgroup of F , hence by Artin’s Theorem is cyclic. Thus there is a primitive n-th root of unity ζn ∈ Vn, that is an element of multiplicative i ∗ order n, so that Vn = {ζn ∈ F ; i ∈ Zn}. Hence we have a group isomorphism i Zn → Vn : i 7→ ζn, the left hand side being an additive group. Moreover, i ∗ ij ∗ n ζn ∈ F has order min{j ∈ N; ζn = 1 ∈ F } = min{j ∈ N; n | ij} = gcd(i,n) ; in i ∗ particular ζn ∈ Vn is a primitive n-th root of unity if and only if i ∈ Zn.

Let Fq(ζn) ⊆ F be the field generated by ζn over Fq. Then Fq(ζn) is the splitting n field of X −1, hence Fq ⊆ Fq(ζn) is a Galois extension. Thus its automorphism group Γ := AutFq (Fq(ζn)) has finite order |Γ| = [Fq(ζn): Fq] := dimFq (Fq(ζn)), [Fq (ζn): Fq ] |Γ| hence |Fq(ζn)| = |Fq| = q , that is Fq(ζn) = Fq|Γ| ⊆ F.

The group Γ is cyclic, generated by the Frobenius automorphism ϕq : Fq(ζn) → (ζ ): a 7→ aq, having order |ϕ | = min{i ∈ ; ϕi = id } = min{i ∈ Fq n q N q Fq (ζn) qi i N; ζn = ζn} = min{i ∈ N; q = 1 ∈ Zn}. Thus |Γ| = |ϕq| coincides with the ∗ order of q ∈ Zn. Identifying Zn → Vn, the group Γ acts on Zn by ϕq : Zn → Zn : i 7→ iq, the Γ-orbits in Zn are called cyclotomic cosets. n The monic divisors g | X −1 ∈ Fq(ζn)[X] are uniquely determined by their sets of zeroes V(g) ⊆ V as g = Q (X − ζ) ∈ (ζ )[X]. Since Fix (Γ) = n ζ∈V(g) Fq n Fq (ζn) Fq, we have g ∈ Fq[X] if and only if V(g) is a union of Γ-orbits. Thus the monic n irreducible divisors of X − 1 ∈ Fq[X], being called cyclotomic polynomials, Q are given by the Γ-orbits on Vn as µi := i Γ (X − ζ) ∈ q[X]. ζ∈(ζn) F i The polynomial µi ∈ Fq[X] is the minimum polynomial of ζn ∈ Fq(ζn) over i i Γ Fq, hence we have [Fq(ζn): Fq] = deg(µi) = |(ζn) | | |Γ| = [Fq(ζn): Fq]. Thus i we have equality deg(µi) = |Γ| if and only if Fq(ζn) = Fq(ζn); in particular i ∗ this holds whenever ζn is a primitive n-th root of unity, that is i ∈ Zn. Note that Γ acts semi-regularly on the set of primitive n-th roots of unity, but not necessarily transitively.

∗ Example. For q := 2 and n := 7 we find that 2 ∈ Z7 has order 3, thus (ζ ) = and ϕ ∈ Aut ( ) has order 3. The Γ-orbits on V are V = F2 7. F8 .2 F2 F8 7 7 i 0 i 00 {1} ∪ {ζ7; i ∈ O } ∪ {ζ7; i ∈ O } ⊆ F8, where the associated cyclotomic cosets 0 0 7 Q i are O := {1, 2, 4} and O := {3, 5, 6}, thus X + 1 = (X + 1) · i∈O0 (X + ζ7) · Q i 3 3 2 i∈O00 (X + ζ7) = µ0µ1µ3 = (X + 1) · (X + X + 1) · (X + X + 1) ∈ F8[X], where the latter are irreducible in F2[X]; note that here we do not specify which of the factors has ζ7 as a zero.

n (9.2) Zeroes of cyclic codes. Let C ≤ Fq , where gcd(q, n) = 1, be the cyclic code with generator polynomial g ∈ Fq[X] of degree k ∈ {0, . . . , n}. Let V(C) := V(g) ⊆ Vn be the set of zeroes of C; hence |V(C)| = deg(g) = k. Recall that the monic divisors of Xn − 1 are in bijection with the Γ-invariant subsets of Vn, where Γ := AutFq (Fq(ζn)).

Hence any subset V ⊆ Vn whose smallest Γ-invariant superset equals V(C), that III 47

S Γ is V(C) = ζ∈V ζ ⊆ Vn, is called a defining set of C, and C is called the code associated with V; in particular, V(C) is the unique maximal defining set of C. n For v ∈ Fq we have v ∈ C if and only if g | ν(v) ∈ Fq[X], which since g ∈ Fq(ζn)[X] being square-free is equivalent to V(g) ⊆ V(ν(v)) ⊆ F, where F is an algebraic closure of Fq, that is V(g) is a subset of the set of zeroes of ν(v), which in turn is equivalent to V ⊆ V(ν(v)) for any defining set V of C; in other words, n Pn−1 i we have C = {[c0, . . . , cn−1] ∈ Fq ; i=0 ciζ = 0 ∈ Fq(ζn) for all ζ ∈ V}. ⊥ n ⊥ −1 For the dual code C ≤ Fq we have V(C ) = {ζ ∈ Vn; ζ 6∈ V(C)}: Letting h ∈ [X] such that gh = Xn − 1 ∈ [X], we have h = Q (X − ζ) ∈ Fq Fq ζ∈Vn\V(C) (ζ )[X], thus h∗ = Q (X − ζ)∗ = Q ((−ζ) · (X − ζ−1)) ∈ Fq n ζ∈Vn\V(C) ζ∈Vn\V(C) ⊥ ∗ −1 Fq(ζn)[X], hence V(C ) = V(h ) = {ζ ∈ Vn; ζ 6∈ V(C)}.

n (9.3) Theorem. a) Let C ≤ Fq , where gcd(q, n) = 1, be the cyclic code a1 ak with set of zeroes V(C) = {ζn , . . . , ζn } ⊆ Vn, where {a1, . . . , ak} ⊆ Zn and |V(C)| = k ∈ {0, . . . , n}. Then the Delsarte matrix [1975]

 a1 2a1 (n−1)a1  1 ζn ζn . . . ζn a 2a (n−1)a2 1 ζ 2 ζ 2 . . . ζn  (j−1)ai  n n  k×n H = H(V(C)) := [ζn ]ij = . . . .  ∈ Fq(ζn) . . . .  . . . .  ak 2ak (n−1)ak 1 ζn ζn . . . ζn

tr n is a generalized check matrix of C, that is we have C = ker(H ) ∩ Fq . Moreover, tr n if V ⊆ V(C) is a defining set of C, then we have C = ker(H(V) ) ∩ Fq . a a+b a+(δ−2)b b) If V(C) contains a consecutive set {ζn, ζn , . . . , ζn }, where a ∈ Zn ∗ and b ∈ Zn and δ ∈ {2, . . . , n}, then C has minimum distance at least δ.

tr n Proof. a) We have to show that ker(H ) ∩ Fq = C: The submatrix of H consisting of the first k columns is a Vandermonde matrix associated with the a1 ak pairwise distinct roots of unity {ζn , . . . , ζn }, hence is invertible. Thus we infer rk (H) = k, hence we have dim (ker(Htr)) = n−k, which implies Fq (ζn) Fq (ζn) tr n n dimFq (ker(H )∩Fq ) ≤ n−k; note that, whenever S ⊆ Fq , Gaussian elimination shows dim (hSi ) = dim (hSi ). Fq Fq Fq (ζn) Fq (ζn) tr Since dimFq (C) = n − k it suffices to show that C ≤ ker(H ): Let g = Pk i Q i=0 giX := ζ∈V(C)(X − ζ) ∈ Fq(ζn)[X] be the monic generator polynomial n of C. For the i-th row vi = [0,..., 0, g0, . . . , gk, 0,..., 0] ∈ Fq of the genera- tor matrix of C associated with g, where i ∈ {1, . . . , n − k}, and the j-th row n Pk (i+l−1)aj wj ∈ Fq(ζn) of H, where j ∈ {1, . . . , k}, we get hvi, wji = l=0 glζn = (i−1)aj Pk aj l (i−1)aj aj tr ζn · l=0 gl(ζn ) = ζn · g(ζn ) = 0 ∈ Fq(ζn), hence vi ∈ ker(H ). n n Finally, for any row w ∈ Fq(ζn) of H there is a row u ∈ Fq(ζn) of H(V) such qi i that w = u = ϕq(u), for some i ∈ N0, where ϕq ∈ Γ is applied component- III 48

n q qi qi qi wise. Since for v ∈ Fq we have v = v, we from hv, wi = hv, u i = hv , u i = qi tr n tr n hv, wi ∈ Fq(ζn) infer that v ∈ ker(H )∩Fq if and only if v ∈ ker(H(V) )∩Fq . ∗ b b) Since b ∈ Zn we conclude that ζn ∈ Vn is a primitive n-th root of unity −1 a a+b a+(δ−2)b as well. Letting c := ab ∈ Zn, we observe that {ζn, ζn , . . . , ζn } = b c b c+1 b c+δ−2 {(ζn) , (ζn) ,..., (ζn) }. Hence we may assume that b = 1. We con- sider the rows of H corresponding to the consecutive set, that is the matrix a a+δ−2 (δ−1)×n H({ζn, . . . , ζn }) ∈ Fq(ζn) , and show that any (δ − 1)-subset of its columns is Fq(ζn)-linearly independent:

Picking columns {j1, . . . , jδ−1} ⊆ {1, . . . , n} yields the square matrix

 (j1−1)a (jδ−1−1)a  ζn . . . ζn  ζ(j1−1)(a+1) . . . ζ(jδ−1−1)(a+1)   n n  (δ−1)×(δ−1) He :=  . .  ∈ Fq(ζn) .  . .   . .  (j1−1)(a+δ−2) (jδ−1−1)(a+δ−2) ζn . . . ζn Hence we have  1 ... 1  j1−1 jδ−1−1  ζn . . . ζn    (j1−1)a (jδ−1−1)a He =  . .  · diag[ζn , . . . , ζn ],  . .  (j1−1)(δ−2) (jδ−1−1)(δ−2) ζn . . . ζn where the left hand factor is a Vandermonde matrix associated with the pairwise j1−1 jδ−1−1 distinct roots of unity {ζn , . . . , ζn }, thus is invertible. ]

(9.4) Hamming codes again. We show that, generically, Hamming codes are linearly equivalent to cyclic codes: Let Fq be the field with q elements, let qk−1 k ≥ 2 such that gcd(k, q − 1) = 1, and let n := q−1 ; note that gcd(q, n) = 1, n and that the condition on k always holds for q = 2. Let C ≤ Fq be the cyclic Γ code with defining set {ζn}, that is V(C) = (ζn) ⊆ Vn, in other words having g = µ1 ∈ Fq[X] as a generator polynomial. Then C is an [n, n − k, 3]-code, thus n is linearly equivalent to the Hamming code Hk ≤ Fq : ∗ k We show that the order of q ∈ Zn equals k: We have n | q − 1, hence the order in question divides k; assume that n | ql − 1 for some l ∈ {1, . . . , k − 1}, then we have qk − 1 | (q − 1)(ql − 1) = ql+1 − ql − q + 1, thus qk ≤ qk − 2q + 2 ≤ k Γ q qk−1 q − 2, a contradiction. This implies (ζn) = {ζn, ζn, . . . , ζn }, thus we have deg(µ1) = k, implying that dimFq (C) = n − k. n−k P1 n i We show that C minimum distance d(C) = 3: Since q · i=0 i (q − 1) = qn−k · (1 + n(q − 1)) = qn−kqk = qn, the Hamming bound yields d(C) ≤ 3. q Moreover, V(C) contains the consecutive set {ζn, ζn} of length 2 and step size qk−1 Pk−1 i q − 1. Now n = q−1 = i=0 q ≡ k (mod q − 1) implies gcd(n, q − 1) = ∗ gcd(k, q − 1) = 1, hence we have q − 1 ∈ Zn, entailing d(C) ≥ 3. ] III 49

(9.5) BCH-Codes [Bose, Ray-Chaudhuri, 1960; Hocquenghem, 1959]. n a) A cyclic code C ≤ Fq , where gcd(q, n) = 1, associated with a (genuinely) a a+δ−2 consecutive set {ζn, . . . , ζn } ⊆ Vn of length δ − 1, where a ∈ Zn and δ ∈ {1, . . . , n + 1}, is called a BCH code of designed distance δ. Note that C might be a BCH code with respect to consecutive sets of varying lengths (or varying step sizes, which amounts to changing the chosen primitive n-th root of unity); the largest designed distance occurring is called the Bose distance. n Note that δ = 1 if and only if V(C) = ∅, that is C = Fq , thus d(C) = 1; and that δ = n + 1 implies V(C) = Vn, that is C = {0}, thus d(C) = ∞. Hence, in all cases, for the minimum distance of C we have the BCH bound d(C) ≥ δ. |Γ| ∗ ∗ If n = q − 1, that is the multiplicative group Fq(ζn) = Fq|Γ| is generated by the primitive element ζn = ζq|Γ|−1, then C is called primitive. If a = 1, that δ−1 is the consecutive set considered is {ζn, . . . , ζn }, then C is called a narrow δ−1 sense BCH code; in particular, for δ = n we get V(C) = {ζn, . . . , ζn } = Xn−1 Vn \{1} = V( X−1 ), hence C is the repetition code. b) We comment on the parameters of BCH codes. Firstly, we consider the Sδ−2 a+i Γ dimension of BCH codes: We have n − dimFq (C) = |V(C)| = | i=0 (ζn ) | ≤ (δ −1)·|Γ|, hence in general we have dimFq (C) ≥ n−(δ −1)·|Γ|. More specially, if q = 2 and C is a narrow sense BCH code, then from µi = µ2i ∈ F2[X], for all b δ c Sδ−1 i Γ S 2 2i−1 Γ i ∈ Zn, we get V(C) = i=1 (ζn) = i=1(ζn ) ⊆ Vn, which yields the better δ estimate dimF2 (C) ≥ n − b 2 c · |Γ|. Secondly, we consider the true minimum distance of BCH codes, which in general might be strictly larger than the Bose distance, as the examples given below show. But we have Peterson’s Theorem [1967], saying that if C is a narrow sense BCH code of designed distance δ | n, then C has minimum distance δ: n l Pδ−1 il Let n = lδ, where l ∈ N. Then we have X − 1 = (X − 1) · i=0 X ∈ il Fq[X]. Since ζn 6= 1 ∈ Fq(ζn), for all i ∈ {1, . . . , δ − 1}, we conclude that δ−1 l δ−1 Pδ−1 il {ζn, . . . , ζn } ∩ V(X − 1) = ∅. Thus we have {ζn, . . . , ζn } ⊆ V( i=0 X ) ⊆ Sδ−1 i Γ Pδ−1 il Vn, which implies that V(C) = i=1 (ζn) ⊆ V( i=0 X ). Hence we have Pδ−1 [1, 0,..., 0; ... ; 1, 0,..., 0] = i=0 eil ∈ C, which has weight δ. ]

Example. For q = 2 and n = 2k − 1, where k ∈ {2, 3, 4}, we get the following: 3 2 7 We have X +1 = µ0µ1 = (X +1)(X +X +1) ∈ F2[X] and X +1 = µ0 ·µ1µ3 = 3 3 2 (X + 1) · (X + X + 1)(X + X + 1) ∈ F2[X]; see (9.1). Moreover, we have 15 2 4 3 2 4 X +1 = µ0µ5µ3 ·µ1µ7 = (X +1)(X +X +1)(X +X +X +X +1)·(X +X + 4 3 Q i Q i 1)(X +X +1) ∈ F2[X], where µ5 = i∈{5,10}(X −ζ15) = i∈{1,2}(X −ζ3) and Q i Q i Q i µ3 = i∈{3,6,9,12}(X−ζ15) = i∈{1,2,3,4}(X−ζ5) and µ1 = i∈{1,2,4,8}(X−ζ15) Q i Q −i and µ7 = i∈{7,11,13,14}(X − ζ15) = i∈{1,2,4,8}(X − ζ15 ). Hence we have the associated narrow sense primitive binary BCH codes C as given in Table 7, where we indicate the Bose distance δ, the monic generator polynomial g ∈ F2[X], the union O of cyclotomic cosets associated with V(C), III 50

Table 7: Narrow sense primitive binary BCH codes.

δ g O dim d δ g O dim d 1 1 ∅ 7 1 1 1 ∅ 3 1 3 µ1 {1, 2, 4} 4 3 3 µ1 {1, 2} 1 3 7 µ1µ3 {1,..., 6} 1 7 4 µ1µ0 Z3 0 ∞ 8 µ1µ3µ0 Z7 0 ∞

δ g O dim d 1 1 ∅ 15 1 3 µ1 {1, 2, 4, 8} 11 3 5 µ1µ3 {1, 2, 3, 4, 6, 8, 9, 12} 7 5 7 µ1µ3µ5 {1, 2, 3, 4, 5, 6, 8, 9, 10, 12} 5 7 15 µ1µ3µ5µ7 {1,..., 14} 1 15 16 µ1µ3µ5µ7µ0 Z15 0 ∞

the dimension dimF2 (C) = n − deg(g) = n − |O|, and the minimum distance d. Note that in all cases given we observe that δ = d, which is explained by (10.3) below, together with Peterson’s Theorem.

(9.6) Reed-Solomon codes [1954]. a) We consider the first case of primitive ∗ BCH codes, that is n := q − 1. We have Fq = hζq−1i, hence [Fq(ζq−1): Fq] = 1, q−1 Qq−2 i thus Γ is trivial, and X − 1 = i=0 (X − ζq−1) ∈ Fq[X]. A primitive BCH q−1 code C ≤ Fq is called a Reed-Solomon code. Thus V(C) coincides with the a a+δ−2 a−1 δ−1 ∗ defining consecutive set V := {ζq−1, . . . , ζq−1 } = ζq−1 ·{ζq−1, . . . , ζq−1 } ⊆ Fq , where a ∈ Zq−1 and δ ∈ {1, . . . , q} is the designed distance of C.

Thus we have k := dimFq (C) = (q − 1) − (δ − 1) = q − δ. Hence if k ≥ 1, that is δ < q, then from the Singleton and BCH bounds we get δ − 1 = (q − 1) − k ≥ d − 1 ≥ δ − 1, where d := d(C) ∈ N is the minimum distance of C, showing that d = δ, implying that C is an MDS [q − 1, q − δ, δ]-code. We describe a fast encoding procedure for C: We observe first that V⊥ := ⊥ ∗ −1 i ∗ i V(C ) = Fq \V = {ζq−1 ∈ Fq ; i ∈ Zq−1 \ (−a − {0, . . . , δ − 2})} = {ζq−1 ∈ ∗ ⊥ Fq ; i ∈ (−a + {1, . . . , q − δ}) ⊆ Zq−1}; in particular C is a Reed-Solomon ⊥ (i−a)(j−1) code again. Hence the (conventional) check matrix H(V ) = [ζq−1 ]ij ∈ III 51

(q−δ)×(q−1) ⊥ q−1 Fq of C ≤ Fq is a generator matrix of C:

1 1 1 ... 1  2 q−2 1 ζq−1 ζq−1 . . . ζq−1   2(q−2)  ⊥ 1 ζ2 ζ4 . . . ζ  (q−a)(j−1) H(V ) =  q−1 q−1 q−1  · diag([ζ ]j) . . . .  q−1 . . . .  . . . .  q−δ−1 2(q−δ−1) (q−2)(q−δ−1) 1 ζq−1 ζq−1 . . . ζq−1

q−δ ⊥ q−1 Thus for v := [a0, . . . , aq−δ−1] ∈ Fq the associated codeword v ·H(V ) ∈ Fq (i−a)(j−1) q−1 ⊥ is given as follows: Letting wi := [ζq−1 ]j ∈ Fq be the i-th row of H(V ), Pq−δ−1 Pq−δ−1 j−1 i−(a−1) for i ∈ {1, . . . , q − δ}, we get i=0 aiwi+1 = [ i=0 ai(ζq−1 ) ]j = Pq−δ−1 j−1 i+(q−a) q−a j−1 q−1 [ i=0 ai(ζq−1 ) ]j = [(X ν(v))(ζq−1 )]j ∈ Fq . Hence we have C = q−a j−1 q−1 q−δ q−1 hw1, . . . , wq−δiFq = {[(X ν(v))(ζq−1 )]j ∈ Fq ; v ∈ Fq } ≤ Fq . In particular, for narrow sense Reed-Solomon codes, that is a = 1, we obtain q−2 q−1 q−δ q−1 C = {[ν(v)(1), ν(v)(ζq−1), . . . , ν(v)(ζq−1 )] ∈ Fq ; v ∈ Fq } ≤ Fq , saying q−δ that for the above generator matrix the codeword associated with v ∈ Fq is ∗ obtained by evaluating the polynomial ν(v) ∈ Fq[X]

Hence for C := GRSk(α, v) we have dimFq (C) ≤ k. Moreover, since any 0 6= f ∈ Fq[X]

(9.7) Example: The Audio Compact Disc. The Red Book Standard [1982], nowadays called DIN EN 60908, for the compact disc digital audio (CD-DA) system has been developed by the companies ‘Sony’ and ‘Philips’. The amplitude of the analog audio data is sampled at a frequency of 44.1 kHz. By the Nyquist-Shannon Theorem frequencies up to half of the sampling frequency can be encoded and decoded, thus here up to ∼ 22 kHz. To prevent producing moire artifacts, the analog signal has to run through a low pass (anti-aliasing) filter before digitalization. The analog signal is encoded using 16-bit pulse code modulation (PCM). Hence using 28 = 256 symbols instead of only the symbols 0 and 1, that is Bytes instead of bits, a stereo audio signal sample needs 4 Byte. Thus digitalization Byte Byte bit produces 4·44100 s = 176400 s = 1411200 s . Given the running time of 74min, this yields a total of 74 · 60 · 176400 Byte = 783216000 Byte ∼ 783 MB. Now 6 samples form a word of 24 Byte = 192 bit, being called a frame. These are encoded using a cross-interleaved Reed-Solomon code (CIRC), which essentially works as follows: First, using an outer [28, 24, 5]-code C2, which is shortened from the narrow sense Reed-Solomon [255, 251, 5]-code over F256, words of length 24 are encoded into words of length 28. Then an interleaver n with offset four is applied: Codewords [xi1, . . . , xin] ∈ Fq , for i ∈ Z, are written diagonally into a matrix and are read out column-wise, as the following scheme with offset one shows:   ··· xi1 xi+1,1 ···  ··· xi2 xi+1,2 ···     .. ..   . .  ··· xin xi+1,n ···

Next, an inner [32, 28, 5]-code C1, again shortened from the narrow sense Reed- Solomon [255, 251, 5]-code over F256, encodes words of length 28 into words of length 32. Finally, a further Byte is added containing subchannel informa- tion, yielding words of total length 33.

The idea of this encoding scheme is as follows: The code C1 is 2-error correcting, where single C1 errors are corrected, while words with two errors (typically) are marked as erasures. The resulting words of length 28 are de-interleaved, leading to a distribution of erasures, called C2 errors. The code C2 is 2-error correcting as well, hence is able to correct four erasures. Hence, given g ∈ N consecutive erasures, that is columns of the above scheme, due to offset four any diagonally g written codeword is affected in at most d 4 e positions. Thus burst errors, which for example result from surface scratches, with a loss of up to 16 words can be corrected this way. Still remaining CU errors are treated by interpolation, and finally oversampling is applied against aliasing. The data is stored as a spiral track of pits moulded into a polycarbonate layer. The pits are 100nm deep, 500nm wide, and at least 850nm long; the regions between pits are called lands. The data is read by a 780nm solid state laser, III 53 where a pit-land or a land-pit change is read as a 1, and 0 otherwise. This technique requires that between two read 1’s there must be at least 2 and at most 10 read 0’s. This is achieved by eight-to-fourteen modulation (EFM), where each Byte, that is each 8-bit word, is replaced by a 14-bit word, using table lookup. Then a suitable 3-bit merging word is added between two 14-bit words. Finally, a 3 Byte synchronization word is added, together with another 3-bit merging word. The synchronization word does not occur elsewhere in the bit stream, hence can be used to detect the beginning of a frame. Hence a frame consists of 33 · (14 + 3) + (24 + 3) bit = 588 bit, which 192 16 588 amounts to an information rate of 588 = 49 ∼ 0.33, hence a bit rate of 192 · bit bit Byte 1411200 s = 4321800 s = 540225 s , and a total of 74 · 60 · 540225 Byte = 2398599000 Byte ∼ 2.4 GB. Moreover, a burst error of 16 words of length 32 Byte is contained in 16 · 588 bit = 9408 bit, since a bit needs some 300nm of track length, this amounts to some 9408 · 300nm = 2822400nm ∼ 2.8mm.

10 Minimum distance of BCH codes

We have already seen that the BCH bound for the minimum distance of a BCH code is not necessarily sharp. We now proceed into two opposite directions: Firstly, we improve on the idea behind the BCH bound in order to obtain better bounds. Secondly, in the narrow sense primitive case, we provide sufficient criteria ensuring that the BCH bound is actually sharp.

(10.1) Van-Lint-Wilson bound [1986]. We need a definition first: Letting r×n s×n A = [aij]ij ∈ Fq and B = [bi0j]i0j ∈ Fq , where n ∈ N and r, s ∈ N0, let rs×n 0 A ∗ B := [aijbi0j](i−1)s+i0,j ∈ Fq , where i ∈ {1, . . . , r} and i ∈ {1, . . . , s}. Note that A ∗ B in general does not have full rank, even if A and B have.

tr n r×|J | Theorem. Let v ∈ C := ker((A ∗ B) ) ≤ Fq , and let AJ ∈ Fq and BJ ∈ s×|J | Fq be the submatrices of A and B, respectively, consisting of the columns in J := supp(v) ⊆ {1, . . . , n}. Then we have rkFq (AJ ) + rkFq (BJ ) ≤ |J | = wt(v).

Proof. Let v = [x1, . . . , xn], where we may assume that J = {1, . . . , n}, that 0 is xj 6= 0 for all j ∈ {1, . . . , n}. Letting B := B · diag[x1, . . . , xn] = [bi0jxj]i0j ∈ s×n 0 tr Fq , we have rkFq (B) = rkFq (B ) ∈ N0. Moreover, the condition v · (A ∗ B) = Pn rs 0tr r×s [ j=1 aijbi0jxj](i−1)s+i0 = 0 ∈ Fq can be rewritten as A · B = 0 ∈ Fq . In other words, the row space of A is contained in the orthogonal space of the row 0 0 space of B , hence we have rkFq (A) ≤ n − rkFq (B ) = n − rkFq (B). ]

Corollary. If for all ∅ 6= I ⊆ {1, . . . , n} such that |I| ≤ d − 1, for some d ∈ N, we have rkFq (AI ) + rkFq (BI ) > |I|, then C has minimum distance at least d. III 54

0 (10.2) Theorem: Roos bound [1983]. For n ∈ N let V ⊆ Vn ⊆ Fq(ζn), where gcd(q, n) = 1, be a consecutive set of length δ−1, where δ ∈ {2, . . . , n+1}. 00 Moreover, let ∅= 6 V ⊆ Vn be any subset such that there is a consecutive subset 00 00 n of Vn containing V and having length |V |+δ −2. Then the cyclic code C ≤ Fq 0 00 00 associated with V := V ·V ⊆ Vn has minimum distance at least δ − 1 + |V |; note that we recover the BCH bound from V00 = {1}.

0 00 (|V0|·|V00|)×n Proof. Since the matrices H(V ) ∗ H(V ) ∈ Fq(ζn) and H(V) ∈ |V|×n tr n Fq(ζn) have the same set of rows, we have C = ker(H(V) ) ∩ Fq = 0 00 tr n ker((H(V ) ∗ H(V )) ) ∩ Fq . We aim at applying the van-Lint-Wilson bound: For ∅= 6 I ⊆ {1, . . . , n}, by the BCH bound we have rk (H(V0) ) = |I| Fq (ζn) I if |I| ≤ δ − 1, and thus rk (H(V0) ) ≥ δ − 1 if |I| ≥ δ. Since we always Fq (ζn) I have rk (H(V00) ) ≥ 1, we get rk (H(V0) ) + rk (H(V00) ) > |I| if Fq (ζn) I Fq (ζn) I Fq (ζn) I |I| ≤ δ − 1. This settles the case |V00| = 1. Hence let now |V00| ≥ 2 and |I| ≥ δ: 00 Let V ⊆ W ⊆ Vn be a consecutive set, again by the BCH bound yielding rk (H(W) ) ≥ |I| if |I| ≤ |W|. Since deleting the rows of H(W) corre- Fq (ζn) I sponding to W\V00 yields H(V00), we infer rk (H(V00) ) ≥ |I| − |W| + |V00| Fq (ζn) I if |I| ≤ |W|. This yields rk (H(V0) ) + rk (H(V00) ) ≥ δ − 1 + |I| − Fq (ζn) I Fq (ζn) I |W| + |V00| if δ ≤ |I| ≤ |W|. Here, the right hand side exceeds |I| if and only if δ ≤ |I| ≤ |W| ≤ |V00| + δ − 2, in which case C has minimum distance at least |I| + 1. Now the assertion follows from choosing |I| = |W| = |V00| + δ − 2. ]

0 Corollary: Hartmann, Tzeng [1972]. Let V ⊆ Vn be a consecutive set of 00 length δ − 1, where δ ∈ {2, . . . , n + 1}, and let ∅ 6= V ⊆ Vn be a generalized ∗ 00 consecutive set of some step size in Zn such that |V | ≤ n+2−δ. Then the cyclic code associated with V := V0 ·V00 has minimum distance at least δ − 1 + |V00|.

Proof. The set V00 can be extended to a consecutive set with the same stpe size and having length |V00| + δ − 2. Recall that the rank estimates for Delsarte matrices also hold for generalized consecutive sets. ]

∗ Example. Let q := 2 and n := 35. Then 2 ∈ Z35 has order 12, and the cyclotomic cosets are given as

. . . {0} ∪ {5, 10, 20} ∪ {15, 25, 30} ∪ {7, 14, 21, 28} . ∪ {1, 2, 4, 8, 9, 11, 16, 18, 22, 23, 29, 32} . ∪ {3, 6, 12, 13, 17, 19, 24, 26, 27, 31, 33, 34}.

35 Let C ≤ F2 be the cyclic code associated with g := µ1µ5µ7 ∈ F2[X]. Hence V(C) is given by {1, 2, 4, 5, 7, 8, 9, 10, 11, 14, 16, 18, 20, 21, 22, 23, 28, 29, 32} ⊆ Z35, thus for the minimum distance of C the BCH bound yields d(C) ≥ 6. . But C is also associated with O := {7, 8, 9, 10, 11} ∪ {20, 21, 22, 23}, which letting O0 := {7, 8, 9, 10} and O00 := {0, 1, 13} can be written as O = O0 + O00 ⊆ 00 Z35. In order to apply the Roos bound with δ = 5, entailing d(C) ≥ δ−1+|O | = III 55

7, we have to embed O00 in a consecutive set of length |O00| + δ − 2 = 6: Since ∗ 3 3 ∈ Z35 we conclude that ζ35 ∈ V35 is a primitive 35-th root of unity as well; recall that the rank estimates of check matrices associated with consecutive sets do not depend on a particular choice of a primitive root of unity. Thus we 00 indeed get 3 ·O = {0, 3, 4} ⊆ {0,..., 5} ⊆ Z35. 12 11 10 8 To show conversely that d(C) ≤ 7, we choose µ1 := X + X + X + X + 5 4 3 2 12 10 9 8 X + X + X + X + 1 ∈ F2[X], which entails µ3 = X + X + X + X + 7 4 2 3 3 2 X + X + X + X + 1, as well as µ5 = X + X + 1 and µ15 = X + X + 1, 4 3 2 19 while µ7 = X + X + X + X + 1 anyway. This yields g = µ1µ5µ7 = X + X15 + X14 + X13 + X12 + X10 + X9 + X7 + X6 + X2 + 1. It turns out that 28 16 14 11 7 5 g | f := X +X +X +X +X +X +1, or equivalently f(ζ35) = f(ζ35) = 7 f(ζ35) = 0. Hence f ∈ F2[X] corresponds to a codeword of weight 7. ]

7 ∗ Example. Let q := 2 and n := 127 = 2 − 1. Then 2 ∈ Z127 has order 7, and the cyclotomic cosets are given as . . {0} ∪ {1, 2, 4, 8, 16, 32, 64} ∪ {3, 6, 12, 24, 48, 65, 96} . . ∪ {5, 10, 20, 33, 40, 66, 80} ∪ {7, 14, 28, 56, 67, 97, 112} . . ∪ {9, 17, 18, 34, 36, 68, 72} ∪ {11, 22, 44, 49, 69, 88, 98} . . ∪ {13, 26, 35, 52, 70, 81, 104} ∪ {15, 30, 60, 71, 99, 113, 120} . . ∪ {19, 25, 38, 50, 73, 76, 100} ∪ {21, 37, 41, 42, 74, 82, 84} . . ∪ {23, 46, 57, 75, 92, 101, 114} ∪ {27, 51, 54, 77, 89, 102, 108} . . ∪ {29, 39, 58, 78, 83, 105, 116} ∪ {31, 62, 79, 103, 115, 121, 124} . . ∪ {43, 45, 53, 85, 86, 90, 106} ∪ {47, 61, 87, 94, 107, 117, 122} . . ∪ {55, 59, 91, 93, 109, 110, 118} ∪ {63, 95, 111, 119, 123, 125, 126}.

⊥ 127 Let C ≤ F2 be the narrow sense primitive BCH code with designed distance 11. Hence C⊥ is associated both with {1, 3, 5, 7, 9} ⊆ {1,..., 10}, thus V(C⊥) has cardinality 35 and is given by

O⊥ := { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 16, 17, 18, 20, 24, 28, 32, 33, 34, 36, 40, 48, 56, 64, 65, 66, 67, 68, 72, 80, 96, 97, 112 } ⊆ Z127.

⊥ ⊥ 127 ⊥ Let C := (C ) ≤ F2 . Hence V(C) is given by O := Z127 \ (−O ), that is O = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14; 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29; 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46; 48, 49, 50, 51, 52, 53, 54; 56, 57, 58; 64, 65, 66, 67, 68, 69, 70; 72, 73, 74, 75, 76, 77, 78; 80, 81, 82, 83, 84, 85, 86; 88, 89, 90; 92; 96, 97, 98; 100, 101, 102; 104, 105, 106; 108; 112; 114; 116 }.

Thus C has minimal defining set Q := {0, 1, 3, 5, 7, 9, 11, 13, 19, 21, 23, 27, 29, 43}, where V(C) has cardinality 92. This implies O⊥ ⊆ O, thus C ≤ C⊥ is weakly self-dual. We proceed to determine the minimum distance d of C: III 56 i) The BCH bound yields d ≥ 16 and the Singleton bound yields d ≤ 127 − 92 + 1 = 36. Moreover, we show that d is divisible by 4: ±1 To this end, we consider the localization F2[X ] := F2[X]X ⊆ F2(X), and ±1 127 ∼ L i let : 2[X ] → 2[X] := 2[X]/hX − 1i = 2[X]/hX − ζ i be F F F i∈Z127 F 127 the extension of the natural epimorphism, where the latter isomorphism results from the Chinese Remainder Theorem. Now, from O⊥ ⊆ O we get O ∪ (−O) = ⊥ i −i O ∪ (Z127 \O ) = Z127, thus for any i ∈ Z127 we have g(ζ127)g(ζ127) = 0, −1 entailing that g(X)g(X ) = 0 ∈ F2[X].

It suffices to show that for any v = [a0, . . . , a126] ∈ C we have 4 | wt(v) =: s: P126 i −1 P126 −j ±1 Let f := ν(v) = i=0 aiX ∈ F2[X], then f(X ) = j=0 ajX ∈ F2[X ]. −1 P126 P k From g | f ∈ 2[X] we infer that f(X)f(X ) = ( ak+jaj)X = F k=0 j∈Z127 P 0 ∈ 2[X], Thus for all k ∈ 127 we have ak+jaj = 0 ∈ 2. In other F Z j∈Z127 F 2 words, letting J := supp(v) ⊆ Z127, this says that |{[i, j] ∈ J ; i = j + k}| is even. For k = 0 this yields that |J | = s is even. Since 127 is odd, for 2 k 6= 0 we have −k 6= k ∈ Z127, hence 4 | |{[i, j] ∈ J ; i ∈ {j ± k}}|, and thus 4 | |{[i, j] ∈ J 2; i 6= j}| = s2 − s = s(s − 1), implying that 4 | s. ii) We aim at applying the Roos bound with δ = 15: Letting O0 := {0,..., 13} . . and O00 := {0, 1, 16, 32, 33} we get O0 + O00 = {0,..., 14} ∪ {16,..., 29} ∪ 0 00 ∗ {32,..., 46} ⊆ O, hence C is associated with O + O . Moreover, since 8 ∈ Z127 8 we conclude that ζ127 ∈ V127 is a primitive 127-th root of unity as well. Thus 00 00 from 8 ·O = {0, 1, 2, 8, 10} ⊆ {0,..., 17} ⊆ Z127, where 18 = |O | + δ − 2, the Roos bound yields d ≥ δ − 1 + |O00| = 19. Hence we get d ≥ 20. . iii) We aim at applying the van-Lint-Wilson bound: Let P0 := {16,..., 29} ∪ . {32,..., 44} ⊆ O and P00 := {0, −16, −15}. Then we get P0+P00 = {0,..., 14} ∪ . {16,..., 29} ∪ {32,..., 44} ⊆ O0 + O00 ⊆ O, hence C is associated with P0 + P00. 0 27×127 We consider the check matrix H(P ) ∈ F2 : Since the set {16,..., 29} and {16,..., 44} are consecutive of length 14 and 29, respectively, for ∅= 6 I ⊆ {1,..., 127} we get rk (H(P0) ) ≥ |I| if |I| ≤ 14, and rk (H(P0) ) ≥ Fq (ζn) I Fq (ζn) I 14 if |I| = 15, and rk (H(P0) ) ≥ |I| − 2 if 16 ≤ |I| ≤ 29. Fq (ζn) I 00 3×127 00 We consider the check matrix H(P ) ∈ F2 : From 8 ·P = {−1, 0, 7} ⊆ {−1,..., 7}, we get rk (H(P00) ) ≥ |I| if |I| ≤ 2, and rk (H(P00) ) ≥ Fq (ζn) I Fq (ζn) I 2 if 3 ≤ |I| ≤ 7, and rk (H(P00) ) ≥ |I| − 6 if 8 ≤ |I| ≤ 9, and Fq (ζn) I rk (H(P00) ) = 3 if |I| ≥ 10; note that |{−1,..., 7} \ {−1, 0, 7}| = 6. Fq (ζn) I Thus we have rk (H(P0) ) + rk (H(P00) ) > |I| whenever |I| ≤ 29, Fq (ζn) I Fq (ζn) I hence the van-Lint-Wilson bound yields d ≥ 30. Hence we get d ≥ 32. 7 iv) Finally, to show conversely that d ≤ 32, we choose µ1 := X +X+1 ∈ F2[X], and thus fix all the polynomials µi ∈ F2[X] for i ∈ Z127. Then the generator III 57

Q polynomial g := i∈Q µi ∈ F2[X] of C turns out to be g = X92 + X91 + X89 + X86 + X85 + X84 + X83 + X80+ X77 + X76 + X74 + X73 + X71 + X67 + X65 + X62+ X60 + X58 + X56 + X53 + X52 + X51 + X49 + X48+ X47 + X46 + X45 + X43 + X39 + X38 + X36 + X35+ X34 + X32 + X28 + X24 + X23 + X19 + X18 + X17+ X16 + X11 + X10 + X6 + X4 + X3 + X + 1.

Letting f ∈ F2[X] to be given as f := X99 + X98 + X97 + X96 + X94 + X87 + X83 + X79+ X78 + X75 + X72 + X69 + X62 + X61 + X59 + X57+ X56 + X54 + X51 + X49 + X48 + X45 + X42 + X28+ X26 + X25 + X22 + X17 + X13 + X10 + X3 + 1,

i it turns out that g | f, or equivalently f(ζ127) = 0 for all i ∈ Q. Hence f corresponds to a codeword of weight 32. ]

s n (10.3) Theorem. Let n := q − 1, for some s ∈ N, and C ≤ Fq be a narrow sense primitive BCH code of designed distance δ = qt−1, for some t ∈ {1, . . . , s}. Then C has minimum distance δ.

Proof. We show that C contains an element of weight δ: ∗ Γ i) Since q ∈ Zn has order s, we have |Γ| = s, from which we infer that ζn = q qs−1 q qt−1 {ζn, ζn, . . . , ζn } ⊆ Vn. Hence the elements {ζn, ζn, . . . , ζn } are pairwise distinct, implying the invertibility of the associated Vandermonde matrix  t−1  1 ζn ··· ζn q (t−1)q 1 ζ ··· ζn  (j−1)qi−1  n  t×t A := [ζn ]ij = . . .  ∈ Fq(ζn) . . . .   t−1  qt−1 (t−1)q 1 ζn ··· ζn

t qt (t−1)q Hence the system of linear equations [X0,...,Xt−1] · A = −[1, ζn , . . . , ζn ] t qt Pt−1 qi has a unique solution [a0, . . . , at−1] ∈ Fq(ζn) . Let f := X + i=0 aiX ∈ t Fq(ζn)[X] be the associated q-linearized polynomial of degree q ; note that we have f(ax + y) = af(x) + f(y) ∈ Fq(ζn), for all x, y ∈ Fq(ζn) and a ∈ Fq. j qt Pt−1 j qi By construction we have (ζn) + i=0 ai(ζn) = 0, for all j ∈ {0, . . . , t − 1}, t−1 hence V := {1, ζn, . . . , ζn } ⊆ V(f) ⊆ F consists of zeroes of f. Moreover, V ⊆ t Pt−1 j Fq(ζn) is Fq-linearly independent: Let [b0, . . . , bt−1] ∈ Fq such that j=0 bjζn = q Pt−1 jqi 0 ∈ Fq(ζn), thus from bj = bj we get j=0 bjζn = 0, for all i ∈ {0, . . . , t − 1}, tr t×1 that is A · [b0, . . . , bt−1] = 0 ∈ Fq(ζn) , implying that [b0, . . . , bt−1] = 0. t Thus letting V := hViFq ≤ Fq(ζn) we have |V | = q , and from f being Fq-linear we infer that V ⊆ V(f) consists of zeroes of f, implying that V = V(f), hence Q f splits over Fq(ζn) as f = c∈V (X − c) ∈ Fq(ζn)[X]. III 58

ii) We need an auxiliary construction: Let X := {X1,...,Xm} be indeter- minates, where m ∈ N. For k ∈ {0, . . . , m} let em,k ∈ Fq[X ] be the associ- ated elementary symmetric polynomial of degree k, for example em,0 = 1 and Pm P em,1 = i=1 Xi and em,2 = 1≤iyj≤m XiXj and em,m = X1 ··· Xm, and for Pm k k ∈ N let pm,k := i=1 Xi ∈ Fq[X ] be the associated power sum polynomial. Qm Pm j j Letting h := i=1(1 − XiX) = j=0(−1) em,jX ∈ Fq[X ][X] ⊆ Fq((X ,X)), ∂ Pm j j−1 we have ∂X h = j=1(−1) jem,jX , and the product rule yields

∂ Pm  Q  ∂X h = − j=1 Xj · i∈{1,...,m}\{j}(1 − XiX) = −h · Pm Xj j=1 1−Xj X Pm  P k+1 k = −h · j=1 k≥0 Xj X P k = −h · k≥0 pm,k+1X ,

Pm i−1 i−1 Pm j j P k−1 implying i=1(−1) iem,iX = j=0(−1) em,jX · k≥1 pm,kX ∈ Pi j−1 Fq[X ][[X]]. Thus we get the Newton identities j=1(−1) em,i−jpm,j = Pi j−1 iem,i for i ∈ {1, . . . , m}, as well as j=i−m(−1) em,i−jpm,j = 0 for i ≥ m+1; i−1 Pi−1 j the former can also be written as (−1) pm,i = iem,i + j=1(−1) em,i−jpm,j. t iii) Now let m := q . Evaluating em,i ∈ Fq[X ], for i ∈ {0, . . . , m}, at the qt Pt−1 qi Q elements of V ⊆ Fq(ζn), we get f = X + i=0 aiX = c∈V (X − c) = Pm m−j j j=0(−1) em,m−j(V )X ∈ Fq(ζn)[X]. This implies that em,i(V ) 6= 0 pos- sibly only for i = m − qj, where j ∈ {0, . . . , t}. Since q | m, we thus have iem,i(V ) = 0 for all i ∈ {0, . . . , m − 2}. Evaluating pm,i ∈ Fq[X ] at the ele- ments of V as well, from the Newton identities we by induction on i ∈ N get P i P i c∈V \{0} c = c∈V c = pm,i(V ) = 0, for all i ∈ {1, . . . , m − 2}. ∗ s ∗ i Since |Fq(ζn) | = q − 1 = n, the map Zn → Fq(ζn) : i 7→ ζn is a bijection. n n i Let v = [x0, . . . , xn−1] ∈ {0, 1} ⊆ Fq be such that V \{0} = {ζn ∈ Fq(ζn); i ∈ j Zn, xi = 1}. Then we have wt(v) = |V | − 1 = m − 1 = δ, and we get ν(v)(ζn) = P (ζj )i = P (ζi )j = P cj = 0, for j ∈ {1, . . . , δ − 1}, i∈Zn,xi=1 n i∈Zn,xi=1 n c∈V \{0} δ−1 which since C is associated with {ζn, . . . , ζn } ⊆ Vn implies that v ∈ C. ]

n (10.4) Corollary. Let C ≤ Fq be a narrow sense primitive BCH code of de- signed distance δ ∈ {1, . . . , n}. Then C has minimum distance at most qδ − 1.

Proof. Let n = qs − 1, and let t ∈ {1, . . . , s} such that qt−1 ≤ δ ≤ qt − 1. δ−1 Since C is associated with {ζn, . . . , ζn }, it contains the code associated with qt−2 t {ζn, . . . , ζn }. Since the latter has minimum distance q − 1, the minimum distance of C is bounded above by qt − 1 = q · qt−1 − 1 ≤ qδ − 1. ]

Finally, we just mention the following theorem, whose proof requires further tools we do not have at our disposal here: III 59

(10.5) Theorem. Let C be a non-trivial narrow sense primitive binary BCH code. Then the minimum distance of C is odd. ]

11 Quadratic residue codes

(11.1) Quadratic residues. We collect a few number theoretic facts. ∗ a) Let p be an odd prime. Then by Artin’s Theorem Zp is a cyclic group of 2 ∗ ∗ ∗ even order p − 1. Hence the set of squares Qp := {i ∈ Zp; i ∈ Zp} ≤ Zp is the ∗ ∗ unique subgroup of Zp of index 2, and consists of the elements of Zp of order p−1 ∗ ∗ dividing 2 . Let Np := Zp \Qp be the set of non-squares in Zp; hence we p−1 have |Qp| = |Np| = 2 . ∗  i  For i ∈ Zp let the Legendre symbol be defined as p := 1 if i ∈ Qp, and  i  ∗  ij   i   j  p := −1 if i ∈ Np. Hence for i, j ∈ Zp we have p = p p , thus  ·  ∗ p : Zp → {±1} is a group homomorphism with kernel Qp.

  p−1 −1 2 Lemma. We have p = (−1) .

2 ∗ Proof. From X − 1 = (X − 1)(X + 1) ∈ Zp[X] we infer that −1 ∈ Zp is the ∗ unique primitive second root of unity. Hence −1 ∈ Zp is a square, if and only if ∗ Zp has an element of order 4, which by Artin’s Theorem is equivalent to p ≡ 1 p−1 (mod 4), the latter being equivalent to (−1) 2 = 1. ]

Now let q 6= p be a prime. Then by Dirichlet’s Theorem given p there are  q  infinitely many q such that p = 1. Moreover, by the Quadratic Reciprocity  q   p  (p−1)(q−1) 4 Law [Gauss, 1796], saying that p q = (−1) whenever q is odd, 2   p −1  q  2 8 and p = (−1) , given q there are infinitely many p such that p = 1.

 2  For the particular cases of q ∈ {2, 3} we get: We have p = 1 if and only  −1  if p ≡ ±1 (mod 8); thus we additionally have p = 1 if and only if p ≡ 1  3  (mod 8). We have p = 1 if and only if p ≡ ±1 (mod 12); thus we additionally  −1  have p = 1 if and only if p ≡ 1 (mod 12). b) Let still p be an odd prime, and let q 6= p be a prime. Then for the associated P  i  i Gaussian sum, being defined as γp := ∗ ζ ∈ q(ζp), we have: i∈Zp p p F

2  −1  Lemma. i) We have γp = p · p ∈ Fq, in particular γp 6= 0.  q  ii) If p = 1, then we have γp ∈ Fq. III 60

2 P  i   −j  i−j  −1  P  ij  i−j Proof. i) We have γ = ∗ ζ = · ∗ ζ . p i,j∈Zp p p p p i,j∈Zp p p ∗ ∗ 2  −1  Since multiplication with j ∈ Zp is a bijection on Zp we get γp = p · P  ij2  (i−1)j  −1  P  i  (i−1)j P  i  ∗ ζp = · ∗ ζp . Next, using ∗ = 0 i,j∈Zp p p i,j∈Zp p i∈Zp p 2  −1  P   i  P i−1 j p we get γ = · ∗ · (ζ ) . Finally, from X − 1 = p p i∈Zp p j∈Zp p Pp−1 i P i j ∗ (X − 1) · X ∈ q[X] we conclude that (ζ ) = 0 whenever i ∈ , i=0 F j∈Zp p Zp 2  −1  hence in the above outer sum only the case i = 1 remains, yielding γp = p ·p. q ii) Recalling that AutFq (Fq(ζp)) = hϕqi, we show that γp = γp: We have q q P  i  iq P  i  iq γ = ∗ ζ = ∗ ζ ; note that this is trivial for q = 2. From i∈Zp p p i∈Zp p p  iq   i  q P  iq  iq P  i  i = we get γ = ∗ ζ = ∗ ζ = γp. ] p p i∈Zp p p i∈Zp p p Note that in the latter case it depends on the chosen minimum polynomial  −1  µ1 ∈ Fq[X] of ζp which square root of p · p equals γp ∈ Fq.

(11.2) Quadratic residue codes. a) Let p be an odd prime, and let q 6= p be  q   iq   i  ∗ a prime such that p = 1. Since p = p , for all i ∈ Zp, we conclude that i i both {ζp ∈ Fq(ζp); i ∈ Qp} and {ζp ∈ Fq(ζp); i ∈ Np} are Γ-invariant. Hence letting ρ := Q (X −ζi ) ∈ (ζ )[X] and η := Q (X −ζi ) ∈ (ζ )[X], p i∈Qp p Fq p p i∈Np p Fq p we infer that both ρp and ηp have coefficients in Fq. Thus we get ρpηp = Q i Xp−1 Pp−1 i p ∗ (X − ζ ) = = X ∈ q[X], hence (X − 1) · ρpηp = X − 1. i∈Zp p X−1 i=0 F This gives rise to the following quadratic residue (QR) codes: Let Qp ≤ p p p Fq and N ≤ Fq be the cyclic codes having generator polynomial ρp and ηp, p p p−1 p+1 respectively. Hence we have dimFq (Q ) = dimFq (N ) = p − 2 = 2 . p 0 p p 0 p Moreover, let (Q ) ≤ Fq and (N ) ≤ Fq be the associated expurgated codes, that is having generator polynomials (X − 1) · ρp and (X − 1) · ηp, respectively; n Pn−1 recall that for v = [a0, . . . , an−1] ∈ Fq the condition i=0 ai = 0 ∈ Fq is Pn−1 i equivalent to ν(v)(1) = ( i=0 aiX )(1) = 0, that is X − 1 | ν(v) ∈ Fq[X]. p 0 p 0 p+1 p−1 Hence we have dimFq ((Q ) ) = dimFq ((N ) ) = p − 2 = 2 . b) For Cp ∈ {Qp, N p}, the associated extended quadratic residue code is p p+1 p γp Pp−1 defined as Cb := {[a0, . . . , ap] ∈ Fq ;[a0, . . . , ap−1] ∈ C , ap = p · i=0 ai}, for  ∈ {±1}; note that the two choices yield linearly equivalent codes. The reason for twisting the original definition of an extended code will become p clear in (11.5) below. In particular, for q = 2 we have γp = 1, so that Cb := p+1 p Pp p+1 {[a0, . . . , ap] ∈ F2 ;[a0, . . . , ap−1] ∈ C , i=0 ai = 0} ≤ Fq coincides with the conventional extended code anyway; for q = 3 we have γp ∈ {±1}, so p p+1 that  can be chosen such that Cb := {[a0, . . . , ap] ∈ F3 ;[a0, . . . , ap−1] ∈ p Pp p+1 C , i=0 ai = 0} ≤ Fq coincides with the conventional extended code. III 61

(11.3) Example. For q := 2 and p := 7 we find that 23 ≡ 1 (mod 7), hence 2 ∈ ∗ has order 3, thus (ζ ) = and ϕ ∈ Γ := Aut ( ) has order 3. We Z7 F2 7 F8 2 F2 F8 . 2  conclude that 2 ∈ Q7, that is = 1. Hence the Γ-orbits on V7 are V7 = {1} ∪ . 7 i i 0 {ζ7; i ∈ Q7} ∪ {ζ7; i ∈ N7}, where Q7 := {1, 2, 4} and N7 := {3, 5, 6}. Thus we have X7+1 = (X+1)·Q (X+ζi )·Q (X+ζi ) = µ µ µ ∈ [X]. i∈Q7 7 i∈N7 7 0 1 3 F8 7 0 00 0 3 Actually, we have X +1 = (X+1)·g g ∈ F2[X], where g := X +X+1 ∈ F2[X] 00 3 2 and g := X + X + 1 ∈ F2[X], hence the latter are both irreducible; see (9.1). 7 0 0 ∗ 00 Let C ≤ F2 be the cyclic code generated by g ; since (g ) = g the code generated by g00 is linearly equivalent to C. Hence the even-weight subcode C0 ≤ C has generator polynomial (X + 1) · g0 = X4 + X3 + X2 + 1. Moreover, C has check polynomial h := (X +1)·g00 = X4 +X2 +X +1, thus C⊥ has generator polynomial h∗ = (X + 1)∗ · (g00)∗ = (X + 1) · g0, showing that C⊥ = C0 ≤ C.

Choosing a primitive 7-th root of unity over F2 having minimum polynomial g0, we conclude that C is a QR code of type Q; the code generated by g00 then 8 is the associated QR code of type N . Moreover, extending yields Cb ≤ F2; by puncturing Cb again we recover (Cb)• = C. Hence by (11.5) below we again conclude that C⊥ = C0 ≤ C, and that (Cb)⊥ = Cb, that is Cb is self-dual. We determine the minimum distance of C and Cb; see also (9.4): The code C can be considered as a narrow sense BCH code with irreducible generator polynomial 0 µ1 = g and associated cyclotomic coset Q23, hence has Bose distance δ = 3, so 7−3 P1 7 that the BCH bound yields d(C) ≥ 3. The Hamming bound 2 · i=0 i = 4 4 7 3−1 2 · (1 + 7) = 2 · 8 = 2 is fulfilled for e := 1 = 2 , showing that d(C) = 3, and hence d(Cb) = 4. Hence C is a perfect [7, 4, 3]-code, and thus linearly equivalent 0 to the Hamming code H3; thus the code C is a [7, 3, 4]-code, linearly equivalent 0 to the even-weight Hamming code H3; and Cb is a quasi-perfect self-dual [8, 4, 4]- code, linearly equivalent to the extended Hamming code Hb3; recall that due to self-duality the latter equals the Reed-Muller code R3; see also (6.3) and (6.5).

 q  (11.4) Theorem. Let p be an odd prime, and q 6= p a prime such that p = 1. a) Then Qp and N p are linearly equivalent, and so are (Qp)0 and (N p)0, as well as are Qbp and Nb p with either choice of . b) Let v ∈ Qp \ (Qp)0. Then for d := wt(v) we have the square root bound d2 ≥ p. Moreover, if p ≡ −1 (mod 4) then we even have d2 − d + 1 ≥ p, and for q = 2 we have d ≡ 3 (mod 4).

 ij   i   j   i  ∗ Proof. a) Let j ∈ Np. Then from p = p p = − p , for all i ∈ Zp, we conclude that the permutation π : Zp → Zp : i 7→ ij interchanges the sets Qp and Np, while 0 is kept fixed. We consider the linear isometry induced by p letting π permute the components of Fq : p π p For v := [a0, . . . , ap−1] ∈ Fq we get v = [aiπ−1 ; i ∈ Zp] ∈ Fq , in other words π P i P iπ P ij ν(v ) = aiπ−1 X = aiX = aiX ∈ q[X]. Thus for all i∈Zp i∈Zp i∈Zp F III 62

∗ π k P ijk P kj i kj k ∈ we get ν(v )(ζ ) = aiζ = ai(ζ ) = ν(v)(ζ ) ∈ q(ζp), Zp p i∈Zp p i∈Zp p p F k π kj p showing that ζp ∈ V(ν(v )) if and only if ζp ∈ V(ν(v)). Hence we have v ∈ Q if and only if vπ ∈ N p. Finally, since the linear equivalence between Qp and N p is induced by a permu- tation of components, it induces a linear equivalence between (Qp)0 and (N p)0 and a linear equivalence between Qbp and Nb p. b) We have ρp | ν(v) ∈ Fq[X], but (X − 1) - ν(v). Applying π again we get π π Pp−1 i ηp | ν(v ), but (X − 1) - ν(v ) ∈ Fq[X]. Hence we have i=0 X = ρpηp | π p π p ν(v)ν(v ), but X − 1 = (X − 1) · ρpηp - ν(v)ν(v ). Let w ∈ Fq be the vector π Pp−1 i associated with ν(v)ν(v ) ∈ Fq[X]. Then we have w 6= 0, and since i=0 X p generates the repetition subcode of Fq we conclude that w = [a, . . . , a] for some a 6= 0, hence wt(w) = p. Now for ν(v)ν(vπ) we get d2 products of non-zero coefficients of ν(v) and ν(vπ), respectively. Hence ν(v)ν(vπ) has at most d2 non-zero coefficients, thus d2 ≥ p.

 −1  If p ≡ −1 (mod 4), that is p = −1, then we may take j = −1, thus π : Zp → Zp : i 7→ −i. Then d of the above products belong to the constant coefficient of ν(v)ν(vπ), hence the latter has at most d2 − d + 1 non-zero coefficients. Finally, for q = 2, if two products belonging to the i-th coefficient of ν(v)ν(vπ) ∗ cancel, where i ∈ Zp, then there are two products belonging to the (−i)-th coefficient canceling as well. Hence we get p ≡ d2−d+1 (mod 4), thus d(d−1) ≡ 2 (mod 4), which since d is odd implies d ≡ 3 (mod 4). ]

 q  (11.5) Theorem. Let p be an odd prime, and q 6= p a prime such that p = 1. a) If p ≡ −1 (mod 4), then we have (Qp)⊥ = (Qp)0, as well as (Qbp)⊥ = Qbp with either choice of . b) If p ≡ 1 (mod 4), then we have (Qp)⊥ = (N p)0, as well as (Qbp)⊥ = Nb p with opposite choices of .

Proof. i) We first consider (Qp)⊥: Recall that Qp has generating polynomial ρ . Now we have ρ∗ = Q (X − ζi )∗ = Q (−ζi ) · Q (X − ζ−i) ∈ p p i∈Qp p i∈Qp p i∈Qp p  −i   −1   i  ∗ Fq(ζp)[X]. Moreover, we have p = p p for all i ∈ Zp.     Hence if p ≡ −1 (mod 4), then −i = − i implies Q (X − ζ−i) = p p i∈Qp p Q (X − ζi ) = η , thus ρ∗ ∼ η ∈ [X]. Now (Qp)⊥ has generating poly- i∈Np p p p p Fq ∗ ∗ p ⊥ p 0 nomial (X − 1) · ηp ∼ (X − 1) · ρp, so that (Q ) = (Q ) .     If p ≡ 1 (mod 4), then −i = i implies Q (X−ζ−i) = Q (X−ζi ) = p p i∈Qp p i∈Qp p ∗ ∗ p ⊥ ρp, thus ρp ∼ ρp and hence ηp ∼ ηp. Now (Q ) has generating polynomial ∗ ∗ p ⊥ p 0 (X − 1) · ηp ∼ (X − 1) · ηp, so that (Q ) = (N ) . III 63

p−1 p ⊥ 0 2 ×p p 0 ii) We now consider (Qb ) : Let G ∈ Fq be a generator matrix of (Q ) . p Pp−1 i We consider the vector 1p ∈ Fq : We have ν(1p) = i=0 X = ρpηp ∈ Fq[X], p p 0 entailing that 1p ∈ Q . But since X − 1 - ν(1p) we have 1p 6∈ (Q ) . Hence  0  p+1 p p 0 G 2 ×p we recover Q by augmenting (Q ) again, thus G := ∈ Fq is a 1p p p 0 Pp−1 generator matrix of Q . Now, for [a0, . . . , ap−1] ∈ (Q ) we have i=0 ai = 0, p p hence we get [a0, . . . , ap−1, 0] ∈ Qb , and for 1p ∈ Q we get v := [1,..., 1, γp] ∈ " 0 tr # G 0 p−1 p+1 p 2 ×(p+1) p Qb . Hence Gb := 2 ∈ Fq is a generator matrix of Qb . 1p γp p p 0 2 If p ≡ −1 (mod 4), then we have hQ , (Q ) i = {0} and hv, vi = p + γp =   −1   p p ⊥ p 1 + p · p = 0 ∈ Fq. This shows that Qb ≤ (Qb ) , hence dimFq (Qb ) = p+1 p ⊥ 2 = dimFq ((Qb ) ) entails equality. If p ≡ 1 (mod 4), then we still have hQp, (N p)0i = {0}, but now we get 2   −1   h[1,..., 1, γp], [1,..., 1, −γp]i = p − γp = 1 − p · p = 0 ∈ Fq. This p p ⊥ p p+1 shows that Qb ≤ (Nb ) with opposite choices of , hence dimFq (Qb ) = 2 = p ⊥ dimFq ((Nb ) ) entails equality. ]

(11.6) Automorphisms of QR codes. Let p be an odd prime, let q 6= p a  q  p p+1 prime such that p = 1, and let Cb ≤ Fq be an extended QR code. Let p ∼ ∗ n AutFq (Cb ) ≤ Ip+1(Fq) = (Fq ) o Sp+1 be its linear automorphism group, and p let G := OutFq (Cb ) ≤ Sp+1 be the induced group of component permutations. Then we have the Gleason-Prange Theorem, saying that PSL2(Fp) ≤ G, 1 where PSL2(Fp) acts by its natural permutation representation on P (Fp). In p+1 particular, G acts 2-fold transitively on the components of Fq . Unfortunately, we are not in a position to prove this here.

It turns out that G = PSL2(Fp), except in the following cases: i) For [q, p] = ∼ ∼ 3 ∼ 3 [2, 7] we get G = AGL3(F2) = C2 : SL3(F2) = C2 : PSL2(F7), see (11.3); ii) ∼ for [q, p] = [2, 23] we get G = M24, see (11.7); iii) for [q, p] = [3, 11] we get ∼ G = M12, see (11.8).

Proposition. For the associated QR codes we have d((Cp)0) = d(Cp) + 1. Hence we have d(Cbp) = d(Cp) + 1 as well. Moreover, the assertions of the square root bound (11.4) hold for d = d(Cp).

p 0 p 0 Proof. Let v = [a0, . . . , ap−1] ∈ (C ) such that wt(v) = d((C ) ). Then we have v := [a , . . . , a , 0] ∈ Cp, and hence by the transitivity of Out (Cp) there is b 0 p−1 b Fq b 0 0 p p 0 0 0 0 p w = [a0, . . . , ap−1] ∈ C \ (C ) such that wb := [a0, . . . , ap−1, ap] ∈ Cb , where the 0 multisets of entries of vb and wb coincide, and ap 6= 0. Since wt(w) = wt(v) − 1 = d((Cp)0) − 1 we conclude that d(Cp) ≤ d((Cp)0) − 1 ≤ p − 1. III 64

p p Conversely, let v = [a0, . . . , ap−1] ∈ C such that wt(v) = d(C ) ≤ p − 1, and let v := [a , . . . , a , a ] ∈ Cp. Again by the transitivity of Out (Cp) there b 0 p−1 p b Fq b 0 0 p p 0 0 0 p is w = [a0, . . . , ap−1] ∈ C \ (C ) such that wb := [a0, . . . , ap−1, 0] ∈ Cb , where the multisets of entries of vb and wb coincide. Since wt(w) = wt(wb) = wt(vb) ≤ wt(v) + 1 = d(Cp) + 1 we conclude that d((Cp)0) ≤ d(Cp) + 1. ]

(11.7) Example: Binary Golay codes [1949]. Let q := 2 and p := 23. We 11 ∗ find that 2 ≡ 1 (mod 23), thus 2 ∈ Z23 has order 11, thus F2(ζ23) = F211 and ϕ ∈ Γ := Aut ( (ζ )) has order 11. Moreover, we conclude that 2 ∈ 2 F2 F2 23 . 2  i Q23, that is = 1. Hence the Γ-orbits on V23 are V23 = {1} ∪ {ζ23; i ∈ . 23 i Q23} ∪ {ζ23; i ∈ N23}, where Q23 = {1, 2, 3, 4, 6, 8, 9, 12, 13, 16, 18} and N23 = {5, 7, 10, 11, 14, 15, 17, 19, 20, 21, 22}. Thus we have X23 +1 = (X +1)·Q (X +ζi )·Q (X +ζi ) = µ µ µ ∈ i∈Q23 23 i∈N23 23 0 1 5 23 0 00 0 F2(ζ23)[X]. Actually we have X + 1 = (X + 1) · g g ∈ F2[X], where g := 11 9 7 6 5 00 11 10 6 5 X + X + X + X + X + X + 1 ∈ F2[X] and g := X + X + X + X + 4 2 X + X + 1 ∈ F2[X], hence the latter are both irreducible. 23 0 Let the binary Golay code G23 ≤ F2 be the cyclic code generated by g , the 12×23 0 ∗ 00 associated generator matrix G ∈ F2 being given in Table 8; since (g ) = g 00 the code generated by g is linearly equivalent to G23. Hence the even-weight 0 0 12 11 10 subcode G23 ≤ G23 has generator polynomial (X + 1) · g = X + X + X + 9 8 5 2 00 X +X +X +X +1. Moreover, G23 has check polynomial h := (X +1)·g = 12 10 7 4 3 2 ⊥ X + X + X + X + X + X + X + 1, thus G23 has generator polynomial ∗ ∗ 00 ∗ 0 ⊥ 0 h = (X + 1) · (g ) = (X + 1) · g , showing that G23 = G23; the associated 11×23 check matrix H ∈ F2 is also given in Table 8.

Choosing a primitive 23-rd root of unity over F2 having minimum polynomial 0 00 g , we conclude that G23 is a QR code of type Q; the code generated by g then is the associated QR code of type N . Hence we again conclude that ⊥ 0 G23 = G23 ≤ G23. Moreover, extending yields the extended binary Golay 24 • • code G24 := Gb23 ≤ F2 ; by puncturing G24 again we recover G24 = (Gb23) = G23. ⊥ We conclude that G24 = G24, that is G24 is self-dual.

We determine the minimum distance of G23 and G24: The code G23 can be considered as a narrow sense BCH code with irreducible generator polynomial 0 µ1 = g and associated cyclotomic coset Q23, hence has Bose distance δ = 5, so that the√ BCH bound yields d(G23) ≥ 5. (The square root bound yields d(G23) ≥ 23 and d(G23) ≡ 3 (mod 4), hence d(G23) ≥ 7. But since we have 0 not proven this, we argue differently.) Now the generator polynomial of G23 0 shows that G23 has an F2-basis consisting of vectors of weight 8. Thus the extended code G24 = Gb23 also has an F2-basis consisting of vectors of weight 8. ⊥ n We now make use of the following observation: Let C ≤ C ≤ F2 be a weakly self-dual code, where n ∈ . Since for any v, w ∈ n we have supp(v + w) = . N F2 (supp(v) \ supp(w)) ∪ (supp(w) \ supp(v)), we conclude that wt(v + w) = wt(v) + wt(w) − 2 · |supp(v) ∩ supp(w)|. Moreover, we have hv, wi = |supp(v) ∩ III 65

Table 8: Generator and check matrices for G23.

 1 1 ... 1 1 1 . 1 . 1 ......   . 1 1 ... 1 1 1 . 1 . 1 ......     .. 1 1 ... 1 1 1 . 1 . 1 ......     ... 1 1 ... 1 1 1 . 1 . 1 ......     .... 1 1 ... 1 1 1 . 1 . 1 ......     ..... 1 1 ... 1 1 1 . 1 . 1 ......     ...... 1 1 ... 1 1 1 . 1 . 1 .....     ...... 1 1 ... 1 1 1 . 1 . 1 ....     ...... 1 1 ... 1 1 1 . 1 . 1 ...   ...... 1 1 ... 1 1 1 . 1 . 1 ..     ...... 1 1 ... 1 1 1 . 1 . 1 .  ...... 1 1 ... 1 1 1 . 1 . 1

 1 . 1 .. 1 .. 1 1 1 1 1 ......   . 1 . 1 .. 1 .. 1 1 1 1 1 ......     .. 1 . 1 .. 1 .. 1 1 1 1 1 ......     ... 1 . 1 .. 1 .. 1 1 1 1 1 ......     .... 1 . 1 .. 1 .. 1 1 1 1 1 ......     ..... 1 . 1 .. 1 .. 1 1 1 1 1 .....     ...... 1 . 1 .. 1 .. 1 1 1 1 1 ....     ...... 1 . 1 .. 1 .. 1 1 1 1 1 ...   ...... 1 . 1 .. 1 .. 1 1 1 1 1 ..     ...... 1 . 1 .. 1 .. 1 1 1 1 1 .  ...... 1 . 1 .. 1 .. 1 1 1 1 1

supp(w)| ∈ F2. Hence if v, w ∈ C such that 4 | wt(v) and 4 | wt(w), since |supp(v) ∩ supp(w)| is even we conclude that 4 | wt(v + w) as well. • Thus we have 4 | wt(v) for all v ∈ G24, hence 4 | d(G24). Since d(G24) = d(G23) ≥ 5 this implies that d(G24) ≥ 8, and thus d(G23) ≥ 7. Now the Hamming 23−11 P3 23 12 12 11 23 bound 2 · i=0 i = 2 · (1 + 23 + 253 + 1771) = 2 · 2 = 2 is fulfilled 7−1 for e := 3 = 2 , showing that d(G23) = 7 and d(G24) = 8. Hence G23 is a perfect 0 [23, 12, 7]-code, G23 is a [23, 11, 8]-code, and G24 is a quasi-perfect [24, 12, 8]-code.

It can be shown that G23 is the unique binary [23, 12, 7]-code, up to linear ∼ equivalence; its linear automorphism group AutF2 (G23) ≤ I23(F2) = S23 is isomorphic to the second largest sporadic simple Mathieu group M23, hav- 7 ing order 10200960 ∼ 1.0 · 10 . It can be shown that G24 is the unique bi- nary [24, 12, 8]-code, up to linear equivalence; its linear automorphism group ∼ AutF2 (G24) ≤ I24(F2) = S24 is isomorphic to the largest sporadic simple Math- 8 ieu group M24, having order 244823040 ∼ 2.4 · 10 . Finally, we only mention that the binary Golay codes are intimately related to Steiner systems, in par- ticular the Witt systems, occurring in algebraic combinatorics; see Exercises (14.24) and (14.25), where we give a very brief indication.

(11.8) Example: Ternary Golay codes [1949]. Let q := 3 and p := 11. We 5 ∗ find that 3 ≡ 1 (mod 11), hence 3 ∈ Z11 has order 5, thus F3(ζ11) = F35 and ϕ3 ∈ Γ := AutF3 (F3(ζ11)) has order 5. Moreover, we conclude that 3 ∈ Q11, III 66

. . 3  i that is 11 = 1. Hence the Γ-orbits on V11 are V11 = {1} ∪ {ζ11; i ∈ Q11} ∪ i {ζ11; i ∈ N11}, where Q11 := {1, 3, 4, 5, 9} and N11 := {2, 6, 7, 8, 10}, 11 Q i Q i Thus we have X −1 = (X −1)· 0 (X −ζ )· (X −ζ ) = µ0µ1µ2 ∈ i∈Q11 11 i∈N11 11 11 0 00 0 F3(ζ11)[X]. Actually we have X − 1 = (X − 1) · g g ∈ F3[X], where g := 5 3 2 00 5 4 3 2 X − X + X − X − 1 ∈ F3[X] and g := X + X − X + X − 1 ∈ F3[X], hence the latter are both irreducible. 11 0 Let the ternary Golay code G11 ≤ F3 be the cyclic code generated by g ∈ 6×11 F3[X], the associated generator matrix G ∈ F3 being given in Table 9; since 0 ∗ 00 00 (g ) = −g the code generated by g is linearly equivalent to G11. Hence 0 0 6 5 4 3 2 G11 ≤ G11 has generator polynomial (X − 1) · g = X − X − X − X + X + 1. 00 6 4 3 2 Moreover, G11 has check polynomial h := (X−1)·g = X +X −X −X −X+1, ⊥ ∗ ∗ 00 ∗ 0 thus G11 has generator polynomial h = (X − 1) · (g ) = (X − 1) · g , showing ⊥ 0 5×11 that G11 = G11; the associated check matrix H ∈ F3 is also given in Table 9.

Choosing a primitive 11-th root of unity over F3 having minimum polynomial 0 00 g , we conclude that G11 is a QR code of type Q; the code generated by g then is the associated QR code of type N . Hence we again conclude that ⊥ 0 G11 = G11 ≤ G11. Moreover, extending yields the extended ternary Golay 12 • • code G12 := Gb11 ≤ F3 ; by puncturing G12 again we recover G12 = (Gb11) = G11. ⊥ We conclude that G12 = G12, that is G12 is self-dual.

We determine the minimum distance of G11 and G12: The code G11 can be considered as a (wide sense) BCH code with irreducible generator polynomial 0 µ1 = g and associated cyclotomic coset Q11, hence has Bose distance δ = 4, so that the√ BCH bound yields d(G11) ≥ 4. (The square root bound also yields d(G11) ≥ 11, hence d(G11) ≥ 4, but this we have not proven.) ⊥ n We now make use of the following observation: Let C ≤ C ≤ F3 be a weakly n self-dual code, where n ∈ N. Now for any v = [a1, . . . , an] ∈ F3 we have Pn 2 hv, vi = i=1 ai = |supp(v)| = wt(v) ∈ F3. Hence for v ∈ C we have 3 | wt(v). • Thus we have 3 | wt(v) for all v ∈ G12, hence 3 | d(G12). Since d(G12) = d(G11) ≥ 4 this implies that d(G12) ≥ 6, and thus d(G11) ≥ 5. Now the Hamming 11−5 P2 11 i 6 6 5 11 bound 3 · i=0 i · 2 = 3 · (1 + 11 · 2 + 55 · 4) = 3 · 3 = 3 is fulfilled for 5−1 e := 2 = 2 , showing that d(G11) = 5 and d(G12) = 6. Hence G11 is a perfect 0 [11, 6, 5]-code, G11 is an [11, 5, 6]-code, and G12 is a quasi-perfect [12, 6, 6]-code.

It can be shown that G11 is the unique ternary [11, 6, 5]-code, up to linear equiv- ∼ 11 alence; its linear automorphism group AutF3 (G11) ≤ I11(F3) = {±1} o S11 is isomorphic to the direct product C2 × M11 of a cyclic group of order 2 with the smallest sporadic simple Mathieu group M11, having order 7920. It can be shown that G12 is the unique ternary [12, 6, 6]-code, up to linear equivalence; its ∼ 12 linear automorphism group AutF3 (G12) ≤ I12(F3) = {±1} o S12 is isomorphic to the non-split two-fold central extension 2.M12 of the second smallest sporadic simple Mathieu group M12, having order 95040. III 67

Table 9: Generator and check matrices for G11.

 2 2 1 2 . 1 .....   . 2 2 1 2 . 1 ....     .. 2 2 1 2 . 1 ...  6×11 G :=   ∈ F3  ... 2 2 1 2 . 1 ..     .... 2 2 1 2 . 1 .  ..... 2 2 1 2 . 1  1 . 1 2 2 2 1 ....   . 1 . 1 2 2 2 1 ...    5×11 H :=  .. 1 . 1 2 2 2 1 ..  ∈ F3    ... 1 . 1 2 2 2 1 .  .... 1 . 1 2 2 2 1

(11.9) Example: Football pool ‘13er-Wette’. To describe the outcome of a soccer match, we identify ‘home team wins’ with 1, ‘guest team wins’ with 2, and ‘draw’ with 0. Hence the outcome of n ∈ N matches can be considered n as an element of F3 . Now the task is to bet on the outcome of these matches, and the more guesses are correct the higher the reward is. The football pool currently in use in Germany is based on n = 13; in the years 1969–2004 it was based on n = 11, and in Austria n = 12 is used. • According to the German ‘Lotto’ company, it is realistic to assume that 106 gamblers participate. Betting on a certain outcome actually costs 0.50e , hence there are 500 000e at stake. From this 60% are handed back to the winners, who have at least 10 correct guesses, according to the schedule below. Assuming independent and uniformly distributed guesses we get the following winning probabilities and quotas, where the latter are obtained from the total rewards by dividing through the associated expected number of winners; these figures indeed fit nicely to the officially published quotas:

hits % reward/e probability · 313 winners quota/e 13 21 105 000 1 0.63 167403.91 13 12 12 60 000 2 · 1 = 26 16.31 3679.21 2 13 11 12 60 000 2 · 2 = 312 195.69 306.60 3 13 10 15 75 000 2 · 3 = 2288 1435.09 52.26 To facilitate analysis, we assume that a single bet on a certain outcome costs 1e . This yields the following figures, with reward rates pi, probabilities µi and III 68

pi P3 P3 3 quotas qi := , entailing an expected reward rate of µiqi = pi = : µi i=0 i=0 5

13 13 i pi qi/3 µi · 3 21 21 0 100 100 1 3 1 3 13 1 25 26 · 25 2 · 1 = 26 3 1 1 2 13 2 25 104 · 25 2 · 2 = 312 3 1 3 3 13 3 20 2288 · 20 2 · 3 = 2288

• To launch a systematic attack, we look for codes having not too many ele- ments and small covering radius. Thus the best candidates are perfect e-error correcting codes of length n, for some e ∈ N0. In this case, the Hamming bound Pe Pe n i implies that |Be(0n)| = i=0 |Bi(0n) \ Bi−1(0n)| = i=0 i · 2 is a 3-power. For n = 13 we get the following cardinalities:

e 0 1 2 3 4 5 6 7 |Be(013)| 1 27 339 2627 14067 55251 165075 384723

e 8 9 10 11 12 13 |Be(013)| 714195 1080275 1373139 1532883 1586131 1594323

13 Hence, not surprisingly, next to the trivial code and all of F3 , we only find the case e = 1, into which the 1-error correcting perfect ternary Hamming 2 33−1 [13, 10, 3]-code H3 fits; note that indeed the projective space P (F3) has 3−1 = 13 elements. Thus with 310 = 59049 bets, out of a total of 313 = 1594323, it is possible to guess at least 13 − 1 = 12 of the outcomes correctly. More precisely, assuming again that outcomes are independent and uniformly distributed, we get the following winning probabilities: 13 Given an outcome v ∈ F3 , we have to count the codewords w ∈ H3 such that d(v, w) = wt(v − w) = i, for i ∈ {0,..., 3}, which amounts to counting the 13 elements of the coset v + H3 ∈ F3 /H3 having weight i. Averaging over the 13−10 3 1 P P3 3 = 3 cosets, yields an expected reward of 3 · 13 |{w ∈ 3 v∈F3 /H3 i=0  1 P3 P  v+H3; wt(w) = i}|·qi = 3 · qi · 13 |{w ∈ v+H3; wt(w) = i}| = 3 i=0 v∈F3 /H3 1 P3 13 1 P3 13 10 P3 33 · i=0 qi · |{w ∈ F3 ; wt(w) = i}| = 33 · i=0 qi · (µi · 3 ) = 3 · i=0 pi. 10 P3 3 Since we have to place 3 bets, we get an expected reward rate of i=0 pi = 5 , which is precisely as good as random guessing. Note that this is independent of the choice of the reward rates pi, and that we have not used that H3 is perfect. • Another strategy, using expert knowledge, is as follows: If the outcome of two matches is for sure, then we may puncture with respect to these components, 11 and end up in F3 . For n = 11 we get the following cardinalities:

e 0 1 2 3 4 5 6 |Be(011)| 1 23 243 1563 6843 21627 51195 III 69

e 7 8 9 10 11 |Be(011)| 93435 135675 163835 175099 177147

11 Hence, next to the trivial code and all of F3 , we only find the case e = 2, into which the 2-error correcting perfect ternary Golay [11, 6, 5]-code G11 fits. Thus with 36 = 729 bets, out of a total of 311 = 177147 next to the above expert knowledge, it is possible to guess at least 13 − 2 = 11 of the outcomes correctly. More precisely, assuming that the unsure outcomes are independent and uni- formly distributed, and assuming independent and uniformly distributed guesses 0 we get the following winning probabilities µi:

13 0 11 i pi qi/3 µi · 3 21 21 0 100 100 1 3 1 3 11 1 25 26 · 25 2 · 1 = 22 3 1 1 2 11 2 25 104 · 25 2 · 2 = 220 3 1 3 3 11 3 20 2288 · 20 2 · 3 = 1320

0 11 P3 0 P3 µi 2 P3 ( i ) We get an expected reward rate of µ qi = ·pi = 3 · 13 ·pi = i=0 i i=0 µi i=0 ( i ) 2 11 55 15 2 251 2259 3 ·(p0 + 13 ·p1 + 78 ·p2 + 26 ·p3) = 3 · 520 = 520 ∼ 4.34. Hence this is a sensible winning strategy, only depending on expert knowledge, but not on using a code. 11−6 5 11 Using the code G11, averaging over the 3 = 3 cosets in F3 /G11 yields an 1 P P3  expected reward of 5 · 11 |{w ∈ v + G11; wt(w) = i}| · qi = 3 v∈F3 /G11 i=0 1 P3 11 1 P3 0 11 6 P3 0 35 · i=0 qi · |{w ∈ F3 ; wt(w) = i}| = 35 · i=0 qi · (µi · 3 ) = 3 · i=0 µiqi. 6 P3 0 Since we have to place 3 bets, we get an expected reward rate of i=0 µiqi, which is precisely as good as random guessing. IV 70

IV

12 Exercises to Part I (in German)

(12.1) Aufgabe: Quellencodierung. Alice kann vier verschiedene Nachrichten senden: 00: ‘Die B¨orseist sehr fest.’ 11: ‘Sollen wir verkaufen?’ 01: ‘Die Kurse fallen.’ 10: ‘Helft uns gegensteuern!’ Diese werden jeweils durch zwei Bits wie angegeben codiert. Bob empf¨angtdie Dauernachricht ... 001100110011 ... Welche Nachrichten will Alice schicken?

(12.2) Aufgabe: Arithmetischer Pr¨ufzeichencode. 10 P10 10 Es sei C := {[x1, . . . , x10] ∈ Z11; i=1 ixi = 0 ∈ Z11} ≤ Z11 der Code des ISBN- 10-Standards. Kann C Zwillingsfehler und Sprungzwillingsfehler erkennen?

(12.3) Aufgabe: Geometrischer Pr¨ufzeichencode. 10 P10 i 10 F¨ur a ∈ Z11 sei Ca := {[x1, . . . , x10] ∈ Z11; i=1 a xi = 0 ∈ Z11} ≤ Z11. Man untersuche, in Abh¨angigkeit von a, wann Ca Einzelfehler, Drehfehler, Zwillings- fehler und Sprungzwillingsfehler erkennen kann.

(12.4) Aufgabe: Kontonummern-Code. a) F¨ur x ∈ N0 sei Q(x) ∈ N0 die Quersumme von x bez¨uglich der Dezimal- darstellung. Man zeige: Faßt man Z10 als Teilmenge von N0 auf, so erh¨altman eine Bijektion Z10 → Z10 : x 7→ Q(2x). b) Ein h¨aufigverwendeter Pr¨ufzeichencode bei der Bildung von Kontonummern 2n Pn 2n ist C := {[x1, . . . , x2n] ∈ Z10 ; i=1(Q(2x2i−1) + x2i) = 0 ∈ Z10} ⊆ Z10 , f¨ur n ∈ N. Kann C Einzelfehler und Drehfehler erkennen?

(12.5) Aufgabe: International Bank Account Number. Man betrachte die g¨ultigeIBAN ‘DE68 3905 0000 0123 4567 89’. Angenommen, an der f¨uhrendenStelle der Kontonummer wird statt ‘0‘ eine ‘9’ eingetippt, so daß die BBAN nun ‘3905 0000 9123 4567 89’ lautet. Welche m¨oglichen weiteren Tippfehler gibt es, so daß eine so entstehende doppelt fehlerhafte BBAN nicht anhand der Pr¨ufzeichen ‘DE68’ erkannt werden kann?

(12.6) Aufgabe: Pr¨ufzeichencodes ¨uber Gruppen. Man betrachte einen Pr¨ufzeichencode ¨uber der endlichen Gruppe G, bez¨uglich der bijektiven Abbildungen πi : G → G, f¨ur i ∈ {1, . . . , n} und n ∈ N. Welche Bedingungen m¨ussendie Abbildungen πi jeweils erf¨ullen, damit Zwillingsfehler und Sprungzwillingsfehler erkannt werden? IV 71

(12.7) Aufgabe: Pr¨ufzeichencodes ¨uber Diedergruppen. n 2 −1 −1 F¨ur n ≥ 2 seien D2n := ha, b; a = 1, b = 1, b ab = a i die Diedergruppe i −i−1 i i der Ordnung 2n, und τ : D2n → D2n gegeben durch a 7→ a und a b 7→ a b, f¨ur i ∈ {0, . . . , n−1}. Man zeige: Die Abbildung τ ist wohldefiniert und bijektiv. τ τ Ist n ungerade, so gilt gh 6= hg f¨uralle g 6= h ∈ D2n.

(12.8) Aufgabe: Huffman-Codierung. Zur (optimalen) bin¨arenQuellencodierung kann man folgenden rekursiven Al- gorithmus benutzen: Es seien X = {x1, . . . , xq} ein Alphabet mit Wahrschein- lichkeitsverteilung µ, und µk := µ(xk) f¨ur k ∈ {1, . . . , q}. Weiter seien j 6= i ∈ {1, . . . , q} mit µi = min{µ1, . . . , µq} und µj = min({µ1, . . . , µq}\{µi}). Dann werden xi 7→ 0 und xj 7→ 1 codiert. Nun benutzt man das Alphabet . 0 X := ({x1, . . . , xq}\{xi, xj}) ∪ {xij}, mit unver¨andertenWahrscheinlichkeiten f¨ur xk, f¨ur k ∈ {1, . . . , q}\{i, j}, und Wahrscheinlichkeit µij := µi + µj f¨ur xij, um rekursiv Pr¨afixe zu den bereits gefundenen Codierungen zu bestimmen. a) Man zeige: Die obige Huffman-Codierung ergibt eine injektive Funktion ∗ ∗ h: X → (Z2) \{}, und der Code h(X ) ⊆ (Z2) ist pr¨afix-frei.Außerdem zeige Pq man: F¨urdie mittlere Wortl¨angegilt H(X ) ≤ i=1 µi · l(h(xi)) ≤ H(X ) + 1, −ei wobei die untere Schranke genau dann angenommen wird, wenn µi = 2 ist, wobei ei ∈ N0 f¨uralle i ∈ {1, . . . , q}. b) Nun trage X die Gleichverteilung, und es sei k := blog2(q)c ∈ N. Man zeige: Die zugeh¨origeHuffman-Codierung besteht aus W¨orternder L¨ange k und k +1, 2k+1 k und hat die mittlere Wortl¨ange k + 2 − n . Was passiert im Falle q = 2 ? Was f¨alltbeim Vergleich mit H(X ) auf? c) Etwa erh¨altman f¨urdas Alphabet Z4 mit den angegebenen Wahrschein- lichkeiten die folgende Huffman-Codierung:

∗ i µi (Z2) 0 0.4 0 1 0.3 11 2 0.2 101 3 0.1 100 P Man bestimme die Entropie H( 4) und die mittlere Wortl¨ange µi ·l(h(i)) Z i∈Z4 der zugeh¨origenHuffman-Codierung. Außerdem betrachte man das Alphabet 2 2 Z4, wobei die beiden Positionen unabh¨angigseien, und bestimme ebenso H(Z4), die zugeh¨origeHuffman-Codierung und ihre mittlere Wortl¨ange.Was f¨alltauf?

(12.9) Aufgabe: Huffman-Codierung. a) Man schreibe ein GAP-Programm zur Berechnung der Huffman-Codierung des Alphabets X := {a,..., z} bez¨uglich der angegebenen relativen H¨aufigkeiten der Buchstaben in der englischen Sprache. Man bestimme die Entropie H(X ) IV 72 und die mittlere Wortl¨angeder zugeh¨origenHuffman-Codierung.

X Z26 µ X Z26 µ a 0 0.082 n 13 0.067 b 1 0.015 o 14 0.075 c 2 0.028 p 15 0.019 d 3 0.043 q 16 0.001 e 4 0.127 r 17 0.060 f 5 0.022 s 18 0.063 g 6 0.020 t 19 0.091 h 7 0.061 u 20 0.028 i 8 0.070 v 21 0.010 j 9 0.002 w 22 0.023 k 10 0.008 x 23 0.001 l 11 0.040 y 24 0.020 m 12 0.024 z 25 0.001 b) Man schreibe ein GAP-Programm, mit dem man aus der Huffman-Codierung eines Texts den urspr¨ungichen Text zur¨uckgewinnt. c) Man codiere und decodiere damit den folgenden Text aus 314 Buchstaben, wobei Leerzeichen und Satzzeichen ignoriert werden k¨onnen.Wieviele Bit hat die Huffman-Codierung dieses Texts?

the almond tree was in a tentative blossom, the days were longer, often ending with magnificent evenings of corrugated pink skies, the hunting season was over, with hounds and guns put away for six months, the vineyards were busy again, as the well organized farmers treated their vines, and the more lackadaisical neighbors hurried to do the pruning they should have done in november

(12.10) Aufgabe: Entropie. Es sei X ein Alphabet mit Wahrscheinlichkeitsverteilung µX . Man zeige: Es gilt H(X ) ≤ log2(|X |), mit Gleichheit genau f¨urdie Gleichverteilung.

Hinweis. Man benutze die Jensen-Ungleichung f¨urlog2 : R>0 → R.

(12.11) Aufgabe: Bedingte Entropie. Es seien X und Y Alphabete mit Wahrscheinlichkeitsverteilungen µX und µY . a) F¨urdie gemeinsame Verteilung auf X × Y zeige man: Es gilt H(X × Y) ≤ H(X ) + H(Y), mit Gleichheit genau dann, wenn µX und µY unabh¨angigsind. b) Daraus folgere man: Es gilt H(X |Y) ≤ H(X ). Wann gilt Gleichheit?

Hinweis. Man benutze die Jensen-Ungleichung f¨urlog2 : R>0 → R. IV 73

(12.12) Aufgabe: Symmetrischer bin¨arerKanal. Man bestimme die maximale Kapazit¨ateines symmetrischen bin¨arenKanals 1 mit Fehlerwahrscheinlichkeit 2 ≤ p ≤ 1.

(12.13) Aufgabe: ML-Decodierung. Man betrachte einen symmetrischen bin¨arenKanal mit Fehlerwahrscheinlichkeit 1 −e ¨ 0 ≤ p < 2 ; typische Werte sind etwa p = 10 f¨ur e ∈ {1, 2, 3}. Uber diesen 3 Kanal sollen die W¨orterin F2 ¨ubertragen werden, wobei Gleichverteilung auf 3 F2 angenommen und ML-Decodierung verwendet werde. 3 a) Wie groß ist die Fehlerwahrscheinlichkeit γ(F2), wenn die W¨orterohne Re- 3 dundanz, also mit Informationsrate ρ(F2) = 1, gesendet werden? 6 3×6 b) Nun werde der Code C0 ≤ F2 mit Generatormatrix G0 := [E3 | E3] ∈ F2 , 1 also mit Informationsrate ρ(C0) = 2 , verwendet. Wie groß ist die Fehler- wahrscheinlichkeit γ(C0)? 6 c) Schließlich werde der Code C ≤ F2 mit Generatormatrix 1 ... 1 1 3×6 G := . 1 . 1 . 1 ∈ F2 , .. 1 1 1 . 1 also ebenfalls mit Informationsrate ρ(C0) = 2 , verwendet. Wie groß ist nun die Fehlerwahrscheinlichkeit γ(C)? 6 Hinweis. Man betrachte die Kugeln Br(c) ⊆ F2, f¨ur c ∈ C und r ∈ {0, 1, 2}.

(12.14) Aufgabe: ML-Decodierung. Zur Daten¨ubertragung durch einen symmetrischen bin¨arenKanal mit Fehler- 1 4 wahrscheinlichkeit 0 ≤ p < 2 werde ein (7, 2 , 3)-Code C (etwa ein Hamming- Code) verwendet, wobei Gleichverteilung auf C angenommen und ML-Deco- dierung verwendet werde. Man bestimme die Fehlerwahrscheinlichkeit γ(C).

13 Exercises to Part II (in German)

(13.1) Aufgabe: Hamming-Abstand. n a) Es seien n ∈ N und v, w ∈ F2 mit d := d(v, w) ∈ N0. F¨ur r, s ∈ N0 betrachte n d+r−s man A := {u ∈ F2 ; d(v, u) = r, d(w, u) = s}, und es sei t := 2 . Man zeige: d n−d Ist t 6∈ Z, so ist A = ∅; ist t ∈ Z, so ist |A| = t · r−t . n b) Es seien n ∈ N und v, w, x, y ∈ F2 mit paarweisem Abstand d ∈ N. Man n zeige: Der Abstand d ist gerade; und es gibt genau ein u ∈ F2 mit d(v, u) = d d d(w, u) = d(x, u) = 2 . Gilt notwendig auch d(y, u) = 2 ?

(13.2) Aufgabe: Ausl¨oschungen. Es sei C ein nicht-trivialer Block-Code. Geht bei der Daten¨ubertragung ein Eintrag eines Wortes verloren, so wird er als Ausl¨oschung markiert; dies ist also ein Fehler mit bekannter Position. F¨ur e, g ∈ N0 zeige man: Der Code C kann genau dann gleichzeitig e Fehler und g Ausl¨oschungen korrigieren, wenn 2e + g + 1 ≤ d(C) gilt. IV 74

(13.3) Aufgabe: Minimaldistanz. Es sei C ein bin¨arerCode der L¨ange n ∈ N und Minimaldistanz d ≥ 3. Man 2n 2n zeige: Es gilt |C| ≤ n+1 ; und ist n gerade, so gilt |C| ≤ n+2 . Kann die Schranke auch f¨ur n ungerade verbessert werden? n Hinweis. Man z¨ahle {[v, w] ∈ C × F2 ; d(v, w) = 2} mittels Double-Counting.

(13.4) Aufgabe: Nichtlineare Codes. a) Aus Singleton-, Hamming- und Plotkin-Schranke folgere man: F¨ureinen bin¨aren(6, 9, d)-Code gilt d ≤ 3. Man gebe einen bin¨aren(6, 9, 2)-Code an. b) Man zeige Es gibt keinen bin¨aren(6, 9, 3)-Code.

(13.5) Aufgabe: Parameter f¨urCodes. F¨ur n ∈ N betrachte man einen (n, qn−3, 3)-Code ¨uber einem Alphabet mit q Elementen. Man zeige: Es gilt n ≤ q2 + q + 1.

(13.6) Aufgabe: Perfekte Codes. n Es sei C ⊆ F2 ein perfekter Code mit Minimaldistanz 7. Man zeige: Es ist n ∈ {7, 23}. Man bestimme C f¨urden Fall n = 7.

(13.7) Aufgabe: Hamming-Schranke. Man zeige: Die Parameter q := 2, n := 90, m := 278 und d := 5 erf¨ullendie Hamming-Schranke, aber es gibt keinen bin¨aren(n, m, d)-Code. n Hinweis. Ist C ⊆ F2 solch ein Code, so betrachte man A := {v = [x1, . . . , xn] ∈ n F2 ; x1 = x2 = 1, wt(v) = 3} und B := {w = [y1, . . . , yn] ∈ C; y1 = y2 = 1, wt(w) = 5}, und z¨ahle {[v, w] ∈ A × B; vwtr = 1} mittels Double-Counting.

(13.8) Aufgabe: Selbstduale Codes. n (n−k)×k a) Es sei C ≤ Fq ein Code mit Kontrollmatrix H = [A | En−k] ∈ Fq in (n−k)×k Standardform, wobei k = dimFq (C) und A ∈ Fq . Man zeige: Der Code C tr ist genau dann selbstdual, wenn 2k = n und AA = −En−k gelten. b) Es sei p ∈ N eine Primzahl. Man gebe selbstduale Fp-lineare Codes der L¨angen n = 4 und n = 8 an. Hinweis zu b). Man unterscheide die F¨alle p = 2 sowie p ≡ ±1 (mod 4).

(13.9) Aufgabe: Gewichtssumme. n Es sei C ≤ Fq ein Code mit k := dimFq (C) ≥ 1, so daß eine Generatormatrix f¨ur P k−1 C keine Nullspalte enthalte. Man zeige: Es gilt v∈C wt(v) = n(q − 1)q .

(13.10) Aufgabe: Systematische Codes. n k×n Es sei C ≤ Fq ein Code mit Generatormatrix G ∈ Fq . Man zeige: Eine k- elementige Teilmenge von Spalten von G ist genau dann Fq-linear unabh¨angig, wenn C systematisch auf den zugeh¨origenKoordinaten ist. IV 75

(13.11) Aufgabe: MDS-Codes und Dualit¨at. n ¨ Es sei C < Fq ein nicht-trivialer [n, k, d]-Code. Man zeige die Aquivalenz der folgenden Aussagen: i) C ist ein MDS-Code. ii) C⊥ ist ein MDS-Code. iii) C ist systematisch auf allen k-elementigen Teilmengen von Koordinaten. iv) In jeder Generatormatrix f¨ur C sind alle k-elementigen Teilmengen von Spal- ten Fq-linear unabh¨angig. v) In jeder Kontrollmatrix f¨ur C sind alle (n − k)-elementigen Teilmengen von Spalten Fq-linear unabh¨angig.

(13.12) Aufgabe: Syndrom-Decodierung. 7 Man betrachte den Hamming-Code H ≤ F2, der durch die folgende Generator- matrix definiert wird: 1 .... 1 1 . 1 .. 1 . 1 4×7 G :=   ∈ F2 . .. 1 . 1 1 . ... 1 1 1 1 Man bestimme Syndrome, zugeh¨origeNebenklassenf¨uhrerund decodiere i) [1, 1, 0, 0, 1, 1, 0], ii) [1, 1, 1, 0, 1, 1, 0], iii) [1, 1, 1, 1, 1, 1, 0].

(13.13) Aufgabe: Syndrom-Decodierung. 7 Es sei C ≤ F2 der durch die folgende Generatormatrix definierte Code: 1 ... 1 . 1 . 1 .. 1 . 1 4×7 G :=   ∈ F2 . .. 1 .. 1 1 ... 1 . 1 1 Man bestimme Syndrome, zugeh¨origeNebenklassenf¨uhrerund decodiere i) [1, 1, 0, 1, 0, 1, 1], ii) [0, 1, 1, 0, 1, 1, 1], iii) [0, 1, 1, 1, 0, 0, 0].

(13.14) Aufgabe: Eindeutige Decodierbarkeit. 10 Es sei C ≤ F2 der durch die folgende Generatormatrix definierte Code: 1 ...... 1 1 . 1 .... 1 1 ..   5×10 G := .. 1 .. 1 . 1 .. ∈ F2 .   ... 1 . 1 1 ... .... 1 1 1 1 ..

10 Man zeige: Alle Vektoren in F2 sind eindeutig decodierbar. Man bestimme Minimaldistanz und Uberdeckungsradius¨ von C.

(13.15) Aufgabe: Aquidistante¨ Codes. 16 Es sei C ⊆ F2 ein bin¨arerCode mit wt(v) = 6 und d(v, w) = 8 f¨uralle v 6= w ∈ C. Man zeige: Es gilt |C| ≤ 16. Gibt es einen solchen Code mit |C| = 16? IV 76

(13.16) Aufgabe: MDS-Codes. n Es sei C ≤ Fq ein [n, k, d]-MDS-Code mit k ≥ 2. Man zeige: Es gilt d ≤ q.

(13.17) Aufgabe: Griesmer-Schranke. a) Man zeige: Die Parameter q := 3, n := 14, k := 4 und d := 9 erf¨ullendie Griesmer-Schranke, aber es gibt keinen tern¨aren[14, 4, 9]-Code. b) Man zeige: Die Parameter q := 2, n := 15, k := 8 und d := 5 erf¨ullendie Griesmer-Schranke, aber es gibt keinen bin¨aren[15, 8, 5]-Code. Hinweis. Man betrachte jeweils den residualen Code.

(13.18) Aufgabe: Optimale Codes. F¨ur n, d ∈ N sei K2(n, d) := max{k ∈ N; es gibt einen bin¨aren[n, k, d]-Code}. Man zeige, daß K2(n, 2d − 1) = K2(n + 1, 2d) und K2(n + 1, d) ≤ K2(n, d) + 1.

(13.19) Aufgabe: Best-Code. Ziel ist es, zu zeigen, daß optimale nicht-lineare Codes besser sein k¨onnenals optimale lineare Codes, aber daß es schwierig sein kann, solche zu finden: a) Man zeige: F¨ureinen bin¨aren[10, k, 4]-Code gilt k ≤ 5. Man gebe einen bin¨aren[10, 5, 4]-Code an; also gilt K2(10, 4) = 5. 10 3×10 b) Es sei C0 ≤ F2 der von [G | G] ∈ F2 erzeugte Code, wobei 1 ... 1 3×5 G := . 1 . 1 1 ∈ F2 . .. 1 1 .

10 Es seien v := [1, 0, 0, 0, 0; 0, 0, 1, 0, 0] ∈ F2 und π := (1, 2, 3, 4, 5)(6, 7, 8, 9, 10) ∈ hπi 10 S10, sowie C := (v + C0) ⊆ F2 der nicht-lineare Best-Code [1978]. Man zeige: C ist ein (10, 40, 4)-Code. (Man kann zeigen, daß C optimal ist.)

(13.20) Aufgabe: Erweiterte Codes. Es seien C1 und C2 die durch die folgenden Generatormatrizen G1 bzw. G2 definierten tern¨arenCodes: 1 . 1 .. 1 . 2 .. G := ∈ 2×5 und G := ∈ 2×5. 1 . 1 . 1 1 F3 2 . 1 . 2 2 F3

Man bestimme die Minimaldistanz von C1 und C2 sowie von Cb1 und Cb2.

(13.21) Aufgabe: Modifikation von Codes. Man wende die Konstruktionen Punktierung, Erweiterung, Bereinigung, Aug- mentierung, Verk¨urzungund Verl¨angerungauf die bin¨arenPr¨ufzeichen- und Wiederholungscodes an, und bestimme die Parameter der so erhaltenen Codes. IV 77

(13.22) Aufgabe: Schwach selbstduale bin¨are Codes. n Es sei C ≤ F2 ein schwach selbstdualer Code. 0 0 n a) Man zeige: Es ist C = C , wobei C ≤ F2 den bereinigten Code bezeichne. Es gilt genau dann 4 | wt(v) f¨uralle v ∈ C, wenn dies f¨ur eine F2-Basis von C gilt. Ist n ungerade, so ist 4 | wt(v) f¨uralle v ∈ C. n−1 b) Es sei C ein [n, 2 , d]-Code, f¨ur n ∈ N ungerade. Man zeige: Es gilt ⊥ n C < C = Ce, wobei Ce ≤ F2 den augmentierten Code bezeichne. c) Es sei C selbstdual. Man zeige: Es gilt C = C⊥ = Ce.

(13.23) Aufgabe: Gerades Gewicht. Man zeige: Gibt es einen bin¨aren(n, m, d)-Code mit d ∈ N gerade, so gibt es auch einen bin¨aren(n, m, d)-Code, dessen Elemente alle gerades Gewicht haben.

(13.24) Aufgabe: Summen und Produkte von Codes. Es seien C ein nicht-trivialer [n, k, d]-Code, sowie C0 ein nicht-trivialer [n0, k0, d0]- k×n 0 k0×n0 Code ¨uber Fq, mit Generatormatrizen G ∈ Fq bzw. G ∈ Fq . Man zeige die folgenden Aussagen, und untersuche, was jeweils passiert, wenn man andere Generatormatrizen w¨ahlt: 0 0 0 k×(n+n ) a) Ist k = k , so erzeugen die Zeilen der Matrix [G | G ] ∈ Fq einen [n + n0, k, d00]-Code mit d00 ≥ d + d0; er wird Verkettung von C und C0 genannt.  G .  0 0 b) Die Zeilen der Matrix ∈ (k+k )×(n+n ) erzeugen einen [n+n0, k+ . G0 Fq k0, min{d, d0}]-Code; er wird direkte Summe von C und C0 genannt.  G G  0 c) Ist n = n0, so erzeugen die Zeilen der Matrix ∈ (k+k )×(2n) einen . G0 Fq [2n, k + k0, min{2d, d0}]-Code; er wird Plotkin-Summe von C und C0 genannt. n×n0 d) Die Menge der Matrizen in Fq , deren Spalten bzw. Zeilen Elemente von 0 0 0 0 0 0 0 (kk )×(nn ) C bzw. C sind, ist ein [nn , kk , dd ]-Code, der G ⊗ G ∈ Fq als Gener- atormatrix besitzt; er wird direktes Produkt von C und C0 genannt.

(13.25) Aufgabe: Produktcodes. 8×16 Es sei C ≤ F2 das direkte Produkt der erweiterten bin¨arenHamming-Codes Hb3 und Hb4; also ist C ein [128, 44, 16]-Code und 7-fehlerkorrigierend. In der Tat hat C aber viel bessere Fehlerkorrektureigenschaften: Angenommen, bei der Ubertragung¨ eines Codeworts geschehen Fehler genau in den 14 (zuf¨alligausgew¨ahlten,dem Empf¨anger nicht bekannten) Positionen [2, 7, 19, 24, 27, 32, 45, 51, 53, 76, 82, 86, 96, 121]; dabei werden die Eintr¨ageder Matrizen zeilenweise durchnumeriert. Man zeige, daß das empfangene Wort eindeutig decodiert werden kann.

(13.26) Aufgabe: Punktierter Simplex-Code. Durch Punktieren konstruiere man aus dem bin¨aren[31, 5, 16]-Simplex-Code einen [21, 5, 10]-Code. Wie verh¨alter sich zur Griesmer-Schranke? IV 78

(13.27) Aufgabe: Hamming- und Simplex-Codes. F¨ur k ≥ 2 seien Hk und Sk die zugeh¨origenHamming- bzw. Simplex-Codes ¨uber Fq. Wie verhalten sich jeweils die Informationsrate und die relative Mini- maldistanz dieser Codes f¨ur k → ∞?

(13.28) Aufgabe: Erweiterte Hamming-Codes. Es seien k ≥ 2 und C ein bin¨arer[2k, 2k − k − 1, 4]-Code. Man zeige: C ist linear ¨aquivalent zum erweiterten Hamming-Code Hbk.

(13.29) Aufgabe: Hadamard-Codes. n×n tr a) Eine Matrix H ∈ R , f¨ur n ∈ N, die Eintr¨agein {±1} hat und HH = nEn erf¨ullt,heißt Hadamard-Matrix; falls zudem die Eintr¨agein der ersten Zeile und Spalte s¨amtlich positiv sind, so heißt H normalisiert. Man zeige: Ist H ∈ n×n eine Hadamard-Matrix mit n ≥ 3, so ist n ≡ 0 R 0 0 (mod 4). Außerdem zeige man: Sind H ∈ n×n und H0 ∈ n ×n Hadamard- 0 0 R R Matrizen, so ist H ⊗ H0 ∈ R(nn )×(nn ) ebenfalls eine Hadamard-Matrix. b) Ersetzt man in einer normalisierten Hadamard-Matrix H ∈ Rn×n den Ein- trag 1 bzw. −1 durch 0 ∈ F2 bzw. 1 ∈ F2, so erh¨altman die zugeh¨orige bin¨are Hadamard-Matrix. Die Zeilen der bin¨arenMatrizen zu H und −H bilden den zugeh¨origenbin¨aren Hadamard-Code A. Verk¨urztman A bez¨uglich der ersten Komponente, so erh¨altman den verk¨urztenHadamard-Code A◦. n ◦ n Man zeige: A ist ein (n, 2n, 2 )-Code, und A ist ein (n − 1, n, 2 )-Code. Was ist ihr Uberdeckungsradius?¨ Wie verhalten sie sich zur Plotkin-Schranke? 1 1  c) Nun seien H = H⊗1 := ∈ 2×2 und H⊗(k+1) := H⊗k ⊗ H ∈ 2 2 1 −1 R 2 2 2 2k×2k ⊗k R , f¨ur k ∈ N. Man zeige: H2 eine normalisierte Hadamard-Matrix; sie wird auch Sylvester-Matrix genannt. ◦ Weiter zeige man: Die zugeh¨origenCodes Ak und Ak sind linear; also ist Ak k k−1 ◦ k k−1 ein [2 , k + 1, 2 ]-Code, und Ak ist ein [2 − 1, k, 2 ]-Code. Daraus folgere ◦ man: F¨ur k ≥ 2 ist Ak linear ¨aquivalent zum Reed-Muller-Code Rk, und Ak ist linear ¨aquivalent zum Simplex-Code Sk. Welche Codes erh¨altman f¨ur k ≤ 3? d) Man zeige, daß die folgende Matrix eine Hadamard-Matrix ist. Sind die IV 79 zugeh¨origenbin¨arenHadamard- und verk¨urztenHadamard-Codes linear?

 111111111111   1 −1 −1 1 −1 −1 −1 1 1 1 −1 1     1 1 −1 −1 1 −1 −1 −1 1 1 1 −1     1 −1 1 −1 −1 1 −1 −1 −1 1 1 1     1 1 −1 1 −1 −1 1 −1 −1 −1 1 1     1 1 1 −1 1 −1 −1 1 −1 −1 −1 1  12×12   ∈ R  1 1 1 1 −1 1 −1 −1 1 −1 −1 −1     1 −1 1 1 1 −1 1 −1 −1 1 −1 −1     1 −1 −1 1 1 1 −1 1 −1 −1 1 −1     1 −1 −1 −1 1 1 1 −1 1 −1 −1 1     1 1 −1 −1 −1 1 1 1 −1 1 −1 −1  1 −1 1 −1 −1 −1 1 1 1 −1 1 −1

(13.30) Aufgabe: Gewichtsverteilungen. n Es sei C ≤ F2 mit Gewichtsverteilung AC ∈ C[X,Y ]. a) Man gebe die Gewichtsverteilungen des erweiterten Codes Cb, des bereinigten Codes C0 und des augmentierten Codes Ce an. b) Ein f ∈ C[X] mit f ∗ = f heißt Palindrom. Man gebe eine notwendige und hinreichende Bedingung daf¨uran, daß AC(X, 1) ∈ C[X] ein Palindrom ist.

(13.31) Aufgabe: Selbstduale bin¨areCodes. a) F¨ureinen selbstdualen bin¨aren[16, 8, d]-Code zeige man: Es gilt d ∈ {2, 4, 6}. b) F¨ur d ∈ {2, 4} gebe man jeweils einen selbstdualen bin¨aren[16, 8, d]-Code zusammen mit seiner Gewichtsverteilung an. c) Man gebe die Gewichtsverteilung eines selbstdualen bin¨aren[16, 8, 6]-Codes an. Gibt es solch einen Code? Hinweis zu a). Man benutze die Griesmer-Schranke.

(13.32) Aufgabe: Gewichtsverteilungen bin¨arerHamming-Codes. k Pn i n−i a) Es seien k ≥ 2 und n := 2 − 1, und AHk = i=0 wiX Y ∈ C[X,Y ] die Gewichtsverteilung des bin¨arenHamming-Codes Hk. Man zeige: Die Koef- n  fizienten erf¨ullenf¨ur i ≥ 2 die Rekursion iwi + wi−1 + (n − i + 2)wi−2 = i−1 , mit Anfangsbedingung w0 = 1 und w1 = 0. 0 b) Man bestimme die Gewichtsverteilungen der Codes Hbk und Hk. • c) Man bestimme die Gewichtsverteilungen der Reed-Muller-Codes Rk und Rk.

(13.33) Aufgabe: Gewichtsverteilungen von MDS-Codes. n Es sei C ≤ Fq ein MDS-Code der Dimension k ∈ N mit Gewichtsverteilung wi := |{v ∈ C; wt(v) = i}| ∈ N0, f¨ur i ∈ {0, . . . , n}. a) Man zeige: Es gilt w0 = 1, sowie wi = 0 f¨ur i ∈ {1, . . . , n − k}, und f¨ur Pk−1 j−ijn k−j i ∈ {0, . . . , k − 1} gilt wn−i = j=i (−1) i j (q − 1). b) F¨ur k ≥ 2 folgere man daraus: Es gilt n ≤ q + k − 1. IV 80

Hinweis zu a). Es sei CI := {[x1, . . . , xn] ∈ C; xi = 0 f¨uralle i ∈ I} ≤ C, f¨ur I ⊆ {1, . . . , n}. Man bestimme |{[I, v]; |I| = j, v ∈ CI \{0}}| mittels Double- Counting, f¨ur j ∈ {0, . . . , k − 1}.

14 Exercises to Part III (in German)

(14.1) Aufgabe: Verallgemeinerte zyklische Codes. n Es seien n ∈ N und 0 6= a ∈ Fq. Ein Code C ≤ Fq heißt a-zyklisch, falls mit [x0, . . . , xn−1] ∈ C auch stets [axn−1, x0, . . . , xn−2] ∈ C ist. Man gebe eine Korrespondenz von der Menge der a-zyklischen Codes der L¨ange n zu den Idealen in einem geeigneten Quotienten des Polynomrings Fq[X] an. 0 n 0 n 0 n Man zeige weiter: Mit C, C ≤ Fq sind stets auch C + C ∈ Fq und C ∩ C ∈ Fq a-zyklisch; man gebe jeweils ein Generatorpolynom an.

(14.2) Aufgabe: Modifikation zyklischer Codes. n Es sei C ≤ Fq ein zyklischer Code mit Generatorpolynom g ∈ Fq[X]. Welche der Konstruktionen Punktierung, Erweiterung, Bereinigung, Augmentierung, Verk¨urzungund Verl¨angerungergeben wieder einen zyklischen Code? In diesen F¨allengebe man jeweils ein Generatorpolynom an.

(14.3) Aufgabe: Konstruktion zyklischer Codes. n 0 n0 Es seien C ≤ Fq und C ≤ Fq nicht-triviale zyklische Codes mit Generatorpoly- 0 nomen g ∈ Fq[X] bzw. g ∈ Fq[X]. a) Es sei ggT(n, n0) = 1. Man zeige: Das direkte Produkt von C und C0 ist ebenfalls zyklisch; man gebe ein Generatorpolynom an. b) Es seien q := 2 und n = n0 ungerade, und es gelte g | g0. Man zeige: Die Plotkin-Summe von C und C0 ist linear ¨aquivalent zu einem zyklischen Code mit 0 Generatorpolynom gg ∈ F2[X].

(14.4) Aufgabe: Fehlerb¨undel. n Es sei C ≤ Fq ein zyklischer Code mit Generatorpolynom g ∈ Fq[X]. Welche Fehlerb¨undel beliebiger L¨angekann C erkennen?

(14.5) Aufgabe: CRC-Codes. Man schreibe GAP-Programme zur CRC-Codierung und -Decodierung. Welche Eingabedaten m¨ussenzur Verf¨ugunggestellt werden? Was ist die Ausgabe? Welche Konsistenztests sollten gemacht werden? 7 Man wende die Programme auf den CRC-Code H ≤ F2 mit Generatorpolynom 3 X + X + 1 ∈ F2[X] an: Man codiere die Vektoren i) [0, 0, 0, 1], ii) [0, 0, 1, 1], iii) [0, 1, 1, 1], iv) [1, 1, 1, 1], und bestimme, welche der folgenden Vektoren in H liegen: i) [1, 1, 0, 0, 1, 1, 0], ii) [0, 1, 0, 1, 1, 1, 0], iii) [1, 0, 0, 0, 1, 0, 1]. IV 81

(14.6) Aufgabe: RWTH-ID. Man schreibe GAP-Programme zur Erzeugung und Verifikation von RWTH-IDs, unter Verwendung der Programme aus Aufgabe (14.5), des Generatorpolynoms 5 2 X + X + 1 ∈ F2[X] und der Quellencodierung in Tabelle 6.

(14.7) Aufgabe: Kreisteilungspolynome. Es seien Fq der K¨orper mit q Elementen und n ∈ N. a) Es sei m ∈ N. Man zeige: Es ist genau dann Xm − 1 ein Teiler von Xn − 1 ∈ Fq[X], wenn m ein Teiler von n ist. b) Es seien p die Charakteristik von Fq, und F ein algebraischer Abschluß von Fq. pn n Wie h¨angen die Faktorisierungen von X − 1 und X − 1 ∈ Fq[X] zusammen? Wie h¨angendie Nullstellenmengen V(Xpn − 1) und V(Xn − 1) ⊆ F zusammen?

(14.8) Aufgabe: Klassifikation zyklischer Codes. a) Man bestimme alle zyklischen bin¨arenCodes der L¨ange n ∈ {1,..., 32} durch Angabe jeweils eines Generatorpolynoms. Außerdem bestimme man jeweils die zugeh¨origeNullstellenmenge. b) Man untersuche diese Codes hinsichtlich Dualit¨atund mengentheoretischer Inklusion. F¨ur n ≤ 10 gebe man jeweils auch die Minimaldistanz an.

(14.9) Aufgabe: Reversible Codes. n Ein zyklischer Code C ≤ Fq mit Generatorpolynom g ∈ Fq[X] heißt reversibel, falls mit [x0, . . . , xn−1] ∈ C stets auch [xn−1, . . . , x0] ∈ C ist. a) Man zeige die Aquivalenz¨ der folgenden Aussagen: i) C ist reversibel. ii) Es ∗ −1 gilt g = ag ∈ Fq[X] f¨urein a ∈ Fq. iii) Mit ζ ∈ V(g) ist stets auch ζ ∈ V(g). ∗ b) Nun seien gcd(q, n) = 1 und −1 ∈ Zn eine q-Potenz. Man zeige: Jeder n zyklische Code C ≤ Fq ist reversibel.

(14.10) Aufgabe: Hamming-Codes als zyklische Codes. n qk−1 F¨ur k ≥ 2 sei Hk ≤ Fq der zugeh¨origeHamming-Code, wobei n := q−1 . Ist Hk auch f¨urggT(k, q − 1) > 1 linear ¨aquivalent zu einem zyklischen Code?

(14.11) Aufgabe: Zyklische bin¨areCodes gerader L¨ange. n k F¨ur k ≥ 3 sei Hk ≤ F2 der zugeh¨origeHamming-Code, wobei n := 2 − 1. ◦ 0 n−1 Man zeige: Der bereinigte verk¨urzteCode (Hk) ≤ F2 ist linear ¨aquivalent zu einem zyklischen [n − 1, n − k − 2, 4]-Code. 0 Hinweis. Man betrachte die Plotkin-Summe von Hk−1 mit einem Even-Weight- Code.

(14.12) Aufgabe: Simplex-Codes. k n F¨ur k ≥ 2 und n := 2 − 1 sei Sk ≤ F2 der zughe¨origebin¨areSimplex-Code, der n definiert ist als der Dualcode des bin¨arenHamming-Codes Hk ≤ F2 . Man zeige: Der Code Sk ist linear equivalent zu einem zyklischen Code. Man bestimme IV 82

die Nullstellenmenge V(Sk), und gebe damit neue Beweise f¨urdie Aussagen k−1 dimF2 (Sk) = k und d(Sk) = 2 an.

(14.13) Aufgabe: BCH-Codes. n ⊥ Es sei C ≤ Fq ein BCH-Code. Ist der duale Code C ebenfalls ein BCH-Code?

(14.14) Aufgabe: Dimension bin¨arerBCH-Codes. n Es sei C ≤ F2 ein primitiver BCH-Code im engeren Sinne der L¨ange n := k k δ d 2 e−1 2 − 1, f¨ur k ≥ 1, und Entwurfsdistanz δ mit b 2 c ≤ 2 . Man zeige: Es gilt δ dimF2 (C) = n − b 2 c · k.

(14.15) Aufgabe: Bin¨areBCH-Codes. Man bestimme die Dimensionen der primitiven bin¨arenBCH-Codes im engeren Sinne der L¨ange31.

(14.16) Aufgabe: Tern¨areBCH-Codes. Man bestimme die maximale Dimension eines primitiven tern¨arenBCH-Codes der L¨ange26 und der Entwurfsdistanz 5.

(14.17) Aufgabe: Reed-Solomon-Codes. q−1 a) Es sei C ≤ Fq ein Reed-Solomon-Code. Man zeige: Ist 1 6∈ V(C), so ist der q erweiterte Code Cb ≤ Fq ein MDS-Code. n b) Es sei C ≤ Fq ein primitiver BCH-Code. Man zeige: Es gibt einen endlichen n n K¨orper Fq ⊆ F und einen Reed-Solomon-Code D ≤ F , so daß C = D ∩ Fq gilt.

(14.18) Aufgabe: Roos-Schranke. k n Es seien n := 2 − 1, f¨urein k ≥ 3, und C ≤ F2 ein zyklischer Code. Man zeige: 5 a) Ist {ζn, ζn} ⊆ V(C), so hat C hat Minimaldistanz d ≥ 4. −1 b) Ist {ζn, ζn } ⊆ V(C), so hat C hat Minimaldistanz d ≥ 5, und der bereinigte 0 n 0 Code C ≤ F2 hat Minimaldistanz d ≥ 6.

(14.19) Aufgabe: van-Lint-Wilson-Schranke. 2k n a) Es seien n := 2 + 1, f¨urein k ≥ 1, und C ≤ F2 der zyklische Code zu {ζn}. Man zeige: Der Code C ist reversibel und hat Minimaldistanz d ≥ 5. 31 5 7 b) Es sei C ≤ F2 der zyklische Code zu {ζ31, ζ31, ζ31}. Man zeige: Der Code C hat Minimaldistanz d ≥ 7. Gilt Gleichheit?

(14.20) Aufgabe: QR-Codes. a) Man bestimme die Minimaldistanz des bin¨arenQR-Codes der L¨ange47. b) Man bestimme alle perfekten 1-fehler-korrigierenden QR-Codes.

(14.21) Aufgabe: Erweiterte QR-Codes. Man gebe jeweils einen selbstdualen bin¨aren[32, 16, 8]-Code und einen selbst- dualen bin¨aren[48, 24, 12]-Code an. IV 83

(14.22) Aufgabe: Golay-Codes. 12 F¨urden erweiterten tern¨arenGolay-Code G12 ≤ F3 und den erweiterten bin¨aren 24 Golay-Code G24 ≤ F2 bestimme man die zugeh¨origenresidualen Codes.

(14.23) Aufgabe: Gewichtsverteilungen der Golay-Codes. Man bestimme die Gewichtsverteilungen der tern¨arenGolay-Codes G11 und G12, sowie der bin¨arenGolay-Codes G23 und G24.

(14.24) Aufgabe: Steiner-Systeme. Ein Steiner-System S(t, k, v), wobei t, k, v ∈ N mit t ≤ k ≤ v, ist eine Menge P der Kardinalit¨at v, die Punkte genannt werden, zusammen mit einer Menge B von k-elementigen Teilmengen von P, die Bl¨ocke genannt werden, so daß jede t-elementige Menge von Punkten in genau einem Block enthalten ist. a) Man zeige: F¨ur s ∈ {0, . . . , t} ist jede s-elementige Menge von Punkten in k−s v−s genau λs ∈ N Bl¨ocken enthalten, wobei λs · t−s = t−s . Daraus folgere man: k v Die Anzahl b = |B| ∈ N der Bl¨ocke ist gegeben durch b · t = t ; jeder Punkt geh¨ortzu genau r ∈ N Bl¨ocken, wobei bk = vr; und f¨ur t = 2 gilt r(k−1) = v−1. b) Man zeige: Gibt es ein Steiner-System S(t, k, v) mit t ≥ 2, so gibt es auch ein Steiner-System S(t − 1, k − 1, v − 1). Steiner-Systeme mit t = 1 oder t = k sind uninteressant (warum?), und solche mit t = 2 sind leicht zu finden, wie wir sogleich sehen werden. Das ist viel schwieriger f¨ur t ≥ 3, und vermutlich gibt es gar keine f¨ur k > t ≥ 6. Unten werden wir sehen, daß es sporadische Steiner-Systeme mit t = 4 und t = 5 gibt. 2 2 b) Es sei A (Fq) := Fq die affine Ebene ¨uber Fq, dabei heißen die Teilmengen 2 2 2 w+hviFq ⊆ Fq, wobei v, w ∈ Fq mit v 6= 0, affine Geraden. Man zeige: A (Fq) bildet zusammen mit den affinen Geraden ein Steiner-System S(2, q, q2); man bestimme die Parameter λs, b und r. 2 3 3 c) Es sei P (Fq) := {hviFq ≤ Fq; 0 6= v ∈ Fq} die projektive Ebene ¨uber 3 2 Fq; dabei heißen die Teilr¨aume hv, wiFq ≤ Fq, wobei hviFq 6= hwiFq ∈ P (Fq), 2 projektive Geraden. Man zeige: P (Fq) bildet zusammen mit den projektiven Geraden ein Steiner-System S(2, q +1, q2 +q +1); man bestimme die Parameter 2 λs, b und r. Außerdem stelle man Fano-Ebene P (F2) graphisch dar.

(14.25) Aufgabe: Codes und Steiner-Systeme. Der Zusammenhang zwischen bin¨arenCodes und Steiner-Systemen wird wie n folgt hergestellt: Es sei C ≤ F2 ein [n, k, d]-Code mit d ∈ N, und f¨ur P := {1, . . . , n} sei B := {supp(v) ⊆ P; v ∈ C, wt(v) = d} die Menge der Tr¨agerder Codew¨orterminimalen positiven Gewichts. n a) Man zeige: Der Code C ≤ F2 ist genau dann perfekt mit d = 2e + 1, wenn B die Menge der Bl¨ocke eines Steiner-Systems S(e + 1, 2e + 1, n) bildet. n+1 In diesem Falle f¨uhrtder erweiterte Code Cb ≤ F2 zu einem Steiner-System S(e + 2, 2e + 2, n + 1). Wie h¨angendiese Steiner-Systeme zusammen? IV 84

b) Daraus folgere man: Der Hamming-Code Hk und der erweiterte Code Hbk, f¨ur k ≥ 2, f¨uhren zu Steiner-Systemen S(2, 3, 2k −1) bzw. S(3, 4, 2k). Der Golay- Code G23 und der erweiterte Code G24 f¨uhrenzu Steiner-Systemen S(4, 7, 23) bzw. S(5, 8, 24); diese werden auch Witt-Systeme genannt. c) Daraus bestimme man erneut die Gewichtsverteilung von G24. IV 85

15 References

[1] E. Assmus, J. Key: Designs and their codes, Cambridge Tracts in Math- ematics 103, Cambridge University Press, 1992. [2] E. Berlekamp: Algebraic coding theory, Mac-Graw Hill, 1968. [3] A. Betten, H. Fripertinger, A. Kerber, A. Wassermann, K. Zim- mermann: Codierungstheorie, Springer, 1998. [4] D. Jungnickel: Codierungstheorie, Spektrum Verlag, 1995. [5] F. MacWilliams, N. Sloane: The theory of error-correcting codes, North-Holland Mathematical Library 16, North-Holland, 1986. [6] R. Schulz: Codierungstheorie, Vieweg, 1991. [7] D. Stinson: Cryptography, theory and practice, third edition, CRC Press Series on Discrete Mathematics and its Applications 36, 2006. [8] J. van Lint: Introduction to coding theory, 2nd edition, Graduate Texts in Mathematics 86, Springer, 1991. [9] W. Willems: Codierungstheorie, de Gruyter, 1999.