fea-elkies-2.qxp 10/19/00 2:44 PM Page 1382
Lattices, Linear Codes, and Invariants, Part II Noam D. Elkies
n Part I of this article we introduced “lattices” As was true of Part I, there is little if any of the L Rn and described some of their relations mathematics described herein for whose discov- with other branches of mathematics, focus- ery I can claim credit. The one possible exception ing on “theta functions”. Those ideas have a is the use of theta functions near the end to relate natural combinatorial analogue in the theory the occurrences of the groups SL2(Z/pZ) and Iof “linear codes”. This theory, though much more SL (Z) in the functional equations for weight enu- 2 recent than the study of lattices, is more accessi- merators and lattices respectively; I know of no ble, and its numerous applications include the published statement of this observation, though construction and analysis of many important lat- it may well be common knowledge in some circles. tices. All other results and ideas I have attributed when A lattice L is a special kind of subgroup of the their source is known to me, and I apologize in ad- additive group Rn; in Part I we associated to L a vance for any mis- or missing attribution. theta function, which is a generating function for the lengths of the vectors of L. A linear code C, to Sphere Packing in Hamming Space: Error- be defined below, is a subgroup of a finite addi- correcting Codes tive group Fn. Analogous to the theta functions of Most of the problems and ideas concerning sphere lattices are “weight enumerators” of codes, which packing have natural, and often more tractable, dis- are generating functions for the coordinates of el- crete analogues in the theory of error-correcting ements of C. We shall see how most of the uses codes (ECCs). Though easier in many ways than the and properties of theta functions that we described sphere-packing problem, the theory of ECCs is in Part I have counterparts in the setting of weight much more recent: its systematic development enumerators. In particular, just as the theta func- began only with R. W. Hamming’s 1947 paper.1 In tion L( ) associated to a “self-dual lattice” L is in- our digital age, ECCs are ubiquitous wherever there variant under certain fractional linear transfor- is information to reliably store or transmit; appli- mations of the variable , we shall encounter cations range from the familiar compact disc in “self-dual codes” C with weight enumerators in- one’s computer or stereo to close-up photographs variant under certain linear transformations of of Jovian moons sent back to Earth with a trans- the variables. This invariance yields much infor- mitter too weak to power a lightbulb. The theo- mation about C, and about an associated self-dual retical study of ECCs has also led to surprisingly lattice LC and its theta function. varied and deep mathematics, including for in- stance orthogonal polynomials and arithmetic Noam D. Elkies is professor of mathematics at Harvard algebraic geometry. In the present exposition we University. His e-mail address is elkies@math. concentrate on the connections with sphere pack- harvard.edu. ings, particularly as regards theta functions and Part I of this article—concerning lattices, lattice packings their discrete analogues. of spheres, theta functions, and modular forms—appeared in the November 2000 issue of the Notices, pages 1Hamming’s work is discussed in the memorial article 1238–1245. about him in the September 1998 Notices.
1382 NOTICES OF THE AMS VOLUME 47, NUMBER 11 fea-elkies-2.qxp 10/19/00 2:44 PM Page 1383
P n To see ECCs as discrete sphere packings, we number of 1’s (equivalently, such that i=1 ai is must specify the metric space containing our even; here =2). A more interesting example is the n spheres. The space will be F , i.e., the set of ordered extended Hamming code H8, in which F = {0, 1} n-tuples of elements of a finite set F, for some again, n =8, and H8 consists of the words choice of F and n. In the ECC context, such n-tuples are often called words of length n, with 00000000, 00001111, 00110011, 00111100, letters drawn from an alphabet F. To assure that 01010101, 01011010, 01100110, 01101001, |Fn| > 1 (so that a word contains some informa- and their ones’ complements tion), we assume |F| > 1 and n>0. The set Fn itself is called Hamming space and is given a met- 11111111, 11110000, 11001100, 11000011, ric structure by the Hamming distance d( , ) de- 10101010, 10100101, 10011001, 10010110. fined thus: The distance between words a =(a1,... ,an) and b =(b1,... ,bn) is Thus |H8| =16, and one may check that =4. All these codes are known to have maximal size for d(a, b):=#{i | 1 i n, ai =6 bi}. their respective F, n, and . We shall again en- counter each of these important codes, and espe- That is, d(a, b) is the number of single-letter cially H8, several times. changes (“errors”) one must make to get from a n to b. A code is then just a nonempty subset C F . Linear Codes The minimal (nonzero) distance of a code is The discrete analogue of a lattice is a linear code. (C) := min d(a, b). A linear code is a vector subspace of Hamming a,b∈C space Fn. For Fn to have the structure of a vector 6 a=b space, F must be a (finite) field, such as Z/pZ for 2 In the degenerate case |C| =1, we set (C)=n +1. some prime p. For instance, if in the previous In the standard application to error-resistant paragraph F is always chosen to be a field (so that { } Z Z { } data storage or transmission, the words that can in particular 0, 1 is identified with /2 ), and 0 be stored or transmitted are those in C, rather is chosen for the single-word code, then each of than arbitrary words in Fn. A code C with minimal the codes in that paragraph is linear. Henceforth distance is then said to detect 1 errors, and F will denote a finite field. to correct b( 1)/2c errors. This is because if The powerful framework of linear algebra makes some information is stored or sent as a word of C, it possible to work with codes much larger than and this word is changed in at most 1 coordi- would be accessible otherwise, since a linear code | |k nates, the resulting word cannot be in C, and thus of size F can be specified by only k basis words. cannot be the word originally intended; moreover, The minimal distance also simplifies somewhat: ∈ n if at most b( 1)/2c letters were changed, then the for any a, b F , we have d(a, b)=d(a b, 0) ; if original word is still uniquely determined, else by a, b are in a linear code, then so is a b, whence ∈C the triangle inequality C would contain two words is the minimum of d(c,0) over all nonzero c . n at distance at most 1. Equivalently, an e-error- For any c ∈ F , the distance d(c,0) is the number correcting code is a code C such that the (closed) of nonzero coordinates of c, a number known as n Hamming spheres Be(a):={x ∈ F | d(a, x) e} of the Hamming weight of c and denoted by wt(c). common radius e about the codewords a ∈Care Thus for a linear code is the minimum (nonzero pairwise disjoint. Such a code detects 2e errors and Hamming) weight. The basic problem for linear has minimal distance at least 2e +1. ECCs thus becomes: what is the maximum dimen- The basic problem of error-correcting codes is: sion of a linear code if its length, the size of the al- what is the maximum size of a code if its length, phabet, and a lower bound on the minimum weight the size of the alphabet, and a lower bound on the are given? minimal distance are given? That is, how efficiently As is the case for sphere packings, for most val- can we encode information while detecting or cor- ues of (|F|,n, ) one cannot at present hope to recting a given number of errors? This is also a nat- solve the problem exactly or even asymptotically, ural mathematical problem, asking how many only to give upper and lower bounds. Many of the points can be packed into Fn without any two com- bounds for codes are analogous to sphere-packing ing closer than to each other. Naturally, the smaller is, the larger C can get, and conversely. 2It is a fundamental theorem of E. H. Moore that a set of At the extreme ends are C = Fn itself (maximal ef- q>1 elements can be given the structure of a field if and only if q is a power of a prime; in the vast majority of “real- ficiency but no error correction), and |C| =1(max- world” applications of ECCs, q is a power of 2, often ei- imal but no information). Other simple examples ther 2 itself or 256 = 28, but other choices of q are also | | are the repetition code, consisting of the F words important in the mathematical theory. A finite field of q all of whose letters are the same (with = n), and elements is often called “GF(q)”; the “GF” stands for the parity check code, for which F is {0, 1} and C “Galois field” , in tribute to Galois who first studied finite consists of the 2n 1 words a with an even fields in this generality.
DECEMBER 2000 NOTICES OF THE AMS 1383 fea-elkies-2.qxp 10/19/00 2:44 PM Page 1384
bounds. For instance, the Minkowski bound be- Weight Enumerators comes the Gilbert-Varshamov bound, and can be Once more in analogy with the sphere-packing sit- C proved in the same simple way: increase by one uation, we can gain some understanding of the dif- word at a time subject to the condition that no two ficult problem of the minimal weight of a linear words come closer than . Once no room is left, code by introducing generating functions for the the open balls of radius centered at C must much harder problem of the distribution of the cover Fn. Each of these balls contains the same weights or coordinates of all the codewords. These number of points, say V . Thus |C| |F|n/V . generating functions are called weight enumera- Warning: the formula ! tors, and are homogeneous polynomials of degree X 1 n in several variables. The complete weight enu- n | | r V = ( F 1) merator is a polynomial cweC with integer coeffi- r r=0 cients in |F| variables Xa, with a ∈ F and Xa ∈ C. for the size of an open ball of radius r in Hamming It is defined as follows: X Yn is more complicated than the formula for the vol- cweC(X):= X . ume of a ball in Euclidean space. Thus the result- ci c∈C i=1 ing lower bound Ve/V2e for the packing density of Q na radius-e Hamming spheres does not simplify to That is, cweC is the polynomial whose a∈F Xa n 2 . One can, however, use Stirling’s formula to ob- coefficient is the number of codewords with n0 tain asymptotic formulas for Ve/V2e for large n. zero coordinates, n1 coordinates of 1, etc. Other In the setting of sphere packings, we reported weight enumerators can be obtained as special- in Part I that the Minkowski bound also holds for izations of cweC. For instance, for the weight dis- lattice packings, and indeed for average lattice tribution of C we are concerned only with the packings. Likewise here the Gilbert-Varshamov number of zero and nonzero coordinates in each bound holds for linear codes, and indeed for a codeword. We thus introduce the Hamming weight positive proportion of linear codes. (Here there enumerator hweC(X,Y). This is the homogeneous are only finitely many C for each choice of polynomial of degree n in two variables obtained C (F,n,dim ), so there is no difficulty in defining from cweC by setting Xa = Y for each nonzero this average.) It is even rather easy to prove that a ∈ F and by setting X0 = X. Equivalently, a randomly chosen code satisfies the Gilbert- X n wt(c) wt(c) Varshamov bound with positive probability. But, hweC(X,Y):= X Y again as with lattice packings, it is easy to choose c∈C a “random” code that almost certainly comes within Xn n m m a small factor of satisfying the Gilbert-Varshamov = Nm(C)X Y , bound, but hard to prove such an estimate be- m=0 cause no computationally feasible way is known where Nm(C) is the number of codewords of weight to find the minimal weight of a general linear code. m. For linear codes over a fixed field F, unlike sphere We illustrate these definitions with the weight packings, one can do much better than random for enumerators of the codes introduced earlier: | | 2 arbitrarily large n, as long as F = q1 for an inte- C { } n • if = 0 then cweC(X)=X0 and ger q1 7 and /n ∈ (a(q1),b(q1)) for certain n hweC(X,Y)=X ; P a(q1),b(q1) ∈ (0, 1). These are the famed “Goppa n n • if C = F then cweC(X)=( ∈ X ) and codes” , obtained by applying V. D. Goppa’s a F a hweC(X,Y)=(X +(|F| 1)Y )n; construction to “modular curves” over F; again n • if C is theP repetition code in F then considerable machinery from number theory and n cweC(X)= ∈ X and hweC(X,Y)= algebraic geometry is needed to prove that these a F a Xn +(|F| 1)Y n . codes improve on the Gilbert-Varshamov bound. Z Z These matters are discussed in [TV], esp. pages If F = /2 then the complete and Hamming 350ff. But when (|F|, /n) is outside the range weight enumerators are the same polynomial with where Goppa codes are known to improve on the the variables differently named; we call this poly- Gilbert-Varshamov bound, we face the same em- nomial simply WC . We then have: barrassment that afflicted us in the sphere- • The weight enumerator of the parity-check n n packing context. For instance, if F = Z/2Z and code of length n is ((X + Y ) +(X Y ) )/2; = n/4 for some large n, it is known that there • the extended Hamming code H8 has weight 8 4 4 8 exist linear codes whose dimension is at least enumerator WH8 (X,Y)=X +14X Y + Y . 3 In Part I of this article, we associated to any lat- ( 4 log2 3 o(1))n =(.1887 ... o(1))n, but we can- n not exhibit one and prove that it works, even tice L R a similar generating function L en- though a randomly chosen code of that dimension coding the squared lengths of lattice vectors. This has minimum weight n/4 with probability generating function, called the theta function of L, almost 1. is defined by
1384 NOTICES OF THE AMS VOLUME 47, NUMBER 11 fea-elkies-2.qxp 10/19/00 2:44 PM Page 1385
X X (x,x) m C(z):= z =1+ N (C)z . (This identity can be proved even when F is a fi- m Z Z x∈C m>0 nite field other than /p ; the cweC identity also has a generalization to arbitrary finite fields.) Note We noted various identities satisfied by theta func- that the weight enumerators we obtained for {0} tions. Weight enumerators of codes satisfy analo- and Fn, and for the repetition and parity-check gous identities. Like theta functions, they are codes over Z/2Z, do in fact satisfy these identities; multiplicative: the direct sum of any two linear 8 4 4 8 so does WH8 (X,Y)=X +14X Y + Y , which is C n1 C n2 codes 1 F and 2 F is a linear code multiplied by 24 under the linear transformation n +n C1 C2 R 1 2 , with complete weight enumera- (X,Y) 7→ (X + Y,X Y ). tor given by Applications of the MacWilliams identity in- clude upper bounds on |C| for given n and . The cweC1 C2 (X) = cweC1 (X) cweC2 (X). coefficients Nm(C) of hweC are constrained on the The same relation thus holds for the Hamming one hand by Nm(C) 0, N0(C)=1, and Nm(C)=0 weight enumerators of C , C , and C C . The for positive m<