Downloaded from orbit.dtu.dk on: Oct 07, 2021

Polynomial weights and code constructions

Massey, J; Costello, D; Justesen, Jørn

Published in: I E E E Transactions on Information Theory

Publication date: 1973

Document Version Publisher's PDF, also known as Version of record

Link back to DTU Orbit

Citation (APA): Massey, J., Costello, D., & Justesen, J. (1973). weights and code constructions. I E E E Transactions on Information Theory, 19(1), 101-110.

General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

 Users may download and print one copy of any publication from the public portal for the purpose of private study or research.  You may not further distribute the material or use it for any profit-making activity or commercial gain  You may freely distribute the URL identifying the publication in the public portal

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-19, NO. 1, JANUARY 1973 101

With no = 4, p = 10, n can be increasedto 16 * 40 = 640 also indebted to the referees for their valuable comments bits, the number of redundant symbols r = 160, k = and suggestions. 640 - 160 = 480 and REFERENCES 111W. W. Peterson, Error-Correcting Codes. New York: M.I.T. Press and Wiley, 1961. [21W. W. Peterson and E. J. Weldon, Jr., Error-Correcting Codes, r 7. lo-lo. 2nd ed. Cambridge, Mass.: M.I.T. Press, 1972. r31 E. R. Berlekamp, Algebraic . New York: McGraw-Hill? 1968. To correct all phased burst errors with A = 3 using the 141 V. D. Kolesmk and E. T. Mironchikov, Decoding of Cyclic Codes. (in Russian). Moscow: Svyaz, 1968. Reed-Solomon code over GF(230) we would need r = 61 R. G. Gallager, Low-Density Parity-Check Codes. Cambridge, 7 * 30 = 210 redundant bits, but the block length in this Mass.: M.I.T. Press, 1963. @I J. L. Massey, Threshold Decoding. Cambridge, Mass.: M.I.T. case can be much longer. Computer implementation of Press, 1963. operations over GF(23) is substantially simpler than over [71 G. D. Forney, Jr., Concatenated Codes. Cambridge, Mass.: M.I.T. Press, 1966. GF(230). [81S. I. Samoylenko, Error-Correcting Codes (in Russian). Moscow: Nauka, 1966. ACKNOWLEDGMENT 191J. K. Wolf, “Adding two information symbols to certain nonbinary BCH codes and some applications,” Bell Syst. Tech. J., vol. 48, The author expresseshis appreciation to Prof. J. K. Wolf Sept. 1969. rw S. Lin, An Zntroduction To Error-Correcting Codes. Englewood for very helpful discussions and criticism. The author is Cliffs, N.J.: Prentice-Hall, 1970.

Polynomial Weights and Code Constructions

JAMES L. MASSEY, DANIEL J. COSTELLO, JR., AND J0RN JUSTESEN

Absrruci-For any nonzero element c of a general tinite field GF(q), I. INTRODUCTION it is shown that the (x - c)*, i = 0,1,2,. . ., have the “weight-retaining” property that any linear combination of these N THIS paper, it is shown that the polynomials polynomials with coefficients in GF(q) has Hamming weight at least I (x - c)‘, i = 0,1,2,.-a, where c is any nonzero element as great as that of the minimum degree polynomial included. This of GF(q), have the fundamental property, which we term fundamental property is then used as the key to a variety of code con- “weight-retaining,” that any linear combination of these structions including 1) a simplified derivation of the binary Reed-Muller GF(q) codes and, for any prime p greater than 2, a new extensive class of p-ary polynomials with coefficientsin has Hamming weight “Reed-Muller codes,” 2) a new class of “repeated-root” cyclic codes at least as great as that of the minimum degreepolynomial that are subcodes of the binary Reed-Muller codes and can be very included. This is proved separately in Section II-A for the simply instrumented, 3) a new class of constacyclic codes that are case where GF(q) has characteristic p = 2 since the binary subcodesof the p-ary “Reed-Muller codes,” 4) two new classes of binary case has a simplicity lacking in general and since it has the convolutional codes with large “free distance” derived from known binary cyclic codes, 5) two new classes of long constraint length binary most interesting applications. The applications to 1) Reed- convolutional codes derived from 2’-ary Reed-Solomon codes, and Muller codes, 2) a new class of “repeated-root” binary 6) a new class of q-ary “repeated-root” constacyclic codes with an cyclic codes, 3) two new classes of binary convolutional algebraic decoding algorithm. codes derived from binary cyclic codes, and 4) two new classesof binary convolutional codes derived from Reed- Manuscript received January 27, 1972. This work was supported Solomon codes are given in Sections II-B, II-C, II-D, and in part by NASA Grant NGL 15-004-026 and by National Science II-E, respectively. Foundation Grant GK-5265. J. L. Massey was on leave at the Laboratory for Communication In Section III-A, we give a new class of “constacyclic” Theory, Technical University of Denmark, Lyngby, Denmark. He is codes over fields GF(q) with characteristic p greater than 2 now with the University of Notre Dame, Notre Dame; Ind. D. J. Costello, Jr., is with the Department of Electrical Engineering, that are “maximum distance separable” and haye a simple Illinois Institute of Technology, Chicago, 111. algebraic decoding algorithm. This class of codes is then J. Justesen is with the Laboratory for Communication Theory, Technical University of Denmark, Lyngby, Denmark. employed in Section III-B as the basis for an inductive proof 102 IEEE TRANSACTIONS ON INFORMATION THEORY, JANUARY 1973 of the weight-retaining property for a general so that with the aid of (4) we have W[P(x)] = 2 W[P,(x)] 2 GF(q). Finally, in Section III-C, we give a new p-ary W[(x + c)~,~,], as was to be shown. Conversely, suppose generalization of the Reed-Muller codes and a new class that PO(x) # 0. We then have from (3) of constacyclic subcodes of these p-ary codes having the same minimum distance as the parent codes. w[%41 2 wEPdx>l (5) since any nonzero terms in P,(x) cancelled by the addition II. THE BINARY CASE of c’“P,(x) must reappear as nonzero terms in x2”P,(x). A. The Weight-Retaining Property in Fields of Since P,,(x) has degreeless than 2”, the induction hypothesis Characteristic Two gives W[P,(x)] 2 W[(x + ~)~m’n],which, together with (5), Let c be a nonzero element of the finite field GF(q) where yields (I), and the theorem is proved. q = 2’ for some integer r. Since the polynomials (X + c)‘, We remark that, for the special case r = 1, Theorem 1.1 i = 0,1,2;. ., include exactly one polynomial of each degree, is equivalent to showing that the binary 2” x 2” matrix they are a basis for the vector space of all polynomials whose (i + 1)th row is the sequence of coefficients in over GF(2’) and hence every polynomial P(x) over GF(2’)’ (x + l)‘, i.e., the (i + 1)th row of Pascal’s triangle reduced can be expresseduniquely as a linear combination of these modulo 2, has the property that any sum of its rows has polynomials. Hereafter, let W[P(x)] denote the Hamming Hamming weight at least as great as the uppermost row weight of P(x), i.e., the number of its nonzero coefficients. included in the sum. For n = 3, this matrix is The following theorem relates W[P(x)] to the expansion of P(x) in the above basis. riooooooo- Theorem 1.1: Let I be any finite nonempty set of non- 11000000 negative integers with least integer imin and let 10100000 11110000 P(X) = C bi(X + C)i 10001000 iel 11001100 where c and each bi is a nonzero element of GF(2’). Then 10101010 W[P(x)] 2 W[(x + c)i,in]. (1) 1 1 1 1 1 1 1 1.

Proof: We proceed by induction on the greatest These matrices are of some importance in switching theory integer im,, in I. A simple check shows that (1) holds for where Preparata [l] has pointed out other interesting imax< 22. We suppose then that (1) holds for i,,,,, < 2” and properties of these matrices, including the fact that they are show that (1) holds for i,,, < 2”+l. self-inverse. Partition Z into the sets I,, and Ii, where I0 contains those B. Binary Reed-Muller Codes and only those i in I such that i < 2”. Then We shall make frequent use of the following fact. C bi(X + C)i = (X + C)2” iz, bi(X + C)ie2”p Lemma 1: Let c be a nonzero element of a finite field ieli GF(q) with characteristic p, and let i be a nonnegative which, upon writing PI(x) for the summation on the right- integer with radix-p form [im- r, * * *,il,iO]. Then hand side which is a polynomial of degree less than 2”, m-l becomes W[(X + C>i] = n (ij + 1) (64 j=O C bi(X + C)' = X'"P,(X) + C'"P,(X). (2) isI or, for the particular case where p = 2, Similarly, write P,(x) as the polynomial of degreeless than 2” W[(x + l)i] = 2”(‘) (6b) given by P,(X) = C bi(X + C)'. where w(i) is the number of l’s in the radix-2 form of i. islo To prove this lemma, we first note that W[(x + c)~] is From (2) and the definitions of I, and I, we then have just the number of integers k, 0 I k I i, such that the binomial coefficient (i) is nonzero modulo p. But, by a P(x) = [PO(X) + ?P,(x)] + xZ”P,(x). (3) theorem of Lucas [2, p. 1131, Suppose first that P,(x) = 0. Then, from (3), we have W[P(x)] = 2W[P,(x)]. Since PI(x) has degree less than (7) 2”, we have from the induction hypothesis where the ij and the kj are the digits in the radix-p forms of w [P,(x)] 2 W[(x + c)imin-2”]. (4) i and k, respectively and where, by convention, a binomial But also coefficient whose ,lower member exceeds its upper is zero. W[(x + c)imq = W[(x + c)Z”(x + C)i-in-2n] It then follows from (7) that there are exactly ij + I choices for kj, namely 0,1,2,. . . ,ij, such that the corre- + c”“)(x + C)im~n-2n] = W[(x2” sponding binomial coefficient in (7) is nonzero module p. = 2W[(x + C)i-in-2n] Thus (6a) follows and the lemma is proved. MASSEY et al. : POLYNOMIAL WEIGHTS AND CODE CONSTRUCTIONS 103

We are now in position to use Theorem 1.1 for a simple W[g(x)] = 2” = d. [Hereafter we call a in derivation of the binary Reed-Muller codes.Let m and u be which the irreducible factors and hence the roots of g(x) any two positive integers such that u < y11.Consider the have multiplicity greater than one a repeated-root cyclic binary matrix G with n = 2” columns whose rows contain code.] We have then proved the following. the sequenceof coefficients in (x + l)i for all i such that Theorem 2: For 2 I u I m, the (n = 2m, k = 2m-U’1 - 1) i < IZand w(i) 2 u. The number of such i, i.e., the number binary repeated-root cyclic code generated by g(x) = of rows of G, is just (x + 1)nfk is a subcode of the uth order binary Reed- Muller code’ having the same minimum distance d = 2” as the parent code. Although the repeated-root cyclic codes of Theorem 2 For example, with m = 3 and u = 2, we have are generally inferior to comparable Bose-Chaudhuri- Hocquenghem (BCH) codes, they have two interesting 1 1 1 1 1 1 1 1 properties that might recommend their use in certain 10101010 practical applications. G= 11001100’ First, very simple syndrome-forming and decoding 11110000 I I circuitry is possible for the repeated-root codes. The We then take G to be the generator matrix of an (n,k) circuitry utilizes logical elements corresponding to the binary parity-check code. It follows from Theorem 1.l factor (x + 1) in combinations that lend themselves to that every nonzero codeword, i.e., every sum of one or implementation with integrated circuits. Consider the more rows of G, has Hamming weight at least 2”, since our circuit of Fig. l(a). It may be readily checked that if a choice of G ensures that such a sum corresponds to a polynomial P(x) = P,,-r-x”-’ + * . * + P,x f PO is read sum of polynomials (x + 1)’ for which w(i,,,J 2 u and into this circuit with higher degreecoefficients leading, then hence, by Lemma 1, W[(x -t l)imin] 2 2”. Moreover, the contents some rows of G have Hamming weight exactly 2” so the s(x) = so + six + .-* + Sn-k-lXn-k-l minimum distance of the code is d = 2”. This code is precisely the uth order Reed-Muller code of length n = 2” when Pi is the only nonzero coefficient will be Pj(x + 1)’ . and, in fact, the rows that we have chosen for G are the mod (x”-“); hence, by linearity, the response for general same as those chosen by Reed [3] (except for a trivial P(x) will be reversal of each row). s(x) = P(x -t- 1) mod (xnTk) The evaluation of k and d for the binary Reed-Muller codes as given here is a substantial simplification of past [where here and hereafter we write P(x) mod Q(x) for the arguments. remainder when the polynomial P(x) is divided by the polynomial Q(x).] In particular, when P(x) = f(x) -t- e(x) C. A New Class of Binary Repeated-Root Cyclic Codes where f(x) = a(x)g(x) = a(x)(x + l)nmk is a codeword For 2 I u I m, consider the binary polynomial in the repeated-root code and e(x) = e, + e,x + * . . + e,-l~n-l is the channel error-pattern, we have g(x) = (x + I)’ 03) where s(x) = e(x + 1) mod (x”-~) i = 2” - 2m-u+1 + 1. (9) n-l The radix-2 form of i is then = Jo ej(x + 1)” mod (x”-~), (11) ~im-l,* * *,i,,i,J = 11 1 . . . 1 0 0 * . . 0 I] which shows that s(x) dependsonly on the error pattern and w is thus a true syndrome. Supposethen that one has realized where the run of O’s has length nz - u. We note that a logical function F that forms the decoding estimate 2”-l < i < 2”, which implies that g(x) divides xn + 1 &,,-1 of the leading error digit e,- t from the syndrome s(x). for IZ = 2” (since then x” + 1 = (x + I>“) but for no After further i shifts of the logic in Fig. l(a), the contents of smaller n. Hence, g(x) generates a binary (n,k) cyclic the syndrome register become code with n = 2” and n - k = i, which, from (9), gives n-1 k = 2”-U+l - 1. The rows of the generator matrix G “s(x)” = C ejMi(x + l)j mod (x”-~) for this cyclic code may be chosen as g(x)(x + I)’ for j=O forj = O,l;.+ ,k - I, or equivalently as (x + I)j for i < [where we understande-,, = en-,,] so that the samefunction j < n. But it follows from (10) that w(j) 2 w(i) for F will then be forming the corresponding estimate a,,- i- 1 i I j < n = 2” sincethe radix-2 form ofj must have u - 1 of e,,- i- 1. Thus, reminiscent of the technique first proposed leading l’s and at least one 1 in its last m - u + 1 positions. by Meggitt [4] f or cyclic codes, a complete decoder for Thus the rows of 6 are a subsetof the rows of G as given in the repeated-root code can be implemented as shown in Section 1I-B. Hence the cyclic code generatedby g(x) is a Fig. l(b). The connection shown dotted in this decoder is subcode of the uth order binary Reed-Muller code having included if it is desired to “remove” the effect of correctly the same minimum distance as the parent code since decodederror digits from the syndromeso that the syndrome 104 IEEE TRANSACTIONS ON INFORMATION THEORY, JANUARY 1973

%I Sl %-k-l any nonzero element c of GF(2’), and any nonnegative L . . . integers n and N, --P: -r!?i wrp(x>w + 4”l (4 2 W[(x + c)“] * W[P(x) mod (x” + c)]. (13) Proof: Letting P(x) = Q,(Y) + xQ2(x”) + * * * + x”- ‘Q,(Y), we have

W[P(X)(X” + c)“] = i W[Qi(X”)(X” + c)“]. (14) i=l Now, identifying x” on the right-hand side of (14) with x in Theorem 1.2, we obtain

mod-2 unit adder --D- delay W[P(X>(X +” cl”1 2 i$l w [QLc)I* W[“l * i$l W[Qi(c)I cyclic codes of Theorem 2. (b) Complete decoder. = W[(x+C)N]*Wt xi- 'Qi(c)] [ i=l contents must be all O’s after successful decoding of the complete block. and (13) now follows upon noting that the last summation is Second, since the repeated-root codes are subcodes of the just the polynomial P(x) mod (x” + c). Reed-Muller codes having the same minimum distance as To describe convolutional codes of rate R = l/o, we the parent codes, Reed’s majority logic decoding algorithm resurrect a notation used by Massey [6]. The sequence [3] can be used for a simple realization of the decoding io,il,i2,. . . of information bits is described by its D-transform function Fin Fig. l(b). For instance, for the (8,3) repeated- root code with d = 22 = 4, it may be readily checked that Z(D) = i. + i,D + i2D2 + . . . sa, s4 and so + s3 + s4 form a set of three parity checks and the convolutional code is defined by the polynomial orthogonal on e,- 1 = e7 [5] so that F may be realized to conform with the decoding rule: & = 0 if none or one of G(D) = G,(D”) + DG,(D”) + . . . + D”-lG,(D”). (15) these checks have value I, & = 1 if all three have value 1, and detection of 2 or more errors announced if two of (The component polynomials Gj(D) are now commonly these checks have value 1. called the “code-generating polynomials” of the con- volutional code [5].) If M is the maximum of the degrees D. Construction of Binary Convolutional Codes from of the code-generating polynomials, then M is called the Binary Cyclic Codes memory of the code and nA = (M + 1)~ is called the As we now proceed to show, Theorem 1.1 provides the constraint length. The encoded sequence is the sequence key for using known cyclic codes to construct convolutional to,t,,t,, . . * whose D-transform is given by codes with large “free distance.” To facilitate this discussion, T(D) = I(D”)G(D), (16) we first recast Theorem 1.1 into the following two equivalent forms. which of course is a polynomial whenever I(D) is a poly- Theorem 1.2: For any polynomial Q(x) over GF(2’), nomial. any nonzero element c of GF(2’), and any nonnegative The free distance d,,,, of the convolutional code is the integer N, minimum of W[T(D)] taken over all I(D) # 0. The code is said to be catastrophic, or to exhibit catastrophic error WCQMx+ 4”l 2 W[ j 2 0. Applying Theorem 1.3 W[IT(D)1 to the second term on the right in (23), we obtain 4(i-j)-Z(Dn + 1)41+2] 2 W[(D + 1)2if’] . W[P(D)h(D)2(j-i)-’ mod (D” + l)]. W[WldD) The first factor on the right is at least two; the argument 2 W[(D + 1)4j+2]. W[P(D)g(D)4(i-j’-2 mod (D" + l)]. 106 IEEETRANSACTIONSONINFORMATIONTHEORY,JANUARY 1973

TABLE I cyclic code such that d, N d,,. In Table II we list several SOMEBINARYCONVOLUTIONALCODESOBTAINEDFROMTHE CONSTRUCTION GIVENINTHEOREM 3. binary (i.e., r = 1) convolutional codes obtained from Theorem 4 for R = + and indicate the specific cyclic code Cyclic Code Employed d free ~~~~~-~~~~ _ _~~ used in the construction. Comparison of Tables I and II show (7, 4) QR code, d, = 3, d,, = 4 23 that the R = + codes obtained from Theorem 4 are often 23 (7, 3) QR code, d, = 4, d,, = 3 24 superior to the R = 4 codes obtained from Theorem 3 k-4 (or at least the lower bound on d,,,, is larger.) (15, 8) BCH code, d, = 4, d,, = 5 24 24 To obtain an indication of the quality of the convolutional (17, 8) QR code, d, = 6, dh = 5 26 codes obtained by the above constructions, the nA = 40 26 (15, 5) BCH code, d, = 7, d,, = 4 >7 and R = f code of Table I was compared with the nA = 40 >7 and R = + Bahl-Jelinek “complementary” code [ 1l] and (23, ;2) QR code, d, = 7, dh = 8 27 L7 with the nA = 40 and R = + Massey-Costello “quick- (23, 11) QR code, d, = 8, dh = 7 28 look-in” code [lo] in Fano-alogirthm [12] sequential 28 (31, 11) BCH code, d, = 11, d,, = 6 211 decoding for simulated binary symmetric channels and an 211 additive white Gaussian noise channel. In an extensive (47, 24) QR code, d, = 11, dh = 12 211 211 simulation when the computational cutoff rate Rcompof (47, 23) QR code, d, = 12, d,, = 11 212 the channel was near the code rate +, it was found that the 112 (63, 30) BCH code, d, = 13, d,, 2 8 213 Table I code was slightly inferior in undetected error 213 probability to the Bahl-Jelinek code but was significantly (63, 24) BCH code, d, = 15, d,, 2 8 215 215 superior to the Massey-Costello code. In erasure prob- (79, 40) QR code, d, = 15, dh = 16 215 ability, the Table I code was inferior to the Massey- 215 (79, 39) QR code, dg = 16, d,, = 15 216 Costello code but superior to the Bahl-Jelinek code. It seems 216 reasonable then that the codes obtained from Theorems 3 (89, 44) QR code, d, = 18, dh = 17 218 218 and 4 will be competitive with the best known codes of (103, 52) QR code, $ = 19, dh = 20 219 other constructions. 219 (103, 51) QR code, d, = 20, dh = 19 220 It should be evident that the generality of Theorem 1.3 220 admits the construction of many new classesof convolutional codes by mixing g(x) and h(x) from several codes, etc. We The first factor on the right is at least 2 and the second at leave such extensions to the reader. It should also be re- least d,, since the argument is a nonzero codeword in the marked that the binary (r = 1) R = 3 special case of cyclic code generated by g(x). Hence the second term on the Theorem 1.3 was independently found by Rudolph and right in (23) is at least 2d9. A similar argument shows that Miczo [13] with a very different argument. the first term is at least d, so that E. Construction of Binary Convolutional Codes from Reed-Solomon Codes W[T(D)] r 3d,. (24) By choosing g(x) in Theorem 3 (or Theorem 4) to be the By an entirely similar argument when j > i 2 0, we have generator polynomial of a Reed-Solomon (RS) 2’-ary code [2, p. 3101, we can construct some surprisingly good W[T(D)] 2 3d,. (25) binary convolutional codes for very long constraint lengths. Finally, suppose that i = j 2 0. Applying Theorem .3 To obtain binary codes, each digit of the RS code is repre- to the first term on the right in (23), we obtain sented as a binary r-tuple so that the R = 3 2’-ary con- volutional code with one information digit and two encoded @‘l?‘(DMD>2(D+ ”1)4 ’] digits per subblock becomes an R = + binary code with r 2 W[(D + 1)4i] . W[P(Djg(Dj2 mod (D” + l)] information bits and 2r encoded bits per subblock. The constraint length nAb of the binary code is r times the 2 d,. constraint length nA of the 2’-ary code. For an (n = 2’ - 1,k) RS code, d, = n - k + 1 and A similar argument shows the second term on the right in d,, = k + 1. Thus the best bound on d,,,, in Theorem 3 (23) to be bounded below by d,, so that we have results from the choice k = Ln/3J where L. J denotes the integer part of the enclosed expression. For this choice one W[T(D)] 2 d, + dh. (26) obtains The theorem now follows from (24), (29, and (26). 4,,, 2 I@ + 4)/3] and Again we note that the lower bound on d,,,, provided by r(n - k + l), n - k odd; Theorem 4 is independent of m and hence of the rate R of nAb = r(n - k + 2), n - k even, the convolutional code derived from the cyclic code. Hence the bound will be tightest for m = 1, i.e., for R = a. so that The best convolutional codes are obtained by selecting a (27) MASSEY et al.: POLYNOMIAL WEIGHTS AND CODE CONSTRUCTIONS 107

TABLE II n - k, which divides x” - cp, generates an (n, k) con- SOMEBINARY CONVOLUTIONALCODES OBTAINED FROM THE CONSTRUCTIONGIVENINTHEOREM 4. stacyclic code whose codewords are all the multiples of g(x) .- having degreeless than n. The code is cyclic if and only if Cyclic Code Employed R nA drr.. c = 1 and is negacyclic [2, p. 2111 if and only if c = - 1. (7, 3) QR code, d, = 4, d,, = 3 f >7 The following theorem gives a new class of repeated-root (17, 8) QR code, d, = 6, dh = 5 :: 211 (23, 11) QR code, d, = 8, d,, = 7 g 215 constacyclic codes and in its proof we develop an algebraic (41, 20) QR code, d, = 10, dh = 9 j E 219 decoding algorithm for thesecodes. The cyclic codes in this (47, 23) QR code, d, = 12, dh = 11 223 (63, 30) BCH code, d, = 13, dh L 8 $ 2: 221 class have been given earlier by Assmus and Mattson [16] (79, 39) QR code, d, = 16, dh = 1 S & 231 and by Berman [17]. (89, 44) QR code, dg = 18, dh = 17 $ 9”; 235 (103,51) QR code, d, = 20, dh = 19 & 108 239 Theorem 5: The polynomial g(x) = (x - c)“- k for 1 I k < p generatesa pr-ary (n = p,k) constacyclic code with d = n - k + 1 (i.e., a maximum distance separable Inequality (27), for r = 10, shows that the free distance of code [2, p. 3091.) these binary codes is still at least 10 percent of the con- Proof: We note first that (x - c)” = xp - cp so that straint length for nAb 1: 7000. g(x) divides xp - cp and hence generates a constacyclic Even better codes can be obtained by adding a single code of length IZ = p. Moreover, by Lemma 1, W[g(x)] = parity digit to the r-tuple used to represent the digits of p-k+l=n-k+lsothatd~n-kfl. GF(2’). In this case,the binary code still has only r informa- Letf(x) be a codeword in this constacyclic code and let tion bits per subblock but the subblock length is increasedto e(x) = e, + e,x + * . . + ep- ixp- ’ be the channel error 2(r + 1) and hencethe rate of the binary code is reduced to pattern. The same analysis as led to (11) above now shows that after f (x) + e(x) is read into the circuit of Fig. 2 with R=I r (28) higher degreeterms leading then the resultant syndrome is 2r+l’ given by which approaches) for large r. Since there are at least two s(x) = e(x + c) mod (xp- k, VW nonzero bits in each nonzero digit of GF(2’) in this new representation, one has for the binary code or equivalently p-1 dfree2 2LPn + 4)/3J S(X) = C ej(x + c)j mod (x"-~). WI j=O and also From (30b), we see that the syndrome s(x) = so + (r + l)(n - k + l), n - k odd; SIX + . . . + Sp-k-tXP-k-l can be expressedas nAb = 1(r + I)(n - k + 2), n - k even, Oli

weight-retaining property given in Theorem 1.1 for p = 2 holds in general. We find it more convenient to prove first the p-ary analog of Theorem 1.2 and thereafter to deduce the p-ary analog of Theorem 1.1. Theorem 6.2: For any polynomial Q(x) over GF(p’), any nonzero c in GF(pr), and any nonnegative integer N, W[Q(-W - c>“l 2 WC“l* W-Q(41. (36) Fig. 2. Syndrome-forming circuit for the p’-ary repeated-robt constacyclic codes of Theorem 5. Proof: In what follows we shall make frequent use of the fact that for any i and any polynomial P(x), W[P(x)] 2 it may be written W[P(x) mod (xi - c)]. We first show that (36) holds for N c p. If Q(c) = 0, then (36) holds trivially. If Q(c) # 0, then Q(x) is not divisible 44 = El yjij by (x - c) so that Q(x)@ - c)~ mod (xp - cp) is a non- where Yj # 0 is the “modified” value of the jth error (and zero codeword in the constacyclic code generated by g(x) = is related to the true error value ei, as et, = Yjc- ‘j) and (x - c)~ and hence, by Theorem 5, has Hamming weight at where we define Xj = ij as the location of the jth error. least N + 1. Thus Equation (34) may then be rewritten as W[Q(x)(x - c)“] 2 W[Q(x)(x - c)” mod (xp - cp>] 2 N + 1 = W[(x - c)“] si = i YjX], Oli(x- c>“] 2 (K + 1) jgo WIPj(Cpi)I, Use of the additive group of GF(p), rather than the multi- plicative group, as the error locations results both in the which, upon invoking (38), may be written as natural inclusion of the “0” position and also in a reduction in the number of multiplications required for decoding by Wt-Q(x)(x- 4”l the iterative algorithm. 2 (K + l)lV[Q(x)(x - c)~ mod (xp’ - cp’)]. (40) B. The Weight-Retaining Property of (x - c)~ over Now also GF(p’) Q(x)(x - c)~ mod (xp’ - cp’) Again we remark that the polynomials (x - c)‘, = Q(x)(x - c)” mod (x - c)~’ i = 0,1,2;**, form a basis for the vector space of all polynomials over GF(p’). We now propose to show that the = [Q(x) mod (x - c)Pi-“1(x - c)~. (41) MASSEY et al. : POLYNOMIAL WEIGHTS AND CODE CONSTRUCTIONS 109

But L < pi so that (36) may be applied to the right-hand TABLE III A SHORT TABLE OF THE NEW CLASS OF U-AKY REED-MULLER CODES side of (41) to give GIVEN IN SECTION iII-C.

W[Q(x)(x - c)~ mod (xp’ - cp’)] p n k d P n k d 4 2 w ctx - 47 . w~wi (42) : where we have recognizedthat Q(x) mod (x - c)” evaluated 1 at x = c is just Q(c). From (40), (41), and (42), we obtain :4 20

W[Q(x)tx - c)“] 2 (K + 1)W [(x - c>“] . W[QWl, :: 27 11 8 25 which, with the aid of Lemma 1 and the facts that L -=zKp ’ z: 10 9 :: 10 and N = Kp’ + L, may finally be written as 27 i :i 27 1 27 : WQ(x)tx - 4”l L W[“l. W[Q@)l, 25 4 3 which is (36), and the theorem is proved. ;: 1 We now have as a trivial consequenceof Theorem 6.2: Theorem 6.1: Let Z be any nonempty finite set of non- The codes in Table III have in most instances the same negative integers with least integer i,,,in and let parameters n, k, and d as do the extended p-ary Reed- Muller codes as given by Kasami et al. [20] (which we P(X) = C bi(X - c)i hereafter call KLP codes) whenever the more sparse KLP iel codes exist; i.e., the codes of Table III generally have the where c and each bi are nonzero elements of GF(pr). same k but n and d both one larger than the corresponding Then KLP code. However, the 5-ary (25, 15) code in Table III has d = 6, whereasd is only 4 for the 5-ary (24, 15) KLP W[P(x)] 2 W[(x - c)i=q. (43) code, so that the “p-ary Reed-Muller codes” as given here This theorem follows from Theorem 6.2 by noting that are not merely permutations of the (more sparse) KLP codes. P(x) = Q(x)(x - c)imin where Q(c) = bimin # 0. We also remark that one can obtain repeated-root Theorem 6.1 is the desiredp-ary analog of Theorem 1.1. constacyclic subcodes of the p-ary Reed-Muller codes Although we shall make no further use of it, we now state given here that are analogous to the cyclic codes of Section for completenessthe p-ary analog of Theorem 1.3, which II-C. The generator polynomial of the constacyclic code is follows from Theorem 6.2 preciselyas Theorem 1.3 followed chosen as g(x) = (x - c)ndk, where k = pm-“+’ - 1. from Theorem 1.1. These constacyclic subcodeshave the same minimum dis- tance d = 2~“~’ as their parent p-ary Reed-Muller codes. Theorem 6.3: For any polynomial P(x) over GF(p’), The constacyclic code is cyclic if and only if c = 1 and is any nonzero element c of GF(p’), and any nonnegative negacyclic if and only if c = - 1. integers n and N, IV. CONCLUDING REMARKS W[P(x)(x” - c)“] In the foregoing, it has been shown that the weight- 2 W[(x - c)“] . W[P(x) mod (x” - c)]. retaining properties of the polynomials (x - c)’ admit of exploitation in the construction of both block codes and C. A New Class of p-ary “Reed-Muller” Codes convolutional codes. This generality is somewhat sur- Proceeding analogously to Section II-B, let m be any prising, and it seemssafe to say that not all the consequences positive integer and consider the matrix G with n = pm of the weight-retaining property have beenuncovered in this columns, whose rows are the sequencesof coefficients of paper. (x - c)~for all i < n such that W[(x -- c)~] 2 d, where d is We wish to remark that the discovery by the secondauthor some integer chosen such that equality holds for at least of the binary R = 3 codes of Theorem 3 was the initial one such i. Given n and d, the k corresponding integers i impetus for the research reported here. The binary con- can be found with the aid of Lemma 1. For simplicity, volutional codesof Theorem 3 for m = 2’ and of Theorem 4 one would usually take c = 1 or c = - I, but this is not for m = 1 were earlier described orally by the first two necessary.It follows immediately from Theorem 6.1 that authors [21]. In the original manuscript of this paper, the G is the generator matrix of an (n = p”,k) p-ary code with results were obtained only for prime fields GF(p)’ rather than minimum distance d. We call these codes “p-ary Reed- GF(p’) as appearshere. The most significant consequenceof Muller codes” because of their similarity to the binary this generalization was the subsequent discovery of the Reed-Muller codes as formulated in Section II-B. A short long constraint length convolutional codes in Section II-E list of these codes is given in Table III. by Justesen. 110 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-19, NO. 1, JANUARY 1973

ACKNOWLEDGMENT communication systems,” IEEE Trans. Commun. Technol., vol. COM-19, pp. 751-772, Oct. 1971. The authors are grateful to Dr. G. David Forney, Jr., of [lOI J. L. Massey and D. J. Costello, Jr., “Nonsystematic convolu- tional codes for sequential decoding in space applications,” the Codex Corporation for his keen interest in and en- IEEE Trans. Comman. Technol., vol. COM-19, pp. 806-813, couragement of this research and for his suggestion of the Oct.-__. 1971.__ _. t111 L. R. Bahl and F. Jelinek, “Rate 4 convolutional codes with simple proof of Lemma 1. (An alternative proof of Lemma complementary generators,” IEEE Trans. Inform. Theory, vol. 1 may be found in Berman [17].) IT-17. vu. 718-727. Nov. 1971. [121 R. M. Fano, “A heuristic discussion of probabilistic decoding,” IEEE Trans. Inform. Theory, vol. IT-9, pp. 64-74, Apr. 1963. REFERENCES [131 L. D. Rudolph and A. Miczo, “Some results on the distance properties of convolutional codes,” Syracuse Univ., Syracuse, [I] F. P. Preparata, “State-logic relations for autonomous sequential N.Y.. Final Reo. NSF Grant GK-4737. Oct. 1970. networks,” IEEE Trans. Electron. Comput., vol. EC-13, pp. 542- I141 B. P\ieumann, _” Distance properties bf convolutional codes,” 548, Oct. 1964. S.M. thesis, Dep. Elec. Eng., Mass. Inst. Technol., Cambridge, [2] E. R. Berlekamp, Algebraic Coding Theory. New York: Aug. 1968. McGraw-Hill. 1968. [151 D. J. Costello, Jr., “Construction of convolutional codes for [3] I. S. Reed, “A class of multiple-error-correcting codes and the sequential decoding,” Dep. Elec. Eng., Univ. Notre Dame, decoding scheme,” IRE Trans. Inform. Theory, vol. PGIT-4, pp. Notre Dame. Ind.. Tech. Ren. EE-692. Aue. 1969. 38-49, Sept. 1954. [161 E. F. Assm&, Jr:, and H.-F. Mattson, jr., “New 5-designs,” [4] J. E. Meggitt, “Error correcting codes and their implementation J. Combinatorial Theory, vol. 6, pp. 122-151, 1969. for data transmission systems,” IRE Trans. Inform. Theory, vol. [171 S. D. Berman, “On the theory of group codes,” Kibernetika, IT-7, pp. 234-244, Oct. 1961. vol. ?, no. 1, pp. 31-39, 1967. [5] J. L. Massey, Threshold Decoding. Cambridge, Mass.: M.I.T. [I81 J. Rlordan, An Introduction to Combinatorial Analysis. New Press, 1963. York: Wilev. v. 33. 1958. [6] -, “Majority decoding of convolutional codes,” Res. Lab. [191 G. D. Foiie$, Jd, Concatenated Codes. Cambridge, Mass.: Electron. Mass. Inst. Technol., Cambridge, Quart. Prog. Rep. M.T.T. Press. 1966.

64.I_. DU. 183-188. Jan. 15. 1962. PO1T. Kasami, S. Lin, and W. W. Peterson, “New generalizations of [7] J. L. Massey and M. ‘K. Sain, “Inverses of linear sequential the Reed-Muller codes,” IEEE Trans. Inform. Theory, vol. circuits,” IEEE Trans. Comput., vol. C-17, pp. 330-337, Apr. 1968. IT-14, pp. 189-198, Jan. 1968. [8] G. D. Forney, Jr., “Convolutional codes I : Algebraic structure,” WI D. J. Costello, Jr., and J. L. Massey, “Constructing good con- IEEE Trans. Inform. Theory, vol. IT-16, pp. 720-738, Nov. 1970. volutional codes from cyclic block codes,” presented at IEEE [9] A. J. Viterbi, “Convolutional codes and their performance in Int. Symp. Information Theory, Asilomar, Calif., Jan. 1972.

Correspondence

A General Approach to Linear Mean-Square Estimation is an easily implemented linear operation on the realizations of Problems the observation process. The estimate is also approximated arbitrarily closely in the stochastic mean-square sense by an STAMATIS CAMBANIS explicitly given linear integral operation on the observation process. The significance of this estimation procedure is that it Abstract-An explicit and easily implemented solution is given to the provides a general solution to linear ‘mean-square estimation general problem of linear mean-square estimation of a signal or system problems that can be implemented in a straightforward way. process based upon noisy observations, under the assumption that the The basis for the approach taken in this correspondence is auto- and cross-correlation functions of the signal and the observation processes are known. Also a number of specific estimation problems are some structural properties of second-order processes presented briefly discussed. in [l 1. These properties are summarized in Section II. In Section III the general linear mean-square estimation problem is for- I. INTRODUCTION mulated and solved. A number of estimation problems for which In this correspondence the linear mean-square estimation of the estimation procedure presented in Section III is applicable a signal or system process on the basis of noisy observations is are discussed in Section IV. considered. A general solution is obtained under the assumption It should be remarked that the use of stochastic-process only that the autocorrelation of the observation process and the representations in the solution of linear mean-square estimation cross-correlation between the observation and the signal pro- problems is well known and widely used; see for instance [4] cesses are known. No assumptions are made about the con- and further references mentioned in the following. In the light tinuous, stationary, or Gaussian properties of the signal or of the of prior work on this subject the contribution of this corre- observation processes, or about the nature of the observation spondence is twofold: i) the linear mean-square estimation interval. The estimate is expressed in a series, each term of which problem is solved for all signal or system processes and observa- tion processes that are measurable and of second order and for all observation intervals; in contrast, all solutions previously Manuscript received May 4, 1971; revised April 12, 1972. This work was supported by the Air Force Office of Scientific Research under Grant reported in the literature apply to more restrictive classes of AFOSR-68-1415. processes and observation intervals; and ii) the estimate can be The author is with the Department of Statistics, University of North Carolina at Chapel Hill, Chapel Hill, N.C. 27514. implemented in a straightforward way and a significant freedom