DECEMBER 1, 2003 1 The binary reflected is optimal for M-PSK Erik Agrell, Johan Lassing, Erik G. Str¬om, and Tony Ottosson

Abstract— This paper concerns the problem of selecting a bi- values of M are given and in [1] the exact values are given for nary labeling for the signal constellation in an M-PSK communi- all M. Common for all these references is the use of the binary cation system. Gray labelings are discussed and the original work reflected Gray labeling of the constellation symbols. The binary by Frank Gray is analyzed. As is noted, the number of distinct Gray labelings that result in different -error probability grows reflected labeling was suggested by Frank Gray in a patent from rapidly with increasing constellation size. By introducing a recur- 1953 [7] as a means of reducing the coding error in a pulse code sive Gray labeling construction method called expansion, the pa- communication system. This system referred to by Gray can be per answers the natural question of what labeling, among all pos- viewed as an analog-to-digital converter, in which an analog sible constellation labelings, that will give the lowest possible av- signal controls the deflection of a sweeping electron beam. The erage probability of bit errors. Under certain assumptions on the channel values, the answer is that the labeling proposed by Gray, electron beam sweeps during each sampling interval over a row the binary reflected Gray code, is the optimal labeling for M-PSK of a coding mask, which allows the electrons to pass in certain systems, which has, surprisingly, never been proved before. slots while blocking them in others. Electrons that actually hit Index Terms— binary reflected Gray code, constellation label- the collector anode give rise to an output current, while this cur- ing, phase shift keying, pulse amplitude , quadrature rent is essentially zero if the electrons are blocked. This system amplitude modulation, average distance spectrum converts an analog signal into a signal that is essentially a string of binary digits. The solution proposed by Gray addresses three main issues with this system. First, the problem of reducing the I. INTRODUCTION distortion of the decoded analog signal arising from a small er- This paper concerns the problem of selecting a binary label- ror in the deflection of the electron beam. Second, simplifying ing for the signal vectors in an M-ary phase shift keyed (M- the manufacturing of the coding mask by making the smallest PSK) communication system that will minimize the probability apertures of the coding mask larger, and third, improving the of bit errors. As was shown in [1], the average bit error proba- timing properties of the recovery circuitry. The way that Gray bility (BER) of coherent M-PSK (M is assumed to be a power solves this problem is by simply listing the binary numbers in of two) is given by, a different order, so that adjacent numbers differ in a single bit position. This approach solves the first and most important is- M−1  sue, giving a small decoding error (a single bit) for small er- P 1 d¯ k P k , b = m ( ) ( ) (1) rors in the beam deflection. In addition, the particular mapping k=1 Gray proposes also doubles the size of the smallest apertures of the coding mask. Gray calls the proposed mapping the re- where m =log2 M and P (k) is the probability that the re- ceived signal is displaced into the decision region of a symbol k flected , due to its recursive construction method steps counter-clockwise away from the transmitted signal. The (see Section III-A). Gray identifies the trivial operations de- main focus in this paper is on the function d¯(k), the average fined in Section II below, but his treatment only concerns Gray distance spectrum (ADS) of a binary labeling of the symbols labelings with the symmetric properties imposed by the recur- in a signal constellation. This function is the average number sive reflection construction method. of that differ in symbols separated by k steps, averaged In most references in the literature the binary reflected label- over all M symbols. The probabilities P (k) given, for exam- ing proposed by Gray is referred to simply as ‘the Gray’ label- ple, by the expressions in [2, p. 201], are not functions of the ing, without further specification. However, for m>3 there constellation labeling, so the BER dependence on the labeling exist several Gray labelings that have different ADS and as m is captured entirely by d¯(k). For most channels of interest, the increases the number of such labelings rapidly becomes very function P (k) decreases rapidly with k making it is reasonable large [8Ð10]. To find the labeling that gives the lowest possible to chose a labeling that assigns binary patterns to the constel- BER, it is necessary to consider the entire class of binary label- lation symbols in such a way that adjacent patterns differ in a ings having the Gray property. Only a fraction of the labelings single bit. Such labelings are known as Gray labelings. in this class can be generated from the recursive methods de- The problem of evaluating the average BER of M-PSK mod- fined below, but we will show that it is possible to generate the ulation schemes has been studied extensively in the literature. optimum labeling by these . For illustration, in Table In [3Ð6], approximate and exact values of the BER for certain I are given two binary labelings having the Gray property along with their respective ADS. By comparing the ADS of the two The authors are with the Communication Systems group at the department of Signals and Systems, Chalmers University of Technology, Gothenburg, Sweden labelings it is seen that, from (1), these labelings will indeed (e-mail: [email protected]). result in different BER. 2 DECEMBER 1, 2003

The natural approach to finding the labeling that minimizes Definition 3—Average Distance Spectrum: The aver- (1) is to make sure that the chosen labeling results in a d¯(k) that age distance spectrum (ADS), d¯(k), of a binary labeling grows slowly with k. To be more precise, we will address the =(c0, c1,...,cM−1) is the average number of bit posi- problem of finding the optimal labeling under the assumption tions that differ in codewords separated by k steps, averaged that the channel values P (k) decay sufficiently quick with k cyclically over all the codewords, i.e., to make the minimization of the BER equivalent to sequential minimization of the components of the ADS. Such channels m−1 M−1   d¯ k  1  c − c  will be called “well-behaved” in the rest of this text. Under this ( ) M [ l]i [ (l+k)modM ]i (2) assumption, considering two labelings a and b with ADS a¯(k) i=0 l=0 and ¯b(k), respectively, the labeling a will result in a lower BER for all k ∈ Z, where the notation [cl]i denotes the bit in position according to (1) if and only if i of cl. a¯(i)=¯b(j), 0 ≤ i

TABLE I TWO DIFFERENT BINARY MAPPINGS a AND b, BOTH HAVING THE GRAY PROPERTY AND THEIR RESPECTIVE AVERAGE DISTANCE SPECTRUM a¯(k) AND ¯b(k).

abk a¯(k) ¯b(k) 0000 0000 00 0 0001 0001 11 1 0101 0011 22 2 0100 0010 32.375 2 0110 0110 42.52 0010 0111 52.52.5 1010 0101 62.53 1110 0100 72.125 2.5 1100 1100 82 2 1101 1101 92.125 2.5 1111 1111 10 2.53 0111 1110 11 2.52.5 0011 1010 12 2.52 1011 1011 13 2.375 2 1001 1001 14 2 2 1000 1000 15 1 1

m As was noted in the introduction, for a given order , the m =2 m =3 number of Gray codes that do not have equivalent ADS is usu- ally very large. Only a fraction of these labelings can be gener- ated from the recursive methods proposed below. However, we 0 00 show in this paper that it is possible to generate the optimum 0 01 labeling by these recursions. 00 0 11

01 0 10 reflect A. Construction by labeling reflection 11 1 10 To generate a labeling of order m from a labeling of 10 1 11 m − order 1 by means of reflection we proceed as fol- 1 01 lows. To the labeling of order m − 1, denoted by C m−1 = 1 00 (c0, c1,...,cM/2−1), we append a sequence of M/2 vectors formed by repeating the codewords of C m−1 in reverse order; (c0,...,cM/2−1, cM/2−1,...,c0). To this new sequence of bi- nary vectors, an extra coordinate is added to each vector from the left. This extra coordinate is 0 for the first half of the M add vectors and 1 for the second half. The so obtained sequence m =3 Cm consists of distinct codewords, so it is a labeling, and Cm is Fig. 1. Construction of Gray code of order from a Gray code of order m =2by means of reflection. said to be obtained by reflection of Cm−1. Labeling reflection is possible for m ≥ 2, and illustrated in Figure 1. If Cm−1 is a Gray code, then so is Cm, which proves that codeword once to obtain a new sequence of M vectors Gray codes of any order exist. The originally proposed Gray (c0, c0, c1, c1 ...,cM/2−1, cM/2−1). Now, an extra coordinate code [7], which is still the most commonly encountered Gray is added to each codeword from the right, taken in turn from code in communications, can be defined as follows. the vector (0, 1, 1, 0, 0, 1, 1, 0,...,0, 1, 1, 0) of length M. La- Definition 8—Binary reflected Gray code: The labeling G m beling expansion is possible for m ≥ 2, and the procedure is obtained by m − 1 recursive reflections of the trivial labeling illustrated in Figure 2. G1 =(0, 1) is the binary reflected Gray code of order m, for If Cm−1 is a Gray code, then so is Cm. By induction, it is m ≥ any 1. possible to verify that m − 1 recursive expansions of the trivial labeling G1 =(0, 1) leads to a Gray code in which the code- B. Construction by labeling expansion words corresponds to the same path on Q m as the BRGC. The second method of construction we will consider is termed labeling expansion. To generate a labeling C m IV. OPTIMALITY OF THE BINARY REFLECTED GRAY CODE from a labeling Cm−1 by expansion we do the follow- The main result of this paper is captured by the following ing; from Cm−1 =(c0, c1,...,cM/2−1), repeat each theorem, which will be proved in Section IV-B. 4 DECEMBER 1, 2003

or more precisely, which labeling has the optimum ADS in the m =2 m =3 sense of Definition 5. According to the discussion in Section I, a mapping with optimum ADS will asymptotically result in the repeat 00 0 lowest possible BER for the M-PSK system. We will show in 00 1 the following that the binary reflected Gray code (BRGC) is the 00 01 1 unique labeling, up to trivial operations, with optimum ADS. C m − 01 01 0 Lemma 5: If m−1 is an optimal labeling of order 1, with m ≥ 2, then an optimal labeling Cm of order m is obtained 11 11 0 by expanding Cm−1. The optimal labeling Cm is unique up to 10 11 1 trivial operations. 10 1 Proof: The lemma is trivial for m =2, since, up to trivial 10 0 operations, only one Gray code of order 2 exists. From Lemmas 3 and 4, any optimal labeling Cm for m ≥ 3 can be constructed by expanding a labeling Cm−1 and applying trivial operations. Hence, the ADS of Cm satisfies (3)Ð(5). Since, for all integers i d¯ i − d¯ i d¯ i add , m(2 1) and m(2 ) are increasing functions of m−1( ), and independent of d¯m−1(j) for j>i, sequential minimization d¯ , d¯ ,... Fig. 2. Construction of Gray code of order m =3from a Gray code of order of m(1) m(2) is equivalent to sequential minimization of m =2by means of expansion. d¯m−1(1), d¯m−1(2),.... Since Cm−1 is optimal by assumption, this proves that Cm is also an optimal labeling. 2 The proof of the main theorem now follows straightforwardly Theorem 1—Optimality of BRGC for M-PSK: The binary from Lemma 5. reflected Gray code of order m is the optimal labeling for Proof of Theorem 1: The BRGC of order m can be ob- 2m-PSK, in the sense of Definition 5. The labeling is unique tained by m − 1 recursive expansions of the trivial labeling up to trivial operations. (0, 1). The proof of optimality for the BRGC is trivial for m =1. By induction and Lemma 5, optimality of the BRGC is m ≥ 2 2 A. The first components of the ADS guaranteed for . In order to prove Theorem 1 we will rely on the following V. D ISCUSSION AND CONCLUSION theorem, which relates the ADS of a labeling Cm−1 of order We have addressed the problem of finding what constellation m − 1 to the ADS of its expanded labeling Cm. The proof is given in the Appendix. labeling that will produce the lowest possible BER among all Theorem 2— for ADS of an Expanded Labeling: possible labelings. The search is done under the assumption of a well-behaved channel, for which the channel values P (k) Let d¯m(k) for m ≥ 2 denote the ADS of the labeling Cm decay quickly enough to ensure that sequential minimization obtained from expansion of a labeling C m−1 having an ADS of the components of the ADS yields the minimum BER. We d¯m−1(k). Then, for all integers k, the distance spectra of C m have shown that the best way of labeling an M-PSK constella- and Cm−1 satisfy tion under this assumption is by using the binary reflected Gray d¯m(4k)=d¯m−1(2k) (3) code. The relevance of this discussion and the proof can be verified d¯m(4k +2)=d¯m−1(2k +1)+1 (4) by consulting almost any widely spread textbook on communi- 1 1 1 d¯m(2k +1)= d¯m−1(k)+ d¯m−1(k +1)+ (5) cations in which the problem of calculating the average BER of 2 2 2 systems using M-PSK is treated. In most cases, the BRGC is used, but referred to simply as ‘the Gray’ labeling and the fact with d¯1(i)=0for even i and d¯1(i)=1for odd i. The following two lemmas give the first components of the that a wealth of different Gray labelings exist and their impact ADS. They will be used in the proof of Lemma 5 and are proved on the BER is often neglected. The proofs in this paper val- M in the Appendix. idates the assumption of the BRGC for -PSK constellation labeling and allows for a more clear presentation of the topic of Lemma 3: For any Gray code Cm with m ≥ 3, the ADS satisfies d¯(1) = 1, d¯(2) = 2, and d¯(3) ≥ 2. BER calculation for this type of communication system. Lemma 4: Expanding a Gray code of order m−1 ≥ 2 results in a Gray code of order m having d¯(3) = 2. Conversely, all APPENDIX Gray codes Cm with m ≥ 3 and d¯(3) = 2 can be constructed PROOFS OF THEOREMS AND LEMMAS by expanding a Gray code of order m − 1, possibly followed by Proof of Theorem 2: The average distance spectrum trivial operations. (ADS) of any binary periodic sequence b l with period P is de- fined, for all integers k,as B. Proof of the optimality of BRGC for M-PSK P−1 δ¯ b, k  1 |b − b | We now address the problem of which particular labeling will ( ) P l l+k give the slowest increasing ADS among all possible labelings, l=0 DECEMBER 1, 2003 5

Now, from bl we form another sequence ul = ADS of the list of binary strings that results from simply re- (b−1,b−1,b0,b0,b1,b1,...), ul being simply an upsam- peating each codeword of Cm−1 once. By noting that the mod- pled version of bl, where each element of bl is repeated once. ulo operator in (2) can be removed without affecting the result, The sequence ul is a binary, periodic sequence with period by instead considering the periodic repetition of the codewords, P =2P , satisfying u2l = u2l+1 = bl, for all integers l.For we can use (6) and (7) above to obtain, for all integers k, this new sequence we have ν¯m(2k)=d¯m−1(k) P −1 2P −1 1  1  1 1 δ¯(u, k)= |ul − ul+k| = |ul − ul+k| ν¯m(2k +1)= d¯m−1(k)+ d¯m−1(k +1) (8) P P 2 2 l=0 2 l=0 ¯(0) By rearranging terms in the second sum we obtain To obtain the desired result, we note that the term dm (k) is the   ADS of the (periodically repeated) sequence (0, 1, 1, 0).For P −1 P −1 1   this sequence we have trivially, for all integers k, δ¯(u, k)= |u2l − u2l+k| + |u2l+1 − u2l+1+k| 2P (0) l=0 l=0 d¯m (4k)=0 (0) For k =2i, where i is an integer, we have d¯m (4k +2)=1   (0) P −1 P −1 d¯ (2k +1)=1/2 (9) 1   m δ¯(u, 2i)= |u2l − u2l+2i| + |u2l+1 − u2l+1+2i| 2P Combining (8) and (9) we have  l=0 l=0  P −1 P −1   d¯ k d¯ k 1 |b − b | |b − b | m(4 )= m−1(2 ) = P l l+i + l l+i 2 l=0 l=0 d¯m(4k +2)=d¯m−1(2k +1)+1 1   1 1 1 = P δ¯(b, i)+P δ¯(b, i) = δ¯(b, i) (6) d¯m(2k +1)= d¯m−1(k)+ d¯m−1(k +1)+ 2P 2 2 2 and, similarly, for k =2i +1,wehave which completes the proof of Theorem 2. 2  Proof of Lemma 3: Since, for a Gray code Cm, all adja- P−1 1 cent codewords differ in a single position, we have d¯m(1) = 1. δ¯(u, 2i +1)= |u2l − u2l+2i+1| P Codewords separated by two steps can either differ in 0 or 2 2 l=0  positions, and since all codewords are distinct, d¯m(2) = 2 for P−1 m ≥ 2. To show d¯m(3) ≥ 2 for m ≥ 3, we start by rewriting + |u2l+1 − u2l+2i+2| the equation for the ADS given by (2) as l=0  P−1 P−1 M−1 m−1 1 1     = |bl − bl+i| + |bl − bl+i+1| d¯ k  c − c  2P ( )=M [ l]i [ (l+k)modM ]i l=0 l=0 l=0 i=0 1   M−1 = P δ¯(b, i)+P δ¯(b, i +1) 1  2P d l, k , ∀k ∈ Z = M ( ) (10) 1 1 l=0 = δ¯(b, i)+ δ¯(b, i +1) (7) 2 2 where d(l, k) is the number of bits that differ between c l and d¯ k C Consider now the ADS m( ) of a labeling m obtained by c(l+k)modM . From this we show that no two consecutive expanding a labeling Cm−1 with ADS d¯m−1(k). Denoting the terms d(l, 3) and d(l +1, 3) in (10) can both be 1 for m ≥ 3. contribution to the ADS from coordinate i of all codewords with To see this, consider any sequence of five consecutive code- M−1 words, which, without loss of generality, can be taken to be 1    d¯(i) k   c − c , ∀k ∈ Z (c0, c1, c2, c3, c4). Since c1 and c3 differ in two positions, m ( ) M [ l]i [ (l+k)modM ]i l=0 there are exactly two points in Qm that are adjacent to both c1 and c3. One of these points is c2; the other may be c0 or c4, Cm we have from (2) for the ADS of but not both. Assume that |c0 − c3| =1, so that d(0, 3) = 1. m−1 Since c1 and c4 are separated by an odd number of steps this (i) |c − c |≥ d¯ ≥ 2 d¯m(k)= d¯m (k) implies that 1 4 3, so that necessarily (3) 2. i=0 Proof of Lemma 4: The first statement of the lemma fol- m−1 lows immediately from (5). (i) (0) = d¯m (k)+d¯m (k) For the second statement of the lemma, we know from i=1 the proof of Lemma 3 that, for any Gray code of order (0) m ≥ 3 d(l, 3) l =0, 1,...  ν¯m(k)+d¯m (k) , the sequence for , consists of odd positive integers such that no two consecutive val- where i =0corresponds to the last coordinate in the code- ues are both 1. Hence, the only sequence that results in words. Now, we observe that the term ν¯m(k) is simply the d¯(3) = 2 is (1, 3, 1, 3,...) (or (3, 1, 3, 1,...), which will not 6 DECEMBER 1, 2003 be further considered, since it corresponds simply to a cyclic shift of the codewords). This means that the codeword pairs {c0, c3}, {c2, c5}, {c4, c7} ... are adjacent vertices of Qm, while the codeword pairs {c1, c4}, {c3, c6}, {c5, c8},..., dif- fer in three coordinates. Since |ci − ci+3| =1for any even 0 ≤ i ≤ M − 4, (ci, ci+1, ci+2, ci+3) forms a square in Qm. Hence, ci+1 − ci and ci+2 − ci+3 are equal, or

∆  c1 − c0 = c2 − c3 = ...= cM−2 − cM−1.

The difference vector ∆ has only one nonzero position, say, po- sition j. By performing trivial operations on C m we can obtain a code Cm for which

∆  c1 − c0 = c2 − c3 = ...= cM−2 − cM−1 =0...01, without affecting its ADS. We now partition the codewords of Cm according to the value of rightmost bit. This creates two subsets (c0, c3, c4, c7, c8,...,cM−1) and (c1, c2, c5, c6, c9,...,cM−2), which represent two cycles on Qm. By picking any of the two subsets and puncturing the rightmost bit in this subset (which geometrically corresponds to a projection orthogonal to ∆ ,so that the two subsets become identical after puncturing) we gen- erate a cyclic Gray code of order m − 1. It is easily verified that expanding this labeling using the procedure given in Section III-B yields Cm. 2

REFERENCES [1] J. Lassing, E. G. Str¬om, T. Ottosson, and E. Agrell, “Computation of the exact bit error rate of coherent M-ary PSK with Gray code bit mapping,” IEEE Transactions on Communications, vol. 51, no. 11, pp. 1758Ð1760, Nov. 2003. [2] M. K. Simon and M. S. Alouini, Digital Communications over Fading Channels—A Unified Approach to Performance Analysis, John Wiley & Sons, 2000. [3] P. J. Lee, “Computation of the bit error rate of coherent M-ary PSK with Gray code bit mapping,” IEEE Transactions on Communications, vol. COM-34, no. 5, pp. 488Ð491, May 1986. [4] M. I. Irshid and I. S. Salous, “Bit error probability for coherent M-ary PSK systems,” IEEE Transactions on Communications, vol. COM-39, no. 3, pp. 349Ð352, Mar. 1991. [5] M. K. Simon, S. M. Hinedi, and W. C. Lindsey, Digital Communication Techniques, Prentice Hall, Englewood Cliffs, New Jersey, 1994. [6] J. Lu, K. B. Letaief, J. C.-I. Chuang, and M. L. Liou, “M-PSK and M-QAM BER computation using signal-space concepts,” IEEE Trans- actions on Communications, vol. 47, no. 2, pp. 181Ð184, Feb. 1999. [7] F. Gray, “Pulse code communications,” U. S. Patent No. 2632058, Mar. 1953. [8] E. N. Gilbert, “Gray codes and paths on the n-cube,” Bell System Tech- nical Journal, vol. 37, pp. 815Ð826, May 1958. [9] J. E. Ludman, “Gray code generation for MPSK signals,” IEEE Trans- actions on Communications, vol. COM-29, no. 10, pp. 1519Ð1522, Oct. 1981. [10] C. E. Sundberg, “Bit error probability properties of Gray-coded M.P.S.K signals,” IEE Electronics Letters, vol. 11, no. 22, pp. 542Ð544, Oct. 1975. [11] R. D. Wesel, X. Liu, J. M. Cioffi, and C. Komninakis, “Constellation labeling for linear encoders,” IEEE Transactions on Information Theory, vol. 47, no. 6, pp. 2417Ð2431, Sept. 2001. [12] J. Gross and J. Yellen, Graph Teory and its applications, CRC Press, Boca Raton, 1999.