Short Low-Rate Non-Binary Turbo Gianluigi Liva, Bal´azs Matuz, Enrico Paolini, Marco Chiani

Abstract—A serial concatenation of an outer non-binary turbo decoding thresholds lie within 0.5 dB from the Shannon with different inner binary codes is introduced and an- limit in the coherent case, for a wide range of coding rates 1 alyzed. The is based on memory- time-variant (1/3 ≤ R ≤ 1/96). Remarkably, a similar result is achieved recursive convolutional codes over high order fields. The resulting codes possess low rates and capacity-approaching performance, in the noncoherent detection framework as well. thus representing an appealing solution for spread spectrum We then focus on the specific case where the inner code is communications. The performance of the scheme is investigated either an order-q or a first order length-q Reed- on the additive white Gaussian noise channel with coherent and Muller (RM) code, due to their simple fast Hadamard trans- noncoherent detection via density evolution analysis. The pro- form (FHT)-based decoding algorithms. The proposed scheme posed codes compare favorably w.r.t. other low rate constructions in terms of complexity/performance trade-off. Low error floors can be thus seen either as (i) a serial concatenation of an outer and performances close to the sphere packing bound are achieved Fq-based turbo code with an inner Hadamard/RM code with down to small block sizes (k = 192 information bits). antipodal signalling, or (ii) as a coded modulation Fq-based turbo/LDPC code with q-ary (bi-) orthogonal modulation. The soft demodulation and the non-binary trellis/check node I.INTRODUCTION soft-input soft-output (SISO) blocks can be both efficiently Low-rate codes have been widely considered in the context implemented thanks to the order-q FHT. This allows full of spread spectrum communications [1], [2]. Some of the hardware reuse at the decoder. The proposed construction most successful and powerful coding schemes are based on performs within 0.8 dB from the SPB in the short block regime Hadamard-Walsh sequences either for orthogonal modulation over an additive white Gaussian noise (AWGN) channel with [2], [3] or as component codes for concatenated schemes [2], coherent detection. Remarkably, low error floors are achieved. [4]–[6]. For instance, a low-rate coding scheme consisting of We further compare the obtained performance with that of the concatenation of an outer rate-1/3 with the IS-95(A) standard with iterative demodulation/decoding an inner Hadamard code, leading to a coding rate of 1/32, was [12], observing gains of 2 dB or more. We also simulate selected for the uplink of the IS-95(A) standard [2], [3]. the concatenated scheme performance on the AWGN with Iteratively-decodable codes able to approach the Shannon noncoherent detection, for which the phase of the channel is limit at low coding rates have been introduced in the past assumed to be blockwise constant [13]. Again a large gain [6]–[8]. However, most of them suffer either from high error w.r.t. IS-95(A) standard can be observed. floors [6], [7] or from visible losses compared with the sphere packing bound (SPB) [9] when the code dimension k is within few hundreds of bits [8]. Very low-rate low-density parity- II. CODE STRUCTURE AND DECODING F check (LDPC) codes over moderate order fields q (e.g., with C F m We consider an (nO , kO) outer code O over q, with q = 2 , 6 ≤ m ≤ 8) possessing decoding thresholds close m q = 2 , where nO and kO denote the block length and the to channel capacity have been recently proposed in [10]. The code dimension in terms of field symbols. The coding rate is codes of [10], also referred to as multiplicative repeat (MR)- given by RO = kO /nO. Due to their excellent performance LDPC codes, rely on the repetition of the codeword symbols in the short block length regime, we restrict our analysis to F and their multiplication by non-zero coefficients of q. They outer codes being non-binary turbo codes based on memory-1 can be described as the serial concatenation of an outer LDPC recursive convolutional codes [11], for which iterative decod- F code over q and binary inner codes with dimension m. ing can be efficiently implemented thanks to FHTs [14] with In this paper, a novel low-rate scheme is presented based complexity O(q log q). The encoder structure is depicted in on the concatenation of inner algebraic codes having good Figure 1. The information word u of kO symbols in Fq is distance properties with the recently-introduced non-binary input to a rate-1, memory-1 time-variant recursive systematic turbo codes of [11] as outer codes where the inner code convolutional (RSC) tail-biting encoder. The first set of parity dimension matches the turbo code field order q. The proposed symbols p(1) is obtained as concatenation can in principle be also applied to ultra-sparse non-binary LDPC codes. The concatenated code performance (1) (1) (1) (1) pi = gi ui + fi pi−1 ∀i ∈ [0, kO − 1], (1) is first analyzed by means of density evolution (DE) for both coherent and noncoherent detection, showing how their with (1) (1) properly initialized. Here, (1) F and p−1 = pkO −1 pi ∈ q (1) (1) F (2) G. Liva and B. Matuz are with Institute of Communications and Navigation gi ,fi ∈ q\{0}. The second set of parity symbols p of the Deutsches Zentrum f¨ur Luft- und Raumfahrt (DLR), 82234 Wessling, is obtained in a similar way after permuting the symbols of Germany (e-mail: {gianluigi.liva, balazs.matuz}@dlr.de). u according to the interleaving rule i 7→ π(i) (for details see E. Paolini and M. Chiani are with CNIT, DEIS/WiLAB, Univer- (1) (2) sity of Bologna, 47521 Cesena (FC), Italy (e-mail: {e.paolini, [11]). The codeword is given by w = [u|p |p ]. The code marco.chiani}@unibo.it). length is nO =3kO symbols and the coding rate is RO =1/3. u represents the probability that the symbol associated with (1) f yt is β, given the observation of yt. Due to the mapping C F C g(1) I(β): q 7→ I, (2) can be rewritten as Pt(β) = Pr{xt = (β) (β) (β) (β) x y where x c and c CI . Under D t | t}, t = 1 − 2 t t = (β) the assumption of β uniformly distributed over Fq, the PMF Pt(β) fulfills p(1) Π f (2) 1 (β) (2) Pt(β) ∝ exp hxt , yti , (3) g σ2  D (β) where hxt , yti denotes the inner product (correlation) be- (β) tween x and yt. The PMFs Pt(β), t = 0,...,nO − 1, p(2) t are then input to the iterative decoder operating on the factor graph for the outer turbo code. Note that usually, when binary Fig. 1. Encoder for the non-binary turbo codes of [11]. outer codes are employed, a marginalization is performed after computing the channel conditional probabilities to derive bit- wise probabilities [12]. This leads to a loss of information The non-binary turbo code is serially-concatenated [15] with that may be partially recovered by iterating decoding between an inner I I binary linear CI, where both block (n , k ) the inner and the outer code [12]. Alternatively, symbol-based length and code dimension are expressed in bits. For the decoding of the outer code may be performed by merging proposed concatenated scheme it is assumed that I . k = m sections of its trellis representation [16], thus avoiding the need The coding rate of the inner code is RI = kI/nI. Since of marginalizing the probabilities P (β). This allows skipping each symbol at the output of the outer encoder is mapped t the iteration between the outer and the inner decoder, at the onto a codeword of CI, the overall block length in bits is expense of a higher outer code decoding complexity. We will O I. The overall rate for the concatenated code C n = n × n see that the proposed concatenated scheme, despite working is O I , where O is the input R = R × R = k/n k = k × m symbol-wise, allows keeping a relatively-simple outer decoder, block size of the outer turbo code in bits. Note that C is with a complexity lower than that of [16]. linear upon proper choice of the mapping CI(β): Fq 7→ CI between Fq symbols and codewords in CI. More specifically, linearity is achieved by multiplying the m-bits binary vector B. Noncoherent Detection representation of the encoded symbols by the generator matrix We consider next a blockwise noncoherent channel with of the inner code. The minimum distance for the concatenated AWGN [13]. We assume the phase to be constant over blocks code C is d ≥ dO × dI , where dO and dI are the minimum of nI channel bits, i.e., over each inner code word. The received distances of outer and inner codes. signal associated with the t-th turbo code symbol is

Let’s denote by w = w0 w1 ... wnO −1 the nO-symbols jθt yt = xte + nt. codeword of the outer turbo code and by c the codeword at the output of the concatenated encoder. Then, c can be partitioned The noise samples are modeled as complex, circularly- 2 into nO segments of nI bits each, c = c0 c1 ... cnO −1 , symmetric Gaussian random variables, CN (0, 2σ ) and θt is where the generic t-th segment ct is associated to the t-th uniformly-distributed, θt ∼ U[0, 2π[. We further assume the symbol at the output of the turbo encoder. Clearly, ct ∈ CI. The phases of different blocks to be independent. For each yt, bits of c are finally mapped onto a binary antipodal modula- the conditional PMF Pt(β) is evaluated for each β ∈ Fq. tion, producing the modulated vector x =1−2c. As for c, also Due to the mapping CI(β): Fq 7→ CI and averaging over the x is partitioned into nO segments, x = x0 x1 ... xnO −1 . distribution of θt, (2) can be rewritten as   2π 1 (β) Pt(β)= Pr{xt |yt,θt}dθt. (4) A. Coherent Detection 2π Z0 F We first consider transmission over the AWGN channel Under the assumption of β uniformly distributed over q, the under the assumption of coherent detection. The received PMF Pt(β) can be easily evaluated as [12] signal y is given by 1 (β) Pt(β) ∝ I0 |hx , yti| , (5) y = x + n  σ2 t  where I is the modified Bessel function of first kind and order with n ∼ N (0, σ2). Also for y and n we adopt the same 0 i zero. partitioning of c and x, with yt denoting the channel output for xt and nt the corresponding noise samples. Decoding III. CODE DESIGN AND DENSITY EVOLUTION ANALYSIS is performed in two stages. For each received segment yt, the conditional probability mass function (PMF) Pt(β) is For a given target coding rate, the choice of the inner evaluated for each β ∈ Fq, where code shall be based on the concatenated code minimum distance and on the iterative decoding threshold of the overall Pt(β) := Pr{wt = β|yt} (2) scheme. To achieve large minimum distances, inner codes 0 10 Capacity (per dimension), coherent detection quite promising considering the no phase information is avail- Decoding Threshold, MR-LDPC, F256 (Coherent) able at the receiver. Note that the decoding threshold under Decoding Threshold, MR-LDPC, F256 (Noncoherent) noncoherent detection for the turbo code ensemble over F256

F256 - No inner code F256 - No inner code is at Eb/N0 ≃ 2.28 dB. We computed the normalized average mutual information (AMI) IN for a blockwise noncoherent F256 - (12, 8) S.Hamming F256 - (12, 8) S.Hamming channel with phase constant over blocks of nI = 8 symbols, with the constraint of antipodal mapping,

−1

R 10 I(xt; yt) 1 p(yt|xt) F256 - (33, 8, 14) code F256 - (33, 8, 14) code E IN := = log2 , (6) nI nI  p(yt) 

F256 - (64, 7) Reed-Muller F256 - (64, 7) Reed-Muller where

F64 - (64, 6) Hadamard F64 - (64, 6) Hadamard p(yt|xt)= F256 - (128, 8) Reed-Muller F256 - (128, 8) Reed-Muller 2 2 1 kxtk kytk 1 exp − − I0 |hxt, yti| . (2πσ2)nI  2σ2 2σ2  σ2 

−2 F256 - (256, 8) Hadamard F256 - (256, 8) Hadamard 10 nI −2 −1 0 1 2 3 4 Under the assumption of xt distributed uniformly over {±1} , Eb/N0 [dB] IN equals R = 1/3 for Eb/N0 ≃ 2.06 dB. Remarkably, the rate-1/3 scheme possesses a decoding threshold that is only Fig. 2. Iterative decoding thresholds for various concatenations. The ’◦’ ∼ 0.22 dB away from the limit given by (6). Similarly, the  marker denotes the coherent detection case, whereas the ’ ’ marker refers to limit provided by (6) for coding rate R = 1/24 and nI = 64 the noncoherent detection case. The decoding thresholds for the MR-LDPC is at Eb/N ≃ 0.84 dB, while the decoding threshold for the ensemble over F256 are provided as reference. 0 rate R =1/24 scheme in Figure 2 is almost at Eb/N0 ≃ 1.42, only 0.6 dB away. with good distance properties will be considered. Next we provide a DE analysis based on the Monte Carlo method for IV. TURBO CODES OVER HIGH ORDER FIELDS WITH selected combinations of outer and inner codes. The decoding ORTHOGONAL AND BI-ORTHOGONAL MODULATION thresholds are evaluated in terms of Eb/N0, where Eb denotes We focus on the concatenation with Hadamard and first the energy per information bit and N0 is the one-sided noise order RM codes for two compelling reasons. First, they power spectral density. The inner codes that have been selected achieve low coding rates with performance close to capacity, for the DE analysis are Hamming, Hadamard and RM codes. as emphasized by the DE analysis. Second, Hadamard and In some cases, shortening has been applied to fit the inner first order RM codes can be decoded via FHT. This allows code dimension to the outer code field size. We additionally an efficient implementation of the inner decoder, also with considered a large minimum distance code with dimension the possibility of reusing of the hardware employed by the tailored to the turbo code field F256, i.e. the (33, 8) code of turbo decoder. Hadamard codes and first order RM codes [17] with minimum distance 14. with antipodal modulation are examples of orthogonal and bi- Figure 2 reports the decoding thresholds for different con- orthogonal codes, respectively [3]. Therefore, they lead to the catenations, under coherent and noncoherent detection. For same error probabilities of any other (bi-) orthogonal signal set the coherent detection case, the unconstrained-input channel of the same order, such as pulse position modulation (PPM) capacity is also depicted. Considering an outer turbo code on and bi-orthogonal PPM. In the following we will refer to F256, code rates from 1/3 to 1/96 are obtained by selection Hadamard and first order RM codes as orthogonal and bi- of different inner codes (no inner code is used for R =1/3). orthogonal codes to emphasize that the achieved results hold As it can be seen the gap to capacity is within ∼ 0.5 dB. For in general when (bi-) orthogonal modulations are used. A comparison, the decoding thresholds of the MR-LDPC codes derivation of the decoding complexity of the proposed scheme of [10] are also provided for different coding rates. The results is provided next, followed by a discussion on how the overall nearly match with those of the concatenated scheme. However, coding rate of the scheme can be flexibly adjusted. as it will be discussed in the next section, when restricting to Hadamard and RM inner codes, the decoding complexity A. On the Decoding Complexity of the proposed solution is lower than that of [10]. We also m provide a result on an F turbo code in concatenation with The antipodal representation of an order-2 Hadamard 64 code can be obtained as follows. Starting from the order- the (64, 6) Hadamard code. We can observe that by lowering 2 the field order q the gap to capacity increases. , A similar analysis has been performed for the noncoherent +1 +1 H = , (7) detection case. As Figure 2 reports, the iterative decoding 2  +1 −1  thresholds are shifted by ∼ 2.4 dB for all codes in comparison the order- m Sylvester-type Hadamard matrix is obtained by F 2 with the coherent detection case. Only the 64 turbo code in iterating the Kronecker product [18, Ch. 14] concatenation with the (64, 6) Hadamard code is affected by a larger loss, that is around 2.7 dB. Nevertheless, the results are H2m = H2 ⊗ H2m−1 . (8) The order-2m Hadamard code modulated sequences corre- match the outer code field order with the inner code dimension. m spond to the rows of H2m . The first order, length-2 RM code Hence, kP = m and nP = kI. By selecting a precode with rate modulated sequences correspond to the rows of the matrix RP = kP/nP > 1, the inner code can be a Hadamard/RM code of dimension kI

−3 10 −4 CER

CER/BER 10

−4 Noncoherent 10 −5 10

−5 Coherent −6 10 10 R = 1/96, 1/48, 1/32, 1/24, 1/15 −7 −6 10 10 −1 0 1 2 3 4 5 6 0 1 2 3 4 5 6 Eb/N0 [dB] Eb/N0 [dB]

F F Fig. 3. CER and BER for low-rate turbo codes over 64 and 256 with Fig. 5. CER for several low-rate codes over F64 and F256, with information orthogonal modulation and coherent detection. The BER of the IS95(A) up- block of 192 bits. Coherent and noncoherent detection. link scheme is provided as reference. Information block of 192 bits.

0 10 REFERENCES F256 TC -CER F256 TC - BER [1] A. J. Viterbi, “Spread spectrum communications–Myths and realities,” F64 TC-CER −1 F64 TC - BER IEEE Commun. Mag., vol. 17, no. 3, pp. 11–18, 1979. 10 BER - IS95 - MAP inner [2] ——, CDMA: Principles of Spread Spectrum Communication. Red- BER - IS95 - IT wood City, CA, USA: Addison Wesley Longman Publishing, 1995. −2 [3] R. Gallager, Principles of digital communication. Cambridge University 10 Press, 2008. [4] A. J. Viterbi, “Orthogonal tree codes for communication in the presence −3 10 of white Gaussian noise,” IEEE Trans. Commun., vol. 15, no. 2, pp. 238–242, 1967. [5] ——, “Very low rate convolution codes for maximum theoretical perfor- −4

CER/BER 10 mance of spread-spectrum multiple-access channels,” IEEE J. Sel. Areas Commun., vol. 8, no. 4, pp. 641–649, May 1990. [6] L. Ping, W. Leung, and K. Wu, “Low-rate turbo-Hadamard codes,” IEEE −5 10 Trans. Inf. Theory, vol. 49, no. 12, pp. 3213–3224, 2003. [7] P. Komulainen and K. Pehkonen, “Performance evaluation of super-

−6 orthogonal turbo codes in AWGN and flat Rayleigh fading channels,” 10 IEEE J. Sel. Areas Commun., vol. 16, no. 2, pp. 196–205, 1998. [8] G. Liva, W. E. Ryan, and M. Chiani, “Quasi-cyclic generalized LDPC

−7 codes with low error floors,” IEEE Trans. Commun., vol. 56, no. 1, pp. 10 1 2 3 4 5 6 7 8 49–57, Jan. 2008. Eb/N0 [dB] [9] C. Shannon, “Probability of error for optimal codes in a Gaussian channel,” Bell System Tech. J., vol. 38, pp. 611–656, 1959. [10] K. Kasai, D. Declercq, C. Poulliat, and K. Sakaniwa, “Multiplicatively Fig. 4. CER and BER for low-rate turbo codes over F64 and F256 with repeated nonbinary LDPC codes,” IEEE Trans. Inf. Theory, vol. 57, orthogonal modulation and noncoherent detection. The BER of the IS95(A) no. 10, pp. 6788–6795, 2011. up-link scheme is provided as reference. Information block of 192 bits. [11] G. Liva, E. Paolini, S. Scalise, and M. Chiani, “Turbo codes based on time-variant memory-1 convolutional codes over Fq,” in Proc. IEEE Int. Conf. on Communications, Kyoto, Japan, Jun. 2011. [12] R. Herzog, A. Schmidbauer, and J. Hagenauer, “Iterative decoding VI. CONCLUSIONS and despreading improves CDMA-systems using M-ary orthogonal modulation and FEC,” in Proc. IEEE Int. Conf. on Communications, Montreal, Canada, 1997, pp. 909–913. This paper investigates the design of non-binary turbo codes [13] M. Peleg and S. Shamai, “On the capacity of the blockwise incoherent MPSK channel,” IEEE Trans. Commun., vol. 46, no. 5, pp. 603–609, in concatenation with inner linear block codes for very low May 1998. coding rates. A DE analysis has been provided for coherent [14] J. Berkmann and C. Weiss, “On dualizing trellis-based APP decoding and the noncoherent detection, showing decoding thresholds algorithms,” IEEE Trans. Commun., vol. 50, no. 11, pp. 1743 – 1757, Nov. 2002. close to the Shannon limit. When the inner codes are chosen [15] G. D. Forney Jr., Concatenated Codes. Cambridge, MA: M.I.T. Press, to be Hadamard or first order RM codes a simple decoder 1966. implementation is possible, which employs FHTs for decoding [16] M. Bingeman and A. Khandani, “Symbol-based turbo codes,” IEEE Commun. Lett., vol. 3, no. 10, pp. 285–287, 1999. both the inner and the outer code. Codeword error rates close [17] T. Helleseth and Ø. Ytrehus, “How to find a [33,8,14] code,” Univ. of within 0.8 dB from the SPB have been obtained for the Bergen, Bergen, Norway, Report in Informatics 41, 1989. proposed schemes, while no floors have been detected down [18] F. Mac Williams and N. Sloane, The Theory of Error-Correcting Codes. North Holland Mathematical Libray, 1977. to error rates as low as CER ≃ 10−6.