On the Bit Error Rate of Repeated Error-Correcting Codes
Total Page:16
File Type:pdf, Size:1020Kb
On the bit error rate of repeated error-correcting codes Weihao Gao and Yury Polyanskiy Abstract—Classically, error-correcting codes are studied with We emphasize that the last observation suggests that conven- respect to performance metrics such as minimum distance tional decoders of block codes should not be used in the cases (combinatorial) or probability of bit/block error over a given of significant defect densities. stochastic channel. In this paper, a different metric is considered. We proceed to discussing the basic framework and some It is assumed that the block code is used to repeatedly encode user data. The resulting stream is subject to adversarial noise of known results. given power, and the decoder is required to reproduce the data A. General Setting of Joint-Source Channel Coding with minimal possible bit-error rate. This setup may be viewed as a combinatorial joint source-channel coding. The aforementioned problem may be alternatively formal- Two basic results are shown for the achievable noise-distortion ized as a combinatorial (or adversarial) joint-source channel tradeoff: the optimal performance for decoders that are informed coding (JSCC) as proposed in [1]. The gist of it for the binary of the noise power, and global bounds for decoders operating in source and symmetric channel (BSSC) can be summarized by complete oblivion (with respect to noise level). General results the following are applied to the Hamming [7; 4; 3] code, for which it is k n Definition 1: Consider a pair of maps f : F2 ! F2 demonstrated (among other things) that no oblivious decoder n k exist that attains optimality for all noise levels simultaneously. (encoder) and g : F2 ! F2 (decoder). The distortion-noise tradeoff is the non-decreasing right-continuous function 4 1 I. INTRODUCTION D(f; g; δ) = max max jx + g(f(x) + e)j δ 2 [0; 1] k e:je|≤δn x2F2 k Suppose a very large chunk of data is encoded via a fixed where j · j denotes the Hamming weight. The tradeoff for the error-correcting block code, whose block length is significantly optimal decoder is denoted as smaller than the total data volume. The data in turn is affected 4 by a noise of high level, thus not permitting correcting errors D(f; δ) = min D(f; g; δ) δ 2 [0; 1] g perfectly. What is the best achievable tradeoff between the noise level and the (post-decoding) bit-error rate? A pair (f; g) is called a (k; n; D; δ)-JSCC if D(f; g; δ) ≤ D. D(f; δ) Such situation may arise, for example, in the forensic Note that the definition characterizes the smallest analysis of a severely damaged optical, magnetic or flash drive. distortion attainable for a given encoder, provided the decoder δ We note that there are two different scenarios depending on knows and can adapt to it. Shortly, we will also address the δ whether the noise level δ 2 [0; 1] (the fraction of bits flipped) case when is unknown to the decoder (see the concept of is known to the decoder or not. The second case presents asymptotic decoding curve below). In this paper we focus on a particular special case of an additional challenge as apriori it is not clear whether a encoders obtained via repetition of a single “small code”, given error-correcting code admits a universal decoder that cf. [1]. Formally, fix an arbitrary encoder given by the mapping is simultaneosly optimal for all noise levels (in the sense of f : u ! v (a small code). If there are at most t errors minimizing the bit-error rate). F2 F2 in the block of length v, t 2 [0; v] the performance of the In this paper we characterize tradeoffs for both cases. The optimal decoder (knowing t) is given by the non-decreasing general theory is applied to the example of the Hamming right-continuous function [7; 4; 3] code uncovering the following basic effects: 4 −1 ∗∗ r0(t) = max rad(f Bv(y; t)); (1) 1) Known converse bound (r0 in [1]) is not tight. v y2F2 2) No single decoder is (even asymptotically) optimal for where all δ. In particular, there does not exist a decoder 4 0 n 0 ∗∗ Bn(x; α) = fx 2 2 : jx − xj ≤ αg achieving r0 at all points. F 3) For the (practical case of) small δ, the optimal decoder is a Hamming ball of (possibly non-integral) radius α and is not the minimum distance one. rad(S) = min max jy − xj n x2F2 y2S W.G. is with the Institute for Interdisciplinary Informa- tion Sciences, Tsinghua University, Beijing, 100084 China. is the radius of the smallest Hamming ball enclosing the set v u e-mail: [email protected] Y.P. is with the Department S. Consider also an arbitrary decoder g : F2 ! F2 and its of Electrical Engineering and Computer Science, MIT, Cambridge, MA performance curve: 02139 USA. e-mail: [email protected]. 4 This material is based upon work supported by the National Science r (t) = max max jg(f(x) + e) + xj: (2) g u Foundation under Grant No CCF-13-18620. je|≤t x2F2 Clearly Note that once existence of the limit is proven, concavity rg(t) ≥ r0(t) follows immediately. Indeed for any L1 + L2 = L, integers si vLi and yi 2 F2 with i = 1; 2 we have From a given code f we may construct a longer code f ⊕L k n BvL(y1 ⊕ y2; s1 + s2) ⊃ BvL1 (y1; s1) ⊕ BvL2 (y2; s2) : by repetition to obtain an F2 ! F2 code as follows, where Lu = k; Lv = n: Applying (f ⊕L)−1 and taking rad we get from additivity of ⊕L 4 the radius [3, Section II]: f (x1; : : : ; xL) = (f(x1); : : : ; f(xL)) : s1 + s2 L1 s1 D(f ⊕(L1+L2); ) ≥ D(f ⊕L1 ; ) This yields a sequence of codes with bandwidth expansion L1 + L2 L1 + L2 L1 factor ρ = n = v . We want to find out the achieved distortion L s k u 2 ⊕L2 2 D(δ) as a function of the maximum crossover portion δ of the + D(f ; ) : L1 + L2 L2 adversarial channel. Since s are arbitrary by taking the limit L ! 1 of both sides, Theorem 1 ( [1]): The asymptotic distortion achievable by i concavity of D(f ⊕∞; δ) follows. Concavity in turn implies the repetition construction satisfies continuity. ⊕L 1 ∗∗ We complete the proof by showing existence of the limit lim inf D(f ; δ) ≥ r0 (δv) : (3) L!1 u and formula (6). To that end, first we expanding the definition vL v A block-by-block decoder g achieves of radius in (7). Second, we represent vectors in F2 as F2- uL valued vectors of length L, and similarly for F2 . Then, the ⊕L ⊕L 1 ∗∗ lim D(f ; g ; δ) = rg (δv) ; (4) expression entirely equivalent to (7) is the following: L!1 u 1 ∗∗ ∗∗ ⊕L ^ where r0 and rg are upper concave envelopes of r0 and rg D(f ; δ) = max min max E [jS − Sj] ; (8) u PY P ^ P :(∗) respectively. SjY SjY;S^ Below we extend and refine these prior results. Namely, in with optimizations satisfying the same constraints as in (6) Section II we show how to compute the limit in (3) exactly with the following additions: 1 v (correcting a previous version in [2]). In Section III we present 1) PY (b) 2 L Z for every b 2 F2 1 u 2) P ^ (ajb) 2 for every a 2 upper and lower bounds for the case of δ not known at the SjY LPY (b) Z F2 0 1 0 u decoder. Finally, in Section IV we demonstrate our findings 3) P ^(a jb; a) 2 Z for every a 2 F2 SjY;S LPY (b)PS^jY (ajb) on the example of the (repetition of the) Hamming [7; 4; 3] Note that since expectations appearing in constraint (∗) and (6) code. are continuous functions of PY , PS^jY and PSjY;S^ we may ad- ditionally impose constraint P (a; b) ≥ p1 . This guarantees II. DECODER KNOWS δ SY^ L that in the integrality constraint 3 the denominator is a large A. Optimal performance curve: correction to [2] integer for each (a; b). Consequently, arbitrary kernel PSjY;S^ Asymptotic performance of a repetition construction is can be approximated with precision of order p1 by kernels L given by: satisfying constraint 3. Hence, in the limit as L ! 1 the u v Theorem 2: Fix a small code f : F2 ! F2 and consider inner maximization in (8) can be performed without verifying the repetition construction. The limit integrality condition 3. Similar argument applies to PS^jY and 4 PY . Overall, this is a standard exercise in approximating joint D(f ⊕∞; δ) = lim D(f ⊕L; δ) (5) L!1 distributions by L-types, see [4, Chapter 1]. exists and is a non-negative concave continuous function of III. DECODER DOES NOT KNOW δ δ 2 [0; 1] given by A. Asymptotic decoder curves ⊕∞ 1 Definition 2: D(f ; δ) = max min max E [jS − S^j] ; (6) A non-decreasing right-continuous function P P u Y S^jY PSjY;S^:(∗) r : [0; v] ! [0; u] is called an asymptotic decoder curve (a.d.c.) u v where P ranges over all distributions on v, P ranges for a given small code f : F2 ! F2 if there exists a sequence Y F2 S^jY vLj uLj v u of integers, Lj, of decoders gj : F2 ! F2 such that over all Markov kernels F2 ! F2 and PSjY;S^ ranges over v u u Markov kernels F2 × F2 ! F2 satisfying t 1 D f ⊕Lj ; g ; ! r(t) (9) j v u (∗) E [jf(S) − Y j] ≤ δv for all t 2 [0; v] points of continuity of r.