Finding the Exhaustive List of Small Fully Absorbing Sets and Designing

IN PREPARATION FOR SUBMISSION TO IEEE TRANS. ON COMMUNICATIONS, DEC. 2011 1 Finding the Exhaustive List of Small Fully Absorbing Sets and Designing the Corresponding Low Error-Floor Decoder Gyu Bum Kyung, Member, IEEE, and Chih-Chun Wang, Member, IEEE Abstract—This work provides an efficient exhaustive search is related to the error-prone substructures (EPSs) such as algorithm for finding all small fully absorbing sets (FASs) of any stopping sets (SSs) [5], near-codewords [6], trapping sets arbitrary low-density parity-check (LDPC) code. The proposed (TSs) [3], pseudocodewords [7], instantons [8], and absorbing algorithm is based on the branch-&-bound principle for solving NP-complete problems. In particular, given any LDPC code, the sets (ASs) [9]. Since the EPSs of LDPC codes generally are problem of finding all FASs of size less than t is formulated as an not valid codewords, the error floor caused by the BP decoder integer programming problem, for which a new branch-&-bound can be lowered by the optimal ML decoder or by an improved algorithm is devised with new node selection and tree-trimming BP decoder [10], [11]. mechanisms. The resulting algorithm is capable of finding all The characterization and (exhaustive/inexhaustive) enumer- FASs of size ≤ 6 for LDPC codes of length ≤ 1000. When limiting the FASs of interest to those with the number of violated parity- ation of EPSs are important for estimating the error floor per- check nodes ≤ 2, the proposed algorithm is capable of finding formance of LDPC codes using importance sampling [3], [12], all such FASs of size ≤ 13 for LDPC codes of lengths ≤ 1000. [13]. Due to the inherent complexity of the exhaustive search The resulting exhaustive list of small FASs is then used to problem, several inexhaustive TS search algorithms have been devise a new efficient post-processing low-error floor LDPC proposed and used to estimate the error-floor performance. decoder. The numerical results show that by exploiting the exhaustive list of small FASs, the proposed post-processing In particular, Richardson estimated the error floor of LDPC decoder can significantly lower the error-floor performance of a codes by an inexhaustive list of TSs and by evaluating their given LDPC code. For various example codes of length ≤ 3000, contribution to the error floor [3]. Similarly, [14] used FPGA the proposed post-processing decoder lowers the error floor by simulation to obtain a partial list of the dominant TSs and a couple of orders of magnitude when compared to the standard they then found all TSs that have the same structure as the belief propagation decoder and by an order of magnitude when compared to other existing low error-floor decoders. dominant TSs by computer search. Reference [15] searched the dominant TSs by applying importance sampling to each Index Terms—Branch-&-bound algorithms, error floor, fully absorbing sets (FASs), low-density parity-check (LDPC) codes. 6-bit cycle. Also, [16] obtained the dominant TSs using the tree structure and the error impulse method. Reference [17] found the inexhaustive list of dominant elementary TSs that I. INTRODUCTION have the smallest number of unsatisfied check nodes based Recently, Low-Density Parity-Check (LDPC) codes [1] have on the exhaustive list of cycles. Reference [18] proposed an received much attention thanks to their near Shannon limit inexhaustive TS enumeration method of TSs by augmenting performance and to the ultra-efficient, high-performance be- the Tanner graph. lief propagation (BP) decoders. When compared to the pro- Finding the minimum distance of an arbitrary linear code hibitively costly optimal maximum likelihood (ML) decoders, is known to be NP-hard [19], [20]. On the other hand, it is the suboptimality of BP leads to performance degradation both worth noting that the NP-hardness is a worst-case study, which in the waterfall region [2] and in the error-floor region [3]. may not predict the average cases for the (carefully structured) Generally, the frame-error-rate (FER) error floor of modern LDPC codes used in digital communication systems.1 More- capacity-achieving error control codes (including LDPC codes over, the NP-hardness results focus on the asymptotic regime and turbo codes) is at around 10−4 to 10−6, which is detri- when the input size N ! 1. It is thus still possible that mental to many important applications that require very low an algorithm for solving an NP hard problem has exponential error floor (10−12 to 10−15) such as the digital storage devices complexity but has a very small coefficient, for which the and optical communications [4]. In the case of turbo codes, complexity is still very manageable for practical values of N. the error floor is mainly caused by low-weight codewords. Unlike other NP-hard problems for which we are interested For comparison, the error-floor performance of LDPC codes in as large input sizes as possible, for the case of finding the minimal distance of a given error control code, we generally This work was supported by NSF Grants CCF-0845968. Part of this paper ≤ 4 was presented at the IEEE International Symposium on Information Theory are interested only in codes of moderate lengths 10 due (ISIT), Austin, TX, USA, Jun.13-18, 2010. to the practical buffer size and delay constraints. As a result, G. B. Kyung is with Samsung Advanced Institute of Technology, Samsung it is of critical value to devise relatively efficient algorithms Electronics, Yongin, Korea and C.-C. Wang is with the Center for Wireless Systems and Applications (CWSA), the School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907, USA (e-mail: 1For example, finding the minimum distance of an array-based LDPC code [email protected] and [email protected]). has smaller complexity than that of an arbitrary code [21]. IN PREPARATION FOR SUBMISSION TO IEEE TRANS. ON COMMUNICATIONS, DEC. 2011 2 that are capable of finding the minimum distance for codes of erasure decoding and a soft erasing technique on the log- moderate length [22]. likelihood ratio (LLR) messages. Reference [11] also proposed For LDPC codes, the BP decoder converges to the EPSs a generalized LDPC decoder to lower the error floor of LDPC instead of the minimum codeword. In this work, we are codes, which combined several problematic check node pro- interested in finding an exhaustive list of all small fully ASs cessors to a generalized constraint node and applied the BCJR (FASs), which are special types of EPSs first introduced in [9]. algorithm [31]. Reference [14] used post-processing decoding Finding all small EPSs is a non-trivial task. For example, which lowers the error floor by increasing the reliability of the problem of determining the minimum size of the SSs is the messages from unsatisfied check nodes and decreasing the NP-complete [23] and the NP-hardness is later proven for reliability of the messages from mis-satisfied check nodes. determining the minimum size of the k-out TSs as well [24]. In this work, we first propose an efficient algorithm to find Reference [25] further generalized the NP-hardness result and all small FASs for any arbitrary LDPC code that may not showed that even the problem of approximating the minimum have any algebraic structure. We then use the exhaustive list size of the “cover SSs” and several different kinds of TSs of small FASs to develop a new post-processing decoder that is also NP-hard. Again, the above NP-hardness results focus lowers the error floor by a couple of orders of magnitude when on the worst case analysis in the asymptotic regime and thus compared to the standard BP decoder. Our exhaustive search is do not preclude the existence of good algorithms that are based on the branch-&-bound principle, which was previously capable of exhaustively searching for small EPSs for codes used in [24] and [26] for exhaustively finding SSs. During of moderate length (≤ 2000). It is worth noting that although the bounding step, we formulate a new integer programming both the problems of finding the minimum Hamming distance (IP) problem to decide the minimum size of FASs under and finding the minimum EPS size are NP-hard, for any given constraints and solve the problem using the branch- given linear code, it generally takes longer time to find the &-cut algorithm, which can be viewed as a generalization minimum Hamming distance than to find the minimum size of [22] that finds the minimum Hamming distance using IP of EPSs since the size of a minimum EPS is usually much solvers. Note that the IP-based approach was also used by [32] smaller than the size of a minimum codeword. For example, to approximate the minimum Hamming distance, which is empirically, the minimum Hamming distance of a variable- conceptually quite different from the exhaustive search al- degree-3 randomly chosen LDPC code of length ≈ 103 is gorithms in this work and in the existing literature [22], approximately 20 [22], while the minimum size of TSs or [24], [26]. For numerical evaluations, we applied the proposed SSs of the same code is typically 3 to 5. This fact, together algorithms to find the exhaustive lists of small FASs for with the recent efficient algorithms of finding the minimum various benchmark LDPC codes in the literature, which was distance for LDPC codes of moderate lengths ≤ 2000 [22], previously unobtainable by the existing approaches. In all our are one of the major motivations of this work. experiments, our scheme is capable of finding all FASs of There are existing algorithms that find the minimal size size ≤ 6 for LDPC codes of length ≤ 1000.

Load more