Wireless Communications and Mobile Computing

Error Control Codes for Next-Generation Communication Systems: Opportunities and Challenges

Lead Guest Editor: Zesong Fei Guest Editors: Jinhong Yuan and Qin Huang Error Control Codes for Next-Generation Communication Systems: Opportunities and Challenges Wireless Communications and Mobile Computing

Error Control Codes for Next-Generation Communication Systems: Opportunities and Challenges

Lead Guest Editor: Zesong Fei Guest Editors: Jinhong Yuan and Qin Huang Copyright © 2018 Hindawi. All rights reserved.

This is a special issue published in “Wireless Communications and Mobile Computing.” All articles are open access articles distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, pro- vided the original work is properly cited. Editorial Board

Javier Aguiar, Spain Oscar Esparza, Spain Maode Ma, Singapore Ghufran Ahmed, Pakistan Maria Fazio, Italy Imadeldin Mahgoub, USA Wessam Ajib, Canada Mauro Femminella, Italy Pietro Manzoni, Spain Muhammad Alam, China Manuel Fernandez-Veiga, Spain Álvaro Marco, Spain Eva Antonino-Daviu, Spain Gianluigi Ferrari, Italy Gustavo Marfia, Italy Shlomi Arnon, Israel Ilario Filippini, Italy Francisco J. Martinez, Spain Leyre Azpilicueta, Mexico Jesus Fontecha, Spain Davide Mattera, Italy Paolo Barsocchi, Italy Luca Foschini, Italy Michael McGuire, Canada Alessandro Bazzi, Italy A. G. Fragkiadakis, Greece Nathalie Mitton, France Zdenek Becvar, Czech Republic Sabrina Gaito, Italy Klaus Moessner, UK Francesco Benedetto, Italy Óscar García, Spain Antonella Molinaro, Italy Olivier Berder, France Manuel García Sánchez, Spain Simone Morosi, Italy Ana M. Bernardos, Spain L. J. García Villalba, Spain Kumudu S. Munasinghe, Australia Mauro Biagi, Italy José A. García-Naya, Spain Enrico Natalizio, France Dario Bruneo, Italy Miguel Garcia-Pineda, Spain Keivan Navaie, UK Jun Cai, Canada A.-J. García-Sánchez, Spain Thomas Newe, Ireland Zhipeng Cai, USA Piedad Garrido, Spain Wing Kwan Ng, Australia Claudia Campolo, Italy Vincent Gauthier, France Tuan M. Nguyen, Vietnam Gerardo Canfora, Italy Carlo Giannelli, Italy Petros Nicopolitidis, Greece Rolando Carrasco, UK Carles Gomez, Spain Giovanni Pau, Italy Vicente Casares-Giner, Spain Juan A. Gomez-Pulido, Spain R. Pérez-Jiménez, Spain Luis Castedo, Spain Ke Guan, China Matteo Petracca, Italy Ioannis Chatzigiannakis, Greece Antonio Guerrieri, Italy Nada Y. Philip, UK Lin Chen, France Daojing He, China Marco Picone, Italy Yu Chen, USA Paul Honeine, France Daniele Pinchera, Italy Hui Cheng, UK Sergio Ilarri, Spain Giuseppe Piro, Italy Ernestina Cianca, Italy Antonio Jara, Switzerland Vicent Pla, Spain Riccardo Colella, Italy Xiaohong Jiang, Japan Javier Prieto, Spain Mario Collotta, Italy Minho Jo, Republic of Korea R. C. Pryss, Germany Massimo Condoluci, Sweden Shigeru Kashihara, Japan Sujan Rajbhandari, UK Daniel G. Costa, Brazil Dimitrios Katsaros, Greece Rajib Rana, Australia Bernard Cousin, France Minseok Kim, Japan Luca Reggiani, Italy Telmo Reis Cunha, Portugal Mario Kolberg, UK Daniel G. Reina, Spain Igor Curcio, Finland Nikos Komninos, UK Abusayeed Saifullah, USA Laurie Cuthbert, Macau Juan A. L. Riquelme, Spain Jose Santa, Spain Donatella Darsena, Italy Pavlos I. Lazaridis, UK Stefano Savazzi, Italy Pham Tien Dat, Japan Tuan Anh Le, UK Hans Schotten, Germany AndrédeAlmeida,Brazil Xianfu Lei, China Patrick Seeling, USA Antonio De Domenico, France Hoa Le-Minh, UK Muhammad Z. Shakir, UK Antonio de la Oliva, Spain Jaime Lloret, Spain Mohammad Shojafar, Italy Gianluca De Marco, Italy M. López-Benítez, UK Giovanni Stea, Italy Luca De Nardis, Italy M. López-Nores, Spain E. Stevens-Navarro, Mexico Liang Dong, USA Javier D. S. Lorente, Spain Zhou Su, Japan Mohammed El-Hajjar, UK Tony T. Luo, Singapore Luis Suarez, Russia V. Syrjälä, Finland Reza Monir Vaghefi, USA Jie Yang, USA Hwee Pink Tan, Singapore J. F. Valenzuela-Valdés, Spain Sherali Zeadally, USA Pierre-Martin Tardif, Canada Aline C. Viana, France Jie Zhang, UK Mauro Tortonesi, Italy Enrico M. Vitucci, Italy Meiling Zhu, UK Federico Tramarin, Italy Honggang Wang, USA Contents

Error Control Codes for Next-Generation Communication Systems: Opportunities and Challenges Zesong Fei ,JinhongYuan,andQinHuang Editorial(2pages),ArticleID2643205,Volume2018(2018)

Efficient Quantization with Linear Index Coding for Deep-Space Images Rehan Mahmood , Zulin Wang, and Qin Huang Research Article (13 pages), Article ID 6387214, Volume 2018 (2018)

Superposition Coded Modulation Based Faster-Than-Nyquist Signaling Shuangyang Li ,BaomingBai , Jing Zhou, Qingli He, and Qian Li Research Article (10 pages), Article ID 4181626, Volume 2018 (2018)

A Novel Design of Downlink Control Information Encoding and Decoding Based on Polar Codes Ce Sun , Zesong Fei , Jiqing Ni, Wei Zhou, and Dai Jia Research Article (7 pages), Article ID 5957320, Volume 2018 (2018)

Performance Analysis of CRC Codes for Systematic and Nonsystematic Polar Codes with List Decoding Takumi Murata and Hideki Ochiai Research Article (8 pages), Article ID 7286909, Volume 2018 (2018)

Adding a Rate-1 Third Dimension to Parallel Concatenated Systematic Polar Code: 3D Polar Code Zhenzhen Liu ,KaiNiu,ChaoDong ,andJiaruLin Research Article (6 pages), Article ID 8928761, Volume 2018 (2018)

Construction and Decoding of Rate-Compatible Globally Coupled LDPC Codes Ji Zhang ,BaomingBai ,XijinMu ,HengzhouXu ,ZhenLiu,andHuaanLi Research Article (14 pages), Article ID 4397671, Volume 2018 (2018)

Construction of Quasi-Cyclic LDPC Codes Based on Fundamental Theorem of Arithmetic Hai Zhu ,LiqunPu,HengzhouXu ,andBoZhang Research Article (9 pages), Article ID 5264724, Volume 2018 (2018)

Code-Hopping Based Transmission Scheme for Wireless Physical-Layer Security Liuguo Yin and Wentao Hao Research Article (12 pages), Article ID 7063758, Volume 2018 (2018)

Research and Implementation of Rateless Spinal Codes Based Massive MIMO System Liangliang Wang ,XiangChen ,andHongzhouTan Research Article (9 pages), Article ID 6101853, Volume 2018 (2018)

Design and Analysis of Adaptive Message Coding on LDPC Decoder with Faulty Storage Guangjun Ge and Liuguo Yin Research Article (13 pages), Article ID 7658093, Volume 2018 (2018)

A CCM-Based OFDM System with Low PAPR for Sparse Source Qinbiao Yang, Zulin Wang, and Qin Huang Research Article (7 pages), Article ID 8923478, Volume 2018 (2018) Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 2643205, 2 pages https://doi.org/10.1155/2018/2643205

Editorial Error Control Codes for Next-Generation Communication Systems: Opportunities and Challenges

Zesong Fei ,1 Jinhong Yuan,2 and Qin Huang 3

1 School of Information and Electronics, Beijing Institute of Technology, Beijing 10081, China 2School of Electrical Engineering and , University of New South Wales, Sydney, NSW 2052, Australia 3School of Electronic and Information Engineering, Beihang University, Beijing 100191, China

Correspondence should be addressed to Zesong Fei; [email protected]

Received 3 October 2018; Accepted 3 October 2018; Published 2 December 2018

Copyright © 2018 Zesong Fei et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Error control codes are widely applied in modern com- “Superposition Coded Modulation Based Faster-Tan- munication systems to improve the bandwidth-power ef- Nyquist Signaling”, by S. Li et al.,presentedatransmis- ciency and the reliability of data transmissions. Modern sion scheme of faster-than-Nyquist signaling combined with error control codes have attracted the interest of scholars superposition coded modulation. Te proposed scheme and industry partners since Turbo codes were invented. For requires a lower SNR than orthogonal signaling with a larger example, Turbo codes have been used in the cellular alphabet to achieve a wide range of spectral efciencies. mobile systems. Nowadays, LDPC codes and the polar codes “A Novel Design of Downlink Control Information are adopted in the standard. Te recent development on Encoding and Decoding Based on Polar Codes”, by C. the theoretic framework of new channel coding theorem for Sun et al., proposed a novel design on downlink control fnite code length will provide guidelines for future practical information encoding and decoding. Te proposed scheme could support dynamic confguration of transmission modes error control codes designs. with low complexity. It is shown that the proposed scheme In the age of IoT, everything will be connected via can comply with the false alarm rate target of 5G standard. communication links. It is expected that the next-generation “Performance Analysis of CRC Codes for Systematic communication systems need to support many scenarios and Nonsystematic Polar Codes with List Decoding”, by T. such as wireless communications, optical communications, Murata and H. Ochiai, studied the efect of the length and distributed storage systems, V2X networks, and sensor net- generator polynomials of CRC codes on frame error rate works. Tese scenarios will impose new requirements to performance of polar codes with list decoding. Te authors the communication systems ranging from lower complex- found that the length of CRC will afect the performance ity encoder/decoder, lower delay or latencies, ultrareliable signifcantly, while the generator polynomials will only afect transmission at rates close to the Shannon capacity, low nonsystematic polar codes. energy consumptions, etc. In addition to the communication “Adding a Rate-1 Tird Dimension to Parallel Concate- systems, error control codes also fnd emerging applications nated Systematic Polar Code: 3D Polar Code”, by Z. Liu in security, fash memories, and deep-space probing. et al., proposed a three-dimensional polar code scheme to Tis special issue is a collection of 11 papers which explore improve the error foor performance of parallel concatenated the performance of error control codes for the next genera- systematic polar codes. Te key idea of the proposed scheme tion communication systems and discuss the opportunities is that it makes the best use of the characteristic of parallel and challenges that they will face. concatenation and serial concatenation. 2 Wireless Communications and Mobile Computing

“Construction and Decoding of Rate-Compatible Glob- reviewers for their valuable comments to improve the quality ally Coupled LDPC Codes”, by J. Zhang et al., presented of the articles. We hope that this special issue will further a family of rate-compatible globally coupled LDPC codes encourage research interests and engineering practice in the which provide more fexibility in code rate and guarantee area of Error Control Codes. the structural property of algebraic construction. Te authors also proposed a modifed low complexity decoding scheme. Zesong Fei “Construction of Quasi-Cyclic LDPC Codes Based on Jinhong Yuan Fundamental Teorem of Arithmetic”, by H. Zhu et al., Qin Huang studied the construction of Quasi-Cyclic LDOC codes based on an arbitrary given expansion factor. Te constructed codes perform well in AWGN channels when iterative decoding algorithms are used. “Code-Hopping Based Transmission Scheme for Wireless Physical-Layer Security”, by L. Yin and W. Hao, proposed a code-hopping based secrecy transmission scheme that uses dynamic nonsystematic LDPC codes and automatic repeat- request (ARQ) mechanism to jointly encode and encrypt source messages. In this scheme, source message is jointly encoded and encrypted by a parity-check matrix which is dynamically selected from a set of LDPC matrices based on the shared secret key. Te authors showed that the key is difcult for the eavesdropper to generate, and the security gap is small. “Research and Implementation of Rateless Spinal Codes Based Massive MIMO System”, by L. Wang et al., proposed a spinal codes-aided massive MIMO scheme and a multilevel puncturing and dynamic block-size allocation scheme. Te proposed schemes can approach the maximum achievable rate, and a comparable MIMO testbed is established to demonstrate the efectiveness of the proposed scheme. “Design and Analysis of Adaptive Message Coding on LDPC Decoder with Faulty Storage”, by G. G and L. Yin, discussed the impacts of message errors on LDPC decoders with unreliable memories. An scheme based on the magnitude level of messages was also proposed to improve the robustness. “A CCM-Based OFDM System with Low PAPR for Sparse Source”, by Q. Yang et al., introduced compressive coded modulation scheme to restrain peak-to-average power ratio in OFDM systems, which becomes severe when the source is sparse. “Efcient Quantization with Linear Index Coding for Deep-Space Images”, by R. Mahmood et al.,proposeda modifed quantization with a linear index coding scheme to improve its spectral efciency and robustness. Te proposed scheme efciently removes the redundant bit-planes for spectrally efcient linear index coding of images. A multipass decoding scheme is also proposed which provides better gain by using extrinsic information.

Conflicts of Interest Te authors declare that there are no conficts of interest regarding the publication of this article.

Acknowledgments Te editors thank all of the authors for their submissions to this special issue. We are also grateful to the anonymous Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 6387214, 13 pages https://doi.org/10.1155/2018/6387214

Research Article Efficient Quantization with Linear Index Coding for Deep-Space Images

Rehan Mahmood ,1,2 Zulin Wang,1,3 and Qin Huang 1

1 School of Electronic and Information Engineering, Beihang University, Beijing 100191, China 2Institute of Space Technology, Islamabad, Pakistan 3Collaborative Innovation Center of Geospatial Technology, Wuhan 430079, China

Correspondence should be addressed to Qin Huang; [email protected]

Received 24 November 2017; Revised 14 May 2018; Accepted 10 September 2018; Published 11 October 2018

Academic Editor: Gonzalo Vazquez-Vilar

Copyright © 2018 Rehan Mahmood et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Due to inevitable propagation delay involved in deep-space communication systems, very high cost is associated with the retransmission of erroneous segments. Quantization with linear index coding (QLIC) scheme is known to provide compression along with robust transmission of deep-space images, and thus the likelihood of retransmissions is signifcantly reduced. Tis paper aims to improve its spectral efciency as well as robustness. First, multiple quantization refnement levels per transmitted source block of QLIC are proposed to increase spectral efciency. Ten, iterative multipass decoding is introduced to jointly decode the subsource symbol-planes. It achieves better PSNR of the reconstructed image as compared to the baseline one-pass decoding approach of QLIC.

1. Introduction postdecoding residual errors [7–9] due to their embedded fxed-to-variable length entropy coders. Even a single error Te deep-space communication is more challenging than afer the channel decoder makes the rest of the bitstream its near-Earth counterpart due to the associated huge inter- unsuitable for decoding. Although this catastrophic propaga- transceiver propagation delay [1, 2]. Te inevitable propaga- tion of errors is spatially confned by partitioning the image tion delay increases the likelihood of larger fuctuations in into segments, the complete loss of certain segments may channel SNR during transmission, e.g., due to antenna point- reduce the visual quality of reconstructed image. Consider- ing errors, atmospheric conditions, etc. [3, 4]. Te error cor- ing the strict integrity requirement of deep-space scientifc recting codes defned in [5, 6] provide excellent performance images, retransmission of the lost segments is necessary. for both near-Earth and deep-space communication systems. In order to provide error resilient source transmission, However, the postdecoding bit-error rate (BER) increases a vast number of schemes are proposed in literature under dramatically when the channel SNR degrades slightly below the category of joint source-channel coding (JSCC) [10–18]. the decoding threshold of the code selected for transmission. A comprehensive survey of such schemes aiming towards Te retransmission of erroneous frames is possibly required robust transmission of images can be found in [19–24] and with a low rate code, if the postdecoding BER is higher references therein. Particularly, this paper is interested in the than the value acceptable by the target application. Terefore, JSCC based linear index coding scheme, which is known visually acceptable reconstruction quality of the images to reconstruct better visual quality images on mismatched transmitted from deep-space despite moderate-to-high BER deep-space channels. Te scheme, referred to as quantization is considered as a desirable feature. with linear index coding (QLIC), is proposed in [24] for the It is well-known that the state-of-the-art image com- transmission of deep-space images using Raptor codes [25]. pression standards (JPEG2000 and ICER) are sensitive to It replaces the concatenation of entropy coding and channel 2 Wireless Communications and Mobile Computing coding with a single linear encoding map, and channel codes Te main contributions of this paper are summarized are used to provide both source compression and error below: protection. Indeed, QLIC is able to withstand signifcant SNR fuctua- (i) An efcient quantization scheme is proposed for tions while still preserving the visual quality of the image, but QLIC to assign quantization precision to the trans- its spectral efciency is inferior to the currently used deep- formed image considering relatively small blocks. Te space image transmission system [26, 27]. Terefore, this proposed scheme achieves better overall transmission paper is aimed towards enhancing the spectral efciency as bandwidth efciency. well as the robustness of QLIC. Firstly, the proposed encoder (ii) A multipass decoding scheme is proposed to utilize utilizes multiple refnement levels to efciently reduce the the redundancy lef in the lower level symbol-planes redundant symbol-planes thus increasing the overall spectral in order to improve the BER of the higher level efciency. Ten, we show that the multipass decoding of symbol-planes. Te improved BER of the higher levels QLIC provides signifcant gain, if the information which eventually results in improved reconstruction quality. maximizes the virtual channel capacity is utilized in the subsequent decoding passes. Our iterative decoding provides Te rest of this paper is organized as follows: In Section 2, better PSNR of the reconstructed image as compared to the we present the background of QLIC and notations used in baseline one-pass decoding. the paper. Te proposed idea of multiple refnement levels On the encoder side, QLIC [24] determines the minimum per subblock is discussed in Section 3 whereas the details number of symbol-planes from partitioned subbands in order of multipass decoding approach are described in Section 4. to achieve a target distortion. Te baseline QLIC uses the Section 5 concludes the paper. same block length � for partitioned subbands as well as for the channel codeword. On one hand, large � is required for capacity achieving performance of channel codes. On the 2. Quantization with Linear Index Coding other hand, small � is necessary to efciently identify the In this section, we briefy outline QLIC scheme, the optimiza- redundant transform coefcients. As a result, the spectral tion problem, the solution of which is used in Section 3 for efciency is compromised in both the cases. Tis paper the assignment of multiple refnement levels to each source proposes to partition the subbands in small block lengths block. We also introduce the baseline encoding-decoding while still using large block length for channel coding. Con- approach and the notations used throughout the paper which sequently, the proposed arrangement decreases the entropy are consistent with the notations in [24, 31]. rate of symbol-planes and thus less budget is required for In classical separated source-channel coding systems a given channel capacity. Simulation results show that the theaimofthesourcecodingistoreducetheredundancy proposed approach reduces the transmission overhead by from the source as much as possible and then the error up to 40% while still preserving the robustness feature of protection is provided by the channel codes. In this way, QLIC. Besides, it is shown that the degree distribution controlled amount of redundancy is added to the transmitted of the Raptor codes can be optimized by only consider- data stream in order to protect it against channel induced ing the virtual correlation channel between the symbol- errors. However, in QLIC, channel codes are used to provide planes. compression as well as error protection. Te source data afer On the decoder side, QLIC [24] utilizes one-way virtual quantization is arranged into bit-planes and the bit-planes correlation channel among the symbol-planes. Consequently, are directly mapped to the channel codewords. Consequently, higher signifcant symbol-planes provide their decoding the rate-budget assigned to each encoded bit-plane is directly decisions to the subsequent lower signifcant symbol-planes proportional to the conditional entropy rate of the bit-plane in a multistage manner. In this paper, we show that lower given the higher level bit-planes. Te direct mapping of the signifcant symbol-planes can also provide new extrinsic bit-planes has been shown to be more robust against channel information to the higher signifcant symbol-planes by exe- induced errors [24]. cuting multiple decoding passes. In fact, diferent from [28, Let us consider that an image afer � level biorthogonal 2� 29], only the reliably recovered symbol of a lower level is used discrete wavelet transform (DWT) is divided into �=2 to provide efective extrinsic information, if its corresponding parallel source components with block length � equal to symbol of higher level was recovered with low reliability the length of LL0 subband, where LL0 represents the low in the previous decoding pass. Terefore, we propose to pass subband afer 2D wavelet transform. In fact, LL0 is the only utilize the information from those combinations of subsampled version of the original image afer DWT. Let the symbol-planes which results in the maximum capacity (�) (�) (�) z =(�1 ,...,�� ) represent the sequence of the �th source of its observed virtual correlation channel. As a result, transform coefcients, where � = 1,...,�. All the operations the decoding performance as well as the quality of the in the context of the QLIC are essentially the same for every reconstructed image will improve with multipass decoder. It source component except for the LL0 subband. Terefore, is shown in Section 4.2 that multipass decoding provides a similar to [24], we do not consider the encoding and decoding gain of up to 1.5 dB in terms of PSNR of the reconstructed of the LL0 subband and assume that it is available error-free image as compared to the baseline one-pass decoding at the receiver. However, in Sections 3.3 and 4.2, we consider approach. the transmission overhead of LL0 subband in comparing the Wireless Communications and Mobile Computing 3

Dead-zone Lower precision level A B C

Higher precision A A B B B C C level A C A B C A C

Dead-zone

Figure 1: Dead-zone uniform scaler quantizer (DZUSQ) ��. simulation results of the proposed enhancements with the a particular refnement level such that the overall distortion baseline QLIC scheme. is given by � Let �� : R �→ {�,�,�} ,for� = 1,...,�,be 1 � an embedded dead-zone uniform scaler quantizer (DZUSQ) �= ∑V � , z �=� � � (3) applied to the transform coefcients ,where is the � �=1 highest level of refnement and �=1is the lowest one. Te alphabets and decision regions of DZUSQ are shown where {V�} is a set of nonnegative weights. Te nonnegative (�) weights are due to the fact that MSE distortion in the in Figure 1. We denote �Q,� as the quantization distortion of the �th source component at the �th quantization level pixel domain is not equal to the MSE distortion in the u = Q(z) transform domain for a biorhtogonal transform. Further, if and as the block of ternary quantization indices � arranged in a two-dimensional �×�array. Te �th row the source-channel code transmits the data in channel (�) uses, the resulting transmission bandwidth efciency for the of u, denoted by u =(��,1,...,��,�), is referred to as corresponding scheme is given by �=�/��. Tis notation is the �th “symbol-plane.” All the symbols-planes from 1 to analogous to the “bit per pixel” (bpp) concept used in image � are included for a refnement level of �. According to the coding and we will use � to compare the spectral efciency of chain rule of mutual information, the entropy rate of every our proposed enhancement in Section 3.3 with the baseline �(�) =∑� �(�) subsource �=1 � ,where scheme. Consequently, the refnement level � which is necessary 1 for every subsource to achieve an overall distortion � is then � = �(u(�) | u(1),...,u(�−1)), � � (1) the result of the following weighted MSE distortion linear program � for � = 1,...,�,istheentropyrateofeachsymbol-planein 1 minimize ∑V��� (4) bits/source symbol. � �=1 Let ��(�) denote the operational rate-distortion (R-D) function of the �th source component with respect to the 1 � 2 ∑� ≤�� mean-square error (MSE), where �=(1/�)E[||z−̂z|| ] and ̂z subject to: � � �=1 is the estimate of z afer transmission using a suitable source- (5) channel code. ��(�) is then given by the lower convex envelope (�) 2 �Q,� ≤�� ≤�� ,∀� of the following set of points considering the concatenation of various refnement levels of quantizer and ideal entropy �� ≥��,��+��,�,∀�,�. coding: Te above program considers idealized channel codes; how- � � ever, practical channel codes require an overhead for (�) (�) successful decoding. Te modifed set of R-D points for the (∑�� ,�Q,�), �=0,...,�, (2) �=1 practical channel codes is given by � (�) (�) (�) (�) 2 (∑�� (1+�� ) ,�Q,�) , �=0,...,�, (6) where �Q,0 =�� by defnition. Since ��(�) is a piecewise linear �=1 and convex function, it is possible to represent it with a family of straight lines ��,��+��,� obtained by joining the consecutive where the overhead � can be determined experimentally for R-D points in (2). a particular family of channel codes. Tis modifed set of R-D Let us consider that a successive refnement source- points is then used to determine ��,� and ��,� in (4) which are channel code exists which encodes all the subsources up to in fact the coefcients of the straight line. 4 Wireless Communications and Mobile Computing

Punctured nodes (1) Codeword for u Bit-planes u(1) N1 parity bits

K N ∝H(u(1)) Systematic bits 1 C u H Non-binary A N1,...,NP quantization N transmitted indices N E (P) Codeword for u L

(P) u NP parity bits

(P) (1) (P−1) NP ∝H(u |u ,...,u ) (a) Intrinsic Estimate from information source bias from channel

N1 parity bits

u(1) C  H 2 , k N2 parity bits A N u(2) N P bit-planes E L

 p , k NP parity bits

u(P)

(b)

Figure 2: (a) Multilayer encoding approach of QLIC. (b) Baseline one-pass multistage decoder. Te higher signifcant bit-planes provide decoding decisions to the lower signifcant bit-planes.

Te solution of the above program is then used to Consequently, the coded nodes receive intrinsic information quantize each subsource up to a refnement level of �.In from the channel. Te systematic nodes, however, receive the order to use binary Raptor codes, the symbol-planes are message decomposed into bit-planes by considering ternary to binary alphabets mapping, the details of which are given in [31]. �(��,� =0|�̂1,�,...,�̂�−1,�) Terefore, in the rest of the paper we use the term “bit- ��,� = log , (7) planes” instead of “symbol-planes.” Te bit-planes of each �(��,� =1|�̂1,�,...,�̂�−1,�) block u are directly encoded to the channel codeword with multilayer encoding as shown in Figure 2(a). Te systematic where �̂1,�,...,�̂�−1,� denote the hard decisions obtained bits are punctured and only coded bits are transmitted. On from the already decoded higher signifcant bit-planes. the decoder side, the decoding is performed in a multistage Te sof information can also be used instead of hard manner starting from the highest signifcant bit-plane as decisions; however, no signifcant improvement in perfor- shown in Figure 2(b). Each �th stage decoder makes use of mance was observed as mentioned in [11]. Te errors, if any, (�) (1) (�−1) thesourceprobabilitymodelPr(u | u ,...,u ) which in the higher signifcant bit-planes will eventually feed the is already conveyed to the decoder as a part of the header. lower levels with false log-likelihood ratios (LLRs), though Wireless Communications and Mobile Computing 5 the error propagation is not as catastrophic as for the variable- 1024/2W to-fxed length decoding. Both hard and sof reconstruction are possible considering the posterior probability Pr(��,� | y), where y is the channel output. In the rest of the paper, we refer to this multistage decoder as the one-pass baseline multistage decoder. I II 3. Multiple Refinement Levels Diferent from JPEG2000 and ICER which use sophisticated context based probability models to derive the entropy coder, the pure compression performance of QLIC is derived by the solution of optimization problem in (4). Te solution IV III identifes the refnement levels � for every subsource block to achieve a target overall distortion. Te � bit-planes are then directly encoded to the channel codewords for transmission. Te pure compression performance of QLIC, i.e., using ideal channel codes, is inferior to the compression performance Figure 3: Binary map of a single subsource of an image from Mars provided by the state-of-the-art image compression systems. exploration rover. Te nonzero quantization indices are represented Tis diference, however, may become signifcant in practical with white pixels. transmission scenario due to the overhead associated with the nonideal performance of fnite-length channel codes. Te high level and low level transform coefcients are 50 usually not uniformly distributed within the subband due tovariousimagefeatures.Techoiceof�,corresponding 48 � � to �/2 ×�/2 portion of the subband, is thus critical for 46 better solution of the optimization problem, where �×� are the dimensions of the image. Since modern block codes 44 tend to be capacity achieving asymptotically, a large � favors increased spectral efciency due to relatively low �. Terefore, 42 a relatively larger block length of � = 16384 is used in the numerical results of [24, 31]. (dB) PSNR 40 However, large block length is associated with a serious drawback. Te optimization problem to fnd out the refne- 38 ment levels considers each subsource as a single unit. Conse- quently, the solution assigns the same refnement level to all 36 the transform coefcients within a subsource. However, due to diferent spatial location of high and low level coefcients 34 within a subsource, this assignment of refnement levels is 0 0.2 0.4 0.6 0.8 1 usually not optimal. For example, let us consider a 128 × 128 bpp length quantized subsource of an image taken from Mars K=4096 (Diferent Δ quantizer) exploration rover as shown in Figure 3. A refnement level K=4096 (Same Δ quantizer) K=16384 of � is used for quantization. Te zero values are represented with black pixels whereas the rest of the values are represented Figure 4: Te pure compression performance of QLIC using ideal with white pixels. Let us focus on quadrant II of the image. channel codes for an image taken from Mars exploration rover. Te Very few pixels in quadrant II have nonzero values and are compression performance for short block lengths is superior to that quantized with a refnement level of �. However, it is likely of the large block lengths. that if these values are quantized with a refnement level lower than �, it will not signifcantly afect the reconstruction quality of the image. In that case, the conditional entropy rate at 0.86 bpp for � = 4096, whereas 0.92 bpp is required in of the bit-planes also decreases which increases the overall case of � = 16384 to achieve the same PSNR. Although the spectral efciency. pure compression performance for � = 4096 is better, the encoding of bit-planes with short block length channel codes 3.1. Proposed Encoding Approach. A possible solution is to requires signifcant overhead for convergence. Te improved use small block lengths to fnd the quantization refnement compression performance is thus easily overwhelmed by the levels. Figure 4 compares the pure compression performance accumulatedoverheadofthechannelcodes. of QLIC for block lengths of � = 16384 and � = 4096.Itis Another possible solution is to combine the promising clear from Figure 4 that the pure compression performance features of both the block lengths; i.e., use small � to fnd the for � = 4096 is superior; e.g., PSNR of 48 dB is achieved solution of the optimization problem whereas use large block 6 Wireless Communications and Mobile Computing

Without partitioning of source block into segments With partitioning of source block 1st segment 2nd segment 3rd segment 4th segment

p = 1 A B AA BC BC A B 00BC BC

p = 2 A A CC AC AC A A 00AC AC

p = 3 A C AC AA AA A C AA AA AA

K K K K (a)

Without partitioning of source block into segments With partitioning of source block 1st segment 2nd segment 3rd segment 4th segment

p = 1 A B AA BC BC A B AA BC BC

p = 2 A A CC AC AC A A 00AC AC

p = 3 A C AC AA AA A C 00AA AA

K K K K (b)

Figure 5: (a) Typical index assignment afer reduced refnement level. (b) Proposed approach of index assignment.

lengthchannelcodesfortransmission.A trivialapproachisto words, the codebook for all the � segments remains the � further partition each subsource into �=2 segments, where same. Although this approach is suboptimal in the sense � Δ � is a power of 2 and the length of each segment is �=�/2 . that a wider is used, we observed that it works quite Te optimization problem (4) is then solved by feeding �×� well as input to the optimization problem (4). Te solution subsource segments as input. Te obtained refnement levels of the optimization problem may identify segments to be � are eventually used to quantize each segment of length � thus quantized with lower as compared to the case when opti- generating � bit-planes per segment. Te � segments of the mization is performed by considering the whole subsource same subsource can then be concatenated together to make as a single segment. As shown in Figure 4, although the the overall block length �. For example, if �=2,there pure compression in this case is reduced relative to the will be �=4segments per subsource which corresponds to previous arrangement, the conditional entropy rate of the bit- � = 16384 and � = 4096. Consequently, each bit-plane can planes is also decreased. Apparently, less budget is required for encoding which ultimately increases the overall spectral be encoded and transmitted using a channel code of block efciency. Te conditional entropy rate of the bit-planes is length �. decreased due to the fact that certain quantization symbols Te above scheme is straightforward but unable to improve the spectral efciency in practice. Every segment are replaced with “0” (“B” in our assignment of quantization alphabets). Tis process is equivalent to the one as shown in associated with a subsource may have diferent range of Figure 5(a) which depicts that the 2nd segment is originally transform coefcients. Terefore, the quantization step size to be quantized with �=3but later reduced to �=1. Δ which depends on the dynamic range of transform coef- fcients in every segment is diferent. In other words, the Te above solution enables getting rid of redundant symbols-planes quite efectively and thus increases the overall quantization of every �th segment may result in a diferent spectral efciency. However, there are also some associated codebook.Tus,thequantizationindexassignmentisalso drawbacks. Let us assume that �=3is the subsource refne- diferent even for the same value coefcients but associated � =1 � with diferent segments of the same subsource. Consequently, ment level whereas ��� is the refnement level of its th the overall entropy rate increases afer concatenation of all the segment obtained with help of the procedure explained in the �×� segments of the subsource as compared to the case when the earlier paragraphs. Figure 5(a) shows the arrangement same refnement level is used by considering the subsource of the quantization alphabets without partitioning the source as a whole, i.e., composed of all the � segments. Since the block into segments as well as with partitioning. Te red rate budget � assigned to each bit-plane is dependent on its square in the fgure shows the quantization alphabets which entropy rate, the overall spectral efciency decreases. In fact, become insignifcant due to the partitioning of the source the resulting spectral efciency is even worse as compared to block; i.e., they are represented with “zeros.” In other words, the large block length case. it is necessary to quantize the 2nd segment with �=3 Terefore, we propose to use the overall dynamic range without partitioning of the source block but afer partitioning of the transform coefcients of a subsource to determine its quantization precision requirement is reduced to �=1. the quantization step size Δ for every �th segment. In other Consequently, in the overall arrangement of the quantization Wireless Communications and Mobile Computing 7 alphabets of the �-length source block, e.g., “A A” appears 0.1 as if it is reduced in precision with reference to the other 0.09 segments. Terefore, it is necessary to convey information to the receiver about its updated refnement level. Tis 0.08 overhead becomes signifcant particularly in case of small 0.07 �. Since lower bit-planes are more prone to errors in case of mismatched channel SNR, the quantization alphabets of 0.06

the 2nd segment are more susceptible to channel noise for ∗

 0.05 the partitioning case although they originally correspond to higher signifcant level. Te quantization symbols with 0.04 decreased signifcance are thus more vulnerable to errors and 0.03 robustness of QLIC is compromised. In order to overcome this issue, we propose to update 0.02 the �×�arrangement of the quantization alphabets as shown in Figure 5(b). Tis arrangement is robust in the 0.01 sense that it preserves the signifcance of higher signifcant 0 bit-planes. Consequently, it alleviates the problem associated 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 with the multistage decoding in case of mismatched channel Channel capacity capacity. Added to that, it is no longer required to convey (J=0.1 the information about the updated refnement levels to the (J=0.25 receiver. Te reason can be quickly explained by considering (J=0.5 the ternary quantization alphabets and the DZUSQ such that � is the quantization alphabet which corresponds to 0.By Figure 6: Te decoding threshold of Raptor codes for diferent design [31], the quantizer cells corresponding to alphabets channel capacities. Te output degree distribution for all plots is “A” and “C” can only be subdivided into “A” and “C” for optimized for SNR of 3dB. It appears that there is no big diference in the decoding thresholds of Raptor codes at various channel SNRs. further refnement. Terefore, the assignment of “B” (0) to the higher refnement level where the lower refnement level is alphabet “A” or “C” is illegal and can easily be spotted by the dequantizer. Tis is achieved by fnding the matching distribution for various (��,�)pairs on a sufciently fne grid as proposed in 3.2. Robustness of Raptor Codes. Te multilayer encoding [24]. However, it is quite impractical to optimize the output (� ,�) approach of QLIC directly maps the symbols-planes to the degree distribution for every � pair. Even on a grid with uniform spacing of 0.01 bits/sec for both �� and �, channel codewords as shown in Figure 2. Te rate budget �� assigned to each bit-plane is in proportion to the conditional the number of output degree distribution will be in terms entropy rate of the bit-plane given the higher signifcant bit- of thousands. Terefore, in the following we show that the planes and is given by output degree distribution of the Raptor codes optimized for a particular (��,�) pair is robust to the variations in �� �. According to [33], this is particularly true when the � =�( +� ), (8) � � � joint decoding of the constituent codes (LT and LDPC) is performed. where �� >0is a small rate margin associated with the Inspired by the work of [33], we numerically compute the ∗ particular family of channel codes used for encoding. Te threshold � of systematic binary Raptor codes for the case layered encoding uses systematic channel codes such that the of composite channel, i.e., by considering both the virtual systematic symbols are punctured before transmission and correlation channel and the physical channel. In order to only �� parity bits are transmitted. Since �� is dependent observe the efect of channel capacity � on a degree distri- on ��, discrete set of coding rates may result in reduced bution optimized for a particular (��,�)pair, we fx �� and bandwidth efciency as it is difcult to generate matching numerically fnd the threshold for diferent values of �.Ifthe code rates for every possible ��.However,thisissuecanbe decoding is successful, we increase the code rate and repeat alleviated with the use of rateless codes which have an added the process with a new ensemble of Raptor codes and diferent advantage of virtually generating infnite number of coding noise realization. Similarly, if the decoding fails, we decrease rates. thecoderateandrepeatthesameprocess.Teconsistencyof ∗ Te nonuniversality of Raptor codes is well-known for � is then established by repeating the same process a large general noisy channels [32]. Due to multilayer encoding, number of times and averaging the results. Te decoding every bit-plane observes a composite channel at the decoder. threshold of the Raptor codes degree distribution optimized Te parity symbols observe a physical channel with capacity for the (��,�) pairs: (0.1, 0.72), (0.25, 0.72),and(0.5, 0.72) � whereas the systematic symbols observe a virtual correla- are shown in Figure 6. Te numerical results show that there tion channel with capacity (1 − ��). For capacity achieving is no big diference in the decoding threshold of Raptor performance, the output degree distribution of Raptor codes codes, optimized for a particular physical channel capacity, must be optimized for the observed composite channel. over the range of channel capacities relevant in deep-space 8 Wireless Communications and Mobile Computing

2.6

2.4

2.2

2

1.8 Bandwidth expansion (b,bpp) expansion Bandwidth 1.6

1.4

Figure 7: Image taken from Mars exploration rover: MER1 [30]. 1 2 3 4 5 6 SNR (dB)

Binary Raptor (Base−line) communication. Terefore, it is not necessary to optimize Binary Raptor (Diferent Δ quantizer) the degree distribution of Raptor codes for every value of Binary Raptor (Proposed scheme) �. It is reasonable to perform the optimization for every Infnite−length binary Raptor (EXIT) (��,���������) pair, which leads to system simplifcation. Te � Figure 8: Te spectral efciency of QLIC at various SNR levels for channel capacity �������� may correspond to the middle MER-B image. Te plots depict that the proposed QLIC scheme with of the range of relevant channel SNRs. In fact, in all the multiple quantization precision per source block possesses superior numerical results, we used the degree distribution of Raptor spectral efciency. codes optimized for SNR of 3 dB. As an example, the degree distribution of the LT part of the Raptor codes optimized for the (��,�)pair (0.1, 0.72) is given by [26], i.e., � = 16384. Every subsource is further subdivided �=4 Ω (�) = 0.0043�1 + 0.4856�2 + 0.1341�3 + 0.1607�4 into segments to apply the multiple refnement levels concept. 5 8 16 + 0.10976� + 0.0140� + 0.0547� (9) A target PSNR of 49 dB is defned for the reconstructed image. Te LL0 subband [8] is not transmitted using the QLIC + 0.0015�41 + 0.0355�42. approach. Instead, it is transmitted along with the header using a rate 1/3 channel code which results in negligible A high rate regular LDPC code with degree (2,100) and rate overhead as explained in [24]. We included this overhead in 0.98 is used as a precode in all the Raptor codes. the simulation results although it accounts for only 0.004 in � (bit-plane). We reproduce the results of [24] using binary Raptor codes for comparison purpose. Figure 8 compares 3.3. Spectral Efciency Comparison. In this subsection, we the spectral efciency of the baseline as well as the proposed present the simulation results depicting the spectral efciency QLIC approach achieved at various values of channel SNR. of QLIC using the multiple refnement levels approach. We Te following comments are in order: also compare these results with the numerical simulations of [24] in order to highlight the improved spectral ef- (i) Te (⬦)-curve corresponds to spectral efciency ciency of our proposed approach. Te test image used in achieved by the baseline QLIC scheme using binary all the simulations results is from Mars exploration rover Raptor codes. Similar to the numerical results of and referred to as MER1. Te image is shown in Figure 7. [24], the degree distribution of the Raptor codes is Itisthesameimageusedin[24].ItisaBWuncoded optimized by considering both the virtual correlation image and bears a resolution of 1024x1024 pixels with 12- channel and the physical channel on a sufciently fne bit pixel value. Te image is compressed using a 3-level scale. Terefore, this curve serves as a benchmark Cohen–Daubechies–Feauvea (CDF) 9/7 wavelet transform to compare the spectral efciency of the proposed as defned in JPEG2000 standard for lossy compression scheme. [8]. Te transform coefcients z are then quantized with a UDZSQ, producing ternary quantization indices. Te ternary (ii) Te (+)-curve corresponds to the spectral efciency quantization indices are further decomposed into binary for the case when small block length is used in the indices to facilitate the use of binary Raptor codes [31]. encoding process. Te length of each segment �= Considering the resolution of the image, it is divided into �= 4096;however,thequantizationstepsizeΔ is assigned 64 subsources of 128x128 pixels each, to make it compatible according to the dynamic range of the transform coef- with the packet length defned in CCSDS recommendations fcient within the segment. As explained earlier, due Wireless Communications and Mobile Computing 9

 max(p , k )

N1 parity bits

u(1) C  H 2 , k N2 parity bits A N u(2) Second decoding pass N E P bit-planes L

 p , k NP parity bits

u(P) Second decoding pass

Figure 9: Proposed multipass decoding approach.

to the increased randomness of the data the resulting information. Te block diagram of the proposed multipass spectral efciency is even worse as compared to the decoder is shown in Figure 9. baseline scheme. (iii) Te (I)-curve corresponds to the spectral efciency 4.1. Proposed Decoding Approach. During decoding, every achieved by the proposed approach. We would like level of the multistage decoder observes a composite channel. �(�) to highlight that, in order to obtain the spectral Afraction of the systematic output nodes (output nodes efciency at various values of channel SNR, only in accordance with the Raptor codes) in the decoding bipar- Raptor codes optimized for 3 dB SNR are used. Te tite graph are connected to the virtual correlation channel I with capacity �V. Te remaining fraction of output nodes gap between the proposed scheme ( )-curve and (�) thebaselinescheme(⬦)-curve is signifcant in terms (1 − � ) are connected to the physical channel with capacity (�) of transmission efciency. For example, 2.16 bpp is ��ℎ�,where� =�/(�+�). A higher value of the ratio (�) (�) (�) required by the baseline scheme at SNR of 2 dB. �� =� /(1 − � ) indicates more dependence on the However, only 2.08 bpp is required by the proposed decoding decisions of higher signifcant bit-planes than the scheme in order to achieve the same performance. y(�) 40 physical transmission channel output . Tis corresponds to % reduced transmission over- Te overall observed channel capacity at the �th decoding head with reference to the performance of infnite �(�) =�(�) +�(�) �(�) = length channel codes at 2 dB SNR denoted by (∗)in stage can be given by ����� �ℎ� V ,where V (1 − � ) � Figure 8. � . Since the assignment of parity budget � is probably diferent for every bit-plane, the channel capacity (iv) Te (∗)-curve shows the bandwidth expansion for the (�) ������ observed by each bit-plane for convergence is also infnite length channel codes using EXIT charts. diferent. Terefore, in order to compare the overall observed channel capacity at various decoding levels, we introduce 4. Multipass Decoding the normalized average channel capacity C in the following, considering the BER of the higher levels. In this section, we frst establish the importance of observed Let the multistage decoder at every decoding level provide (�) (�) virtual correlation channel in multipass decoding. Ten, we the estimate û of u afer running sufcient iterations of (�) (�) (�) present the analysis of multipass decoding. It reveals that belief-propagation algorithm. Further, �� = Pr(û = u ) certain symbols recovered in the frst decoding pass can is the probability that the decoding decisions of the higher only provide efective extrinsic information in the subsequent (� − 1) decoding levels are correct. Te normalized bit-level (�) decoding passes. In particular, a reliably recovered symbol of average channel capacity C observed at every �th level can a lower level can only provide efective extrinsic information, then be defned as if its corresponding symbol of higher level was recovered with low reliability in the previous decoding pass. Terefore, (� ×� ) �−1 (�) �ℎ� � (�) C =( − � )+(1−� )×∏� , (10) we propose to utilize information from only those combina- � �ℎ� � � tions of the bit-planes which results in the higher extrinsic �=1 10 Wireless Communications and Mobile Computing

1 K (1) N1 parity bits Codeword for u

Virtual channel Virtual channel with capacity with capacity 2 1−H2 1−H1

(2) N2 parity bits Codeword for u

(1) Reliably recovered source nodes ofu afer first decoding pass (2) Reliably recovered source nodes ofu afer first decoding pass (2) (1) Reliably recovered source nodes ofu diferent from u (1) New reliably recovered source nodes ofu afer second decoding pass

(1) (2) Figure 10: Multipass decoding approach depicting the decoding of two correlated sources u and u .

where ��ℎ� denotes the degradation in channel capacity from identifying the reliably recovered source nodes afer sufcient C(�) 1 � =0 iterations of BP algorithm. Consequently, it is possible to only its nominal value. becomes provided that �ℎ� and (1) �−1 (�) provide the true LLRs {��,� :�∈A } to the corresponding ∏ � =1. �=1 � u(2) {� :�∈K :�∉ Now let us consider the transmission using QLIC for �= systematic nodes of . Te LLRs �,� (1) (2) (1) 2 such that u and u are the bit-planes to be transmitted A } are assumed as erased and thus set to zero. Obviously, over a binary input-output symmetric channel with nominal such a decoder will restrict the propagation of errors to the (1) capacity ��ℎ�. Te rate budget assigned to u is proportional subsequent decoding stages. Further, this genie aided decoder u(1) u(2) is somehow similar to the case when the sof information to the marginal entropy rate of ,whereas is assigned (1 − �) aratebudgetinproportiontoitsconditionalentropyrate recovered from the decoding levels is used to scale (1) (1) (2) ��,�. Indeed for practical multipass decoding, we used this given u . y and y are received corresponding to the (1) (2) later approach. transmission of u and u , respectively, over a mismatched (1) Te decoded output û for any arbitrary value ��ℎ� >0 transmission channel with capacity (��ℎ� −��ℎ�).Tebaseline (1) can only be improved by integrating new extrinsic informa- multistage decoder frst decodes y and outputs the estimate (1) (2) (1) tion at the systematic nodes of the decoding bipartite graph. û .Tedecodingofy is then followed such that û (1) In other words, �� can only be increased by enhancing is available and LLRs according to (7) are provided to the (1) (2) � systematic nodes of u . the capacity of the virtual channel V by executing a (1) second decoding-pass and utilizing the extrinsic information Te normalized average channel capacities C�→ 1 and (2) (2) (1) (2) available from the decoded output û . Now let us consider C observed by u and u , respectively, during the frst �→ 1 the following two hypothetical scenarios for the second decoding pass (baseline multistage decoder), are given by (1) (1) (2) decoding pass of u with the assumption that |A |>|A |, (� ×�) where |.| is the cardinality of A. Tis assumption is true in C(1) =( �ℎ� 1 − � )+(1−�)×�(0), �→ 1 � �ℎ� 1 � case of layered QLIC encoding approach. (11) A(2) ∩ A(1) = A(2) (� ×�) (i) Scenario 1: ; i.e., the reliably (2) �ℎ� 2 (1) û(2) C�→ 1 =( − ��ℎ�)+(1−�2)×�� , recovered nodes indices of are exactly the same � (1) as of û in the frst decoding pass. Now if a second (1) �(0) =1 u(1) decoding pass is executed for u ,newextrinsicinfor- where � due to the independent decoding of . (2) (�) û Let us defne A ={�:�∈K},whereK ={�: mation from is available only for those systematic u(1) � = 1,...,�}is an index set which labels the � systematic nodes of which were already recovered reliably nodes of the decoding bipartite graph. Now consider that in the frst decoding pass as shown in Figure 10. (1) every decoding stage of the multistage decoder is equipped Efectively, negligible improvement in �V is observed with a genie aided decoder which is capable of correctly and hence no further improvement in the decoding Wireless Communications and Mobile Computing 11

(1) performance of u is possible in the subsequent 50 decoding passes. (ii) Scenario 2: now consider that all the nodes recovered (2) 48 in the frst decoding pass of u are diferent from the (1) (2) (1) recovered nodes of u ; i.e., A ∩ A =0.Inthis (2) case, new extrinsic information from û is available 46 (1) corresponding to the nodes of u which were not recovered in the frst decoding pass. Consequently, a (1) (1) 44 second decoding pass of u is likely to increase �� as compared to the frst decoding pass. (dB) PSNR (2) (2) (1) 42 Consequently, let us defne �� = Pr(A ∩ A ) as the probability with which the reliably recovered nodes indices (2) of û are diferent from the reliably recovered nodes indices 40 ̂(1) (�) of u afer the frst decoding pass. We can estimate �� for the case of two correlated sources as 38 (�) (�) (�−1) 3 2.8 2.6 2.4 2.2 2 �� = (�� (1−�� )(1−��)) . (12) SNR (dB) One−pass (1) Te updated normalized average capacity observed by u in One−pass with improved refnement levels Multi−pass the second decoding pass is thus given by Multi−pass with improved refnement levels (� ×�) Multi−pass with conventional refnement levels C(1) =( �ℎ� 1 − � )+((1−�)�(0)) �→ 2 � �ℎ� 1 � Figure 11: Reconstructed PSNR of MER1 when channel SNR varies (13) form its nominal value of 3 dB. 2 (2) + ((1 − �1 )�� ), 2 (1) (2) where �1 is the conditional entropy rate of u w.r.t. u . and 2 (2) (1) (1) Consequently, if ((1 − � )� )>0, C > C and the 1 � �→ 2 �→ 1 �(�̂�,� =1)=�(��,� =1|�(X ,�),...,�(X ,�)) (1) (2) 1 � decoding performance of u will be improved. Further, C�→ 2 (17) ×�(�̂ ,...,�̂ ) is given by (X1,�) (X�,�) (� ×�) (2) �ℎ� 2 (1) where X ={�:�∈P :�=�}̸ , �=|X| and C = ( − ��ℎ�) + ((1−�2) � ) . (14) �→ 2 � � P = {1,...,�} is an index set which labels all the bit- planes. Te multipass decoding thus improves the decoding According to (13), multipass decoding gain is dependent (�) performance by providing � to each bit-plane according to not only on the correlation among the bit-planes but also ��� (15).However,theperformanceimprovementevenbyusing on the probability ��.However,contrarytothegenieaided the sof decisions is not very signifcant as shown in Figure 11 decoding case, it is not possible to identify the reliably recov- ((∗)-curve). We explain the reason in the following and then ered nodes for practical decoders. Terefore, the practical improve the multipass decoding. solution is to use the reliability information which is implict In case of more than two bit-planes, diferent amount of in the sof decisions. Te reliability information can then correlation exists between the combinations of the bit-planes. be used to scale the LLRs of the systematic nodes in the �(u(1) | u(2)) �(u(1) | subsequent decoding passes. Terefore, diferent from [24], For example, in case of 3 bit-planes, , u(3)) �(u(1) | u(2), u(3)) the decoding with sof decisions is necessary for the multipass ,and are known at the decoder for the approach. Terefore, in the following we use the sof decisions most signifcant bit-plane. Due to the multistage decoding, estimates for practical case. certain symbols are recovered with very low reliability in (�) Te extrinsic LLR � for the �th bit-plane in each case of mismatched channel SNR. Particularly, this is true ��� �(�) decoding pass can be represented as for bit-planes with high � . Added to that, the layered encoding approach makes the lower signifcant bit-planes ̂ (�) �(��,� =0) more susceptible to channel noise. It is mainly because the � = , (15) ��� �(�̂ =1) bit-planes are transmitted over a combination of virtual and �,� physical channel and thus afected by the noise on both the where �(�̂�,� =0)and �(�̂�,� =1)can be expressed with channels. reference to all the bit-planes except � as follows. Terefore, it is likely that including symbols from every bit-plane without any criterion in calculating the extrinsic �(�̂�,� =0)=�(��,� =0|�(X ,�),...,�(X ,�)) (�) 1 � information may decrease ����. Tis issue becomes more (16) signifcant as the number of bit-planes increases. Terefore, ×�(�̂(X ,�),...,�̂(X ,�)) 1 � excluding such low reliability symbols for the calculation of 12 Wireless Communications and Mobile Computing extrinsic information may increase the multipass decoding multipass decoding performance is similar to the performance. Indeed, it results in higher capacity of the (∗)-curve. It is shown that the multirefnement virtual correlation channel observed by the �th bit-plane as level approach outperforms the multipass decoding compared to the former case. However, it is difcult to defne approach. a threshold to identify the unreliable symbols. Terefore, in order to get maximum beneft of multipass decoding for 5. Conclusion every �th bit-plane, we propose to calculate (15) for all the (�−1) 2 −1combinations of the bit-planes belonging to the In this paper, we propose to efciently remove the redundant (�) set X.Ten,themaximumvalueof|����| is used as the bit-planes for spectrally efcient linear index coding of extrinsic information. In the next subsection, we show that images. Further, the bit-planes are arranged to preserve their this approach outperforms the general approach. signifcance, and hence the similar robustness performance is achieved even with higher spectral efciency. Ten, multipass 4.2. Robustness Comparison. In this subsection, we compare decoding is used to iteratively decode the bit-planes. We show the simulation results for the baseline one-pass and the that the multipass decoding provides better gain by using multipass decoding approach. Te simulation setup similar extrinsic information from selected bit-planes. to the one explained in Section 3.2 is used. Figure 11 shows the reconstructed PSNR of the MER1 image. Originally, the Conflicts of Interest rate budget assigned to the image corresponds to the nominal channel SNR of 3 dB. However, as the SNR decreases from its Te authors declare that there are no conficts of interest nominal value, the reconstruction quality of the image also regarding the publication of this paper. decreases in terms of PSNR. Te following comments are in order with reference to Figure 11: Acknowledgments (i) (I)-curve shows the performance of one-pass decod- Tis work was supported by National Natural Science Foun- ing scheme as the channel degrades from its nominal dation of China under Grant 61471022, and NSAF under value. Te PSNR degrades gradually as the channel Grant U1530117. SNR decreases similar to the results of [24]. Te baseline QLIC encoder is used to generate these results. References (ii) (∗)-curve corresponds to reconstructed PSNR of the [1] G. Maral and M. Bousquet, Satellite Communications Systems: image at various values of channel SNR by using Systems, Techniques and Technology, John Wiley & Sons, 2011. the multipass decoding approach. Tree iterations [2] J. Taylor, Deep Space Communications, Jet Propulsion Labora- of multipass decoder are executed to generate these tory, California Institute of Technology, 2014. results. It appears that multipass decoding provides [3] J. L. Massey, “Deep-space communications and coding: A gain over the one-pass decoding approach. Particu- marriage made in heaven,”in Advanced Methods for Satellite and larly, the gain is signifcant for low values of mis- Deep Space Communications,pp.1–17,Springer,1992. matched channel SNR. For example, the decoding [4] T. De Cola, E. Paolini, G. Liva, and G. P. Calzolari, “Reliability gain is 1.5 dB when channel degrades by 0.3 dB options for data communications in the future deep-space from its nominal value. Similar to the (I)-curve, the missions,” Proceedings of the IEEE,vol.99,no.11,pp.2056–2074, baseline encoder is used to generate these simulations 2011. results. [5] CCSDS, “TM synchronization and channel coding,” Tech. Rep. 131.0-B-3 Blue Book, Consultative Committee for Space Data (iii) (⬦)-curve corresponds to the typical multipass Systems (CCSDS), 2017. decoding, i.e., without using the approach proposed [6]K.S.Andrews,D.Divsalar,S.Dolinar,J.Hamkins,C.R.Jones, in the previous subsection. Te multipass decoding and F. Pollara, “Te development of turbo and LDPC codes for provides a maximum gain of 0.6 dB at SNR of 2.8 dB. deep-space applications,” Proceedings of the IEEE,vol.95,no.11, Te gain decreases sharply in case of further channel pp. 2142–2156, 2007. degradation. It is due to the reason explained earlier [7] O. Y. Bursalioglu, M. Fresia, G. Caire, and H. V. Poor, “Lossy that the symbols recovered with low reliability are joint source-channel coding using raptor codes,” International unable to provide signifcant extrinsic information in Journal of Digital Multimedia Broadcasting,vol.2008,ArticleID subsequent decoding passes. 124685, 18 pages, 2008. (iv) (+)-curve is similar to the (I)-curve. However, the [8]C.Christopoulos,A.Skodras,andT.Ebrahimi,“TeJPEG2000 still image coding system: an overview,” IEEE Transactions on proposed multiple refnement level encoder is used. Consumer Electronics,vol.46,no.4,pp.1103–1127,2002. Te simulation results confrm that the robustness of QLIC is retained by using the multiple refnement [9]A.KielyandM.Klimesh,“TeICERprogressivewaveletimage compressor,” Interplanetary Network Progress Report,vol.42,no. levels per subsource. 155, pp. 1–46, 2003. (v) (×)-curve corresponds to the multipass decoding [10] O. Y. Bursalioglu, M. Fresia, G. Caire, and H. V. Poor, “Lossy and the proposed multiple refnement encoder. Te multicasting over binary symmetric broadcast channels,” IEEE Wireless Communications and Mobile Computing 13

Transactions on Signal Processing,vol.59,no.8,pp.3915–3929, [27] S. P. Protocol, “Recommendation for space data system stan- 2011. dards,” Tech. Rep. 133.0-B-1. Blue Book, CCSDS, 2003. [11] M. Fresia and G. Caire, “A linear encoding approach to [28] R. Mahmood, Q. Huang, and W. Zulin, “A novel decoding index assignment in lossy source-channel coding,” Institute of method for linear index joint source-channel coding schemes,” Electrical and Electronics Engineers Transactions on Information in Proceedings of the 13th International Bhurban Conference on Teory,vol.56,no.3,pp.1322–1344,2010. Applied Sciences and Technology, IBCAST 2016, pp. 595–600, [12] Z. Yang, S. Zhao, X. Ma, and B. Bai, “A new joint source- Islamabad, Pakista, January 2016. channel coding scheme based on nested lattice codes,” IEEE [29] R.Mahmood,Z.Wang,andQ.Huang,“Multi-passdecodingfor Communications Letters,vol.16,no.5,pp.730–733,2012. the robust transmission of deep-space images,” in Proceedings [13] M. Fresia, F. Perez-Cruz, H. Poor, and S. Verdu, “Joint source of the 2017 IEEE 85th Vehicular Technology Conference (VTC and channel coding,” IEEE Signal Processing Magazine,vol.27, Spring),pp.1–5,Sydney,NSW,Australia,June2017. no.6,pp.104–113,2010. [30] “Images, mars exploration program,” https://mars.nasa.gov/. [14] J. Hagenauer, “Source-Controlled Channel Decoding,” IEEE [31] O. Y. Bursalioglu, G. Caire, and D. Divsalar, “Joint source- Transactions on Communications,vol.43,no.9,pp.2449–2457, channel coding for deep space image transmission using rateless 1995. codes,” in Proceedings of the 2011 Information Teory and Applications Workshop (ITA),pp.1–10,LaJolla,Calif,USA, [15] I. Shahid and P. Yahampath, “Distributed joint source-channel Feburary 2011. coding using unequal error protection LDPC codes,” IEEE Transactions on Communications,vol.61,no.8,pp.3472–3482, [32] O. Etesami and A. Shokrollahi, “Raptor codes on binary mem- 2013. oryless symmetric channels,” IEEE Transactions on Information Teory,vol.52,no.5,pp.2033–2051,2006. [16]J.Li,S.Lin,K.Abdel-Ghafar,W.E.Ryan,andJ.Costello, “Integrated code design for a joint source and channel LDPC [33] A. Venkiah, C. Poulliat, and D. Declercq, “Jointly decoded coding scheme,” in Proceedings of the IEEE Information Teory raptor codes: analysis and design for the biawgn channel,” and Applications Workshop, pp. 1–9, San Diego, Calif, USA, 2017. EURASIP Journal on Wireless Communications and Networking, vol. 2009, Article ID 657970, 2009. [17]P.Wang,L.Yin,andJ.Lu,“Anefcienthelicopter-satellite communication scheme based on check-hybrid LDPC coding,” Tsinghua Science and Technology,vol.599,pp.10–26,2018. [18] L. Yin and W. Hao, “Code-hopping based transmission scheme for wireless physical-layer security,” Wireless Communications and Mobile Computing,vol.2018,ArticleID7063758,12pages, 2018. [19] J. Kliewer and N. Gortz, “Iterative source-channel decoding for robust image transmission,” in Proceedings of the IEEE Inter- national Conference on Acoustics Speech and Signal Processing ICASSP-02, Minneapolis, Minn, USA, April 1993. [20]L.Xu,L.Wang,S.Hong,andH.Wu,“Newresultson radiography image transmission with unequal error protection using protograph double LDPC codes,” in Proceedings of the 2014 8th International Symposium on Medical Information and Communication Technology (ISMICT),pp.1–4,Firenze,Italy, April 2014. [21] J. Kliewer, N. Goertz, and A. Mertins, “Iterative source-channel decoding with Markov random feld source models,” IEEE Transactions on Signal Processing,vol.54,no.10,pp.3688–3701, 2006. [22] R. Hamzaoui, V. Stankovi´c,andZ.Xiong,“Optimizederror protection of scalable image bit streams,” IEEE Signal Processing Magazine,vol.22,no.6,pp.91–107,2005. [23] A. Gabay, M. Kiefer, and P. Duhamel, “Joint source-channel coding using real BCH codes for robust image transmission,” IEEE Transactions on Image Processing,vol.16,no.6,pp.1568– 1583, 2007. [24] O. Y. Bursalioglu, G. Caire, and D. Divsalar, “Joint source- channel coding for deep-space image transmission using rate- less codes,” IEEE Transactions on Communications,vol.61,no. 8, pp. 3448–3461, 2013. [25] A. Shokrollahi, “Raptor codes,” IEEE Transactions on Informa- tion Teory,vol.52,no.6,pp.2551–2567,2006. [26] CCSDS, “Low density parity check codes for use in near-earth and deep space applications,”Tech.Rep. 131.1-O-2 Orange Book, Consultative Committee for Space Data Systems (CCSDS). Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 4181626, 10 pages https://doi.org/10.1155/2018/4181626

Research Article Superposition Coded Modulation Based Faster-Than-Nyquist Signaling

Shuangyang Li ,1,2 Baoming Bai ,1 Jing Zhou,3 Qingli He,1 and Qian Li 1

1 State Key Laboratory of ISN, Xidian University, Xi’an, China 2Science and Technology on Communication Networks Laboratory, Shijiazhuang, China 3Department of EEIS, University of Science and Technology of China, Hefei, China

Correspondence should be addressed to Baoming Bai; [email protected]

Received 24 November 2017; Accepted 31 March 2018; Published 16 May 2018

Academic Editor: Luca Reggiani

Copyright © 2018 Shuangyang Li et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

A structure of faster-than-Nyquist (FTN) signaling combined with superposition coded modulation (SCM) is considered. Te so- called FTN-SCM structure is able to achieve the constrained capacity of FTN signaling and only requires a low detection complexity. By deriving a new observation model suitable for FTN-SCM, we ofer the power allocation based on a proper detection method. Simulation results show that, at any given spectral efciency, the bit error rate (BER) curve of FTN-SCM lies clearly outside the minimum signal-to-noise ratio (SNR) boundary of orthogonal signaling with a larger alphabet. Te achieved data rates are also close to the maximum data rates of the certain shaping pulse.

1. Introduction thatcouldbetransmittedinthesamebandwidthcompared to that of Nyquist signaling, with almost the same error With the demand and the growth of advanced signal pro- performance over additive white Gaussian noise (AWGN) cessing capabilities at base stations, the need of efcient channels. Recently, Rusek et al. proved that FTN signaling is backhauling solutions to transmit a large amount of data able to bring more degrees of freedom (DoF) over the AWGN increases signifcantly. Tus, as one of the most important channel [5, 6] compared to orthogonal signaling. As a result, parts of deploying the ffh-generation (5G) cellular network, a higher spectral efciency is expected for FTN signaling more efcient backhauling techniques need to be applied [2]. and, indeed, fascinating simulation results have already been Conventionally, the capacity of networks is enlarged by con- reported. In [7], a precoded FTN system with quadrature suming more time/bandwidth/spatial resources. However, phase shif keying (QPSK) modulation was presented, which, this solution may not always be possible, due to the practical as simulation results imply, requires lower SNR to reach the −5 reasons. Hence, as an alternative method to gain more capac- BER <10 compared to that of the constrained capacity ity, FTN signaling has recently received a lot of attention. An of 8-PSK for orthogonal signaling with the same spectral overview of FTN signaling for 5G communication systems efciency. However, there is still no such signaling method was provided in [3]. existing in the literature that is able to outperform orthogonal FTN signaling is an extension of traditional linear modu- signaling constrained by a larger alphabet at any preferred lation and a classical way of nonorthogonal signal transmis- spectral efciency. Te reason for this problem lies in the sion, which was frst proposed by Mazo in 1975 [4]. Mazo complexity of maximum-likelihood (ML) detection for FTN discovered that, with sinc pulse as the shaping pulse, the mini- signals growing exponentially with the size of the alphabet mum squared Euclidean distance of binary phase shif keying and with the number of taps of intersymbol interference (BPSK) modulated pulses remains the same even when the (ISI), respectively. When the system requires high spectral symbol rate is, to some extent, higher than the Nyquist crite- efciency, conventional FTN signaling systems need either an rion. His work indicates that there are roughly 25% more bits alphabetwithalargersizeoracompressionfactorofasmaller 2 Wireless Communications and Mobile Computing value to meet the requirements. Consequently, the required (1) We adapt the idea from [1] and provide a more gener- ML detection complexity becomes prohibitively high and a alized FTN-SCM scheme. suboptimal detection method has to be utilized, which in (2) A new channel observation model suitable for FTN- return somehow damages the performance. Hence, in this SCM is introduced. paper, we attempt to solve such an issue by considering SCM [8–13]. (3) Te detection method and the corresponding power allocation for FTN-SCM are also discussed. SCMisaspecialcaseofmultilevelcoding(MLC) −5 [8], which ofers an excellent solution to transmissions (4) Simulation results show that, for BER <10 , FTN- with severe interference. With the use of the fast Fourier SCM requires lower SNR than that of the orthogonal transform- (FFT-) based technique proposed in [9], the signaling with a larger alphabet at any given spectral detection complexity of SCM system is �(log �frame),where efciency. �frame is denoted as the frame length [10]. Moreover, with Te rest of this paper is organized as follows. Te diagram proper Gaussian assumption, the optimization for SCM ofFTN-SCMisprovidedinSection2.Ten,thenew systems is easier than that of conventional bit interleaved channel observation is derived in Section 3. In Section 4, coded modulation (BICM) systems [10]. SCM has also been the detection method is described, along with the power proven to have promising performance over a variety of allocation derivation. Our numerical results are presented channels [11, 12]. More advantages of SCM can be found in in Section 5, and fnally a brief conclusion is provided in [10] and the references therein. Section 6. Te idea of combining SCM with FTN signaling frst appeared in [14], where FTN signals are treated as the sum of several orthogonal signals with diferent time delays; 2. System Model thus it allows the successive interference cancellation (SIC) Te transmitter structure of FTN-SCM is illustrated in detection at the receiver. However, in [1], it has been proven Figure 1. Assume that the sequence u carrying information that the aforementioned structure cannot bring any gain in bits is separated into � substreams, namely, u0, u1,...,u�−1. terms of DoF. Hence, a so-called “full-FTN” structure has Each subsequence of u,sayu�,isthenencodedbyits been proposed in [1] along with its proof of achieving the corresponding encoder generating the codeword c� of length � constrained capacity of FTN signaling. In this structure, the �. c � isthepermutedversionofc�,whichisaferward signals are viewed as the sum of several FTN signals of the modulated in the form of BPSK with an average symbol same compression factor and the SIC is also utilized to reduce energy �� =����,where�� is a pregiven power, � is the the detection complexity. Diferent from the traditional FTN compression factor, and �� is the symbol time. x� ={��[1], T signaling, to gain a higher spectral efciency, this structure ��[2],...,��[�],...,��[�]} represents the modulated sym- utilizes more layers rather than a small compression factor. bols at the �th layer, which are then superposed directly Since, with SIC detection, the detection complexity grows with the modulated symbols from other layers. Te trans- linearly with the number of layers and exponentially with mitted symbol sequence x is obtained as the superposition the number of ISI taps, the overall detection complexity is fnished, where the �th symbol of x is given as �[�] = of this structure is normally very low. On the other hand, �−1 ∑�=0 ��[�].TeFTNmodulatorisabletoshapethetrans- since at each layer, the symbol rate still exceeds the signal mitted signal �(�) for the given input x basedonacertain bandwidth, the DoF gain of FTN signaling is maintained. �-orthogonal pulse ℎ(�). Without loss of generality, an FTN- However, this structure still lacks a well-designed equalizer SCM signal can be expressed as to perform SIC, because the common equalizers for FTN signaling, such as the one in [15], require the utilization of � �−1 � � (�) = ∑� [�] ℎ (� − ���) = ∑ ∑� [�] ℎ (� − ���) . the whitening flter in the receiver. Tis is rather difcult � (1) �=1 �=0 �=1 andevenimpossiblewhentheFTNsignal,ateachlayer,is corrupted by both the colored noise and the signals from A brief diagram of FTN-SCM signal is given in Figure 2, other layers. Hence, it is needed to derive a new observation where �=2and � = 0.5. As shown in the fgure, the pulse of model, which allows the SIC and the detection for each each individual symbol is interfered by the pulses from both individual layer at the same time. It should also be noted the current layer and the other layers. It should be mentioned that the combination of FTN signaling and SCM bypasses that, in this case, a symbol rate that is higher than the Nyquist the obstacle of designing the channel code in terms of criterion is maintained at each layer. Tus, the capacity gain diferent compression factors. Similar to the traditional SCM, of FTN signaling is preserved. Note that the diferent power an identical code can be utilized for all layers of FTN- assignment for each layer is not the only way of performing SCM, which makes the design and implementation of FTN- SIC in the receiver, similar to that of the orthogonal signaling; SCM system very easy. By simply adjusting the number of choosing codes of diferent rate for each layer may also do the superposition layers and the power allocation, FTN-SCM work. is able to provide a wide range of spectral efciencies with Figure 3 illustrates the receiver structure of FTN-SCM. excellent performance. In this paper, as we only focus on AWGN channels, the Te main contributions of this paper are summarized in received signal �(�) is presented as �(�) = �(�) + �(�),where the following: �(�) has one side power spectral density (PSD) �0.Let�� Wireless Communications and Mobile Computing 3

u c c x 0 0 0  0 Encoder-0 Interleaver-0 E0(1 − 2c0[n])

c u1 c1 1  x1 Encoder-1 Interleaver-1 E1(1 − 2c1[n]) u x FTN s(t) S/P  modulator ......

c uK−1 cK−1 K−1  xK−1 Encoder-(K − 1) Interleaver-(K − 1) EK−1(1 − 2cK−1[n])

Figure 1: Te transmitter structure of FTN-SCM.

x0[1] x0[2] x0[3] x0[4] x0[5] x0[2n + 1] x0[2n + 2]

x1[1] x1[5] x1[2n + 1] x1[2n + 2]

··· ··· ···

0 T/2 T 3T/2 2T 5T/2 3T nT (n + 1)T Figure2:AbriefdiagramofanFTN-SCMsignalwith�=2and � = 0.5.

u0 Interference Detector and calculation Update x0 & SIC decoder-0

u1 Interference Detector and calculation Update x1 decoder-1 r(t) y & SIC Matched Filter ......

uK−1 Interference Detector and calculation Update xK−1 decoder-(K − 1) & SIC

Figure 3: Te receiver structure of FTN-SCM. represent the autocorrelation function samples of ℎ(�).We G have 1� ⋅⋅⋅ � 000 0⋅⋅⋅ −1 −�I � 1� ⋅⋅⋅ � 00 0⋅⋅⋅ ∞ 1 −1 −�I � = ∫ ℎ (�) ℎ∗ (� − ���) �, −� ≤�≤�, ( ) � d I I (2) ( . ) −∞ ( . dd d d d ) ( ) (4) = ( ) ( �� ⋅⋅⋅ �1 1�−1 ⋅⋅⋅ �−� 0⋅⋅⋅) ∗ ( I I ) (⋅) �I where denotes the complex conjugation and is the 0�� ⋅⋅⋅ �1 1�−1 ⋅⋅⋅ �−� ⋅⋅⋅ length of fnite ISI tap. Te output of the matched flter is I I . denoted as y.Wethushave ( . d ddd d)

and � is the colored-noise vector with zero mean and the y = Gx + �, (3) H covariance matrix E[�� ]=(�0/2)G.Here,E[⋅] is the H expectation operator and (⋅) is the Hermitian (conjugate) where G is a Toeplitz matrix given as transpose. 4 Wireless Communications and Mobile Computing

� �2 � � Without loss of generality, the detection can start at the ∞ � �−1 � � � frst layer, and afer the detection of each layer, the estimation x̂� = arg min ∫ �� (�) − ∑ �̂� (�) −�� (�)� d�, (5) x� � � of current layer inputs, say x̂�,isstoredfortheinterfer- −∞ � �=0 � � � ence calculation of the following layers. Note that, afer �=� ̸ one complete sweep over all layers, the updated estimation can be reused to perform other sweeps, which is able to where further improve the interference calculation. Normally, three complete sweeps would be enough for FTN-SCM systems. � �̂� (�) = ∑�̂� [�] ℎ (� − ���) (6) �=1 3. Channel Observation Model We consider the minimum distance detector in this paper. represents the estimation of the signal of the �th layer. By Based on the receiver structure, for detecting the �th layer, expanding the equation and dropping the irrelevance, (5) we have yields

∞ { { �−1 } 1 } x̂ = ∫ (� (�) − ∑ �̂ (�))�∗ (�) − � (�) �∗ (�) �, � arg maxx {Re { � � } � � } d (7) � −∞ �=0 2 { { �=� ̸ } }

where Re(⋅) represents the real part of a complex number. By switching the integral sequence, (7) can be further derived as

{ � ∞ �−1 } 1 ∞ x̂ = ∑�∗ [�] ⋅ ∫ (� (�) − ∑ �̂ (�))ℎ∗ (� − ���) � − ∫ � (�) �∗ (�) �. � arg maxx Re { � � d } � � d (8) � �=1 −∞ �=0 2 −∞ { �=� ̸ }

Tus, with respect to the matched flter outputs, we get in which ��[�] is assumed to be Gaussian, and its variance 2 is denoted as ��[�]. Hence, the derivation of the channel � observation model is complete. x̂ = ∑ {(� [�] −� [�])�∗ [�]} � arg maxx Re � � � �=1

� � 4. Detection Method and Power Allocation 1 ∗ − ∑ ∑�� [�] �� [�] ��−�, 2 �=1 �=1 To detect FTN-SCM signal, a well-designed equalizer that accepts nonwhite noise is necessary. Tus, we choose the (9) original method from [18] as the method detecting each layer. Te method has been proven to ofer promising performance where based on the Ungerboeck observation model [19]. As an � � � extension of the traditional -algorithm BCJR ( -BCJR) �−1 I � � [�] = ∑ ∑ �̂ [� + �] � . algorithm, the detection method selects the best states not � � � (10) only based on the current symbol but also considering the �=0 �=−�I �=� ̸ infuence of some “future” symbols. At each trellis section, for each possible state ��, the method calculates the metrics � It is obvious that (9) enables the implementation of the of the path v =�1 that induces �� and all possible paths from Viterbi algorithm [16]. Similarly, as for sof-in sof-out (SISO) the section �+1till �+�that are extended from ��.However, algorithms, for example, the BCJR algorithm [17], (9) implies new concerns arise due to the presence of other layers. In the recursive probabilistic factorization of the form thefollowing,weaimtooferaperformanceanalysisfor the detection of each layer and we further give the power � 1 allocation of each layer. �(y | x , a )∝∏ { � � exp 2 �=1 �0/2 + �� [�] We believe slightly abusing the notation is acceptable. (11) We henceforth use �� and �� in place of ��[�] and ��[�], � 1 I {� ,� ,...,� } ⋅ {�∗ [�] ⋅(� −� [�] − � � [�] − ∑� � [�−�])}} , respectively, and then the sequence of 1 2 � can Re � � � 2 0 � � � �� �=1 be represented as 1 . Without loss of generality, we assume Wireless Communications and Mobile Computing 5

�� equiprobably taking values in the alphabet. Hence, in our where case, for detecting the �th layer, based on the description in 0 0 0 ⋅⋅⋅ 0 [18] and the aforementioned observation model, the metric V�+� =��+� +��+� ��+� of path 1 1 1 with a random error pattern 1 . . . . ( . . . . ) is represented as ( ) ( 0 0 0 ⋅⋅⋅ 0 ) �+� �+� H �+� �+� 1 � �+��2 ( ) �(V )=Re {(V ) (� −� )− �V � ( ) 1 1 1 1 � 1 � ( �−� 0 0 ⋅⋅⋅ 0 ) 2 ( I ) (12) T = ( ) . (17) H H ( �−(� −1) �−� 0 ⋅⋅⋅ 0 ) −(V�+�) G (V�+�)+(V�+�) ��+�}, ( I I ) 1 L(�+�)×(�+�) 1 1 1 ( ) ( . . ) ( . dd . ) 2 where ‖⋅‖ is the norm operator and GL is the lower (�+�)×(�+�) �−2 �−3 ⋅⋅⋅ �−� 0 triangular matrix with zero main diagonal of the size (�+�)× I �−1 �−2 ⋅⋅⋅ �−(� −1) �−� (� + �) that is denoted as ( I I ) 10000⋅⋅⋅ In the following, we focus on the detection performance � 1000⋅⋅⋅ of each stage. Since the detection at each stage is interfered by 1 the signals from the other stages, it is necessary to make sure ( ) ( . ) that the algorithm is still able to ofer a correct detection, for ( . d 100⋅⋅⋅) G = ( ) . which we ofer the following theorem. L(�+�)×(�+�) ( ) (13) ( �� ⋅⋅⋅ �1 10⋅⋅⋅) ( I ) Teorem 1 (correct detection criterion). In FTN-SCM sys- 0�� ⋅⋅⋅ �1 1 d I tems, the �th stage can be successfully detected without the pres- . 2 . ence of noise, if the maximum variance of �[�],say���� = ( . ddd) 2 max{�� [�], 1 ≤ � ≤ �},satisfes ≜(�+�)×(�+�+�) ≜ As we defne size A I and size B � I � � 1 (� + �) × (� + �), by substituting (3) into (12), we have √2�2 ∑ �� � < √� ���2 ��� � �� 2 � ��� �=−� H I �+� �+� �+� �+�+�I �(V1 )=Re {(�1 +�1 ) GA�1 (18) � −� I � � −2√� �� ∑ � �� � , 1 � �+� �+��2 � � −(�+�)� − �� +� � �=1 2 � 1 1 � (14) 2 H �+� �+� �+� �+� where (����) represents the minimum squared Euclidean −(� +� ) GL (� +� ) 1 1 B 1 1 distance of BPSK modulated FTN signals with normalized ��(�) �+� �+� H �+� �+� �+� H �+� signal energy, i.e. BPSK modulated FTN signal satisfes ∞ � 2 +(�1 +�1 ) GB�1 +(�1 +�1 ) �1 }, ∫ |� (�)| =1 −∞ . in which �� represents the accuracy of the estimation and is Proof. Te proof is given in Appendix A. given as Clearly, Teorem 1 is a sufcient condition for the �th �−1 stage being successfully detected. Tus, we have proved � = ∑ � [�] − �̂ [�] , � � � (15) that the algorithm is able to provide a correct detection, �=0 �=� ̸ as long as (18) is satisfed. It is straightforward to ofer the power allocation based on (18). However, this may not be a 2 with the variance �� [�]. good choice for three reasons. Firstly, the derivation for (18) Furthermore, we consider the diference of the metrics involves the scaling of inequalities; thus the power allocation of two individual erroneous paths. Defne the two paths as basedon(18)isnotthebest.Secondly,thealgorithmoperates �+� �+� �+� ��+� �+� ��+� on a reduced ISI trellis, where the certain error patterns V1 =�1 +�1 and V 1 =�1 +� 1 and further defne �+� that have a larger metric may not be included during the ��+� =�� −��+� 1 1 1 .Aferseveralmanipulations[18],we detection. Tirdly, practically speaking, Teorem 1 is not the obtain necessary condition for the successful detection. Since error- �+� H �+�+� correcting codes are normally implemented in FTN-SCM �(V� )−�(V�+�)= {(��+�) T� I 1 1 Re 1 �+�+1 systems which helps in the detection in a certain level, thus the power allocation should also take the infuence of the �+� H �+� 1 �+� corresponding codes into account. −(�1 ) GB (�1 + �1 )} (16) 2 All these three reasons suggest that the power allocation H H may not necessarily be derived in such a strict way. Hence, + {(��+�) G ��+� +(��+�) ��+�}, Re 1 B 1 1 1 we slightly adjust the detection criterion by calculating 6 Wireless Communications and Mobile Computing

Table 1: Power allocation for the simulations in Figure 4.

��1/� �2/� �3/� �4/� �5/� �6/� �7/� 2 0.6714 0.3286 - - - - - 3 0.5774 0.2837 0.1389 - - - - 4 0.5403 0.2655 0.1304 0.0638 - - - 5 0.5237 0.2573 0.1264 0.0621 0.0304 - - 6 0.5159 0.2535 0.1246 0.0612 0.0301 0.0147 - 7 0.5122 0.2517 0.1237 0.0608 0.0299 0.0147 0.0072

� � the probability of �(�� |�1 ,�1 ,��−1) instead of �(�� | In the following, we ofer a power allocation with respect � � �1 ,�1 ), where the detection algorithm is assumed with �= to individual error-correcting codes. Without loss of gener- � 1 and ��−1 is the correct state that is preserved at the (�−1)th ality, we assume that the code at the th layer successfully section. Tus, at �th section, the log likelihood ratio (LLR) of recovers the information sequence at SNR =�� over the the input �(��) canbeobtainedbythefollowingtheorem. FTNchannel.Meanwhile,accordingto(21),thesignal-to- interference plus noise ratio (SINR) for the �th layer is Teorem 2 (the correct tail path). We claim that v is the defned as correct tail path (CTP) if and only if the last � elements are �+� T �(�2 � = 0 � � [�]) correct, which is �+1 .Tenat th section, for the CTPs ≜ � SINR� �(�2 [�])+�/2 v and v of the states �+ and �− that are induced by the correct � 0 (23) state ��−1,wehave � �� = � . 2 2 �I � � � � � �−1 � � �0/2 + ∑ ��−�� ���� + ∑ ��−�� (∑ ����) �(� =+√� �� | � ,� ,� ) �=�+1 � � �=−�I � � �=�+1 �(� )≜ � � 1 1 �−1 � ln � � �(�� =−√���� | �1 ,�1 ,��−1) Terefore, to successfully decode ��, SINR� ≥�� must hold. (19) Tus, we have � � �(�+ |� ,� ,��−1) 1 1 � � �2 = ln ∝�(k) −�(k ). � (� /2 + ∑� �� � (∑�−1 � ��)) �(� |��,��,� ) � 0 �=−� � −�� �=�+1 � − 1 1 �−1 � ≥ I . � 2 (24) �I � � �� (1 − �� ∑�=�+1 ��−�� ) Proof. Te proof is given in Appendix B. Hence, by noticing the natural power assignment constraint With Teorem 2, it is possible to evaluate the error event �+� that �0 +�1 +⋅⋅⋅+��−1 =1,therequired�� for all � can be rate (EER) of each layer. We assume the two CTPs are V = 1 obtained recursively starting from �=�−1.Tenumerical ��+� V��+� =��+� +��+� ��+� = [0,...,0,� ,0, 1 and 1 1 1 with 1 � results based on the above power allocation are demonstrated T �+� ...,0] , respectively. Tus, for the error event �≜�1 ,we in the next section. have ��+� �+� 5. Numerical Results � (�) =�(�(V 1 )−�(V1 )>0) We choose the root raised cosine (RRC) (with roll-of factor �I � �2 � = 0.3 and a time-truncation to ±15� around � =0)(without =�(� ( ∑ � � )−�� � +� � � −� �+� � �� � � loss of generality, we assume �=1)astheshapingpulse �=�+1 (20) ℎ(�). Meanwhile, the outer code is chosen as the code rate � �=1/3asymmetric in [18], with the gener- � (�) ≜ [1(1+�+�2)/(1 + �2)] +�� ( ∑ �−���+�)>0). ator polynomial 1 and 3 2 3 �=−�I �2(�) ≜ [1(1+�+�)/(1 + � +� )].Aswetransmit 20000 information bits per layer, we have � = 60010 as By considering the Gaussian assumption, and the steep the codeword length, wherein 10 redundant bits are included decrease of � function, (20) can be simplifed as to terminate the trellis. Te two-dimensional normalized spectral efciency is defned as � = 2��/�(1 + �). � �� Te simulation results of FTN-SCM with �=2to 7 and � (�) ≃�(√ � ), �2 (21) �=2/3are plotted in Figure 4, wherein the power allocation is shown in Table 1. Te parameters for the detection method are chosen as �=4and �=2. Tere are 50 Turbo iterations where betweentheFTNandTurbocodeateachlayerand3complete

�I � sweeps in total. Figure 5 shows the corresponding achieved 2 �0 � �2 � �2 2 � = + ∑ ��−�� ���� + ∑ ��−�� �� [�] . (22) data rate (for details on the data rate, please refer to [5]) at 2 <10−5 �=�+1 �=−�I BER .Asfguresimply,theBERresultsofFTN-SCM Wireless Communications and Mobile Computing 7

100 K=1, K=2, K=3, K=4, K=5, K=6, K=7,  = 0.77  = 1.54  = 2.31  = 3.08  = 3.85  = 4.62  = 5.38 10−1

10−2

10−3 BER dB dB

−4 dB dB 10 dB 6.66 dB = 11.41 = 14.02 = 4.54 = = 9.15 = 2.89 10−5 -QAM Limit -QAM Limit -QAM

−6 Limit -QAM

10 Limit -PSK 8 16 Limit 32 -QAM Limit 64 -QAM 128 256 0155 10

Eb/N0 (dB) Figure 4: BER results of FTN-SCM with � layers and �=2/3compared to constrained capacities of orthogonal signaling, where �(bits/s/Hz) represents the two-dimensional normalized spectral efciency.

10

9 Unachievable Region 8

7 K=7 Maximum data rate seconds) 6 K=6 T for RRC with  = 0.3

5 K=5

4 K=4 Capacity for RRC with  = 0.3, orthogonal 3 K=3 signaling

Data Rate (bits per (bits Rate Data 2 K=2 1

0 −2 0 2 4 6 8 10 12 14 16 −5 Eb/N0 for BER <10 (dB)

Achieved via FTN-SCM Constrained capacity with equiprobable constellation points, orthogonal signaling −5 Figure 5: Achieved data rate (BER <10 ) compared to both the constrained and unconstrained capacity of orthogonal signaling as well as the ultimate capacity of RRC with � = 0.3.

−5 lie clearly outside the constrained capacity boundary with BER <10 , our method only needs no more than 0.1 dB to alargeralphabetoforthogonalsignaling.Wealsoobserve achieve the same performance as the optimal result, but with that FTN-SCM is able to achieve a wide range of spectral much less complexity, which proves that our method exhibits efciencies with a simple binary modulation format at each a better trade-of between performance and complexity than layer. In order to better illustrate the performance of FTN- the method in [1]. For a detailed complexity comparison, SCM, we make a comparison between our method and the please refer to [18]. It should be mentioned that the BER method in [1]. Te BER results of two methods are given in results can be improved by choosing a better outer code or Figure 6, where the simulation parameters are the same as a better detection method, which is a future topic for us. the case of �=2in Figure 4, except the power allocation for � /� = 0.6705 � /� = 0.3295 method in [1] is given as 1 and 2 . 6. Conclusion Note that, in [1], the authors utilize an optimal FTN equalizer in order to gain a very good performance. However, such In this paper, we considered the FTN-SCM structure. Based equalizers are normally impractical, especially when there are on the transceiver structure, we derived a new observation a lot of layers; for example, �=7. On the other hand, for model and further ofered the power allocation with respect 8 Wireless Communications and Mobile Computing

� to the detection method of each layer. Simulation results where k � and v� represent the �th path of the probability � show that, in a wide range of spectral efciencies, FTN-SCM calculation for �� =� and �� =�,respectively.Withoutloss � requires lower SNR than that for orthogonal signaling with a of generality, we assume that k � and v� havethesameerror �+� larger alphabet. It should be noted that the proposed scheme pattern ��+1 . Tus, by considering (16), (A.1) can be further is easy to be extended to nonbinary modulation cases and simplifed as other types of channels.

�(� =��,��,��) Appendix � 1 1 ∝2� ln �(� =�,��,��) A. Proof of Theorem 1 � 1 1 H �+� 1 ×{ {(��) T�� I − �2 (��)} (A.2) According to [18], the probabilities of the correct state �� =� Re 1 �+�+1 1 � � 2 and the wrong state �� =� with the error sequence �1 satisfy the following equation: � H � �+� � H � + Re {(�1) G �1 +(�1) �1 }} , � � � 2� �(�� =�,�1 ,�1 ) ∝ ∑�(k� )−�(k ), (A.1) ln �(� =�,��,��) � � � 1 1 �=1 where

00⋅⋅⋅0 . ( . ) ( ) ( . ) ( . ) ( 0 . ) T� = ( ) , ( �−� 0 ) ( I ) ( ) ( �−(� −1) �−� d ) ( I I ) . . d 0 � � ⋅⋅⋅ � ( −(�+1) −(�+2) −�I ) (A.3) 1� ⋅⋅⋅ � � ⋅⋅⋅ � 0⋅⋅⋅0 −1 −� −�−1 −�I � 1� ⋅⋅⋅ � � ⋅⋅⋅ � 0 ⋅⋅⋅ 1 −1 −� −�−1 −�I ( ) ( �2 �1 1�−1 ⋅⋅⋅ �−� �−�−1 ⋅⋅⋅ �−� 0 ) ( I ) � ( ) G = ( dd d d d ) , ( ) ( dddddd) ⋅⋅⋅ 0 � ⋅⋅⋅ � 1� ⋅⋅⋅ � � �I 1 −1 −� −�−1 0⋅⋅⋅0� ⋅⋅⋅ � 1� ⋅⋅⋅ � ( �I 1 −1 −� )

2 � � H � 1 2 � and � (�1)=(�1) G�×��1, representing the squared − Re {min {� (�1)}} �� Euclidean distance between the erroneous path and the 2 1 correct path at the current stage. As �[�] is assumed to be 2 � H � �+� Gaussian with variance �� [�],theright-handsideof(A.2)can + Re {max {(�1) G �1 }}} . �� be upper-bounded in the noiseless regime by 1 (A.4)

H �+� 1 By noticing the fact that �� ∈{−√2����, 0, 2√����},the 2� ×{ {(��) T�� I − �2 (��)} Re 1 �+�+1 2 1 right-hand side of (A.4) can further be extended as

� H � �+� � H �+� + {(� ) G � }} < 2 � � � I Re 1 1 2 ×{Re { max {(�1) T ��+�+1}} � �+�I �1 ,��+�+1

H � � �+�I 1 2 � ×{Re { max {(�1) T ��+�+1}} − { {� (� )}} � �+�I Re min� 1 �1 ,��+�+1 2 �1 Wireless Communications and Mobile Computing 9

10−1

10−2

10−3 BER 10−4

10−5

10−6 2 2.05 2.1 2.15 2.2 2.25 2.3 2.35 2.4

Eb/N0 (dB)

FTN-SCM Method in [13] Figure 6: BER result of FTN-SCM with �=2compared to the result of the method in [1].

�(� |��,��,� ) � H � �+� � + 1 1 �−1 � + Re {max {(�1) G �1 }}} < 2 ∝ exp [� (k�)−�(k �)] �� � � 1 �(�− |�1 ,�1 ,��−1) (B.2) � −� exp [� (k1)−�(k�)] + exp [� (k2)−�(k�)] ⋅ ⋅ ⋅ { I 1 × . � � 2 � � � � × � �� (2 ∑ � �� � − � ) exp [� (k 1)−�(k �)] + exp [� (k 2)−�(k �)] ⋅ ⋅ ⋅ { � � −(�+�)� 2 min { �=1 Without loss of generality, we consider �(v�)−�(v�),wherev� � I } is not the CTP. Recall (16); we have √ 2 � � + 2�max���� ∑ ����} . �+� H �� �+�+� �=−� I I } �(k�)−�(k�)=Re {(��+1 ) T ��+�+1 (A.5) �+� H 1 �+� −(��+1 ) GB ( ��+1 )} (B.3) Tis completes the proof of Teorem 1. 2 �+� H �+� �+� H �+� + Re {(� ) G�×�� +(� ) � }, B. Proof of Theorem 2 �+1 �+1 �+1 �+1 where Clearly, since ��−1 is the correct state at section (� − 1) and the states in the trellis are Markovian, the LLR of the input T�� �� is determined by the probabilities of the states �+ and � � ⋅⋅⋅ � 0 ⋅⋅⋅ 0 �−. According to the description in [18], in our case, the −� −(�+1) −�I . . probability of �� =�follows . . (B.4) =( . dd. ). 2� � ⋅⋅⋅ � � ⋅⋅⋅ � 0 −2 −� −(�+1) −�I �(� =�|��,��,� )∝∑ [� (k )] , (B.1) � 1 1 �−1 exp � �−1 �−2 ⋅⋅⋅ �−� �−(�+1) ⋅⋅⋅ �−� �=1 I �+� �+� �+� �+� Since �1 is no longer part of the equation, the only vari- where v� =�1 +�1 ,with�1 = [0,...,0,��,��+1, �+� T able in the equation is ��+1 .Tus,asallthecombinationsof ...,��+�] ,representingthe�th possible path that is extended �+� � areincludedin(B.2),wecansafelydrawtheconclusion from �� =�. Te calculation implies a process of generating �+1 � the marginal probability from all joint probabilities, as 2 that combinationsarealltakenintoaccount. [� (k )−�(k )] + [� (k )−�(k )] ⋅ ⋅ ⋅ � exp 1 � exp 2 � For derivation brevity, we use v and v representing the � � � � exp [� (k 1)−�(k �)] + exp [� (k 2)−�(k �)] ⋅ ⋅ ⋅ paths extended from �+ and �−,respectively.Wefurther (B.5) require that the same subscript � represents the same error =1. � pattern. Tus, it is fair to assume that k� and v � are the CTPs from states �+ and �−,respectively.Hence,weobtain Tis completes the proof of Teorem 2. 10 Wireless Communications and Mobile Computing

Conflicts of Interest [13] X. Ma and L. Ping, “Coded modulation using superimposed binary codes,” IEEE Transactions on Information Teory,vol.50, Te authors declare that there are no conficts of interest no. 12, pp. 3331–3343, 2004. regarding the publication of this paper. [14] Y.J.KimandJ.Bajcsy,“FasterthanNyquistbroadcastsignaling,” in Proceedings of the 2012 26th Biennial Symposium on Commu- Acknowledgments nications (QBSC), pp. 186–189, Kingston, Canada, May 2012. [15] G. Colavolpe, G. Ferrari, and R. Raheli, “Reduced-state BCJR- Tis work was supported in part by the National Natural Sci- type algorithms,” IEEE Journal on Selected Areas in Communi- ence Foundation of China under Grants 61771364, 91438101, cations,vol.19,no.5,pp.848–859,2001. and61601346,bytheScienceandTechnologyonCommuni- [16] G. D. Forney, “Te Viterbi Algorithm,” Proceedings of the IEEE, cation Networks Laboratory under Grant 614210403070717, vol.61,no.3,pp.268–278,1973. by the Fundamental Research Funds for the Central Uni- [17] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal Decoding versities under Grant WK2100060020, by the Key Research of Linear Codes for Minimizing Symbol Error Rate,” IEEE ProgramofFrontierSciencesofCASunderGrantQYZDY- Transactions on Information Teory,vol.20,no.2,pp.284–287, SSW-JSC003, by the China Postdoctoral Science Foundation 1974. under Grant 2015M580819, and by the Shaanxi Province [18]S.Li,B.Bai,J.Zhou,P.Chen,andZ.Yu,“Reduced-Complexity Postdoctoral Science Foundation. Equalization for Faster-than-Nyquist Signaling: New Methods Based on Ungerboeck Observation Model,” IEEE Transactions on Communications,vol.66,no.3,pp.1190–1204,2018. References [19] G. Ungerboeck, “Adaptive Maximum-Likelihood Receiver for Carrier-Modulated Data-Transmission Systems,” IEEE Transac- [1] Y. J. D. Kim, J. Bajcsy, and D. Vargas, “Faster-Tan-Nyquist tions on Communications,vol.22,no.5,pp.624–636,1974. Broadcasting in Gaussian Channels: Achievable Rate Regions and Coding,” IEEE Transactions on Communications,vol.64, no. 3, pp. 1016–1030, 2016. [2] CISCO, Cisco Visual Networking Index: Global Mobile Data Trafc Forecast Update, 2015-2020, http://www.cisco.com/c/en/ us/solutions/collateral/service-provider/visual-networking- index-vni/mobile-white-paper-c11-520862.pdf, 2016. [3] J. B. Anderson, “Faster-than-Nyquist signaling for 5G commu- nication,”in Signal Processing for 5G: Algorithms and Implemen- tations, pp. 24–46, Wiley-IEEE Press, 2016. [4] J. E. Mazo, “Faster-than-Nyquist signaling,” Bell Labs Technical Journal,vol.54,no.8,pp.1451–1462,1975. [5] F. Rusek and J. B. Anderson, “Constrained capacities for faster- than-Nyquist signaling,” IEEE Transactions on Information Teory,vol.55,no.2,pp.764–775,2009. [6] J. B. Anderson, F. Rusek, and V. Owall,¨ “Faster-than-Nyquist signaling,” Proceedings of the IEEE,vol.101,no.8,pp.1817–1830, 2013. [7] F. Rusek and J. B. Anderson, “Serial and parallel concatenations based on faster than nyquist signaling,” in Proceedings of the IEEE International Symposium on Information Teory,pp.1993– 1997, Seattle, WA, USA, 2006. [8]T.M.Cover,“BroadcastChannels,”IEEE Transactions on Infor- mation Teory,vol.18,no.1,pp.2–14,1972. [9] X. Yuan, Q. Guo, and L. Ping, “Low-complexity iterative detec- tion in multi-user MIMO ISI channels,” IEEE Signal Processing Letters,vol.15,pp.25–28,2008. [10]L.Ping,J.Tong,X.Yuan,andQ.Guo,“Superpositioncoded modulation and iterative linear MMSE detection,” IEEE Journal on Selected Areas in Communications,vol.27,no.6,pp.995– 1004, 2009. [11] K. Wu and L. Ping, “A quasi-random approach to space-time codes,” IEEE Transactions on Information Teory,vol.54,no.3, pp.1073–1085,2008. [12] X. Chen, X. Wang, and X. Ma, “Superposition convolutional coded modulation for the Rayleigh fading channel,” in Proceed- ings of the International Conference on Communications, Circuits and Systems, ICCCAS 2007, pp. 37–41, IEEE, Kokura, Japan, 2007. Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 5957320, 7 pages https://doi.org/10.1155/2018/5957320

Research Article A Novel Design of Downlink Control Information Encoding and Decoding Based on Polar Codes

Ce Sun ,1 Zesong Fei ,1 Jiqing Ni,2 Wei Zhou,2 and Dai Jia1

1 Beijing Institute of Technology, Beijing 100081, China 2China Mobile Research Institute, Beijing 100081, China

Correspondence should be addressed to Zesong Fei; [email protected]

Received 25 November 2017; Revised 1 March 2018; Accepted 1 April 2018; Published 13 May 2018

Academic Editor: Luca Reggiani

Copyright © 2018 Ce Sun et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

In legacy long term evolution (LTE) networks, multiple transmission modes are defned to cater to diverse wireless environment and improve the spectrum utilization. However, constrained by user equipment (UE) processing capability on blind detection of downlink control information (DCI), two transmission modes are allowed to be confgured to UE simultaneously. In recent 5G standardization, the polar codes have supplanted the tail biting convolution codes (TBCC), becoming the channel coding scheme for downlink control information (DCI). Motivated by its successive decoding property, a novel design of DCI encoding and decoding is proposed in this paper. Te proposed scheme could support dynamic confguration of transmission modes with decreasing the complexity of blind detection. Evaluation results from link level simulations show that the performance loss compared to conventional encoding/decoding scheme is generally negligible and the proposed scheme can comply with the false alarm rate (FAR) target of 5G standardization.

1. Introduction communication networks [5]. It is based on the polarization theory and can achieve the capacity of arbitrary binary-input In the long term evolution (LTE) system, transmitter, and discrete memoryless channel (B-DMC) [6]. Furthermore, the receiver communicate with each other by diferent trans- complexity of polar codes is low with some optimized decod- mission scheme. Each transmission scheme is corresponding ing algorithms, such as low-complexity list successive can- to a transmission mode (TM) [1]. In the LTE, the multiple cellation (LCLSC) decoding algorithm [7]. Blind detection TMs are defned to cater to diverse wireless environment of polar codes has been researched in [8]; that work focuses andimprovethespectrumutilization.TeLTEsystem on ftting within the 5G parameters. A low-complexity blind- supports nine TMs; the diference among those is the special detection algorithm for polar-encoded frames is proposed in structure of the antenna mapping, the reference signal of [9]. Tat scheme decreased the complexity of polar decoding demodulation, and the feedback type [2]. in blind detection. Our scheme decreased the complexity of Te downlink control information (DCI) transited by the process of blind detection. base station to users can be used to schedule the downlink/ Being diferent from the tail biting convolution code uplink data transmission and convey essential confgurations (TBCC) which is the coding scheme in LTE, all the existing [3]. Specifcally, the diferent DCI formats correspond to decoders of polar codes are based on successive cancellation the diferent transmission modes. Te length of DCI format (SC) decoder [10], which allows the encoded bits to be will be adjusted with the diferent system confguration. decoded in given order. Taking advantage of successive prop- ConstrainedbyUEprocessingcapabilityofblinddetection erty of polar decoders, decoding process can be paused afer of DCI [4], two transmission modes are allowed to be con- the frst several bits being decoded and continued accordingly fgured to a UE simultaneously. basedonthevalueoffrstseveralbits.Basedonthesuccessive Te polar codes have been adopted as the channel cod- property of polar decoders, a novel design on DCI encoding ing scheme of the control channel in the next-generation anddecodingisproposedinthispaper.Teproposedscheme 2 Wireless Communications and Mobile Computing

� � could support dynamic confguration of transmission modes and �1 = h�1 + n denotes the received message, h is the with decreasing the complexity of blind detection. channel matrix, and n denotes the noise of channel. Te rest of this paper is organized as follows. In Sec- Trough the above formula, we can see that the polar tion 2 we introduce the foundation of proposed scheme. Te decoder decodes the information bit by bit from �1 to ��. schemeofDCIdesignisproposedinSection3.InSection4, When we need to decode the �th bit ��, it is decided as zero we analyze the complexity, and simulation results are given. if �� is frozen bit; otherwise �� is decided by (2) with the Finally, Section 5 concludes the paper. prior information of �1,�2,...,��−1 and �1,�2,...,��.Tis property of polar decoder is defned as successive property 2. Preliminary which makes it is possible to suspend the process of decoding when �1,�2,...,��−1 have been decoded. In this section, we introduce polar codes and DCI design of the LTE system briefy; these are the foundation of the 2.3. DCI Design. According to the latest MIMO-related proposed scheme. progress in the 3rd-generation partnership project (3GPP), 2.1. Polar Codes. Polar codes are based on channel polariza- only one code word (CW) is transmitted for 1 to 4 layers and tion theory which is described as follows. two CWs are transmitted for 5 to 8 layers. Tus, the actual number of transmission layers could implicitly indicate the (�) number of CWs. Moreover, compared to 1-CW case, 2 CWs Teorem 1. For any B-DMC �,thechannels�� polarize in the sense that, for any fxed � ∈ (0, 1),as� goes to infnity would add an additional block of bit felds to DCI, possibly through powers of two, the fraction of indices � ∈ {1,...,�} containing MCS/RV/NDI and CBGTI/CBGFI if CBG-based (�) transmission is confgured, as shown in Figure 1. Tis is where for which �(� ) ∈ (1 − �, 1] goes to �(�) and the fraction � the diference between DCI payload sizes mainly rises. Con- �(�(�))∈[0,�) 1−�(�) � for which � goes to ,where is the sequently, DCI formats with 1- to 4-layer transmission could length of code word which is equal to the length of polarized strivetohavethesameDCIpayloadsizeandsoaretheDCI �(�) � � subchannels, � denotes the th subchannel of subchannels, formats with 5- to 8-layer transmission. Diferent transport and � denotes the channel capacity. layers corresponded to diferent DCI formats. UE does not know which DCI format of information is selected by base According to Teorem 1, we set the information bits in the (�) station; therefore blind detection is needed. Constrained by subchannel set in which �(�� ) ∈ (1−�, 1] and set the frozen UE processing capability on blind detection of DCI, two DCI bits in the other subchannels to construct the information formats are allowed to be confgured to UE simultaneously. block �.Beforesettingtheinformationbitsandfrozenbits,we UE attempted to decode the information with one DCI should calculate the reliability of � subchannels and decide format, if it can not perform decoding correctly, UE will which subchannels are good to be set as information bits. attempt to decode the information with the other DCI format. Te common algorithms to calculate the reliability include algorithm based on Bhattacharyya parameters [11], density 3. Proposed Scheme evolution (DE) [12], and Gaussian approximation (GA) [13]. And then send � into polar encoder to be encoded. Te Basedontheabovediscussion,apotentialdesignistoaddan � � � polar encoding is denoted as �1 =�1 G�,where�1 = explicit rank indicator (RI) feld in DCI and utilize this feld to � �1,�2,...,�� is the code word, �1 =�1,�2,...,�� is the implicitly indicate DCI payload size during decoding process. information block, and G� is the generator matrix of order Specifcally, the RI with fxed length could be decoded frstly, �. Te recursive defnition of �� is given by andthenthenumberofCWsandpossiblyDCIpayload size (this depends on the detailed DCI content) could be 10 ⊗� implicitly identifed. �� =��� ,�2 =[ ], (1) 2 11 Tis design is feasible from technical perspective as polar � codes have been adopted for PDCCH [1]. Based on the where � is a permutation matrix. successive property of polar decoder, the number of RI can be decoded frst before decoding of the CWs. Motivated by this 2.2. Successive Cancellation Decoder. All the existing decod- property, the number of CWs could be informed dynamically ers of polar codes are based on successive cancellation (SC) � by RI. If the number of transmission layer exceeds a threshold decoder. Afer receiving �1 , the SC decoder generates its �� (e.g., 4-layer transmission based on current agreements), the decision 1 by computing decoder would continue the decoding process based on a long � {0 if �∈A bit length, otherwise based on a short bit length. It is evident �̂� ≜ { (2) that such DCI format and decoding design are feasible with ℎ (��, �̂�−1) �∈A, { � 1 1 if decreasing the blind decoding complexity at UE side. In this section, the proposed scheme is described in detail. where Te encoding and decoding of polar codes are adjusted as the �−1 � { �(0|�̂1 ,�1 ) proposed scheme. � �−1 {0, ≥1 ℎ (� , �̂ )≜ if � (1|�̂�−1,��) � 1 1 { 1 1 (3) 1, 3.1. Encoding. Since each scheduled transmission may con- { otherwise tain 1 or 2 CWs, we defne two DCI formats with diferent Wireless Communications and Mobile Computing 3

CB HARQ 1 CW ··· RI MCS NDI RV CBGTI Resource allocation ··· GFI related

CW #1 CW #2

CB CB HARQ 2 CWs··· RI MCS NDIRV CBGTI MCS NDI RV CBGTI Resource allocation ··· GFI GFI related

Figure 1: An illustrative DCI format for 1 CW and 2 CWs.

Frozen bits index Frozen bits index

Information bits index Information bits index (a) Information bits index and frozen bits index of long DCI format with D (b) Information bits index and frozen bits index of conventional polar codes information bits with D information bits

Frozen bits index

Information bits index

Information bits index from long DCI format (c) Information bits index and frozen bits index of short DCI format with S information bits Figure 2: Te information bits index and the frozen bites index.

lengths. Assume that the long DCI format is � bits and information set ��. Ten the information bits of short DCI theshortDCIformatis� bits. Both the DCI formats are format are mapped on selected � subchannels and the frozen encoded to an �-bit code words for PDCCH. Te RI feld bitsaremappedontheother� − (� + �) subchannels. in DCI is utilized to implicitly indicate the DCI payload size. Te construction of sequence � is shown as in Figure 2. Te base station sets a threshold � for RI. If RI exceeds Figure 2(a) denotes the sequence � of long DCI format. Fig- the threshold, the base station shall transmit the short DCI ure 2(b) denotes the sequence � of conventional polar codes format or, otherwise, transmit the long DCI format. Te with � information bits. And Figure 2(c) denotes the sequence encoding procedure for two DCI formats is illustrated as � of short DCI formats. Te black blocks are information follows, respectively. bits which are selected by the reliability of subchannels, the white blocks are the frozen bits, and the red blocks are the information bits selected by the long DCI format. Long DCI Format. Te base station knows which DCI format � � should be transmitted and encode the DCI format for Te selection of thresholds and is not fxed, that diferentlength.Step1:thereliabilityof�-bit polarized is, depended on the practical situation. When the channel subchannelsiscalculated.Transmitterselectsthe(� + �) condition is good, two CWs can be transmitted in one block, and the small value of � and � canbeset;otherwisesetalarge most reliable subchannels which are noted as �(�+�) = value of � and �. [�1,�2,...,�(�+�)].Step2:thetransmittermapsthe(�+�)- bit information on the selected �(�+�);thefrst� bits of (� + �) � 3.2. Decoding. Te proposed decoding process is generally -bit information are the RI feld and the remaining based on the SCL decoder with adding a step called pause- bits are the control information. Step 3: the transmitter maps and-judge (PJ). Te detail of the PJ step is described as the frozen bits (usually all being zero sequence) on the other � − (� + �) � � follows. subchannels to construct an -bit sequence . Based on successive decoding property, the polar decoder Last, input the sequence � into the polar encoder. can decode the information from �1 to �� bit by bit. Tat is, the polar decoder can pause when �1,�2,...,�� have been Shot DCI Format. Step 1 for short DCI format is the same as decoded and proceed with other operations. that for long DCI format. In step 2, instead of selecting the In the proposed scheme, user receives message from the most reliable (�+�) subchannels from the set, the transmitter base station without the knowledge of DCI format used. Te selects the frst � elements from the set �(�+�) as RI feld diference between coded information of long DCI format which is the same as the long DCI format. Ten select the andthatoftheshortDCIformatisselectionofinformation most reliable � subchannels from the remaining sets as the bits set, but the frst � elements of RI feld are the same. Te 4 Wireless Communications and Mobile Computing

. .

I information layers And some frozen layers . . . .

Yes No Te value of RI is bigger than the threshold?

......

··· ···

D S 2 decoding paths 2 information nodes With L decoding nodes With L decoding nodes

Frozen nodes

Decoding information nodes

Nondecoding information nodes

Figure 3: Te proposed SCL decoding with � decoding paths. user decodes information from �1 with the SCL algorithm, so-calledsearchspace.Tereare22candidatesearchspaces. when �1 to �� have been decoded, the decoding proceeding AndtwoDCIformatsareallowedtobeconfguredtoUE is paused and user selects the most reliability path to decode simultaneously. UE does not know which search space and �1,�2,...,�� andjudgesthevalueof�1,�2,...,��.Ifitis DCI format are transmitted. Terefore, the most amount of bigger than threshold �, user continues decoding according blind detection of PDCCH is 22 ∗ 2 = 44 times. Our work to the long information set �(�+�). Otherwise, user continues saves the complexity of blind detection which is used to decoding according to the short information set ��.Te determine DCI format, the most amount of blind detection decoding process is shown as in Figure 3. of PDCCH is decreased to 22 times. Withproposedscheme,beforetheblinddetection,wecan 4. Analysis and Simulation Results decide which DCI format base station uses by the dynamic confguration of DCI format. Tat is, we can save up to 50% 4.1. Complexity of Blind Detection. In the LTE system, the for complexity of blind detection. blind detection of PDCCH can be divided into two parts, detection of search space and detection of DCI format. Te 4.2. Block Error Rate (BLER). In this section we provide DCI can be placed in various valid locations which form the numerical examples to illustrate that the proposed scheme Wireless Communications and Mobile Computing 5

N = 256, M = 216 N = 512, M = 432 100 100

10−1 10−1

10−2 10−2 BLER BLER

10−3 10−3

10−4 10−4 −1 −0.5 0 0.5 1 1.5 2 2.5 3 −4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 SNR (dB) SNR (dB)

TBS = 60 TBS = 60 TBS = 60-proposed scheme TBS = 60-proposed scheme TBS = 65 TBS = 65 TBS = 65-proposed scheme TBS = 65-proposed scheme TBS = 70 TBS = 70 TBS = 70-proposed scheme TBS = 70-proposed scheme TBS = 75 TBS = 75 TBS = 75-proposed scheme TBS = 75-proposed scheme TBS = 80 TBS = 80 TBS = 80-proposed scheme TBS = 80-proposed scheme TBS = 85 TBS = 85 TBS = 85-proposed scheme TBS = 85-proposed scheme

Figure 4: Te BLER performance for proposed scheme with �= Figure 5: Te BLER performance for proposed scheme with �= 256, � = 216,andvariousTBS. 512, � = 432,andvariousTBS. has lower complexity and the degradation caused by dynamic indication of code word number is generally negligible. 4.3. False Alarm Rate (FAR). Te false alarm ratio has been When DCI format uses long information set, the selection one of the most important measures of channel coding in of information set of proposed scheme is the same as that of 3GPP.TeFARisdefnedasFAR=��/��,where�� denotes common polar codes; there is hardly any BLER performance the number of undetected erroneous packets and �� denotes loss. Terefore the BLER of short information set with � RI the number of total packets. Figures 7 and 8 show that the FARfortheproposedschemeovervarious(�, �) pairs can feld is needed to be simulated. −21 We consider the 216-, 432-, and 864-bit DCI payload with comply with the FAR target of 1.5 ∗ 2 . the various length of the short information from 60 bits to 85 bits. Note that � = 216/432/864 corresponds to aggre- 5. Conclusion gation level 2/4/8 with 1/4 RS density per resource element group (REG), respectively. And the mother code lengths are In this paper, we proposed the dynamic confguration of DCI � = 256/512/1024, respectively. Te proposed scheme is format based on polar codes. In the encoder of proposed compared with common polar codes. Te simulation results scheme, the RI feld is used to indicate the number of MCS are shown as in Figures 4–6. Figure 4 shows that, with a felds and other related felds. Ten the pause-and-judge step shorter DCI payload length (� = 216), a considerable is added to the SCL scheme. Analysis and the simulation performance degradation on BLER is caused by dynamic results illustrate that the proposed scheme can reduce the −2 indication, less than 0.1 dB, when BLER is 10 .Figures5and complexity of the blind detection and the degradation caused 6showthat,withalongerencodedbitlength(� = 432/864), by dynamic indication of code word number is generally the performance loss is generally negligible. It is obvious that negligible. a larger aggregation level would be more possibly used for a larger DCI payload size (e.g., DCI format with 2 CWs) Conflicts of Interest to ensure the reliable reception of NR-PDCCH on UE side, under which the BLER performance is hardly impacted by Te authors declare that there are no conficts of interest dynamic indication. regarding the publication of this paper. 6 Wireless Communications and Mobile Computing

N = 1024, M = 864 10−6 100

10−1

10−7 FAR

10−2 BLER

10−8 10−3 −7 −6.5 −6 −5.5 −5 −4.5 −4 −3.5 −3 −2.5 −2 SNR

−21 1.5 ∗ 2 84-864 120-864 10−4 88-864 96-864 140-864 −7.5 −7 −6.5 −6 −5.5 −5 −4.5 −4 92-864 100-864 SNR (dB) Figure 8: FAR evaluation results for � = 864. TBS = 65 TBS = 65-proposed scheme TBS = 70 References TBS = 70-proposed scheme TBS = 75 [1]R.Bajracharya,R.Shrestha,Y.B.Zikria,andS.W.Kim,“LTE TBS = 75-proposed scheme or LAA: Choosing Network Mode for My Mobile Phone in TBS = 80 5G Network,” in Proceedings of the 2017 IEEE 85th Vehicular TBS = 80-proposed scheme Technology Conference (VTC Spring), pp. 1–4, Sydney, NSW, TBS = 85 June 2017. TBS = 85-proposed scheme [2] G. Pecoraro, S. Di Domenico, E. Cianca, and M. De Sanctis, “LTE signal fngerprinting localization based on CSI,” in Pro- Figure 6: Te BLER performance for proposed scheme with �= ceedings of the 2017 IEEE 13th International Conference on Wire- 1024, � = 864,andvariousTBS. less and Mobile Computing, Networking and Communications (WiMob),pp.1–8,Rome,October2017. 10−6 [3] B. Tahir and M. Rupp, “New construction and performance analysis of Polar codes over AWGN channels,” in Proceedings of the 24th International Conference on Telecommunications, ICT 2017,Cyprus,May2017. [4]H.Mzoughi,F.Zarai,M.S.Obaidat,andL.Kamoun,“3GPP LTE-Advanced Congestion Control Based on MIH Protocol,” 10−7 IEEE Systems Journal,vol.11,no.4,pp.2345–2355,2017.

FAR [5] E. Arikan, “Channel polarization: a method for constructing capacity-achieving codes for symmetric binary-input memo- ryless channels,” Institute of Electrical and Electronics Engineers Transactions on Information Teory,vol.55,no.7,pp.3051–3073, 2009. 10−8 [6]C.Cao,Z.Fei,J.Yuan,andJ.Kuang,“Lowcomplexitylist −7 −6.5 −6 −5.5 −5 −4.5 −4 −3.5 −3 −2.5 −2 successive cancellation decoding of polar codes,” IET Commu- SNR nications,vol.8,no.17,pp.3145–3149,2014. [7]K.Niu,K.Chen,J.Lin,andQ.T.Zhang,“Polarcodes:Primary −21 1.5 ∗ 2 48-432 60-432 concepts and practical decoding algorithms,” IEEE Commu- 36-432 44-432 64-432 nications Magazine,vol.52,no.7,pp.192–203,2014. 40-432 52-432 68-432 [8] C.Condo,S.A.Hashemi,andW.J.Gross,“BlindDetectionwith Figure 7: FAR evaluation results for � = 432. Polar Codes,” IEEE Communications Letters,vol.21,no.12,pp. 2550–2553, 2017. [9] P. Giard, A. Balatsoukas-Stimming, and A. Burg, “Blind detec- tion of polar codes,” in Proceedings of the 2017 IEEE Interna- Acknowledgments tional Workshop on Signal Processing Systems (SiPS),pp.1–6, Lorient, October 2017. TisworkissupportedinpartbytheNationalNatural [10]S.A.Hashemi,C.Condo,andW.J.Gross,“FastSimplifedSuc- Science Foundation of China under Grant no. 61371075 and cessive-Cancellation List Decoding of Polar Codes,”in Proceed- Beijing Municipal Science and Technology Project under ings of the 2017 IEEE Wireless Communications and Networking Grant D171100006317001. Conference Workshops, WCNCW 2017,USA,March2017. Wireless Communications and Mobile Computing 7

[11] Q. Zhang, A. Liu, X. Pan, and K. Pan, “CRC Code Design for List Decoding of Polar Codes,” IEEE Communications Letters, vol. 21, no. 6, pp. 1229–1232, 2017. [12] R. Mori and T. Tanaka, “Performance of polar codes with the construction using density evolution,” IEEE Communications Letters,vol.13,no.7,pp.519–521,2009. [13] P.Trifonov, “Efcient design and decoding of polar codes,” IEEE Transactions on Communications,vol.60,no.11,pp.3221–3227, 2012. Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 7286909, 8 pages https://doi.org/10.1155/2018/7286909

Research Article Performance Analysis of CRC Codes for Systematic and Nonsystematic Polar Codes with List Decoding

Takumi Murata and Hideki Ochiai

Department of Electrical and Computer Engineering, Yokohama National University, 79-5 Tokiwadai, Hodogaya, Yokohama, Kanagawa 240-8501, Japan

Correspondence should be addressed to Takumi Murata; [email protected]

Received 24 November 2017; Revised 22 February 2018; Accepted 20 March 2018; Published 8 May 2018

Academic Editor: Qin Huang

Copyright © 2018 Takumi Murata and Hideki Ochiai. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Successive cancellation list (SCL) decoding of polar codes is an efective approach that can signifcantly outperform the original successive cancellation (SC) decoding, provided that proper cyclic redundancy-check (CRC) codes are employed at the stage of candidate selection. Previous studies on CRC-assisted polar codes mostly focus on improvement of the decoding algorithms as well as their implementation, and little attention has been paid to the CRC code structure itself. For the CRC-concatenated polar codes with CRC code as their outer code, the use of longer CRC code leads to reduction of information rate, whereas the use of shorter CRC code may reduce the error detection probability, thus degrading the frame error rate (FER) performance. Terefore, CRC codes of proper length should be employed in order to optimize the FER performance for a given signal-to-noise ratio (SNR) per information bit. In this paper, we investigate the efect of CRC codes on the FER performance of polar codes with list decoding in terms of the CRC code length as well as its generator polynomials. Both the original nonsystematic and systematic polar codes are considered, and we also demonstrate that diferent behaviors of CRC codes should be observed depending on whether the inner polarcodeissystematicornot.

1. Introduction Various types of decoding algorithms have been proposed for polar codes, such as successive cancellation list (SCL) Polar codes, proposed by Arıkan [1], are known to achieve the [5, 6], [7], and linear program [8]. In par- symmetric capacity for any given binary-input memoryless ticular,itiswellknownthattheSCLdecodercansignifcantly channels (B-MCs) with low complexity at both encoder and improve the performance of the polar codes. Furthermore, decoder. Polar codes are characterized by channel polariza- basedontheobservationthatthecorrectcodewordisinthe tion whichiscausedbychannelsplittingandchannelcom- list but not necessarily the most likely one, Tal and Vardy bining, and information bits are transmitted over good chan- proposed the use of error detecting codes, such as cyclic nels. To identify those channels accurately, frst Arıkan pro- redundancy-check (CRC) codes, in order to identify the cor- posed a technique which recursively updates mutual infor- rect codeword from the list. Te efectiveness of such CRC- mation of the channels. However, complete estimation of assisted SCL decoder has been subsequently recognized by mutual information can be performed over the binary erasure several other researchers [9, 10]. Sofware implementation of channel only. Terefore, several researchers proposed chan- its fast decoder has been also developed in [11]. nel estimation techniques such as density evolution [2], Gaus- Te performance of the polar codes concatenated with sian approximation [3], and channel upgrading/degrading CRC code as its outer code, together with the list decoding [4]. Teir performance, however, turn out to be inferior to and CRC code detection (or list-CRC decoding [11]), depends modern capacity approaching codes such as turbo and low- on the length of the CRC code and its generator polynomial. density parity-check (LDPC) codes when they are compared Todate,however,littleattentionhasbeenpaidtothestructure under the constraint of the same block length. of CRC codes themselves. Most of the preceding studies 2 Wireless Communications and Mobile Computing employ the CRC codes with 16 bits or longer and their length where the parity bits of length � are appended by the cyclic has been determined empirically. Otherwise, perfect error encoder for the purpose of correct information identifcation detection capability is assumed for simulation purpose, at the decoder. Tis output sequence (i.e., CRC codeword) is assuming that ideal CRC codes are employed. Nevertheless, if encoded by the inner polar encoder to generate the binary the block length of the code is short, the relative redundancy codeword of length �. Tis is modulated by binary phase- associated with CRC codes becomes dominant and thus shif keying (BPSK) and transmitted over an additive white reduces the overall efciency of the concatenated code. Te Gaussian noise (AWGN) channel. At the receiver, the likeli- CRC code design for convolutional codes has been recently hood is calculated from the received signal and then passed studied in [12], where the optimal CRC codes are identifed to the list-CRC decoder, where � survived paths (codeword by developing the equivalent composite convolutional codes candidates) from the list decoder are tested by CRC detector based on combination of the original convolutional codes starting from the most likely path. When the candidate and CRC codes. More recently, an optimal CRC search codeword is determined to be correct by the CRC detector, algorithm for polar codes by using SCL decoder is developed the corresponding � information bits are considered as the in [13]. However, identifying the best CRC code length for the transmitted information bits. concatenated polar coding system has remained unknown. In this work, therefore, we attempt to analyze the efect of the 2.1. Polar Codes. Following the notation of [1], let �:X → CRC code length on the resulting frame error rate (FER) Y denote a binary-input memoryless channel (B-MC) with performance of the concatenated polar codes with list-CRC input alphabet X ={0,1}and output alphabet Y. We denote decoding. We frst show that the miss-detection error proba- the corresponding channel transition probability as �(� | �), bility of the CRC codes that depends on their length directly where �∈X and �∈Y. leads to degradation in terms of the FER performance. We � � Te polar encoding is denoted by �1 =�1 G�,where also demonstrate that even if the length of CRC codes is iden- � � � �1 =(�1,�2,...,��)∈{0,1} is the codeword, �1 = tical, the performance of the list-CRC decoding may be afect- � ed by the generator polynomials of CRC codes employed. (�1,�2,...,��)∈{0,1} is the information block, and G� is Systematic polar coding, also proposed by Arıkan [14], the generator matrix of the polar codes, which is the mapping � � is an efective approach to improve the performance of the function of X → X [1]. Te generator matrix is represent- conventional polar codes in terms of bit error rate (BER). edbythefollowingrecursiveform: However, the price of the systematic polar codes is their addi- 10 tional processing complexity that requires the SC decoding at ⊗� G� = B�F , F2 =[ ], (1) the encoder side. Later, a simpler encoding scheme has been 2 11 proposed in [15], where the conventional polar encoding is employed twice, instead of the SC decoder. Other efcient �= � F⊗� � F encoding schemes are proposed more recently in [16], where where log2 , denotes the th Kronecker power of , B the three encoding implementations are described based on and � is the bit-reversal permutation matrix. � the trade-of between time and space complexity. In this Polar codes combine the input binary channels at the work, we also address the CRC code property requirement for encoder and then split them at the decoder. Afer the channel � systematic polar codes. Based on the fact that the distance combining and splitting processes, the set of binary-input spectrum property of the systematic polar codes is diferent coordinate channels, referred to as bit channels and com- �(�) : X → Y� × X�−1 1≤�≤� from that of the nonsystematic ones [17], we demonstrate that monly denoted by � for ,can the CRC code structure suitable for systematic polar codes is be expressed as diferent from that for the original nonsystematic codes. 1 Tis paper is organized as follows. In Section 2, the sys- �(�) (��,��−1 |�)≜ ∑ � (�� |��), � 1 1 � �−1 � 1 1 � �−� 2 tem model considered throughout the paper is described. ��+1∈X Section 3 develops the relationship between the CRC code (2) length and the resulting FER performance of the concate- � � (�� |��)=∏�(� |�), nated polar codes with the list-CRC decoder. In Section 4, � 1 1 � � the analytical results developed are compared with the corre- �=1 sponding simulations. Finally, Section 5 concludes the paper. (��,��−1) We note that our initial results on the conventional nonsys- where the set of the vectors 1 1 corresponds to the out- (�) tematic polar codes are reported in the conference paper [18]. put of the channel �� and �� is its input. Tis paper is its considerable extension including the system- Due to the channel combining and splitting operation, the (�) atic polar codes and their distance spectrum properties. mutual information of the bit channel, �(�� ),polarizesto either 0 or 1. Polar codes select � out of � bit positions as 2. System Model information bits based on their channel reliabilities. We de- note the set of the selected channels for information trans- � We consider the binary polar codes concatenated with a bi- mission as A ⊂ {1,2,...,�}.ItscomplementA contains nary CRC code at the transmitter, together with their SCL the frozen bits thatareknowntothereceiveraprioriandthus decoding at the receiver. Information bits of length � are are not transmitted. Te code rate of the polar codes is thus frst encoded by CRC encoder to generate � = �+�bits, given by �/�. Wireless Communications and Mobile Computing 3

� 2.2. Concatenated Encoder. As an outer code, the CRC se- briefy described as follows: let �1 ={xA, xA� } be a codeword quence of length � bits is added to the information sequence. vector, where xA denotes a part of the codeword correspond- Tis CRC code is of rate �/(� + �) and thus the efective ing to information index and xA� denotes the corresponding �/� = (� + �)/� � � rate of the polar code is but only bits parity bits. First, information sequence �1 is set as xA and � represent information. Terefore, increasing for better error all the bits in xA� are set as erased bits, and then decoding detection performance may lead to increased code rate and ofthecodewordisperformedbytheSCdecoder.Conse- thus reduction of the error correcting capability of the inner �� quently, we obtain a temporary information sequence � 1 = polar code. Hence, the redundancy introduced by the CRC � � {u A, u A� },andthenxA� is determined by encoding the tem- code should be designed carefully, especially when the block � �� length � is relatively short. porary sequence 1 .

2.3. List-CRC Decoding. Polar codes with SC decoding are 3. Performance Analysis of List-CRC Decoding proved to achieve channel capacityforanybinary-input mem- oryless channels. For a given channel index � with 1≤�≤�, Te exact performance analysis of SC decoding is challenging the estimated bit �̂� that corresponds to �� is expressed as due to correlation between codeword bits. For a binary erasure channel (BEC), Parizi and Telatar have shown that the (�) � �−1 {arg max �� (�1 , �̂1 |��), for �∈A, correlations between the erasure events decay fast and thus �̂ = �� � { � (3) theunionboundontheframeerrorprobabilitybecomestight {0, for �∈A , as the codeword length increases [19]. More recently, Shuval and Tal have derived an improved lower bound for a binary �� =(�,� ,...,� ) where 1 1 2 � isthereceivedsymbolvectorand memoryless symmetric channel based on the correlation ̂�−1 ̂ ̂ ̂ �1 =(�1, �2,...,��−1) is the previously estimated bit vector between the codeword bits [20]. with respect to the �th bit. In this section, we analyze the performance of the list- SCL decoding searches a limited number of paths that CRC decoding for a given length of CRC bits �,provided correspondtotheinputbitsinatreediagram;thatis,itretains that the statistical distribution of the correct candidate in the at most � candidates in parallel [5, 6] for some positive integer list is available. We note that the distribution itself is difcult � that represents the list size. to analyze, and thus we simply assume that it is obtained by When the � most likely paths are determined at the fnal resorting to Monte-Carlo simulation based on the conven- stage, the path that holds the highest likelihood is chosen as tional list decoding without concatenation of error detecting the most reliable path and its information sequence is tested codes. by CRC detector (which is referred to as a CRC test in what follows). If the CRC test indicates that the most likely � 3.1. Ideal Decoding Error Probability. Let C� ⊂ X denote candidate is incorrect, the candidate with the second highest �+� the (�,�+�)polar code as our inner code and let C� ⊂ X likelihood will be tested. Tis process is iterated until any of � the candidates that pass the CRC test is found, or all the � denote a set of the codewords encoded by -bit CRC encoder as our outer code. candidates are tested. �� ∈ X� ��+� ∈ C Although the use of CRC codes should improve the per- For a given information sequence 1 , 1 � formance, increasing the length of CRC bits should reduce the represents the corresponding CRC codeword. Likewise, we �� ∈ C efciency as mentioned in the previous subsection. On the denote the codeword of polar codes by 1 �. � � other hand, if the CRC code length is not sufcient, it may For a given received sequence �1 ∈ Y ,theestimated �+� erroneously detect the incorrect candidate as correct one. codeword of list decoder �̂1 is expressed by Terefore, there should be a trade-of in the length of CRC �+� � code, which will be elaborated in the next section. �̂ = � (� |� ,� � ), 1 arg max � 1 A A (4) �A∈C� 2.4. Systematic Encoding. For a systematic code, each code- � � � word can be explicitly formed by the information bits and where A and A are the vectors corresponding to informa- parity bits. In many conventional block codes, a systematic tion and frozen bits, respectively, and the resulting estimated �̂� encoding structure is adopted for its practical advantage upon information sequence 1 is denoted by retrieval of information bits from the decoded codeword. ̂� ̂�+� On the other hand, the original polar codes described in �1 = I� (�1 ), (5) Section 2.1 have a nonsystematic structure. However, it has been shown that systematic polar coding [14] outperforms the where I�(⋅) is the decoding operation of C� which maps �+� � original polar codes in terms of BER even though the FER X → X . performance remains identical. Let � denote the number of the candidate codewords For systematic polar encoding, there are largely two generated by the list decoder, let L = {1,2,...,�} be a set ⋆ encoding approaches. Te frst approach is the basic tech- of indices in the list, and let ℓ denote the list index which nique for general linear codes where the parity bits are iden- corresponds to the correct codeword. Here, without loss of tifedfromthegeneratormatrix,andthesecondoneisto generality we assume that smaller indices correspond to more employ the SC decoder as a part of encoding process, which is likely candidates. If the correct codeword is not in the list, that 4 Wireless Communications and Mobile Computing

⋆ ⋆ is, ℓ ∉ L, we assume that ℓ >�.Moreover,let�ℓ⋆ (�, �) = testandeachprobabilityisboundedby(9).Hence,thecorrect ⋆ the probability conditioned on �,thatis,Pr{ℓ =�|�}denote decoding probability ��(�, �) is expressed as ⋆ the distribution of ℓ , that is, the probability that the index ⋆ � (�, �) ℓ agrees with � at a given SNR �=��/�0 over an AWGN � channel. =�⋆ (�, 1) If we assume the ideal case where the correct codeword ℓ in the list can be identifed without error, the ideal decoding � �−1 �+� � ⋆ error probability ��, (�, �) is expressed as id + ∑�ℓ⋆ (�, �) ∏ Pr {�1 (� )∉C� |ℓ =�} �=2 ��=1 (10) �

� (�, �) = 1 − ∑� ⋆ (�, �) . ≥�⋆ (�, 1) �,id ℓ (6) ℓ �=1 � �−1 2� −1 + ∑� ⋆ (�, �) ∏ {1 − }, ℓ �+� � 3.2. Undetected Error Probability of CRC Codeword. Te CRC � 2 −(� −1) � �=2 � =1 testmayfailwithacertainprobability ud (i.e., undetected � �+� error probability of the CRC codeword). As ud increases, the where �1 (�) is the �th estimate vector of polar codes in the error correcting performance of the concatenated polar codes �+� � list and the term 2 −(� −1)corresponds to the number of may be degraded. �+� �� In general, the undetected error probability of the code remaining binary sequences of length prior to the th 2�+� ≫� C� over a binary symmetric channel (BSC) with crossover CRC test. Since , we may express �∈[0,1/2] probability is given by [21] � −� �−1 � (�, �) ≳ ∑� ⋆ (�, �) (1 − 2 ) . �+� � ℓ (11) � (C ,�) = ∑� �� (1−�)�+�−� , �=1 ud � � (7) �=1 Tus, the approximate upper bound of the decoding error probability for the list CRC-concatenated system ��(�, �) where �� (1≤�≤�+�)is the weight distribution of C�.We C with the candidates of completely random strings is expressed note that the CRC code � may be called good if the relation- as ship � �−1 � −� 1 2 −1 1 �� (�, �) = 1 − �� (�, �) ≲ 1 − ∑�ℓ⋆ (�, �) (1 − 2 ) � (C ,�)≤� (C , )= ≈ ud � ud � �+� � (8) �=1 2 2 2 (12) � � −� holds for all [21].Teupperboundof(8)maybealso ≈1−∑�ℓ⋆ (�, �) {1 − (�−1) 2 } achieved if all the binary sequences of length �+�;thatis, �=1 X�+� all the elements of may appear in the list with equal � −� probability [22] (in other words, the candidates in the list are =� (�, �) + 2 ∑� ⋆ (�, �) (�−1). �,id ℓ modeled as completely random strings [22]). ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟�=1 (13) � Te major challenge in analyzing the probability ud in ≜�(�,�,�) the case of polar codes with list-CRC decoding is that, due to the correlation among the codewords, the binary sequences We observe that the frst term in (13) corresponds to the in the list may be also correlated. Nevertheless, as mentioned ideal error performance when CRC test is perfect and is � �(�, �, �) in [22], the upper bound of ud in (8) would be a good esti- derived in (6), whereas the second term in (13) is the mate of the probability of the undetected error, provided that performance degradation associated with the imperfect CRC the minimum distance of CRC code is much lower than that code, which can be reduced by increasing the redundant bit of the candidates in the list. length �. Terefore, we should select � such that �(�,�,�)is � (�, �) negligible compared to �,id . 3.3. Approximate Upper Bound on Decoding Error Probability. GiventheaboveundetectederrorprobabilitymodelofCRC 4. Simulation Results code, the probability of correct detection by the CRC test is lower bounded as In this section, we investigate the FER performance of the CRC-concatenated nonsystematic and systematic polar codes 2� −1 with list-CRC decoding through various simulations with � =1−� ≥1− . (9) cd ud 2�+� emphasis on the diference in the CRC code length as well as its generator polynomials. Troughout simulations, we On the �-candidate list decoding, if the �th estimate employ BPSK for modulation and the channel model is codeword is correct, all the codewords up to the (�−1)th list AWGN. Te polar codes simulated are designed according to should be incorrect. In other words, all the (�−1)codewords [2] with design-SNR ���/�0 = −1.5917 dB (as proposed in must be correctly detected as invalid codewords by the CRC [23]), where � istherateoftheinnerpolarcodes.Telistsize Wireless Communications and Mobile Computing 5

100 100

10−1 10−1

10−2 10−2 FER 10−3 FER 10−3

10−4 10−4

10−5 10−5 0.0 0.5 1.51.0 2.0 2.5 3.0 3.5 0.0 0.5 1.51.0 2.0 2.5 3.0

Eb/N0 (dB) Eb/N0 (dB)

CRC-4 (0x9) CRC-10 (0x327) CRC-4 (0x9) CRC-10 (0x327) CRC-8 (0xA6) CRC-16 (0x8810) CRC-8 (0xA6) CRC-16 (0x8810)

Figure 1: FER performance of list-CRC decoding with various Figure 2: FER performance of list-CRC decoding with various lengths of CRC code with the code length � = 512. lengths of CRC with the code length � = 2048. of the SCL decoding is chosen as �=32.Wealsoconsiderthe Tisstemsfromthefactthatas� increases the efect of the fxed spectral efciency scenario where the code rate of the relativelossinthecoderatebecomesnegligiblysmall. entire system is 1/2. Tis is achieved by setting the parameters For both cases, the performance loss due to the miss- of polar codes as (�, �/2 + �) with � and � representing detection of CRC codes becomes noticeable when � is small. thecodelengthofthepolarcodeandtheredundantbits imposed by the CRC code, respectively. Terefore, increasing 4.2. Performance Comparison of Analytical and Simulation CRC code length (for better list-CRC decoding performance) Results. We next examine the validity of the FER upper leads to an increment of the code rate of polar codes with no bound expression given by (12). We frst obtain the statistical ⋆ efective increase of information bits, which in turn reduces distribution of the correct paths ℓ by Monte-Carlo simula- the error correction capability. Terefore, it should reveal the tion (without concatenating CRC code). Tis has been done trade-of relationship. by counting the locations of the correct paths through 7 the simulation of 10 trials. We note that the precise FER 4.1. Comparison of CRC Code Length. We frst examine the performance depends on the particular realization of the trade-ofbetweentheCRCcodelengthandthecoderateof polar codes, and hence its theoretical characterization should polar codes. In what follows, we describe the generator poly- be challenging. It is thus lef as future work. nomial of CRC encoders by the hexadecimal representation; for example, the notation 0x8810 for 16-bit CRC encoder indi- 4.2.1. Comparison with Generator Polynomials for Nonsystem- cates 1000 1000 0001 0000 in binary representation, which � = 2048 16 12 5 atic Polar Codes. We frst consider the case of corresponds to � +� +�, and addition of the implicit �=8 16 and for nonsystematic polar codes. We have selected “+1” term specifes the generator polynomial of �(�) = � + two specifc groups of generator polynomials of CRC codes: 12 5 � +� +1[24]. (1) all the eighth-order primitive polynomials and (2) all the Figures 1 and 2 compare the FER performance of the seventh-order primitive polynomials multiplied by (� + 1). conventional nonsystematic polar codes with code lengths Te multiplication of the factor (� + 1) makes the CRC � = 512 and 2048, respectively, where the CRC code length detector capable of detecting all the odd Hamming weight � is chosen from 4, 8, 10, and 16. In this example, only the errors and thus is found to be preferable in some applications. results with the CRC polynomials that are found to be best Also, the cases where we insert a random interleaver afer among those compared for each � are shown. CRC encoder are considered where the block diagram of this From the results with � = 512 shown in Figure 1, particular scenario is depicted in Figure 3. becauseofthelossintermsofcoderate,thecaseofCRC-16, Te results for the generator polynomials consisting of which is a commonly adopted scenario in the study of CRC- all the eighth-order primitive polynomials are compared in concatenated polar codes, turns out to be inferior to that of Figure4.Weobservethatwithoutinterleaving,somespecifc CRC-10. Terefore, the negative efect of the rate loss associ- CRCpolynomialsmaypoorlyperformcomparedtothe ated with the CRC code length becomes dominant compared analytical results. Tis may be caused by the fact that bit error to the improvement achieved by the reduction of the unde- patterns due to the SC decoder, namely, the sequences in � tected error probability ud fortheshortpolarcodes.Onthe the list, and the distance property of the CRC codes do not other hand, when � = 2048,theperformancegapbetween match well for this system. However, with interleaving, the CRC-16 and CRC-10 becomes smaller as observed in Figure 2. resulting performance becomes similar and comparable with 6 Wireless Communications and Mobile Computing

k bits K=k+rbits Nbits 100 CRC Polar ∏ encoder encoder 10−1

10−2 W 10−3 CRC SCL w/o interleaver . ∏−1 .

decoder FER decoder . . 10−4

L candidate sequences −5 10 w/interleaver Figure 3: A block diagram of the CRC-concatenated polar codes 10−6 with an interleaver. 10−7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

Eb/N0 (dB)

100 Analytical 0xA1 0xD9 Interleaved 0xAE 0xE0 10−1 0x8C 0xBF 0xE5 0x83 0xC2 0xEA 10−2 0x89 0xC8 0xF4 0x97 0xCD 0xFE −3 10 0x98 0xD3

FER 10−4 Figure 5: Comparison of FER performances between the analytical

−5 bounds and full simulations for list-CRC decoding with CRC codes 10 based on the seventh-order primitive polynomials multiplied by (� + 1) 10−6 . Te dashed line represents the analytical result. Te dotted and solid lines correspond to the simulation results with and without 10−7 interleaving, respectively. 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 E /N (dB) b 0 Table1:Tenumberofevenandoddweightinputswithcodelength 20 Analytical 0xAF 0xC6 � = 512,coderate�=1/2,andlistsize�=2 . Interleaved 0xB1 0xD4 � =8 �=16 0x8E 0xB2 0xE1 Output weight min 0x95 0xB4 0xE7 Input weight Odd Even Odd Even 0x96 0xB8 0xF3 Nonsystematic 1 63 3470 114297 0xA6 0xC3 0xFA Systematic 32 32 55840 62114 Figure 4: Comparison of FER performances between the analytical bound and full simulations for list-CRC decoding with the CRC codes based on the eighth-order primitive polynomials. Te dashed 4.2.2. Comparison with Generator Polynomials for Systematic line represents the analytical result. Te dotted and solid lines Polar Codes. We next consider the case of systematic polar correspond to the simulation results with and without interleaving, coding, where all the parameters are chosen identically to respectively. the previous results with nonsystematic encoding. Te results for the eighth-order primitive polynomials are compared in Figure 6. Tese results show that the concatenated polar the analytical result (derived based on the assumption of the coding systems, both with and without interleaver, agree with list with completely random strings). the analytical result. Te results for the seventh-order primitive polynomials Te results for the seventh-order primitive polynomials multiplied by (�+1)are compared in Figure 5. In this case, multiplied by (� + 1) are compared in the case of systematic all the simulation results without insertion of interleaver are polar codes in Figure 7. In contrast to the results of nonsys- inferior to the analytical bound as opposed to the results in tematic polar codes shown in Figure 5, the systematic polar Figure 4. Insertion of interleavers may mitigate this gap, but codes are not afected by whether the factor (� + 1) is present we still observe that the performance achieved by the CRC or not in the generator polynomial of the CRC codes. Tis codes designed in this manner will be poor compared to the performance diference between two encoding approaches cases shown in Figure 4. Terefore, the introduction of the can be explained by the input-output weight enumerator term (�+1)oftheCRCpolynomial,whichmakesalltheodd- function (IOWEF) of polar codes. weight errors detectable, may not be efective in the case of the InTable1,forthecodewordofthepolarcodeswiththe investigated nonsystematic polar codes. Te reason for this frstandsecondlowestHammingweights(i.e.,�=8and 16 behavior will be elucidated through the comparison with the in our case), the corresponding numbers of the odd and even systematic polar codes in what follows. Hamming weight information sequences are compared. Te Wireless Communications and Mobile Computing 7

100 100

10−1 10−1

10−2 10−2

10−3 10−3 FER FER 10−4 10−4

10−5 10−5

10−6 10−6

10−7 10−7 0.0 0.5 1.51.0 2.0 2.5 3.0 3.5 0.0 0.5 1.51.0 2.0 2.5 3.0 3.5

Eb/N0 (dB) Eb/N0 (dB) Analytical 0xAF 0xC6 Analytical 0xA1 0xD9 Interleaved 0xB1 0xD4 Interleaved 0xAE 0xE0 0x8E 0xB2 0xE1 0x8C 0xBF 0xE5 0x95 0xB4 0xE7 0x83 0xC2 0xEA 0x96 0xB8 0xF3 0x89 0xC8 0xF4 0xA6 0xC3 0xFA 0x97 0xCD 0xFE 0x98 0xD3 Figure 6: Comparison of FER performances between the analytical bound and full simulations for systematic polar codes and list-CRC Figure 7: Comparison of FER performances between the analytical decoding with the CRC codes based on the eighth-order primitive bounds and full simulations for systematic polar codes and list-CRC polynomials. Te dotted and solid lines correspond to the simulation decoding with the CRC codes based on the seventh-order primitive results with and without interleaving, respectively. polynomials multiplied by (� + 1).Tedottedandsolidlines correspond to the simulation results with and without interleaving, respectively. IOWEFiscalculatedbythemethodbasedonSCLdecoding with generation of huge list size as proposed in [17], and the � = 512 �=1/2 associated parameters here are set as , ,and longer than necessary for a given target FER, the performance �=220 . Te result of the conventional nonsystematic polar degradation will be expected due to the relative increase in codes in the table indicates that the even-weight input the code rate of the inner polar codes. Secondly, it has been sequence is dominant when the minimum Hamming weight shown that the performance of the nonsystematic polar codes codeword is generated, whereas the numbers of odd and depends on the generator polynomials of CRC codes, and even-weight input sequences are equal in the case of sys- itisnotefectiveinemployingtheterm(� + 1) that makes tematic polar codes. As the SNR increases, the candidate all the odd-weight errors detectable from the viewpoint of codewords are dominated by those with the minimum Ham- their distance spectrum properties. Tis is in contrast to ming weight. Terefore, for the nonsystematic polar codes the systematic polar codes, where the performance is robust where the even-weight errors occur frequently, the use of the against the structure of the CRC codes. CRC polynomials with the factor (�+1), which are capable of detecting all the odd-weight errors, becomes inefective. On the other hand, since the systematic polar codes have Conflicts of Interest balanced minimum Hamming weights of even and odd, the inclusion of the factor (� + 1) may not afect the undetected Te authors declare that there are no conficts of interest re- error probability. garding the publication of this paper.

5. Conclusion Acknowledgments In this work, we have investigated the performance of list- Tis work was supported in part by MIC/SCOPE (Grant no. CRC decoder for CRC-concatenated polar codes. In partic- 155003003) and by JSPS KAKENHI (Grant no. JP16H02345). ular,wehavefocusedontherelationshipbetweentheFER performanceandCRCcodelengthbyderivingtheanalytical bound of the FER. Furthermore, the efect of the generator References polynomials of CRC codes on the polar code structure has [1] E. Arıkan, “Channel polarization: a method for constructing been analyzed. Specifcally, it has been frstly shown that capacity-achieving codes for symmetric binary-input memo- there is a trade-of relationship between the FER performance ryless channels,” Institute of Electrical and Electronics Engineers and CRC code length when the overall code rate of the Transactions on Information Teory,vol.55,no.7,pp.3051–3073, concatenated system is identical. In general, if the CRC is 2009. 8 Wireless Communications and Mobile Computing

[2] R. Mori and T. Tanaka, “Performance of polar codes with the [22] D. P.Bertsekas and R. G. Gallager, Data Networks, Prentice Hall, construction using density evolution,” IEEE Communications 2nd edition, 1992. Letters,vol.13,no.7,pp.519–521,2009. [23] H. Vangala, E. Viterbo, and Y. Hong, “A comparative study [3] P.Trifonov, “Efcient design and decoding of polar codes,” IEEE of polar code constructions for the AWGN channel,” https:// Transactions on Communications,vol.60,no.11,pp.3221–3227, arxiv.org/abs/1501.02473. 2012. [24] P. Koopman and T. Chakravarty, “Cyclic Redundancy Code [4]I.TalandA.Vardy,“Howtoconstructpolarcodes,”Institute of (CRC) polynomial selection for embedded networks,” in Pro- Electrical and Electronics Engineers Transactions on Information ceedings of the 2004 International Conference on Dependable Teory,vol.59,no.10,pp.6562–6582,2013. Systems and Networks,pp.145–154,July2004. [5] I. Taland A. Vardy, “List decoding of polar codes,”in Proceedings of the IEEE International Symposium on Information Teory (ISIT), pp. 1–5, 2011. [6] I. Tal and A. Vardy, “List decoding of polar codes,” Institute of Electrical and Electronics Engineers Transactions on Information Teory, vol. 61, no. 5, pp. 2213–2226, 2015. [7] E. Arıkan, “Aperformance comparison of polar codes and reed- muller codes,” IEEE Communications Letters,vol.12,no.6,pp. 447–449, 2008. [8]N.Goela,S.B.Korada,andM.Gastpar,“OnLPdecodingof polar codes,” in Proceedings of the IEEE Information Teory Workshop (ITW ’10), pp. 1–5, Dublin, Ireland, September 2010. [9] K. Niu and K. Chen, “CRC-aided decoding of polar codes,” IEEE Communications Letters, vol. 16, no. 10, pp. 1668–1671, 2012. [10] B. Li, H. Shen, and D. Tse, “An adaptive successive cancellation list decoder for polar codes with cyclic redundancy check,” IEEE Communications Letters,vol.16,no.12,pp.2044–2047,2012. [11] G. Sarkis, P.Giard, A. Vardy, C. Tibeault, and W.J. Gross, “Fast List Decoders for Polar Codes,” IEEEJournalonSelectedAreas in Communications, vol. 34, no. 2, pp. 318–328, 2016. [12] C.-Y.Lou,B.Daneshrad,andR.D.Wesel,“Convolutional-code- specifc CRC code design,” IEEE Transactions on Communica- tions,vol.63,no.10,pp.3459–3470,2015. [13]Q.Zhang,A.Liu,X.Pan,andK.Pan,“CRCCodeDesignfor List Decoding of Polar Codes,” IEEE Communications Letters, vol. 21, no. 6, pp. 1229–1232, 2017. [14] E. Arıkan, “Systematic polar coding,” IEEE Communications Letters,vol.15,no.8,pp.860–862,2011. [15] G. Sarkis, P. Giard, A. Vardy, C. Tibeault, and W. J. Gross, “Fast polar decoders: algorithm and implementation,” IEEE Journal on Selected Areas in Communications,vol.32,no.5,pp.946–957, 2014. [16] H. Vangala, Y. Hong, and E. Viterbo, “Efcient algorithms for systematic polar encoding,” IEEE Communications Letters,vol. 20,no.1,pp.17–20,2016. [17]Z.Liu,K.Chen,K.Niu,andZ.He,“Distancespectrumanal- ysis of polar codes,” in Proceedings of the 2014 IEEE Wireless Communications and Networking Conference, WCNC 2014,pp. 490–495, Turkey, April 2014. [18] T. Murata and H. Ochiai, “On design of CRC codes for polar codes with successive cancellation list decoding,”in Proceedings of the 2017 IEEE International Symposium on Information Teory, ISIT 2017,pp.1868–1872,Germany,June2017. [19] M. B. Parizi and E. Telatar, “On the correlation between pola- rized BECs,” in Proceedings of the 2013 IEEE International Sym- posium on Information Teory, ISIT 2013, pp. 784–788, Turkey, July 2013. [20]B.ShuvalandI.Tal,“Alowerboundontheprobabilityof error of polar codes over BMS channels,” https://arxiv.org/abs/ 1701.01628v2. [21] T. Kløve and V. I. Korzhik, Error Detecting Codes,KluwerAca- demic Publishers, Boston, MA, USA, 1995. Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 8928761, 6 pages https://doi.org/10.1155/2018/8928761

Research Article Adding a Rate-1 Third Dimension to Parallel Concatenated Systematic Polar Code: 3D Polar Code

Zhenzhen Liu , Kai Niu, Chao Dong , and Jiaru Lin

Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876, China

Correspondence should be addressed to Zhenzhen Liu; [email protected]

Received 24 November 2017; Revised 8 March 2018; Accepted 27 March 2018; Published 3 May 2018

Academic Editor: Zesong Fei

Copyright © 2018 Zhenzhen Liu et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

In this paper, a three-dimensional polar code (3D-PC) scheme is proposed to improve the error foor performance of parallel concatenated systematic polar code (PCSPC). Te proposed 3D-PC is constructed by serially concatenating the PCSPC with a rate- 1 third dimension, where only a fraction � of parity bits of PCSPC are extracted to participate in the subsequent encoding. It takes full advantage of the characteristics of parallel concatenation and serial concatenation. In addition, the convergence behavior of 3D-PC is analyzed by the extrinsic information transfer (EXIT) chart. Te convergence loss between PCSPC (� = 0) and diferent � provides the reference for choosing the value of � for 3D-PC. Finally, the simulation results confrm that the proposed 3D-PC scheme lowers the error foor.

1. Introduction addition, the infuence of � of 3D-TC on convergence threshold and minimum distance has been researched in Te novel concept of parallel concatenated systematic polar [6, 7]. code (PCSPC) was frst put forward in [1]. PCSPC scheme It is known from the literature that serial concatenated consists of two systematic polar codes (SPCs) [2]. It has code has larger minimum distance with respect to parallel performance advantage with respect to original SPC. In [3], concatenated code; however, its convergence threshold is the extrinsic information transfer (EXIT) charts of diferent worse than that of parallel concatenation [8]. Meanwhile, length SPC have been given. As a promotion of the above inspired by the idea in [7], 3D polar code (3D-PC) scheme is EXIT chart results, the convergence behavior of PCSPC can proposed to improve the error foor performance of PCSPC be analyzed. It can be observed that SPC with larger code in this paper. It makes full use of the features of parallel length leads to narrower opening. Terefore, it is difcult for concatenation and serial concatenation. 3D-PC is constituted PCSPC with large code length SPCs to converge at low error by adding a rate-1 CRSC code to PCSPC. And only a fraction rate. Te motivation of our work is to solve this problem. � of parity bits of PCSPC are sent to the third encoder. As we know, there is error foor for turbo code (TC) Moreover, the convergence behavior of 3D-PC is analyzed −5 at block error rate (BLER) around 10 [4]. In order to by EXIT chart method [9]. It can be utilized to guide the improve the performance of TC in the error foor region, choice of � which is an important parameter that afects the three-dimensional turbo code (3D-TC) has been studied in performance of 3D-PC. Simulation results corroborate the [5–7]. 3D-TC scheme was proposed by serially concatenating efectiveness of 3D-PC scheme to improve the low error rate a rate-1 cyclic recursive systematic convolutional (CRSC) performance. code to conventional TC. It is important to note that only a Te paper is organized as follows. Section 2 reviews fraction � of parity bits from TC are extracted to participate systematic polar code and EXIT chart. 3D-PC scheme is in the encoding again. Compared with conventional TC, proposed in Section 3. In Section 4, convergence analysis 3D-TC scheme has larger minimum distance. Terefore, of 3D-PC is presented. Te simulation results are shown in 3D-TC improves the error foor performance greatly. In Section 5. Section 6 concludes this paper. 2 Wireless Communications and Mobile Computing

2. Preliminaries with �2 2.1. Systematic Polar Code. Polar code is a capacity-achieving � = � , (8) channel code which was proposed by Arıkan in [10]. Given � 2 code length � and code rate �=�/�,thereliabilitiesof � where �� is a Gaussian random variable with mean zero and subchannels can be obtained by Gaussian approximation �2 method [11] or other construction algorithms. Ten the variance �. Under the above assumption, the mutual infor- � mation between transmitted bits � and a priori information subchannels with high reliability are used to transmit � information bits, and other �−�subchannels are utilized to can be written as A ⊂{1,...,�} deliver frozen bits. Let set denote the indexes �� =�(�; �) of those � high reliability subchannels. Supposing that the =1 input sequence k =(V1,...,V�) is given, the codeword x of polar code can be obtained by 2 2 (9) �� ⊗� 2 +∞ −(�− ) /2�� x = kG� = k (B�F ), (1) 1 2 2 − ∫ � (1 + �−�)��. √ log2 −∞ 2��� where G� is the generator matrix, B� denotes the bit-reversal ⊗� � permutation matrix, denotes the -th Kronecker product, Assume that extrinsic information is denoted by �.Te F =[10] and 2 11 . mutual information between � and � is calculated as Since the input source sequence k can be decomposed 1 +∞ into two parts kA and kA� ,thecodewordx in (1) can be written �� =�(�; �) = ⋅ ∑ ∫ �� (�|�=�) as 2 �=−1,1 −∞ (10) x = kG = k G + k � G � , 2⋅� (�|�=�) � A A A A (2) × � ��, log2 � (�|�=−1) +� (�|�=+1) � � � where kA ⊂ k is the information bits, A = {1,...,�}\A denotes the complement of A,andGA consists of the rows of where ��(� | � = �) is the probability distribution function G� with indices in A. given condition �=�. It can be obtained by Monte Carlo Systematic polar code is constructed based on polar code simulation. [2]. Assume that �-elements set B denotes the indexes of x x � system bits; then B denotes system bits and B is the check 3. Proposed 3D Polar Code Scheme bits. Equation (2) can be rewritten as 3.1. Encoding Structure. In short, 3D-PC scheme can be xB = kAGAB + kA� GA�B, (3) regarded as a concatenation of the inner code and outer code, PCSPC. Te encoding structure of 3D-PC is illustrated in xB� = kAGAB� + kA� GA�B� , (4) Figure 1. First of all, the input information sequence u with length � is encoded by parallel concatenated systematic polar G G where AB denotes the submatrix of � with row indexes in encoder. Te component encoders of PCSPC are written as A B and column indexes belonging to . �� and ��, respectively. Both of them are systematic polar x k � As to SPC, the systematic bits B are known and A encoders. We use x� and x� to denote the parity bits sequence k arealsoknownandsettozero;thus A can be calculated of �� and ��,respectively.Further,thecodewordxPC can according to (3): be obtained by taking the bits from x� and x� alternatively. −1 −1 Te fraction � of xPC is interleaved by the interleaver Π� k =(x − k � G � )(G ) = x (G ) . A B A A B AB B AB (5) and sent to the postencoder �� for encoding, where � is named as permeability rate. And codeword x� is output by the Further, the check bits xB� canbecomputedby(4): postencoder ��.Teparitybitschosenforencodingfollow −1 a certain puncturing pattern p with length 2/�. Te fraction xB� = xB [(GAB) GAB� ]. (6) 1−�of xPC is passed to the channel straightly, denoted by xch. Te patterns p and p are complementary. Furthermore, the Here, the codeword x of SPC is achieved. last codeword x of 3D-PC with code length �� is obtained by combining the input sequence u,theparitysequencexch, 2.2. EXIT Chart. EXIT chart [9] is an efcient convergence and the parity sequence x�.Herethecoderateof3D-PCis analysis tool for the iterative decoding structure. It tracks the calculated by �=�/�� =1/3.Inordertoachievehigher average mutual information of constituent decoders. code rate, it is need to puncture some parity bits from xch or � � We use and to denote the transmitted bits and the x�.Sincex� contains more information, xch is frst taken into corresponding a priori information, respectively. And � is consideration. modeled as an independent Gaussian random variable with For complexity and performance reasons, the selected the following expression: �� encoder should meet some requirements: its decoder is as simple as possible, its decoder inputs sof information �=� �+� � � (7) and outputs sof information, and its decoder should not Wireless Communications and Mobile Computing 3

Outer code--PCSPC

1− x p MUX u a x=B x Ca  Π MUX p Πc Cc x0# xp uc xc xb Inner Code Cb

Figure 1: 3D polar encoder. introduce too much error [5]. As a result, a rate-1 cyclic given out-loop iteration number is reached and the decision −1 recursive systematic convolutional encoder with generator is made by the LLR information of �� . �(�) = 1/(1 + �2) � polynomial is selected as the encoder � Since it is needed to exchange extrinsic information −1 −1 [6]. between �� and �� , the decoder adopted should meet In literature [5, 6], the interleavers Π and Π� have been the sof-in-sof-out (SISO) requirement. As to the decoding well designed to increase the minimum distance. Because the of SPC, there are two SISO decoding algorithms, belief design of interleaver has a great infuence on the performance propagation (BP) decoding [12] and sof cancellation (SCAN) of TC. While the efect of interleaver on polar code is not so decoding [13]. Terefore, BP decoder and SCAN decoder can −1 −1 obvious, random interleaver is considered in this 3D polar be considered for the decoders �� and �� . encoding structure for convenience. As to the decoding of tail-biting , In this paper, regular puncturing pattern is applied to p. the optimal algorithm is maximum a posteriori probabil- p x � If is adopted to PC with length �, there are altogether ity (MAP) decoding algorithm, but its complexity is very � � � x � ones in the period �.Tebitsof PC corresponding to high. Two suboptimal MAP decoding algorithms have been � � the positions of � ones are not punctured. For example, proposed for tail-biting convolutional code, tail-biting BCJR assume that �=1/4and p = [11000000];theneveryfourth (TB-BCJR), and A3 [14]. Aferwards, a less complexity MAP x x � bit of � and � is extracted and sent to � for encoding again. algorithm has been presented to decode tail-biting convo- According to the relationship between p and p,itiseasyto lutional code [15]. Terefore, the TB-BCJR, A3 algorithms p = [00111111] p x obtain . If we apply to PC, then the bits andthelowcomplexityMAPalgorithmcanbechosenasthe −1 which are reserved are sent to the channel. candidate schemes for �� decoder.

3.2. Decoding Structure. In general, a concatenated code canbedecodedbytheiterativedecodingstructure.Te 4. Convergence Behavior Analysis decoding diagram of 3D-PC is shown in Figure 2. Te In this part, EXIT chart is utilized to analyze the convergence y � sequence is received from channel and is demultiplexed threshold of 3D-PC. In Figure 3, the simplifed decoding y y y into three parts, �, ch,and �. Te corresponding channel structure for the calculation of EXIT chart is given. In Ψ Ψ logarithm likelihood ratios (LLRs) are denoted by �, ch, Figure 3, ��,inner denotes the average mutual information Ψ and �. Later they participate in the subsequent decoding. between �(u�) and u�, ��,inner denotes the average mutual � −1 � −1 � −1 Te decoders � , � ,and � are corresponding to information between �(u�) and u�, ��,outer denotes the aver- � � � encoders �, �,and �,respectively. age mutual information between �(x�) and x�,and��,outer −1 −1 First, Ψ� from channel and Γ� from �� and �� are denotes the average mutual information between �(x�) and −1 fed to �� for decoding. Ten the extrinsic information Λ� x�. Te detailed calculation processes of EXIT chart curve are is deinterleaved, combined with Ψch and demultiplexed into presented as follows: two parts, Ψ� and Ψ�.TeobtainedΨa and Ψ� are regarded � −1 � −1 (1) Given signal to noise ratio (SNR), u� and 0≤��,inner ≤ as channel LLRs of parity bits and assist � and � 1 �(u ) in decoding, respectively. For outer decoder, the extrinsic ; then the a priori information � can be obtained −1 −1 by the assumed model [9] and is sent for the inner information related to u is exchanged between �� and �� −1 decoder �� . because both the input information of �� and that of �� u Ξ −1 are from . Additionally, the extrinsic information, � and (2) Monte Carlo simulation based on �� is performed −1 −1 Ξ�, of parity bits which is output by �� and �� goes to get the distributions of �� of (10). through the following operations: multiplex, puncture, and (3) Ten ��,inner is calculated by substituting �� into (10). interleave. Ten extrinsic LLR information Γ� is obtained and −1 delivered to �� as a priori information at next iteration. Te (4) Traverse ��,inner at a certain step size in a certain extrinsic LLR information of part parity bits x� is exchanged internal [0, 1] and calculate the corresponding ��,inner. −1 between inner decoder �� and outer decoder as framed Ten the curve which depicts the relation between � � in Figure 2. Te exchange procedure is terminated when the �,inner and �,inner is obtained. 4 Wireless Communications and Mobile Computing

Outer decoder

a Π−1 Π b y  b Demultiplexer u a u −1 Π −1 Ca Cb Decision y y=B LLR =B a

a b yc c b De-mux Mux −1 Cc c PC c −1 Combine Puncture Πc p

Πc

Figure 2: Te iterative decoding structure of 3D-PC.

IE,CHH?L IA,ION?L y=B yu Eu  Ax  c −1 p y Πc y c −1 De-mux Cc Outer decoder Auc Exp Πc

IA,CHH?L IE,ION?L Figure 3: Simplifed decoding structure for EXIT chart analysis.

1 Table 1: Convergence thresholds of 3D-PC. � �=1 �=1/2 �=1/4 �=1/8 �=0 0.8 Tresholds 6.60 dB 4.60 dB 3.40 dB 3.16 dB 2.60 dB ) c ( C

E 0.6 I of the outer code and inner code are denoted by solid curves 0.4 and dash curves, respectively. From Figure 4, it can be seen

(PCSPC); that there is an opening between the EXIT chart curves of A I inner code and outer code for both confgurations. Since 0.2 there is no disjoint for each pair of EXIT chart curves, the decoding of 3D-PC can reach convergence. In general, the 0 EXIT chart curves can be depicted with the variety of SNR. 0 0.2 0.4 0.6 0.8 1 Te convergence threshold is the SNR at which the tunnel IA(Cc); IE(PCSPC) between EXIT chart curves pairs is very narrow. As to 3D- �=1/4 �=1/3  = 1/4, R = 1/3 PC with and , the convergence threshold  = 1/2, R = 1/3 is 3.4 dB. Table 1 lists the convergence thresholds of 3D-PC under diferent �. Te simulation frames for Monte Carlo 4 Figure 4: EXIT chart of �=1/4, �=1/3, 3D-PC, SNR = 3.41 and simulation are 1.0 × 10 . of �=1/2, �=1/3, 3D-PC, SNR = 4.8. From Table 1, it can be observed that the convergence threshold increases with the increase of �.Comparedwiththe best convergence threshold when � is 0,theconvergenceloss Likewise, ��,outer can be got by the above processes. Te under �=1/8and �=1/4is relatively small. Terefore, those diferences are that the decoder for Monte Carlo simulation � −1 two confgurations are set to 3D-PC. is outer decoder other than �� , ��,outer is given, and the x u transmitted bits are � instead of �. 5. Simulation Results Figure4givestheEXITchartof3D-PCwithtwo confgurations, {� = 1/4, � = 1/3, SNR =3.41}and InFigure5,theBLERperformanceof3D-PCisgiven.Te {� = 1/2, � = 1/3, SNR = 4.8}.TeEXITchartcurves underlying channel is additive white Gaussian noise (AWGN) Wireless Communications and Mobile Computing 5

100 In addition to performance, complexity is also important. As to the conventional PCSPC [1], the computation complex- ity is written as 10−1 Θ� =�(2��log �) , (11) � 10−2 where is iteration number between the component decoders and � is the code length of component systematic polar BLER code. For the proposed 3D scheme, it includes not only the 10−3 complexity of PCSPC decoder, but also the complexity of tail- biting convolutional code decoder [15]. Comprehensively, the complexity is about 10−4 � Θ3D =�(� (2� log �+4×2 ×��)) , (12) 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 Eb/N0 (dB) where � is the out-loop iteration number, � is the memory element of tail-biting convolutional code, � isthecodelength PCSPC of component polar code, and � is the permeability rate. � 3D-PC-:1/8 In (12), 2� log � and 4×2 ×��denote the complexity 3D-PC-:1/4 of outer decoder and inner decoder in one outer iteration, Figure 5: Performance comparison between PCSPC and 3D-PC. respectively. Since the inner iteration number between the For both PCSPC and 3D-PC, the input information length and code PCSPC component decoders is 1, the complexity of outer rate of them are � = 4096 and �=1/3,respectively. decoder is 2� log � according to (11). As to Log-MAP algorithm, the complexity can be regarded as the metric updates in the trellis nodes. Corresponding to (12), 4 denotes 2� channel. Te input block size is set to � = 4096.Tecode the metric updates per trellis node, is the state numbers, �� rate of the component SPC is 1/2. However, it is noteworthy and denotes the input information length of tail-biting that the output of the component SPC is � parity bits. And convolutionalcodewhichcanbeknownfrom3Dpolar the total code rate of 3D-PC is �=1/3. Te interleavers encoder (refer to Section 3). � � Π and Π� used for simulation are random interleavers. Te In this paper, and are set the same to ensure that internal iteration number of outer decoder is 1 and the the total iteration number between the PCSPC component iteration number between the inner decoder and the outer decoders is the same. Moreover, the increased complexity 4×2� ×��� decoder is equal to 6. In addition, SCAN decoding algorithm is which is brought by inner decoder. Since is utilized for the decoding of SPC and the CRSC code is the memory of the tail-biting convolutional code we use 0≤�≤1 decoded by low complexity MAP decoding [15]. Diferent is small and , the additional complexity of permeability rates � are set to 3D-PC scheme, such as 1/4 and the proposed scheme is less compared with the complexity 1/8. of the conventional PCSPC decoder. Here, we adopt the As a comparison scheme, the performance of PCSPC is parameter confgurations in this paper to give a specifc � = 8192 �=�=6�=2 alsogiveninFigure5.TeconstituentcodesareSPCswith example. Assume that , , , �=1/4 Θ = 1277952 Θ = 1474560 code rate 1/2 and code length 8192. Under this confguration, and ;then � and 3D are the total code rate of PCSPC is 1/3 which is the same as obtained by (11) and (12). Hence, compared to the complexity that of 3D-PC. Te SCAN decoding algorithm is applied of the original PCSPC, the additional complexity of 3D polar 15 to decode the component codes. For fair comparison, total code is about %. iteration numbers between the PCSPC component decoders are required to set the same for both the conventional PCSPC 6. Conclusion and the proposed 3D-PC scheme. Tereafer the outer loop number between the two constituent decoders is equal to In this paper, 3D-PC is presented to lower the error foor 6. of PCSPC. It makes the best use of the characteristics of By observing Figure 5, it can be found that the perfor- parallel concatenation and serial concatenation. Te simu- mance in water region is lost for 3D-PC with respect to lation results verify the efectiveness of 3D-PC. In addition, PCSPC. Tis phenomenon is accordant with the analysis in EXITchartisutilizedtoanalyzetheconvergencethreshold Section4.Tatis,theconvergencethresholdbecomeslarger of 3D-PC under diferent permeability rate confgurations. with the increase of �. In addition, 3D-PC has better BLER Te obtained convergence thresholds can guide the choice of performance than PCSPC in low error rate. For PCSPC, error permeability rate of 3D-PC. −4 foor phenomenon begins at about BLER 2×10 .However, −4 the error foor does not appear around BLER 2×10 for 3D- Conflicts of Interest PC. In other words, the error foor is lowered by the proposed 3D-PC scheme. Te reason may be that 3D-PC has lager Te authors declare that there are no conficts of interest minimum distance compared with PCSPC. regarding the publication of this paper. 6 Wireless Communications and Mobile Computing

Acknowledgments [15] P. Wijesinghe, U. Gunawardana, and R. Liyanapathirana, “Low complexity MAP decoding of tailbiting convolutional codes,” Tis work was supported by the National Natural Sci- in Proceedings of the 2010 International Conference on Signal ence Foundation of China (no. 61771066), the National Processing and Communications, SPCOM 2010,India,July2010. Natural Science Foundation of China (no. 61671080), and the National Science and Technology Major Project (no. 2017ZX03001004).

References

[1]D.Wu,A.Liu,Y.Zhang,andQ.Zhang,“Parallelconcatenated systematic polar codes,” IEEE Electronics Letters,vol.52,no.1, pp.43–45,2016. [2] E. Arikan, “Systematic polar coding,” IEEE Communications Letters,vol.15,no.8,pp.860–862,2011. [3]Q.Zhang,A.Liu,Y.Zhang,andX.Liang,“PracticalDesign andDecodingofParallelConcatenatedStructureforSystematic Polar Codes,” IEEE Transactions on Communications,vol.64, no. 2, pp. 456–466, 2016. [4] C. Berrou, A. Glavieux, and P. Titimajshima, “Near Shannon limit error-correcting coding and encoding: turbo-codes,” in Proceedings of the IEEE International Conference on Communi- cations, pp. 1064–1070, Geneve, Switzerland, May 1993. [5] C. Berrou, A. Graell i Amat, Y. O. C. Mouhamedou, C. Douillard, and Y. Saouter, “Adding a rate-1 third dimension to turbo codes,” in Proceedings of the 2007 IEEE Information Teory Workshop, ITW 2007, pp. 156–161, USA, September 2007. [6] C. Berrou, A. Graell i Amat, Y. Ould-Cheikh-Mouhamedou, and Y. Saouter, “Improving the distance properties of turbo codes using a third component code: 3D turbo codes,” IEEE Transactions on Communications,vol.57,no.9,pp.2505–2509, 2009. [7] E. Rosnes and A. Graell i Amat, “Performance analysis of 3- D turbo codes,” Institute of Electrical and Electronics Engineers Transactions on Information Teory,vol.57,no.6,pp.3707– 3720, 2011. [8] S. Benedetto, D. Divsalar, G. Montorsi, and F. Pollara, “Serial concatenation of interleaved codes: performance analysis, design, and iterative decoding,” Institute of Electrical and Elec- tronics Engineers Transactions on Information Teory,vol.44, no. 3, pp. 909–926, 1998. [9] S. T. Brink, “Convergence behavior of iteratively decoded paral- lel concatenated codes,” IEEE Transactions on Communications, vol. 49, no. 10, pp. 1727–1737, 2001. [10] E. Arıkan, “Channel polarization: a method for constructing capacity-achieving codes for symmetric binary-input memo- ryless channels,” Institute of Electrical and Electronics Engineers Transactions on Information Teory,vol.55,no.7,pp.3051–3073, 2009. [11] P.Trifonov, “Efcient design and decoding of polar codes,” IEEE Transactions on Communications,vol.60,no.11,pp.3221–3227, 2012. [12] E. Arikan, “Aperformance comparison of polar codes and reed- muller codes,” IEEE Communications Letters,vol.12,no.6,pp. 447–449, 2008. [13] U. U. Fayyaz and J. R. Barry, “Low-complexity sof-output decoding of polar codes,” IEEE Journal on Selected Areas in Communications,vol.32,no.5,pp.958–966,2014. [14] J. B. Anderson and S. M. Hladik, “Tailbiting MAP decoders,” IEEE Journal on Selected Areas in Communications,vol.16,no. 2, pp. 297–302, 1998. Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 4397671, 14 pages https://doi.org/10.1155/2018/4397671

Research Article Construction and Decoding of Rate-Compatible Globally Coupled LDPC Codes

Ji Zhang ,1,2 Baoming Bai ,1 Xijin Mu ,1 Hengzhou Xu ,3 Zhen Liu,1 and Huaan Li 1

1 State Key Laboratory of Integrated Service Networks, Xidian University, Xi'an, China 2School of Mathematics and Statistics, Henan University of Science and Technology, Luoyang, China 3School of Network Engineering, Zhoukou Normal University, Zhoukou, China

Correspondence should be addressed to Baoming Bai; [email protected]

Received 24 November 2017; Revised 2 February 2018; Accepted 28 February 2018; Published 2 May 2018

Academic Editor: Zesong Fei

Copyright © 2018 Ji Zhang et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Tis paper presents a family of rate-compatible (RC) globally coupled low-density parity-check (GC-LDPC) codes, which is constructed by combining algebraic construction method and graph extension. Specifcally, the highest rate code is constructed using the algebraic method and the codes of lower rates are formed by successively extending the graph of the higher rate codes. Te proposed rate-compatible codes provide more fexibility in code rate and guarantee the structural property of algebraic construction. It is confrmed, by numerical simulations over the AWGN channel, that the proposed codes have better performances than their counterpart GC-LDPC codes formed by the classical method and exhibit an approximately uniform gap to the capacity over a wide range of rates. Furthermore, a modifed two-phase local/global iterative decoding scheme forGC-LDPCcodesisproposed. Numerical results show that the proposed decoding scheme can reduce the unnecessary cost of local decoder at low and moderate SNRs, without any increase in the number of decoding iterations in the global decoder at high SNRs.

1. Introduction areofenusedinconjunctionwiththeHARQstrategy[14– 20]. Most recently, RC-LDPC codes have been adopted by Globally coupled low-density parity-check (GC-LDPC) the 3rd Generation Partnership Project (3GPP) as the channel codes, which were proposed by Li et al. in [1–4], are a special coding scheme for 5G enhanced mobile broadband (eMBB) type of LDPC codes designed for correcting random symbol data channel [21]. Such codes with a wide range of rates errors and bursts of errors or erasures. Tey have a diferent and block lengths are a family of nested codes which can be structure from the conventional LDPC block codes and the interpreted as a graph extension of high-rate codes [17, 22, spatially coupled LDPC (SC-LDPC) codes [5–12]. From the 23]. perspective of the Tanner graph, a GC-LDPC code using a UnliketheSC-LDPCcodeswhichhavesomeconven- group of global check nodes (CNs) couples (or connects) a set tional design methods for RC-LDPC codes, such as punctur- of disjoint Tanner graphs called local graphs. We refer to such ing variable nodes from the codes with low rate and extending codes as CN-based globally coupled LDPC (CN-GC-LDPC) variable nodes to the codes with high rate, the classical GC- codes. Due to the special structure, CN-GC-LDPC codes not LDPC codes are mostly restricted to invariant code rate [11, only perform well on both the additive white Gaussian noise 20, 22–25]. And the existing construction methods of GC- (AWGN) channel and the binary erasure channel (BEC) but LDPC codes are not fexible enough for RC code design. also are efective for correcting burst erasures. In this paper, we present a family of RC CN-GC-LDPC For time-varying channels, from the wireless commu- codes. Te proposed construction is based on combining nication theory, we need to adapt the rate according to algebraic construction method and graph extension. Te the available channel state information (CSI); such an error highestratecode,whichcanbecalledthemothercode,is control strategy is referred to as rate adaptability [13]. Rate- constructed using the algebraic method. And the codes of compatible (RC) channel codes with incremental redundancy lower rates are formed by successively extending the graph 2 Wireless Communications and Mobile Computing

of the higher rate. Compared to the original CN-GC-LDPC with 0≤�<�−1, and then replace ��,� by a (� − 1) × (� − 1) codes, the proposed family of RC CN-GC-LDPC codes not circulant permutation matrix (CPM) whose frst row has a only provides more fexibility in code rate and allows more single 1-component at the �th element. Te resultant H��,�� is efcient encoding but also guarantees the structural property an (��+�)(�−1)×��(�−1) matrix over GF(2).Tenullspace of algebraic construction. Examples show that the proposed of H��,�� gives a CN-based QC-GC-LDPC code. codes perform well and exhibit an approximately uniform gap From the perspective of graph, let G� be the Tanner graph to the capacity over a wide range of rates. associated with B� for 0≤�≤�−1.WecallG� alocal Furthermore, localization is a signifcant advantage of Tanner graph. Te diferent G� are pairwise disjoint (for 0≤�, CN-GC-LDPC codes which allows us to process a consec- �≤�−1,and�=� ̸ , G� and G� are disjoint). Te � global CNs utive sequence by some local sequences. If a decoder uses connect all these local graphs, composing the Tanner graph two-phase local/global iterative scheme, its decoding latency associated with B��, denoted as G��, as shown in Figure 1 and memory requirements are much less than those of the (in this fgure, the edge labels are ignored). We see that the classical decoding scheme without performance degradation � global CNs provide the only connections between any two [1, 2, 26, 27]. However, the local decoder performs a plenty disjoint local Tanner graphs. of useless operations at low and moderate SNRs, which causes the unnecessary cost of the decoder. In light of this 2.2. Te Basic Method of Code Construction. Li et al. pre- case, we present a modifed two-phase local/global iterative sented two types of CN-based QC-GC-LDPC codes and their decoding scheme for CN-GC-LDPC codes which reduces the construction methods in [1, 2, 4]. In this subsection, we briefy unnecessary cost of the local decoder at low and moderate reviewtheconstructionmethodforthefrsttypeofQC- SNRs. GC-LDPC codes in [1]. Let GF(�) be a nonbinary (NB) feld Te remainder of this paper is organized as follows. with � elements, where � is a power of a prime. Let � be a In Section 2, we give a brief introduction to CN-GC- primitive element of GF(�).Ten,thepowersof�,namely, 0 1 2 �−2 LDPC codes. In Section 3, we introduce the concept of � ,� ,� ,...,� , give all the nonzero elements of GF(�) graph extension through a four-edge-type LDPC code and �−1 0 and � =� =1.WeconstructthefollowingmatrixB�� propose a simple construction method of four-edge-type over GF(�) using the method in [2]: LDPC code family. In Section 4, we present a modifed two-phase local/global iterative decoding scheme for CN-GC- �0 −1 �−1 ⋅⋅⋅ ��−2 −1 LDPC codes. Numerical results for RC GC-LDPC codes are [ ] [ �−2 0 �−3 ] presented in Section 5. Finally, Section 6 concludes this paper. [� −1 � −1 ⋅⋅⋅ � −1] [ ] B�� = [ . . . ] . (2) [ . . d . ] 2. Background [ ] 2 0 [ �−1 � −1 ⋅⋅⋅ � −1 ] 2.1. CN-Based QC-GC-LDPC Codes. A CN-based QC-GC- LDPC code C�� is constructed using a base matrix B�� which Suppose �−1can be factored as the product of two posi- consists of two subarrays as shown: tive integers � and �;thatis,�−1=��.TematrixB�� which has a cyclic structure is then an ��×�� square matrix. Te frst B =[� ] � B �×�� W �� �,� 0≤�<��+�,0≤�<�� rows in �� comprise an matrix, denoted by 0.Ten, W0 canbepartitionedinto� submatrices of size �×�, denoted W , W ,...,W B B0 by 0,0 0,1 0,�−1. Based on the cyclic structure of ��, [ ] � B [ B ] we can obtain the next rows in �� by cyclically shifing all [ 1 ] W =[W , W ,...,W ]� [ ] the rows of 0 0,0 0,1 0,�−1 positions to right, [ d ] W =[W , W ,...,W , W ] [ ] (1) denoted by 1 0,�−1 0,0 0,�−3 0,�−2 .Andso, [ ] B � = [ B ] . the matrix �� canbepartitionedinto submatrices of size [ � ] �×�� W , W ,...,W ,...,W W = [ ] , denoted by 0 1 � �−1,where � [ d ] [W ,...,W , W ,...,W ] 0<�≤�−1 [ ] 0,�−� 0,�−1 0,0 0,�−�−1 , .Hence, [ ] B B�−1 we can form the following block-cyclic structured matrix � from B��: [ X�� ] W W ⋅⋅⋅ W B �×� �×� 0,0 0,1 0,�−1 Te upper subarray of �� is a diagonal array with [ ] B �×�� [W W ⋅⋅⋅ W ] matrices � on its main diagonal and the lower part is an [ 0,�−1 0,0 0,�−2] X 0≤�≤�−1 B (�� + �) × �� B = [ ] . matrix ��,where .So, �� is an � [ . . . ] (3) matrix. [ . . d . ] Consider a fnite feld GF(�) and let � be a primitive [ W0,1 W0,2 ⋅⋅⋅ W0,0 ] element of GF(�).LetH��,�� denote the parity-check matrix of CN-based QC-GC-LDPC code. We can use the base matrix For 0≤�<�and 1≤�, �≤�,wetakean�×� B (� − 1) H �� and -fold dispersion to form ��,�� as follows: for submatrix R0,� from W0,�.ForthesubmatrixR0,�,werestrict � B � =0 � =0 each entry �,� in ��,if �,� ,thenreplace �,� by the the locations of its entries in W0,� to be identical (i.e., for (�−1)×(�−1) � =�� 0≤� � <� R R zero matrix (ZM); otherwise, assume �,� 1, 2 ,theentriesof 0,�1 and 0,�2 are taken from Wireless Communications and Mobile Computing 3

  −

0 1 s−1

s global CNs Figure 1: Te Tanner graph of CN-based QC-GC-LDPC codes.

W W B (� − 1) identical locations in 0,�1 and 0,�2 ). Ten, we form the subarray of ��,�� global part.Te -fold dispersion of following �×�array B� of �×�submatrices over GF(�) with B��,�� results in an (��+�)×�� array H�� of (�−1)×(�−1) CPMs a block-cyclic structure: and/or ZMs. Te null space of H�� gives a CN-based QC-GC- LDPC code whose Tanner graph has a girth of at least 6. R R ⋅⋅⋅ R 0,0 0,1 0,�−1 � � ,� ; � ,� [ ] Let �� be the design rate of a ( V V � �)binaryCN- [R0,�−1 R0,0 ⋅⋅⋅ R0,�−2] [ ] GC-LDPC code, where �V and �V denote the VNs (variable B = [ ] . � [ . . . ] (4) nodes) degrees of local part and global part,respectively,and [ . . d . ] �� and �� denote the CNs degrees of local part and global part, R R ⋅⋅⋅ R [ 0,1 0,2 0,0 ] respectively. Ten, ��� =1−(�V +�V)(�� + �)/(���� +���),

1≤�V ≤�1≤� ≤�1≤�� ≤� 1≤� ≤�� We denote the set of rows in the frst row blocks of B� , V , ,and � . � =�� =�� =� � =��� by Γ=[R0,0, R0,1,...,R0,�−1].So,B� is a submatrix of B�.In For V , V , � ,and � , �� is equal to 1 − �/� − �/�� forming the �×�array B� of �×�submatrices over GF(�) . given by (4), there are �−�rows in each row-block and (�−�) (127) columns in each column-block of B� which are unused. We Example 1. Consider the prime feld GF for the code �= denote the set of �(� − �) unused rows of B� in each row- construction. Suppose we choose two sets of parameters, 42 �=3�=5�=40�=3�=1 �=21�=6 block by Π1.So,thereare� sections which have a total of � , , , , , and , , �=3�=21�=6�=1 components in each row of Π1.Foreachsectioninrow,we , , , . Based on the construction remove �−�components which are not used in forming the method described above, we form two binary matrices of ∗ 2016 × 15120 2394 × 15876 array B� from B�. We denote the set of �(�−�)rows by Π1 . size and , respectively. We denote ∗ H (126, 126) H (126, 126) Te rows in set Π and the rows in set Γ are disjoint. Let � those two matrices as ��,1 and ��,2 .For 1 H (126, 126) and � be two positive integers with 1≤�≤�(�−�)and ��,1 ,ithastwocolumnweights,5and6,andtwo ∗ H (126, 126) 1≤�≤�.Weremovethelast�−�sections from Π1 and take row weights, 39 and 120. Te null space of ��,1 ∗ (15120, 13104) C � rows from the remaining sections of Π to form an �×�� gives CN-based QC-GC-LDPC code 1 with 1 C matrix X�� over GF(�).Bytakingthe�×�diagonal array from a rate of 0.8667. Te Tanner graph of 1 contains 3,165,498 the main diagonal of B� and appending the matrix X�� to the cycles of length 6 and 545,198,472 cycles of length 8. For H (126, 126) bottom of them, we form the following base matrix of a QC- ��,2 ,ithastwocolumnweights,3and4,andtwo H (126, 126) GC-LDPC code: row weights, 21 and 125. Te null space of ��,2 gives (15876, 13494) CN-based QC-GC-LDPC code C2 with R 0,0 a rate of 0.85. Te Tanner graph of C2 contains 198,828 cycles [ ] [ R ] of length 6 and 21,715,639 cycles of length 8. [ 0,0 ] [ ] B��,�� = [ d ] . (5) [ ] [ ] R0,0 3. Rate-Compatible GC-LDPC Codes [ X�� ] In this section, we frst introduce the concept of graph Each entry in the upper subarray of B��,�� is equal to R0,0 extension through a four-edge-type LDPC code. Ten, we which is an �×� matrix, and such subarray is called local part. present a construction method of RC GC-LDPC codes. Note that we can use � diferent member matrices in the set {R0,0, R0,1,...,R0,�−1} as the matrices on the main diagonal of 3.1. Four-Edge-Type LDPC Codes. Te Tanner graph of a four- �×�upper subarray of B��,�� as well. And we call the lower edge-type LDPC code is illustrated in Figure 2. We partition 4 Wireless Communications and Mobile Computing

0 1 t−1

0 1 s−1

CN q CN q Mother s m(0) Extended s e(0)

VN p VN p Mother s m(0) Extended s e(0)

CN q Global s g(0)

Edge 1 Edge 3 Edge 2 Edge 4

Figure 2: Te Tanner graph of four-edge-type LDPC codes. the VNs into � sections, denoted by p(0), p(1),...,p(� − 1), codes by connecting the mother CNs and the global CNs to each containing �� VNs. Every p(�) is then split into two the mother VNs in the graph. By extending the new nodes parts: the mother VNs p�(�) of length ��(�) and the extended andthenincreasingnewedgesoftype2andtype3,wecan VNs p�(�) of length ��(�);thatis,p(�) = [p�(�), p�(�)].We continuously form the new GC-LDPC codes. Note that the denote the corresponding coded symbols of VNs as a vector type 2 edges are the only edges connecting the mother nodes v of size �.Atthe�th section, we denote the corresponding and the extended nodes, so they are quite diferent from the symbol vectors of p�(�) and p�(�) as v�(�) = [V�,�(�)]0≤�<� (�) type 1 edges. Since there are four diferent types of edges in the v (�) = [V (�)] � and � �,� 0≤�<��(�), respectively. Ten, we denote by Tanner graph, we refer to such LDPC codes as four-edge-type q� the �� CNs in the lower part which are called global CNs. LDPC codes. In the upper part, we partition the CNs into � sections, each For 0≤�<�, the submatrices which correspond to the � q(0), q(1), . . . , q(� − 1) containing � CNs, denoted by .We four types of edges are denoted by split each section of such CNs into two parts: the mother CNs q�(�) of length ��(�) and the extended CNs q�(�) of length ��(�),where0≤�<�. We refer to the edges connecting p�(�) H (�) =[ℎ (�)] , � �,�,� 0≤�<� (�),0≤�<� (�) and q�(�) as the type 1 edges, the edges connecting p�(�) and � � q (�) p (�) q (�) � as the type 2 edges, the edges connecting � and � H (�) =[ℎ (�)] , �→� �→�,�,� 0≤�<� (�),0≤�<� (�) asthetype3edges,andtheedgesconnectingp�(�) and q� as � � the type 4 edges. (6) H (�) =[ℎ (�)] , � �,�,� 0≤�<� (�),0≤�<� (�) From Figure 2, we can see that the mother VN not � � only is connected to the global CNs by the type 4 edges H�→� (�) =[ℎ�→�,�,� (�)] . but also is connected to the mother CNs and the extended 0≤�<��,0≤�<��(�) CNs by the type 1 edges and the type 2 edges, respectively. However, the extended VNs which have the same number of H the extended CNs only are connected to the extended CNs Ten, the parity-check matrix FET of four-edge-type LDPC by the type 3 edges. Tis means that we can form GC-LDPC codes can be represented by Wireless Communications and Mobile Computing 5

H� (0) 0 [ ] [H (0) H (0) ] [ �→� � ] [ ] [ H (1) 0 ] [ � ] [ ] [ H (1) H (1) ] H = [ �→� � ] . FET [ ] (7) [ d ] [ ] [ ] [ H� (�−1) 0 ] [ ] [ ] H�→� (�−1) H� (�−1) [H�→� (0) 0H�→� (1) 0 ⋅⋅⋅ H�→� (�−1) 0 ]

Analogous to GC-LDPC codes, we call the upper subar- 3.2. Construction of Rate-Compatible GC-LDPC Codes. We ray local part and the lower subarray global part. We extract now present a method to construct RC GC-LDPC codes. H (�) H (�) H (�) H � (�) × q � , �→� ,and � from FET to form an local Because the extended VNs are not connected to �,the � (�) (�) local matrix over GF as follows: construction of RC GC-LDPC codes can be realized by H (�) 0 successively extending the graph of local part. Te successive H (�) =[ � ], extension of the parity-check matrices of local part in the �th local (8) H�→� (�) H� (�) section is shown in Figure 3. Assume that we intend to construct a family of four- 0 � (�) = � (�) + � (�) where is an all-zero matrix, local � � ,and edge-type LDPC codes containing � member codes C = � (�) = ��(�) + ��(�) for 0≤�<�. So, the corresponding {C , C ,...,C } local FET,0 FET,1 FET,�−1 . Te corresponding target rate base matrix of H (�) and H is B� and B��,respectively. R ={� ,� ,...,� } local FET set FET,0 FET,1 FET,�−1 is given. We denote Ten, the four-edge-type LDPC code satisfes the follow- H � ×� C FET,� as the FET,� FET,� parity-check matrix of FET,�, ing parity-check equation: 0≤�<� H � where .Atthelocal part of FET,�,thereare � � (�)×� (�) [k (�) k (�)] H (�) = 0, submatrices on the main diagonal of size local,� local,� � � local and we denote each submatrix as H ,�(�),where0≤�<� (9) local � 0≤�<� H (�) H (�) k (�) H (�) = 0. and .So, local,0 isthemothermatrixof local,1 . � �→� C And the code rate of FET,0 is H We denote the LDPC code given by the null space of FET C H (�) as FET and the LDPC code given by the null space of local (H ) Rank FET,0 as C (�), respectively (where 0≤�<�). Notice that H�(�) � =1− . local FET,0 ∑ � (�) (12) is square and has a full rank for 0≤�<�,andtheentries � local,0 above H�(�) havetobeallzerotoachieveratecompatibility; C (�) H (�) thus, local is a RC-LDPC code [16]. We extract � and H H (�) H (� +∑ � (�)) × ∑ � (�) Suppose FET,0 has a full rank: �→� from FET to form an � � � � � matrix A over GF(�) as follows: � +∑ � (�) H (0) � =1− � � local,0 . � FET,0 (13) [ ] ∑� � ,0 (�) [ H (1) ] local [ � ] [ ] A = [ d ] . (10) [ ] By adding ��,1(�) newrowsand��,1(�) new columns to [ H (�−1) ] H (�) � mother matrix local,0 , we obtain an extended matrix H (�) H (�) [H�→� (0) H�→� (1) ⋅⋅⋅ H�→� (�−1)] local,1 .Teupperpartof local,1 is composed of a H (�) mother matrix local,0 andanall-zeromatrixofsize Ten, based on (8) and (9), we have � (�) × � (�) H (�) local,0 �,1 .Telowerpartof local,1 is composed � of a matrix H�→�,1(�) of size ��,1(�) × � ,0(�) and a [k� (0) k� (1) ⋅⋅⋅ k� (�−1)] A = 0. (11) local matrix H�,1(�) of size ��,1(�) × ��,1(�).Tus,the� updated H H Tis means that FET is a RC parity-check matrix submatrices on the main diagonal of FET,0 form the local H H obtained progressively by extension, in order of decreasing part of FET,1. Ten, we partition the global part of FET,0 � � ×� (�) rate [17]. Furthermore, all the mother VNs are connected to into submatrices of size � local,0 andwedenote q� and there are no edges connecting p(�) and q(�) for �=� ̸ ; each submatrix as H�→�,0(�),where0≤�<�. By adding thus, the four-edge-type LDPC codes can be viewed as a type �� ×��,1(�) all-zero matrix to the right of H�→�,0(�),we of RC CN-based QC-GC-LDPC codes. form an extended matrix H�→�,1(�),where0≤�<�.We 6 Wireless Communications and Mobile Computing

where H 0 local,0(j) H ME,� 0 H�→�,� (0) H H (j) [ ] m→e,1(j) e,1 [ H (1) ] (17) 0 [ �→�,� ] = [ ] [ d ] H H m→e,2(j) e,2(j) [ H�→�,� (�−1)]

and 0<�≤�−1. Consider H�,�(�) as a nonsingular matrix, 0≤�≤�−1 (H ) H where ; the rank Rank FET,� of FET,� can be written as H H m→e,f−1(j) e,f−1(j) �−1 (H )= (H )+∑� (�) Rank FET,� Rank FET,�−1 �,� �=0 � (18) Figure 3: Parity-check matrix extension in the th section of local � �−1 part. = (H )+∑∑� (�) , Rank FET,0 �,� �=1�=0 0<�≤�−1 H where .Anditiscleartoseethat FET,� can be a � H combine the extended submatrices to compose the global full-rank matrix if and only if FET,0 is a full-rank matrix. Te H H part of FET,1.Suppose FET,1 has a full rank; we have construction for the parity-check matrix of the RC GC-LDPC codes is described in detail in Algorithm 1. Furthermore, the RC CN-based QC-GC-LDPC codes � + ∑ � (�) � � local,1 with extension structure allow more efcient encoding. Espe- � ,1 =1− FET ∑ � (�) cially for 0≤�≤�−1and 0≤�≤�−1,ifH�,�(�) is an identity � local,1 (14) matrix, the encoding of such RC CN-based QC-GC-LDPC � +∑ � (�) + ∑ � (�) codes is more efcient: afer encoding the mother VNs, the =1− � � local,0 � �,1 . ∑ � (�) + ∑ � (�) encoding of the extended VNs only involves XOR operations � local,0 � �,1 on the precode output symbols.

C Example 2. Consider the prime feld GF(193) for code con- Recursively, for the code FET,�+1, we can obtain its parity- �=64 �=3 �=5 �=27 �=3 �=1 H struction. Let , , , , ,and . check matrix FET,�+1 from the previously generated parity- H C 0≤�≤�−2 H Based on the construction method described in Section 2, check matrix FET,� for FET,�, .Suppose FET,�+1 3072 × 15552 H (192, 192) C we form a binary matrix ��,3 .Ithas has a full rank; the rate of the code FET,�+1 satisfes onecolumnweight6,tworowweights,27and81.Basedon basegraph1in5Gstandard,weconstructamaskingmatrix D 960 × 5184 � + ∑ � (�) local of size for each submatrix on the main � � local,�+1 � =1− diagonal at the local part of H��,3(192, 192) [28]. Particularly, FET,�+1 ∑ � (�) 2×192 D � local,�+1 the frst columns of local are punctured columns. Te D �(�) = 0.0127 + 0.0759� + (15) degree distribution for local is 2 3 4 2 18 ∑� ��,�+1 (�) � 0.7975� + 0.0506� + 0.0633� �(�) = 0.0380� + 0.962� =� − ⋅ FET,� . , . FET,� ∑ � (�) 2−� � � � local,� FET,� and arethevariableandcheckdegreedistributions from the edge perspective. We can construct a 3072 × 15552 H D matrix FET,0 using local to mask each submatrix on the H H (192, 192) In particular, it is not necessary for FET,1 to be a full- main diagonal at the local part of ��,3 .Suppose rank matrix for constructing the RC GC-LDPC codes. Using �=20. Based on the construction method described in H elementary row and column operations of FET,�,wehave Algorithm 1, we construct a family of four-edge-type LDPC codes. Particularly, by applying graph extension, we obtain the protomatrix of H�→�,�(�) and H�,�(�),where1≤�≤19 H 0 ,�−1 and 0≤�≤2.Particularly,p�(�) and q�(�) are connected [ FET ] [ H�,� (0) ] by one edge if and only if �=�. In the terminologies of [ ] [ H (1) ] protograph construction, lifing the protograph of H�→�,�(�) [ �,� ] [ ] , and H�,�(�) is equivalent to dispersing the base matrix of them [ H ] (16) [ ME,� ] [11]. Ten, we can use the method in [29] to fnd circulants [ ] H (�) H (�) [ d ] for the protomatrix of �→�,� and �,� .Basedonthe construction method described in Algorithm 1, we construct H (�−1) [ �,� ] a family of four-edge-type LDPC codes. And we refer to its Wireless Communications and Mobile Computing 7

Input: �, �, �, �,andR Output H : FET (1) � ×� H Using the construction method described in Section 2 to form an FET,0 FET,0 matirx FET,0 which C � specify a GC-LDPC code FET,0 with rate FET,0. (2) for �=0:�−2do (3) H Use FET,� as the mother matrix. (4) for �=0:�−1do (5) H (�) � (�) × � (�) Generate a matrix �→�,�+1 of size �,�+1 local,� . (6) Generate a matrix H�,�+1(�) of size ��,�+1(�) × ��,�+1(�) whichissquareandhasafullrank. (7) for �=0:�−1do (8) H (�) Compose the parity-check matrix local,�+1 of the form H (�) 0 H (�) = [ local,� ]. local,�+1 H�→�,�+1(�) H�,�+1(�) (9) for �=0:�−1do

(10) Compose the matrix H�→�,�+1(�) of the form H�→�,�+1(�) = [H�→�,�(�) 0]. where 0 is an all-zero matrix of size �� ×��,�+1(�). (11) H Compose the matrix FET,�+1 of the form H ,�+1(0) [ local ] [ ] [ d ] H ,�+1 = [ ] . FET [ H (� − 1) ] local,�+1 [H�→�,�+1(0) ⋅ ⋅ ⋅ H�→�,�+1(� − 1)] update �:�=�+1.

Algorithm 1: Algorithm for constructing RC GC-LDPC codes.

� C 0≤�≤19 th member as FET,�,where . Consider that local decoder. In a good channel environment, we only need thefour-edge-typeLDPCcodeisconstructedbasedonfnite to use a group of (� or less) local decoders to process a feld which guarantees the structural property of algebraic group of (� or less) consecutive received sections in parallel. construction, and the Tanner graphs of such codes have a Tis means that we can process a consecutive sequence by girth of at least 6. Te parameters of such family of four-edge- some local sequences for a GC-LDPC code. Te advantage of type LDPC codes are summarized in Table 1 and the diagram this decoding scheme is that lower latency and less memory ofitsmatrixisshowninFigure4. requirements are required by the decoder. We refer to such aschemeasnormal two-phase local/global iterative scheme. Example 3. In order to improve fexibility of code rate, we However, in a bad channel environment, we fnd that the local can puncture the parity bits from a family of four-edge-type decoder performs a plenty of useless operations, which causes H LDPC codes. Based on FET,1 presented in Example 2, for the unnecessary cost of the decoder. Terefore, we present a instance,wesetthelast98columnsfromH�,1(�) as punctured modifed two-phase local/global iterative decoding scheme for columns, where 0≤�≤2.Ten,weobtain(14682,12480) CN-GC-LDPC codes. GC-LDPC code with a rate of 0.85 and denote it as C3. 4.1. Modifed Local/Global Two-Phase Iterative Decoding z =(z , z ,...,z ) 4. Local/Global Two-Phase Decoding Scheme Scheme. Let 0 1 �−1 be the received sequences. Firstly, in local phase of decoding, � received sequences For classical iterative decoding scheme, the decoder includes are decoded by � independent decoders with the maximum � all the VNs in a block and performs total decoding operations iteration number max1 . If all the sections are successfully in one phase [30]. We refer to such an iterative decoding decoded and the locally decoded codeword satisfes the scheme as one-phase iterative scheme.Incontrasttoclassical parity-check constraints in global part, the locally decoded iterative decoding scheme, Li et al. devised a two-phase codeword could be delivered to the user. If one of the local/global iterative scheme for CN-GC-LDPC codes [1, 2]. decoders fails to decode a received section, we switch the Taking advantage of the cascading structure of local part, decoding from the local phase to the global phase. If all we can split local part into � independent sections. And sections are successfully decoded, but the locally decoded each section can use an independent decoder at the local codeword does not satisfy the parity-check constraints in phase. If all sections of local part are successfully decoded global part, save the decoded information (���s) and return and the locally decoded codeword satisfes the parity-check to the local decoder. We set the maximum iteration number � constraints in global part, the locally decoded codeword of local decoders to max2 . Ten, if one of the decoders would be delivered to the user. If it does not, the global fails to decode a received section or the locally decoded decoder starts to process the received codeword from the codeword does not satisfy the parity-check constraints in 8 Wireless Communications and Mobile Computing 19 80 + 0.0844� 12 17 9 + 0.1139� + 0.076� 80 18 + 0.0759� 80 6 11 7 9 + 0.0714� 7 + 0.1579� + 0.3207� + 0.065� + 0.1929� + 0.0614� + 0.0702� 5 80 9 + 0.2532� 18 5 5 6 18 8 + 0.0571� 80 5 + 0.0566� + 0.2195� + 0.0422� + 0.4444� + 0.0488� 80 + 0.0526� + 0.3275� 4 + 0.5429� 8 + 0.1899� 9 4 18 4 5 9 7 + 0.2368� + 0.2143� 18 4 + 0.2547� + 0.0472� + 0.0759� + 0.0585� + 0.4878� + 0.6179� + 0.2193� + 0.2456� 3 + 0.0714� 18 7 + 0.0338� 8 3 8 3 4 8 6 + 0.6667� + 0.4286� 7 3 + 0.7925� + 0.7170� + 0.0675� + 0.1053� + 0.2927� + 0.0732� + 0.5614� + 0.0877� 2 + 0.0643� 2 6 + 0.0591� 7 2 7 2 3 7 5 + 0.0702� + 0.1429� 2 2 . Degree distribution pair + 0.065� + 0.1477� = 0.0283� + 0.0936� 2 + 0.1170� + 0.0571� 5 + 0.0759� ) 6 2 6 � 4 ( � . + 0.05� = 0.0189� + 0.0849� = 0.0244� �(�) = 0.0263� + 0.1772� + 0.1228� 2 ) � = 0, 1, 2, 4, 8, 19 ) 4 + 0.1055� 2 � � ( 3 ( � � 0≤�≤2 = 0.0163 + 0.0163� + 0.0732� = 0.0088 + 0.0175� + 0.0789� ) = 0.0214� and ) + 0.0422� � ) � ( ( 2 � � ( � = 0.0286 + 0.0143� + 0.0429� � ) �(�) = 0.0175� � ( , respectively, where 0≤�≤19 � = 0.0468 + 0.0117� + 0.0175� ) FET ,� = 0.0127� � Table 1: Parameters of a RC GC-LDPC code. ( ) H � � ( � = 0.0802 + 0.0084� + 0.0338� ) � ( � are punctured columns, where (bits) � (�) �→� H and 0.7471 16704 0.8333 14976 0.8025 15552 0.4924 25344 0.8667 14400 0.6566 19008 (�) local ,� H columns of 3072 × 15552 3648 × 16128 4224 × 16704 5376 × 17856 7680 × 20160 14016 × 26496 are the variable and check degree distributions from the edge with (�) 2 × 192 � and FET ,0 FET ,1 FET ,2 FET ,4 FET ,8 FET ,19 (�) MemberH SizeH H Code rate H H H Te frst � Wireless Communications and Mobile Computing 9

Nlocal,i(0) Ne,i+1(0)

local H 0 M ,i(0) local,i(0) H e,i+1(0)

M (0) H H e,i+1 m→e,i+1(0) e,i+1(0)

Nlocal,i(1) Ne,i+1(1)

Mlocal (1) H 0 ,i local,i(1)

H H Me,i+1(1) m→e,i+1(1) e,i+1(1)

Nlocal,i(2) Ne,i+1(2)

Mlocal (2) H 0 ,i local,i(2)

H H Me,i+1(2) m→e,i+1(2) e,i+1(2)

H 0 H 0 H 0 Mg m→g,i(0) m→g,i(1) m→g,i(2)

Two punctured circulant blocks

Figure 4: Extension structure of parity-check matrix of the RC GC-LDPC codes in Example 2.

Num max &F;A = 1, -;R.)N?L. .=I 2 &F;A = 0, -;R.)N?L. .OG. = Imax 1 Check by Pass Check by Pass Input Local Decoder H Output Hlocal gc,cn

No pass

No pass z Local Decoder 0 Yes 0 If Flag == 0

. . No

z t−1 Local Decoder t−1 Global Decoder

Figure 5: Modifed local/global two-phase iterative decoding scheme. global part,weswitchthedecodingfromlocalphasetoglobal We defne the order of decoding complexity as the phase. number of operations required per information bit and In global phase of decoding, a global decoder is activated. denote the order of decoding complexity for the normal two- z O It processes the received vector with the channel infor- phase decoding algorithm as normal.Supposethenumber mation and the combined decoded information (���s) of of operations for one iteration of global part is �� and the successfully decoded sections as inputs. And the diagram of number of operations for one iteration in the �th section of modifed two-phase local/global decoding iterative scheme is local part is ��(�),where0<�≤�−1.Asstatedin[1–3],we illustrated in Figure 5. have 10 Wireless Communications and Mobile Computing

1 �−1 0 O = (� � + ∑� (�) � (�)) , 10 normal � � � � (19) � �=0 10−1 � � (�) (0 ≤ where is the length of the information bits, � −2 � (�) ≤ � ) 10 � max is the number of iterations involving updates of variables in the �th section of the local part, and �� is the 10−3 number of iterations involving updates of variables in the global part. Ten, the order of decoding complexity of the −4

BER/BER 10 modifed two-phase decoding algorithm, which is denoted as O modifed, can be summarized as 10−5 O modifed 10−6 1 �−1 (20) = (∑� (�) (� (�) + � (�)) + � � ), 10−7 � �1 �2 � � 0 0.5 1 1.5 2 2.5 3 3.5 4 � �=0 Eb/N0 (dB) � (�) (0 ≤ � (�) ≤ � ) � (�) (0 ≤ FET,0, R = 0.8667, "%2 where �1 �1 max1 and �2 FET , R = 0.8667, ",%2 � (�) ≤ � ) ,0 �2 max2 are the number of iterations involving FET , R = 0.8025, "%2 � ,2 updates of variables in the th section of the local part FET,2, R = 0.8025, ",%2 � � FET , R = 0.7471, "%2 in which maximum iterations number is max1 and max2 , ,4 respectively.Ascanbeseen,inabadchannelenvironment, FET,4, R = 0.7471, ",%2 few sections are successfully decoded at frst local phase. And FET,8, R = 0.6566, "%2 FET , R = 0.6566, ",%2 each successfully decoded section performs not more than ,8 FET , R = 0.4924, "%2 � iteration operations. So, we have ,19 max1 FET,19, R = 0.4924, ",%2 FET,0, Shannon Limit �−1 1 FET , Shannon Limit O ≈ (� � + ∑� (�) � ), ,2 normal � � � max FET,4, Shannon Limit � �=0 (21) FET,8, Shannon Limit 1 FET,19, Shannon Limit O ≈ (���� +�� (0) � (0)). modifed � max1 Figure 6: Te BER and BLER performances for diferent code rates of C ,0, C ,2, C ,4, C ,8,andC ,19 over the AWGN channel � (0) ≪ � O ≪ O FET FET FET FET FET Considering max1 max,wehave modifed normal. with QPSK signaling. Moreover, in a good channel environment, most successfully decoded sections satisfy the parity-check constraints in global (� +� ) part.Notmorethan max1 max2 iteration operations are � = ofExamples1,2,and3.WethenseethattheproposedRCGC- required in each successfully decoded section. For max LDPC codes are better than the QC-GC-LDPC codes formed � +� ,wehaveO ≈ O . max1 max2 modifed normal by the classical method in Section 2.

5. Numerical Results 5.2. Decoding Complexity. Figures 8 and 9 depict the average C C iteration number of 1 and FET,0 with one-phase, normal In this section, we frst present the simulation performance two-phase local/global,andmodifed two-phase local/global for RC GC-LDPC codes over the AWGN channel. Ten, iterative schemes based on the BP decoding algorithm, we compare the decoding complexity of diferent decoding C C respectively. For both 1 and FET,0,themaximumiteration schemes presented in Section 4. Te decoding latency with number of one-phase iterative scheme is set to 50. Te diferent decoding schemes is also discussed. maximum iteration numbers in local decoder and global decoder of normal two-phase local/global iterative scheme 5.1. Error-Correcting Performance. We now provide the sim- with them are 50 and 100, respectively. For modifed two- � � ulated BER and BLER performances for RC GC-LDPC codes phase local/global iterative scheme, max1 , max2 , and the max- over the AWGN channel with QPSK signaling. It is assumed imum iteration number in global decoder of C1 are 30, 20, � � that all the simulations are performed using the belief and 50, respectively. And max1 , max2 ,andthemaximum C propagation (BP) algorithm with the maximum iteration iteration number in global decoder of FET,0 are 60, 40, and number50,ifnotspecifed.TeBERandBLERperformances 100, respectively. Based on Figures 8 and 9, we conclude that for diferent code rates are plotted in Figure 6 together with the normal two-phase local/global iterative scheme requires a the corresponding Shannon limits. Te iterative decoding signifcantly higher number of iterations than modifed two- thresholds achieved by the proposed RC GC-LDPC codes are phase local/global iterative scheme at local phase and needs summarized in Table 2. It can be seen that the gaps between approximately the same iteration number as the modifed the iterative decoding thresholds and the Shannon limits are scheme at global phase, especially at low and moderate SNRs. very small. Figure 7 depicts the BER and BLER performances At high SNRs, two-phase local/global iterative scheme requires Wireless Communications and Mobile Computing 11

Table 2: Parameters of a RC GC-LDPC code. Member Code rate Protograph threshold (dB) Shannon limit (dB) Gap to capacity (dB) H FET,0 0.8667 3.1919 2.7381 0.4538 H FET,1 0.8333 2.7292 2.3167 0.4125 H FET,2 0.8025 2.4967 2.0491 0.4476 H FET,4 0.7471 2.0461 1.5853 0.4608 H FET,8 0.6566 1.4321 0.9605 0.4716 H FET,19 0.4924 0.4879 0.1435 0.3444

100 55 50 45 10−2 40 35 30 −4 10 15

BER/BLER 20 15 10−6 number iteration Average 10 5 0 10−8 33.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.2 3.43.3 3.5 3.6 3.7 3.8 3.9 4 E /N (dB) Eb/N0 (dB) b 0 One phase scheme 1, R = 0.8667, "%2  , R = 0.8667, ",%2 Normal two phase scheme, Global 1 Normal two phase scheme, Local 2, R = 0.85, "%2  , R = 0.85, ",%2 Modifed two phase scheme, Global 2 Modifed two phase scheme, Local FET,0, R = 0.8667, "%2 FET,0, R = 0.8667, ",%2 Figure 8: Average iteration number of one-phase, normal two-phase 3, R = 0.85, "%2 local/global,andmodifed two-phase local/global iterative scheme 3, R = 0.85, ",%2 based on BP decoding algorithm with C1. Figure 7: Te BER and BLER performances of QC-GC-LDPC codes C C C C of 1, 2, FET,0,and 3 over the AWGN channel with QPSK signaling. 129,984 real multiplications,20,352real divisions,and20,352 real additions in each local decoder. Based on Figures 10 and 11, we conclude that the normal two-phase local/global iterative scheme requires signifcantly more operations than a smaller number of iterations than one-phase iterative scheme modifed two-phase local/global iterative scheme at low and without performance hit. moderate SNRs. Figures 10 and 11 depict the decoding complexity of C1 C and FET,0 with one-phase, normal two-phase local/global, 5.3. Decoding Latency. Te decoding delay in a data trans- and modifed two-phase local/global iterative schemes based mission system is defned as the delay incurred in receiving on BP decoding algorithm, respectively. All operations asso- the coded bits before decoding takes place and the ensuing ciated with modulo-2 arithmetic have been neglected as decoder processing delay in [31]. In this paper, we assume conventionally done. Te decoding complexity associated that all schemes being compared have approximately the with BP algorithm is evaluated based on the forward and same decoding complexity, but the decoder processing time backward recursions proposed in [7]. For C1,thetotalcom- is negligible. For one-phase iterative scheme, no information plexity associated with one iteration of BP consists of 877,338 symbols are decoded until an entire block is received. Tus, real multiplications,104,328real divisions,and282,366real the maximum decoding delay experienced by an information additions at global phase. At local phase, it consists of bitwhenLDPCcodeisusedforone-phase iterative scheme is 237,258 real multiplications,29,736real divisions,and79,002 the arrival time of one incoming block. Suppose its number C � real additions in each local decoder. For FET,0,thetotal of iterations is total. Ten, the total decoding latency in complexity associated with one iteration of BP consists of received symbols and the total number of sof received values 559,872 real multiplications,76,608real divisions,and198,720 that must be stored in the decoder memory at any given real additions at global phase. At local phase, it consists of time (decoding latency for short) can be represented by 12 Wireless Communications and Mobile Computing

100 ×107 90 7

80 6 70 ×106 5 14 60 12 4 10 50 8 6 40 3 4 2 30

Average iteration number iteration Average 2

Decoding Complexity 3.55 3.653.6 20 10 1

0 0 3 3.1 3.23.3 3.4 3.5 3.6 3.7 3.8 3 3.1 3.2 3.3 3.4 3.73.63.5 3.8 E /N (dB) b 0 Eb/N0 (dB)

One phase scheme Add., One phase Div., Normal Normal two phase scheme, Global Mul., One phase Add., Modifed Normal two phase scheme, Local Div., One phase Mul., Modifed Modifed two phase scheme, Global Add., Normal Div., Modifed Modifed two phase scheme, Local Mul., Normal Figure 9: Average iteration number of one-phase, normal two-phase C Figure 11: Decoding complexity of FET,0 with one-phase, normal local/global,andmodifed two-phase local/global iterative scheme two-phase local/global,andmodifed two-phase local/global iterative C based on BP decoding algorithm with FET,0. scheme based on BP decoding algorithm.

×107 9 decoding latency of two-phase local/global iterative scheme.By using � local decoders in fully parallel local phase decoding, 8 the maximum decoding delay experienced by an information 7 bit is the arrival time of each incoming block in local � �+ 6 6 decoder. Ten, the decoding latency reduces to global ×10 {� � ,0 ≤ � ≤ �−1} � max � local .Notethat global decreases 5 10 with the increase of SNR. At moderate and high SNRs, the {� � ,0≤�≤�−1} 4 5 decoding latency approaches max � local when � →0.Since� ≪�and � ≈ max{��,0≤ 3 0 global local total �≤�−1},thenmax{��� ,0≤�≤�−1}≪� �. Decoding Complexity local total 2 3.9 3.95 4 Tis means that latency and memory requirements of two- 1 phase local/global iterative scheme aremuchlessthanforthe one-phase iterative scheme when the channel environment is 0 3.2 3.3 3.4 3.63.5 3.7 3.8 3.9 4 better.

Eb/N0 (dB)

Add., One phase Div., Normal 6. Conclusion Mul., One phase Add., Modifed Div., One phase Mul., Modifed In this paper, we introduced the graph extension through Add., Normal Div., Modifed a four-edge-type LDPC code and presented a family of Mul., Normal RC GC-LDPC codes; they are constructed by combining C algebraic method and graph extension. It was shown that Figure 10: Decoding complexity of 1 with one-phase, normal the proposed family of RC CN-GC-LDPC codes can provide two-phase local/global,andmodifed two-phase local/global iterative scheme based on BP decoding algorithm. more fexibility in code rate and guarantee the structural property of algebraic construction. It is confrmed, by numer- ical simulations over the AWGN channel, that the proposed RC GC-LDPC codes outperform their counterpart QC-GC- � � � total ,where is the total number of symbols. For two- LDPC codes formed by the method in [1, 2] in terms of phase local/global iterative scheme, all information symbols waterfall performance and exhibit an approximately uniform areassignedto� local decoders. Suppose the number of gap to the capacity over a wide range of rates. Moreover, we � � iterations in global phase is global.Forthe th local decoder in presented a modifed two-phase local/global iterative scheme two-phase local/global iterative scheme,supposethenumber which can reduce unnecessary cost of local decoders at low � � �+∑�−1 � � of iterations is �.So, global �=0 � local represents the and moderate SNRs. Wireless Communications and Mobile Computing 13

Conflicts of Interest [12] M. Zhang, Z. Wang, Q. Huang, and S. Wang, “Time-Invariant Quasi-Cyclic Spatially Coupled LDPC Codes Based on Pack- Te authors declare that there are no conficts of interest ings,” IEEE Transactions on Communications,vol.64,no.12,pp. regarding the publication of this paper. 4936–4945, 2016. [13] D. Tse and P.Viswanath, Fundamentals of Wireless Communica- tion, Cambridge University Press, Cambridge, UK, 2005. Acknowledgments [14] J. Hagenauer, “Rate-Compatible Punctured Convolutional Tis work was supported in part by the National Natural Codes (RCPC Codes) and their Applications,” IEEE Transac- tions on Communications, vol. 36, no. 4, pp. 389–400, 1988. Science Foundation of China under Grants 61771364 and 61701368, the Joint Funds of the National Natural Science [15] J. Ha, J. Kim, and S. W. McLaughlin, “Rate-compatible punc- Foundation of China under Grant U1504601, and Youth turing of low-density parity-check codes,” Institute of Electrical and Electronics Engineers Transactions on Information Teory, Foundation of the Henan University of Science and Technol- vol.50,no.11,pp.2824–2836,2004. ogy (2014QN030). [16] G. Yue, X. Wang, and M. Madihian, “Design of rate-compatible irregular repeat accumulate codes,” IEEE Transactions on Com- References munications,vol.55,no.6,pp.1153–1163,2007. [17] N. Jacobsen and R. Soni, “Design of rate-compatible irregular [1]J.Li,S.Lin,K.Abdel-Ghafar,W.E.Ryan,andD.J.Costello, LDPC codes based on edge growth and parity splitting,”in Pro- “Globally coupled LDPC codes,” in Proceedings of the 2016 ceedings of the 2007 IEEE 66th Vehicular Technology Conference, Information Teory and Applications Workshop, ITA 2016,La VTC 2007-Fall, pp. 1052–1056, USA, October 2007. Jolla, CA, USA, February 2016. [18]C.-H.HsuandA.Anastasopoulos,“CapacityachievingLDPC [2] J. Li, S. Lin, K. Abdel-Ghafar, W. E. Ryan, and D. J. Costello Jr., codes through puncturing,” Institute of Electrical and Electronics LDPC Code Designs, Constructions, and Unifcation, Cambridge Engineers Transactions on Information Teory,vol.54,no.10,pp. University Press, 2017. 4698–4706, 2008. [19]T.V.Nguyen,A.Nosratinia,andD.Divsalar,“Tedesignof [3] J. Li, K. Liu, S. Lin, and K. Abdel-Ghafar, “Reed-Solomon based rate-compatible protograph LDPC codes,” IEEE Transactions on globally coupled quasi-cyclic LDPC codes,”in Proceedings of the Communications,vol.60,no.10,pp.2841–2850,2012. 2017 Information Teory and Applications Workshop, ITA 2017, San Diego, CA, USA, February 2017. [20] X. Mu, C. Shen, and B. Bai, “ACombined Algebraic-and Graph- BasedMethodforConstructingStructuredRC-LDPCCodes,” [4] J. Li, K. Liu, S. Lin, and K. Abdel-Ghafar, “Reed-solomon based IEEE Communications Letters,vol.20,no.7,pp.1273–1276,2016. nonbinary globally coupled LDPC codes: Correction of random [21] Document 3GPP R1-1613710 3GPP TSG RAN WG1 Meeting 87, errors and bursts of erasures,” in Proceedings of the 2017 IEEE 3GPP, Nov. 2016, http://www.3gpp.org/fp/tsg ran/WG1 RL1/ International Symposium on Information Teory, ISIT 2017,pp. TSGR1 87/Docs/R1-1613710.zip. 381–385, Aachen, Germany, June 2017. [22]Z.Si,R.Tobaben,andM.Skoglund,“Rate-compatibleLDPC [5] R. G. Gallager, “Low-Density Parity-Check Codes,” IRE Trans- convolutional codes achieving the capacity of the BEC,” Institute actions on Information Teory,vol.8,no.1,pp.21–28,1962. of Electrical and Electronics Engineers Transactions on Informa- [6] D. J. C. MacKay and R. M. Neal, “Near Shannon limit per- tion Teory,vol.58,no.6,pp.4021–4029,2012. formance of low density parity check codes,” IEEE Electronics [23] W.Hou, S. Lu, and J. Cheng, “Rate-compatible spatially coupled Letters,vol.32,no.18,pp.1645-1646,1996. LDPC codes via repeat-accumulation extension,”in Proceedings [7] D. J. MacKay, “Good error-correcting codes based on very of the 2014 8th International Symposium on Turbo Codes and sparse matrices,” Institute of Electrical and Electronics Engineers Iterative Information Processing, ISTC 2014, pp. 87–91, Bremen, Transactions on Information Teory,vol.45,no.2,pp.399–431, Germany, August 2014. 1999. [24]H.Zhou,D.G.M.Mitchell,N.Goertz,andD.J.Costello, “Robust rate-compatible punctured LDPC convolutional [8] S. Kudekar, T. J. Richardson, and R. L. Urbanke, “Tresh- codes,” IEEE Transactions on Communications,vol.61,no.11, old saturation via spatial coupling: why convolutional LDPC pp. 4428–4439, 2013. ensembles perform so well over the BEC,” Institute of Electrical [25] W. Nitzold, G. P. Fettweis, and M. Lentmaier, “Rate-compatible and Electronics Engineers Transactions on Information Teory, spatially-coupled LDPC code ensembles with nearly-regular vol. 57, no. 2, pp. 803–834, 2011. degree distributions,” in Proceedings of the IEEE International [9]D.J.CostelloJr.,L.Dolecek,T.E.Fuja,J.Kliewer,D.G. Symposium on Information Teory, ISIT 2015,pp.41–45,Hong M. Mitchell, and R. Smarandache, “Spatially coupled sparse Kong, June 2015. codes on graphs: Teory and practice,” IEEE Communications [26] S. Lin and D. J. Costello Jr., Error Control Coding: Fundamentals Magazine,vol.52,no.7,pp.168–176,2014. and Applications,Prentice-Hall,UpperSaddleRiver,NJ,USA, [10]D.G.Mitchell,M.Lentmaier,andJ.Costello,“Spatiallycou- 2nd edition, 2004. pled LDPC codes constructed from protographs,” Institute of [27] W. Ryan and S. Lin, ChannelCodes:ClassicalandModern, Electrical and Electronics Engineers Transactions on Information Cambridge University Press, New York, NY, USA, 2009. Teory,vol.61,no.9,pp.4866–4889,2015. [28] Document 3GPP R1-1711982 3GPP TSG RAN WG1 Meeting [11] K. Liu, M. El-Khamy, and J. Lee, “Finite-length algebraic AH NR2, 3GPP, June 2017, http://www.3gpp.org/fp/tsg ran/ spatially-coupled quasi-cyclic LDPC codes,” IEEE Journal on WG1 RL1/TSGR1 AH/NR AH 1706/Docs/R1-1711982.zip. Selected Areas in Communications,vol.34,no.2,pp.329–344, [29] H.Xu,D.Feng,R.Luo,andB.Bai,“Constructionofquasi-cyclic 2016. LDPC codes via masking with successive cycle elimination,” 14 Wireless Communications and Mobile Computing

IEEE Communications Letters,vol.20,no.12,pp.2370–2373, 2016. [30] M. P. C. Fossorier, M. Mihaljevic, and H. Imai, “Reduced complexity iterative decoding of low-density parity check codes basedonbeliefpropagation,”IEEE Transactions on Communi- cations,vol.47,no.5,pp.673–680,1999. [31] S. V.Maiya, D. J. Costello Jr., and T.E. Fuja, “Low latency coding: Convolutional codes vs. LDPC codes,” IEEE Transactions on Communications,vol.60,no.5,pp.1215–1225,2012. Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 5264724, 9 pages https://doi.org/10.1155/2018/5264724

Research Article Construction of Quasi-Cyclic LDPC Codes Based on Fundamental Theorem of Arithmetic

Hai Zhu ,1 Liqun Pu,2 Hengzhou Xu ,1 and Bo Zhang1

1 School of Network Engineering, Zhoukou Normal University, Zhoukou, China 2School of Mathematics and Statistics, Zhengzhou University, Zhengzhou, China

Correspondence should be addressed to Hai Zhu; zhu [email protected] and Hengzhou Xu; [email protected]

Received 23 November 2017; Revised 5 February 2018; Accepted 7 March 2018; Published 15 April 2018

Academic Editor: Qin Huang

Copyright © 2018 Hai Zhu et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Quasi-cyclic (QC) LDPC codes play an important role in 5G communications and have been chosen as the standard codes for 5G enhanced mobile broadband (eMBB) data channel. In this paper, we study the construction of QC LDPC codes based on an arbitrary given expansion factor (or lifing degree). First, we analyze the cycle structure of QC LDPC codes and give the necessary and sufcient condition for the existence of short cycles. Based on the fundamental theorem of arithmetic in number theory, we divide the integer factorization into three cases and present three classes of QC LDPC codes accordingly. Furthermore, a general construction method of QC LDPC codes with girth of at least 6 is proposed. Numerical results show that the constructed QC LDPC codes perform well over the AWGN channel when decoded with the iterative algorithms.

1. Introduction of these modern channel codes have been proposed [16, 17]. Tese signifcant works can facilitate and accelerate Low-density parity-check (LDPC) codes [1] are a class of the applications of these modern coding techniques in 5G modern channel coding. Because of the advantages of ap- communications. According to the defnition and description proaching the Shannon capacity and the iterative decoding of URLLC and mMTC provided by ITU-R [18], these two algorithms with lower complexity, LDPC codes have been scenariosrequirelowlatencyandhighreliability.Tatis, attracting great interests of the industries and academia. short data package communication which has no visible error −5 For various specifc communication systems [2–4], LDPC foor down to block error rate (BLER) of 10 should be codes have been well designed and chosen as their standard considered. Research results [19] show that LDPC codes have codes. As an important scenario of 5G communications, good performance in the waterfall and error-foor region. the enhanced mobile broadband (eMBB) data channel had Moreover, LDPC codes have good robust property [15, 20] adopted the LDPC coding scheme [5], and LDPC codes have and then their good performance can be also obtained over recently been determined afer several rounds of discussions various channels. Hence, LDPC coding still has a strong [6–12]. However, the other two scenarios of 5G commu- competitiveness in the applications of URLLC and mMTC. nications, that is, ultrareliable and low latency communi- LDPC codes can be divided into two major classes: cations (URLLC) and massive machine-type-communication (1) random-like codes constructed by means of computer (mMTC), have no candidate channel coding at present. Te search under the efcient algorithms [21, 22] and (2) struc- promising coding techniques for 5G communication systems turedcodesconstructedbasedonalgebraictools,combi- are turbo codes, binary/nonbinary LDPC codes, spatially natorial structures, and graphs, such as fnite geometries coupled(SC)LDPCcodes[13],blockMarkovsuperposition [23], fnite felds [24], balanced incomplete block designs transmission (BMST) [14], and polar codes. Te encod- (BIBDs) [20], resolvable group divisible designs (RGDDs) ing/decoding complexity, performance, spectral efciency, [25], and protographs [26, 27]. Research results show that and robustness comparisons among them can be found in well designed algebraic-based LDPC codes have no error −15 [15]. Recently, some low-complexity decoding algorithms foor at the bit error rate (BER) down to 10 [28]. To 2 Wireless Communications and Mobile Computing facilitate implementation, LDPC codes usually have some Te following three lemmas and one theorem are useful special structures, such as diagonal structure and quasi-cyclic for constructing QC LDPC codes with girth of at least 6. (QC) structure. In general, quasi-cyclic (QC) LDPC codes Lemma 2. Let � be a prime, and let �1 and �2 be two positive [29] have advantages of encoding and decoding with low � integers. If � and � are two integers for 1≤�≤�1 −1and complexity [30, 31], easy hardware implementation [32], and � � +� good iterative performance [33], and then they have attracted 1≤�≤�2 −1,then�� =0(̸ mod � 1 2 ). comprehensive attention. � Proof. Since � and � are two integers with 1≤�≤�1 −1and In order to support lots of data packets with various � 1≤�≤�2 −1 lengths in the eMBB scenario of 5G communications, the ,then � � � +� designed 5G LDPC codes are chosen as rate-compatible (RC) 1=1×1≤�×�≤(�1 −1)(� 2 −1)<� 1 2 . (2) QC LDPC codes. Notice that the number of expansion factors � +� (or lifing degrees) of 5G QC LDPC codes is not much. On the Tat is, �� =0̸ (mod � 1 2 ). other hand, some encoding algorithms [34] are only suitable for QC LDPC codes with certain expansion factor (or lifing Lemma 3. Let �1 and �2 be two diferent primes, and let �1 degree). Furthermore, the encoding and decoding of QC and �2 be two positive integers. If � and � are two integers for �1 �2 �1 �2 LDPC codes with expansion factors (or lifing degrees) being 1≤�≤�1 −1and 1≤�≤�2 −1,then�� =0(̸ mod �1 �2 ). the power of two can be easily implemented by linear shif �1 registers. Hence, it is interesting to construct QC LDPC codes Proof. Since � and � are two integers with 1≤�≤�1 −1and �2 from an arbitrary given expansion factor (or lifing degree). 1≤�≤�2 −1,then � � � � In this paper, we focus on the construction of QC LDPC 1≤�×�≤(�1 −1)(� 2 −1)<� 1 � 2 . codes from given expansion factors (or lifing degrees). We 1 2 1 2 (3) � � frst introduce the fundamental theorem of arithmetic in 1 2 Tat is, �� =0(̸ mod �1 �2 ). number theory and divide the integer factorization into three �1 �2 �� categories. By analyzing the cycle structure of QC LDPC Lemma 4. Let �=�1 �2 ⋅⋅⋅�� be a positive integer where codes, we present three classes of QC LDPC codes based on �1,�2,...,�� are � diferent primes and �1,�2,...,�� are � � � three families of integers. Furthermore, a general construc- � � positive integers. If �=�� and �=�� with 1≤�,�≤� tion of QC LDPC codes with girth of at least 6 based on the and �=� ̸ ,then�� =0(̸ mod �). fundamental theorem of arithmetic is proposed. Finally, in � order to show the good performance of our constructed QC �� � Proof. Since �=� and �=� with 1≤�,�≤�and �=� ̸ , LDPC codes, numerical simulation results are provided. � � Te rest of this paper is organized as follows. Section 2 then � � � � � introduces the fundamentals of number theory, the defni- � � 1 2 � 1≤�×�≤�� �� <�1 �2 ⋅⋅⋅�� =�. (4) tions, basic concepts, and cycle structure of QC LDPC codes. Section 3 presents three classes of QC LDPC codes and a Tat is, �� =0(̸ mod �). general construction method. Numerical results are also pro- vided in this section. Finally, Section 4 concludes this paper. Teorem 5. Let � be a positive integer. If � and � are two positive integers and �� < �,then�� =0(̸ mod �). 2. Preliminaries Proof. Since � and � are two positive integers and �� < �, 2.1. Fundamentals of Number Teory then 1≤��<�. Teorem 1. Every composite number, which is greater than (5) one, factors uniquely as a product of prime numbers. Tat is, �� =0(̸ mod �). Tis theorem is the well-known fundamental theorem of arithmetic in number theory, and it had been proved by 2.2. QC LDPC Codes and Teir Associated Tanner Graphs. A (�, �) �� Gauss and Clarke [36]. Tis theorem is also called the unique -regular quasi-cyclic (QC) LDPC code [29] of length factorization theorem. Tat is, every integer greater than one can be completely specifed by the null space of the following (2) iseitherprimeitselfortheproductoftheprimenumbers matrix over GF : and this product is unique, up to the order of the factors. For 3 2 I (�0,0) I (�0,1) ⋅⋅⋅ I (�0,�−1) example, 1400 = 14 × 100 = 2 × 7 × 2 × 2 × 5 × 5 = 2 ×5 ×7. [ ] [ ] Tis theorem is twofold: frst, 1400 can be written as a product [ I (�1,0) I (�1,1) ⋅⋅⋅ I (�1,�−1) ] [ ] of the primes, and second, no matter how this is done, there H = [ . . . ] , (6) [ . . . ] will always be three 2s, two 5s, one 7, and no other primes [ . . d . ] �≥2� in this product. Hence, for a given integer , can be I (� ) I (� ) ⋅⋅⋅ I (� ) represented by the unique product; that is, [ �−1,0 �−1,1 �−1,�−1 ] � � � 1 2 � where, for 0≤�≤�−1and 0≤�≤�−1, I(��,�) is an �×� �=�1 ×�2 ×⋅⋅⋅×�� , (1) circulant permutation matrix (CPM) with a one at column- where �1,�2,...,�� are prime numbers. (�+��,�)(mod �) for row-�, 0≤�≤�−1,andzeroelsewhere. Wireless Communications and Mobile Computing 3

0 1 2 3 Check nodes 111000 6-cycle     100110 H =     010101 Variable nodes 001011 0 1 2 345 (a) (b)

Figure 1: Tanner graph of H.

It is clear that I(0) represents the �×�identity matrix. Notice codes, the parity-check matrix in (6) can be simplifed as the that the parameter � is referred to as the expansion factor (or following matrix: lifing degree)[37].Itcanbeeasilyobservedthatthepositions H I (0) I (0) ⋅⋅⋅ I (0) of nonzero elements in are uniquely determined by the [ ] [ ] following matrix, called permutation shif matrix or exponent [I (0) I (�1,1) ⋅⋅⋅ I (�1,�−1) ] [ ] matrix: H = [ . . . ] . (8) [ . . . ] [ . . d . ] �0,0 �0,1 ⋅⋅⋅ �0,�−1 [ ] [I (0) I (��−1,1) ⋅⋅⋅ I (��−1,�−1)] [ ] [ �1,0 �1,1 ⋅⋅⋅ �1,�−1 ] [ ] � =� =0 0≤�≤�−1,0≤�≤�−1 P = [ . . . ] . (7) Tat is, �,0 0,� for . [ . . . ] [ . . d . ] Equivalently, its exponent matrix is � � ⋅⋅⋅ � 00⋅⋅⋅0 [ �−1,0 �−1,1 �−1,�−1] [ ] [ ] [0�1,1 ⋅⋅⋅ �1,�−1 ] [ ] P P = [. . . ] . (9) Tat is, there is a one-to-one correspondence between and [. . . ] H. [. . d . ] An LDPC code is commonly described by a bipartite [0��−1,1 ⋅⋅⋅ ��−1,�−1] graph known as Tanner graph [38] in . Tanner graph of H, denoted by G(�,�),consistsofaset� of variable Tatiswhytheelementsinthefrstrowandfrstcolumn nodes (containing �� code symbols of a code word) and a set of the exponent matrix P areusuallysetto0intheresearch � of check nodes (containing �� local check-sum constraints process [29, 42]. Hence, we only consider such H and P in the on the code symbols). An edge in G(�,�) connects the following discussions. variable node � to the check node � if and only if the element at column-� and row-� of H is nonzero. A cycle is formed 2.3. Cycle Structure of QC LDPC Codes. Consider a QC LDPC by a sequence of vertices (or edges) in G(�,�) which starts code C given by the null space of H in (8). It can be seen from and ends at the same vertex (or edge) and contains other [29] that a cycle in the Tanner graph of C is associated with a vertices (or edges) not more than once. Te cycle of length family of the ordered CPMs in H.Asshownin[29],a2�-cycle � is denoted as �-cycle for short and the length of the shortest in the Tanner graph of the code C (or H)isrepresentedbyan cycle is called the girth of G(�,�) (or an LDPC code). As ordered sequence of CPMs an example, Figure 1 shows the Tanner graph of H and an I (� ),I (� ),I (� ),I (� ),I (� ),..., associate 6-cycle. �0,�0 �1,�0 �1,�1 �2,�1 �2,�2 In graph theory, the biadjacency matrix A =[��� ] of a (10) I (� ),I (� ),I (� ), bipartite graph G(�, �) can be constructed as follows. Te ��−1,��−1 �0,��−1 �0,�0 rows of A are labeled by the |�| vertices in � and the columns where �� =�0,�� =�0,0≤�� ≤�−1,��−1 =�̸ �,0≤�� ≤ are labeled by |�| vertices in �.Teelement��� at the row �−1,and��−1 =�̸ � for 1≤�≤�.Teabovesequencecan labeled by the vertex �∈�and the column labeled by the be simplifed as vertex �∈�is 1 if and only if there exists an edge between the � � vertices and and otherwise 0. Actually, for an LDPC code I (�� ,� ),I (�� ,� ),I (�� ,� ),...,I (�� ,� ). (11) given by the null space of H, H is the biadjacency matrix of 0 0 1 1 2 2 �−1 �−1 its relevant Tanner graph G(�,�). Itcanbeseenthatsucha2�-cycle corresponds to the elements � ,� ,� ,...,� P Moreover, isomorphism theory of QC LDPC codes was �0,�0 �1,�1 �2,�2 ��−1,��−1 in the exponent matrix .Fur- proposed in [39–41] based on the isomorphism of graphs in thermore, short cycles of QC LDPC codes can be determined graph theory. According to the isomorphism of QC LDPC by the elements of P [39, 40]. 4 Wireless Communications and Mobile Computing

First, we design the exponent matrix P in (9) as follows: 00⋅⋅⋅0 p pj ,k pj ,k p [ ] j,k 0 0 0 −1 j0,k2−1 [0�1,1 ⋅⋅⋅ �1,�−1 ] [ ] P = [. . . ] , (14) [. . d . ]

[0��−1,1 ⋅⋅⋅ ��−1,�−1] p p p p j+1,k j1,k0 j1,k1 j+1,k+1 where ��,� =�×�(mod �) for 1≤�≤�−1,1≤�≤�−1. Second, we replace the 0s and ��,� in the designed exponent matrix P with CPMs I(0) and I(��,�) of the same size �×�, respectively, and then obtain a �×�array H of �×�CPMs. pj ,k p Tis array is a �� × �� matrix over GF(2) with column and −1 −1 j2−1,k2−1 row weights � and �,respectively.Tenullspaceofthismatrix Figure 2: Te structure of 2�-cycle and nonexistence of the 4�-cycle. gives a (�, �)-regular QC LDPC code.

Remark 6. Asshownin[44],girthandshortcyclesplayan Let � be the girth of the code C.Itcanbeseenfrom[43] important role in the design of LDPC codes. If the above (�, �) that, for �≤2�≤2�−2, the necessary and sufcient condition constructed -regular QC LDPC code does not have for the existence of a 2�-cycle in the Tanner graph of the code gooditerativeperformance,wecanreplacesomeCPMsinthe H C (or H) can be generalized as follows: above array with zero matrices (ZMs) of the same size to reducethenumberofshortcyclesandpossiblyenlargethe �−1 girth value. Tis replacement is called masking. On the other ∑ (� −� )=0 ( �) ��,�� ��+1,�� mod (12) hand, if the lengths of the desired QC LDPC codes are shorter �=0 than �� or they require much higher code rates, then we can � � � take a � ×� subarray of the designed array H,where� ≤� � =�,� =�,� =�̸ � =�̸ � with 0 � 0 � � �+1,and � �+1.Notethat and � ≤�. Notice that this subarray can be obtained from 2� � (12) is not the sufcient condition for the existence of a - the following two steps: (1) Choose the frst � row-CPMs C H 2� ≥ 2� � cycle in the Tanner graph of the code (or )for ,but of the designed array H; (2) select � column-CPMs from � it is the necessary condition. Here we give a counterexample. column-CPMs of the designed array H.Inthispaper,both 2� (� ≥ �) Consider a -cycle whose cycle structure is given in the masking technique and the selection method in [43] are �� ,� =�� ,� Figure 2. Clearly, (12) is satisfed. Let � � �+� �+� for employed to construct (or further optimize) QC LDPC codes. 0≤�≤�−1,and�0 =�2�,�0 =�2�,�� =�̸ �+1,�� =�̸ �+1. According to (12), we have 3.1.TreeClassesofQCLDPCCodeswithGirthofatLeast6. Based on (12), we can see that Tanner graph of the designed 2�−1 �−1 array H contains a 4-cycle if and only if the following equation ∑ (�� ,� −�� ,� )=2∑ (�� ,� −�� ,� ) � � �+1 � � � �+1 � is satisfed: �=0 �=0 (13) 1 =0 (mod �) , ∑ (� −� )=(� −� )(� −� ) ��,�� ��+1,�� 0 1 0 1 �=0 (15) where �0 =�� =�2�,�0 =�� =�2�,�� =�̸ �+1,�� ̸= =0 (mod �) , ��+1,for0≤�≤2�−1. Tat is, the ordered elements �� ,� ,�� ,� ,�� ,� ,...,�� ,� ,(i.e., �� ,� ,�� ,� ,�� ,� ,..., 0 0 1 0 1 1 0 2�−1 0 0 1 0 1 1 where �0 =�̸ 1 and �0 =�̸ 1. It can be observed that the �� ,� ,�� ,� ,�� ,� ,�� ,� ,...,�� ,� ) make (12) hold, but 0 �−1 0 0 1 0 1 1 0 �−1 existence of 4-cycles in the Tanner graph of the designed array they do not determine a 4�-cycle. A visual representation is H is related to �. According to the fundamental theorem of depicted in Figure 2. Terefore, (4) in [29] and (3) in [39] are arithmetic, the values of � can be divided into three categories not applicable to the cycles with lengths larger than 2�− 2. andthreeclassesofQCLDPCcodeswithgirthofatleast 6 are proposed. Notice that all numerical simulations in the 3. Construction of Quasi-Cyclic LDPC Codes following examples, binary phase shif keying (BPSK), additive with Girth of at Least 6 white Gaussian noise (AWGN) channel, and the sum-product algorithm (SPA), are assumed. Based on the aforementioned, there exists a one-to-one cor- � � respondence between the exponent matrix P and the parity- 3.1.1. Te Case of �=�. Let �=�,where� is a prime check matrix H of a QC LDPC code. Hence, construction of and � is a positive integer, and let �=�1 +�2,where�1,�2 �1 a QC LDPC code is equivalent to the design of its exponent are two positive integers and �1 ≤�2. Consider �=� and �2 �1 �2 matrix P.Inthissection,wepresentthreeclassesofQC �=�.Since1≤�0,�1 ≤� −1,1 ≤ �0,�1 ≤� − �1 LDPC codes with girth of at least 6 and then give a general 1, �0 =�̸ 1,and�0 =�̸ 1,then1≤�0 −�1 ≤� −1and �2 construction of QC LDPC codes with girth of at least 6 based 1≤�0 −�1 ≤� −1, where the calculation is taken modulo � � on an arbitrary integer. � 1 and modulo � 2 , respectively. Hence, (15) is not satisfed Wireless Communications and Mobile Computing 5 according to Lemma 2. Tat is, Tanner graph of the designed 100 H array has no 4-cycles and then the constructed QC LDPC −1 codes have girth of at least 6. 10 −2 8 2 6 10 Example 7. Consider � = 256 = 2 .Let�=2 and �=2. According to (14), we can obtain the exponent matrix P of size 10−3 4×64.Byemployingthemethodin[43],weselectthefrst4 −4 rows and the 2nd, 16th, 19th, 35th, 50th, 55th, 62nd, and 63rd 10 BER columns of P and construct a 4×8array H of 256×256 CPMs 10−5 by replacing the elements of the selected submatrix with the corresponding CPMs. By using the matrix 10−6

11011110 −7 [ ] 10 [10110111] M = [ ] −8 1 [11101011] (16) 10 1 1.25 1.5 1.75 2 2.25 2.5 2.75 3 3.25 3.5 3.75 4 4.25 4.5 [01111101] Eb/N0 (dB) to mask H,a1024 × 2048 matrix with column and row 5 iters, algebraic 5 iters, proposed 10 iters, algebraic 10 iters, proposed weights 3 and 6, respectively, is obtained. Tis matrix gives 20 iters, algebraic 20 iters, proposed a (3, 6)-regular (2048, 1024) QC LDPC code. Te bit error 50 iters, algebraic 50 iters, proposed rates (BERs) of this code decoded by the SPA (5, 10, 20, and 50 (3, 6) iterations)areshowninFigure3.AlsoshowninFigure3isthe Figure 3: Te bit error performance of the proposed -regular (2048, 1024) (3, 6) performance of the (3, 6)-regular (2048, 1024) algebraic QC QC LDPC code and the comparable -regular � (2048, 1024) LDPC code constructed based on fnite feld GF(� ) [35]. Tis algebraicQCLDPCcode[19]inExample7.Te decoding algorithm is the SPA with 5, 10, 20, and 50 iterations. comparable code is constructed from the prime feld GF(257) and then the CPM size of its parity-check matrix is 256×256.

Notice that the exponent matrix and masking matrix of this 3 2 Example 8. Consider �=72=2 ×3 =4×18.Since12 < 18, comparable code are 2 let �=2 and �=18. According to (14), we can obtain the 179 75 202 52 116 24 15 176 P 4×18 [ ] exponent matrix of size . By employing the method in [ 23 179 75 202 52 116 24 15 ] P = [ ] , [43],weselectthefrst4rowsandthe1st,2nd,3rd,4th,5th, 1 [ 25 23 179 75 202 52 116 24 ] 6th, 12th, 13th, 15th, 16th, 17th, and 18th columns of P and 4×12 H 72 × 72 [162 25 23 179 75 202 52 116] construct a array of CPMs by replacing the (17) elements of the selected submatrix with the corresponding 10101111 CPMs. By using the matrix [ ] [01011111] M = [ ] , 111110001111 4×8 [ ] 11111010 [ ] [010011111111] [11110101] [ ] M2 = [ ] (18) [111110111100] respectively. It can be observed that these two codes have similar performance when decoded using the SPA with [101101111011] various iterations. It is well known that algebraic-based LDPC codes have fast decoding convergence [19, 45, 46]. Tat is, the to mask H,a288 × 864 matrix with column and row weights SPA decoding of the proposed LDPC code also converges fast, 3 and 9, respectively, is obtained. Tis matrix gives a (3, 9)- asshowninFigure3.Wecanseethattheperformancegap regular (864, 576) QC LDPC code. Te bit/word error rates between 20 and 50 iterations is less than 0.15 dB at the BER of (BERs/WERs) of this code decoded by the SPA with 50 −6 −7 10 , and the gap is also less than 0.25 dB at the BER of 10 ; iterations are shown in Figure 4. Also shown in Figure 4 is hence this code achieves a fast rate of decoding convergence. the performance of the (3, 9)-regular (864, 576) algebraic QC � � � � LDPC code constructed from the fnite feld GF(73) [35]. Te �=�1 � 2 �=�1 � 2 � ,� 3.1.2. Te Case of 1 2 . Let 1 2 ,where 1 2 exponent and masking matrices of this algebraic QC LDPC are two diferent prime numbers and �1,�2 are two positive � � � � code are integers. Assume �=min{� 1 ,� 2 } and �=max{� 1 ,� 2 }. �1 2 � 1 2 1 2 P Without loss of generality, �1 <�2 is assumed. Since 1≤ 2 �1 �2 �0,�1 ≤�1 −1, 1 ≤ �0,�1 ≤�2 −1, �0 =�̸ 1,and�0 =�̸ 1,then �1 �2 49 41 21 40 29 71 39 3 58 61 65 52 1≤�0−�1 ≤� −1 and 1≤�0−�1 ≤� −1,wherethecalcula- 1 � 2 � [ ] tion is taken modulo � 1 and modulo � 2 ,respectively.Hence, [23 49 41 21 40 29 71 39 3 58 61 65] 1 2 [ ] (15) is not satisfed according to Lemma 3. Tat is, Tanner = [ ] , [24 23 49 41 21 40 29 71 39 3 58 61] graph of the designed array H does not contain 4-cycles and the girth of the constructed QC LDPC codes is at least 6. [57 24 23 49 41 21 40 29 71 39 3 58] 6 Wireless Communications and Mobile Computing

Table 1: Te cycle distributions of two (864, 576) QC LDPC codes in Example 8.

Code 4-cycles 6-cycles 8-cycles 10-cycles 12-cycles Proposed code 0 288 12852 110736 1514772 Algebraic code [35] 0 360 8316 109800 1402308

100 according to Lemma 4. Tat is, Tanner graph of the designed array H does not have 4-cycles and the constructed QC LDPC 10−1 codes have girth of at least 6.

10−2 Example 9. Consider � = 105= 3×5×7.Since10 < 21, let �=5and �=21. According to (14), we can obtain the 10−3 exponent matrix P of size 5×21. By employing the method in[43],weselectthefrst5rowsandthe1st,2nd,3rd,6th,7th, 10−4 P

BER/WER 13th, 16th, 17th, 20th, and 21st columns of and construct a 5×10array H of 105 × 105 CPMs by replacing the elements −5 10 of the selected submatrix with the corresponding CPMs. By using the matrix 10−6

10−7 1001110011 1.5 1.75 2 2.25 2.5 2.75 3 3.25 3.5 3.75 [ ] [1110010101] Eb/N0 (dB) [ ] [ ] M = [0011101101] BER, algebraic BER, proposed 3 [ ] (20) [ ] WER, algebraic WER, proposed [0111011010] Figure 4: Te error performance of the proposed (3, 9)-regular [1100101110] (864, 576) QC LDPC code and the comparable (3, 9)-regular (864, 576) algebraic QC LDPC code constructed based on fnite feld GF (73) [35] in Example 8. to mask H,a525×1050 matrix with column and row weights 3and6,respectively,isobtained.Tismatrixgivesa(3, 6)- regular (1050, 525) QC LDPC code of girth 8. For compari- 101010111111 son, we simultaneously present the simulation for the (3, 6)- [ ] [010101111111] regular (1050, 525) LDPC code constructed based on the [ ] M4×12 = [ ] , progressive edge-growth (PEG) algorithm [22]. Te bit/word [111111101010] error rates (BERs/WERs) of these two codes decoded with [111111010101] the SPA (50 iterations) are shown in Figure 5. It can be seen that although these two codes have similar performance in (19) the waterfall region, the proposed code performs better than respectively. Notice that the CPM size of this algebraic code the PEG-LDPC code in the high-SNR region. is 72 × 72. It can be observed that these two codes also have similar performance. Moreover, the cycle distributions 3.2. A General Construction of QC LDPC Codes from an ofthesetwocodesaregiveninTable1.Wecanseethat Arbitrary Positive Integer. For a given positive integer �,we although the proposed code has fewer shortest cycles than in general fnd out two positive integers � and � such that the algebraic QC LDPC code, the proposed code has much �� ≤ � and �, � ≥ 3. Assume �≤�. Consider �=�and more cycles of length 8 than the algebraic QC LDPC code. �=�,where1≤�,�≤�,and�=� ̸ .Since1≤�0,�1 ≤ Tat is why the proposed code does not perform better than �−1,1 ≤ �0,�1 ≤�−1,�0 =�̸ 1,and�0 =�̸ 1,then the algebraic QC LDPC code in the high-SNR region. 1≤�0 −�1 ≤�−1and 1≤�0 −�1 ≤�−1,where the calculation is taken modulo � and modulo �,respectively. � � � � � � 3.1.3. Te Case of �=�1 � 2 ⋅⋅⋅� � . Let �=�1 � 2 ⋅⋅⋅� � , Hence, (15) is not satisfed according to Teorem 5. Tat is, 1 2 � 1 2 � H where �1,�2,...,�� are � diferent prime numbers and Tanner graph of the designed array does not have 4-cycles �1,�2,...,�� are � positive integers. Without loss of generality, andtheconstructedQCLDPCcodeshavegirthofatleast6. � � � � <�� 1≤�,�≤� �=� ̸ we assume � � ,where ,and . Consider � = 127 > 4 × 31 �=4,�= � � � Example 10. Consider , and let �=�� �=�� 1≤�,� ≤�� −1,1≤� ,� ≤ � and � .Since 0 1 � 0 1 31. According to (14), we can obtain the exponent matrix P � � �� 4×31 �� −1,�0 =�̸ 1,and�0 =�̸ 1,then1≤�0 −�1 ≤�� −1and of size . By employing the method in [43], we select the � 1≤� −� ≤�� −1 frst4rowsandthe1st,2nd,6th,7th,22nd,26th,29th,and31st 0 1 � , where the calculation is taken modulo P 4×8 H 127×127 � columns of and construct a array of CPMs �� � �� and modulo �� , respectively. Hence, (15) is not satisfed by replacing the elements of the selected submatrix with the Wireless Communications and Mobile Computing 7

100 100

10−1 10−1

10−2 10−2

10−3 10−3

10−4 10−4 BER/WER BER/WER

10−5 10−5

10−6 10−6

10−7 10−7 1 1.25 1.5 1.75 2 2.25 2.5 2.75 3 1 1.25 1.5 1.75 2 2.25 2.5 2.75 3 Eb/N0 (dB) Eb/N0 (dB)

BER, PEG BER, proposed BER, partial geometry BER, proposed WER, partial geometry WER, proposed WER, PEG WER, proposed (3, 6) Figure 5: Te error performance of the proposed (3, 6)-regular Figure 6: Te bit error performance of the proposed -regular (1050, 525) (3, 6) (1016, 508) QC LDPC code and the comparable (3, 6)-regular QC LDPC code and the comparable -regular (1016, 508) (1050, 525) QC LDPC code constructed based on the PEG algorithm QC LDPC code constructed from partial geometry [28] [22] in Example 9. in Example 10. corresponding CPMs. By using the method in [43], we design into three categories and then constructed three classes of amaskingmatrix,thatis, QC LDPC codes. Furthermore, a general construction of QC LDPC codes with girth of at least 6 was proposed. Numerical 10111101 resultsshowthattheconstructedQCLDPCcodeshavegood [01111110] M = [ ] ; performance over the AWGN channel and converge fast 4 [11010111] (21) under iterative decoding. In other words, for an arbitrary [11101011] integer �(≥6), we can easily construct QC LDPC codes whose parity-check matrices consist of several CPMs and/or to mask H,a508 × 1016 matrix with column and row zero matrices of size �×�,andtheproposedmethodensured weights 3 and 6, respectively, is obtained. Tis matrix gives that the resultant QC LDPC codes have girth of at least 6. a (3, 6)-regular (1016, 508) QC LDPC code of girth 8. For Moreover, the proposed QC LDPC codes perform as well as comparison,wealsoconstructa(3, 6)-regular (1016, 508) QC the algebraic QC LDPC codes. LDPC code based on the partial geometry [28]. Note that the exponent matrix of this code is Conflicts of Interest 2 83334636944286 Te authors declare that there are no conficts of interest [109 15 84 94 57 43 3 115] P = [ ] , regarding the publication of this paper. 3 [112 76 70 36 111 57 66 117] (22) [ 31 80 67 78 50 60 16 63 ] Acknowledgments and the masking matrix is also M4×8 in Example 7. Te Tis work was supported in part by the National Natural bit/word error performance of these two codes decoded by Science Foundation of China under Grant 61103143, the Joint the SPA with 50 iterations is shown in Figure 6. It can be seen Funds of the National Natural Science Foundation of China that these two codes have similar performance. We can also under Grant U1504601, the Key Scientifc and Technological observefromFigure6that,fortheproposedQCLDPCcode, Project of Henan under Grants 162102310589, 172102310124, there are no error foors in the BER curves down to BER = and 182102310867, the Key Scientifc Research Projects of −7 −6 2.27×10 andintheWERcurvesdowntoWER= 3.5×10 . Henan Educational Committee under Grant 18B510022, and the School-Based Program of Zhoukou Normal University 4. Conclusion under Grant ZKNUB2201705.

In this paper, based on the fundamental theorem of arith- References metic, we presented a method for constructing QC LDPC codes with girth of at least 6 from an arbitrary integer. [1] R. G. Gallager, “Low-Density Parity-Check Codes,” IRE Trans- According to the integer factorization, we divided the integers actions on Information Teory,vol.8,no.1,pp.21–28,1962. 8 Wireless Communications and Mobile Computing

[2] IEEE Standard, “Air Interface for Fixed Broadband Wireless puncturing,” Institute of Electrical and Electronics Engineers Access Systems,” IEEE Standard P802.16e/D1, 2005. Transactions on Information Teory,vol.59,no.12,pp.7898– [3] European Telecommunications Standards Institute, Digital 7914, 2013. Video Broadcasting (DVB), European Telecommunications [24] S. Song, B. Zhou, S. Lin, and K. Abdel-Ghafar, “A unifed Standards Institute, Sophia Antipolis, France, 2009. approach to the construction of binary and nonbinary quasi- [4] CCSDS, “Short Block Length LDPC Codes for TC Synchroniza- cyclic LDPC codes based on fnite felds,” IEEE Transactions on tion and Channel Coding,” CCSDS 231.1-O-1, 2015. Communications,vol.57,no.1,pp.84–93,2009. [5] 3GPP, “Document 3GPP chairman’notes 3GPP TSG RAN WG1 [25] H. Xu, D. Feng, C. Sun, and B. Bai, “Construction of LDPC meeting 87,” 2016, https://www.3gpp.org. codes based on resolvable group divisible designs,” in Proceed- [6] 3GPP, “Document 3GPP chairman’notes 3GPP TSG RAN WG1 ings of the International Workshop on High Mobility Wireless meeting AH NR1,” 2017, https://www.3gpp.org. Communications (HMWC’15), pp. 111–115, 2015. [7] 3GPP, “Document 3GPP chairman’notes 3GPP TSG RAN WG1 [26]D.Divsalar,S.Dolinar,C.R.Jones,andK.Andrews,“Capacity- meeting 88,” 2017, https://www.3gpp.org. approaching protograph codes,” IEEE Journal on Selected Areas in Communications, vol. 27, no. 6, pp. 876–888, 2009. [8] 3GPP, “Document 3GPP chairman’notes 3GPP TSG RAN WG1 meeting 88bis,” 2017, https://www.3gpp.org. [27] D. G. Mitchell, R. Smarandache, and J. Costello, “Quasi-cyclic LDPC codes based on pre-lifed protographs,” Institute of [9] 3GPP, “Document 3GPP chairman’notes 3GPP TSG RAN WG1 Electrical and Electronics Engineers Transactions on Information meeting 89,” 2017, https://www.3gpp.org. Teory, vol. 60, no. 10, pp. 5856–5874, 2014. [10] 3GPP, “Document 3GPP chairman’notes 3GPP TSG RAN WG1 [28]Q.Diao,J.Li,S.Lin,andI.F.Blake,“Newclassesofpartial meeting AH NR2,” 2017, https://www.3gpp.org. geometries and their associated LDPC codes,” Institute of [11] 3GPP, “Document 3GPP R1-1711982 3GPP TSG RAN WG1 Electrical and Electronics Engineers Transactions on Information meeting AH NR2,” 2017, https://www.3gpp.org. Teory, vol. 62, no. 6, pp. 2947–2965, 2016. [12] 3GPP, “Document 3GPP R1-1712254 3GPP TSG RAN WG1 [29] M. P. Fossorier, “Quasi-cyclic low-density parity-check codes meeting 90,” 2017, https://www.3gpp.org. from circulant permutation matrices,” Institute of Electrical and [13] M. Zhang, Z. Wang, Q. Huang, and S. Wang, “Time-Invariant Electronics Engineers Transactions on Information Teory,vol. Quasi-Cyclic Spatially Coupled LDPC Codes Based on Pack- 50, no. 8, pp. 1788–1793, 2004. ings,” IEEE Transactions on Communications,vol.64,no.12,pp. [30] Z. Li, L. Chen, L. Zeng, S. Lin, and W. H. Fong, “Efcient 4936–4945, 2016. encoding of quasi-cyclic low-density parity-check codes,” IEEE [14]X.Ma,K.Huang,andB.Bai,“SystematicblockMarkovsuper- Transactions on Communications,vol.54,no.1,pp.71–81,2006. position transmission of repetition codes,” IEEE Transactions on [31] J. Li, K. Liu, S. Lin, and K. Abdel-Ghafar, “Decoding of Information Teory,vol.64,no.3,pp.1604–1620,2018. quasi-cyclic LDPC codes with section-wise cyclic structure,” in [15] B. Bai, “Nonbinary LDPC coding for 5G communication Proceedings of the IEEE Information Teory and Applications systems,” in Proceedings of the 10th International Conference on Workshop (ITA’14),pp.1–10,Calif,USA,2014. Information, Communications and Signal Processing (ICICS’15), [32] F.Cai, X. Zhang, D. Declercq, S. K. Planjery, and B. Vasic,“Finite pp. 2–4, Singapore, 2015. alphabet iterative decoders for LDPC codes: optimization, [16] S. Wang, Q. Huang, and Z. Wang, “Symbol fipping decoding architecture and analysis,” IEEE Transactions on Circuits and algorithms based on prediction for non-binary LDPC codes,” Systems I: Regular Papers,vol.61,no.5,pp.1366–1375,2014. IEEE Transactions on Communications,vol.65,no.5,pp.1913– [33] H. Liu, Q. Huang, G. Deng, and J. Chen, “Quasi-cyclic repre- 1924, 2017. sentation and vector representation of RS-LDPC Codes,” IEEE [17] Q. Huang, L. Song, and Z. Wang, “Set Message-Passing Decod- Transactions on Communications,vol.63,no.4,pp.1033–1042, ing Algorithms for Regular Non-Binary LDPC Codes,” IEEE 2015. Transactions on Communications,2017. [34] Q. Huang, L. Tang, S. He, Z. Xiong, and Z. Wang, “Low- [18] 3GPP,“Study on scenarios and requirements for next generation complexity encoding of quasi-cyclic codes based on Galois access technologies,” Technical Report (TR) 38.913, 2016. Fourier transform,” IEEE Transactions on Communications,vol. [19] W. Ryan and S. Lin, ChannelCodes:ClassicalandModern, 62, no. 6, pp. 1757–1767, 2014. Cambridge University Press, New York, NY, USA, 2009. [35] J. Li, K. Liu, S. Lin, and K. Abdel-Ghafar, “Algebraic quasi- [20] L. Lan, Y. Y. Tai, S. Lin, B. Memari, and B. Honary, “New cyclic ldpc codes: Construction, low error-foor, large girth and constructions of quasi-cyclic LDPC codes based on special a reduced-complexity decoding scheme,” IEEE Transactions on classes of BIBD’s for the AWGN and binary erasure channels,” Communications,vol.62,no.8,pp.2626–2637,2014. IEEE Transactions on Communications,vol.56,no.1,pp.39–48, [36] C. F. Gauss and A. A. Clarke, Disquisitiones arithmeticae 2008. (Second, corrected edition), springer, New York, NY, USA, 1966. [21]T.Tian,C.Jones,J.D.Villasenor,andR.D.Wesel,“Con- [37] J. Li, K. Liu, S. Lin, K. Abdel-Ghafar, and W. E. Ryan, “An struction of irregular LDPC codes with low error foors,” in unnoticed strong connection between algebraic-based and pro- Proceedings of the International Conference on Communications tograph-based LDPC codes, Part I: Binary case and interpreta- (ICC’03),pp.3125–3129,2003. tion,” in Proceedings of the Information Teory and Applications [22] X.-Y. Hu, E. Elefheriou, and D. M. Arnold, “Regular and Workshop (ITA’15), pp. 36–45, San Diego, Calif, USA, 2015. irregular progressive edge-growth Tanner graphs,” Institute of [38] R. M. Tanner, “A recursive approach to low complexity codes,” Electrical and Electronics Engineers Transactions on Information Institute of Electrical and Electronics Engineers Transactions on Teory, vol. 51, no. 1, pp. 386–398, 2005. Information Teory,vol.27,no.5,pp.533–547,1981. [23]Q.Diao,Y.Y.Tai,S.Lin,andK.Abdel-Ghafar,“LDPCcodes [39] A. Tasdighi, A. H. Banihashemi, and M.-R. Sadeghi, “Efcient on partial geometries: construction, trapping set structure, and search of girth-optimal QC-LDPC codes,” Institute of Electrical Wireless Communications and Mobile Computing 9

and Electronics Engineers Transactions on Information Teory, vol.62,no.4,pp.1552–1564,2016. [40]C.Sun,H.Xu,D.Feng,andB.Bai,“(3,L)quasi-cyclicLDPC codes: Simplifed exhaustive search and designs,”in Proceedings of the 9th International Symposium on Turbo Codes and Iterative Information Processing (ISTC’16), pp. 271–275, Brest, France, 2016. [41] H. Xu, C. Chen, M. Zhu, B. M. Bai, and B. Zhang, “Nonbinary LDPC cycle codes: Efcient search, design, and code optimiza- tion,” Science China Information Sciences, http://engine.scichina .com/doi/10.1007/s11432-017-9271-6. [42] S. Zhao and X. Ma, “Construction of high-performance array- based non-binary LDPC codes with moderate rates,” IEEE Communications Letters,vol.20,no.1,pp.13–16,2016. [43] H. Xu, D. Feng, R. Luo, and B. Bai, “Construction of quasi-cyclic LDPC codes via masking with successive cycle elimination,” IEEE Communications Letters,vol.20,no.12,pp.2370–2373, 2016. [44] H. Xu and B. Bai, “Superposition Construction of Q-Ary LDPC Codes by Jointly Optimizing Girth and Number of Shortest Cycles,” IEEE Communications Letters,vol.20,no.7,pp.1285– 1288, 2016. [45] Q. Huang, K. Liu, and Z. Wang, “Low-density arrays of circulant matrices: Rank and row-redundancy, and QC-LDPC codes,” in Proceedings of the 2012 IEEE International Symposium on Information Teory, ISIT 2012,pp.3073–3077,USA,July2012. [46]H.Xu,D.Feng,C.Sun,andB.Bai,“Algebraic-basednonbinary ldpc codes with fexible feld orders and code rates,” China Communications,vol.14,no.4,pp.111–119,2017. Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 7063758, 12 pages https://doi.org/10.1155/2018/7063758

Research Article Code-Hopping Based Transmission Scheme for Wireless Physical-Layer Security

Liuguo Yin 1,2 and Wentao Hao2,3

1 Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China 2EDA Laboratory, Research Institute of Tsinghua University in Shenzhen, Shenzhen, China 3School of Aerospace Engineering, Tsinghua University, Beijing 100084, China

Correspondence should be addressed to Liuguo Yin; [email protected]

Received 23 November 2017; Revised 9 February 2018; Accepted 28 February 2018; Published 3 April 2018

Academic Editor: Zesong Fei

Copyright © 2018 Liuguo Yin and Wentao Hao. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Due to the broadcast and time-varying natures of wireless channels, traditional communication systems that provide data encryption at the application layer sufer many challenges such as error difusion. In this paper, we propose a code-hopping based secrecy transmission scheme that uses dynamic nonsystematic low-density parity-check (LDPC) codes and automatic repeat- request (ARQ) mechanism to jointly encode and encrypt source messages at the physical layer. In this scheme, secret keys at the transmitter and the legitimate receiver are generated dynamically upon the source messages that have been transmitted successfully. During the transmission, each source message is jointly encoded and encrypted by a parity-check matrix, which is dynamically selected from a set of LDPC matrices based on the shared dynamic secret key. As for the eavesdropper (Eve), the uncorrectable decoding errors prevent her from generating the same secret key as the legitimate parties. Tus she cannot select the correct LDPC matrix to recover the source message. We demonstrate that our scheme can be compatible with traditional cryptosystems and enhance the security without sacrifcing the error-correction performance. Numerical results show that the bit error rate (BER) of Eve approaches 0.5 as the number of transmitted source messages increases and the security gap of the system is small.

1. Introduction the source message. To avoid the key agreement and exploit the inherent randomness of wireless channels, Wyner [2] Information security and reliability are two crucial issues presented the degraded wiretap channel model in which a in wireless communications. Traditionally, communication transmitter wants to send a secret message to a legitimate systems correct transmission errors at the physical layer receiver through the main channel. Tis message is also based on channel codes and cope with eavesdropping at perceived by an eavesdropper through the degraded wiretap the application layer based on cryptographic algorithms. channel. Te secrecy capacity is defned as the supremum of In practical scenarios, there will be residual errors in the all the achievable secure and reliable transmission rates. Ten, decoded messages due to the time-varying nature of wireless Wyner’s original work was generalized to broadcast channels channels, which may cause severe error difusion in the [3] and Gaussian channels [4]. Moreover, the secrecy capacity decryption.Inaddition,withtherapidincreaseoftheeaves- of fading wiretap channels [5], MIMO wiretap channels [6], dropper’s computing power, these computational-complexity and multiuser wiretap channels [7, 8] has been derived in basedencryptionalgorithmswillbeeasiertobreak,suchas the literature. In these works, the equivocation of Eve is a A5/1 in the GSM. widely accepted metric for security, which is defned as the Alternatively, the schemes based on physical-layer secu- conditional entropy of the source message given her noisy rity aim to tackle these two crucial issues at the physical observation [9]. layer. Shannon [1] frst studied secure communication from Many coding techniques are applied to wiretap channels an information theoretic perspective in which a preshared to make the secrecy transmission rate approach the secrecy secret key between the legitimate parties is used to encrypt capacity, in other words, for the equivocation of Eve to 2 Wireless Communications and Mobile Computing

fi

Feedback channel

m x y m i i Main i i Encoder channel Decoder Transmitter (Alice) Receiver (Bob)

z m Wiretap i i Decoder channel Eavesdropper (Eve) Figure 1: Wiretap channel model with public feedback. approximate the entropy of the source message. For binary legitimate parties. Simulation results show that the BER of erasure wiretap channels, Tangaraj et al. [10] proposed Eve approaches 0.5 as the number of transmissions increases a coding technique based on the dual of LDPC codes and the security gap of the system is small. and showed that the secrecy capacity can be achieved by Te remainder of the paper is organized as follows. We this technique. For symmetric discrete memoryless wiretap introduce our system model and the design of the encoder channels, Andersson et al. [11] proved that nested polar and decoder in Section 2. In Section 3, the dynamic secret codes can achieve the whole rate-equivocation region. In key generation algorithm is proposed and the security of the addition, this coding technique is further applied to relay- secret key is well examined. In Section 4, we construct a large eavesdropper channels [12], block fading channels [13], and number of parity-check matrices of LDPC codes based on the multiuser channels [14]. Tese schemes are really efective technique we called structured-random protograph expand- when the code length is sufcient, but may be difcult to ing. Encoder and Decoder implementation of structured- implement in practical systems. random LDPC codes are discussed in Section 5. In Section 6, When we consider the design of practical coding we analyze the reliability and the security performance of our schemes, another valuable metric is the BER [15, 16]. In fact, scheme. And some numerical results are given in Section 7. it is difcult for the eavesdropper to recover any information Finally, concluding remarks are provided in Section 8. from the decoded message when she experiences a BER of about 0.5 and the errors are randomly distributed. Security 2. The Proposed Secrecy Transmission Scheme gap is defned as the quantity diference between Bob’s and Eve’s channels required to achieve a sufcient level In this section, we will frst introduce the wiretap channel of physical-layer security, while ensuring that Bob reliably model with public feedback and the concept of security gap. receives the information [17]. In [17], punctured systematic Ten, we will propose our secrecy transmission scheme along LDPC codes were exploited to obtain a small security gap. with the design of encoder and decoder. Furthermore, a nonsystematic solution based on scrambled systematic LDPC codes was proposed in [18]. It was proved 2.1. System Model. As shown in Figure 1, for � = 1,2,..., that the achievable security gap of the scrambled scheme message �� is a sequence of uncoded bits and the length of is smaller than that of the punctured method. In [19], �� is �. A transmitter named Alice wants to send �� to a scrambling, concatenation, and hybrid automatic repeat- legitimate receiver named Bob through the main channel, but request (HARQ) were combined to reduce the security gap her transmission is also perceived by an eavesdropper named even further. In addition, dynamic LDPC codes are used to Evethroughthewiretapchannel.Tokeep�� as secret as enhance the security of the communication system [20]. And possible, Alice encodes each length-� message �� to a length- protograph LDPC codes [21, 22] can also be used to guarantee � codeword �� by her encoder. Te corresponding received the security of the transmission. codewords by Bob and Eve are denoted by �� and ��,which In this paper, we propose a scheme based on code- are recovered by the decoder as �̂� and ��,respectively. hopping for secrecy transmission over wireless wiretap Additionally, in our model, Bob can use a public feedback channels. In the proposed scheme, with ARQ mechanism, channel to inform Alice whether the current codeword is thetransmitterandthelegitimatereceivercanselectthe decoded successfully with a feedback signal ��.Ifthereoccurs source messages in real time to distill the secret key. Tis a decoding error at Bob, Alice will retransmit the source secret key is then mapped into the parity-check matrix of message until Bob successfully recovers it or the number of LDPC codes, which is used to encode the source message. retransmissions reaches the maximum. Taking into account As for the eavesdropper, the uncorrectable decoding errors the application in practical scenarios, both channels are prevent her from generating the same secret key as the assumedtobeGaussianorfadingchannels: transmitter and the legitimate receiver. Terefore, she cannot � � obtain the correct parity-check matrix to recover the source �� =ℎ� �� +�� , (1) message. Teoretical analysis demonstrates that it is difcult � � for the eavesdropper to generate the same secret key as �� =ℎ� �� +�� , Wireless Communications and Mobile Computing 3

E P?LL,GCH

Security region Reliable region BER

B P?LL,G;R Security gap

E B 3.2G;R 3.2GCH SNR Figure 2: Security gap established by the BER curve.

� � where ℎ� and ℎ� are the fading coefcients, which are equal to one for Gaussian scenario and follow a certain distribution mi−1 fi−1 � � Secret key for fading scenario and �� and �� arezeromeanGaussian generator � 2 � 2 noise; �� ∼ N(0, ��) and �� ∼ N(0, ��). � � Ki Let �� and �� denote the average BER of Bob and Eve, respectively. As shown in Figure 2, to guarantee the reliability, Z−1 Parity-check � � matrix generator �� should be lower than a given threshold �err,max (≈0). And �� to achieve the confdentiality, � should be larger than a given Hi � � threshold �err,min (≈0.5). Particularly, if �� is close to 0.5 m i xi and the errors are randomly distributed, Eve cannot extract LDPC encoder any information from the decoded messages. Based on this observation,thereliabilityandsecurityofthetransmission −1 are guaranteed if conditions (2) and (3) can be satisfed [17], Figure 3: Block diagram of the encoder. Note that the � block respectively: denotes a delay unit.

� � � �� ≤�err,max =�� (SNRmin), (2) matrices. Te block diagrams of the encoder and the decoder �� ≥�� =�( � ), � err,min � SNRmax (3) are illustrated in Figures 3 and 4, respectively. In the encoder of Alice, the secret key �� is updated � where SNRmin is the lowest signal-to-noise ratio at Bob to dynamically according to the received feedback signal ��−1 � � � = � guarantee reliability, SNRmax is the highest signal-to-noise and the source message �−1.If �−1 ACK, then � will be � � = � ratio at Eve to guarantee security, and ��(⋅) denotes the BER updated according to �−1.If �−1 NACK, � will remain as the function of SNR. Ten, the security gap is defned as unchanged. Te detailed procedure of key update will be follows [17]: discussed in Section 3. Ten, the secret key �� will be used to generate the parity-check matrix of LDPC codes as follows: � � Sg = SNRmin − SNRmax, (4) H� =�� (��), (5) where the SNRs are expressed in decibels (dB). Without sacrifcing the error-correcting performance of the trans- where ��(⋅) is the mapping from the secret key to the parity- mission system, our design targets are making the BER of check matrix. For each source message ��,itwillbeencoded Eve approach 0.5 and reducing the security gap as much as by the corresponding H�. possible. In the decoder of Bob, the integrity of the decoded source message �̂� will be checked. If �̂� is recovered without errors, 2.2. Design of the Coding Scheme. To exploit the inherent ran- the public feedback signal �� = ACK; otherwise, �� = NACK. domness of wireless channels and the uncorrectable decoding Instead of using the syndrome of the decoded codeword to errors of Eve, our scheme is implemented such that the secret determine the correctness of �̂�,weusethecyclicredundancy keys are distilled from the un-retransmitted source messages, check (CRC) algorithm to perform integrity check. Tis is which are then used to generate the parity-check matrices of becausewhenthedecodedcodewordconvergestoanother LDPC codes. During the transmission, the source messages valid codeword of H�, the method based on the syndrome are encoded and decoded by these dynamic parity-check cannot detect errors. As for the symmetric key �� and the 4 Wireless Communications and Mobile Computing

fi Integrity check

Z−1

mi−1 Secret key Z−1 fi−1 generator

Ki

Parity-check matrix generator

Hi

yi mi LDPC decoder

−1 Figure 4: Block diagram of the decoder. Note that the � block denotes a delay unit.

0 1123 Transmitter (Alice) ···

ACK NACK ACK ACK NACK

Receiver (Bob) 0 XX12 ···

Un-retransmitted source message Retransmitted source message

Figure 5: Te process of automatic source massage selection.

parity-check matrix H�, they will be generated as in Alice’s corresponding feedback signal �� before transmitting any new encoder. source message. If the received feedback signal �� = NACK, � � ��+1 will remain unchanged compared to �� :

3. Dynamic Secret Key Generation Scheme � ��+1 =(��+1,0,��+1,1,...,��+1,�−1) (6) In this section, we will introduce the dynamic secret key gen- =(� ,� ,...,� ). eration algorithm and the mathematical rationales behind it. �,0 �,1 �,�−1 With this algorithm, Alice and Bob can select the appropriate � = �� source messages during the transmission and then distill the If the received feedback signal � ACK, �+1 will be updated secret key based on the universal hashing family. as follows: � ��+1 =(��+1,0,...,��+1,�−2,��+1,�−1) 3.1. Automatic Source Message Selection. In this subsection, (7) we will show how Alice and Bob select appropriate source =(��,1,...,��,�−1,��). messages in real time during the transmission, which is then � � hashed into the dynamic secret key. We defne �� and �� as As for Bob, if he recovers the source message successfully, � the source message set that is used to generate the secret key he will also update the set ��+1 in the same way and send a � �� atAliceandBob,respectively.TogiveAliceandBoban feedback signal �� = ACK. If he fails, he will keep ��+1 = � advantage over Eve, only un-retransmitted source messages �� and send a feedback signal �� = NACK. Tis strategy �� �� � � will be included in � and � . Before the communication guarantees that �� =�� =��. � � begins, �0 =�0 =(�0,0,�0,1,...,�0,�−1),where� is the Becausetherearetotally� elements in �� and the length number of source messages in the set and �0,� is the public of each element is � bits, the space complexity of storing �� agreed initialized binary vector of length-�, �=0,1,...,�−1. is �(��).Teupdateof�� is similar to that of a queue. In As illustrated in Figure 5, during the transmission, the update process, the frst element in �� will be removed Alice transmits a source message �� and waits for the and discarded. Te second element in �� will be moved to Wireless Communications and Mobile Computing 5 the frst location and so on. As for the new element, that is, 1 the source message that has been successfully transmitted, it 0.9 will be moved to the last location. Considering that the length 0.8 of each element is � bits, only additional � bits of space are needed to store the element that is being moved. Terefore, 0.7 � �(�) the space complexity of updating � is . 0.6 It is very difcult for Eve to reproduce ��.Shemust eavesdroponnotonlyeverysourcemessage,butalsoallofthe 0.5 feedback signals. Whenever the eavesdropper has uncertainty 0.4 � Secret bits key about �, the uncertainty is refected in the corresponding 0.3 secret key. 0.2

3.2. Secret Key Distillation. In this subsection, we will intro- 0.1 duce how to distill a secret key from the source message 0 10−5 10−4 10−3 10−2 10−1 100 set ��. Our target is retaining as much of the eavesdrop- per’s information loss as possible in the secret key. Te BER of Eve theory of universal hash family (UHF) provides a powerful Figure 6: Te number of secret key bits we can distill from each solution for us. A UHF is a family of functions such that source message bit versus Eve’s BER. the random mapping obtained by uniformly choosing a function from this family is almost invertible [23]. In other words, regardless of the actual input distribution, by uniform- Because of the randomness of the wireless channel, it randomly choosing a function from a universal hash family, is impossible for Eve to recover each source message in ��. the expected hash output distribution will be close to the �2(�� |�� =��) canbecalculatedasfollows[24]: uniform. In our considered scenario, �� is hashed into a 2 2 secret key �� by using a function �key that is selected from � � �2 (�� |�� =��)=−�⋅log2 ((1 − (� )) +(� ) ), (13) the universal hash function families �. And the conditional � � distribution of �� given the eavesdropper’s knowledge about where �=��is the length of �� in bits. �� can be close to the uniform distribution. Because a nearly Figure 6 illustrates the relationship between Eve’s BER uniform distribution means nearly maximum entropy, the and the number of secret key bits we can distill from each eavesdropper knows almost nothing about ��.Basedon source message bit. We can see that, with the increase of the generalized result from [24], the security of �� can be Eve’sBER,wecandistillmoresecretkeybitsaveragelyfrom evaluated by each source message bit. It shows that the eavesdropper’s � (� |�,� =�) ≥� (� |�,� =�) information loss is retained in the secret key. � � � 2 � � � (8) In our considered wiretap channel model, Eve’s BER can

�−�� be calculated by Bob’s maximum BER and the security gap: ≥�−log2 (1 + 2 ) (9) � �−� � =�� (SNR�)=�� (SNR� − Sg) 2 � � ≥�− , (10) (14) 2 −1 � ln =�� (�� (�� )−Sg). � =� � � where � � is the eavesdropper’s knowledge about �, is Ten, we can calculate �� as follows: the length of �� in bits, and �2(⋅) is the Renyi entropy of order 2 2 2 [24]. When the probability that �� =�� is at least (1 − �), � � �� =−�⋅log2 ((1−�� ) + (�� ) ) (15) formula (9) can be generalized as 2 =−�⋅ [(1 − � (�−1 (��)− )) �−�� log2 � � � Sg � (�� |�,��) ≥ (1−�) (�−log2 (1+2 )) . (11) (16) −1 � 2 +(�� (� (� )−Sg)) ]. Formulas (9) and (11) show that if the length of the secret key � � does not exceed ��, �� is secure because averagely Eve will � � Considering that �=��, according to (16), we can choose have less than one-bit information about �.And � can be � estimated as follows: the value of as follows: � � ≤� (� |� =�) . � 2 � � � (12) −� = � . (17) [(1 − � (�−1 (��)− ))2 +(� (�−1 (��)− ))2]� It is noteworthy that (9) and (11) are averaged over all log2 � � � Sg � � � Sg uniformly choices of hash functions. It is possible that, for some specifc values of �, �(�� |�,��) is not negligible when 3.3. Implementation of UHF. In this subsection, we will show �≤��.However,itappearswithnegligibleprobability[24]. how to implement the universal hash function �key(⋅) in 6 Wireless Communications and Mobile Computing

c c practical scenarios. A Toeplitz matrix is a matrix in which 2 4 each descending diagonal from lef to right is constant 6 2 4 and is a kind of UHF that can be implemented with low complexity [25]. In our proposed scheme, we try to generate secret key �� with length � fromthesourcemessageset 5 1 3 �� with length �. Te corresponding Toeplitz matrix is as follows: c1 c3 Figure 7: Protograph � for the nonsystematic LDPC code. Te rate �� ��+1 ⋅⋅⋅ ��+�−2 ��+�−1 [ ] is 1/2. [� � ⋅⋅⋅ � � ] [ �−1 � �+�−3 �+�−2] [ ] [ . . . ] T = [ .��−1 d . . ] , (18) [ ] � [ . ] expose the secret message bits, all of the information bits [ . ] will be punctured and the � parity bits will be transmitted. [ �2 . d �� ��+1 ] We use the code doping method in [28] to design and � � ⋅⋅⋅ � � [ 1 2 �−1 � ] optimize our protograph to ensure that the iterative decoding of the designed LDPC codes can be triggered successively. � ,� ,...,�,...,� where 1 2 � �+�−1 is the randomly generated ele- Figure 7 shows our optimized protograph � = (�,�, �) for (2) ment over GF .Tesecretkeycanbegeneratedbymulti- arate-1/2 nonsystematic LDPC code. We denote � as the set plying T and �: of variable nodes {V1, V2,...,V6}, � as the set of check nodes {�1,�2,...,�4},and� as the set of edges {�1,�2,...,�16}.Inthe � =� (� )=T ×(�)� . � key � � (19) designed protograph, we will puncture the information nodes V V 2 denoted by 1 and 2 among all the variable nodes to avoid Te computational complexity of (19) is �(� ).Toreduce systematic transmission. the computational complexity, we can use the improved To guarantee the convergence of the brief propagation algorithm based on fast Fourier transformation (FFT) [26]. (BP) decoding algorithm, the connection relationship of T Based on the Toeplitz matrix , we can obtain a new circular the check node �4 is specially designed. In our designed T matrix � as follows: protograph, the check node �4 is connected to only one punctured variable node V2.Equivalently,wecanuseabase TR H 4×6 T =[ 1] parity-check matrix �,0 with size to represent this � R R protograph. 2 3 (20) 110010 = Circu (��,��+1,...,��+�−1,�1,...,��−1), [ ] [110001] R R R [ ] where 1, 2,and 3 arethesubmatricesdefnedin[26], H�,0 = . (22) [311100] which make the extended matrix T� acircularmatrix. [ ] (⋅) Circu denotes the circular matrix, which can be repre- [011200] sented by its frst row. �̂ =(�, 0) Ten, we generate a new vector � � by combining A “copy-and-permute” operation can be applied to the � 0 �̂ � with a zero vector , where the length of � equals protograph � to obtain a large derived Tanner graph. We T the columns of �. Te secret key can be generated by defne � as the expanded factor; the “copy-and-permute” T �̂ multiplying � and �,whichcanbecalculatedusingthe operation frstly makes � copies of the protograph � and then FFT-based method: permutes the endpoints of each edge among the � variable nodes and � check nodes connected to the set of � edges � = T × �̂� = F−1 (F (T (1))∘F (�̂ )) , � � � � � (21) copied from the same edge from the original protograph � −1 . where F(⋅) is the Fourier transform and F (⋅) is the inverse, Afer this operation, we can obtain a large Tanner graph, T (1) T ∘ � is the frst row of �,and denotes the operation where the � copies of the original protograph are connected that multiplies the corresponding elements in the vector. Te to each other. Equivalently, we can expand each element of �(� �) computational complexity of (21) is log . value � in the base matrix H�,0 to a �×�matrix with � ones in each row or column. As a result, we can obtain a large 4. Design of Structure-Random LDPC Codes matrix with size 4� × 6�. Because random permutation is not easy to describe and In this section, we will show how to construct a large number implement efciently, in our scheme, we adopt the structured of parity-check matrices of LDPC codes based on the tech- type of permutation, such as cyclic permutation. In other nique we called structured-random protograph expanding. A words, we expand each element of value � in the base matrix protograph is a Tanner graph with a relatively small number H�,0 to �×�circulant permutation matrices I�(�).Asaresult, ofnodes[27],whichcanbeusedtoconstructtheparity-check the expanded parity-check matrix will become a �-circulant matrix of LDPC codes. Because systematic codes directly matrix. Wireless Communications and Mobile Computing 7

To construct a large number of parity-check matrices of above, H� is a ��-circulant matrix and can be written as H� = LDPC codes, it is not enough to expand the protograph � [A(��), B(��)] such that with just one single stage. Terefore, we develop a structured- 1 1 random protograph expanding technique. Tis technique A A [ 11 12] expands the protograph � with �>1stages. We denote [ 1 1 ] � ,� ,...,� 1,2,...,� [A21 A22] 1 2 � as the expanding factors for stages , A� [ ] A (��)=[ ��] = [ ] , respectively. Te total expanding factor � can be calculated 2×4 [A3 A1 ] [ 31 32] as �=�1�2 ⋅⋅⋅��. Finally, the base matrix H�,0 is expanded H 1 to the parity-check matrix �,�. [ 0A42] (24) 1 (i) Structured expanding: in the procedure of structured 00B13 0 [ ] expanding, we expand the protograph � in the frst [ 1 ] � [ 000B24 ] �−1stages to avoid parallel edges, short cycles, and B (� )=[B ] = [ ] . � �� 4×4 [ 1 1 ] low-weight codewords. As a result, all the nonzero [ B B 00] H 1 31 32 elements in �,�−1 will be equal to . 1 2 [ B41 B42 00] (ii) Random expanding:intheprocedureofrandom �=2� expanding, we expand H�,�−1 in the � stage based Te frst nodes are punctured as information nodes �+�=6� on the value of the dynamic secret key ��.Foreach among all the variable nodes. zero element in H�,�−1,wewillexpanditbya�� ×�� 0 zero matrix ��×�� . For each nonzero element, we will 4.1. An Example. In this subsection, we construct a large expand it by a �� ×�� circulant permutation matrix number of nonsystematic (2048, 1024) LDPC codes via �=3 I (�) �=�� � =4×4×32= �� . Te total number of zero and nonzero elements stages. Te total expanding factor 1 2 3 is � = |�|�/��. 512.Withthefactor�1 =4, the frst stage aims to separate all the parallel edges. With the factor �2 =4,thesecondstage Asfortheparametersthatareusedintheprocedure aims to avoid the existence of the cycle of girth 4. With the � =32 of structured expanding, that is, all the shif values and factor 3 , the third stage aims to randomly expand all |�|�/� = 256 expanding factors, they are constant and will be shared the 3 edges. Finally, we get a set of parity-check H ={H(r):r ∈⟦0,2256 −1⟧} between Alice and Bob publicly in advance. Now, we rewrite matrices . the dynamic secret key �� as a binary vector: During the transmission, we randomly select a parity- check matrix for each source message. Te number of iterations is restricted by 63.InFigure8,weshowtheaverage �� =(��,0,��,1,...,��,�,...,��,�−1), (23) BER of the structured-random nonsystematic (2048, 1024) LDPC codes with diferent number of retransmissions �. where each element ��,� ∈⟦0,�� −1⟧is represented by log2�� bits. Regarding the parameters that are used in the procedure 5. Encoder and Decoder Implementation of of random expanding, that is, all the random shif values, they Structured-Random LDPC Codes are controlled by the dynamic secret key ��,whoselengthis �=� � required to be log2 � bits. 5.1. Encoder Implementation. To implement the encoder of Afer expanding the protograph � with �>1stages, the structured-random LDPC codes, we need to derive the �×� H � × (� + �) base matrix �,0 is expanded to an parity-check nonsystematic generator matrix G� according to the parity- H �=4� �=2� matrix �,where and .Asmentioned check matrix H�. According to (24), G� canbederivedby

(G ) (G ) (G ) (G ) −1 � [ � 11 � 12 � 13 � 14] G� =(B (��) ⋅ A (��)) = (G ) (G ) (G ) (G ) [ � 21 � 22 � 23 � 24] (25) � � � � (A3 ) D (A3 ) D (A1 ) B (A1 ) B [ 31 1 31 3 11 13 21 24 ] = [ ] , 1 � 1 � 1 � 1 � � 1 � 1 � [ (A32) D1 ⊕(A42) D2 (A32) D3 ⊕(A42) C (A12) B13 (A22) B13 ]

1 1 � � 1 � 1 � 1 � 1 where D1 =(I ⊕ B31(B41) C (B32) )B31, D2 = C (B32) B31, Te multiplication between �� and G� can be calculated 1 1 � � 2 � 1 � 1 1 � −1 D3 = B31(B41) C ,andC =((B42) ⊕(B32) B31(B41) ) . in blocks: 8 Wireless Communications and Mobile Computing

100 process is the same. Terefore, the decoder implementation complexity of structured-random LDPC codes will be the 10−1 same as that of quasi-cyclic LDPC codes with a fxed parity- check matrix. 10−2

10−3 6. Performance Analysis

BER −4 In this section, we will analyze the security and reliabil- 10 ity performances of our proposed scheme. As shown in the previous section, we can construct a large number of 10−5 nonsystematic LDPC codes that have good error-correction −6 performance. Terefore, we can guarantee that Bob’s BER 10 � �� will be lower than the given threshold by utilizing these 10−7 nonsystematic LDPC codes. It guarantees the reliability of 0 0.5 1 1.5 2 2.5 the transmission. We will analyze the security of our scheme Eb/N0 (dB) in two aspects: the complexity when Eve tries to crack the dynamic secret key and Eve’s average BER during the whole Uncoded r = 2 r = 0 r = 3 transmission. r = 1 Diferent from the traditional cryptosystems that have to distribute the secret key before communication begins, our Figure 8: Te average BER of the nonsystematic (2048, 1024) LDPC � codes with diferent maximum retransmission number �. scheme generates the secret key � dynamically from the source message set ��. During the transmission, an event which is referred to as synchronization error may happen. �� ⋅ G� =[(��)11 (��)12] � ∈ N � Tat is, there exists an index TH ,suchthat �TH is not correctly decoded by Eve, but �� is successfully recovered (G ) (G ) (G ) (G ) TH � 11 � 12 � 13 � 14 � ×[ ] (26) by Bob. At this moment, Eve’s source message set �+1 will (G�)21 (G�)22 (G�)23 (G�)24 bediferentfromAlice’sandBob’ssourcemessageset��+1. Terefore, Eve cannot generate the same secret key as Alice =[(� ) (� ) (� ) (� ) ], � 11 � 12 � 13 � 14 and Bob. As analyzed in Section 3, universal hash function makes where (��)11 =(��)11(G�)11 +(��)12(G�)21, (��)12 = the conditional distribution of �� close to the uniform (��)11(G�)12 +(��)12(G�)22, (��)13 =(��)11(G�)13 + distribution as follows: (��)12(G�)23,and(��)14 =(��)11(G�)14 +(��)12(G�)24.Te multiplication between (��)�� and (G�)��́ ́ can be further 1 �(� =� |� =�)≈� �,∀�∈�. divided as in [29]. For example, to multiply by (G�)12 can be � � � � �� � � � (27) 3 � � �� divided into four steps by successively multiplying by (A31) , 1 1 � � B31, (B41) ,andC . Because all those submatrices are circu- From the information theoretic perspective, (27) means that lant, all the required multiplications in the encoding process the conditional entropy of �� is close to its self-information can be fnished in �(�) time. Te additional computational � � �(� |� =�)≈�(�)= �� � . complexity is from the inversion operation to derive C.In � � � � log2 � �� (28) [30], authors have shown the inversion of a binary matrix �(�) Terefore, the computational complexity of Eve to crack a canbefnishedin time by using a parallel hardware |� | � 2 � =2 architecture. Terefore, the encoding process can be fnished dynamic secret key is approximated to .Evenif in �(�) time. Eve cracks the secret key by the exhaustive search, the similar Consider that the size of C is about 1/8 of the size of synchronization error may happen again and she has to repeat the cracking process. G�. Tus, the designed encoder for structured-random LDPC 1/8 To evaluate the probability that the synchronization error codes will increase by of the storage compared to the � (⋅) traditional encoder for QC-LDPC codes with a fxed parity- happens, we denote � as the frame error rate (FER) as the function of SNR. Bob’s FER and Eve’s FER can be expressed check matrix [31, 32]. � � as �� =��(SNR�) and �� =��(SNR�),respectively.And \ �TH is distributed geometrically; �TH ∼�(�0),where�0 = 5.2. Decoder Implementation. As for the decoder of struc - � � tured-random LDPC codes, it can be extended from the (1 − �� )�� .Tus,theprobabilitydistributionof�TH can be conventional decoder of quasi-cyclic LDPC codes with a fxed calculated as parity-check matrix [33, 34]. Tis is because the parity-check � � �−1 � � matrix H� of structured-random LDPC codes is also quasi- Pr (�TH =�) = [1−(1−�� ) �� ] (1−�� ) �� . (29) cyclic as shown in Section 4. Te only diference is that the shif values of the circulant permutation matrices in H� will As analyzed above, it is difcult for Eve to generate the be updated according to the dynamic secret key ��. When the same secret key as Alice and Bob once the synchronization shif values are successfully updated, the iterative decoding error happens. In other words, Eve cannot generate the Wireless Communications and Mobile Computing 9

� 0.4 correct parity-check matrix to decode �TH .ToevaluateEve’s BER during the whole transmission, we can divide the source 0.35 messages that Eve fails to recover into two categories. Te frst category contains the source messages that Eve fails to 0.3 recover before the synchronization error happens. For the 0.25 messages in the frst category, they are recovered by Eve using the correct parity-check matrix. Te number of messages in 0.2 the frst category �1(�TH) obeys the binomial distribution, � � � � 0.15 �1(�TH)∼�(�TH −1,�1),where�1 =�� �� /(1 − (1 − �� )�� ). Secret bits key 0.1 Tus, the average of �1(�TH) canbecalculatedas � � 0.05 (�TH −1)�� �� �1 (�TH)=(�TH)�1 = . � � (30) 0 1−(1−�� )�� 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 Eb/No (Bob)

And the average number of error bits in each error message 0.5 canbecalculatedas Degraded by dB Degraded by 0.2 dB �� (SNR�)⋅� Nondegraded �ER = . (31) �� (SNR�) Figure 9: Te number of secret key bits we can distill averagely from each transmitted source message bit using (2048, 1024) For the messages in the second category, half of their bits nonsystematic LDPC codes. are wrong, because Eve cannot generate the correct parity- checkmatrixasAliceandBob.Finally,Eve’sBERcanbe calculated as where �TH is defned as follows: � � �ER ⋅ �1 (�) + 0.5� ⋅ (�−�+1) ∞ � = ∑ ⋅ Pr (�TH =�) 1 1 � �⋅� � = ∑�⋅ (� =�)= = . �=1 TH Pr TH (34) �0 � � (32) �=1 (1 − ��)�� �ER ⋅ �1 (�+1) + ⋅ Pr (�TH ≥�+1). � �⋅� Finally, �� can be lower bounded as follows: � Based on (30), �� canbefurthercalculatedas 1 �→∞ �� ≥0.5⋅(1−(1−� )� − ) �����→ 0.5. � 0 �⋅� (35) � � ⋅ (�−1) ⋅� + 0.5� ⋅ (�−�+1) 0 �� = ∑ ER 1 � �⋅� � �=1 From the above analysis, we can know that Eve’s BER �� will approach 0.5 when the number of the transmitted messages �ER ⋅�⋅�1 ⋅ Pr (�TH =�)+ ⋅ Pr (�TH ≥�+1) goes to infnity. In addition, when the security gap of the �⋅� � � system increases, (1 − �� ) and �� will increase, and thus �0 � � � ⋅ (�−1) ⋅� + 0.5� ⋅ (�−�+1) will increase. Terefore, we can make Eve’s BER � approach ≥ ∑ ER 1 � 0.5 with faster speed by increasing the security gap of the �=1 �⋅� system. ⋅ Pr (�TH =�) 0.5� ⋅ � ∑� (� =�) 7. Simulation Results = �=1 Pr TH �⋅� In this section, we will evaluate the performance of our (33) proposed scheme by Monte-Carlo simulations. (� ⋅� − 0.5�) ∑� (�−1) ⋅ (� =�) + ER 1 �=1 Pr TH Figure 9 illustrates the number of secret key bits we can �⋅� distill averagely from each transmitted source message bit. In the region with very low or very high ��/�0,wecanseethat ≥0.5⋅(1−Pr (�TH ≥�+1)) the number of secret key bits decreases. Te reasons are as ∞ 0.5 ⋅ ∑ �⋅ (� =�) follows: in the region with very low ��/�0, the retransmission − �=1 Pr TH � happens frequently and the proportion of un-retransmitted source messages is small; in the region with very high ��/�0, � =0.5⋅(1− (� ≥ � + 1)) − 0.5 ⋅ TH theBERofEveisverylowandthereforethenumberofsecret Pr TH � key bits we can distill averagely from each source message bit issmall.Inaddition,wecanseethatthemorethechannelof � =0.5⋅(1−(1−� )� − TH ), Eve is degraded compared to that of Bob, the more secret key 0 � bits we can distill. 10 Wireless Communications and Mobile Computing

0.5 0.45 0.4 0.35 0.3 0.25 BER 0.2 0.15 0.1 0.05 0 0 5000 10000 15000 N

Eve (3A = 0 dB) Eve (3A = 0.2 dB) Eve (3A = 0.5 dB)

Figure 10: Te BER of eavesdropper versus the number of transmitted source messages when � = 1, 2, . . . , 15000.

100 100

10−1 10−1

10−2 10−2

10−3 10−3 BER BER 10−4 10−4

10−5 10−5

10−6 10−6

10−7 10−7 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 Eb/No (dB) Eb/No (dB)

Bob (r=3) Eve (3A = 0.2 dB) Bob (r=3) Eve (3A = 0.2 dB) Eve (3A = 0 dB) Eve (3A = 0.5 dB) Eve (3A = 0 dB) Eve (3A = 0.5 dB) (a) (b)

Figure 11: BER of our framework for Gaussian wiretap channel using (2048, 1024) nonsystematic LDPC codes for diferent security gaps when � = 1000 and � = 10000 source message are transmitted with maximum retransmission number �=3.

In Figure 10, the BER of Eve versus the number of thesecretkeysgeneratedbyEvearethesameasthekeys transmitted source messages � for diferent security gaps �� generatedbyAliceandBobbeforethefrstsynchronization is plotted. We can see from Figure 8 that Bob’s BER will be error happens. Terefore, she can recover the corresponding −6 lower than 10 in four conditions: �=0and ��/�0 = 2.2 dB, source messages successfully. Afer the frst synchronization �=1and ��/�0 = 1.7 dB, �=2and ��/�0 = 1.5 dB, error happens, Eve can no longer decode the following source or �=3and ��/�0 = 1.4 dB. Terefore, to guarantee the messages anymore, because the uncorrected decoding errors � −6 reliability of the transmission (��,max <10 ), the quality prevent her from generating the correct secret key. Tus, as of the main channel can be fxed to ��/�0 = 1.7 dB and the number of transmitted source messages increases, Eve’s the maximum transmission number can be fxed to �= average BER will approach 0.5. 1. For diferent security gaps, we can see that the BER of In Figure 11(a), the BER curves of Bob and Eve are plotted Eve will always approach 0.5 as the number of transmitted when � = 1000 for diferent security gaps ��.Temaximum source messages increases. Tis is owing to the fact that retransmission number is fxed to �=3.IfBob’sBER Wireless Communications and Mobile Computing 11

� −6 threshold is set to ��,max <10 and Eve’s BER threshold is set [6] F. Oggier and B. Hassibi, “Te secrecy capacity of the MIMO � wiretap channel,” IEEE Transactions on Information Teory,vol. to ��,min = 0.49, the security gap �� =0dB can be achieved. In Figure 11(b), the BER curves of Bob and Eve are plotted 57,no.8,pp.4961–4972,2011. when � = 10000.Wecanseethatsecuritygap�� can be [7] E. Ekrem and S. Ulukus, “Te secrecy capacity region of further reduced to lower than 0 dB. It means that security of the Gaussian MIMO multi-receiver wiretap channel,” IEEE the source message can be guaranteed even when the wiretap Transactions on Information Teory,vol.57,no.4,pp.2083–2114, 2011. channel is better than the main channel. We can see that the security gap performance of our scheme is really small and [8]A.S.Mansour,R.F.Schaefer,andH.Boche,“Teindividual secrecy capacity of degraded multi-receiver wiretap broadcast canbeimprovedbyincreasing�. channels,” IEEE International Conference on Communications, pp.4181–4186,2015. 8. Conclusions [9] A. Yener and S. Ulukus, “Wireless physical-layer security: lessons learned from ,” Proceedings of the In this paper, we have proposed a secrecy transmission IEEE,vol.103,no.10,pp.1814–1825,2015. scheme based on code-hopping to encrypt and encode the [10] A. Tangaraj, S. Dihidar, A. R. Calderbank, S. W. McLaughlin, source messages at the physical layer for wireless commu- and J.-M. Merolla, “Applications of LDPC codes to the wiretap nications. First, we present a dynamic secret key generation channel,” IEEE Transactions on Information Teory,vol.53,no. algorithm based on ARQ mechanism. With this algorithm, 8, pp. 2933–2945, 2007. Alice and Bob can distill the secret keys from the un- [11] M. Andersson, V. Rathi, R. Tobaben, J. Kliewer, and M. retransmitted source messages based on the universal hash Skoglund, “Nested polar codes for wiretap and relay channels,” families. Second, we present a structured-random LDPC IEEE Communications Letters,vol.14,no.8,pp.752–754,2010. codes design algorithm. Based on this algorithm, we generate [12]B.Duo,P.Wang,Y.Li,andB.Vucetic,“Securetransmission a large amount of parity-check matrices of LDPC codes. for relay-eavesdropper channels using polar coding,” IEEE During the transmission, Alice and Bob dynamically select International Conference on Communications,pp.2197–2202, the parity-check matrices of LDPC codes to encode and 2014. recover the source messages based on the dynamic secret [13] H. Si, O. O. Koyluoglu, and S. Vishwanath, “Hierarchical polar keys. Teoretical analysis demonstrates that it is difcult coding for achieving secrecy over state-dependent wiretap for Eve to generate the same secret key as Alice and Bob. channels without any instantaneous CSI,” IEEE Transactions on Simulation results show that the BER of Eve will approach 0.5 Communications,vol.64,no.9,pp.3609–3623,2016. asthenumberoftransmittedsourcemessagesincreasesand [14] Y.-P. Wei and S. Ulukus, “Polar Coding for the General Wiretap the security gap of our system is small. Channel with Extensions to Multiuser Scenarios,” IEEE Journal on Selected Areas in Communications,vol.34,no.2,pp.278–291, 2016. Conflicts of Interest [15] Z. Chen, L. Yin, and J. Lu, “Hamming distortion based secrecy Te authors declare that they have no conficts of interest. systems: To foil the eavesdropper with fnite shared key,” IEEE Communications Letters,vol.19,no.5,pp.711–714,2015. [16] P. Wang, L. Yin, and J. Lu, “An efcient helicopter-satellite Acknowledgments communication scheme based on check-hybrid ldpc coding,” Tsinghua Science and Technology,2018. Tis work was supported by the National Natural Sci- [17] D. Klinc, J. Ha, S. W. McLaughlin, J. Barros, and B.-J. Kwak, ence Foundation of China (NSFC, 91538203), the New “LDPC codes for the Gaussian wiretap channel,” IEEE Trans- Strategic Industries Development Projects of Shenzhen City actions on Information Forensics and Security,vol.6,no.3,pp. (JCYJ20150403155812833), and the Joint Research Founda- 532–540, 2011. tion of the General Armaments Department and the Ministry [18] M. Baldi, M. Bianchi, and F. Chiaraluce, “Non-systematic codes of Education of China (6141A02033322). for physical layer security,” IEEE Information Teory Workshop, pp.1–5,2010. References [19] M. Baldi, M. Bianchi, and F. Chiaraluce, “Coding with scram- bling, concatenation, and HARQ for the AWGN wire-tap chan- [1] C. E. Shannon, “Communication theory of secrecy systems,” nel: A security gap analysis,” IEEE Transactions on Information Bell Labs Technical Journal,vol.28,pp.656–715,1949. Forensics and Security,vol.7,no.3,pp.883–894,2012. [2]A.D.Wyner,“Tewire-tapchannel,”Bell Labs Technical [20] Z. Chen, L. Yin, Y. Pei, and J. Lu, “CodeHop: physical layer error Journal,vol.54,no.8,pp.1355–1387,1975. correction and encryption with LDPC-based code hopping,” [3] I. Csiszar and J. Korner, “Broadcast channels with confdential Science China Information Sciences,vol.59,no.10,ArticleID messages,” IEEE Transactions on Information Teory,vol.24,no. 102309, pp. 1–15, 2016. 3, pp. 339–348, 1978. [21] Y. Fang, G. Bi, Y. L. Guan, and F. C. M. Lau, “A survey on [4] S. K. Leung-Yan-Cheong, “On a Special Class of Wiretap protograph LDPC codes and their applications,” IEEE Commu- Channels,” IEEE Transactions on Information Teory,vol.23,no. nications Surveys & Tutorials,vol.17,no.4,pp.1989–2016,2015. 5, pp. 625–627, 1977. [22]Y.Fang,S.C.Liew,andT.Wang,“Designofdistributed [5]P.K.Gopala,L.Lai,andH.ElGamal,“Onthesecrecycapacity protograph LDPC Codes for multi-relay coded-cooperative of fading channels,” IEEE Transactions on Information Teory, networks,” IEEE Transactions on Wireless Communications,pp. vol. 54, no. 10, pp. 4687–4698, 2008. 7235–7251, 2017. 12 Wireless Communications and Mobile Computing

[23] H. Tyagi and A. Vardy, “Universal hashing for information- theoretic security,” Proceedings of the IEEE,vol.103,no.10,pp. 1781–1795, 2015. [24] C. H. Bennett, G. Brassard, C. Crepeau,´ and U. M. Mau- rer, “Generalized privacy amplifcation,” IEEE Transactions on Information Teory,vol.41,no.6,part2,pp.1915–1923,1995. [25] M. Hayashi, “Exponential decreasing rate of leaked information in universal random privacy amplifcation,” IEEE Transactions on Information Teory, vol. 57, no. 6, pp. 3989–4001, 2011. [26] M. Hayashi and T. Tsurumaru, “More efcient privacy amplif- cation with less random seeds via dual universal hash function,” IEEE Transactions on Information Teory,vol.62,no.4,pp. 2213–2232, 2016. [27] J. Torpe, “Low-density parity-check (ldpc) codes constructed from protographs,” IPN progress report,vol.42,no.154,pp.42– 154, 2003. [28] S. Ten Brink and G. Kramer, “Design of repeat-accumulate codes for iterative detection and decoding,” IEEE Transactions on Signal Processing, vol. 51, no. 11, pp. 2764–2772, 2003. [29] T. J. Richardson and R. L. Urbanke, “Efcient encoding of low- density parity-check codes,” IEEE Transactions on Information Teory,vol.47,no.2,pp.638–656,2001. [30] A. Bogdanov, M. C. Mertens, C. Paar, J. Pelzl, and A. Rupp, “Smith-a parallel hardware architecture for fast gaussian elim- ination over gf (2),” in Workshop on Special-Purpose Hardware for Attacking Cryptographic Systems,2006. [31]Z.Li,L.Chen,L.Zeng,S.Lin,andW.H.Fong,“Efcient encoding of quasi-cyclic low-density parity-check codes,” IEEE Transactions on Communications,vol.54,no.1,pp.71–81,2006. [32] Q. Huang, L. Tang, S. He, Z. Xiong, and Z. Wang, “Low- complexity encoding of quasi-cyclic codes based on Galois Fourier transform,” IEEE Transactions on Communications,vol. 62, no. 6, pp. 1757–1767, 2014. [33] Y.-L. Ueng, B.-J. Yang, C.-J. Yang, H.-C. Lee, and J.-D. Yang, “An efcient multi-standard LDPC decoder design using hardware- friendly shufed decoding,” IEEE Transactions on Circuits and Systems I: Regular Papers,vol.60,no.3,pp.743–756,2013. [34] Q. Huang, L. Song, and Z. Wang, “Set message-passing decod- ing algorithms for regular non-binary LDPC codes,” IEEE Transactions on Communications,vol.65,no.12,pp.5110–5122, 2017. Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 6101853, 9 pages https://doi.org/10.1155/2018/6101853

Research Article Research and Implementation of Rateless Spinal Codes Based Massive MIMO System

Liangliang Wang ,1,2 Xiang Chen ,1,2 and Hongzhou Tan1

1 School of Electronic and Information Technology, Sun Yat-sen University, Guangzhou 510006, China 2Key Lab of EDA, Research Institute of Tsinghua University in Shenzhen (RITS), Shenzhen 518075, China

Correspondence should be addressed to Xiang Chen; [email protected]

Received 23 November 2017; Revised 1 February 2018; Accepted 28 February 2018; Published 2 April 2018

Academic Editor: Zesong Fei

Copyright © 2018 Liangliang Wang et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Te potential performance gains promised by massive multi-input and multioutput (MIMO) rely heavily on the access to accurate channel state information (CSI), which is difcult to obtain in practice when channel coherence time is short and the number of mobile users is huge. To make the system with imperfect CSI perform well, we propose a rateless codes-aided massive MIMO scheme, with the aim of approaching the maximum achievable rate (MAR) as well as improving the achieved rate over that based on the fxed-rate codes. More explicitly, a recently proposed family of rateless codes, called spinal codes, are applied to massive MIMO systems, where the spinal codes bring the beneft of approximately achieving the MAR with sufciently large encoding block size. In addition, the multilevel puncturing and dynamic block-size allocation (MPDBA) scheme is proposed, where the block sizes are determined by user MAR to curb the average retransmission delay for successfully decoding the messages, which further enhances the system retransmission efciency. Multilevel puncturing, which is MAR dependent, narrows the gap between the system MAR and the related achieved rate. Teoretical analysis is provided to demonstrate that spinal codes with the MPDBA can guarantee the system retransmission efciency as well as achieved rate, which are also verifed by numerical simulations. Finally, a simplifed but comparable MIMO testbed with 2 transmit antennas and 2 single-antenna users, based on NI Universal Sofware Radio Peripheral (USRP) and LabVIEW communication toolkits, is built up to demonstrate the efectiveness of our proposal in realistic wireless channels, which is easy to be extended to massive MIMO scenarios in future.

1. Introduction throughput loss resulting from the rate mismatching under inaccurate CSI. Terefore, developing a resilient transmission Massive multi-input and multioutput (MIMO), achieving scheme for massive MIMO in the presence of imperfect CSI high spectral efciency and low power consumption, has is of great importance. been widely regarded as a promising technique for 5G In fact, for multiuser cellular systems, where the base wireless communication systems. However, its benefts rely station is not equipped with a large number of antennas, heavily on the accuracy of the channel state information it has been proved that rateless codes perform well under (CSI) to perform the multiuser procoding. Unfortunately, inaccurate CSI in [5, 6]. Terefore, rateless codes based the collection of accurate CSI is costly because of the short transmission scheme for massive MIMO is considered in this channel coherence time and the huge number of mobile paper. When CSI is not available at transmitter, rateless codes users. In fact, pilot contamination is an essential factor to with adaptive code rates perform well; some related works result in imperfect CSI, and pilot contamination appears about rateless codes, such as spinal codes, strider codes, and to have a more profound efect than classical MIMO [1, raptor codes, have been presented in [7–9]. In [7], the authors 2]. Terefore, a critical question is how to improve the proved that spinal codes outperform LDPC codes as well as throughput performance of massive MIMO systems under strider and raptor codes in fading channels with inaccurate imperfect CSI. Traditional fxed-rate codes, such as LDPC CSI. Terefore, spinal codes are integrated into resilient [3] and Turbo codes [4], may sufer from the signifcant transmission scheme, where spinal encoders encode users’ 2 Wireless Communications and Mobile Computing

Y S1,l X1,l 1,l U Spinal Spinal U 1 encoder decoder 1

S2,l X2,l Y2,l U Spinal Spinal U 2 encoder decoder 2 Precoder ......

S X Y Spinal N,l M,l N,l Spinal  U UN N encoder decoder ACK feedback channel

Figure 1: Spinal codes based massive MIMO systems. original messages into multiple infnite streams of symbols, with MPDBA can make massive MIMO with imperfect CSI which are transmitted continuously through numbers of work reliably with efcient retransmission. (re)transmissions, called passes, until the senders receive the Moreover, in order to make the spinal codes based acknowledge (ACK) indicating the successful decoding, from massiveMIMObepracticalfromtheory,wealsoconsider the receivers. Tus, the system will beneft from the multi- building up a system with 2 transmit antennas and 2 single- retransmission diversity and achieves a robust performance. antennas users, where NI USRP and LabVIEW communi- Once the encoding block sizes for spinal codes are allowed to cationtoolkitsareinvolvedasthehardwareandsofware be sufciently large, users will gradually approach their MAR platforms, respectively. In this implementation demo system, with maximum-likelihood (ML) decoding algorithm [10]. the retransmission efciency by our proposal is verifed For classical MIMO, nonlinear precoding techniques, in the fading environments. Tis demo can also be fast such as dirty-paper-coding (DPC) [11], vector perturbation extended to massive MIMO scenarios by massive hardware (VP) [12], and lattice-aided [13], have better performance. scale expansion. However, with antennas increasing at the base station in Te rest of this paper is organized as follows: Section 2 massive MIMO, linear precoders, such as zero forcing (ZF) presents the system model. Te efcient transmission scheme [14], have limited throughput loss compared with nonlinear for the system with spinal codes is proposed in Section 3. precoders [15]. ZF has low complexity and is more practical Section 4 shows some simulation results to illustrate the for massive MIMO. Terefore, ZF is considered in our work benefts of the proposed scheme, Section 5 presents some to mitigate multiuser interferences. details about the USRP based spinal codes MIMO demo However, in order to guarantee the system retransmission system. Finally, conclusions are drawn in Section 6. efciency (measured in bits per symbol per second), reducing the retransmission delay, which is equivalent to reducing the 2. System Model pass number, should be considered before spinal codes are integrated into massive MIMO. Reference [16] studied this Troughout this paper, uppercase boldface letters are used to � � problem from the link-layer; however, user-schedule scheme denote matrices. (∙) and (∙) represent the Hermitian trans- involved in the link-layer will bring some extra complexity. pose and transpose, respectively. �[∙] denotes the expectation To that end, an easily implemented scheme in the physic- operator. Tr(∙) stands for trace, [∙] is the round to integer layer is proposed in this paper. From [10], it is known that operator, and ⌈∙⌉ is the round up to integer operator. the pass number is proportional to the block size while A �×�downlink massive MIMO system is shown in being inversely proportional to the maximum achievable rate Figure 1, where the single-cell system has a base station with (MAR).Terefore,onceweinitializetheencodersforallusers � antennas and � single-antenna users, �≫�.Foruser�, with a large enough static block size for MAR-approaching �=1,2,...,�, �� is used to denote an unexpected message. purpose, the users with lower MARs, locating in the cell- Spinal encoder [7], illustrated in Figure 2, divides ��, denoted edge or trapping in a deep fading channel, will signifcantly by message �, � = 1,2,...,into� blocks with size ��(�) enlarge the average pass number for decoding, which further bits; hash function uses each block to generate a sequence degrades the system retransmission efciency performance. of spine values; then the modulated symbols are yielded Because of this, we developed a multilevel puncturing and by a mapper and pseudorandom number generator (RNG) dynamic block-size allocation (MPDBA) scheme, where the function, which uses spine values as its seeds. Afer encoding, MARs are obtained as a priori knowledge to determine the ZF maps the modulated symbols to the precoded symbols, block sizes dynamically, which can reduce the pass number which are transmitted over fading channels. If decoding as well as retransmission delay. Diferent puncturing method happened successfully, the receiver will send the ACK to is implemented for diferent MARs, which is proved to the corresponding sender through the feedback channel; guarantee the system retransmission efciency and achieved otherwise, the sender will generate more redundant symbols rate. Te numerical simulation results show that spinal codes to transmit in the following passes until receiving the ACK. Wireless Communications and Mobile Computing 3

estimation schemes. According to the channel estimation Message Message ··· Message ··· ̂ 1 2 I model described in [14], H� canbegivenby ̂ H� = H� +��ΔH�, (3) Block Block ··· Block 1 2 B �×� where ΔH� ∈ C denotes the channel estimation error Vn,0 Vn,1 Vn,2 Vn,B−1 matrix, with zero mean and unit variance i.i.d. complex Hash Hash ··· Hash function function function Gaussian random variables uncorrelated with that of H�,and Vn,1 Vn,2 Vn,B �� is a nonzero parameter to measure the quality of channel RNG & RNG & ··· RNG & mapper mapper mapper estimation and is appropriately chosen depending on the channel dynamics. Since the focus of this paper is to study the performance of spinal codes, �� isassumedtobeknown S S ··· S Pass 1 n,1,1 n,2,1 n,B,1 at transmitter. Pass 2 Sn,1,2 Sn,2,2 ··· Sn,B,2 Based on (3), the ZF precoding matrix can be expressed ...... as

Pass  Sn,1, Sn,2, ··· Sn,B, � W� =�� (H� +��ΔH�) ...... (4) � −1 ⋅[(H� +��ΔH�)(H� +��ΔH�) ] , Figure 2: Structure of spinal encoder. substituting (4) into (2), the �th symbol vector at �th pass is given by Given ��(�) ≥ 1 is the pass number for user � to � � �= −1 ̂� ̂ ̂� −1 successfully decode the message .At th pass, for Y�� = S�� +�� N�� −��ΔH�H� (H�H� ) S��, (5) 1,2,...,��(�),allusers’modulatedsymbolsaremappedto �×� S� =(S1�, S2�,...,S��) S�� = ̂� ̂ ̂� −1 a matrix of symbols . where ��ΔH�H (H�H ) S�� is the additional interference (� ,� ,...,� )� � = 1,2,...,� � � 1�� 2�� ��� ,for , follows Gaussian item brought by the channel estimation error. (S S�)=� � distribution and Tr �� �� �, � is total transmit power. Once �=��(�), spinal decoder for user � collects enough Assuming CSI is prior knowledge obtained by channel mutual information to retrieve the message � successfully [7]. estimation methods at transmitter, and downlink MIMO Ten the user current data rate (measured in bits per symbol) channel remains ergodic and stationary during � symbols is achieved by �×� periods, S� is then precoded to transmit signal X� ∈ C � (�) by � �� (�) = , (6) �� (�) X� =��W�S�, (1) based on (6), the average achieved rate of spinal codes W = Ĥ�(Ĥ Ĥ�)−1 based massive MIMO is given by �achieved = lim�→∞(1/ where � � � � denotes the ZF precoding matrix ∞ � ̂ �×� �) ∑ ∑ ��(�). for �th pass, H� ∈ C denotes the estimated channel �=1 �=1 � √ � Based on (5), assuming that �� is the average signal matrix, and �� = �/Tr(W�W� ) is power control factor used � power for the desired symbols ���� ∈ S��,wehave∑�=1 ��� = Ŵ (X X�)=� � to normalize � such that Tr �� �� �. Tr(S��S�� )=��.Denoting X� is then transmitted through a MIMO fading channel, �×� −1 and the received signal matrix Y� ∈ C is given by ̂ −1 ̂� ̂ ̂� N�� =�� N�� −��ΔH�H� (H�H� ) S�� (7) −1 −1 Y� =�� H�X� +�� N�, (2) ̂ asthereceivednoise,thecovarianceforN�� can be proved to 1/2 be where H� = D� G� denotes the MIMO channel matrix, �×� �×� �[N̂ N̂�]=�−2I + �−1� �2, G� ∈ C and D� ∈ R represent the complex small- �� �� � � � � � (8) scale and large-scale fading coefcients matrix, respectively. ̂ ̂� Assuming that G� has zero mean and unit variance inde- where �� = diag(�1�,...,���) is the singular value of H�H� pendent and identically distributed (i.i.d.) complex Gaussian and is approximated by entries, large-scale fading coefcients, denoted as D� = 2 diag(�1�,�2�,...,���), are the same for diferent antennas but �� ≈�(D� +�� I�). (9) �×� user-dependent. N� ∈ C denotes the noise matrix with i.i.d. complex Gaussian random variables with zero mean and Proof. Assuming that �≫�in massive MIMO system, let ̂ ̂� unit variance. Ten the average transmit SNR is ��/�. the singular value decomposition (SVD) of matrix H�H� be ̂ ̂ ̂� � IfperfectCSIisavailableattransmitter,thenH� = H�H� = U���V� ,whereU� and V� are unitary matrix and �� H�. However, imperfect CSI always arises in any practical is a diagonal matrix. 4 Wireless Communications and Mobile Computing

̂ −1 ̂� ̂ ̂� −1 upper upper Given N�� =�� N�� −��ΔH�H� (H�H� ) S��,thecovari- Terefore, we can obtain �� (1 − �� (1 + �(1))/��(�)) ≤ ̂ � (�) ≤ �upper ance of N�� canbecomputedas � � , which can be further expressed as upper 2 � −2 � (� ) (1+�(1)) �[N̂ N̂ ]=� �[N N ] upper � �� �� � �� �� 0≤�� −�� (�) ≤ . (15) �� (�) 2 ̂� ̂ ̂� −1 � ̂ ̂� −1 +� �[ΔH�H (H�H ) S��S (H�H ) upper � � � �� � (10) From (15), there exists �1 to satisfy �� −��(�) ≤ upper 2 upper �1((�� ) /��),andwecanobtain�� −��(�) = ⋅ Ĥ ΔH�]=�−2I + �−1� �2, upper 2 � � � � � � � �((�� ) /��),where�(⋅) denotes the upper bound of upper �� −��(�). ̂� ̂ ̂� −1 � whereweusethefactthat�[ΔH�H� (H�H� ) S��N�� ]=0and � Tis lemma indicates that when block size ��(�) = �max, �[N��N�� ]=I�. where �max denotes the sufciently large block size in this In massive MIMO, when �≫�, �→∞,the upper � paper, the gap between ��(�) and �� can be arbitrarily rows of G� are asymptotically orthogonal; that is, G�G� ≈ upper small such that each user can approach � with ��(�),and �I Ĥ Ĥ� =(D1/2G +�ΔH )(D1/2G + � �.Hencewehave � � � � � � � � spinal codes based massive MIMO approaches �upper as well. � 2 2 ��ΔH�) ≈�(D� +�� I�) and �� ≈�(D� +�� I�). When the small large-scale fading coefcients ��� deteri- upper orate �� in (11), based on (12), ��(�) = �max will enlarge Basedon(8)and(9),theaverageMARforuser� is given ��(�) signifcantly. If each pass occupies the constant time by � (measured in seconds), then the larger ��(�) makes the average retransmission delay spent on decoding the message ��2� � be costly, and the average system retransmission efciency, �upper =�[ (1+ � �� )] . � log2 2 2 2 −1 (11) defned as �+�� ���� (��� +�� ) ∞ � 1 �� (�) �= lim ∑ ∑ , (16) Tus,theupperboundofergodicachievablerateformassive �→∞ � �=1 �=1 ��� (�) MIMO with ZF under imperfect CSI is expressed as �upper = � upper � (�) = � (�)/�� (�) ∑�=1 �� . will be limited as well. We use � � � to denote current retransmission efciency for each user. ����� Lemma 1. If ��(�) is sufciently large and ��(�) ≥ �� , Te greedy idea is an efective way to achieve a preferable then there exists (�1(�), �2(�),...,��(�)) for users to achieve system retransmission efciency across diferent SNRs. Tat ����� ����� 2 � (�) � �� −��(�) = �((�� ) /��(�)). is,wetrytomakesurethat � for each message is better, which will further achieve a satisfed value for �.When��(�) Proof. From [10], ��(�) can be expressed as is ideal, to decrease ��(�) is an efective way to enhance ��(�). Terefore, based on (12), ��(�) can be dynamically upper �� (�) determined by �� to reduce ��(�). �� (�) =⌈ upper ⌉+�(1) . (12) �� 3. Multilevel Puncturing and Dynamic Substituting (12) into (6), the corresponding data rate is Block-Size Allocation Scheme rewritten as � (�) In this section, a multilevel puncturing and dynamic block- � (�) = � . � ⌈� (�) /�upper⌉+�(1) (13) size allocation scheme is proposed. For each user, the � � dynamic block-size allocation problem is formulated as [� (�)/�upper]≤�(�)/�upper ∗ Based on the fact that � � � � ,wehave �� (�) = arg min �� (�) , upper (17) �� ≤��(�)<�max � (�) � (�) = � �∗(�) � � ⌈� (�) /�upper⌉+�(1) where � is the desired block size for message . � � However, we cannot obtain the solution for (17) by solving � (�) (12) directly, which includes an unexpected constant item = � �(1) [� (�) /�upper]+1+�(1) .Instead,weusethecumulativedistributionfunction � � � (�) �∗(�) (14) (CDF) of pass numbers � to determine � .Figure3 � (�) presents the CDF of pass numbers corresponding to the ≥ � upper diferent block sizes and SNRs, from which we obtain that, �� (�) /�� +1+�(1) under diferent SNRs, smaller block sizes guarantee the �upper system to decode successfully with less pass number and high = � , upper probability. Terefore, we choose 1+�� (1+�(1)) /�� (�) ∗ upper upper upper �� (�) =⌈�� ⌉ (18) when ��(�)→∞, �� (1 + �(1))/��(�) → 0, �� /(1 + upper upper upper �� (1 + �(1))/��(�)) ≈ �� (1 − �� (1 + �(1))/��(�)). for each spinal encoder for encoding the message �. Wireless Communications and Mobile Computing 5

1 Proof. From [10], ��(�) under MPDBA scheme is given by

0.9 ∗ �� (�) �sub,� (�) 0.8 �� (�) =⌈ upper ⌉−�sub,� (�) +1+�(1) . (21) �� 0.7 0.6 Substituting (21) into (20), the current data rate under MPDBA scheme is achieved as 0.5 CDF �∗ (�) � (�) 0.4 � (�) = � sub,� , � ∗ upper (22) ⌈� (�) � (�) /�� ⌉+�(1) 0.3 � sub,� ∗ upper 0.2 when �� (�) = ⌈�� ⌉, (22) can be further written by 0.1 �� (�) 0

0 1 2 3 4 5 6 7 8 9 upper

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 [� ]� (�) +� (�) (23) = � sub,� sub,� . Pass number necessary for decoding upper upper upper [[�� ]�sub,� (�) /�� +�sub,� (�) /�� ]+1+�(1) OJJ?L Kn(I) = Rn , SNR = 10 dB OJJ?L Basedonthefactthat Kn(I) = Rn , SNR = 20 dB upper Kn(I) = KG;R, SNR = 10 dB [� ]� (�) � (�) � sub,� + sub,� −1 Kn(I) = KG;R, SNR = 20 dB upper upper �� �� Figure 3: CDF curves of pass number needed for successfully upper upper [�� ]�sub,� (�) �sub,� (�) decoding under diferent SNRs, �max >� �� = 0.1. � , <[ upper + upper ] (24) �� �� [�upper]� (�) � (�) � (�) = �∗(�) � sub,� sub,� When � � , according to the proof of Lemma ≤ upper + upper , 1,thegapbetweentheMARandtheachievedrateisgivenby �� �� upper upper we can obtain �� −�� (�) =�(�� ) , (19) upper [�� ]�sub,� (�) +�sub,� (�) which indicates that the dynamic block-size scheme will upper upper upper [�� ]�sub,� (�) /�� +�sub,� (�) /�� +1+�(1) cause an unavoidable achieved-rate performance loss at upper higher �� .Toreducethelossasmuchaspossible,a ≤�� (�) (25) multilevel puncturing method is implemented for spinal [�upper]� (�) +� (�) codes. < � sub,� sub,� . [�upper]� (�) /�upper +� (�) /�upper +�(1) Assume message � with � bits is encoded by spinal � sub,� � sub,� � ∗ encoder with desired encoding block size ��(�) = �� (�) and yields �=�/��(�) coded symbols. If these � symbols From (25), we have �=1 transmitted in the frst pass, that is, ,cannotbe �upper� (1) used to retrieve message successfully, then, from the second � <�upper −� (�) � (�)(2+�(1)) � � transmit pass, that is, �>1, each � symbol is transmitted sub,� � (�) �/� (�)� (�) (26) within sub,� subpasses with � sub,� symbols �upper (1+�(1)) transmitted at each subpass. < � . � (�) When �=�sub,�(�), � bits can be retrieved successfully sub,� �/� (�) + (� (�) − 1)(�/� (�)� (�)) by � � � sub,� symbols, and the upper � From (26), there exist �2 and �3 to satisfy �2(�� /�sub,�(�)) ≤ user currentratein(6)canberewrittenas upper upper �� −��(�) ≤ �3(�� /�sub,�(�)).Tuswehave �� (�) �sub,� (�) � (�) = , �upper � � (�) +� (�) −1 (20) upper � sub,� � �� −�� (�) =�( ) , (27) �sub,� (�) where �sub,�(�)∈{1,2,...,�}and is determined according to where �(⋅) is used to denote the lower and upper bound of the practical cases. upper �� −��(�). 1+�(1)<� (�) ≤ Teorem 2. Consider that the symbols of MPDBA based From (21), the pass number satisfes � ∗ upper �sub,�(�)/� +2+�(1) spinal codes with parameter �� (�), �� are transmitted over a � ; therefore we have MIMO channel with imperfect CSI, then the system can achieve ����� ����� ����� 2 �sub,� (�) �� −��(�) = �(�� /����,�(�)) with ��(�) ≥ �((�� ) / � (�) =�( ). � �upper (28) ����,�(�)). � 6 Wireless Communications and Mobile Computing

9 16 15 8 14 7 13 12 6 11 5 10 9 4 8 3 7

Rate (bits/symbol/user) Rate 6 Rate (bits/symbol/user) Rate 2 5 1 4 3 0 2 −30 −25 −20 −15 −10 −5 0 5 10 −10 −5 0 5 10 15 20 25 30 Transmit SNR (dB) SNR (dB)

Shannon bound ROJJ?L, =0 ROJJ?L, = 0.1 ∗ n e n e Spinal, Kn(I) =Kn(I), LMO<,n =8 ∗ ∗ Kn(I) =Kn (I),e =0 Kn(I) =Kn (I),e = 0.1 LDPC, rate = 1/2, BPSK Kn(I) =KG;R,e =0 Kn(I) =KG;R,e = 0.1 LDPC, rate = 1/2, QPSK LDPC, rate = 1/2, 16 QAM Figure 5: Average rates achieved by spinal codes based massive LDPC, rate = 3/4, 16 QAM ∗ MIMO, �max >�� , �sub,�(�) = 8. LDPC, rate = 2/3, 64 QAM LDPC, rate = 3/4, 64 QAM LDPC, rate = 5/6, 64 QAM Figure 4: Average rates achieved by spinal codes and LDPC codes caused by diferent constellations mapping operations of LDPC codes. based massive MIMO, �� = 0.1. In Figure 5, we compare the average achieved-rates of spinal codes based massive MIMO with ��(�) = �max and � (�) = �∗(�) � � � � , respectively. Massive MIMO can approach Based on (27) and (28), there exist 4 and 5 such that the min- its MAR gradually by enlarging the block size, which verifes imum achieved data rate and maximum pass number can be upper upper upper the conclusion in Lemma 1. We also obtain that the achieved- given by �� −�4(�� /�sub,�(�)) and �5(�sub,�(�)/�� ), ∗ rate performance of spinal codes with ��(�) = �� (�) follows respectively. Terefore, the lower bound of ��(�) is given by upper conclusions in Teorem 2; that is, larger �� will cause upper 2 upper 2 bigger achieved-rate performance loss, especially for lower low 1 (� ) � (� ) � (�) � (�) = � − 4 � , sub,� . � � � (�) � 2 (29) � (�) 5 sub,� 5 (�sub,� (�)) Figure 6 shows that we can enlarge sub,� to minimize thegapbetweentheMARsandtheratesachievedbyspinal upper low upper 5 codes with MPDBA. For lower �� ,smaller�sub,�(�) can which can be written as �� (�) = �((�� ) /�sub,�(�)). upper achieve a satisfying achieved-rates result. For higher �� , � (�) Tis theorem indicates that for the transmission scheme alarger sub,� is necessary for spinal codes to achieve the in massive MIMO with MPDBA based spinal codes, we can MAR. upper choose a proper �sub,�(�) (i.e., lower �sub,�(�) for lower �� Figure 7 shows that the application of MPDBA for spinal upper upper and higher �sub,�(�) for higher � )fordiferent� such encoders will enhance the system retransmission efciency. � � � (�) thatthegapbetweentheachievedrateandtheMARislimited Diferent proper sub,� for the MPDBA based spinal codes as well as better retransmission efciency is guaranteed under can achieve preferable retransmission efciency, which also diferent �sub,�(�). indicates that once the system achieves better retransmission efciency, there will be more proper �sub,�(�) to choose from to obtain better achieved-rates performance illustrated in 4. Simulation Results and Analysis Figure 6. Te numerical results in Figures 6 and 7 verify Teorem 2; that is, spinal codes with MPDBA can guarantee In this section, some numerical simulation results are illus- the system reliability as well as retransmission efciency. trated to verify the performance of MPDBA based spinal codes scheme for a 4×64MIMO system, �max =20. Figure 4 compares the average achieved-rates of massive 5. Simplified Spinal Codes Based MIMO MIMO based on spinal codes and LDPC codes, respectively. Demo System with NI USRP 2920 LDPC codes are from 802.16e, decoded with a powerful decoder. We can obtain that spinal codes outperform LDPC Tis section will present some details about making spinal codes across all SNRs. Moreover, the simple decoder struc- codes based massive MIMO practical from theory. Some ture of spinal codes also avoids the demapping complexity works about implementation of massive MIMO prototyping Wireless Communications and Mobile Computing 7

16 Universal Sofware Radio Peripheral (USRP, here we use 15 USRP 2920 as the basic module) and LabVIEW communi- 14 cation toolkits to build up the testbed, which is used to verify 13 the benefts presented in our theoretical analysis. Owing to 12 the decouple design of sofware and hardware, the simplifed 11 MIMO can be easily deployed and expanded to a massive 10 MIMO system with little extra operations. 9 8 7 5.1. Simplifed MIMO System Architecture. Te MIMO sys- 6 tem architecture is shown in Figure 8. In our demo system, Rate (bits/symbol/user) Rate 5 specifc MIMO cables are used to synchronize the clock 4 sources between 2 transmitters and 2 receivers, respectively 3 [19]. Ten, they are connected to one host computer through 2 an Ethernet switch. By using the unique host computer, the −10 −5 0 5 10 15 20 25 30 CSI can be obtained perfectly from the uplink to perform SNR (dB) precoding, being also suitable for adding some more CSI ROJJ?L, =0 ROJJ?L, = 0.1 errors. n e n e We consider a Time Division Duplexing (TDD) system, LMO<,n(I) =8,e =0 LMO<,n(I) =8,e = 0.1 L (I) =16, =0 L (I) =16, = 0.1 where two users separately send orthogonal pilots to base MO<,n e MO<,n e station in the frst time slot, and then the received pilots Figure 6: Te efects of �sub,�(�) on average rates achieved by spinal atthebasestationareusedtoestimatetheuplinkchannel codes based massive MIMO. matrix. Base station then calculates precoding matrix with estimated CSI (in general imperfect CSI as presented above 4 in this paper) and precodes symbols afer parallel spinal encoders as Figure 1 shows. Precoded symbols are then 3.5 transmitted through the downlink channel to users in the second time slot. Each user tries to retrieve original messages 3 from received symbols and then sends ACK to based station 2.5 if messages were decoded successfully. Figure 9 illustrates the user interference in the unique host computer. Tree 2 parts are mainly included, including parameters confgura- tion (located on the lef), retrieved-message display (on the 1.5 middle), and numerical results display (on the right). 1 Te decouple design of sofware and hardware can easily customize USRP as transmitter or receiver, which enables 0.5 USRP to work correctly. USRP supports the input of syn-

Average rate for each pass (bits/symbol/pass) for rate Average chronized signal from external clock sources; more than 2 0 −5 0 5 10 15 20 25 30 functioned-wellUSRPscanbesynchronizedtobuildupa SNR (dB) system with large transmit antennas and receivers. Terefore, both the function designs and devices support the scalability ∗ Kn(I) =Kn(I), LMO<,n(I) =8 from a 2∗2MIMO system to a massive MIMO system. ∗ Kn(I) =Kn(I),LMO<,n(I) =16 Kn(I) =KG;R,LMO<,n(I) =8 5.2. Radio Frequency Calibration. In our implemented TDD Kn(I) =KG;R,LMO<,n(I) =16 system as Figure 8 shows, the uplink and downlink channels

∗ are assumed to be reciprocal. However, real channels are not Figure 7: Average system retransmission efciency, �max >�� (�), � = 0.1. reciprocal due to the diferences in transmitter and receiver � hardware. An internal calibration method [20] is considered in our work, where we frst fnd a reference radio; then if we know the calibration coefcient between any two radios system have been presented in [17], where large-scale anten- and a reference radio, we can derive the direct calibration nas are used to verify the signifcant gains in data rates. A coefcient between them. With these calibration coefcients, rapid prototyping system architecture is proposed in [18], the real channels will be derived. where the proposed system has high scalability in terms of the number of antennas. In order to verify the performance 5.3. Demo System’s Signal Frame Structure Design. Figure 10 of spinal codes based MIMO with imperfect CSI, to equip gives a simple frame structure design based on TDD mode for large-scale antennas is unnecessary and also too expensive the 2×2demo system. At the uplink scheduling slot, we use to achieve. Terefore, we frst only consider a system with 2 preamble to obtain the start point of data and synchronize transmit antennas and 2 single-antenna users as a simplifed the system frequency; the orthogonal pilots are transmitted andcomparablecaseofthemassiveMIMO.WeuseNI through uplink sound channel (UL-SCH) to obtain the CSI. 8 Wireless Communications and Mobile Computing

USRP MIMO cable USRP Tx1 Tx2

Gigabit Host Network cable ethernet computer switch

USRP MIMO cable USRP Rx1 Rx2

Figure 8: Simplifed 2×2MIMO demo system layout.

Figure 9: Simplifed 2×2MIMO demo system’s user interface in the host computer.

Uplink scheduling Downlink scheduling Table 1: Detailed parameters of the system.

Parameter Value Sample rate (MHz) 1 Preamble UL-SCH UGI DL-TCH DGI Number of multipaths 3 Path delay (us) 0, 0.01, 0.02 − − Figure 10: Data frame structure. Path power (dB) 0, 0.9, 1.7

Table 2: Retransmission times (pass number) of 2×2MIMO. At downlink scheduling slot, users’ messages are precoded SNR (dB) and transmitted through the downlink trafc channel (DL- Block size TCH). UGI and DGI are guard intervals in the uplink and 01020 downlink channel, respectively. Fixed ��(�) = 8 22 13 10 ∗ Dynamic ��(�) = �� (�) 43 3 5.4. Signal Detection Implementation Test Results. Due to complicated feld test condition limitations, we frstly test the above demo system’s performance in fading channels by stationary, we use the real achieved rate to replace the MAR the channel simulator in LabVIEW sofware environments. to obtain the dynamic block size. Te uplink and downlink signals are truly transmitted by Te performance of retransmission times is compared radio. However, the distance limitation between transmit byourproposedMPDBAandbyfxedblock-sizescheme and receive antennas is so near, and we have to pass the in Table 2. From this table, it can be seen that, by the signals through fading channels in LabVIEW sofware envi- MPDBA, the retransmission times can be reduced obviously ronments. Te detailed parameters of the fading channels indiferentSNRscenarioscomparedwiththefxedblock-size are presented in Table 1. Since the fading channel remains scheme. It also helps to verify the retransmission efciency Wireless Communications and Mobile Computing 9 improvements of the proposed spinal codes based MIMO Proceedings of the IEEE International Conference on Communi- scheme in fading channels. cations, pp. 1064–1070, 1993. [5] Y. Sun, C. E. Koksal, K.-H. Kim, and N. B. Shrof, “Scheduling 6. Conclusions of Multicast and Unicast Services under Limited Feedback by using Rateless Codes,” in Proceedings of the IEEE INFOCOM, Te application of spinal codes in massive MIMO with imper- pp.1671–1679,Toronto,Canada,2014. fect CSI has great benefts. Before spinal codes are integrated [6] Y.Sun,C.E.Koksal,S.-J.Lee,andN.B.Shrof,“Networkcontrol into the system, an efcient transmission scheme for massive without CSI using rateless codes for downlink cellular systems,” MIMO with spinal codes is proposed, where the maximum in Proceedings of the IEEE INFOCOM, pp. 1016–1024, 2013. achievable rate (MAR) is used to determine the block size [7]J.Perry,P.A.Iannucci,K.Fleming,H.Balakrishnan,andD. dynamicallytoreducetheretransmissiondelayspenton Shah, “Spinal codes,” ACM SIGCOMM Computer Communica- tion Review,vol.42,no.42,pp.49–60,2012. decoding the messages, and multilevel puncturing scheme is considered for diferent MARs to limit the gap between [8] A. Gudipati and S. Katti, “Automatic rate adaptation and col- lision handling,” ACM SIGCOMM Computer Communication theMARandtheachievedrate.Sometheoreticalanalyses Review, vol. 41, no. 4, pp. 158–169, 2011. are provided to prove that the multilevel puncturing and dynamic block-size allocation (MPDBA) based spinal codes [9] O. Etesami and A. Shokrollahi, “Raptor codes on binary mem- oryless symmetric channels,” IEEE Transactions on Information can guarantee the system achieved rate as well as retrans- Teory,vol.52,no.5,pp.2033–2051,2006. mission efciency. Numerical simulation results illustrate [10]H.Balakrishnan,P.Iannucci,J.Perry,andD.Shah,“De- that spinal codes with MPDBA can improve the achieved randomizing shannon: Te design and analysis of a capacity- rate efciently over that based on the fxed-rate codes as achieving rateless code,” arXiv preprint arXiv:1206.0418, 2012. well as enhancing the system retransmission efciency with [11] M. H. M. Costa, “Writing on dirty paper,” IEEE Transactions on limited achieved-rate performance degradation. Terefore, Information Teory,vol.29,no.3,pp.439–441,1983. the proposed method can guarantee spinal codes based [12] B. M. Hochwald, C. B. Peel, and A. L. Swindlehurst, “A vector- massiveMIMOsystemtobereliableandpractical.Finally, perturbation technique for near-capacity multiantenna mul- we implement a simplifed MIMO testbed over NI’s USRP tiuser communication-part ii: perturbation,” IEEE Transactions platform to verify the proposal’s performance improvements, on Communications,vol.53,no.3,pp.537–544,2005. which can help to understand our theoretical analysis and can [13] C. Windpassinger, R. F. H. Fischer, and J. B. Huber, “Lattice- be easily extended to massive MIMO scenarios. reduction-aided broadcast precoding,” IEEE Transactions on Communications,vol.52,no.12,pp.2057–2060,2004. Conflicts of Interest [14] C. Wang, E. K. S. Au, R. D. Murch, W.H. Mow, R. S. Cheng, and V.Lau, “On the performance of the MIMO zero-forcing receiver Te authors declare that there are no conficts of interest in the presence of channel estimation error,” IEEE Transactions regarding the publication of this paper. on Wireless Communications,vol.6,no.3,pp.805–810,2007. [15]F.Rusek,D.Persson,B.K.Lauetal.,“ScalingupMIMO: opportunities and challenges with very large arrays,” IEEE Acknowledgments Signal Processing Magazine,vol.30,no.1,pp.40–60,2013. Tis work has been supported by Science, Technology [16]P.Iannucci,J.Perry,H.Balakrishnan,andD.Shah,“Nosymbol and Innovation Commission of Shenzhen Municipality lef behind: A link-layer protocol for rateless codes,”in Proceed- (JCYJ20160429170032960), National Natural Science Foun- ings of the International Conference on Mobile Computing and Networking,pp.17–27,2012. dation of China (no. 61501527), State’s Key Project of Research and Development Plan (no. 2016YFE0122900-3), the Funda- [17] X.Yang,W.J.Lu,N.Wang,K.Nieman,S.Jin,andH.Zhu,Design and implementation of a tdd-based 128-antenna massive mimo mental Research Funds for the Central Universities, Guang- prototyping system,2016. dong Science and Technology Project (no. 2016B010126003), [18] X.Yang,Z.Huang,B.Hanetal.,“RaPro:ANovel5GRapidPro- and 2016 Major Project of Collaborative Innovation in totyping System Architecture,” IEEE Wireless Communications Guangzhou (no. 201604046008). Letters,vol.6,no.3,pp.362–365,2017. [19] T. Zhao, DesignandimplementationofMIMOsystembasedon References USRP, Nanjing University of Posts and Telecommunications, June 2016. [1] T. L. Marzetta, “Noncooperative cellular wireless with unlim- [20] S. Clayton, H. Yu, A. Narendra et al., “Argos: practical many- ited numbers of base station antennas,” IEEE Transactions on antenna base stations,” in proceedings of the International Con- Wireless Communications,vol.9,no.11,pp.3590–3600,2010. ference on Mobile Computing and Networking,vol.11,pp.53–64, [2] J. Hoydis, S. Ten Brink, and M. Debbah, “Massive MIMO in the 2012. UL/DL of cellular networks: how many antennas do we need?” IEEE Journal on Selected Areas in Communications,vol.31,no. 2,pp.160–171,2013. [3] R. G. Gallager, Low-density parity-check codes, Springer Inter- national Publishing, 1960. [4] C. Berrou, A. Glavieux, and P. Titimajshima, “Near shannon limit errorcorrecting coding and decoding: Turbo-codes. 1,” in Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 7658093, 13 pages https://doi.org/10.1155/2018/7658093

Research Article Design and Analysis of Adaptive Message Coding on LDPC Decoder with Faulty Storage

Guangjun Ge1,2 and Liuguo Yin 2,3

1 School of Aerospace, Tsinghua University, Beijing, China 2EDA Laboratory, Research Institute of Tsinghua University in Shenzhen, Shenzhen, China 3School of Information Science and Technology, Tsinghua University, Beijing, China

Correspondence should be addressed to Liuguo Yin; [email protected]

Received 23 November 2017; Revised 9 February 2018; Accepted 19 February 2018; Published 22 March 2018

Academic Editor: Qin Huang

Copyright © 2018 Guangjun Ge and Liuguo Yin. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Unreliable message storage severely degrades the performance of LDPC decoders. Tis paper discusses the impacts of message errors on LDPC decoders and schemes improving the robustness. Firstly, we develop a discrete density evolution analysis for faulty LDPC decoders, which indicates that protecting the sign bits of messages is efective enough for fnite-precision LDPC decoders. Secondly, we analyze the efects of quantization precision loss for static sign bit protection and propose an embedded dynamic coding scheme by adaptively employing the least signifcant bits (LSBs) to protect the sign bits. Tirdly, we give a construction of Hamming product code for the adaptive coding and present low complexity decoding algorithms. Teoretic analysis indicates that the proposed scheme outperforms traditional triple modular redundancy (TMR) scheme in decoding both threshold and residual errors, while Monte Carlo simulations show that the performance loss is less than 0.2 dB when the storage error probability varies −3 −4 from 10 to 10 .

1. Introduction the min-sum decoding of LDPC was studied in [10–12]. It showed that quantizing messages with more bits was not Low-Density Parity-Check (LDPC) codes are widely used always benefcial for LDPC decoders with hardware errors. in space communications due to their capacity-approaching In general, the existing works treated each fnite-precision capabilities [1]. Te outstanding performance of LDPC is message as an integer, while this paper discusses the various based on the sof-decoding algorithms [2] which consume impacts of diferent bits of the fnite-precision message. a large number of memories. However, the radiation envi- We develop a discrete density evolution analysis for LDPC ronment will give rise to fault problems for memories when decoders with faulty messages. It indicates that the sign LDPC decoders are used in the spacecraf [3]. Such unreliable bits of the messages play the most important role in the storage will severely degrade the performance of LDPC codes. decoding performance of LDPC codes, which means setting Tus, it is important to consider the robustness of LDPC protection on sign bits is efcient enough. To protect the sign decoders utilizing unreliable memories. bit inside each quantized message, the traditional method Tere are studies on the efects of unreliable hardware is the static triple modular redundancy (TMR) scheme as on LDPC decoders. Varshney considered the thresholds and applied in [13]. However, since two quantization bits are residual errors of LDPC codes with the faulty Gallager A occupied for protecting the sign bit, the TMR scheme is not decoding in the earlier stage [4]. Extended studies on faulty always benefcial for various storage error levels due to the Gallager B decoders were then developed in [5–7]. Besides loss of quantization precision. By analyzing the convergence these bit fipping decoding algorithms, the belief propagation process of LDPC decoding as well as referring to the results in (BP) decoding of LDPC on noisy hardware was studied [12, 14], it shows that when the magnitude of message is small, in [8, 9], where infnite-precision message with additive the precision bits, that is, the least signifcant bits (LSBs), Gaussian noise was considered. Finite-precision message for are nonnegligible for decoding performance, while when the 2 Wireless Communications and Mobile Computing message has a large magnitude value, the sign bit becomes N even more critical for the residual errors. ···

Basedontheaforementionedobservations,wepropose CRAM an adaptive embedded coding scheme for the unreliable . . . m mc . messages to achieve a robust LDPC decoder. First, we M . ΠH(,$0#) . c2 . put the messages into packages by taking advantage of the parallel message architecture of the quasi-cyclic (QC) m2c l LDPC decoders. Te structure of message package permits ··· more efcient block coding schemes for the sign bits other CNU m (M/l) than simple TMR method. Ten, two LSBs are adaptively 2c VRAM employed for sign bits protection based on the magnitude mc2 level of message package. Moreover, we introduce a con- m struction of Hamming product code for the adaptive coding, ··· N/l which has a multistage coding structure and outstanding VNU( ) m error-correcting capability. We also discuss low complexity r LRAM iterative decoding algorithms for the Hamming product md code. Both theoretical analysis and Monte Carlo simulations mo demonstrate that the proposed adaptive message coding scheme outperforms the TMR scheme in decoding both Figure 1: Te partially parallel architecture of LDPC decoders. thresholds and residual errors for various storage error levels. Te paper is organized as follows. In Section 2, the system models are introduced. Section 3 presents the discrete implement LDPC decoders on integrated circuits, all of the density evolution analysis on unreliable LDPC decoders. messages will be quantized into bits. Existing studies [20] Te adaptive message coding scheme and construction of have shown that 4–6 bits’ quantization on messages can pro- Hamming product code are proposed in Section 4. We give vide ideal compromise between complexity and performance the decoding algorithms for adaptive Hamming product for LDPC decoders. Among the quantized bits, one is used codes in Section 5. Monte Carlo simulations are provided in for the sign, while the rest are used for the magnitude value. Section 6. Section 7 concludes the entire paper. 2.2. Error Model of Memory. For the existing studies on 2. System Models LDPC decoders with faulty hardware, there are several widely accepted error models. As shown in Figure 2, where the 2.1. LDPC Decoder. Te hardware architecture of the QC- model shown as Figure 2(a) is adopted in [4, 5, 7], the model LDPC decoder is shown in Figure 1, which consists of shown as Figure 2(b) is adopted in [8, 9]. Tese models are Π interleaver ( �(LDPC)), variable node units (VNU), check both assumed to connect the error-free operation results with node units (CNU), and data bufers (RAM). Since the matrix error channels, such as the binary symmetric channel (BSC) of QC-LDPC is divided into subblocks, the decoders are or the additive white Gaussian noise (AWGN) channel. always implemented with the partially parallel architecture However, these two error models still have limitations for [15–17], which means the messages of each subblock are practical LDPC decoders. For example, the BSC error model calculatedbythesameVNUorCNUnodeinthepipeline is mostly utilized in bit fipping decoding algorithms, such as operations. Te constraint of the LDPC code is executed by Gallager A decoding and Gallager B decoding, which make the interleaver, which is used to deliver the messages between moresenseintheoreticalanalysis.TeAWGNerrormodel theVNUandCNUbasedontheparity-checkmatrixof is adopted in infnite-precision sof-decoding algorithms, LDPC in various decoding algorithms [18, 19]. To execute the wherethemessagesareincontinuousdomainandassumed BP decoding of LDPC, the decoder frstly obtains the log- tobeaddedwithGaussiannoisebythefaultyhardware. likelihood (LLR) �� from the channel. Ten, VNU and CNU In this paper, we consider the practical LDPC decoders, perform iterative computations, where the internal messages where fnite-precision decoding algorithm is utilized. Fol- �V2� and ��2V are produced. Specifcally, in VNU, lowing the studies in [11, 12], we assume a quantized BSC model for the storage errors, as shown in Figure 3. In � (�) =� + ∑ � � (�−1) , V2� � � 2V (1) the quantized BSC error model, the decoding messages are ��∈N(V)\� quantizedintobits,eachofwhichisassumedtopassaBSC error channel. Te BSC errors for diferent bits are assumed while, in CNU, to be independent, and the error ratios are assumed to be the same. Te cross-over parameter � of the BSC channel is � � (�) � (�) =2 −1 [ ∏ ( V 2� )] , the fipping probability of the RAM cell, which is relevant �2V tanh tanh 2 (2) [V�∈N(�)\V ] to the radiation level and the service duration. As shown in Figure 1, there are 3 memories for the message storage: where N(V) and N(�) are defned as the sets of nodes the LLR message storage, the V2C message storage, and the connected to node V and node �, respectively. Tese messages C2V message storage. In this paper, we assume the same bit are stored in memories during the decoding process. To fipping probability �0 for all message memories. Wireless Communications and Mobile Computing 3

BSC Additive noise 0 0 w  +  Write Read 1 1

(a) (b)

Figure 2: Existing faulty storage models.

where �� is the value of the �th quantization symbol, �0 =−∞, RAM 0 Bit 0 and �2�−1 =+∞. Meanwhile, P�2V is initialized as

Message storage BSC() Bit 1 CNU/VNU P0 ={0,...,0,� =1,0,...,0}. BSC() �2V 2(�−1) (4) . . Afer the initialization, the density evolution executes its Bit b iterations. Firstly, in the VNU nodes, BSC() � � �−1 ⊗(�V−1) PV = P� ⊗(P�2V ) . (5) Figure 3: Te quantized BSC model for the unreliable memories. It is worth noting that, afer the convolution operations, we � shall combine the extra elements of PV so as to ensure a length � of 2 −1. Secondly, in the CNU nodes, the magnitude values of 3. Analysis on LDPC Decoder with �(�) = Unreliable Messages messagesaremappedintolog-domainbyfunction − log(tanh(�/2)).TecorrespondingPMFofthemagnitude � 3.1. Discrete Density Evolution. In this section, we defne a values is mapped by Λ(PV2�). Further defne discrete density evolution method for the analysis of fnite- � � � ⊗(��−1) precision BP decoding of LDPC codes, which will give ΓV2� =[sign (PV2�),Λ(PV2�)] , (6) the performance of decoding thresholds and residual error ratios for LDPC decoders with diferent message protection and thus the output PMF of CNU is updated by schemes. � � � It has been proved by Varshney [4] that the symmetric P� = sign (ΓV2�) ⋅Λ(abs (ΓV2�)) . (7) conditions of density evolution are still suitable for the faulty � LDPC decoders with symmetric hardware errors. Terefore, Similarly, the extra elements of ΓV2� shall be combined afer we can utilize the discrete density evolution by assuming all the convolution operations. zero sequences are transmitted to analyze the fnite-precision Finally, afer the maximum iterations, the decoding deci- LDPC decoders with the BSC storage models. In this paper, sion is made in the VNU nodes, where the PMF is calculated (� ,� ) only the regular V � LDPC codes are considered for the as sake of simplicity. � � � �−1 ⊗�V In the density evolution analysis, we defne P� ={�1, P� = P� ⊗ (P�2V ) , (8) �2,...,�2�−1} as the probability mass function (PMF) of the corresponding message � at iteration �,where� is the and the probability of residual error �� canbeobtainedby number of quantization bits and �� is the probability of the � P 2�−1 th quantization symbol. For example, � is the PMF vector 1 � �−1 � � = ⋅ P (2 )+ ∑ P (�) . (9) of the message �� showninFigure1,whichistheLLRfrom � 2 � � � �=2(�−1)+1 channel, while PV2� is the PMF of message �V2� at the �th decoding iteration. Above is the conventional discrete density evolution Since the codewords are assumed to be all zero sequences, method for fnite-precision LDPC decoders. However, this P to initialize the discrete density evolution, � is calculated as paper considers the issue of message storage errors, which means each message will sufer transformation of PMF outside the nodes. In the following, we will model the PMF P� transformation of unreliable message in density evolution. E ={�,� ,...,� } 2 Defne 1 2 � as the quantization bit error � 2 { 1 � (� + 2/� ) } (3) vector, where �� is the error probability of the �th quantization = �� = ∫ exp (− )�� , { √ 8/�2 } bit (�1 for the sign bit, �� for the LSB). For example, we can 2� ⋅ 2/� ��−1 { } make �1 =�2 = ⋅⋅⋅ = �� =�0 for the VRAM and CRAM 4 Wireless Communications and Mobile Computing error models described in Section 2.2, where all quantized bits experience the same error probability. Further, defne PMF ) 4 transfer matrix Π(E), which is calculated as follows: 3.9 Eb/N0 � 3.8 Π (E) ={��� = ∏�� (�, �)} , �=1 (10) 3.7

(�,�=1,2,...,2� −1), 4, ( 32 ) LDPC 3.6

3.5 where ��� is the transfer probability from the �th quantization symbol �� to the �th one ��.And��(�, �) = 1 − �� if �� and �� 3.4 havethesamebitinthe�th quantization position; otherwise ( of Treshold 3.3 ��(�, �) = ��.Since��(�, �) = ��(�, �),weknowthatΠ(E) 0123 −4 is a symmetric matrix. As a result, the PMF transformations RAM bit flipping probability (0) ×10 between RAM’s input and output can be described as No bits protected 3 highest bits protected P� = P� ⋅ Π (E (�0)) , 1 highest bit protected 4 highest bits protected 2 highest bits protected � � P = P ⋅ Π (E (�0)) , (11) V2� V Figure 4: Te thresholds of protecting various message bits. � � P�2V = P� ⋅ Π (E (�0)) .

We could set various error vectors E for corresponding message quantization bits under the constraint of complexity, protection schemes for unreliable messages in discrete den- introducing TMR will bring a loss of quantization precision, sity evolution, which will give asymptotic performance of which is not always benefcial for various storage error ratios. diferent protection schemes. In the following, using the discrete density evolution method described in Section 3.1, we will analyze the performance of �=6 3.2. Analysis on Various Bit Errors for Finite-Precision Messa- TMRschemeforthesignbitprotection.Weset for ges. Based on the discrete density evolution method defned the quantity of quantization bits, which is adopted for most in Section 3.1, a threshold analysis is provided to demonstrate practical LDPC decoders. Moreover, to model the storage the various efects of fnite-precision message bits. It is shown erroroftheTMR-protectedmessages,theerrorvectorisset E ={3�2 −2�3,� ,� ,� } that the sign bits have the most infuence on the decoding to be � 0 0 0 0 0 ,whichisquantizedwith4 thresholds of LDPC codes. bits actually. While the unprotected LDPC decoder is set to E ={�,� ,� ,� ,� ,� } We execute the discrete density evolution on a (4, 32) be � 0 0 0 0 0 0 , the results of discrete density � regularLDPCcodewiththe6bits’quantizeddecoderinthis evolution under diferent storage error ratio 0 are shown in paper. To analyze the various efects of quantized bits, we set Figure 5. E ={0,�0,�0,�0,�0,�0} for the one highest bit protected Fromtheanalysis,itcanbeobservedthatwhenthe −3 memory error model, which means the sign bit is assumed storage error ratio is high (e.g., �0 =10), the LDPC to be error-free. Similarly, several � highest bits protected decoder without protection cannot work anymore, while the models can be defned, where E = {0,...,0,�0,...,�0}. TMR-protected one can work with a dramatic degradation of Te decoding thresholds are obtained by the discrete density decoding threshold. However, when the storage error ratio is −4 −5 evolution,asshowninFigure4.Wecanseethatifthesignbit low enough, for �0 =10 and �0 =10 ,duetothelossof is protected, the threshold will not be severely afected, while quantization precision, the TMR-protected LDPC decoders additional protection on the extra bits can provide little gain. will show disadvantages in decoding threshold compared with the unprotected ones. However, the TMR protection 3.3. Triple Modular Redundancy Scheme for Sign Bit Protec- scheme still has its advantages: we can observe that TMR- tion. ForLDPCdecoders,thecostisoverwhelmingtoprotect protected decoders have lower decoding residual errors for every message bit. However, as mentioned before, it is not all levels of storage error ratios. necessary since the sign bits are demonstrated to be the most important. Tus, following the idea of unequal error 3.4. Existing Adaptive Messages Coding Scheme. We noticed protection [21], we can simply set protection on the sign bits that a similar adaptive coding scheme for approximate com- to promise a low complexity. In this section, we will frstly puting with faulty storage has been proposed in [22]. In [22], introducetraditionalTMRprotectionschemeforthesignbits anadaptivemessagecodingschemeonfaultymin-sumLDPC and then discuss its advantages and disadvantages. decoders is mentioned. In detail, when the messages were As in [13], TMR has been applied in protecting the writtenintoRAMs,iftheMSBwas1,thelasttwoLSBswere messages for LDPC decoder on unreliable hardware. How- neglected, while the corresponding memory addresses were ever, TMR will charge two extra bits for protecting the sign used for a (3, 1) repetition coding on the sign bit. Otherwise, bit while the messages are typically quantized into only 4 the messages would be stored in the RAMs directly. When to 6 bits [20]. As a result, if we maintain the quantity of the LDPC decoder reads a message from the RAMs, the MSB Wireless Communications and Mobile Computing 5

RAM error ratio: 1e − 3 RAM error ratio: 1e − 4 100 100

10−10 10−10 Bit error ratio error Bit Bit error ratio error Bit 10−20 10−20

10−30 10−30 3 3.2 3.4 3.6 3.8 4 4.2 4.4 3 3.2 3.4 3.6 3.8 4 4.2 4.4 EbN0 (dB) EbN0 (dB)

No scheme No scheme TMR TMR No RAM errors No RAM errors −3 −4 (a) �=10 (b) �=10 RAM error ratio: 1e − 5 100

10−10 Bit error ratio error Bit 10−20

10−30 3 3.2 3.4 3.6 3.8 4 4.2 4.4 EbN0 (dB)

No scheme TMR No RAM errors −5 (c) �=10

Figure 5: Performance analysis under diferent storage error ratios. was checked. If the MSB was read as 1, a decoding on the when the number of quantization bits is from 4 to 6, only (3, 1) code was executed to obtain the sign bit, while the last simple repetition codes can be utilized. And this scheme is twoLSBswereselectedfrom{0, 1} randomly.Otherwise,the even inapplicable when the messages are quantized into less messages were assigned with the read values. than 4 bits. Secondly, whether the adaptive coding is executed Te aforementioned scheme makes full use of the LSBs ornotistotallybasedontheMSB,whichisalsosubjectto in the messages. It has efciently protected the unreliable the storage errors. In such a case, the decoding of the adaptive messages without using any storage redundancy. However, code may be incorrectly executed, which further degrades the there are some disadvantages for this protection scheme. performanceofthesignbitprotection. Firstly, the adaptive coding is executed inside the single We demonstrate the exact error-correcting performance message, which is typically quantized with no more than 7 of the sign bits for this coding scheme as follows. In the frst bits for the reason of complexity [20]. Consequently, there are case where the MSB is 1, the encoding will be executed. If the not enough bits for the efcient coding schemes. For example, MSB is read correctly, the (3, 1) code will be properly decoded 6 Wireless Communications and Mobile Computing

Iterations

CNU

Iterations Magnitude ENC DEC Sign LSBs CNU Message 1 VRAM/CRAM Message 2 ...... VRAM/CRAM DEC ENC Message k

ENC/DEC VNU VNU Output Embedded encoder/decoder Output DEC

LRAM LRAM Input ENC Original LDPC decoder Input Adaptive message coding

Figure 6: Structure of adaptive package coding.

2 3 with an output error ratio of 3� −2� .IftheMSBisreadin by the loss of quantization precision. However, we notice error,thedecodingwillbeneglected,whichresultsinanerror that quantization precision afects decoding performance rate of � for the sign bit. Tat means that the expectation of specifcally when the magnitude of message is small; that the error rate for the sign bit is is, when the message has a large magnitude, the LSBs are

2 3 less important. On the other hand, with the convergence �� (MSB =1) = (1−�) ×(3� −2� )+�×� of LDPC decoding process, the sign bits of messages are (12) 2 3 4 showntobehavesignifcantefectsonthedecodingresidual =4� −5� +2� . errors when most messages have large magnitudes. Based on these observations, while referencing the idea of adaptation InthesecondcasewheretheMSBis0,similarlytheerrorrate as in [23], we introduce an adaptive scheme for protect- should be calculated with two cases, which is derived by ing the sign bits of unreliable messages. Te basic idea is 1 3 that when the message magnitude is small, the LSBs are � ( =0) = (1−�) ×�+�×[ (1−�) + �] � MSB 4 4 used for maintaining quantization precision, while when (13) the magnitude is large enough, the storage space for LSBs 5 1 = �− �2. is used to protect sign bits to ensure a lower residual 4 2 error. Unfortunately, since the storage error probability � is small What is more, existing studies execute protection on enough, when the MSB is 1, this coding scheme cannot each single message, where only simple coding scheme (such achieve the error-correcting capability of the (3, 1) repetition asTMR)canbeutilized.However,wenoticethatLDPC code, while when the MSB is 0, the error probability of the decoders are usually implemented with a partially parallel sign bit is even higher than the one without protection. architecture, as described in Section 2.1. In other words, a group of messages are produced simultaneously. It inspires 4. Adaptive Message Coding Scheme us to put the sign bits into packages so that we can introduce efcient block coding schemes instead of the traditional In this section, we frstly present the architecture of the TMR. proposed adaptive coding scheme. Ten, a specifc construc- As shown in Figure 6, the structure of our proposed tion of Hamming product code for the adaptive strategy is adaptive coding scheme is described as follows: frst, put � provided. Next, we analyze the performance of the proposed concurrently produced messages into a package. Ten, defne scheme theoretically. �1 and �2 as the adaptive thresholds, where 0<�1 <�2 < � � max ( max is the maximum absolute value of quantization). 4.1.ProtectingSignBitsUtilizingLSBsAdaptively. As analyzed Next, when the messages are written into RAMs, for each in Section 3.4, protecting the sign bits of unreliable messages message package, calculate the average magnitude value � of by occupying extra bits is not always the best scheme. the � messages. Based on the value of �, the adaptive coding Te degradation of decoding threshold is mainly caused is divided into 3 stages as below. Wireless Communications and Mobile Computing 7

andcolumnsubcodesareinactivewhen0≤�<�1, while the second coding stage is executed by only activating the row subcodes when �1 ≤�<�2.Andthethirdcodingstageis executed by activating the whole subcodes when �2 ≤�≤ � max.

4.3. Teoretical Analysis. In this section, we will also utilize the discrete density evolution method to analyze our pro- posed adaptive package coding scheme. As mentioned before, we should deduce the error vector E for the proposed scheme. As defned in Section 4.1, since the stage of adaptive coding is based on the average magnitude values of message packages, we should frstly calculate the PMF of the summa- tion of magnitude values in one message package. First, the PMF of message magnitudes |P| is derived as |P| (48, 16) Figure 7: Te Hamming product code. (14) ={P (2(�−1)),P (2(�−1) −1)+P (2(�−1) +1),...}.

(i) If 0≤�<�1, all LSBs of the message package are Ten, the PMF of the summation of � magnitudes can be reserved for quantizing messages. obtained by

(ii) If �1 ≤�<�2,thestoragespaceforthelast� LSBs P = (|P|)⊗� . areoccupiedforcodingonthe� sign bits with a code sum (15) rate of 1/2. BasedonthePMFofmagnitudesummation,wecanobtain � ≤�≤� 2� (iii) If 2 max, the storage space for the last LSBs the probabilities of the 3 stages of adaptive coding, respec- areoccupiedforcodingonthe� sign bits with a code tively, by rate of 1/3. �⋅�1 Reversibly, when the messages are read from RAMs, adaptive � = ∑ P (�) , � �0 sum decoding is executed based on the value of .Ifthestorageof mag(�)=0 LSBshasbeenoccupiedforthesignbits,theLSBsofmessages �⋅� are randomly assigned. 2 � = ∑ P (�) , �1 sum (16) (�)=�⋅� 4.2. Construction of Adaptive Hamming Product Code. In this mag 1 section, we will give a specifc code construction for the �⋅� adaptive coding scheme. max ��2 = ∑ P (�) , To adaptively protect the sign bits of message packages, sum mag(�)=�⋅�2 the ideal should have the features of multistage coding structure, as well as low coding complexity and where mag(�) is the corresponding magnitude value of the �th P appropriate block length. We introduce Hamming product element of sum. Next, we need to calculate the error ratio codes as the adaptive package codes based on the following for each stage of the adaptive Hamming product codes. As advantages. First, the product codes are constructed by sev- for (�,�,�)block codes with a raw error ratio of �0,theerror eral subcodes, whose coding process can be easily designed upper bound is derived by into multistage. Second, Hamming codes have the simplest � decoders and encoders among all of the block codes, which � � �−� � (�,�,�,� )= ∑ ( )(� ) (1 − � ) . only consist of several basic logic gates. Moreover, as the blk 0 0 0 (17) �=⌊(�+1)/2⌋ � dataisusuallyoperatedinbytes,whereeachbytecontains 8 bits, in order to make the package codes suitable for the Basedon(17),theerrorvectorforthefrstcodingstageis data operations, we choose the modifed Hamming product E�0 =(�0,�0,�0,�0,�0,�0), while for the second stage it is code, which is (�, �) = (48, 16). It is worth noting that some E =(� (8,4,4,� ), � ,� ,� ,� , 0.5) �1 blk 0 0 0 0 0 ,andforthethird other short algebraic codes could be adopted to constitute E =(� (48, 16, 7, � ), � ,� ,� , 0.5, 0.5) stage it is �2 blk 0 0 0 0 .Asa the product code, such as Gray codes in [24], at the cost of result, the eventual error vector is obtained by complexity. As shown in Figure 7, the dark points are the sign bits E =��0 ⋅ E�0 +��1 ⋅ E�1 +��2 ⋅ E�2. (18) in one message package and the white points are the LSBs. � = 0.4� � =0.8� Te row and column subcodes are both (8, 4) Hamming We set 1 max and 2 max and analyze the −3 codes. For such multistage (48, 16) Hamming product codes, performance of our proposed scheme with �0 =10 ,�0 = −4 −5 package size �=16, the frst coding stage is that both row 10 , and �0 =10 . Firstly, we simulate with the parameter 8 Wireless Communications and Mobile Computing

100 100

10−10 10−10

10−20 Bit error ratio error Bit Bit error ratio error Bit 10−20

10−30

10−30 3 3.2 3.4 3.6 3.8 4 4.2 4.4 3 3.2 3.4 3.6 3.8 4 4.2 4.4

Eb/N0 (dB) Eb/N0 (dB)

No protection Proposed No protection Proposed TMR No RAM errors TMR No RAM errors −3 −4 (a) �=10 (b) �=10

100

10−10

10−20 Bit error ratio error Bit

10−30

3 3.2 3.4 3.6 3.8 4 4.2 4.4

Eb/N0 (dB)

No protection Proposed TMR No RAM errors −5 (c) �=10

Figure 8: Performance analysis on adaptive package coding scheme with (�V,��) = (4, 32).

(�V,��)=(4,32), whose results are shown in Figure 8. for the Hamming product code, which achieves good perfor- Ten, as a comparison, we set (�V,��) = (4, 8) to verify the mance with low complexity. efects of diferent row weights (or the coding rates) on the performance,theresultsofwhichareshowninFigure9. 5.1. Iterative Decoding of the Hamming Product Code. For the We can see that, with the proposed adaptive package coding subcode (8, 4) Hamming code, the Hamming distance is 4; scheme, the performances of both decoding threshold and that is, it can correct any one-bit error in one block. But when residual error are signifcantly improved. What is more, our there are two error bits, the decoder can only declare a block proposed scheme is efective in diferent coding rates. error without locating the error bits. Based on the Hamming codes’ characters of both error correcting and detecting, we defne two states for the output bits of the Hamming product 5. Decoding of Hamming Product Codes decoder: the fxed bits and the erasure bits. Te decoding Te Hamming product code we have introduced has an algorithm of the Hamming product code is described as outstanding minimum distance characteristic. However, its follows: error-correcting capability can only be achieved under the (i) Iterative step: the row subcodes and the column sub- maximum likelihood (ML) decoding, which has high com- codes execute their decoding algorithms iteratively. plexity and is not practical for LDPC message protection. During the decoding, if the Hamming decoder can- In this section, we will discuss specifc decoding algorithms not locate the error bits, keep the block unchanged; Wireless Communications and Mobile Computing 9

0 100 10

10−10 10−10

10−20 Bit error ratio error Bit Bit error ratio error Bit 10−20

10−30

10−30 2 2.2 2.4 2.6 2.8 3 3.2 3.4 2 2.2 2.4 2.6 2.8 3 3.2 3.4

Eb/N0 (dB) Eb/N0 (dB)

No protection Proposed No protection Proposed TMR No RAM errors TMR No RAM errors −3 −4 (a) �=10 (b) �=10

100

10−10

10−20 Bit error ratio error Bit

10−30

2 2.2 2.4 2.6 2.8 3 3.2 3.4

Eb/N0 (dB)

No protection Proposed TMR No RAM errors −5 (c) �=10

Figure 9: Performance analysis on adaptive package coding scheme with (�V,��)=(4,8).

otherwise, update the block. Afer several iterations for the output bits, we use two parameters to describe the (we set it to 2 iterations here), stop the iterative decoding performance: the bit erasure ratio �� and the bit decoding. error ratio ��. Both parameters can be approximately derived � (ii) Decision step: frstly, the error detecting is executed bythemostlikelyerrorpatterns.Teprobabilityofthe th- by the Hamming decoders. Defne R ⊂ {1, 2, 3, 4} as order error pattern can be derived as the set of indices where the row subcodes detect block C � 48−� errors. Similarly, defne index set for the column � (�) =�0 (1 − �0) , (19) subcodes. Ten, declare the bits located at (�, �) in the 4×4information bit matrix as erasure bits, where �∈ � R �∈C where 0 is the bit fipping probability of the RAM. As listed and .Terestofthebitsaredeclaredasfxed in Table 1, the error patterns with orders lower than 4 are bits. analyzed. Apparently, if there are less than two-bit errors We utilize a low-order approximation method to evaluate in a block of the Hamming product code, no erasure or the performance of the Hamming product code with the error bit exists. Terefore, only the error patterns with 3rd proposed decoding algorithm. Since two states are defned order and 4th order are adopted to deduce the approximate 10 Wireless Communications and Mobile Computing

Table 1: Low-order error pattern analysis.

Error pattern order Bit erasures total Bit errors total 10−4 100 200 325616 −6 413008168010

performance. Using the 3rd order only, the error performance ratio erasure/error Bit 10−8 canbederivedby 256 �̃ (3) = ×�(3) =16�3 (1−�)45 , � 16 246810121416 (20) −3 RAM bit flipping probability (0) ×10 16 3 45 �̃� (3) = ×�(3) =� (1−�) . 16 TMR rp Monte Carlo rp Monte Carlo re Low-order rp(3, 4) Whileusingboththe3rdorderand4thorder,theperfor- Low-order re(3, 4) Low-order rp(3) mance is derived as follows: Low-order re(3) 256 13008 � � �̃ (3, 4) = ×�(3) + ×�(4) Figure 10: Te low-order approximation of � and �. � 16 16 =16�3 (1−�)45 + 813�4 (1−�)44 , (21) the decoding of the row and column subcodes. In such a 16 1680 �̃ (3, 4) = ×�(3) + ×�(4) case, the error in the intersection position will be declared � 16 16 as an erasure bit according to the decision logic of the 3 45 4 44 aforementioned decoding algorithm. =� (1−�) + 105� (1−�) . As a matter of fact, the decoding erasure bits and error bits caused by the 3rd-order error patterns all occurred in Meanwhile, Monte Carlo simulations for the iterative decod- the similar ways mentioned above. Based on this analysis, ing of Hamming product code are provided. Te curves of an enhanced decoding scheme is proposed by adding the theperformanceareshowninFigure10.Wecanseethat following two decision logics in the decision step: �̃�(3, 4) and �̃�(3, 4) areveryclosetotheresultsofMonte Carlo simulations. Moreover, the Hamming product code (i) 3rd-order error bit decision: afer the error detecting, outperforms the traditional TMR scheme dramatically. if the index sets R ={�}and C ={�}are both single element sets, the bit located at (�, �) is fipped 5.2. Enhanced Decoding of the Hamming Product Code. In and declared as a fxed bit. fact, the performance of Hamming product code can be (ii) 3rd-order erasure bit decision: if one bit is decoded further improved at the expense of complexity for the iterative into diferent values by the row subcode and the decoding. In this section, an enhanced decoding scheme is column subcode, it is declared as an erasure bit. proposed to obtain better performance by introducing more decision logics. Te performance of the enhanced decoding scheme is shown � � As analyzed in Section 5.1, the 3rd-order error pattern in Figure 12. Tere is an improvement for both � and �.It has the most infuence on the decoding performance of should be noted that the additional decision logics are only the Hamming product code. Specifcally, the most typical used to cope with the 3rd-order error patterns. In fact, more error patterns for decoding erasure and error are depicted decision logics for the higher-order error patterns can be in Figure 11. Without loss of generality, we can assume that introduced to obtain better performance. the row subcodes are decoded frstly in the iterative step. If there are three check bit errors (denoted by the dark 5.3. Decoding Complexity of the Hamming Product Code. points in Figure 11(a)) in one column subcode, one of the Another key issue is the complexity of decoding Hamming column subcode’s information bits (denoted by the point product code compared with traditional TMR scheme. As marked with an asterisk) will be incorrectly decoded. In such TMR only consumes a majority decision logic module to a case, when it comes to the decision step, this incorrect decode the duplicate check, it is generally believed that intro- information bit, together with the other three incorrect ducing advanced long block codes will defnitely increase the check bits, will constitute a valid codeword for the column hardware complexity. However, in this section, based on the subcode. Tus, it will result in a decoding error. Similarly, Field Programmable Gate Array (FPGA) implementation, we asshowninFigure11(b),iftherearethreeerrors(denoted will see that the hardware complexity of Hamming product by the dark points) located, respectively, in the row subcode code can be even lower than TMR scheme in some cases. and the column subcode, they will simultaneously disable Moreover, there will be a fexible tradeof between hardware Wireless Communications and Mobile Computing 11

(a) Error pattern (b) Erasure pattern

Figure 11: Te 3rd-order error patterns.

and 6-input LUTs are mostly equipped. As a result, the TMR decision is actually processed by a 4-input LUT on FPGA. 10−4 Next, we will compare the consumption of LUTs for the TMR and proposed schemes. In our proposed adaptive message coding scheme, each 16 messages are grouped into one 10−6 package. Consequently, the corresponding consumption for TMR scheme is 16 4-input LUTs totally. Comparatively, the consumption of the proposed scheme is shown in Figure 13. (8, 4) 10−8 We can see that, for the Hamming encoder, only four

Bit erasure/error ratio erasure/error Bit 4-input LUTs are required, while, for the decoder, four 4- input LUTs are utilized to generate the correctors, and then each decoded information bit outputs through a 6-input LUT 10−10 by logically processing the correctors and original value. To 246810121416sum up, the total consumption is eight 4-input and four 6- −3 RAM bit flipping probability (0) ×10 input LUTs for (8, 4) Hamming encoder and decoder. Based rp on these analyses, the hardware complexity of proposed TMR Hamming product code is no more than TMR scheme, Monte Carlo normal re Monte Carlo normal rp while roughly speaking it even consumes less resources on Monte Carlo enhanced re FPGA. Actually, in our iterative decoding algorithm for Monte Carlo enhanced rp the Hamming product codes, the cost for improving error- correcting performance of unreliable message is decoding Figure 12: Te performance of the enhanced decoding scheme. delay instead of hardware complexity. As the subcodes of Hamming produce code are decoded iteratively, the decoding of each message package will occupy a certain number of consumption and decoding delay for the Hamming product clocks. Tus if the LDPC decoder is poor in timing margin, code. the iterative decoding of Hamming product code will severely In applications, LDPC encoder and decoder are mostly degrade the decoding throughput. Fortunately, the subcodes implemented based on FPGA, which is reconfgurable and of Hamming product code can be decoded in parallel, which widely adopted in communication systems. A major difer- means we can compress the decoding clocks by parallel ence between FPGA and the Application Specifc Integrated processing with multiple Hamming decoders. In this case, Circuit (ASIC) is the structure of combinational logic circuit. there can be a fexible tradeof between hardware complexity For FPGA, the combinational logic is not composed of and decoding delay. Te specifc consumption of space-time actual logic gates. Instead, it is based on a structure called resource for various arrangements is shown in Table 2. Lookup Table (LUT), which is actually a small block of RAM. Te input of combinational logic is connected to the 6. Simulations RAM’s address, and the logical output is presynthesized and stored into the RAM. Tus arbitrary logical operation can In this section, Monte Carlo simulations are executed on the be implemented by looking up into the storage for each fnitelengthcodewordsofLDPC.Weutilizethe(8176, 7154) input logic combination. For conventional FPGA, 4-input LDPC code defned by CCSDS in [25], which is publicly 12 Wireless Communications and Mobile Computing

4 4

4 4

4 4

4 4 4

6

×16 6

6

TMR scheme 6

(8, 4) Hamming encoder (8, 4) Hamming decoder

Figure 13: Te implementation structure of various schemes based on LUT.

Table 2: Tradeof between complexity and decoding delay. 10−1

Serial number Consumption of LUTs Decoding clocks −2 14inputs× 4+6inputs× 48 10 24inputs× 8+6inputs× 84 × × 34inputs16 + 6 inputs 16 2 10−3 available and has outstanding performance. In the simu- 10−4 lations, the messages are quantized into 6 bits, while the maximumnumberofiterationsissetto15.Tecommuni- (BER) ratio error Bit 10−5 cation channel is assumed to be the additive white Gaussian noise (AWGN) channel. To demonstrate the efectiveness of our proposed scheme under various storage error levels, the 10−6 −3 fipping probability of BSC model is set from �0 =10 −4 to �0 =10.Wecomparetheadaptivemessagecoding 3.8 4 4.2 4.4 4.6 4.8 5 5.2 5.4 scheme (labeled as “proposed”) with both the traditional EbN0 (dB) TMR scheme (labeled as “TMR”) and the one without No err No sch(1e − 4) protection (labeled as “no sch”). Te results are shown in 1e − 4 � =10−3 No sch(1e − 3) TMR( ) Figure14.Wecanseethatwhen 0 ,theproposed TMR(1e − 3) Proposed(1e − 4) scheme has a gain of 0.2dB compared to TMR scheme, while 1e − 3 −4 Proposed( ) theunprotectedoneevencannotwork.When�0 =10 ,Te (8176, 7154) proposed scheme still outperforms the other schemes. Figure 14: Simulations on the LDPC code with mes- sage storage errors. 7. Conclusion

Tis paper considered the challenge of implementing LDPC residual errors under various storage error levels. Moreover, decoders on unreliable memories. We explored the efects Monte Carlo simulations showed that the proposed scheme of various message bits on fnite-precision LDPC decoders could at least obtain a gain of 0.3 dB to the static TMR −3 and introduced an efective adaptive coding scheme based scheme when the storage error probability was from 10 to on the magnitude level of messages. We put the messages −4 10 . into packages and proposed a Hamming product code to adaptively correct the sign bits, as well as discussing two low complexity decoding algorithms. Te discrete density evolu- Conflicts of Interest tion analysis showed that the proposed scheme outperforms traditional TMR scheme in decoding both threshold and Te authors declare that they have no conficts of interest. Wireless Communications and Mobile Computing 13

Acknowledgments [15] Q.Li,X.Qu,L.Yin,andJ.Lu,“GeneralizedLow-DensityParity- Check coding scheme with Partial-Band Jamming,” Tsinghua Tis work was supported by the National Natural Sci- Science and Technology,vol.19,no.2,pp.203–210,2014. ence Foundation of China (NSFC 91538203), the new [16] Z. Chen, L. Yin, Y. Pei, and J. Lu, “CodeHop: physical layer error strategic industries development projects of Shenzhen City correction and encryption with LDPC-based code hopping,” (JCYJ20150403155812833), and the Beijing Innovation Center Science China Information Sciences,vol.59,no.10,ArticleID for Future Chips, Tsinghua University. 102309, 2016. [17] P. Wang, L. Yin, and J. Lu, “An efcient helicopter-satellite communication scheme based on check-hybrid ldpc coding,” References Tsinghua Science and Technology,pp.10–26599,2018. [1]K.S.Andrews,D.Divsalar,S.Dolinar,J.Hamkins,C.R.Jones, [18] Q. Huang, L. Song, and Z. Wang, “Set Message-Passing Decod- and F. Pollara, “Te development of turbo and LDPC codes for ing Algorithms for Regular Non-Binary LDPC Codes,” IEEE deep-space applications,” Proceedings of the IEEE, vol. 95, no. 11, Transactions on Communications,2017. pp.2142–2156,2007. [19] Q. Huang, M. Zhang, Z. Wang, and L. Wang, “Bit-reliability [2] Q. Huang, Q. Xiao, L. Quan, Z. Wang, and S. Wang, “Trimming based low-complexity decoding algorithms for non-binary sof-input sof-output viterbi algorithms,” IEEE Transactions on LDPC codes,” IEEE Transactions on Communications,vol.62, Communications,vol.64,no.7,pp.2952–2960,2016. no.12,pp.4230–4240,2014. [3] D.Stone,A.Lindenmoyer,G.Frenchetal.,“NASA’sapproachto [20] J. Lee and J. Torpe, “Memory-efcient decoding of LDPC commercial cargo and crew transportation,” Acta Astronautica, codes,” in Proceedings of the Proceedings. International Sympo- vol.63,no.1-4,pp.192–197,2008. sium on Information Teory, 2005. ISIT 2005.,pp.459–463, Adelaide, Australia, September 2005. [4] L. R. Varshney, “Performance of LDPC codes under faulty iterative decoding,” IEEE Transactions on Information Teory, [21] J. Huang, Z. Fei, C. Cao, M. Xiao, and D. Jia, “On-Line Fountain vol. 57, no. 7, pp. 4427–4444, 2011. Codes with Unequal Error Protection,” IEEE Communications Letters,vol.21,no.6,pp.1225–1228,2017. [5] F. Leduc-Primeau and W. J. Gross, “Faulty Gallager-B decoding [22] C. H. Huang, Y. Li, and L. Dolecek, “ACOCO: Adaptive with optimal message repetition,”in Proceedings of the 2012 50th Coding for Approximate Computing on Faulty Memories,” Annual Allerton Conference on Communication, Control, and IEEE Transactions on Communications, vol. 63, no. 12, pp. 4615– Computing, Allerton 2012, pp. 549–556, USA, October 2012. 4628, 2015. [6]C.H.Huang,Y.Li,andL.Dolecek,“GallagerBLDPCdecoder [23]X.Ji,J.Xu,Y.L.Che,Z.Fei,andR.Zhang,“AdaptiveMode with transient and permanent errors,” IEEE Transactions on Switching for Cognitive Wireless Powered Communication Communications,vol.62,no.1,pp.15–28,2014. Systems,” IEEE Wireless Communications Letters,vol.6,no.3, [7] S. M. S. Tabatabaei Yazdi, H. Cho, and L. Dolecek, “Gallager B pp. 386–389, 2017. decoder on noisy hardware,” IEEE Transactions on Communi- [24] L. Wang, Z. Wang, Q. Huang, and M. Zhang, “Balanced Gray cations,vol.61,no.5,pp.1660–1673,2013. Codes with Flexible Lengths,” IEEE Communications Letters, [8]C.H.Huang,Y.Li,andL.Dolecek,“Noisybeliefpropagation vol.20,no.5,pp.894–897,2016. decoder,” in Proceedings of the 48th Asilomar Conference on [25] CCSDS, “Low density parity check codes for use in near-earth Signals, Systems and Computers, ACSSC 2015, pp. 2111–2115, anddeepspaceapplications,”CCSDS,2011. USA, November 2014. [9]C.H.Huang,Y.Li,andL.Dolecek,“Beliefpropagation algorithms on noisy hardware,” IEEE Transactions on Commu- nications,vol.63,no.1,pp.11–24,2015. [10] E. Dupraz, D. Declercq, B. Vasic, and V. Savin, “Analysis and Design of Finite Alphabet Iterative Decoders Robust to Faulty Hardware,” IEEE Transactions on Communications,vol.63,no. 8, pp. 2797–2809, 2015. [11] C. K. Ngassa, V. Savin, and D. Declercq, “Min-Sum-based decoders running on noisy hardware,”in Proceedings of the 2013 IEEE Global Communications Conference, GLOBECOM 2013, pp. 1879–1884, USA, December 2013. [12] A. Balatsoukas-Stimming and A. Burg, “Density evolution for min-sum decoding of LDPC codes under unreliable message storage,” IEEE Communications Letters,vol.18,no.5,pp.849– 852, 2014. [13] M. May, M. Alles, and N. Wehn, “A case study in reliability- aware design: A resilient LDPC code decoder,”in Proceedings of the Design, Automation and Test in Europe, DATE’08,pp.456– 461, 2008. [14] C. H. Huang, Y. Li, and L. Dolecek, “Adaptive error correc- tion coding scheme for computations in the noisy min-sum decoder,” in Proceedings of the IEEE International Symposium on Information Teory, ISIT’15,pp.1906–1910,June2015. Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 8923478, 7 pages https://doi.org/10.1155/2018/8923478

Research Article A CCM-Based OFDM System with Low PAPR for Sparse Source

Qinbiao Yang,1 Zulin Wang,1,2 and Qin Huang 1

1 Electronic and Information Engineering, Beihang University, Beijing 100191, China 2Collaborative Innovation Center of Geospatial Technology, Wuhan 43079, China

Correspondence should be addressed to Qin Huang; [email protected]

Received 24 November 2017; Revised 6 February 2018; Accepted 19 February 2018; Published 21 March 2018

Academic Editor: Michael McGuire

Copyright © 2018 Qinbiao Yang et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Orthogonal frequency division multiplexing (OFDM) usually sufers high peak-to-average power ratio (PAPR). As shown in this paper, PAPR becomes even severe for sparse source due to many identical nonzero frequency OFDM symbols. Tus, this paper introduces compressive coded modulation (CCM) in order to restrain PAPR by reducing identical nonzero frequency symbols for sparse source. As a result, the proposed CCM-based OFDM system, together with iterative clipping and fltering, can efciently restrain the high PAPR for sparse source. Simulation results show that it outperforms about 4 dB over the traditional OFDM system when source sparsity is 0.1.

1. Introduction In signal distortion techniques, clipping and fltering is the most popular way to reduce PAPR because of its Due to its high-speed data rate, orthogonal frequency divi- low complexity and moderate signal distortion. It clips the sion multiplexing (OFDM) [1, 2] has been widely applied OFDM signal to a predefned threshold and uses a flter to the 4G mobile communication system, Wi-Fi, and some to restrain the out-of-band radiation. However, the fltering military communication systems [3–7]. However, its signals operation may cause peak regrowth. Hence, iterative clipping may have high peak-to-average power ratio (PAPR). Because and fltering (ICF) [15] has been proposed to suppress the of the nonlinearity of power amplifer (PA), it will lead to peak regrowth. Some variants of ICF have been developed to signal distortion and restrict implementation of the OFDM increase convergence rate, for example, simplifed ICF (SICF) system [8, 9]. [16], optimized ICF (OICF) [19], and simplifed optimized Tosolvethisproblem,manyPAPRreductiontechniques ICF (SOICF) [20]. Only afer several iterations is the SOICF have been proposed in recent few years. Tese techniques able to achieve sufcient PAPR reduction with less BER can be classifed into three categories [10]: multiple signaling degradation and low complexity. and probabilistic techniques [11–13], coding techniques [14], and signal distortion techniques [15–19]. In general, the frst In a traditional OFDM system, it usually takes the M- two categories, that is, the multiple signaling and probabilistic array Phase Shif Keying (MPSK) or the M-array Quadrature techniques and coding techniques, do not increase the bit AmplitudeModulation(MQAM)bit-to-symbolmapping error rate (BER), but they involve high computational com- operations. If source is sparse, these bit-to-symbol mapping plexity and data rate loss [10]. Accordingly, signal distortion operations will generate more identical nonzero frequency techniques reduce the PAPR by distorting the transmitted OFDM symbols which will result in high PAPR. So PAPR OFDM signal before it passes through the PA. Although becomesevensevereinthecaseofsparsesource.Asa signal distortion techniques slightly increase BER, they have result, ICF and its variants may fail to obtain sufcient PAPR lower computational complexity and do not result in data reduction. rateloss.Tus,inordertoachievehighdatarateinOFDM Tis paper focuses on the PAPR reduction of OFDM for systems, we adopt signal distortion techniques for PAPR sparse source. Te key idea to reduce PAPR for sparse source reduction. is to convert source bits into frequency OFDM symbols by 2 Wireless Communications and Mobile Computing

Source MPSK RF IFFT SOICF bits /MQAM transmitter

Channel

Receive MPSK/MQAM RF FFT bits demodulation receiver

Figure 1: Diagram of the traditional OFDM system.

compressive coded modulation (CCM) [21] instead of tradi- where � is the oversampling factor and the oversampling tional MPSK or MQAM. Usually, the CCM is used for seam- operation is implemented by zero padding in the end of X.For less rate adaptation by adopting a special random projection signal �(�), the PAPR is defned as the ratio of the maximum (RP) to generate symbols from source bits. However, we verify power to the average power and can be formulated as thattheRPsymbolsoftheCCMforsparsesourceconcentrate 2 in zero, and there are much less nonzero symbols than those max [|� (�)| ] PAPR = , (2) ofMPSKandMQAMintheRPsymbols.Asaresult,we �{|� (�)|2} propose using CCM together with SOICF to efciently reduce 2 the PAPR of OFDM for sparse source. Simulation results where �{|�(�)| } denotes the average power of �(�).As show that the proposed CCM-based OFDM system, together mentioned in [22], if the elements of X are statistically with the SOICF method, almost has the same performance independent and identically distributed random variables, in terms of PAPR reduction as a traditional OFDM system the real and imaginary parts of �(�) are Gaussian random 2 regardless of source sparsity. Furthermore, the proposed variables with zero mean and same variance � when �> OFDM system outperforms about 4 dB over the traditional 64. Consequently, the amplitude of �(�),thatis,|�(�)|,isa OFDM system when source sparsity is 0.1. Rayleigh random variable and, sometimes, the peak power of Te rest of this paper is organized as follows. Sec- �(�) is much larger than its average power. In other words, it tion 2 briefy introduces necessary background. Section 3 results in high PAPR of the OFDM signal, which degrades the presents the related theoretical analysis and the proposed system’s performance. novel OFDM system. Numerical and simulation results are provided in Section 4. Te conclusion is given in Section 5. 2.2. CCM Scheme. Te CCM scheme is a compressive coded modulation for seamless rate adaptation. Its core is 2. Background a special random projection code (RPC) and the corre- sponding decoding algorithm. Due to the embedding of In this section, we frst review the traditional OFDM system source compressive into modulation, the CCM scheme brings anditsPAPRproblem.Ten,theCCMisbriefydescribed. signifcant throughput gain when source is sparse. Moreover, the decoding algorithm performs joint decoding based on 2.1. Traditional OFDM System. As shown in Figure 1, a received RP symbols, and the number of received RP symbols traditional OFDM system usually takes the MPSK or MQAM canbeadjustedinfnegranularity.Ten,itisusuallyusedfor to convert the source bits into the frequency OFDM symbols. seamless rate adaption. Te key process of the CCM scheme Ten, IFFT process part generates the discrete-time OFDM is as follows: signal. Furthermore, we adopt the SOICF method to restrain (i) Design the desired RPC. PAPR. Finally, the optimized signal is transmitted by the RF transmitter, which includes oversampling, upconversion, and (ii) Use the RPC and source bits to generate the RP PA. In the receiver, the source bits can be recovered by the symbols. FFT and the corresponding demodulation. (iii) Map every two consecutive RP symbols into a wireless T Let us defne b = (�1,�2,...,��) ∈{0,1}as the source symbol with I and Q components. vector, where T indicates the transpose of the vector. Trough (iv) Transmit the wireless symbols through carrier mod- b the bit-to-symbol mapping operations with ,thefrequency ulation and digital-to-analog converter (DAC). OFDM symbols with � subcarriers can be written as X = (v) Recover the transmitted source bits through the RPC- [�(0), �(1), . . . , �(� − 1)], and the discrete-time signal can Belief Propagation (BP) [21] decoding algorithm in be obtained by the receiver. � 1 �(2���/��) � (�) = ∑� (�) ⋅� Specifcally, unlike traditional bit-to-symbol mappings, √� �=1 (1) the RP can implement the bit-to-symbol mapping and chan- nel protection at the same time. We present the schematic √ = �⋅IFFT (� (�))�� , �=0,1,...,��−1, diagramofRPinFigure2. Wireless Communications and Mobile Computing 3

Source bits RP symbols Modulation constellation 1 1 1 0.9 0 Q −4 0.8 1 4 2 −4 0 4 ...... 0.7 −1 (−4, −4) 0 4 −4 ...... 1 −2 −4 (6, 4) 0.6

0 ... 0.5 2 −4 ...... I 1 1 CCDF

... 0.4 −1 1 6 (−4, −4) 0.3 −2 (6, 4) 0 4 4 ...... 0 0.2 −4 ... 1 0.1 0 Figure 2: Illustration of random projection. 0 2 4 6 8 10 12 14 16 18 20 PAPR (dB)

As illustrated in Figure 2, source bits can be converted p = 0.5 p = 0.3 to RP symbols by weighted sum operation. Ten, every two p = 0.1 RP symbols are mapped into one wireless symbol. Each constellation point is used to represent one wireless symbol. Figure 3: PAPRs for diferent � with QPSK-based OFDM and L = RPC bit-to-symbol mapping converts the source vector b 8. into a series of RP symbols �1,�2,...,�� by weighted sum operation. Defne �={�1,�2,...,��}∈Z, �=�,asweight set; the �th RP symbol can be generated by In the receiver, through the demodulation and inverse � projection from [�min,�max] to [−�, �], we can obtain the � = ∑� ⋅� , � � ��� (3) noisy RP symbols from wireless channel. Furthermore, using �=1 the RPC-BP decoding algorithm, we can recover the source bits from the noisy RP symbols. � b � where ��� istheelementofthesourcevector and �� is the index of the bit weighted by �� for generating symbol 3. Proposed CCM-Based OFDM System ��. Based on (3), the RPC can be regarded as a low-density generator matrix G.Tereareonly� nonzero entries in G � � FromSection2.1,weknowthattheOFDMsignalisthesum- every row of matrix ,and � are located in the columns mation of the products of entries in X and the multicarrier s =(�,� ,...,� )T randomly. Denote 1 2 � as a symbol vector, and signals. According to (1), if some elements of X have the same then the RP symbols can be formulated as signandthecarrierssignalshavethesamephase,thesignal s = G ⋅ b. obtained by summation will have a large peak amplitude; (4) otherwise,thesignalwillhaveapeakamplitudearoundthe average. Generally, in order to achieve better transmission rate for Te source sparsity can be modelled by unequal proba- CCM scheme, the key problem of weight set selection is that bilitiesontheappearanceof0’sand1’s[23].Inthispaper, the entropy of the RP symbols generated by the weighted we denote the probability �(� = 1) = � as the source sum operation is always large. In addition, the low-density sparsity,thatis,theprobabilityontheappearanceof1.In characteristic of G also should be guaranteed. Te specifc the traditional OFDM system, if the source sparsity � is design principles of � and G canbefoundin[21]. small, that is, the source bits contain a large portion of To generate the waveform for RF transmitter, the RP 0’s, the traditional bit-to-symbol mapping operations, for symbolshavetobemappedtotheamplitudeofsinusoid example, MPSK and MQAM, may generate a lot of identical signals. Let �min and �max be the minimum and maximum nonzero symbols. Tis may cause higher PAPR than that of value in s.Supposethat� is the maximum amplitude of nonsparse source. For instance, Figure 3 shows the PAPR sinusoid signals. Te mapping is thus a linear projection complementary cumulative distribution function (CCDF) from [�min,�max] to [−�, �], and the actual amplitude afer [10] curves with diferent source sparsity. Te CCDF of the mapping is PAPR denotes the probability that a PAPR exceeds a certain 2� value. It is shown that the PAPR increases as the source �� =−�+ (�� −�min). (5) becomes sparse. �max −�min Te above analysis shows that the high PAPR for sparse Eventually, the wireless symbols are transmitted to wireless source in traditional OFDM system is caused by the identical channel through carrier modulation and DAC, while each nonzero symbols generated by conventional bit-to-symbol wireless symbol is composed of two consecutive ��,thatis, mapping operations. Terefore, we would like to use new �� +�⋅��+1. bit-to-symbol mapping that produces less identical nonzero 4 Wireless Communications and Mobile Computing

0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15

appearance probability 0.1 0.05 0 0 1 2 3 4 5 6 7 8 9 10 11 12 −9 −8 −7 −6 −5 −4 −3 −2 −1 −12 −11 −10 RP symbol value

p = 0.5 p = 0.3 p = 0.1

Figure 4: RP symbols distribution for diferent p.

�(� =0) symbols for sparse source. Later, we will verify that the From (8), we can prove that �Δ is always the maximal CCM produces more zero symbols but less identical nonzero value among all values regardless of �, and the maximal value symbols for sparse source. Tis feature of the CCM promises becomes large as � becomes small. Tat is to say, when �=2, alowPAPRforsparsesource. theappearanceprobabilityofzerosymbol,thatis,�(�� =0), According to (3), we know that all the generated RP dominates the PDF of ��.Moreover,sincethePDFof�� is a � � symbols are integer including zero and distribute from the joint PDF of every �Δ ,thePDFof � would be similar to the minimal value to the maximal value of the sum of the weights. case of �=2when � becomes large. For instance, if �= Moreover, the distribution of the RP symbol �� is determined {±1, ±2, ±4, ±4}, the RP symbols take value from [−11, +11], by the weight set and the source sparsity �. Considering andtheyaregeneratedbytheequationasfollows: �={�1,�2,...,��}∈Z, �=�,and�1 =−�2, �3 = � = (+1) ⋅� + (−1) ⋅� + (+2) ⋅� + (−2) ⋅� −�4,...,��−1 =−��, the RP symbol can be written as � ��1 ��2 ��3 ��4

+ (+4) ⋅�� + (−4) ⋅�� + (+4) ⋅�� + (−4) (9) � =� ⋅� +� ⋅� +⋅⋅⋅+� ⋅� +� ⋅� �5 �6 �7 � 1 ��1 2 ��2 �−1 ��(�−1) � ���

⋅�� . =� (� −� )+� (� −� )+⋅⋅⋅ �8 1 ��1 ��2 3 ��3 ��4 (6) In Figure 4, we illustrate the RP symbols distribution for diferent source sparsity p. +��−1 (�� −�� ). �(�−1) �� In Figure 4, we can see that the zero symbol always dominates the distribution regardless of source sparsity � in � the case of � = {±1, ±2, ±4, ±4}. In the other cases of weight Since the probability distribution function (PDF) of ��� is set,thesameresultcanbeobtainedasFigure4.Tisresult also verifes the above analysis. �(�� =0)=1−� Terefore, the appearance probability of zero symbol �� � � � (7) always dominates the PDF of � regardless of ,andif becomes small, �(�� =0)will become large. Tat is to say, �(�� =1)=�, �� there are many zero symbols in RP symbols. Moreover, the sparser the source is, the more zero symbols there are. � (� −� ) � Basedontheaboveanalysis,weuseCCMtogetherwith let �Δ denote ��−1 �� ; we can obtain the PDF of �Δ as follows: SOICF to efciently reduce the PAPR of OFDM for sparse source. Figure 5 illustrates the diagram of the CCM-based OFDM system. {�(1−�) � =−1 { �Δ { AsshowninFigure5,weadopttheCCMschemeinthe 2 2 OFDM system. Ten, the elements of X are the RP symbols. �(�� )= � +(1−�) �� =0 (8) Δ { Δ { BecausethevaluesoftheRPsymbolsaredominatedby �(1−�) � =1. { �Δ thezerovalue,thelargepeakamplitudeofthesummation Wireless Communications and Mobile Computing 5

Source RF CCM IFFT SOICF bits transmitter

Channel

Receive RPC-BP RF FFT bits decoder receiver

Figure 5: Diagram of the CCM-based OFDM system.

1 and 480 columns [21]. Since the PAPRs with diferent weight 0.9 sets are similar, we take the weight set � = {±1, ±2, ±4, ±4} 0.8 with high achievable transmission rate to evaluate the PAPR 0.7 reduction and BER performance. For source sparsity, we consider the source sparsity � to be 0.5, 0.3, and 0.1 [21]. For 0.6 the SOICF method, we defne �th as the predefned threshold 0.5 and �=�th/√�av as the clipped ratio and set � = 2.1 dB, CCDF 0.4 where �av means the average power of the signals before �=8 0.3 clipping. Te oversampling factor is able to provide a sufciently accurate approximation of the PAPR [24], and 0.2 thus we set �=8in our simulations. 0.1 Te PAPR reduction performances are, respectively, 0 shown in Figures 7(a), 7(b), and 7(c) for various source 0 24681012 sparsity. Specifcally, when the source is nonsparse, for exam- PAPR (dB) ple, � = 0.5, the PAPR reduction performances are almost p = 0.5 identical in the traditional system and proposed system; when p = 0.3 the source becomes sparse, for example, � = 0.3 and �= p = 0.1 0.1, due to the much higher PAPR for sparse source, the traditional OFDM system fails to restrain its PAPR. However, Figure 6: PAPRs of the CCM-based OFDM system for diferent p. in the CCM-based OFDM system, because the PAPR of its signals has not been infuenced by sparse source, it still can obtain much lower PAPR, and there is a performance gap signal caused by many identical elements of X is efciently around 9 dB between two systems when � is 0.1. restrained in the CCM-based OFDM system. As source Figure 8 shows the BER curves for diferent OFDM sparsity becomes small, it will increase the number of the systems and source sparsity afer 3 iterations of SOICF. As zero RP symbols. Ten, in the case of sparse source, it does illustrated in Figure 8, the proposed OFDM system almost not cause high PAPR in the CCM-based OFDM system. has the same BER performance as the traditional OFDM sys- Figure 6 depicts the PAPR in the CCM-based OFDM system tem with nonsparse source. Due to signal distortion caused for diferent � with � = {±1, ±2, ±4, ±4}, � = 480, � = 128, by PAPR reduction, the BER performance of the traditional and �=8. OFDM system begins to degrade as source becomes sparse, FromFigure6,itcanbeseenthatthePAPRoftheCCM- and the deterioration is about 2 dB at �=0.1.However, based OFDM system is not sensitive to the source sparsity. in the proposed OFDM system, the BER performance is As source becomes sparse, it will not bring additional PAPR improved by the channel protection of CCM scheme, and the increase. BER performance becomes better as source becomes sparser. Particularly, the proposed OFDM system outperforms about 4 dB over the traditional OFDM system, when source sparsity 4. Performance Evaluation is 0.1. In addition, in the case of � = {±1, ±2, ±4, ±4}, �= 480 � = 0.1 In the traditional OFDM system and the proposed CCM- ,and , the proposed system is able to achieve based OFDM system, we adopt the SOICF method to evaluate 8.5 bit/s/Hz average rate and about 12 bit/s/Hz peak rate for thePAPRreductionperformancefordiferentsourcesparsity. seamless rate adaptation [21]. Furthermore, the BER performance is evaluated for diferent source sparsity over the Additive White Gaussian Noise 5. Conclusion (AWGN) channel. For a traditional OFDM system, we consider that it takes In this paper, we have presented CCM-based OFDM system QPSK bit-to-symbol mapping operation and 128 subcarriers to reduce PAPR for sparse source. By using the CCM scheme [20].FortheproposedCCM-basedOFDMsystem,we in the traditional OFDM system, the generated frequency consider that the size of its generator matrix G is 1920 rows OFDM symbols are concentrated in zero, and the sparser 6 Wireless Communications and Mobile Computing

1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 CCDF 0.4 CCDF 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 4 6 8 10 12 14 16 18 20 4 6 8 10 12 14 16 18 20 PAPR (dB) PAPR (dB)

QPSK + OFDM, unclipped QPSK + OFDM, unclipped QPSK + OFDM + SOICF, 1 iteration QPSK + OFDM + SOICF, 1 iteration QPSK + OFDM + SOICF, 2 iterations QPSK + OFDM + SOICF, 2 iterations QPSK + OFDM + SOICF, 3 iterations QPSK + OFDM + SOICF, 3 iterations CCM + OFDM, unclipped CCM + OFDM, unclipped CCM + OFDM + SOICF, 1 iteration CCM + OFDM + SOICF, 1 iteration CCM + OFDM + SOICF, 2 iterations CCM + OFDM + SOICF, 2 iterations CCM + OFDM + SOICF, 3 iterations CCM + OFDM + SOICF, 3iterations

(a) �=0.5 (b) �=0.3 1 0.9 0.8 0.7 0.6 0.5

CCDF 0.4 0.3 0.2 0.1 0 4 6 8 10 12 14 16 18 20 PAPR (dB)

QPSK + OFDM, unclipped QPSK + OFDM + SOICF, 1 iteration QPSK + OFDM + SOICF, 2 iterations QPSK + OFDM + SOICF, 3 iterations CCM + OFDM, unclipped CCM + OFDM + SOICF, 1 iteration CCM + OFDM + SOICF, 2 iterations CCM + OFDM + SOICF, 3 iterations

(c) �=0.1

Figure 7: PAPR reduction with diferent source sparsity and iterations.

thesourceis,themorezerosymbolsthereare.Tezero of CCM scheme. Particularly, the proposed OFDM system symbols will not bring additional PAPR increase. Tus, the outperforms about 4 dB over the traditional OFDM system proposed CCM-based OFDM system, together with iterative when source sparsity is 0.1. In addition, the proposed CCM- clipping and fltering, can efciently restrain the high PAPR based OFDM system also maintains the characteristic of for sparse source. Finally, the simulation results show that the seamless rate adaptation. proposed OFDM system almost has the same performance in terms of PAPR reduction as the traditional OFDM system Conflicts of Interest regardless of source sparsity. Moreover, the BER performance oftheproposedOFDMsystemisimprovedbytheuse Te authors declare that they have no conficts of interest. Wireless Communications and Mobile Computing 7

100 [8]J.Joung,C.K.Ho,K.Adachi,andS.Sun,“Asurveyonpower- amplifer-centric techniques for spectrum- and energy-efcient 10−1 wireless communications,” IEEE Communications Surveys & Tutorials,vol.17,no.1,pp.315–333,2015. 10−2 [9] G. Wunder, R. F. H. Fischer, H. Boche, S. Litsyn, and J.-S. No, “Te PAPR problem in OFDM transmission: new directions for 10−3 along-lastingproblem,”IEEE Signal Processing Magazine,vol. BER 30,no.6,pp.130–144,2013. 10−4 [10] Y. Rahmatallah and S. Mohan, “Peak-to-average power ratio reduction in ofdm systems: a survey and taxonomy,” IEEE Communications Surveys & Tutorials,vol.15,no.4,pp.1567– 10−5 1592, 2013. [11] J. Ji, G. Ren, and H. Zhang, “PAPR reduction of sc-fdma signals 10−6 0 12345678910 via probabilistic pulse shaping,” IEEE Transactions on Vehicular Eb/N0 (dB) Technology,vol.64,no.9,pp.3999–4008,2015. [12] S.-H. Wang, W.-L.Lin, B.-R. Huang, and C.-P.Li, “PAPR reduc- QPSK + OFDM, unclipped, p = 0.5 tion in OFDM systems using active constellation extension and QPSK + OFDM + SOICF, p = 0.5 subcarrier grouping techniques,” IEEE Communications Letters, QPSK + OFDM + SOICF, p = 0.3 vol.20,no.12,pp.2378–2381,2016. QPSK + OFDM + SOICF, p = 0.1 CCM + OFDM, unclipped, p = 0.5 [13]H.Wang,X.Wang,L.Xu,andW.Du,“HybridPAPRreduction CCM + OFDM + SOICF, p = 0.5 scheme for FBMC/OQAM systems based on multi data block CCM + OFDM + SOICF, p = 0.3 PTS and TR methods,” IEEE Access,vol.4,no.99,pp.4761–4768, CCM + OFDM + SOICF, p = 0.1 2016. [14]S.Shu,D.Qu,L.Li,andT.Jiang,“InvertiblesubsetQC-LDPC Figure 8: BER performance for diferent source sparsity. codesforPAPRreductionofofdmsignals,”IEEE Transactions on Broadcasting, vol. 61, no. 2, pp. 290–298, 2015. [15] J. Armstrong, “Peak-to-average power reduction for OFDM Acknowledgments by repeated clipping and frequency domain fltering,” IEEE Electronics Letters,vol.38,no.5,pp.246-247,2002. Tis work was supported by the National Natural Science [16] L. Wang and C. Tellambura, “A simplifed clipping and fltering Foundation of China (Grant 61471022) and NSAF (Grant technique for PAR reduction in OFDM systems,” IEEE Signal U1530117). Processing Letters,vol.12,no.6,pp.453–456,2005. [17]R.J.Baxley,C.Zhao,andG.T.Zhou,“Constrainedclipping for crest factor reduction in OFDM,” IEEE Transactions on References Broadcasting, vol. 52, no. 4, pp. 570–575, 2006. [1] L. J. Cimini, “Analysis and simulation of a digital mobile [18]J.Tong,L.Ping,Z.Zhang,andV.K.Bhargava,“Iterativesof channel using orthogonal frequency division multiplexing,” compensation for OFDM systems with clipping and superpo- IEEE Transactions on Communications,vol.33,no.7,pp.665– sition coded modulation,” IEEE Transactions on Communica- 675, 1985. tions, vol. 58, no. 10, pp. 2861–2870, 2010. [2]J.A.C.Bingham,“Multicarriermodulationfordatatransmis- [19] Y.-C. Wang and Z.-Q. Luo, “Optimized iterative clipping and fl- sion: an idea whose time has come,” IEEE Communications tering for PAPR reduction of OFDM signals,” IEEE Transactions Magazine,vol.28,no.5,pp.5–14,1990. on Communications,vol.59,no.1,pp.33–37,2011. [3] C. Ru, L. Yin, J. Lu, and C. W. Chen, “UEP video transmission [20] X. Zhu, W. Pan, H. Li, and Y. Tang, “Simplifed approach to based on dynamic resource allocation in MIMO OFDM sys- optimized iterative clipping and fltering for PAPR reduction of tem,” in Proceedings of the 2007 IEEE Wireless Communications OFDM signals,” IEEE Transactions on Communications,vol.61, and Networking Conference, (WCNC ’07), pp. 310–315, China, no.5,pp.1891–1901,2013. March 2007. [21] H. Cui, C. Luo, J. Wu, C. W. Chen, and F. Wu, “Compressive [4] I. Kofman and V.Roman, “Broadband wireless access solutions coded modulation for seamless rate adaptation,” IEEE Trans- based on OFDM access in IEEE 802.16,” IEEE Communications actions on Wireless Communications,vol.12,no.10,pp.4892– Magazine,vol.40,no.4,pp.96–103,2002. 4904, 2013. [5]Z.Fei,C.Xing,N.Li,Y.Han,D.Danev,andJ.Kuang, [22] T. Jiang, M. Guizani, H.-H. Chen, W. Xiang, and Y. Wu, “Power allocation for OFDM-based cognitive heterogeneous “Derivation of PAPR distribution for OFDM wireless systems networks,” Science China Information Sciences,vol.56,no.4,pp. based on extreme value theory,” IEEE Transactions on Wireless 1–10, 2013. Communications,vol.7,no.4,pp.1298–1305,2008. [6] Z. Chen, L. Yin, Y. Pei, and J. Lu, “CodeHop: physical layer error [23] G. Caire, S. Shamai, A. Shokrollahi, and S. Verdu,` “Fountain correction and encryption with LDPC-based code hopping,” codes for lossless ,” in DIMACS Series in Science China Information Sciences,vol.59,no.10,ArticleID Discrete Mathematics and Teoretical Computer Science,vol.68, 102309, 2016. pp. 1–18, 2005. [7] P. Wang, L. Yin, and J. Lu, An efcient helicopter-satellite com- [24] C. Tellambura, “Computation of the continuous-time PAR of munication scheme based on check-hybrid ldpc coding, Tsinghua an OFDM signal with BPSK subcarriers,” IEEE Communications science and technology,2018. Letters,vol.5,no.5,pp.185–187,2001.