<<

IEEE TRANSACTIONS ON , VOL. 62, NO. 4, APRIL 2016 1541 Constructions and Decoding of Cyclic Codes Over b-Symbol Read Channels Eitan Yaakobi, Member, IEEE, Jehoshua Bruck, Fellow, IEEE, and Paul H. Siegel, Fellow, IEEE

Abstract— Symbol-pair read channels, in which the outputs However, in some of today’s emerging storage technologies, of the read process are pairs of consecutive symbols, were as well as in some proposed for the future, this is no longer recently studied by Cassuto and Blaum. This new paradigm an accurate assumption and symbols can only be written and is motivated by the limitations of the reading process in high density data storage systems. They studied error correction in read in possibly overlapping groups. This brings us to study this new paradigm, specifically, the relationship between the a model, recently proposed by Cassuto and Blaum [1], for minimum Hamming of an error correcting code and channels whose outputs are overlapping pairs of symbols. the minimum pair distance, which is the minimum Hamming The rapid progress in high density data storage technolo- distance between symbol-pair vectors derived from codewords gies paved the way for high capacity storage with reduced of the code. It was proved that for a linear with price. However, since the bit size at high densities is small, minimum Hamming distance dH , the corresponding minimum pair distance is at least dH +3. In this paper, we show that, for a it is a challenge to successfully read the individual bits given linear cyclic code with a minimum Hamming distance dH, recorded on the storage medium; for more details, see [1]. The the minimum pair distance is at least dH +dH/2 . We then symbol-pair read channel model studied in [1], and later by describe a decoding algorithm, based upon a bounded distance Cassuto and Litsyn in [2], mimics the reading process of such decoder for the cyclic code, whose symbol-pair error correcting capabilities reflect the larger minimum pair distance. Finally, storage technologies. In that model, the outputs produced by a we consider the case where the read channel output is a larger sequence of read operations are (possibly corrupted) overlap- number, b  3, of consecutive symbols, and we provide extensions ping pairs of adjacent symbols, called pair-read symbols.For of several concepts, results, and code constructions to this setting. example, if the recorded sequence is (010), then in the absence Index Terms— , codes for storage media, cyclic of any noise the output of the symbol-pair read channel would codes, symbol pairs. be [(01), (10), (00)]. In this new paradigm, the errors are no longer individual symbol errors, but, rather, symbol-pair I. INTRODUCTION errors, where in a symbol-pair error at least one of the symbols is erroneous. The main task now becomes combating these HE TRADITIONAL approach in information theory to symbol-pair errors by designing codes with large minimum analyzing noisy channels involves a message into T symbol-pair distance. individual information units, called symbols. Even though in The results in [1] and [2] addressed several fundamental many works the error correlation and interference between questions regarding the pair-, as well as construction and the symbols is studied, the process of writing and reading decoding of codes with pair-error correction capability. Finite- is usually assumed to be performed on individual symbols. length and asymptotic bounds on code sizes were also derived. Manuscript received April 8, 2015; revised September 29, 2015; accepted These were extended in [3] and [4], where construction January 13, 2016. Date of publication January 27, 2016; date of current version of maximum distance separable codes for the symbol-pair March 16, 2016. This work was supported was supported in part by the ISEF Foundation, the Lester Deutsch Fellowship, the University of California metric was considered, and in [5], where the authors studied Laboratory Fees Research Program, under Award 09-LR-06-118620-SIEP, syndrome decoding of symbol-pair codes. The paradigm of in part by the National Science Foundation under Grant CCF-1116739 the symbol-pair channel studied in these prior works can and Grant CCF-1405119, in part by the Center for Magnetic Recording Research, University of California at San Diego, and in part by the NSF be generalized to b-symbol read channels, where the result Expeditions in Computing Program under Grant CCF-0832824. E. Yaakobi of a read operation is a consecutive sequence of b > 2 was supported by the Electrical Engineering Department, California Institute symbols. In essence, we receive b estimates of the same stored of Technology, Pasadena, CA, USA. P. H. Siegel was supported by Technion, within the Fellowship from the Lady Davis Foundation and by a Viterbi sequence. This insight connects the symbol-pair problem to Research Fellowship. This paper was presented at the 2012 IEEE International the sequence reconstruction problem, which was first intro- Symposium on Information Theory, [19]. duced by Levenshtein [8]–[10]. In the sequence reconstruction E. Yaakobi is with the Department of Computer Science, Technion–Israel Institute of Technology, Haifa 32000, Israel (e-mail: scenario, the same codeword is transmitted over multiple [email protected]). channels. Then, a decoder receives all channel outputs, which J. Bruck is with the Department of Electrical Engineering, are assumed to be different from each other, and outputs an California Institute of Technology, Pasadena, CA 91125 USA (e-mail: [email protected]). estimate of the transmitted codeword. The original motivation P. H. Siegel is with the Center for Magnetic Recording Research, Depart- did not come from data storage but rather from other domains, ment of Electrical and Computer Engineering, University of California at such as molecular biology and chemistry, where the amount San Diego, La Jolla, CA 92093 USA (e-mail: [email protected]). Communicated by Y. Mao, Associate Editor for Coding Techniques. of redundancy in the information is too low and thus the Digital Object Identifier 10.1109/TIT.2016.2522434 only way to combat errors is by repeatedly transmitting the 0018-9448 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. 1542 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 62, NO. 4, APRIL 2016 same message. However, this model is very relevant for the result that   advanced storage technologies mentioned above as well as dH dp  dH + . in any other context where the stored information is read 2 multiple times. Furthermore, we note that the model proposed − by Levenshtein was recently studied and generalized, with dp 1 According to [1], this permits the correction of 2 applications to associative memories [18]. symbol-pair errors. Thus, in contrast to Levenshtein’s results In the channel model described by Levenshtein, all channels on independent channels, on the symbol-pair read channel we are (almost) independent from each other, as it is only guaran- can correct a large number of symbol-pair errors. In order teed that the channel outputs are all different. Assuming that to exploit this potentially much larger minimum pair distance the transmitted message c belongs to a code with minimum guarantee, we explicitly construct a decoder, based upon a Hamming distance dH and the number of errors in every chan- − bounded distance decoder of the given linear cyclic code, that dH 1 nel can be strictly greater than 2 , Levenshtein studied the can correct a number of symbol-pair errors up to the decoding minimum number of channels that are necessary to construct a radius corresponding to this bound. successful decoder. The corresponding value for the Hamming We then address the general paradigm of channels that sense metric (as well as other distance metrics) was studied in [9]; some prescribed number, b > 2, of consecutive symbols on extensions to distance metrics over permutations, e.g. [6], [7], each read. First, some of the results of the symbol-pair read and error graphs [11] have also been considered. Recently, channel are generalized. Next, we study properties of codes for the analogous problem has been addressed for the Grassmann the b-symbol read channel that are constructed by interleaving graph and for permutations under Kendall’s τ distance [20], b component codes. Finally, we examine the b-distance of two and an information-theoretic study motivated by applications specific families of codes, namely the codebooks n and the related to DNA sequencing was carried out for a special case linear cyclic Hamming codes. of a channel with deletions [12], [13]. The rest of the paper is organized as follows. In Section II, More specifically, for the Hamming distance, the following we formally review the symbol-pair read channel and some of result was proved in [9]. Assume the transmitted word belongs its basic properties. In Section III, we show that linear cyclic to a code with minimum Hamming distance dH and the − codes can correct a large number of symbol-pair errors and dH 1 number of errors, t, in every channel is greater than 2 . in Section IV, a decoding algorithm for such codes is given. Then, in order to construct a successful decoder, the number Section V generalizes some of the results on the symbol-pair of channels has to be greater than read channel to b-symbol read channels, where b > 2. Finally, Section VI concludes the paper. − /     t dH 2 − t−i n dH dH . i k II. DEFINITIONS AND BASIC PROPERTIES i=0 k=i+dH −t In this section, we review the symbol-pair read channel − =dH 1 + model introduced in [1]. If a length-n vector is stored in the For example, if t 2 1, i.e., only one more than the error correction capability, then the number of channels has memory then its pair-read vector is also a length-n vector in   − to be at least 2t + 1. Note that if t > dH 1 +1 then this which every entry is a pair of cyclically consecutive symbols in t 2 = ( ,..., ) ∈ n number is at least on the order of the message length n.This the stored vector. More formally, if x x0 xn−1  disappointing result is a consequence of the arbitrary errors is a length-n vector over some alphabet , then the symbol- π( ) that may occur in every channel. In practice, especially for pair read vector of x, denoted by x ,isdefinedtobe storage systems, we can take advantage of the fact that the π(x) =[(x0, x1), (x1, x2),...,(xn−2, xn−1), (xn−1, x0)]. errors are more constrained in number in order to improve the n error correction capability. Note that π(x) ∈ ( × ) ,andforx, y ∈ , In the symbol-pair read channel, there are in fact two π(x + y) = π(x) + π(y). channels. If the stored information is x = (x0,...,xn−1),then the corresponding pair-read vector of x is We will focus on binary vectors, so  ={0, 1} and unless stated otherwise, all indices are taken modulo n. The all- , π(x) =[(x0, x1), (x1, x2),...,(xn−2, xn−1), (xn−1, x0)], zeros, all-ones vector is denoted by 0 1, respectively. The Hamming distance between two vectors x and y is denoted and the goal is to correct a large number of the so-called by dH (x, y). Similarly, the of a vector x symbol-pair errors. With symbol alphabet ,thepair distance, is denoted by wH (x).Thepair distance between x and y is ( , ) dp(x, y), between two pair-read vectors x and y is the denoted by dp x y andisdefinedtobe Hamming distance over the symbol-pair alphabet ( × ) dp(x, y) = dH (π(x), π( y)). between their respective pair-read vectors, that is, dp(x, y) = dH (π(x), π( y)). Accordingly, the minimum pair distance of Accordingly, the pair weight of x is wp(x) = wH (π(x)). a code C is defined as dp(C) = minx,y∈C,x=y{dp(x, y)}. A symbol-pair error in the i-th symbol of π(x) changes at In [1], it was shown that for a linear cyclic code with min- least one of the two symbols (xi , xi+1). Note that the following imum Hamming distance dH , the minimum pair distance, dp, connection between the pair distance and pair weight holds. n satisfies dp  dH + 3. Our main contribution is the stronger Proposition 1: For all x, y ∈  ,dp(x, y) = wp(x + y). YAAKOBI et al.: CONSTRUCTIONS AND DECODING OF CYCLIC CODES OVER b-SYMBOL READ CHANNELS 1543

Proof: Note that for x, y ∈ n, III. THE PAIR DISTANCE OF CYCLIC CODES The goal of this section is to show that linear cyclic codes dp(x, y) = dH (π(x), π( y)) = wH (π(x) + π(y)) provide large minimum pair distance. In order to do so, we first = w (π( + )) = w ( + ). H x y p x y give a method to determine the pair weight of a vector x. A similar characterization of the pair weight was proved in [1] A first observation on the connection between the Hamming (Theorem 2). = distance and pair distance was proved in [1]. The key observation is that if xi 1, then two symbol- Proposition 2 [1]: Let x, y ∈ n be such that 0 < pairs in π(x),the(i − 1)-st and i-th symbol-pairs must be x − = dH (x, y)

From Proposition 2, if 0 < dH (C)

Next, assume that x = 1 is a codeword in C, in which case codewords from C and thus, for simplicity of notation, we wp(x) = n. We now show that if the dimension of C is greater have assumed that when (c,c) is a codeword at distance no than one, then d (C)  2n/3. This will imply that more than 2t from a received word ( y , y ), the decoder DC  H  1 2 2 returns c ∈ C. We defer the explicit design of the decoder DC dH (C) 2 dH (C) +  2n/3+n/3=n = wp(x) to the end of the section. 2 Consider a codeword c ∈ C and let π(c) ∈ π(C) be its from which the theorem will follow. corresponding symbol-pair vector. Let y = π(c) + e be a Since the dimension of C exceeds one, it contains at received word, where e ∈ ( × )n is the error vector with 3t+1 least three distinct non-zero codewords. Choose two of them, weight wH (e)  t0 = . We will describe a decoder (C)> / 2 x1 and x2.IfitweretruethatdH 2n 3 , then it would Dπ : ( × )n →{0, 1}n that can correct the error e. w ( ), w ( )  / + follow that H x1 H x2 2n 3 1, implying that The received vector has the form   wH (x + x )  2n − (2 · 2n/3 + 1)  2n/3, 1 2 y = (y0,0, y0,1), (y1,0, y1,1),...,(yn−1,0, yn−1,1) . which is a contradiction. Therefore, we conclude that We define three related vectors: dH (C)  2n/3, completing the proof. = ( ,..., ), yL y0,0 yn−1,0 IV. DECODING = ( ,..., ), yR y0,1 yn−1,1 From Theorem 1 we conclude that linear cyclic codes = + yS yL yR have large minimum pair distance, thereby permitting the   = y , + y , ,...,y − , + y − , . correction of a large number of symbol-pair errors. It is 0 0 0 1 n 1 0 n 1 1 therefore of interest to construct efficient decoders for these Since the vector y suffers at most t0 symbol-pair errors, codes, which is the topic of this next section. First note that the vectors yL and yR each have at most t0 errors, as = since these codes are linear, it was shown in [5] how a modified well. One can think of yL as a noisy version of c ( , ,..., ) version of syndrome decoding can be used in order to correct c0 c1 cn−1 ,andyR as a noisy version of a left-cyclic ( ,..., , ) symbol-pair errors within the error correction capability of the shift of c, c1 cn−1 c0 . Therefore, yL and the right- codes. However, since syndrome decoding suffers exponential (1) = ( , ,..., ) cyclic shift of yR, yR yn−1,1 y0,1 yn−2,1 , can complexity, our goal is to provide more efficient decoders be viewed as two noisy versions of the same codeword c. whose complexity is of the same order as classical decoders Furthermore, the vector yS has at most t0 errors with respect  for cyclic codes. to the codeword c = (c0 + c1,...,cn−1 + c0). In general, Let C be a linear cyclic code with minimum distance the codeword c does not uniquely determine the value of c. (C) = + C dH 2t 1. Assume there is a decoder for that can However, we will now show that, in this setting, it does. This correct up to t errors. We will show how to use this decoder result, which we will make use of in the development of the π(C) in the design of a decoder for the code which corrects decoding algorithm Dπ , is proved in the following lemma. =3t+1 up to t0 2 symbol-pair errors. Lemma 2: In the symbol-pair read channel setting We assume that the dimension of C is greater than one. This described above, if the codeword c ∈ C is successfully condition implies, according to the proof of Theorem 1, that recovered, then we can uniquely determine the codeword c. d (C)  2n/3;thatis,2t + 1  2n/3,or  H Proof: Since the word c is decoded successfully, we know 2n/3−1 the value of t  < n/3. 2  = (  ,  ,...,  ) = ( + , + ,..., + ). c c0 c1 cn−1 c0 c1 c1 c2 cn−1 c0 < / .  It is straightforward to verify that t0 n 2 −  n = + i 1 . We define the decoder as a mapping DC :  → C ∪{F}, The codeword c satisfies ci c0 j=0 c j Hence if we define c =[c0,...,cn−1] by c0 = 0andfor1 i  n − 1, where F denotes a decoder failure. For a received word −  n  = i 1   + y ∈  we write DC( y) = c ∈ C ∪{F}.Ifc ∈ C is the ci j=0 c j , then the codeword c is either c or c 1, transmitted word and dH (c, y)  t, then it is guaranteed that depending on the value of c0. The distance between yL and c ( , + ) = < / c = c.However,ifdH (c, y)>t, then either c is a codeword is at most t0 and dH c c 1 n. Recalling that t0 n 2, different from c, whose Hamming distance from the received we have word y is at most t, i.e., d (c, y)  t,orc = F, indicating H d ( y , c + 1) = n − t > t . that no such codeword exists. H L 0 0 ( ,)< ( ,+ ) =  We now introduce another code that will play a role in the Hence, if dH yL c dH yL c 1 ,thenc c;otherwise, symbol-pair decoder design. The double-repetition code of C is c = c + 1. In either case, we can recover the codeword c. the code defined by For convenience, we denote the codeword c obtained from the codeword c by the method of Lemma 5 as c∗;Thatis, C ={(c, c) : c ∈ C}. 2 c∗ = c. Note that its length is 2n and its minimum Hamming distance The number of symbol-pair errors in the vector y is at satisfies dH (C2) = 2dH (C). The code C2 can correct up to 2t most t0. Each symbol-pair error corresponds to one or two-bit D : n × n → errors and we assume that it has a decoder C2 errors in the symbol-pair. We let E1 be the number of single- n  ∪{F}. Every codeword in C2 consists of two identical bit symbol-pair errors and E2 be the number of double-bit YAAKOBI et al.: CONSTRUCTIONS AND DECODING OF CYCLIC CODES OVER b-SYMBOL READ CHANNELS 1545

+  = D ( ) =  symbol-pair errors, where E1 E2 t0. Thus, the number of decoding operation c1 C yS c succeeds, and accord- ( , (1)) errors in yS is E1 and the number of errors in yL yR is ing to Lemma 2, we can conclude that E + 2E . 1 2  = ∗ = ∗ = . The following result will also play a role in the validation c c1 c c of the symbol-pair error decoding algorithm. > t+2 Step 5: We are left with the case where e1 2 .Since Lemma 3: If c ∈ C, y = π(c) + e, and wH (e)  t0,then  =t+2 +   t −  (1) e1 t, we can write e1 2 a,where1 a 2 1. either DC( y ) = c or DC (( y , y )) = c. S 2 L R Assume the decoding in Step 2 fails. Then, according to Proof: If E  t, then the decoder DC( y ) is successful. 1 S Lemma 3, the decoding operation c = DC( y ) succeeds,  +  − ( + ) 1 S Otherwise, E1 t 1andE2 t0 t 1 , so the number implying that ( , (1)) of errors in yL yR satisfies t + 2 3t + 1 E = e = + a. E + 2E  t + t − (t + 1) = 2 − (t + 1)  2t. 1 1 2 1 2 0 0 2 The value of E then satisfies D (( , (1))) 2 This implies that the decoder C2 yL yR is successful.   t +2 3t +1 t +2 From Lemma 3, we know that at least one of the two E  t − + a = − −a  t −a. D D 2 0 decoders, C and C2 , succeeds. However, it is not obvious 2 2 2 which one of them does, and the main task of the algorithm ( , (1)) underlying the decoder mapping Dπ , which we now describe, The total number of errors in yL yR is is to identify the successful decoder. 5t + 1 Given a received vector y, the decoder output Dπ ( y) = c E1 + 2E2  t0 + t − a = − a. 2 is calculated as follows. ( ) Decoder Dπ : D (( , 1 )) Since the decoder C yL y fails and the minimum = D ( ) = ( , ) 2 R Step 1. c1 C yS , e1 dH c1 yS . distance of C2 is 4t + 2, it follows that the weight of the = D ( , (1)) = (( , ), Step 2. c2 C2 yL yR , e2 dH c2 c2 error vector in Step 2, e2, must satisfy ( , (1)))   yL yR . +  + − ( + )  + − 5t 1 − Step 3. If c1 = F or wH (c1) is odd then c = c2. e2 4t 2 E1 2E2 4t 2 a t+2 ∗ 2 Step 4. If e1  ,thenc = c . 2 1 3t + 1 t+2 t+2 t Step 5. If e > ,lete = +a,(1 a  −1)  + a + 1 = t0 + a + 1. 1 2 1 2 2 2 a) If e2  t0 + a then c = c2, ∗  = Hence, we conclude that if e2  t0 + a, then the decoder b) Otherwise, c c1. in Step 2 must succeed. The correctness of the decoder is proved in the next Alternatively, assume the decoding in Step 1 fails. As in theorem. Step 4, this means that the number of errors E1 in yS is at least Theorem 2: The decoder output satisfies Dπ ( y) = c = c. Proof: According to Lemma 3, at least one of the     + two decoders in Steps 1 and 2 succeeds. Steps 3–5 help to t 2 3t E1  2t + 2− +a = −(a−1) = t0 −(a−1). determine which of the two decoders succeeds. 2 2  Step 3:Since yS is a noisy version of the codeword c ,the +    −  Since E1 E2 t0, it follows that E2 satisfies 0 E2 a 1, decoding operation in Step 1 attempts to decode c ,which, + ( , (1)) and E1 2E2, the total number of errors in yL yR , satisfies we recall, has even weight. If either c1 = F or the Hamming weight of c1 is odd, then this decoding operation necessarily t0 − (a − 1)  E1 + 2E2 = (E1 + E2) + E2  t0 + a − 1. fails, implying that the decoding operation in Step 2 was (1) Thus, the decoding operation DC (( y , y )) succeeds, and successful. If we reach Steps 4 and 5 then wH (c1) must be 2 L R even. t − (a − )  e  t + a − .  t+2 0 1 2 0 1 Step 4: We now show that if e1 2 ,then  t+2 > + E1 2 as well, and therefore the decoding operation Hence, if e2 t0 a, then the decoder in Step 1 must succeed. in Step 1 succeeded. In order to see this, first note that the That completes the explanation of the assignments in a) and b) minimum Hamming distance of the code C is 2t +1. If there is of Step 5. a miscorrection in Step 1, then the word yS is miscorrected to We demonstrate the decoding algorithm in the following some codeword of even weight. The weight of the error vector example. t+2 C found in Step 1, e1,isatmost 2 . Since the minimum Example 2: In this example we choose the code to be the Hamming distance of the code C is 2t + 1, the number E1 of cyclic binary triple-error correcting BCH code of length 15, actual errors in y satisfies so d (C) = 7, d (C) = 7 +7 =11, t = 3, and t = 5. S   H p 2 0 t + 2 3t Thus the code can correct 5 symbol-pair errors. Assume the E1  2t + 2 − = + 1 > t0, stored word is the all zeros codeword 0. 2 2 Let y be the received vector contradicting the fact that the number of errors in yS is at most t0. Therefore, the condition on e1 implies that the y = (00, 11, 10, 00, 00, 00, 11, 00, 10, 00, 00, 11, 00, 00, 00). 1546 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 62, NO. 4, APRIL 2016

Then, erasure. If the number of errors in y is τ and the number of erasures is ρ,thenwehave2τ + ρ  2t = d(C) − 1, which is = ( , , , , , , , , , , , , , , ), yL 0 1 1 0 0 0 1 0 1 0 0 1 0 0 0 within the error and erasure correcting capability of C.Weare = ( , , , , , , , , , , , , , , ), yR 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 left only with the problem of defining a decoder that corrects = ( , , , , , , , , , , , , , , ), yS 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 errors and erasures for cyclic codes. For that, we refer the reader to, for example, [14], [16]. Alternatively, we can treat and the code C2 as a concatenated code where the inner code is ( ) y 1 = (0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0). simply the repetition code of length two. A general technique R for decoding concatenated codes is described in [15, Ch. 12]. In Step 1 of the decoding algorithm we calculate A symbol-pair error can change the value of either a single bit or both bits in a pair-read symbol. However, the knowledge c = DC( y ), e = d (c , y ). 1 S 1 H 1 S on the maximum number of symbol-pair errors from each kind D may be known in advance. For example, a symbol-pair error Since yS suffered two errors and the decoder C can decode at most three errors, we get c1 = 0 and e1 = 2.InStep2we which corrupts only a single bit can be a result of a bit that was calculate written erroneously to the media, while the two bits may be in   error as a result of a reading noise. Thus, we consider codes = D ( , (1)) , = (( , ), ( , (1))). c2 C2 yL yR e2 dH c2 c2 yL yR that distinguish between these two types of errors. Specifically, ( , ) ( , (1)) C we say that a code is a t1 t2 symbol-pair error-correcting The word yL y suffered eight errors and since the code 2 R code if it can correct up to t1 single-bit symbol-pair errors and has minimum distance 14, the decoder DC can successfully 2 up to t2 double-bit symbol-pair errors. correct at most six errors. Hence, the output c2 is either F, Our discussion of (t1, t2) symbol-pair error-correcting codes indicating there is no codeword of distance at most six from C ( , (1)) = uses as a point of departure a given binary cyclic yL y , or some codeword c2 such that e2 6. The R with minimum Hamming distance dH (C) and a decoder DC. condition in Step 3 fails, but the condition in Step 4 holds (C) 3+2 The next theorem provides a condition on dH that implies because e1 = 2 = = 2. Therefore, we conclude that the 2 the code C is a (t1, t2) symbol-pair error-correcting code. decoder in Step 1 succeeds and we can decode the codeword C ˆ = ∗ = ∗ Theorem 3: A binary linear cyclic code of length n as c 0 0. Note that the operation 0 can result in and minimum distance d (C) is a (t , t ) symbol-pair error- 0 1 1 H 1 2 either or , but we can eliminate the word as its distance correcting code if from the received word is too large. As another example, consider the received vector y given by dH (C)  min {t1 + 2t2 + 1, 2t1 + 1} , = ( , , , , , , , , , , , , , , ), y 10 00 01 00 10 00 00 00 00 10 00 00 00 10 00 and t1 + t2 < n/2. Proof: We now examine two different approaches to with associated vectors correct such errors. = ( , , , , , , , , , , , , , , ), D yL 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1) The first approach uses the decoder C2 of the double- = ( , , , , , , , , , , , , , , ), repetition code of C, as introduced earlier in this section. yR 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 This decoder will need to correct t + 2t errors and y = (1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0), 1 2 S therefore (1) = ( , , , , , , , , , , , , , , ). yR 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 dH (C2) = 2dH (C)  2(t1 + 2t2) + 1, The word yS sufferedfiveerrors,soinStep1thedecoded word c1 is either the failure symbol F or some codeword of or weight seven or eight with distance either two or three from dH (C)  t1 + 2t2 + 1. yS, respectively. Let us assume for this example that c1 is a codeword of weight eight and e1 = 3. 2) The second approach uses the decoder DC that we ( , (1)) The input word yL yR suffered five errors; therefore, applied to the vector yS, also previously described. D ( , (1)) in Step 2 the decoder operation C2 yL yR succeeds, This decoder is required to correct t1 errors and thus so c2 = 0 and e2 = 5. Now the conditions in Steps 3 and 4 do dH (C) = 2t1 + 1. Note that according to Lemma 2, the not hold, so Step 5 will determine which decoder succeeds. condition t1 +t2 < n/2 guarantees a successful recovery  First, we see that a = 1andsoe2 = 5 < 5 + 1 = t0 + a. of the stored codeword c based upon the decoding of c . Hence, the condition in Step 5a) holds and we conclude that We conclude that if the condition in the theorem statement the second decoder succeeds, i.e., cˆ = 0. holds, we can choose either of these two approaches as the To complete the decoder presentation, we return to the basis for a successful decoder. D construction of the decoder C2 . This decoder receives two From Theorem 3, we conclude that knowing the values = ( ,..., ) = ( ,..., ) vectors, y1 y1,0 y1,n−1 and y2 y2,0 y2,n−1 . of t1 and t2 significantly simplifies the decoder operation Each is a noisy version of some codeword c ∈ C, and the goal since we know which of the two decoders to apply in order is to correct a total of 2t errors in the two vectors. We define to correct the symbol-pair errors. However, this knowledge the vector y = (y0,...,yn−1) such that for all 0  i  n − 1, may also reduce the lower bound on the required minimum yi = y1,i if y1,i = y2,i , and otherwise yi =? to indicate an Hamming distance of the code C, and thus the required YAAKOBI et al.: CONSTRUCTIONS AND DECODING OF CYCLIC CODES OVER b-SYMBOL READ CHANNELS 1547

code redundancy. To see this, assume that we construct a We refer to the elements of πb(x) as b-symbols.Theb-symbol (t1, t2) symbol-pair error-correcting code using a linear cyclic distance between x and y, denoted by db(x, y),isdefinedas code C which corrects t1 +t2 symbol-pair errors. Then, accord- db(x, y) = dH (πb(x), πb( y)). ing to Theorem 1, its minimum Hamming distance dH (C) is required to satisfy For simplicity, we sometimes refer to this as the b-distance.   Similarly, we define the b-weight of the vector x as dH (C) dH (C) +  2(t + t ) + 1. (2) wH (πb(x)). As was the case for the symbol-pair distance, 2 1 2 it is easy to see that db(x, y) = wb(x + y),forallx, y.For π ( ) π ( ) On the other hand, according to Theorem 3, dH (C) must convenience of notation, if the i-th symbol b x i of b x is π ( ) = be greater than or equal to min {t1 + 2t2 + 1, 2t1 + 1}.Itis not the all-zero b-tuple, we will write b x i 0. straightforward to verify that for any two non-negative integers The following proposition is a natural generalization of t1 and t2, this lower bound on dH (C) is no greater than the Proposition 2. ∈ n 0,  a linear cyclic code, if c is a codeword then so is c . Thus, we can find an index k such that xk = xk+1 = ··· =  decoding of at least c and c has to succeed. xk+b−1 = 0andxk+b = 1. Therefore, for all k + 1  i  In the next section, we extend some of our results on k + b − 1, we have πb(x)i = 0. Furthermore, for all i such the symbol-pair read channel to b-symbol read channels, that xi = 1, we also have πb(x)i = 0. It follows that with b  3. wb(x)  wH (x) + b − 1.

V. E XTENSIONS TO b-SYMBOL READ CHANNEL Our next goal is to suitably generalize Lemma 1, giving a In this section, we study the b-symbol read channel, useful characterization of wb(x) for arbitrary b  3. In order 3  b < n,whereb symbols are sensed in each read operation. to do this, we introduce for every x ∈ n an auxiliary vector First, we formally define the b-symbol read channel model x ∈ n, obtained from x by inverting every sequence of b −2 and introduce the b-symbol distance function. We then prove or fewer consecutive zeros in x. More formally, we define x some of their basic properties, along the lines of those of as follows. If (xi , xi+1,...,xi+k , xi+k+1) = (1, 0,...,0, 1) the symbol-pair read channel model and symbol-pair distance. for some 0  i  n − 1andk  b − 2, then x j = 1 − x j = 1 Next, we turn to constructions of codes for the b-symbol for i + 1  j  i + k. For all other values of j, x j = x j . read channel. Generalizing results in [1], we analyze the Example 3: Assume b = 4andletx = (0, 1, 1, 0, 0, b-symbol distance properties of codes obtained by interleaving 0, 1, 0).Then b component codes, and then describe a decoding algorithm  = ( , , , , , , , ). that decodes up to the decoding radius. Finally, we study x 1 1 1 0 0 0 1 1 b-symbol properties of two specific families of codes, namely Note that the sequence of consecutive zeros beginning at n the codes corresponding to the entire space of codewords, , position 3 has length b − 1 = 3 and therefore these zeros = m − and the linear cyclic Hamming codes of length n 2 1, remain unchanged in x, while the sequence of cyclically  m 3. consecutive zeros beginning at position 7 has length 2 < b − 1 = 3 and therefore these zeros are changed to ones. A. Basic Properties Now, we state and prove the generalization of Lemma 1 for b  3. For b  3, the b-symbol read vector corresponding to the Lemma 4: For any x ∈ n and positive integer b  3, x = (x , x ,...,x − ) ∈ n vector 0 1 n 1 is defined as w ()   w ( ) = w () + ( − ) · H x . b n b x H x b 1 πb(x) =[(x0,...,xb−1),...,(xn−1, x0,...,xb−2)]∈  . 2 1548 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 62, NO. 4, APRIL 2016

Proof: Let us first show that wb(x) = wb(x). The only and π ( ) π ()  positions j for which b x j and b x j differ are those for x =[1, 0, 0, 1, 0, 0, 0, 0], which πb(x) j = (x j ,...,x j+b−1) contains a zero bit in a w () = w () = length-(k + 2) sequence of the form so H x 5and H x 2. The relationship in Lemma 4 is clearly seen to hold. ( , ,..., , ) = ( , ,..., , ) xi xi+1 xi+k xi+k+1 1 0 0 1 In analogy to the definition of symbol-pair codes, we define the b-symbol read code of a code C over  to be the code where k  b − 2. The zero bits xi+1,...,xi+k appear in the π (C) ={π (c) : c ∈ C} over b.Theminimum b-distance j-th symbol of πb(x) for i −b+2  j  i +k. However, since b b of C, d (C),isgivenbyd (C) = d (π (C)),wherethe xi = xi+k+1 = 1, in this range of values of j, we see that b b H b Hamming distance is over the alphabet b. Referring to both πb(x) j = 0andπb(x) j = 0. For all other positions j, Proposition 3, we see that if 0 < d (C)  n − (b − 1),then the corresponding bits of πb(x) j and πb(x) j are the same and H thus πb(x) j = 0 if and only if πb(x) j = 0. dH (C) + b − 1  db(C)  b · dH (C). (3) Next, we determine the value of w (x), making use of the b A b-symbol error in the i-th symbol of π (c) changes at least fact that any sequence of consecutive zeros in x has length at b one of the b symbols in the vector π (c) = (x ,...,x + − ). least b − 1. Let b i i i b 1 Using the same reasoning as in the proof of Proposition 3 S0 ={i : πb(x)i = 0,xi = 1}, in [1], we see that a code C can correct any tb-symbol errors if d (C)  2t + 1. S1 ={i : πb(x)i = 0,xi = 0,xi+1 = 1}, b In the next section, we study the b-distance of a code . . constructed by interleaving. We note that an extension of

Sb−2 ={i : πb(x)i = 0,xi =···=xi+b−3 = 0,xi+b−2 = 1}, Theorem 1 is not straightforward to derive in this case. Namely, if c is a codeword in a linear cyclic code C,then S − ={i : π (x) = 0,x =···=x + − = 0,x + − = 1}. b 1 b i i i b 2 i b 1 the vectors c and c do not necessarily belong to the code C, Clearly, wH (x) =|S0| and, since S j ∩ S =∅,forall and hence it is not possible to use Lemma 4 to derive a bound 0  j < b − 1, we also have on the minimum b-distance of a binary cyclic code. b−1 − B. Code Construction by Interleaving w (x) =|∪b 1 S |= |S |. b i=0 i i The interleaving scheme studied in [1] generates codes C = i 0 that satisfy d (C) = 2d (C). We will next show how this    − | |=| | p H Let us now show that for all 2 b 1, S1 S . construction can be generalized for arbitrary b  3, so we ∈ ( , ) = ( , ) If i S1 then xi xi+1 0 1 . Since there is no sequence generate codes which satisfy d (C) = b · d (C). This result − b H of less than b 1 consecutive zeros approves also that the upper bound on the minimum b-distance stated in (3) is tight. The standard notation of (n, M, d) (xi−(b−2),...,xi ,xi+1) = (0,...,0, 1) will be used to denote the parameters of a binary code of and thus i − ( − 1) ∈ S. Hence, |S|  |S1|. For the opposite length n,sizeM, and minimum distance d. Given a collection ∈    inequality, note that if i S , 1, then of b codes C0,...,Cb−1, where for 0  i  b − 1 the code C ( , , ) C (x ,x + ,...,x +− ,x +) = (0,...,0, 1). i has parameters n Mi di , their interleaved code is a i i 1 i 1 i ( ,b−1 , { }) bn i=0 Mi min0ib−1 di code defined as follows: Therefore (xi+−1,xi+) = (0, 1),soi +  − 1 ∈ S1, implying C ={(c0,0,...,cb−1,0, c0,1,...,cb−1,1,...,c0,n−1,..., that |S |  |S|. Hence, |S |=|S| for all 2    b − 1. 1 1 ): =( ,..., ) ∈ C ,   − }. Remember that the vector x is defined according to (1) to be cb−1,n−1 ci ci,0 ci,n−1 i for 0 i b 1 the vector Theorem 4: Let C0,...,Cb−1 be a set of b binary codes  with respective parameters (n, M , d ),for  i  b − . x = (x + x ,x + x ,...,x − + x ), i i 0 1 0 1 1 2 n 1 0 C  Then their interleaved code satisfies wH (x ) and hence as in the proof of Lemma 1, |S1|= ,sowe 2 db(C) = b · dH (C) = b · min {di }. can conclude that 0ib−1 b−1  Proof: Every codeword in c ∈ C has the form wH (x ) wb(x) = |Si |=wH (x) + (b − 1) · . 2 c = (c0,0,...,cb−1,0, c0,1,...,cb−1,1,..., =0 c0,n−1,...,cb−1,n−1), Example 4: Assume b = 4andletx = (1, 0, 0, 0, 1, obtained by interleaving codewords , , ) w ( ) = 0 0 1 .Then H x 3, ci = (ci,0,...,ci,n−1) ∈ Ci , 0  i  b − 1. π ( ) =[ , , , , , , , ], 4 x 1000 0001 0010 0100 1001 0011 0110 1100 If c = 0,thenci = 0 for some 0  i  b − 1. The symbols w (x) = in ci are separated by at least b positions from one another in c and, therefore, 4 8. It is easy to verify that the · w ( ) π ( ) inequalities in Proposition 3 hold. and, therefore, there are at least b H ci symbols in b c We also see that that are non-zero. That is, wb(c)  b · wH (ci )  b · min {di }=b · dH (C). x =[1, 0, 0, 0, 1, 1, 1, 1] 0ib−1 YAAKOBI et al.: CONSTRUCTIONS AND DECODING OF CYCLIC CODES OVER b-SYMBOL READ CHANNELS 1549

The opposite inequality follows from the fact that there is a where dH denotes the Hamming distance over the binary codeword c ∈ C such that wH (c) = dH (C) and, according to alphabet .   − Proposition 3, Proof: First, note that, for 0 i b 1, every yi is a noisy version of the codeword wb(c)  b · wH (c) = b · dH (C). (ci )b = (ci,0,...,ci,0,...,ci,n−1,...,ci,n−1) ∈ (Ci )b. We conclude that db(C) = b · dH (C), as claimed. According to the remark at the end of the previous sub- Furthermore, every b-symbol error in y can change at most one C section, Theorem 4 implies that the interleaved code can of the bits in every yi and thus the binary Hamming distance ( (C)− )/ ( ) ( (C) − )/ correct any db 1 2 b-symbol errors. We now describe between ci b and yi is at most db 1 2 ,thatis, a decoding algorithm for C that achieves this decoding radius.   dH y ,(ci )b  (db(C) − 1)/2. For a length-n vector c = (c0,...,cn−1) we define the i length-bn vector Finally, from Theorem 4, we have (c)b = (c0,...,c0,...,cn−1,...,cn−1) ( (C) − )/ =( · (C) − )/  ( · − )/ , obtained by repeating b times each bit in c. Now, for each db 1 2 b dH 1 2 b di 1 2 component code Ci ,0 i  b − 1, of the interleaved code C, for all i with 1  i  b, which implies that the decoder (Di )b we let (Ci )b be the length-bn code   − can successfully decode the word yi . Thus, for 0 i b 1, (Ci )b ={(c)b : c ∈ Ci }. the codeword ci is successfully decoded and, therefore, so is the codeword c. If Ci is an (n, Mi , di ) code, then (Ci )b is a (bn, Mi , bdi ) code. We assume that the code Ci has a bounded distance decoder Di that corrects errors of weight up to the decoding radius. Since C. Codes With Small Minimum Hamming Distance the code (Ci )b can be interpreted as a concatenation of an outer In this section we study the minimum b-distance of two code Ci and an inner b-repetition code, we can also assume special classes of codes. The first class corresponds to the − (C ) (D ) bdi 1 “complete” codebooks n and the second to the linear cyclic that i b has a decoder i b that can correct up to 2 errors; for more details on constructing the decoder (Di )b from Hamming codes. b decoder Di , we again refer the reader to [15, Ch. 12]. Given a length-n vector over ,themajority decoder Assume that the stored codeword c ∈ C is outputs for each b-symbol the majority value among its b constituent bits, or ? if b is even and the number of zeros = ( ,..., , ,..., ,..., c c0,0 cb−1,0 c0,1 cb−1,1 and ones is equal. The following lemma provides the minimum c0,n−1,...,cb−1,n−1), b-distance of the code n and proves that the majority decoder can be used to decode up to the b-distance decoding radius. where c = (c , , c , ,...,c , − ) ∈ C for all 0  i  b − 1. i i 0 i 1 i n 1 i C = n  Let π (C) be the b-symbol read vector of c,andlet y be the Lemma 6: Let . For all b 3, the minimum b (C) = length-bn received vector. We represent y as b-distance satisfies db b and the majority decoder can correct b−1 b-symbol errors. = ( ,..., ) 2 y y0 ybn−1 Proof: Assume x is a non-zero vector, and let x be as defined in Section II. Then x = 0 and therefore w (x)  1. where y = (y , ,...,y , − ),for0 j  bn − 1, and we H j j 0 j b 1 = w ()  make the assumption that If x 1,then H x 2. Lemma 4 then implies that w ( )  + ( − ) · 2 = = w ( ) = b x 1 b 1 2 b.Ifx 1,then b x n,and dH ( y,πb(c))  (db(C) − 1)/2. the inequality follows from the fact that b < n. We conclude that d (C) = b by noting that the codewords in π (C) If we index the bit positions in c from 0 to bn−1, the bit c , b b i j corresponding to the b-symbol read vector of a weight-1 word in c lies in position ( jb+i),for0 i  b−1, 0  j  n −1. in C has b-weight b. Each bit c , is read b times, corresponding to the components i j If there are b−1 symbol errors in the received version of y jb+i−(b−1),b−1,...,y jb+i−1,1, and y jb+i,0 that belong to 2 πb(x), then for every 0  i  n − 1, the bit xi of the vector x the b-symbols y + −( − ),...,y + − , and y + , respec- jb i b 1 jb i 1 jb i is in error in at most b−1 of the b symbols that provide an tively. 2 estimate of that bit. Thus, the majority decoder can be used Next, we combine these estimates of ci, j into a binary to recover every bit xi and, therefore, the vector x. vector y , for 0  i  b − 1, 0  j  n − 1, where i j The following lemma considers the b-distance of linear = ( ,..., , ). yi, j y jb+i−(b−1),b−1 y jb+i−1,1 y jb+i,0 cyclic Hamming codes, providing a generalization of a result in [1]. Finally, the vector y, treated as a binary vector, is partitioned Lemma 7: If C is the linear cyclic of length into the following b vectors, each of length bn bits, m n = 2 − 1 and b + 2  m, then db(C) = 2b + 1. = ( , ,..., ), ∈ C yi yi,0 yi,1 yi,n−1 Proof: Let x be a non-zero codeword. We first show that wb(x)  2b + 1. Clearly, wH (x)  wH (x)  3. for 0  i  b − 1.  Assume that x = 1,sowH (x ) is a positive even integer. Lemma 5: For 0  i  b − 1,    If wH (x )  4, then according to Lemma 4, we have wb(x)  ,( )  ( (C) − )/ + ( − ) · 4 = + . w () =  dH yi ci b db 1 2 3 b 1 2 2b 1 If H x 2, then x contains 1550 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 62, NO. 4, APRIL 2016 a single sequence of consecutive ones, whose length  must [5] M. Hirotomo, M. Takita, and M. Morii, “Syndrome decoding of satisfy   m.Otherwise,if

ACKNOWLEDGEMENT The authors thank Yuval Cassuto for helpful discussions on the symbol-pair read channel and Ryan Gabrys for bringing reference [17] to their attention. They also thank two anony- mous reviewers and the Associate Editor Prof. Yongyi Mao for their valuable comments and suggestions.

REFERENCES Eitan Yaakobi (S’07–M’12) is an Assistant Professor at the Computer [1] Y. Cassuto and M. Blaum, “Codes for symbol-pair read channels,” IEEE Science Department at the Technion — Israel Institute of Technology. Trans. Inf. Theory, vol. 57, no. 12, pp. 8011–8020, Dec. 2011. He received the B.A. degrees in computer science and mathematics, and the [2] Y. Cassuto and S. Litsyn, “Symbol-pair codes: Algebraic construc- M.Sc. degree in computer science from the Technion — Israel Institute of tions and asymptotic bounds,” in Proc. IEEE Int. Symp. Inf. Theory, Technology, Haifa, Israel, in 2005 and 2007, respectively, and the Ph.D. degree St. Petersburg, Russia, Jul./Aug. 2011, pp. 2348–2352. in electrical engineering from the University of California, San Diego, in 2011. [3] Y. M. Chee, H. M. Kiah, and C. Wang, “Maximum distance separable Between 2011-2013, he was a postdoctoral researcher in the department of symbol-pair codes,” in Proc. IEEE Int. Symp. Inf. Theory, Cambridge, Electrical Engineering at the California Institute of Technology. His research MA, USA, Jul. 2012, pp. 2886–2890. interests include information and coding theory with applications to non- [4] Y. M. Chee, L. Ji, H. M. Kiah, C. Wang, and J. Yin, “Maximum distance volatile memories, associative memories, data storage and retrieval, and voting separable codes for symbol-pair read channels,” IEEE Trans. Inf. Theory, theory. He received the Marconi Society Young Scholar in 2009 and the Intel vol. 59, no. 11, pp. 7259–7267, Nov. 2013. Ph.D. Fellowship in 2010-2011. YAAKOBI et al.: CONSTRUCTIONS AND DECODING OF CYCLIC CODES OVER b-SYMBOL READ CHANNELS 1551

Jehoshua Bruck (S’86–M’89–SM’93–F’01) is the Gordon and Betty Moore Paul H. Siegel (M’82–SM’90–F’97) received the S.B. and Ph.D. degrees in Professor of computation and neural systems and electrical engineering at the mathematics from the Massachusetts Institute of Technology (MIT), California Institute of Technology (Caltech). His current research interests Cambridge, in 1975 and 1979, respectively. He held a Chaim Weizmann include information theory and systems and the theory of computation in Postdoctoral Fellowship at the Courant Institute, New York University. He was nature. with the IBM Research Division in San Jose, CA, from 1980 to 1995. Dr. Bruck received the B.Sc. and M.Sc. degrees in electrical engineering He joined the faculty at the University of California, San Diego in July 1995, from the Technion-Israel Institute of Technology, in 1982 and 1985, respec- whereheiscurrentlyProfessorofElectrical and Computer Engineering tively, and the Ph.D. degree in electrical engineering from Stanford University, in the Jacobs School of Engineering. He is affiliated with the Center in 1989. for Magnetic Recording Research where he holds an endowed chair and His industrial and entrepreneurial experiences include working with IBM served as Director from 2000 to 2011. His primary research interests lie Research where he participated in the design and implementation of the first in the areas of information theory and communications, particularly coding IBM parallel computer; cofounding and serving as Chairman of Rainfinity and modulation techniques, with applications to digital data storage and (acquired in 2005 by EMC), a spin-off company from Caltech that created transmission. Prof. Siegel was a member of the Board of Governors of the first virtualization solution for Network Attached Storage; as well as the IEEE Information Theory Society from 1991 to 1996 and again from cofounding and serving as Chairman of XtremIO (acquired in 2012 by EMC), 2009 to 2014. He served as Co-Guest Editor of the May 1991 Special a start-up company that created the first scalable all-flash enterprise storage Issue on “Coding for Storage Devices” of the IEEE TRANSACTIONS ON system. INFORMATION THEORY.HeservedthesameTRANSACTIONS as Associate Dr. Bruck is a recipient of the Feynman Prize for Excellence in Teaching, Editor for Coding Techniques from 1992 to 1995, and as Editor-in-Chief from the Sloan Research Fellowship, the National Science Foundation Young July 2001 to July 2004. He was also Co-Guest Editor of the May/September Investigator Award, the IBM Outstanding Innovation Award and the IBM 2001 two-part issue on “The Turbo Principle: From Theory to Practice” Outstanding Technical Achievement Award. and the February 2016 issue on “Recent Advances in Capacity Approaching Codes” of the IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS. Prof. Siegel was co-recipient, with R. Karabed, of the 1992 IEEE Information Theory Society Paper Award and shared the 1993 IEEE Communications Society Leonard G. Abraham Prize Paper Award with B. H. Marcus and J.K. Wolf. With J. B. Soriaga and H. D. Pfister, he received the 2007 Best Paper Award in Signal Processing and Coding for Data Storage from the Data Storage Technical Committee of the IEEE Communications Society. He was the 2015 Padovani Lecturer of the IEEE Information Theory Society. Prof. Siegel is an IEEE Fellow and a member of the National Academy of Engineering.