Quantum serial turbo-codes

David Poulin∗, Jean-Pierre Tillich†, and Harold Ollivier‡ ∗ Center for the Physics of Information, California Institute of Technology, Pasadena, CA 91125, USA. † INRIA, Equipe Secret, Domaine de Voluceau BP 105, F-78153 Le Chesnay cedex, France. ‡ Perimeter Institute for Theoretical Physics, Waterloo, ON, N2J 2W9, Canada.

Abstract— We present a theory of quantum serial turbo- all decoded by a belief propagation algorithm which, albeit codes, describe their iterative decoding algorithm, and study sub-optimal, has been shown to have astonishing performance their performances numerically on a depolarization channel. even at rates very close to the channel capacity. Moreover, Our construction offers several advantages over quantum LDPC codes. First, the Tanner graph used for decoding is free of 4- the randomness involved in the code design can facilitate the cycles that deteriorate the performances of iterative decoding. analysis of their average performance. Indeed, probabilistic Secondly, the iterative decoder makes explicit use of the code’s codes are in many aspect related to quench-disordered physical degeneracy. Finally, there is complete freedom in the code design systems, so standard statistical physics tools can be called into in terms of length, rate, memory size, and interleaver choice. play [46], [29]. We define a quantum analogue of a state diagram that provides an efficient way to verify the properties of a quantum and quantum error correction [41], , and in particular its recursiveness and the [44], [4], [17], [23] are much younger theories and differ from presence of catastrophic error propagation. We prove that all their classical cousins in many aspects. For instance, there recursive quantum convolutional encoder have catastrophic error exists a quantum analogue of the Shannon channel capacity propagation. In our constructions, the convolutional codes have called the quantum channel capacity [12], [40], [26], which thus been chosen to be non-catastrophic and non-recursive. While the resulting families of turbo-codes have bounded minimum sets the maximum rate at which quantum information can be distance, from a pragmatic point of view the effective minimum sent over a noisy quantum channel. Contrarily to the classical distances of the codes that we have simulated are large enough not case, we do not know how to efficiently compute its value for to degrade the iterative decoding performance up to reasonable channels of practical significance, except for quite peculiar word error rates and block sizes. With well chosen constituent channels such as the quantum erasure channel where it is convolutional codes, we observe an important reduction of the word error rate as the code length increases. equal to one minus twice the erasure probability [3]. For the depolarizing channel – the quantum generalization of the I.INTRODUCTION binary symmetric channel – random codes do not achieve the For the fifty years that followed Shannon’s landmark pa- optimal transmission rate in general. Instead, they provide a per [39] on , the primary goal of the field lower bound on the channel capacity, often referred to as the of was the design of practical coding schemes hashing bound. In fact, coding schemes have been designed to that could come arbitrarily close to the channel capacity. reliably transmit information on a depolarization channel in a Random codes were used by Shannon to prove the existence noise regime where the hashing bound is zero [14], [42]. of codes approaching the capacity – in fact he proved that The stabilizer formalism [17] is a powerful method in the overwhelming majority of codes are good in this sense. which a quantum code on n qubits can be seen as classical For symmetric channels this can even be achieved by linear linear codes on 2n bits, but with a parity-check matrix whose codes. Unfortunately, decoding a is an NP-hard rows are orthogonal relative to a symplectic inner product. problem [5], so they have no practical relevance. Making the Moreover, a special class of stabilizer codes, called CSS codes decoding problem tractable thus requires the use of codes with after their inventors [8], [43], can turn any pair of dual classical

arXiv:0712.2888v2 [quant-ph] 10 Jun 2009 even more structure. linear code into a quantum code with related properties. The The first few decades were dominated by algebraic coding stabilizer formalism and the CSS construction allow to import theory. Codes such as Reed-Solomon codes [38] and Bose- a great deal of knowledge directly from the classical theory, Chaudhuri-Hocquenghem codes [21], [7] use the algebraic and one may hope to use them to leverage the power of structure of finite fields to design codes with large minimal probabilistic coding to the quantum domain. In particular, one distances that have efficient minimal distance decoders. The may expect that, as in the classical case, quantum analogues most satisfying compromise nowadays is instead obtained of LDPC codes or turbo-codes could perform under iterative from families of codes (sometimes referred to as “probabilistic decoding as well as random quantum codes, i.e. that they could codes”) with some element of randomness but sufficiently come arbitrarily close to the hashing bound. structured to be suitable for iterative decoding. They display For this purpose, it is also necessary to design a good good performances for a large class of error models with a iterative decoding algorithm for quantum codes. For a special decoding algorithm of reasonable complexity. The most promi- class of noise models considered here – namely Pauli noise nent families of probabilistic codes are Gallager’s low density models – it turns out that a version of the classical belief parity-check (LDPC) codes [16] and turbo-codes [6]. They are propagation algorithm can be applied. For CSS codes in particular, each code in the pair of dual codes can be decoded Pauli channels, quantum turbo-codes can be decoded similarly independently as a classical code. However, this is done at the to classical serial turbo-codes. One of the motivation behind cost of neglecting some correlations between errors that impact this work was to overcome some of the problems faced by the coding scheme’s performances. For some class of stabilizer quantum LDPC codes. For instance, graphical representation codes, the classical belief propagation can be improved to of serial quantum turbo-codes do not necessarily contain 4- exploit the coset structure of degenerate errors which improve cycles. Moreover, there is complete freedom in the code pa- the code’s performances. This is the case for concatenated rameters. Both of these points are related to the fact that there block codes [35] and the turbo-codes we consider here, but are basically no restrictions on the choice of the interleaver we do not know how to exploit this feature for LDPC codes used in the concatenation. An other advantage over LDPC for instance. Finally, a quantum belief propagation algorithm codes is that the decoder makes explicit use of the coset was recently proposed [25] to enable iterative decoding of structure associated to degenerate errors. more general (non-Pauli) noise models. As in the classical Despite these features, the iterative decoding performance of case, quantum belief propagation also ties in with statistical the turbo-code considered in [33] was quite poor, much poorer physics [20], [24], [25], [36]. in fact that results obtained from quantum LDPC codes. The We emphasize that a fast decoding algorithm is crucial in purpose of the present article is to discuss in length several quantum information theory. In the classical setting, when issues omitted in [33], to provide a detailed description of the error correction codes are used for communication over a decoding algorithm, to suggest much better turbo-codes than noisy channel, the decoding time translate directly into com- the one proposed there, and, most importantly, to address the munication delays. This has been the driving motivation to issue of catastrophic error propagation for recursive quantum devise fast decoding schemes, and is likely to be important in convolutional encoders. the quantum setting as well. However, there is an important Non-catastrophic and recursive convolutional encoders are additional motivation for efficient decoding in the quantum responsible for the great success of parallel and serial classical setting. Quantum computation is likely to require active sta- turbo-codes. In a serial concatenation scheme, an inner convo- bilization. The decoding time thus translates into computation lutional code that is recursive yields turbo-code families with delays, and most importantly in error suppression delays. If unbounded minimum distance [22], while non-catastrophic errors accumulate faster than they can be identified, quantum error propagation is necessary for iterative decoding conver- computation may well become infeasible: fast decoding is an gence. The last point can be circumvented in several ways (by essential ingredient to fault-tolerant computation (see however doping for instance, see [45]) and some of these tricks can be [13]). adapted to the quantum setting, but are beyond the scope of The first attempts at obtaining quantum analogues of LDPC this paper. codes [28], [9], [19] have not yielded results as spectacular The proof [22] that serial turbo-codes have unbounded as their classical counterpart. This is due to several reasons. minimal-distance carries almost verbatim to the quantum First there are issues with the code design. Due to the setting. Thus, it is possible to design quantum turbo-codes orthogonality constraints imposed on the parity-check matrix, with polynomially large minimal distances. However, we will it is much harder to construct quantum LDPC codes than demonstrate that all recursive quantum convolutional encoders classical ones. In particular, constructing the code at random have catastrophic error propagation. This phenomenon is re- will certainly not do. The CSS construction is of no help lated to the orthogonality constraints which appear in the since random sparse classical codes do not have sparse duals. quantum setting and to the fact that quantum codes are in In fact, it is still unknown whether there exist families of a sense coset codes. As a consequence, such encoders are not quantum LDPC codes with non-vanishing rate and unbounded suitable for (standard) serial turbo-codes schemes. minimum distance. Moreover, all known construction seem In our constructions, the convolutional codes are therefore to suffer from a poor minimum distances for reasons which chosen to be non-catastrophic and non-recursive, so there is are not always fully understood. Second, there are issues no guarantee that the resulting families of turbo-codes have a with the decoder. The Tanner graph associated to a quantum minimum distance which grows with the number of encoded LDPC code necessarily contains many 4-cycles which are qubits. Despite these limitations, we provide strong numerical well known for their negative effect on the performances of evidence that their error probability decreases as we increase iterative decoding. Moreover, quantum LDPC codes are by the block size at fixed rate – and this up to rather large block definition highly degenerate but their decoder does not exploit sizes. In other words, from a pragmatic point of view, the this property: rather it is impaired by it [37]. minimum distances of the codes that we have simulated are On the other hand, generalizing turbo-codes to the quantum large enough not to degrade the iterative decoding performance setting first requires a quantum analogue of convolutional up to moderate word error rates (10−3 −10−5) and block sizes codes. These have been introduced in [10], [11], [31], [32] (102 − 104). and followed by further investigations [15], [18], [1]. Quan- The style of our presentation is motivated by the intention tum turbo-codes can be obtained from the interleaved serial to accommodate a readership familiar with either classical concatenation of convolutional codes. This idea was first turbo-codes or quantum information science. This unavoidably introduced in [33]. There, it was shown that, on memoryless implies some redundancy and the expert reader may want to skip some sections, or perhaps glimpse at them to pick up the k, 1lk denotes the k × k identity matrix. Similarly, since H notation. In particular, the necessary background from classical has rank n − k, there exists a n × (n − k) matrix H−1 over −1 coding theory and convolutional codes is presented in the F2 satisfying HH = 1ln−k. next section using the circuit language of quantum information Lemma 1: The right inverses H−1 and G−1 can always be science. This framework is somewhat unconventional – block chosen such that (H−1)T G−1 = 0. codes are defined using reversible matrices rather than parity- Proof: Let B = (H−1)T G−1. The substitution H−1 → check or generating matrices, convolutional codes are defined H−1 + GT BT preserves the property HH−1 = 1l and fulfills via a reversible seed transformation instead of a linear filter the desired requirement. built from shift registers and feed-back lines – yet requires We will henceforth assume that the right inverses H−1 and little departure from standard presentations. The benefit is a G−1 are chosen to fulfill the condition of Lemma 1. very smooth transition between classical codes and quantum To study the analogy between classical linear binary codes codes, which are the subject of Sec. III. Whenever possible, k and stabilizer codes, we view a rate n classical linear code and the definitions used in the quantum setting directly mirror its encoding in a slightly unconventional fashion. We specify those established in the classical setting. The other benefit the encoding by an n × n invertible encoding matrix V over of this framework is that it permits to generate all quantum F2. The code space is defined as convolutional codes straightforwardly without being hassled  k by the orthogonality constraint. In fact, the codes we describe C = c = (c : 0n−k)V | c ∈ F2 , (2) are in general not of the CSS class. Section IV uses the circuit representation to define quantum where we use the following notation. convolutional codes and their associated state diagram. The Notation 1: For an n-tuple a ∈ A n and an m-tuple b ∈ state diagram is an important tool to understand the properties A m over some alphabet A , we denote by a : b the n + m- of a convolutional code. In particular, the detailed analysis tuple formed by the concatenation of a followed by b. of the state diagram of recursive convolutional encoders per- Given the generator matrix G and parity check matrix H of formed in Sec. IV-E will lead to the conclusion that they all a code, the encoding matrix V can be fixed to have catastrophic error propagation. Section V is a detailed  G  presentation of the iterative decoding procedure used for V = . (3) (H−1)T quantum turbo-codes. Finally, our numerical results on the codes’ word error rate and spectral properties are presented This matrix is invertible: at Sec. VI. V −1 = G−1,HT  (4) II.CLASSICALPRELIMINARIES −1 The main purpose of this section is to introduce a circuit and satisfies VV = 1ln following Lemma 1. Clearly, the n 2 representation of convolutional encoders which simplifies the encoding matrix V : F2 → F2 specifies both the code space generalization of several crucial notions to the quantum set- and the encoding. The output b = aV of the encoding matrix ting. For instance, it allows to define in a straightforward way V is in the code space if and only if the input is of the form k a state diagram for the quantum analogue of a convolutional a = (c : 0n−k) where c ∈ F2 . This follows from the equalities code which arises naturally from this circuit representation. aV = cG = c ∈ C and (c : s)VHT = s. This state diagram will be particularly helpful for defining The encoding matrix also specifies the syndrome associated and studying fundamental issues related to turbo-codes such as to each error. When transmitted on a bit-flip channel, a recursiveness and non-catastrophicity of the constituent convo- codeword c will result in the message m = c + p for some n lutional encoders. The circuit representation is also particularly p ∈ F2 . The error p can be decomposed into an error syndrome n−k k −1 well suited to present the decoding algorithm of quantum s ∈ F2 and a logical error l ∈ F2 as pV = (l : s). This convolutional codes. is conveniently represented by the circuit diagram shown at Fig. 1, in which time flows from left to right. In such diagrams, A. Linear block codes the inverse V −1 is obtained by reading the circuit from right to A classical binary linear code C of dimension k and length left, running time backwards. This circuit representation is at n can be specified by a full-rank (n − k) × n parity-check the core of our construction of quantum turbo-codes, it greatly matrix H over F2: simplifies all definition and analysis. A probability distribution P(p) on the error p incurred dur- C = {c | HcT = 0}. (1) ing transmission induces a probability distribution on logical Alternatively, the code can be specified by fixing the encoding transformation and syndromes k of each information word c ∈ F2 through a linear mapping c 7→ c = cG k ×n G P(l, s) = P(p) . (5) for some full-rank generator matrix over −1 T p=(l:s)V F2 that satisfies GH = 0. Since G has rank k, there exists an n × k matrix over F2 that we denote by a slight abuse of We call P(l, s) the pullback of the probability P(p) through −1 −1 n−k k notation by G satisfying GG = 1lk where for any integer the gate V . Maximum likelihood decoding lML : F2 → F2 k ... l n p w n n k V i 2 p s − − k i 1 li 1 U − − n k m si 1 − − pi Fig. 1. Circuit representation of encoder (l : s)V = p. Slashed wires with li integer superscript j indicate a j-bit input/output. The l-bit input are called the si logical bits, the other (n − k)-bit input are called syndrome or stabilizer bits, pi+1 n li+1 and the n-bit output are the physical bits. The string p ∈ F2 is a codeword if and only if s = 0 . si+1 wi+1 n−k ...

Fig. 2. Circuit diagram of a convolutional encoder with seed transformation consists in identifying the most likely logical transformation l U. given the syndrome s

1 lML(s) = argmaxlP(l|s) (6) p 2 ... p where the conditional probability is defined the usual way f0 f1 fm 1 fm − P(l, s) l w1 w2 ... wm P(l|s) = . (7) P 0 q1 qm 1 qm l0 P(l , s) − ... Similarly, we can define the bit-wise maximum likelihood de- i n−k coder lML : F2 → F2 which performs a local optimization Fig. 3. Representation of convolutional encoder as a linear filter. The labels on each logical bit f and q take value 0 and 1 and indicate respectively the absence or presence of the associated wire. Although linear, this transformation is not invertible. i i lML(s) = argmaxli P(l |s), (8) where the marginal conditional probability is defined the usual way is shown at Fig. 3. An other common description of this X P(l|s) = P(l1, . . . lk|s). (9) encoder would be in terms of its rational transfer function which related the D-transform of the output p(D) to that of l1,...li−1,li+1,...lk the input l(D). Remember that the D-transform of a bit stream B. Convolutional codes P i x1 : x2 : x3 : ... is given by x(D) = i xiD . For the code We define now a convolutional code as a linear code whose of Fig. 3, the output’s D-transforms are encoder V has the form shown at Fig. 2. The circuit is built p1(D) = l(D) (10) from repeated uses of a linear invertible seed transformation f + f D + ... + f Dm U : n+m → n+m shifted by n bits. In this circuit, 2 0 1 m F2 F2 p (D) = m l(D) (11) particular attention must be paid to the order of the inputs 1 + q1D + ... + qmD as they alternate between syndrome bits and logical bits. The where the inverse is the Laurent series defined by long total number of identical repetition is called the duration of division. The code can also be specified by the recursion the code and is denoted N. The m bits that connect gates relation from consecutive “time slices” are called memory bits. The j j−1 encoding is initialized by setting the first m memory bits to wi = wi−1 for j > 1 m w0 = 0m. There are several ways to terminate the encoding, X w1 = l + q wj but we here focus on a padding technique. This simply consists i i j i−1 k t i = j=1 in setting the logical bits of the last time slices m m N + 1,N + 2,...N + t equal to li = 0k, where t is a free 2 X j X j p = f0( qjw + li) + fjw parameter independent of N. The rate of the code is thus i i−1 i−1 j=1 j=1 k/n + O(1/N). m Note that in this diagram, we use a subscript to denote the X j = f0li + (fj + f0qj)wi−1. different elements of a stream. For instance, pi denotes the n- j=1 bit output string at time i. The jth bits of pi would be denoted j j by a subscript as pi , or simply p when the particular time i These definitions are in fact equivalent to the circuit of is clear from context. This convention will be used throughout Fig. 2 with the seed transformation U specified by Fig. 4. Note the paper. that we can assume without lost of generality that fm = 1 or This definition of convolutional code differs at first sight qm = 1 (or both), and these two cases lead to different seed from the usual one based on linear filters built from shift transformations. The generalization to arbitrary linear filters is register and feed-back lines. An example of a linear filter for straightforward. In terms of matrices, the seed transformation a rate 1/2 (systematic and recursive) convolutional encoder associated to this convolutional code encodes the relation 1 ...... 1 wi 1 q pi − 1 f1 the code considered in the above example is most easily seen 2 w ...... 2 1 i 1 q2 pi from Fig. 3: p is a copy of the input l. Systematic encoders ... − f2 1 m 1 wi are used to avoid catastrophic error propagation. This term will w − ...... i 1 qm 1 2 − − fm 1 be defined formally in the quantum setting, but it essentially m − wi w ...... i 1 qm 3 means that an error affecting a finite number of physical bits − wi ... li ...... is mapped to a logical transformation on an infinite number f0 s ...... m i wi of logical bits by the encoder inverse. Catastrophic encoders 1 ...... 1 wi 1 q pi cannot be used directly in standard turbo-code schemes. The − f1 1 2 ...... 2 wi 1 q pi problem is that the first iteration of iterative decoding does not ... − f2 2 1 provide information on the logical bits. This is due to the fact m 1 wi wi −1 ...... − fm 1 qm 1 2 that as the length of the convolutional encoder tends to infinity − − wi m ...... wi 1 fm f0 3 and in the absence of prior information about the value of the − wi ...... 1 l ... i logical bits, the logical after decoding tends to 2 . s ...... m i wi A recursive encoder has an infinite impulsive response: on input l of 1, it creates an output of infinite Fig. 4. Seed transformation circuit for convolutional code of Fig 3. Top: weight for a code of infinite duration N. Recursiveness is also Case fm = 1. Bottom: Case qm = 1. The labels f and q take value 0 and 1 related to the presence of feed-back in the encoding circuit, and indicate respectively the absence or presence of the associated gate. Both circuits are entirely built from controlled-nots, and are therefore invertible. which is easily understood from the linear filter of Fig. 3. P i P i As its name indicates, the controlled-not acts by negating the target bit ⊕ if Except when the polynomial qiD factors fiD , an and only if the control bit • is in state 1. encoder with feed-back will be recursive. It is essential to use as constituent recursive convolutional codes in classical turbo- codes schemes to obtain families of turbo-codes of unbounded (pi : wi) = (wi−1 : li : si)U with U given by minimum distance and with performances which improve with n m the block size.  z}|{ z}|{  µP µM }m III.QUANTUM MECHANICSAND QUANTUM CODES U =  ΛP ΛM }k . (12) In this section, we review some basic notions of quantum ΣP ΣM }n−k mechanics, the stabilizer formalism, and the decoding problem where for quantum codes. In Sec. III-B, stabilizer codes are defined     q1 the usual way, as subspaces of the Hilbert space stabilized 0 f1 + f0q1  q2 1lm−1  by an Abelian subgroup of the Pauli . We detail in  . .    µP = . . , µM = . , Sec. III-C how these codes are decoded. Even if a stabilizer    .  0 f + f q   code is a continuous space, it can be defined and studied m 0 m q 0 0 0 m by using only discrete objects (parity-check matrix, encoding ΛP = (1, f0), and ΛM = (1 0m−1). The two other components matrix, syndrome) which are quite close to classical linear depend on whether fm = 1 or qm = 1. In the former case codes. We discuss in Sec. III-D the relations between such ΣP = (0, f0) and ΣM = (1 0m−1) while in the latter case quantum codes and classical linear codes but also highlight ΣP = (0, 1) and ΣM = (0m). the crucial distinctions between them. Particular emphasis is Not only does the circuit of Fig. 2 produce the same put on the role of the encoder because it is a crucial ingredient encoding as the linear filter of Fig. 3, it also has the same for our definition of quantum turbo-codes. The encoder also memory states. More precisely, the value contained in the jth provides an intuitive picture for the logical cosets, which are shift register at time i in Fig. 3 is equal to the value of the an important distinction between classical codes and quantum jth memory bit between gate i and i + 1 on Fig. 2. This is stabilizer codes. important because it allows to define the state diagram (see Sec. IV-B) directly from the circuit diagram Fig. 4. A. Qubits and the Pauli group Of particular interest are systematic recursive encoders that A qubit is a physical system whose state is described by are defined as follows. a unit-length vector in a two-dimensional Hilbert space. The Definition 1 (Systematic encoder): An encoder is system- two vectors of a given orthonormal basis are conventionally atic when the input stream is a sub-stream of the output stream. denoted by |0i and |1i. We identify the Hilbert space with Definition 2 (Recursive encoder): A convolutional encoder C2 in the usual way with the help of such a basis. The state is recursive when its rational transfer function involves gen- of a system comprising n qubits is an unit-length vector in uine Laurent series (as opposed to simple polynomials). the tensor product of n two-dimensional Hilbert spaces. It Systematic encoders copy the input stream in clear in one is a space of dimension 2n which can be identified with n of the output stream. Typically they have transfer functions of (C2)⊗n ' C2 . It has a basis given by all tensor products j j j the form p (D) = l (D) for j = 1, . . . , k and arbitrary p (D) of the form |x1i ⊗ · · · ⊗ |xni, where the xi ∈ {0, 1} and the j j for j > k, so pi is a copy of li . The systematic character of inner product between two basis elements |x1i ⊗ · · · ⊗ |xni and |y1i ⊗ · · · ⊗ |yni is the product of the inner products of corresponds to bitwise addition over F2×F2. As a consequence |xii with the corresponding |yii. In other words, this basis is effective Pauli operators can be represented by binary couples. orthonormal. It will be convenient to use the following notation We will henceforth make use of the following representation

Notation 2: I ↔ (0, 0) (13) X ↔ (1, 0) (14) |0ni , |0i ⊗ · · · ⊗ |0i . | {z } Y ↔ (1, 1) (15) n times The error model we consider in this paper is a Pauli- Z ↔ (0, 1) (16) memoryless channel which is defined with the help of the ∼ n three Pauli matrices Note that Gn = G1 and we will either view, depending on the context, an element P ∈ G as an n-tuple (P i)n with 0 1 0 −i 1 0  n i=1 X = , Y = , Z = . entries in G or as 2n-tuple with entries in obtained by 1 0 i 0 0 −1 1 F2 replacing each Pi by its corresponding binary representation. These matrices anti-commute with each other and satisfy the Gn is generated by the Xi and Zi, and we introduce the following multiplication table following notation. x z × X Y Z Notation 4: For P in Gn, we denote by P and P the X I iZ −iY only elements of Gn satisfying: Y −iZ I iX 1) P = P x + P z, and Z iY −iX I 2) P x ∈ {I,X}n,P z ∈ {I,Z}n. An important property of Gn is that any pair of elements where I denotes the 2 × 2 identity matrix. The action of P, Q either commutes or anti-commutes. This leads to the these operators on the state of a qubit is obtained by right definition of an inner product “?” for elements P = (Pi)1≤i≤n multiplication |ψi → P|ψi, with |ψi viewed as an element of n and Q = (Q ) of G such that P?Q = P P ?Q 2. i 1≤i≤n n i=1 i i C mod 2. Here, P ?Q = 1 if P 6= Q , P 6= I and Q 6= I; These matrices generate the Pauli group which is readily i i i i i i G1 and P ?Q = 0 otherwise. seen to be the set i i Fact 1: P, Q ∈ Gn commute if and only if [P] ? [Q] = 0. {±I, ±iI, ±X, ±iX, ±Y, ±iY, ±Z, ±iZ}. This product can also be defined with the help of the following matrix which will appear again later in the definition They also form all the errors which may affect one qubit in of symplectic matrices. our error model. If we have an n-qubit system, then the errors Notation 5: which may affect it belong to the Pauli group Gn over n qubits which is defined by Λn , 1ln ⊗ X. By viewing now elements of Gn as binary 2n-tuples we have: ⊗n Gn = G1 = {P1 ⊗ · · · ⊗ Pn| ∈ {±1, ±i}, Pi ∈ {I, X, Y, Z}} Definition 4 (Inner product): Define the inner product ? : G × G → by P?Q = P Λ QT . This group is generated by i and the set of X ’s and Z ’s for n n F2 n i i G is an -vector space and we use the ? inner product to i = 1, 2, . . . , n which are defined by: n F2 define the orthogonal space of a subspace of G as follows. Notation 3: n Definition 5 (Orthogonal subspace): Let V be a subset of i − 1 times n − i times ⊥ z }| { z }| { Gn. We define V by Xi , I ⊗ · · · ⊗ I ⊗X ⊗ I ⊗ · · · ⊗ I ⊥ i − 1 times n − i times V ,{P ∈ Gn : P?Q = 0 for every Q ∈ V }. z }| { z }| { V ⊥ G Zi I ⊗ · · · ⊗ I ⊗Z ⊗ I ⊗ · · · ⊗ I is always a subspace of n and if the space spanned by , ⊥ In quantum mechanics two states are physically indistin- V is of dimension t, then V is of dimension 2n − t. guishable if they differ by a multiplicative constant. This From the fact that two states are indistinguishable if they motivates the definition another group of errors, called the differ by a multiplicative constant, a Pauli error may only be effective Pauli group, obtained by taking the quotient of Gn specified by its effective Pauli group equivalence to which by {±I, ±iI}. it belongs. A very important quantum error model is the Definition 3 (Effective Pauli group): The effective Pauli depolarizing channel. It is in a sense the quantum analogue group Gn on n qubits is the set of equivalence classes [P] for of the binary symmetric channel. P in Gn, where the equivalence class [P] is the set of elements Definition 6 (Depolarizing channel): The depolarizing of Gn which differ from P by a multiplicative constant. We channel on n qubits of error probability p is an error model will also use the notation I ,[I],X ,[X],Y ,[Y],Z ,[Z] and where all the errors which occur belong to Gn and the Xi = [Xi],Zi = [Zi]. probability that a particular element P is chosen is equal to n−w(P ) p w(P ) All the effective Pauli groups Gn are Abelian. (G1, +) is (1 − p) 3 where the weight w(P ) of a Pauli isomorphic to (F2 × F2, +) where the group operation of G1 error is given by Notation 6: w(P ) is the number of coordinates of P which Clearly then, a Clifford transformation on n qubits can be 2n differ from I. specified by its associated encoding matrix V on F2 together In other words, the coordinates of the error are chosen indepen- with a collection of 2n phases. This shows that Clifford trans- dently: there is no error on a given coordinate with probability formations are efficiently specifiable as claimed. It can readily 1 − p and there is an error on it of type X,Y or Z each with be verified that the rows of V , denoted Vi i = 1, 2,..., 2n, p probability 3 . are equal to † B. Stabilizer codes: Hilbert space perspective V2i−1 = [VXiV ] = XiV, (18) † A quantum protecting a system of k V2i = [VZiV ] = ZiV. (19) qubits by embedding them in a larger system of n qubits is k 2 ⊗n Since conjugation by a unitary matrix V does not change the a 2 dimensional subspace C of (C ) . We say that it is a k commutation relations, the above equations implies that the quantum code of length n and rate n . It can be specified by 2n 2n encoding matrix is a symplectic matrix, whose definition is a unitary transformation V : C → C : recalled below. n 2k o C = |ψi = V(|ψi ⊗ |0n−ki) | |ψi ∈ C . (17) Definition 10 (Symplectic transformation): A n-qubit sym- plectic transformation is a 2n × 2n matrix U over F2 that This definition directly reflects Eq. (2). As in the classical satisfies case, the matrix V specifies not only the code but also the UΛ U T = Λ . encoding, that is the particular embedding ( 2)⊗k → ( 2)⊗n. n n C C By definition, symplectic transformation are invertible and An importance distinction however is that in the quantum preserve the inner product ? between n-qubit Pauli group case, the dimension of the matrix V is exponential in the elements. Conversely, every symplectic matrices always cor- number of qubits n. To obtain an efficiently specifiable code, respond to a (non-unique) Clifford transformation. we choose V from a subgroup of the unitary group over A stabilizer code is thus a quantum code specified by ( 2)⊗n called the Clifford group. In fact, not only are Clifford C Eq. (17), but with V in the Clifford group. The code C (but transformations over n qubits efficiently specifiable, they can not the encoding) can equivalently be specified with n − k also be implemented efficiently by a quantum circuit involving independent mutually commuting elements of Gn of order 2 only O(n2) elementary quantum gates on 1 and 2 qubits (see as follows: Theorem 10.6 in [30] for instance). Definition 11 (Stabilizer code): The stabilizer code C as- Definition 7 (Clifford transformation and Clifford group): sociated to the stabilizer set {H , i = 1..n − k}, where the A Clifford transformation over n qubits is a unitary transform i Hi’s are independent mutually commuting elements of Gn of V over ( 2)⊗n which leaves the Pauli group over n qubits C order 2 and different from −1, is the subspace of ( 2)⊗n of globally invariant by conjugation C elements stabilized by the Hi’s, that is † VGnV = Gn. C = {|ψi | Hi|ψi = |ψi, 1 ≤ i ≤ n − k}. (20) The set of Clifford transformations is a group and is called This is the usual definition of stabilizer codes. The Hi play the Clifford group over n qubits. a role analogous to the rows of the parity-check matrix of a This definition naturally leads to the action of the Clifford classical linear code, and this connection will be formalized group on elements of the Pauli group. in Subsection III-D. To see the equivalence between this † Definition 8 (Action of Clifford transformation on Pauli): definition and Eq. (17), set Hi = VZk+iV . These operators A Clifford transformation V acts on the Pauli group as are independent and of order 2 since they are conjugate to the Zi which are independent and of order 2. Now, consider Gn → Gn a |ψi ∈ C as defined in Eq. (17). For all Hi, we have P 7→ P0 = VPV† † Hi|ψi = VZi+kV V(|ψi ⊗ |0n−ki) (21) It also acts on the effective Pauli group by the mapping [P] 7→ = V(|ψi ⊗ Z |0 i) = |ψi, (22) [P0]. i n−k The last mapping is F2-linear and there is a square binary where we used the fact that Z|0i = |0i. Hence, |ψi satisfies matrix V of size 2n which is such that the condition of Def. 11. Conversely, for any state |ψi ∈ C [VPV†] = [P]V. according to Def. 11, we have † † This matrix will be called the encoding matrix. Zk+iV |ψi = V Hi|ψi (23) Definition 9 (Encoding matrix): The encoding matrix V = V†|ψi, (24) associated to an encoding operation V, which is a Clifford † transformation over n qubits, is the binary matrix V of size which implies that the k + ith qubit of V |ψi must be in state |0i. Since this holds for all i = 1, 2, . . . , n − k, we conclude 2n × 2n such that for any P ∈ Gn we have that the two definitions are equivalent. This equivalence has [VPV†] = [P]V. the following consequence: Fact 2: A stabilizer code of length n associated to n − k encoding we obtain the state independent generators H is of dimension 2k. i † † Since X, Y, Z are all of order 2, all the generators of order V P|ψi = V PV(|ψi ⊗ |0n−ki) 2 in Gn are of the form ±P where P is a tensor product of n = (L|ψi) ⊗ (S|0n−ki), matrices all chosen among the set {I, X, Y, Z}. Thus, we can where L belongs to Gk and S = αS1 ⊗ · · · ⊗ Sn−k belongs specify the generators Hi of the stabilizer code by giving only the associated effective Pauli group elements together with to Gn−k (and the Si’s to {I, X, Y, Z}). Notice that S|0n−ki is a sign for each generator. Changing the sign of a stabilizer equal to |s1i ⊗ · · · ⊗ |sn−ki where  ∈ {±1, ±i} and generator changes the code, but not its properties1. More s = 0 if S ∈ {I, Z}, (26) precisely, the set of Pauli errors which can be corrected by i i such a code does not depend on the signs which have been si = 1 otherwise. (27) chosen. Hence, we can specify a family of “equivalent” codes Measuring the n − k last qubits reveals s . . . s which by specifying instead of the H ’s the set of H [H ] = 1 n−k j j , j is the analogue of a classical syndrome. This motivates the VZ . It is important to note that these elements have to j+k following definition. be orthogonal: the fact that the H ’s commute translate into i Definition 12 (Error syndrome): The syndrome s(P) asso- the orthogonality condition H ?H = 0. Thus, the H span a i j i ciated to a Pauli error P is the binary vector (s ) linear space called the stabilizer space, that we denote C(I) i 1≤i≤n−k defined by Equations (26) and (27). for reasons that will become apparent later. Note that the syndrome s(P ) can be obtained from the H ’s Thus, in analogy with classical linear codes, a stabilizer i (which are defined as in the previous subsection by H = code (or more precisely an equivalent class thereof) can be i [VZ V†] = Z V ) by efficiently specified by an encoding matrix V on 2n. This k+i k+i F2 Proposition 1: matrix also provides an efficient description of the encoding up to a set of phases. There is another analogy with a s(P) = ([P] ?Hi)1≤i≤n−k. classical encoding matrix that will be crucial for our definition −1 Proof: si(P) is equal to [P]V ?Zi+k by definition. of quantum turbo-codes. Assume that we concatenate two Since symplectic transformations preserve the symplectic inner stabilizer codes and that these codes are encoded by Clifford −1 product we deduce that si(P) = ([P]V ) ?Zi+k = [P] ? transformations. The result of the concatenation is also a Zi+kV = [P] ?Hi. stabilizer code (because Clifford transformations form a group) This proposition motivates the following definition of a and the resulting encoding matrix is just the product of the two parity-check matrix of a stabilizer code encoding matrices of each constituent code. This reflects the Definition 13 (Parity-check matrix): The parity-check ma- fact that the encoding matrices provide a representation of the trix H of a quantum code with stabilizer set {H1,...,Hn−k} Clifford group. is the binary matrix of size (n − k) × 2n with rows Fact 3: Let V1 and V2 be two Clifford transformations over H1,...,Hn−k. n qubits with encoding matrices V1 and V2 respectively. Then V2V1 is a Clifford transformation with encoding matrix V1V2. k Proof: Consider the Clifford transformation V , V2V1. L n It suffices to verify the statement on a generating set of the n k V P S − Pauli group:

 † h † †i Fig. 5. Circuit representation of encoder (L : S)V = P . The operator VX V = V V X V V n−k i 2 1 i 1 2 P ∈ Gn is a codeword (has trivial syndrome) if and only if S ∈ {I,Z} . h †i = V1XiV1 V2 (25) The calculation of the syndrome depends only on the = XiV1V2 effective Pauli error P = [P]. As we did for classical errors † Equation (25) uses the fact that V1XiV1 belongs to Gn. The in Sec. II-A, it will be convenient to decompose the error −1 same kind of result holds for the Zi’s and this completes the as PV = (L : S), with L ∈ Gk and S ∈ Gn−k. Like proof. in the classical case, this is conveniently represented by the circuit diagram of Fig. 5. At this point however, the analogy C. Decoding with the classical case partially breaks down. As described in Section II-A, in the classical setting a bit-flip error p can be When transmitted on a Pauli channel, an encoded state decomposed as pV −1 = (l : s). In that case, s is the error |ψi = V(|ψi ⊗ |0 i) (where |ψi belongs to ( 2)⊗k) will n−k C syndrome and is therefore known. Decoding then consists in result in a state P|ψi for some P ∈ Gn. Upon inverting the identifying the most likely l given knowledge of s. In the quantum case however, S is only partially determined by 1This is strictly true for Pauli channels which are considered here. For a general noise model, error correcting properties may actually depend on the the error syndrome s(P ). Indeed, we can decompose S as sign of the stabilizer generators. S = Sx + Sz (c.f. Notation 4), and notice that from (26) and (27), s(P ) reveals only Sx. More precisely, we have the these qubits. It corresponds to the probability that the error x x x following relation for the i-th component Si of S belongs to the coset C(L, S ) which is therefore equal to x P(L, Sx) Si = X if si = 1 x P(L|S ) = P 0 x . (28) x 0 P(L ,S ) Si = I otherwise. L with the probability P(L, Sx) is the pullback of P(P ) through Hence, two physical errors P = (L : Sx + Sz)V and P 0 = x 0z z 0z the encoding matrix (L : S + S )V = P + (Ik : S + S )V have the same error 2 x X syndrome S , so cannot be distinguished. However, they P(L, Sx) = P(P ) . (29) P =(L:Sx+Sz )V −1 also yield the same logical transformation L, so they can be Sz ∈{I,Z}n−k corrected by the same operation (namely applying L = L−1 again). Therefore, they cannot and need not be distinguished Similarly to the classical setting, maximum likelihood de- by the error syndrome: such errors are called degenerate. This coding consists in identifying the most likely logical transfor- z mation L given the syndrome Sx. More formally: reflects the fact that all errors of the form P = (Ik : S )V (with Sz ∈ {I,Z}n−k) have zero syndrome but do not need Definition 17 (Maximum likelihood decoder): The n−k to be corrected. We denote such kind of errors by maximum likelihood decoder LML : {I,X} → Gk Definition 14 (Harmless undetected errors): The set of er- is defined by z z rors P of the form P = (Ik : S )V where S ranges over L (Sx) = argmax P(L|Sx) n−k ML L (30) {I,Z} is called the set of harmless undetected errors. The classical MAP decoding (or bit-wise decoding) has also All the other errors of zero syndrome (and which are therefore a quantum analogue undetected) have a non trivial action on the k first qubits Definition 18 (Qubit-wise maximum likelihood decoder): after inverting the encoding transformation. This motivates the i The qubit-wise maximum likelihood decoder LML : following definition n−k {I,X} → G1 is defined by Definition 15 (Harmful undetected errors): The set of er- z z i x i x rors P of the form P = (L : S )V where S ranges over LML(S ) = argmaxLi P(L |S ) (31) {I,Z}n−k and L is different from I is called the set of k where the marginal conditional probability is defined the usual harmless undetected errors. z z way Note that the set of errors of the form (Ik : S )V with S n−k X in {I,Z} is also the subgroup spanned by the rows V2i for P(Li|Sx) = P(L1,...Lk|Sx). (32) i ∈ {k +1, . . . , n}, or what is the same, the subgroup spanned L1,...Li−1,Li+1,...Lk by the Hi , Zi+kV for i ∈ {1, . . . , n − k}. In other words Equation (29) differs from its classical analogue Eq. (5) by Proposition 2: The set of harmless undetected errors is a summation over Sz which reflects the coset structure of equal to C(I). the code. Aside from this distinction, the maximum-likelihood This fact that there are errors which do no need to be decoders are defined as in the classical case. corrected has an important consequence. Contrarily to the classical setting where the most likely error satisfying the D. Comparison between stabilizer codes and classical linear measured syndrome is sought, in the quantum case, we look codes for the most likely coset of C(I) satisfying the measured One of the main advantage of the stabilizer formalism is syndrome. Such a coset is the set of errors of the form that it allows to discretize a seemingly continuous problem by Definition 16 (Logical coset): Given an encoding matrix studying the effect of Pauli errors (which are discrete) on the V , the logical coset C(L, Sx) associated to the logical trans- continuous code subspace. By classifying these errors, discrete x formation L ∈ Gk and to the syndrome S (belonging to quantities such as error syndromes or parity-check matrices {I,X}n−k) is defined as arise naturally. In other words, stabilizer codes share many

x z x z n−k analogies with classical linear codes, but there are also some C(L, S ) = {P = (L : S + S )V | S ∈ {I,Z} } fundamental differences. Let us summarize these analogies and = (L : Sx)V + C(I). differences here. We assume in what follows that the relevant

x quantum quantities are defined for a stabilizer code C of length When S = In−k we simply write C(L) instead of k n and rate n . C(L, In−k). Syndrome and parity-check matrix. The parity-check matrix What replaces the classical probability that a given information H is a binary matrix of size (n − k) × 2n. It differs from a sequence has been sent given a measured syndrome is in classical parity-check matrix in two respects: the quantum case the probability P(L|Sx) that applying the 1) Its rows H must be orthogonal with respect to the ?- transformation L−1 = L to the k first qubits after performing i product, the inverse of the encoding operation corrects the error on 2) The syndrome s(P ) of a Pauli error P in Gn is defined 2By a slight abuse of terminology, we use the one-to-one correspondence with the help of the ?-product (rather than by matrix x between s and S to refer to both quantities as the error syndrome. multiplication): s(P ) = (Hi ?P )1≤i≤n−k. Encoding matrix. It is a binary matrix V of size 2n × 2n Information symbols. There is in the quantum setting a natural and must be a symplectic matrix (and any symplectic matrix notion of information sequence corresponding to a Pauli error is the encoding matrix of a certain stabilizer code). Because P which consists in taking the element L in Gk such that −1 T it is a symplectic matrix V = ΛnV Λn, it plays a role there exists an S in Gn−k for which (L : S)V = P . analogous to both the classical encoding matrix and its inverse. Like the classical encoding matrix Eq. (3), it contains a IV. QUANTUMTURBO-CODES generator matrix as a sub-matrix. Like the inverse of the In this section, we describe quantum turbo-codes obtained classical encoding matrix Eq. (4), it also contains a parity from interleaved serial concatenation of quantum convolu- check matrix as a sub-matrix. The parity check matrix is tional codes. This first requires the definition of quantum formed of rows V2(k+1),V2(k+2),...,V2n while the generating convolutional codes. We will define them through their circuit matrix consists of rows V1,V2,...,V2k. The remaining rows representation as in [31] rather than through their parity-check V2k+1,V2(k+1)+1,...,V2n−1 are sometimes referred to as matrix as in [15], [18], [1]: this allows to define in a natural “pure errors” [34]. Indeed, taking the rows of V as generators way the state diagram and is also quite helpful for describing of Gn, the syndrome associated to an element of Gn depends the decoding algorithm. only on its pure error component. Hence, their classical analogue is the matrix (H−1)T appearing in the classical A. Quantum convolutional codes encoding matrix Eq. (4). A quantum convolutional encoder can be defined quite The encoding matrix V is associated to a (continuous) succinctly as a stabilizer code with encoding matrix V given unitary encoding transformation V. Like in the classical case, by the circuit diagram of Fig. 6. The circuit is built from the natural decoding process consisting in inverting V and repeated uses of the seed transformation U shifted by n qubits. measuring the last n − k qubits, which yields a syndrome In this circuit, particular attention must be paid to the order that is associated to a parity check matrix. of the inputs as they alternate between stabilizer qubits and Code. We may define the discrete stabilizer code as in the logical qubits. This is a slight deviation from the convention classical setting as the set of errors with zero syndrome, that established in the previous section, and it is convenient to is introduce the following notation to label the different qubits Definition 19 (Discrete stabilizer code): The discrete sta- appearing in the encoding matrix of a quantum stabilizer code. bilizer code C associated to the stabilizer set {Hi, i = 1..n−k}, where the Hi’s are independent mutually orthogonal Definition 20: The positions corresponding to L are called elements of Gn, is the subspace of Gn orthogonal to the Hi, the logical positions and the positions corresponding to S are that is called the syndrome positions. The total number of identical repetition of the seed transforma- C = {P ∈ Gn | Hi ?P = 0, 1 ≤ i ≤ n − k}, (33) tion U is called the duration of the code and is denoted N. The m qubits that connect gates from consecutive time slices are ⊥ or more succinctly C = C(I) . called memory qubits. The encoding is initialized by setting Codewords. There is an important difference between the clas- the first m memory qubits in the |0mi state. To terminate the sical setting and the quantum setting here. Since all elements encoding, set the k information qubits of the last t time slices of a coset of C(I) have the same effect on , we make no C in the |0ki state, where t is a free parameter independent of distinction between the elements of such cosets. Therefore N. The rate of the code is thus kN/(n(N + t) + m) which is the codewords in the quantum setting are grouped in cosets of the form k/n + O(1/N) for fixed t. of C(I). Note that all elements of the coset C(I) are the Formally, a quantum convolutional code can be defined as analogue of the zero codeword. With the notation introduced follows. in the previous subsection we have Definition 21 (Quantum convolutional encoder): Let n, k, [ m, and t be integers defining the parameters of the code, C = C(L). (34) and N the duration of the encoding. Let U be an (n + m)- L∈G k qubit symplectic matrix called the seed transformation. The Minimum distance. In the classical setting, the minimum encoding matrix V of the quantum convolutional encoder is a distance of a linear code is the smallest Hamming weight symplectic matrix over m + n(N + t) qubits given by of a non-zero codeword. This definition carries over to the V = U U ...U quantum setting with the coset C(I) playing the role of the [1...n+m] [n+1...2n+m] [(N+t−1)n+1...(N+t)n+m] N+t zero codeword. Thus, the minimal distance of a code is the Y minimum weight w(P ) of an element P of C − C(I). With = U[(i−1)n+1..in+m] this definition of the minimum distance d, it is straightforward i=1 to check that the number of errors which are corrected by where [a..b] stands for the integer interval {a, a + x a decoder which outputs the coset C(L, S ) containing the 1, . . . , b} and where U[(i−1)n+1..in+m] acts on an element x element P of lowest weight and satisfying the syndrome S (P1,...,Pm+n(N+t)) ∈ Gm+n(N+t) such that its image d−1 (P 0,...,P 0 ) satisfies: (P 0 ,...,P 0 ) = is equal to b 2 c. 1 m+n(N+t) (i−1)n+1 in+m S0 The Sj for i ∈ {N+1,...,N+t−1} are obtained by recursion P1 L1 U on i: S1 −1 P2 (Mi−1 : Si) ,(Pi : Mi)U (36) L2 S2 and the Mi−1, Li, Si for i in {1,...,N} are obtained from the recursion PN LN −1 (Mi−1 : Li : Si) (Pi : Mi)U (37) SN , PN+1 SN+1 Finally, set . . . S0 = M0. (38) PN+t 1 − SN+t 1 Any Clifford transformation U on n+m qubits can be used − { as a seed transformation and defines a convolutional code. It P SN+t N+t will be useful to decompose U into blocks of various sizes

} 2n 2m Fig. 6. Circuit diagram of a quantum convolutional encoder with seed  z}|{µ z}|{µ } transformation U. The superscript indicating the number of qubits per wire P M 2m are omitted for clarity, and can be found on Fig. 7. U =  ΛP ΛM }2k . (39) ΩP ΩM }2(n−k) Just like in the classical case, this definition of quantum Mi 1 n convolutional code can easily be seen to be equivalent to the − k Pi ones that have previously appeared in the literature [15], [18], Li U n k m [1]. In particular, the D-transform associated to the code can Si − Mi easily be obtained from the sub-matrices of V appearing in Eq. (39). However, these concepts will not be important for Fig. 7. Seed transformation circuit. our analysis. Our definition of convolutional code is stated in terms of their encoding matrix V . From this perspective, convolutional codes are ordinary, albeit very large, stabilizer codes. How- (P(i−1)n+1,...,Pin+m)U and all other Pi are given by 0 ever, there are important aspects of convolutional codes that Pi = Pi. The syndrome symbols correspond to the positions S distinguish them from generic stabilizer codes. belonging to [1..m] ∪ i∈[1..N][(i − 1)n + m + k + 1..in + m] ∪ S [(i − 1)n + m..in + m]. As mentioned in the previous section, stabilizer codes have i∈[N+1..(N+t)] in general encoding circuits using a number of elements It will be convenient to decompose an element P in proportional to the square of the number of physical qubits. G as P = (P : P : ··· : P ) where the P n(N+t)+m 1 2 N+t i Convolutional codes have by definition circuit complexity that belong to G for i in {1, 2,...,N +t−1} and P belongs n N+t scales linearly with N for fixed m: each application of the to G . This decomposition directly reflects the structure of n+m seed transformation U requires a constant number of gates, the output wires appearing on the right-hand-side of the circuit and this transformation is repeated N + t times. diagram of Fig. 6. The most important distinction however has to do with the Similarly, we will decompose the Pauli-stream obtain by decoding complexity. The maximum-likelihood decoder of a applying the inverse encoder to P as stabilizer code consists in an optimization over the logical K −1 cosets, of which there are 4 where K denotes the the number (S0 : L1 : S1 : ··· : LN : SN : SN+1 : ··· : SN+t) , PV , of encoded qubits. Without any additional structure on V , maximum-likelihood decoding is an NP-hard problem [5]. where S0 belongs to Gm, the Li’s all belong to Gk, the Si’s Quantum convolutional codes on the other hand have decoding belong to Gn−k for i in {1,...,N} and the SN+j’s belong complexity that scales linearly with K. The algorithm that to Gn for j in {1, . . . , t}. This decomposition directly reflects accomplishes this task will be described in details in Sec. V. the structure of the input wires appearing on the left-hand-side of the circuit diagram of Fig. 6. B. State diagram

While the Pj are related to the Lj and Sj via a matrix V We will now define some properties of convolutional codes of dimension 2(N + t)n + 2m, the convoluted structure of V that will play important roles in the analysis of the perfor- can be exploited to recursively compute this transformation mance of turbo-codes. Most of these definitions rely on the without the need to manipulate objects of size increasing the state diagram of a convolutional code, which is defined with N. This requires the introduction of auxiliary memory similarly as in the classical case. variables Mj ∈ Gm. The recursion is initialized by setting Definition 22 (State diagram): The state diagram of an en- coder with seed transformation U and parameters (n, k, m) is −1 m (MN+t−1 : SN+t) , PN+tU . (35) a directed multi-graph with 4 vertices called memory-states, 0 I,II (Z,II) each labeled by a M ∈ Gm. Two vertices M and M are (I,IZ) linked by an edge M → M 0 with label (L, P ) if and only if (Z,ZI)

) ) z n−k (I,ZI) Z there exists L ∈ Gk, P ∈ Gn and a S ∈ {I,Z} such (Z,IZ)

I I,ZZ Z,ZZ ( that ( 0 z ) ( P : M = (M : L : S )U, . (40) X,YI Y,IY ) (Y ( ,XZ ) X,ZX ) ) ) ) ( ) ) ) ) The labels L and P are referred to as the logical label and ) ,XI ,YZ ,IX ,ZY Y Y X,XI X,IX Y Y X,YZ X,ZY ( ( ( ( ( ( ( ( physical label of the edge respectively. X,IY ) ( (Y ,ZX Y,YI ) ) ( Thus, the state diagram represents partial information about ) (X,XZ the transformation (M : L : Sz) → (P : M 0) generated ) U ) by the seed transformation . Partial information because all X (I,XY) Y I,XX Z,XX ( information about Sz is discarded. Note that Sz ∈ {I,Z}n−k, (Z,YX) ( so the state diagram only contains information about the (I,YX) (Z,XY) streams of Pauli operators that remain in the set of codewords (Z,YY) (I,YY) C. The restriction on the Sz input can be lifted if we instead consider the effective seed transformation Fig. 9. State diagram for the seed transformation shown at Fig. 8.

2n 2m  z}|{ z}|{  µP µM }2m Definition 24 (Zero-physical-weight cycle): A zero- Ueff ,  ΛP ΛM }2k . (41) physical-weight cycle is a closed path in the state diagram ΣP ΣM }n−k that uses only edges with zero-physical-weight. where the matrix [ΣP :ΣM] is obtained by removing every In the state diagram of Fig. 9 for example, there are two second row from the matrix [ΩP :ΩM] (i.e. the rows which zero-physical-weight cycles corresponding to the transforma- represent the action on the Xi). This definition will be conve- tions (I : I : I)U = (II : I) and (Z : Z : Z)U = (II : Z). nient for later analysis. Definition 25 (Non-catastrophic encoder): An encoder C The state diagram of the seed transformation represented is non-catastrophic if and only if the only cycles in its state at Fig. 8 is shown at Fig. 9. For instance, the self-loop at I diagram with physical-weight 0 have logical weight 0. labeled (I,II) represents the trivial fact that (I : I : I)U = We see for instance that the state diagram of our running (II : I). The edge from Y to I labeled (Y,XZ) represents example is catastrophic due to the presence of the self-loop the transformation (Y : Y : I)U = (XZ : I), and so on. with label (Z,II) at state Z: this cycle has physical-weight 0 and logical weight 1. To understand the consequences of a Mi 1 catastrophic seed transformation, consider the act of inverting − Pi the encoding transformation of the associated convolutional L i } encoder. This is done by running the circuit of Fig. 6 back- Si Mi wards. Suppose that a single Y error affected the transmitted qubits. More specifically, at time i, 1 ≤ i ≤ N, there is Fig. 8. Seed transformation for an n = 1, k = 1, and m = 1 a Y on the lower physical wire of the seed transformation quantum convolutional code. It corresponds to a unitary transform which maps 2 |ai ⊗ |bi ⊗ |ci to |ai ⊗ |a + bi|a + b + ci for a, b, c ∈ {0, 1}. Therefore of Fig. 8 (i.e. Pi = Y ) and everything else is I. Since −1 the seed transformation U acts as follows on the Zi and Xi: Z1U = (IY : I)U = (Z : Y : X), this will result in a Z in (Z,I,I)U = (Z,I,I),X U = (X,X,X),Z U = (Z,Z,I),X U = 1 2 2 the memory qubit Mi−1, a Y in the logical qubit Li, and a X (I,X,X),Z3U = (I,Z,Z),X3U = (I,I,X). in the stabilizer qubit Si. The Si = X triggers a non-trivial syndrome, which signals the presence of an error. Moreover, The state diagram is crucial for analyzing the properties of because of the self-loop at M = Z that has non-zero logical- the associated code, and also for defining some of its essential weight but zero physical-weight, this error will continue to features. Here, we give some definitions based on the state propagate without triggering additional syndrome bits, while diagram that will be important in our analysis. creating Z’s in Li−1 and Mi−2, and in Li−2 and Mi−3, and so Definition 23 (Path): A path in the state diagram is a se- on. Thus, an error of finite physical-weight results in an error quence of vertices M1,M2,... such that Mi → Mi+1 belongs of unbounded logical-weight, and a finite syndrome. This is to the state diagram. the essence of catastrophic error propagation. Each element of C is naturally associated to a path in the state Catastrophic encoders may have large minimal distances, diagram, which corresponds to the memory states visited upon but perform poorly under iterative decoding. All the codes its encoding. The physical- and logical-weight of a codeword we have considered in our numerical simulations were non- can be obtained by adding the corresponding weights of the catastrophic. In fact, they even satisfied a stronger condition: edges in the path associated to the codeword. More generally, Definition 26 (Completely non-catastrophic code): A com- we will refer to the weight of a path as the sum of the weight pletely non-catastrophic code is such that the only loop in its of its edges. state diagram with physical-weight zero is the self-loop at Im. In the classical setting, non-catastrophicity is insured for to perform a Clifford transformation on each qubit which instance by the use of systematic encoders. For such encoders, amounts to permute X,Y and Z. More precisely: the logical string c is contained as a substring of the encoded Definition 30 (Quantum interleaver): A quantum string c = cV . Systematic quantum encoding can be obtained interleaver Π of size N is an N-qubit symplectic by setting the first k columns of ΛP = 1lk and the first k transformation composed of a π of the N columns of ΣP = µP = 0 (c.f. Eq. (41)). However, this qubit registers and a tensor product of single-qubit symplectic would imply that the stabilizers act trivially on the first k transformation. It acts as follows by multiplication on the output qubits, resulting in a minimal distance equal to 1. right on GN : We conclude that it is not possible to design a systematic (P ,...,P ) 7→ (P K ,...,P K ) quantum encoder with minimal distance greater than 1. Thus, 1 N π(1) 1 π(N) N non-catastrophicity is a condition that needs to be built in by where K1,...,KN are some fixed symplectic matrices acting hand. Fortunately, it can be efficiently verified directly on the on G1. state diagram and we have made great use of this fact. It follows that interleavers preserves the weight of N-Pauli In the classical setting, turbo-codes can be designed with a streams. An interleaved serial concatenation of two quantum minimal distance that grows polynomially with N when the encoders has three basic components: inner code is recursive. Recall that recursive means that the 1) An outer code encoding kOut qubits by embedding them encoder has an infinite impulsive response: when a single 1 in a register of nOut qubits, with encoder V Out, is inputed at any logical wire of the encoding circuit Fig. 2 2) An inner code encoding kIn qubits by embedding them and every other input is 0, the resulting output has infinite in a register of nIn qubits, with encoder V Out and which weight for a code of infinite duration. This definition can be is such that kIn = nOut, generalized to the quantum setting. 3) A quantum interleaver Π of size N = nOut = kIn. Definition 27 (Quasi-recursive encoder): Consider execut- The resulting encoding matrix of the interleaved concate- ing the encoding circuit of Fig. 6 on an input containing a nated code is a symplectic matrix V acting on G In such that single non-identity Pauli operators on a logical wire, with all n other inputs set to I. The corresponding encoder V is quasi- V = V 0OutΠ0V In, recursive when the resulting output has infinite weight when 0Out 0 with the action of V and Π on G In being defined by the code has infinite duration N. n However, it can be verified that this notion of recursiveness (L : SOut : SIn)V 0Out = ((L : SOut)V Out : SIn) (42) is too weak to derive a good lower bound on the minimal Out In distance of turbo-codes. This departure from the theory of for (L : S : S ) ∈ GkOut × GnOut−kOut × GnIn−kIn , and classical codes stems from the fact that quantum codes are (L0 : SIn)Π0 = (L0Π: SIn) (43) coset codes. As in the classical case, the proper definition 0 of a recursive encoder demands that it generates an infinite for L ∈ GnOut . These relations are summarized at Fig. 10. kOut impulsive response. The novelty comes from the fact that The rate of the concatenated code is equal to nIn = kOut kIn this must be true for every elements in the coset associated nOut nIn , that is the product of the rates of the inner code to the impulsive logical input: not only must the encoded and the outer code. version of Xi, Yi, and Zi have weight growing with the A serial quantum turbo-code is obtained from this inter- Out In duration of the code N, but so must every elements of C(Xi), leaved concatenation scheme by choosing V and V as C(Yi), and C(Zi). Formally, we can define recursive quantum quantum convolutional encoders. convolutional encoders in two steps: Definition 28 (Admissible path): A path in the state dia- Out Π gram is admissible if and only if its first edge is not part L In In Out P V

of a zero physical-weight cycle. V SOut SIn Definition 29 (Recursive encoder): A recursive encoder is such that any admissible path with logical-weight 1 starting from a vertex belonging to a zero physical-weight loop does Fig. 10. Circuit diagram for a turbo encoder. not contain a zero physical-weight loop. Once again, this property can be directly and efficiently tested given the seed transformation of the convolutional code D. Figure of merit by constructing its state diagram. There are a number of in-equivalent ways of characterizing the performance of a code. We might use the minimum C. Interleaved serial concatenation distance, but a quantity that is more informative is the weight Quantum turbo-codes are obtained from a particular form enumerator, which counts the number of undetected harmful of interleaved concatenation of quantum convolutional codes. errors of each weight. For a convolutional code however, as Interleaving is slightly more complex in the quantum setting the number K of encoded qubits tends to infinity, the weight since in addition to permuting the qubits it is also possible enumerator will be infinite. Indeed, because of the translational invariance of the encoding circuit, finite error patterns come in code has a minimal distance that grows polynomially with an infinite number of copies obtained by translation. Instead, their length and offer good iterative decoding performances. we can consider the distance spectrum of a non-catastrophic More precisely, random serial turbo-codes have a minimum Out d∗ −2 encoder, which is defined as follows. Out distance which is typically of order N d∗ when the inner Definition 31 (Distance spectrum): The distance spectrum encoder is recursive, where N is the length of the concatenated (F (w)) of a non-catastrophic convolutional encoder is a w≥0 code and dOut the free distance of the outer code [22]. That sequence for which F (w) is the number of admissible paths ∗ the encoder be non-catastrophic is important to obtain good in the state diagram starting and ending in memory states that iterative decoding performances. are part of zero-weight cycles, and with physical-weight w This result and its proof would carry over the quantum and logical weight greater than 0. setting almost verbatim with our definition of recursive en- An other relevant quantity is the distance spectrum for logical- coders. The quantum case is slightly more subtle due to the weight-one elements of C, which is defined similarly. coset structure of the code. Unfortunately, such encoders do Definition 32 (Logical-weight-one distance spectrum): not exist: The distance spectrum for logical-weight-one codewords Theorem 1: Quantum convolutional recursive encoders are F (w) of a non-catastrophic convolutional encoder is the 1 catastrophic. number of admissible paths in the state diagram starting and This result is perhaps surprising since the notions of catas- ending in memory states that are part of zero-weight cycles, trophic and recursive are quite distinct in the classical setting. and with physical-weight w and logical weight 1. Nonetheless, the stringent symplectic constraints imposed to It can easily be seen that the minimum distance of a turbo- the seed transformation U gives rises to a conflicting relation code obtained from the concatenation of two convolutional between them. The proof of Theorem 1 is rather involved. codes is no greater than dOut∗dIn where d = min {F (w) > ∗ 1 1 w 1 Here we present its main steps and leave the details to the 0} and the free minimal distance is d = min {F (w) > 0}. ∗ w appendix. The free distance is defined similarly to the classical case The proof involves manipulation of the rows of the effective by the smallest weight of a harmful undetected error in the encoding matrix Eq. (41) and for that reason, it is more ap- convolutional code with infinite time duration. It is so-to- 2n propriate to view effective Pauli operators as elements of F2 . speak a kind of typical minimal distance for convolutional The proof proceeds directly by demonstrating that the state codes, ignoring finite-size effects. To maximize the minimum diagram of any recursive convolutional encoder contains a distance of the turbo-code, we must use outer codes with large directed cycle with zero physical-weight and non-zero logical free distances d∗ and inner encoders with large value of d1. weight. We first need a characterization of the memory states Recursive encoders for instance have d1 proportional to N, M that can be part of a zero physical-weight cycle. We break and therefore serve as ideal inner codes. However, it happens this into three steps. First, we characterize the set of states that we cannot use recursive encoders as inner codes as we that are the endpoint of edges in the state diagram with zero will see in the next section. Hence, a good rule of thumb is to physical-weight edges. In other words, we want to find all use inner encoders that minimize the value of F1(w) at small 0 2m possible values for the memory element M in F2 such that w, and similarly use an outer code which minimizes the value 2m n−k 2k there exist M ∈ F2 , S ∈ F2 , and L ∈ F2 such that of F (w) at small w. These will result in a turbo-code with a 0 (M : L : S)U = (02n : M ). distance spectrum that is small at low distances. eff Lemma 2: Given a seed transformation U, let S to be the Finally, given an error model, the word error rate (WER) 2m subspace of F2 spanned by the rows of ΣM. The set of and qubit error rate (QER) provide a good operational figure ⊥ endpoints of edges with zero physical label is equal to S of merit. The QER is the probability that an individual ⊥ and conversely, any M in S is the end vertex in the state logical qubit is incorrectly decoded. In other words, the QER diagram of exactly one edge of zero physical-weight. represents the fraction of logical qubits that have errors after If the state diagram contains a zero physical-weight cycle, the decoding. The WER is the probability that at least one it is therefore necessarily supported on the subset of vertices qubit in the block is incorrectly decoded. We expect in general S ⊥. However, edges of zero physical-weight with endpoints QER  WER. The WER is thus a much more strenuous vertices in S ⊥ may originate from vertices outside S ⊥. Such figure of merit that the QER. For instance, if N qubits are edges are not part of zero-physical-weight cycles. The next encoded in N/k block codes for some constant k, then as step is thus to characterize the set of endpoints of edges N increases, the WER approaches 1 exponentially while the with zero physical label and starting point in S ⊥. Since in QER remains constant. As we will see, turbo-codes have a the absence of other inputs each time interval modifies the completely different behavior. In general, we will be interested memory state by M → MµM, we intuitively expect this set in the WER averaged over the choice of interleaver Π. ⊥ to be S0 , where S0 is the smallest subspace containing S and E. Recursive convolutional encoders are catastrophic stable by µM. This is confirmed by the following lemma. Lemma 3: Given a seed transformation U, let In the classical setting, non-catastrophic and recursive con- ∞ volutional encoders are of particular interest. When used as the X i S0 , SµM. (44) inner encoder of a concatenated coding scheme, the resulting i=0 0 ⊥ For any element M of S0 , there exists a unique element M Armed with this result, we are now in a position to prove the ⊥ in S0 , such that there is an edge of physical-weight 0 from main result of this section. M to M 0. Proof: (of Theorem 1 ) Consider a recursive quantum This lemma narrows down the set of vertices in the state convolutional encoder and its associated kernel graph. By diagram that can support zero physical-weight cycles. In Lemma 7, this graph has at least one edge with non-zero particular, we can define a sub-graph of the state diagram logical weight. Let us say that it goes from M0 to M1. From ⊥ obtained from the vertex set S0 and directed edges with trivial Lemma 4, we can follow a directed path of arbitrary length l physical labels. This subgraph is guaranteed to have constant with (M0,M1) as starting edge: in-degree 1 for all its vertices, but some of its vertices may M → M → · · · → M → M . have no outgoing edges. These would definitely not be part 0 1 t−1 t of a cycle. To ensure that all vertices in the subgraph have a If the length of the path is greater than the number of vertices positive number of outgoing edges, we must once more restrict of the graph it must contain at least twice the same vertex. i its set of vertices. The (left) nullspace of µM’s will play a Moreover, M0 must be part of this cycle. Otherwise, we would fundamental role. have a path of the form M0 → M1 → ...Mj → Mj+1 → 2m Notation 7: Let µ be a linear mapping from F2 to itself. · · · → Ml = Mj with j > 0. In this case, Mj would have We denote by Null(µ) the (left) nullspace of µ, that is in-degree 2 which is impossible. In other words, there is a 2m directed cycle in the state diagram with zero physical-weight Null(µ) = {M ∈ F2 |Mµ = 02m}. P∞ i and non-zero logical weight. The corresponding convolutional Notation 8: Let N0 , i=1 Null(µM) and V0 = S0 + N0. Let G be a sub-graph of the state diagram obtained from the encoder is therefore catastrophic. ⊥ vertex set V0 and edges with trivial physical label. This graph V. DECODING is called the kernel graph of the quantum convolutional code This section describes the decoding procedure for turbo- with seed-transformation U. codes operated on memoryless Pauli channels. With an n- By replacing the vertex set S ⊥ by V ⊥, our goal was to 0 0 qubit memoryless Pauli channel, errors are elements of G eliminate any vertex with no outgoing edge. This turns out to n distributed according to a product distribution P(P : P : ... : be successful as shown by the following lemma. 1 2 Lemma 4: The kernel graph has constant in-degree 1 and Pn) = f1(P1)f2(P2) . . . fn(Pn). The depolarizing channel positive out-degree for any vertex. described in Section III is a particular example of such a Thus, any cycle with zero physical-weight must be sup- channel where all fj are equal. We note that our algorithm can ported on the kernel graph of the seed transformation. The be extended to non-Pauli errors using the belief propagation next step in order to prove Theorem 1 is to demonstrate algorithm of [25], but leave this generalization for a future that when the quantum convolutional encoder is recursive, its paper. The decoding algorithm we present is an adaptation to the quantum setting of the usual “soft-input soft-output” corresponding kernel graph G does not only consist of the single zero vertex with a self-loop attached to it, corresponding algorithm used to decode serial turbo-codes (see [2]). It differs from the classical version in several points. to the trivial relation (02m : 02k : 0n−k)Ueff = (02n : 02m). Lemma 5: The kernel graph of a recursive quantum convo- 1) As explained in Subsection III-C, for decoding a quan- lutional encoder has strictly more than one vertex. tum code we do not consider the state of the qubits This result is an essential distinction between the quantum and directly (which belong to a continuous space and which the classical case. In the classical case, when the memory state cannot be measured without being disturbed) but instead is non-zero, it is always possible to create a non-zero physical consider the Pauli error (which is discrete) that has af- output for instance by copying the state of the memory at the fected the quantum state. Decoding consists in inferring output. But this is not possible quantum-mechanically. the transformation that has affected the state rather than Before proving that G contains a cycle with non-zero inferring what the state should be. logical weight, we will first prove that it contains at least one 2) Decoding a quantum code is related to classical “syn- edge with non-zero logical weight. For this purpose, let us drome decoding” (see [27, chapter 47]) with the caveat characterize the subset of edges with zero physical-weight and that errors differing by a combination of the rows of zero logical weight. the parity-check matrix act identically on the codewords. Lemma 6: Given a seed transformation U, let L be the Thus, maximum-likelihood decoding consists in identi- 2m subspace of F2 spanned by the rows of ΛM and ΣM. The fying the most likely error coset given the syndrome. set of endpoints of edges of zero physical and logical weight The coset with largest probability can differ from the is equal to L⊥. one containing the most likely Pauli error. This result and its proof are structurally similar to Lemma 2, 3) We cannot assume as in the classical case that the soft- except that S has been replaced by L. From this, we conclude: input soft-output decoder of the convolutional quantum code starts at the zero-state and ends at the zero-state. Lemma 7: The kernel graph of a recursive quantum con- This is related to the fact that the memory is described volutional encoder contains an edge with non-zero logical in terms of the Pauli error that has affected the qubits weight. rather than reflecting a property of the encoded state. In Out x Instead, we perform a measurement which reveals partial P (L) 1 P (P S ) Π− | information (the X component) about the first memory In x Out In P (L S ) P (P ) element. P (P ) SISOIn | Π SISOOut POut(L Sx) | Let us now describe how each constituent convolutional (Sx)In x Out code is decoded with a soft-input soft-output decoder. (S ) A. Decoding of convolutional codes Fig. 11. Information flow in the iterative turbo decoding procedure. As stated in Definition 18, qubit-wise maximum likelihood consists in finding the logical operator Li that maximizes x the marginal conditional probability P(Li|S ). We call the algorithm actually follows the same logic. As explained in algorithm that computes this probability – but without re- Sec. III-C, the probability on L and S is obtained from the turning the Li that optimizes it – a soft-input soft-output pullback of P(P ) through the encoder V (c.f. Eq. (29)). For (SISO) decoder. The purpose of this section is to explain how a convolutional code, this pullback can be decomposed into such a decoder can be implemented efficiently for quantum elementary steps, each step passing through a single seed convolutional codes. transformation U and intermediate probabilities on We choose to base our presentation solely on the circuit the memory variables. description of the code. Our algorithm is essentially equivalent In addition to the procedure just outlined, the decoder to a sum-product algorithm operated on the trellis of the code must also update the probability P(L) obtained from the [33]. However, the novelties of quantum codes listed above re- pullback of P(P ) conditioned on the value of the observed quire some crucial modifications of the trellis-based decoding. syndrome. This operation is slightly more subtle, and requires We find that these complication are greatly alleviated when not only the pullback of probabilities through the circuit, decoding is formulated directly in terms of the circuit. but also their push-forward (propagating from the left to the Since the distinction between trellis-based and circuit-based right-hand-side of the circuit). For that reason, the decoding decoding are technical rather conceptual, we will present algorithm presented at Algorithm 1 will consist of three steps, the procedure in details and omit its derivation from first a backward pass (Algorithm 2), a forward pass (Algorithm principles. As usual, when operated on a memoryless Pauli 3), and a local update (Algorithm 4). As indicated by their channel, the whole procedure is nothing but Bayesian updating names, these respectively perform a pullback of probabilities, of probabilities. a push-forward of probabilities, and finally an operation that Consider a quantum convolutional code with parameters combines these two probabilities into the final result. (n, k, m, t), seed transformation U and duration N as shown Our description of these algorithms make use of the follow- at Fig 6. We use the same notation as in Subsection IV-A ing notation: and denote by V the associated encoding matrix. Let us recall S (Si)0≤i≤N+t (45) that it maps Gn(N+t)+m to itself. As in Subsection IV-A we , decompose an element P in Gn(N+t)+m (i.e. an error on the S≤i , (Sj)0≤j≤i (46) channel) as P = (P1 : P2 : ··· : PN+t) where the Pi’s belong S>i , (Sj)i

INPUTS: j P(Pi ) for i ∈ [N + t], j ∈ [n], (and j ∈ [n + m] when i = N + t) From physical noise model j P(Li ) for i ∈ [N], j ∈ [k] From turbo decoder Sx From syndrome measurement OUTPUTS: j x P(Pi |S ) for i ∈ [N + t], j ∈ [n] (and j ∈ [n + m] when i = N + t), j x P(Li |S ) for i ∈ [N], j ∈ [k] ALGORITHM: backward pass forward pass local update

Algorithm 2: Backward pass

INPUTS: Same as SISO algorithm OUTPUTS: x P(Mi|S>i) for i ∈ [N + t]. ALGORITHM: {Initialization: P(Mn+t) is given directly by the physical noise model.} for all γ ∈ Gm do m n+j j P(Mn+t = γ) ← Πj=1P(PN+t = γ ) end for {Recursion: first t steps} for i = N + t − 1 to N + 1 do

x X h x i P(Mi|S>i) ∝ P(Pi+1 = (Mi : σ)UP )P(Mi+1 = (Mi : σ)UM |S>i+1) x x σ∈Gn:σ =Si end for {Recursion: last N steps} for i = N to 1 do x X h x i P(Mi|S>i) ∝ P(Li = λ)P(Pi+1 = (Mi : λ : σ)UP )P(Mi+1 = (Mi : λ : σ)UM |S>i+1)

λ∈Gk x x σ∈Gn−k:σ =Si end for

In j In j P (Li ) is the uniform distribution. The distribution P (Pi ) are the single-qubit symplectic transformations that appear in is given directly by the channel model. The only output which the quantum interleaver Π. This yields the output distributions Out j x j x x is used in the following step is the output distribution on the P (Pi |S ) and P(Li |S ) (again, S only refers to the logical variables given the syndrome measured on the inner part of the syndrome attached to the outer code). This step is In j x x ˆ code: P (Li |S ) (S really refers to the part of the syndrome terminated by estimating the most likely error coset L, setting measured for the inner code and not to the whole syndrome, ˆj n Out j x o but we do not attach a “In” to it to avoid cumbersome Li = argmaxγ P (Li = γ|S ) . notation). Then, the outer code is decoded with the SISO algorithm, To iterate this procedure, use the output probability Out j x using as input distribution for the logical variables, as in the P (Pi |S ) as information on the logical variables of the previous case, the uniform distribution. The input distribution inner code: in other words, set as input distribution for inner Out j SISO decoding of the physical variables P (Pi ) is deduced from the logical output distribution of the inner decoder: In j Out jπ iπ x P (Li = γ) = P (Piπ = γKjπ |S ), Out jπ In j j x P (P π = γ) = P (L K = γ|S ), i i i and the distribution of the physical variables are set by the π π π π j where i and j are such that (i , j ) = π(i, j), and the Ki physical channel as before. This is represented by the feedback Algorithm 3: Forward pass

INPUTS: Same as SISO algorithm OUTPUTS: x P(Mi|S≤i) for i ∈ {0,...,N + t − 1}. ALGORITHM: {Initialization: } for all γ ∈ Gm do x x if γ = S0 then x 1 P(M0 = γ|S0 ) ← 2m else x P(M0 = γ|S0 ) ← 0 end if end for {Recursion: } for i = 1 to N + t + 1 do x X h x  i P(Mi|S≤i) ∝ P(Li = λ)P (Pi = (µ : λ : σ)UP ) P Mi−1 = µ|S≤i−1

µ∈Gm,λ∈Gk x x σ∈Gn−k:σ =Si Mi=(µ:λ:σ)UM end for

Algorithm 4: Local update

INPUTS: Same as SISO algorithm x P(Mi|S>i) for i ∈ [N + t] From backward pass x P(Mi|S≤i) for i ∈ {0,...,N + t − 1} From forward pass OUTPUTS: j x P(Pi |S ) for i ∈ [N + t], j ∈ [n] (and j ∈ [n + m] for i = N + t) j x P(Li |S ) for i ∈ [N] and j ∈ [k] ALGORITHM: for i = 1 to N + t do x X h x  x i P(Li|S ) ∝ P(Li)P Mi−1 = µ|S≤i−1 P (Pi = (µ : Li : σ)UP ) P (Mi = (µ : Li : σ)Um|S>i)

µ∈Gm x x σ∈Gn−k:σ =Si x X h x x i P(Pi|S ) ∝ P(Pi)P(Li = λ)P(Mi−1 = µ|S≤i−1)P (Mi = (µ : λ : σ)UM |S>i)

µ∈Gm,λ∈Gk x x σ∈Gn−k:σ =Si Pi=(µ:λ:σ)UP end for {Marginalization: } j x x Compute P(Li |S ) from P(Li|S ) j x x Compute P(Pi |S ) from P(Pi|S ) loop on Fig. 11 where information from the outer decoder is that although PIn(P ) is memoryless, the induced channel π Out Out j returned to the inner decoder. P (P ) = on the outer code obtained from P (Piπ ) = PIn(LjKj|Sx) is not. The decoder ignores this fact and only This procedure can be repeated an arbitrary number i i POut(P j) POut(P ) of times, with each iteration yielding an estimate of the uses the marginals i of . This is the price maximum-likelihood decoder of the outer code. The iterations to pay for an efficient decoding algorithm. can be halted after a fixed number of rounds, or when the VI.RESULTS estimate does not vary from one iteration to the next. Although the decoding scheme is exact for both constituent codes, the The convolutional codes we used for our construction of overall turbo-decoding is sub-optimal. The reason for this is turbo-codes are for the most part generated at random. That w U(3,1,3) U(3,1,4) U(2,1,4) w U(3,1,4) U(3,1,4) U(2,1,4) 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 2 0 0 0 2 0 0 0 3 0 0 0 3 0 0 0 4 0 0 0 4 1 0 0 5 0 0 0 5 11 0 6 6 2 0 0 6 47 11 82 7 4 3 0 7 265 70 442 8 8 0 2 8 1 275 324 3 379 9 16 7 0 9 6 397 1 596 24 074 10 35 0 3 10 31 785 7 773 174 997 11 70 34 2 11 160 311 40 971 1 253 748 12 143 0 0 12 801 232 206 959 9 033 087 13 295 156 2 14 634 0 10 TABLE II 15 1 362 586 12 DISTANCE SPECTRUM F (w) 16 2 802 0 37 17 5 714 2 827 38 18 11 526 0 121 19 23 674 11 430 86 20 48 817 0 280 on the interleaver, which we chose completely at random. In TABLE I all cases, there are most likely lower weight codewords than DISTANCE SPECTRUM F1(w) OFLOGICAL-WEIGHT-ONECODEWORDS the estimate provided by these lower bounds, but those are atypical. On the other hand, the codes are not decoded with a minimum distance decoder, so even a true large minimal distance does not imply low WER. is, we first generate a random seed transformation U of desired The WER of a quantum turbo-code on a depolarization dimensions. Using its state diagram, we then test whether the channel can be estimated using Monte Carlo methods. An corresponding encoder is catastrophic, and if so we reject it error P ∈ GN is generated randomly according to the channel and start over. Non-catastrophicity is the only criterion that model probability distribution. The syndrome associated to we systematically imposed. this error is evaluated, and based on its value, the decoding As a first sieve among the randomly generated non- algorithm (see Sec. V) is executed. The decoding algorithm catastrophic seed transformations, we can study their distance outputs an error estimate P 0. If P − P 0 ∈ C(I), the decoding spectrums and make some heuristic test based on it. Example is accepted, otherwise it is rejected. In other words, the of good seed transformations obtained from this procedure are decoding is accepted only if all K encoded qubits are correctly U(3,1,3) = {2085, 926, 2053, 1434, 910, 3943, 1484, 2881, 3212, recovered. The WER is then the fraction of rejected decodings. 2250, 68, 331}, The WERs as a function of the depolarizing probability p U(3,1,4) = {13159, 10335, 13127, 6554, 10319, 14441, 10625, are shown for a selection of codes on Fig. 12-14. Perhaps the 5835, 832, 13893, 11916, 11329, 8204, 5570}, most striking features of those curves is the existence of a U(2,1,4) = {610, 3323, 760, 1591, 2500, 942, 2290, 794, 1535, pseudo-threshold value of p below which the WER decreases 2202, 2859, 809}, as the number of encoded qubits is increases. Since the codes where the binary symplectic encoding matrix is specified have a bounded minimal distance, this is not a true threshold in by its list of rows and each row is given by the integer the sense that as we keep increasing the number of encoded corresponding to the binary entry. The subscript on the qubits, the WER should start to increase. However, we see encoders specify its parameters (n, k, m). Hence, the first that for modest sizes K of up to 4000, this effect is not 1 two codes have rate 3 but differ by the size of their memory. observed. We do see however that the improvement appears 1 The third code has a higher rate of 2 . The first few values to be saturating around these values. The pseudo-threshold is of the distance spectrum of logical-weight-one codewords for particularly clear for the seed transformation U(3,1,3), where these quantum convolutional code are given at Table I, while it is approximately 0.098, and for the seed transformation the distance spectrum of all codewords are listed at Table II. U(2,1,4) where it is approximately 0.067. Its value for the seed Based on those values, we conclude that the turbo-codes transformation U(3,1,4) is not as clear, but seams to be between obtained from concatenation of code using seed transformation 0.95 and 0.11. U(3,1,4) with itself has a minimal distance no greater that 6 × These values should be compared with the hashing bound, 4 = 24 1 . Similarly, the codes obtained by the concatenation whose value is approximately 0.16024 for a rate 9 code and U 1 of (3,1,4) with itself has minimal distance no greater than 0.12689 for rate 4 . We can also compare with the results 7 × 6 = 42, and the one obtained from the concatenation of obtained from LDPC codes in [28, Figure 10] by evaluating U(2,1,4) with itself has 8 × 6 = 48. the depolarizing probability p at which the WER drops below −4 1 These are upper bounds on the minimal distance and do 10 . For a rate 4 , this threshold was achieved at pth ≈ 0.033 2 not translate directly into the performance of the code. On the (note the convention fm = 3 p) for LDPC codes while the one hand, the actual minimal distance of a turbo-code depends turbo-code shown at Fig. 14 has pth ≈ 0.048. It should also be noted that this improved threshold is achieved with a smaller 1 block size than that used for the LDPC in [28]; a larger block 250 should further improve this result. 500 0.1 1000

1 50 0.01 100 0.1 250 500 0.001 2000 0.01 4000 0.0001

0.001 0.00001 0.04 0.045 0.05 0.055 0.06 0.065 0.07 0.075 0.08

0.0001 Fig. 14. WER vs depolarizing probability p for the quantum turbo- code obtained from the concatenation of the convolutional code with seed 0.00001 0.08 0.09 0.1 0.11 0.12 0.13 transformation U(2,1,4) with itself, for different number of encoded qubits K. Each constituent convolutional code has m = 4 qubits of memory and 1 1 have rate 2 , so the rate of the turbo-code is 4 . Fig. 12. WER vs depolarizing probability p for the quantum turbo- code obtained from the concatenation of the convolutional code with seed transformation U(3,1,3) with itself, for different number of encoded qubits K. Each constituent convolutional code has m = 3 qubits of memory and cases, the slope of the WER increases with the memory size. 1 1 have rate 3 , so the rate of the turbo-code is 9 .

(1,4) 0.01 (2,4) 1 (3,4) 50 (4,4) 100 (1,5) 0.1 250 (2,5) 500 (3,5) 2000 0.001 4000 0.01

0.001

0.0001 0.0001 0.08 0.085 0.09 0.095 0.1

0.00001 0.08 0.09 0.1 0.11 0.12 0.13 Fig. 15. WER vs depolarizing probability p for a quantum turbo-code 1 encoding K = 100 qubits and rate 9 with different memory configurations (mIn, mOut). Fig. 13. WER vs depolarizing probability p for the quantum turbo- code obtained from the concatenation of the convolutional code with seed transformation U(3,1,4) with itself, for different number of encoded qubits K. Each constituent convolutional code has m = 4 qubits of memory and 1 1 VII.CONCLUSION have rate 3 , so the rate of the turbo-code is 9 . In this article, we have presented a detailed theory of As expected, changing the rate of the code directly affects quantum serial turbo-codes based on the interleaved serial the value of the pseudo threshold. This is seen by comparing concatenation of quantum convolutional codes. The descrip- either of Figs. 12 or 13 to Fig. 14. The effect of the memory tion and analysis of these codes was greatly simplified by the size is however less obvious. Comparing Fig. 12 and 13, it use of a circuit representation of the encoder. In particular, appears that the effect of a larger memory is to sharpen the this representation provides a simple definition of the state slope of the WER profile below the pseudo threshold for fixed diagram associated to a quantum convolutional code, and K. In other words, the main impact of the memory size is enables a simple and intuitive derivation of their efficient not in the value of the pseudo threshold, but rather in the decoding algorithm. effectiveness of the error suppression below that threshold. By a detailed analysis of the state diagram, we have shown This conclusion is somewhat supported by Fig. 15 where the that all recursive convolutional encoders have catastrophic WER is plotted for a variety of memory configurations. In all error propagation. Recursive convolutional encoders can be constructed and yield serial turbo-codes with polynomial min- and is therefore spanned by the rows of Ueff. This implies that 2m n−k 2k imal distances. However, they offer extremely poor iterative there exist M ∈ F2 , S ∈ F2 , and L ∈ F2 such that 0 decoding performances due to their unavoidable catastrophic (M : L : S)Ueff = (02n : M ). Furthermore, it can be noticed error propagation. The encoders we have used in our con- from the fact that the rows of Ueff are independent, that if such structions are thus chosen to be non-catastrophic and non- an (M : L : S) exists, it is unique. recursive. While the resulting codes have bounded minimal The proof of Lemma 3 requires a straightforward Fact and distance, we have found that they offer good iterative decoding a Lemma. 0 2m performances over a range of block sizes and word error rates Fact 5: For any M,M ∈ F2 we have that are of practical interest. 0 0 0 Compared to quantum LDPC codes, quantum turbo-codes (MµP : MµM) ? (M µP : M µM) = M?M offer several advantages. On the one hand, there is complete Proof: This is straightforward consequence of the or- freedom in the code design in terms of length, rate, memory thogonality relations satisfied by the first 2m rows of U. 2m 0 size, and interleaver choice. The freedom in the interleaver Lemma 8: Let T ∈ F2 and let M be such that there 2m n−k 2k is crucial since it is the source of the randomness that is exist M ∈ F2 , S ∈ F2 , and L ∈ F2 such that (M : L : 0 responsible for the success of these codes. On the other S)Ueff = (02n : M ). We have hand, the graphical representation of turbo-codes is free of 4- M 0 ? T µ = M?T. (48) cycles that deteriorate the performances of iterative decoding. M Finally, the iterative decoder makes explicit use of the code’s Proof: We observe that degeneracy. This feature is important because turbo-codes, like 0 0 M ? T µM = (02n : M ) ? (T µP : T µM) LDPC codes, have low-weight stabilizers and are hence greatly = (Mµ + LΛ + SΣ : degenerate. P P P In future work, we hope to surmount the obstacle of catas- MµM + LΛM + SΣM) ? (T µP : T µM) trophic error propagation. A concrete avenue is the generalized = (MµP : MµM) ? (T µP : T µM) stabilizer formalism of operator quantum error correction [34], which could circumvent the conclusions of our theorem established in the context of subspace stabilizer codes. Doping where the last equation follows from the fact that any row [45] is an other possibility that we will investigate. of [ΛP :ΛM] or [ΣP :ΣM] is orthogonal to all the rows of Acknowledgments — DP is supported in part by the Gordon [µP : µM]. From this, we conclude that and Betty Moore Foundation through Caltech’s Center for the 0 Physics of Information, by the National Science Foundation M ? T µM = M?T. under Grant No. PHY-0456720, and by the Natural Sciences and Engineering Research Council of Canada. Proof: (of Lemma 3) Since M 0 ∈ S ⊥, there exist by 2m n−k 2k APPENDIX Lemma 2, M ∈ F2 , S ∈ F2 , and L ∈ F2 such that (M : L : S)U = (0 : M 0). Let T ∈ . Using Lemma 8 To prove Lemma 2, we first establish some simple facts: eff 2n S0 2n+2m we obtain Fact 4: The subspace of F2 orthogonal to all the rows of Ueff is the space spanned by the rows of its submatrix M 0 ? T µ = M?T 2n+2m M [ΣP :ΣM]. Similarly, the subspace of F2 orthogonal to   0 µP µM Notice now that M ? T µM = 0 since T µM ∈ S0. From this all the rows of is the space spanned by the rows ⊥ ΣP ΣM M?T = 0. This shows that M belongs to S0 too. The unicity Λ Λ  of M is a consequence of Lemma 2. of P M . ΣP ΣM The following lemma is used in the proof of Lemma 4. 2n+2m 2m Proof: The subspace V of F2 orthogonal to all the Lemma 9: Let µ be a linear mapping from F2 to itself. 2m rows of Ueff is of dimension 2n + 2m − (2m + 2k + n − k) = Let V be a subspace of F2 such that µ(V ) ⊂ V and which n − k. We observe now that the rows of [ΣP :ΣM] are all contains the null space of any positive power of µ. Then for ⊥ 0 ⊥ independent and all orthogonal to the rows of Ueff. They form any M in V there exists M in V such that for any T in 2m therefore a basis of V . This finishes the proof of the first F2 : statement. The second one is obtained by similar arguments. M?T = M 0 ? T µ. Proof: We are first going to prove this statement in the 0 2m Proof: (of Lemma 2 ) Let M ∈ F2 be such that there case 2m n−k 2k ∞ exist M ∈ F2 , S ∈ F2 , and L ∈ F2 such that (M : L : 0 0 [ t S)Ueff = (02n : M ). Notice now that (02n : M ) is spanned V = Null(µ ). t=1 by the rows of Ueff and is therefore orthogonal to all the rows 0 ⊥ of the matrix [ΣP :ΣM]. This implies that M belongs to S . This is a subspace of 2m since the Null(µt)’s are nested sets: 0 0 F2 Conversely, any row vector of the form (02n : M ) with M ⊥ 2 t belonging to S is orthogonal to all the rows of [ΣP :ΣM] Null(µ) ⊂ Null(µ ) ⊂ · · · ⊂ Null(µ ) ⊂ .... t t 2m Let us consider the space Im(µ ) generated by the rows of µ . For such a t, any M in F2 can be expressed as a sum M = 2m 2 Pt i t Since F2 ⊃ Im(µ) ⊃ Im(µ ) ⊃ ... there must exist a posi- N + i=0 TiµM, where N is in Null(µM) and the Ti’s all t t+1 t tive t such that Im(µ )) = Im(µ ). In this case, µ(Im(µ )) = belong to S, i.e. they are of the form Ti = SiΣM for some t t n−k Im(µ ). This implies that the restriction of µ to Im(µ ) is a Si ∈ F2 . Consider now a finite path starting at the origin t t one-to-one mapping and that Null(µ ) ∩ Im(µ ) = {02m}. with logical weight 1 and non-zero physical-weight. We denote t t 2m Since dim(Null(µ )) + dim(Im(µ )) = 2m, we can form a by M its endpoint (which is viewed as an element in F2 ). 2m t+1 basis (T1,...Tl,Tl+1,...,T2m) of F2 such that (T1,...,Tl) We decompose MµM as explained before spans Null(µt) and (T ,...,T ) spans Im(µt). Moreover, l+1 2m t all the Null(µv)’s are equal for v greater than or equal to t. X Mµt+1 = N + S Σ µi This follows directly from the fact that the Im(µv)’s are all M t−i M M i=0 equal in this case. This can be checked by using the relations v v k dim(Null(µ )) + dim(Im(µ )) = 2m. From these equalities, where the Si’s belong to F2 . The path of length t which starts we deduce that dim(Null(µt)) = dim(Null(µt+1)) = ··· = at M and which corresponds to the sequence of pairs of logical v v dim(Null(µ )) = ... . The Null(µ )’s are nested sets and transformations/stabilizer transformations (02k : S0) → (02k : t t+1 v therefore Null(µ ) = Null(µ ) = ··· = Null(µ ) = ... . S1) → · · · → (02k : St) will go from point M to This implies that V = Null(µt). We define U for i in i t 2m X {l + 1,..., 2m} as the unique element in F2 such that Mµt+1 + S Σ µi = N. U µ = T . There exists a unique M 0 such that M t−i M M i i i=0 0 M ?Ti = 0 for i ∈ {1, . . . , l} (49) By extending this path by feeding in t zero transformations 0 t M ?Ti = M?Ui for i ∈ {l + 1,..., 2m} (50) (02k : 0n−k) we go from vertex N to µ (N) which is equal to 02m by definition. This path may then continue by feeding This M 0 belongs to V ⊥ by Equation (49). Note now that we in additional zero transformations and will stay at the zero have defined M 0 in such a way that M?T coincides with vertex forever. This contradicts the fact that the quantum code 0 M ?T µ over the basis (T1,...,Tl,Ul+1,...,U2m). Therefore, is recursive. by linearity of the ? product, we have M?T = M 0 ? T µ for 0 2m Proof: (of Lemma 6) Let M be an element of F2 for all T in 2m. 2m n−k F2 which there exist M ∈ F2 and S ∈ F2 , such that (M : 2k 0 0 The general case is direct consequence of this particular 0 : S)Ueff = (02n : M ). (02n : M ) is spanned by the 0 case. We define M similarly by Equations (49) and (50) and rows of [µP : µM] and [ΣP :ΣM]. By Fact 5, this implies 0 ⊥ 0 it is readily checked that M belongs to V . that (02n : M ) is orthogonal to all the rows of the matrices 0 ⊥ Proof: (of Lemma 4) We know from Lemma 3 that for [ΛP :ΛM] and [ΣP :ΣM]. Hence M should belong to L . On 0 ⊥ ⊥ 0 0 ⊥ any element M in S0 , there exists a unique M in S0 such that the other hand, any (02n : M ) for which M belongs to L there is an edge of zero physical-weight in the state diagram is orthogonal to all the rows of [ΛP :ΛM] and [ΣP :ΣM] and 0 which goes from M to M . To prove that the kernel graph is therefore spanned by the rows of [µP : µM] and [ΣP :ΣM]. has constant in-degree 1 we just have to prove that when M 0 ⊥ ⊥ belongs to the subset V0 of S0 the corresponding M also Proof: (of Lemma 7) This amounts to prove that there 0 belongs to this subset. Since for any T ∈ N0 we have M ?T = exists a vertex in the kernel graph which does not belong to 0 and since is stable by applying µ to the left we obtain ⊥ ⊥ N0 M L . The set of vertices of the kernel graph is V0 . Therefore, 0 for a such a T , M?T = M ? T µM = 0. This shows that M we need to find an element of L that is not in V0. In particular, ⊥ ⊥ also belongs to N0 which shows that M belongs to V0 . we would be done if there existed a row of ΛM which does On the other hand, by applying Lemma 9 with V = V0, we not belong to V0. know that for any vertex M of the kernel graph, there is an Assume the opposite. Let t be the integer such that V0 = M 0 belonging also to V ⊥ such that for any T in 2m: t Pt i 2m 0 F2 Null(µM) + i=0 SµM. Then, for every L in F2 of weight n−k 0 1 and any integer k, there exists S0,S1,...,St in F2 and M?T = M ? T µM. t a N in Null(µM) such that 0 Note that given such an M there is a unique M which t satisfies the aforementioned equality for all T . Therefore M k X i LΛMµM = N + St−iΣMµM. is necessarily the starting vertex of the unique directed edge i=0 of physical-weight 0 having as endpoint M 0. Proof: (of Lemma 5) We just have to prove that the set Consider a finite path of non-zero physical-weight and logical 2m weight 1 starting at the origin and ending at a vertex M. V0 is not equal to the whole space F2 . We proceed by 2m Assume that this path corresponds to the sequence of pairs contradiction. Assume that V0 = F2 . Notice now that there exists a finite number t such that of logical/stabilizer inputs t (02k : S0) → (02k : S1) → · · · → t X i V0 = Null(µM) + SµM. (02k : Si−1) → (L : Si) → (02k : Si+1) → · · · → (02k : Su), i=0 (i.e. the only time where the logical transformation is non-zero [13] D. P. DiVincenzo and P. Aliferis, Effective fault-tolerant quantum is at time i and is equal to L which is assumed to be of weight computation with slow measurements, Phys. Rev. Lett., 98, (2007) p. 020501. 1 ). The final memory state would then be [14] D. P. DIVINCENZO, P. W. SHOR, AND J.A.SMOLIN, Quantum-channel u capacity of very noisy channels, Phys. Rev. A, 57 (1998), p. 830. u−i X i [15] J. G. D. FORNEY,M.GRASSL, AND S.GUHA, Convolutional and tail- M = LΛMµ + Su−iΣMµ . (51) M M biting quantum error-correcting codes, IEEE Transactions on Informa- i=0 tion Theory, 53 (2007) p. 865. Since, by assumption, the rows of ΛM are in V0, there exists [16] R. G. GALLAGER, Low Density Parity Check Codes, M.I.T. Press, S0 ,...,S0 in n−k and N 0 in Null(µt ) such that Cambridge, Massachusetts, 1963. 0 t F2 M [17] D. GOTTESMAN, Stabilizer codes and quantum error correction, PhD u t thesis, California Institute of Technology, Pasadena, CA, 1997, quant- u+t+1−i X i+t+1 0 X 0 i ph/9705052. LΛMµM + Su−iΣMµM = N + St−iΣMµM. [18] M. GRASSLAND M.ROTTELER¨ , Non-catastrophic encoders and en- i=0 i=0 (52) coder inverses for quantum convolutional codes, in Proc. ISIT, IEEE (2006) p. 1109. Thus, if we extend the path by the sequence of inputs [19] M. HAGIWARA AND H.IMAI, Quantum quasi-cyclic ldpc codes, 2007, 0 0 0 quant-ph0701020. (02k : S0) → (02k : S1) → · · · → (02k : St), [20] M. HASTINGS, Quantum belief propagation: An algorithm for thermal 0 quantum systems, Phys. Rev. B, 76 (2007) p. 201102. we arrive at the vertex M which satisfies [21] A. HOCQUENGHEM, Codes correcteurs d’erreurs, Chiffres, 2 (1959), t p. 147. 0 t+1 X 0 i [22] N. KAHALEAND R.URBANKE, On the minimum distance of parallel M = Mµ + S ΣMµ M t−i M and serially concatenated codes, in Proc. IEEE Int. Symp. Info. Theo. i=0 (ISIT’98), 1998, p. 31. u u+t+1−i X i+t+1 [23] E. KNILLAND R.LAFLAMME, Theory of quantum error-correcting = LΛMµM + Su−iΣMµM codes, Phys. Rev. A, 55 (1997), p. 900. i=0 [24] C. LAUMANN,A.SCARDICCHIO, AND S.SONDHI, Cavity method for t quantum spin glasses on the Bethe lattice, 2007, arXiv:0706.4391. X + S0 Σ µi [25] M. LEIFERAND D.POULIN, Quantum graphical models and belief t−i M M propagation, Ann. Phys., 323 (2007) p. 1899. i=0 [26] S. LLOYD, Capacity of the noisy quantum channel, Phys. Rev. A, 55 0 = N (1997), p. 1613. [27] D. J. C. MACKAY, Information Theory, Inference and Learning Algo- Extending this whole sequence by adding t zero transforma- rithms, Cambridge University Press, Cambridge, UK, October 2003. tions (02k : 0n−k) will bring this path back to the origin [28] D. J. C. MACKAY,G.MITCHISON, AND P. L. MCFADDEN, Sparse 0 t graph codes for quantum error-correction, IEEE Trans. Info. Theor., 50 since N in in the kernel of µM. Once at the origin, then (2004), p. 2315. encoder can remain in that state forever without any additional [29] M. MEZARDAND´ A.MONTANARI, Constraint Satisfaction Networks in physical output. This implies that the code is non recursive, Physics and Computation, Clarendon Press, 2007. and completes the proof. [30] M. A. NIELSENAND I.L.CHUANG, Quantum Computation and Quantum Information, Cambridge University Press, Cambridge, UK, REFERENCES 2000. [31] H. OLLIVIERAND J.-P. TILLICH, Description of a quantum convolu- [1] S. A. ALY,A.KLAPPENECKER, AND P. K. SARVEPALLI, On quantum tional code, Phys. Rev. Lett., 91 (2003), p. 177902. and classical BCH codes, IEEE Trans. Info. Theor., 53 (2007), p. 1183. [32] , Quantum convolutional codes: fundamentals, 2004, quant- [2] S. BENEDETTO,D.DIVSALAR,G.MONTORSI, AND F. POLLARA, ph/0401134. Serial concatenation of interleaved codes: performance analysis, design, [33] , Trellises for stabilizer codes: definition and uses, Phys. Rev. A, and iterative decoding, IEEE Trans. Info. Theor., 44 (1998). 74 (2006), p. 032304. [3] C. H. BENNETT, D. P. DIVINCENZO, AND J.A.SMOLIN, Capacities of [34] D. POULIN, Stabilizer formalism for operator quantum error correction, quantum erasure channels, Phys. Rev. Lett., 78 (1997), pp. 3217–3220. Phys. Rev. Lett., 95 (2005), p. 230504. [4] C. H. BENNETT, D. P. DIVINCENZO,J.A.SMOLIN, AND W. K. [35] D. POULIN, Optimal and efficient decoding of concatenated quantum WOOTTERS, Mixed state entanglement and quantum error-correcting block codes, Phys. Rev. A, 74 (2006), p. 052333. codes, Phys. Rev. A, 54 (1996), p. 3824. [36] D. POULINAND E.BILGIN, Belief propagation algorithm for computing [5] E. R. BERLEKAMP,R.J.MCELIECE, AND H. VAN TILBORG, On the correlation functions in finite-temperature quantum many-body systems inherent intractability of certain coding problems, IEEE Trans. Info. on loopy graphs, Phys. Rev. A 77 (2008) p. 52318. Theor., 24 (1978), p. 384. [37] D. POULINAND Y.-J. CHUNG, On the iterative decoding of sparse [6] C. BERROU,A.GLAVIEUX, AND P. THITIMAJSHIMA, Near shannon quantum codes Quant. Info. and Comp. 8 (2008) p. 986. limit error-correcting coding and decoding, in ICC’93, Geneve,` Switzer- [38] I. S. REEDAND G.SOLOMON, Polynomial codes over certain finite land, May 1993, pp. 1064–1070. fields, J. SIAM, 8 (1960), p. 300. [7] R. C. BOSEAND D.K.RAY-CHAUDHURI, On a class of error- correcting binary group codes, Info. Contr., 3 (1960). [39] C. E. SHANNON, A mathematical theory of communication, Bell System Tech., 27 (1948), pp. 379, 623. [8] A. R. CALDERBANKAND P. W. SHOR, Good quantum error-correcting codes exist, Phys. Rev. A, 54 (1996), pp. 1098–1105, quant-ph/9512032. [40] P. SHOR, Msri workshop on quantum computation. [9] T. CAMARA,H.OLLIVIER, AND J.-P. TILLICH, A class of quantum http://www.msri.org/publications, 2002. LDPC codes: construction and performances under iterative decoding, [41] P. W. SHOR, Scheme for reducing decoherence in quantum computer in Proceedings of ISIT 2007, Nice: IEEE, June 2007, pp. 811–815. memory, Phys. Rev. A, 52 (1995), p. 2493. [10] H. F. CHAU, Good quantum convolutional error correction codes and [42] G. SMITHAND J.A.SMOLIN, Degenerate coding for Pauli channels, their decoding algorithm exist, 1998, quant-ph/9806032. 2006, quant-ph/0604107. [11] , Quantum convolutional correcting codes, Phys. Rev. A, 58 [43] A. M. STEANE, Error correcting codes in quantum theory, Phys. Rev. (1998), p. 905. Lett., 77 (1996), p. 793. [12] I. DEVETAK, The private classical capacity and quantum capacity of a [44] , Simple quantum error correcting codes, Phys. Rev. A, 54 quantum channel, IEEE Trans. Info. Theor., 51 (2005), p. 44. (1996), p. 4741. [45] S. TEN BRINK, Designing iterative decoding schemes with the extrinsic information transfer chart, AEU Int. J. Electron. Commun., 54 (2000), p. 389. [46] J. S. YEDIDIA, Advanced mean field methods: theory and practice, MIT Press, 2001, ch. An idiosyncratic journey beyond mean field theory, p. 21.

David Poulin received a Ph.D. in Physics from the University of Waterloo in 2004. He has been a Postdoctoral Fellow at The University of Queensland in 2005 and at Caltech during the years 2006-2008. He joined the Physics department of the Universite´ de Sherbrooke in 2008 where he is currently an Assistant Professor. His research interests include the theory of quantum error correction, quantum algorithms, and numerical methods for the simulation of quantum many-body systems.

Jean-Pierre Tillich (M’06) was born in Mulhouse, France, in 1966. He received the Engineer degree from Ecole´ des Mines de Paris, Paris, France, in 1989 and the Ph.D. degree in computer science from Ecole´ Nationale Suprieure des Tlcom- munications (ENST), Paris, in 1994. From 1997 to 2003, he was an Assistant Professor at the University Paris XI. He is now a Researcher at the Institut de Recherche en Informatique et Automatique (INRIA), Rocquencourt, Le Chesnay, France. His research interests include classical and quantum coding theory, cryptography, and graph theory.

Harold Ollivier holds a Ph.D. from Ecole´ Polytechnique received in 2004 for his work on quantum foundations, deco- herence and error correction. He joined Perimeter Institute as a Postdoctoral Fellow until 2006 where he further developped new quantum error correction schemes. He later joined the French Ministry for Finance and Economy where he was in charge of venture capital policies. He now manages the Institut Louis Bachelier and the Fondation du Risque, two non-profit strutures dedicated to funding academic research in finance, insurance, and risk management.