Positivity and nonadditivity of quantum capacities using generalized erasure channels

Vikesh Siddhu∗ and Robert B. Griffiths† Department of Physics, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, U.S.A. Date: 1 March 2020

Abstract The positivity and nonadditivity of the one-letter (maximum coherent information) Q(1) is studied for two simple examples of complementary pairs (B, C). They are produced by a process, we call it gluing, for combining two or more channels to form a composite. (We discuss various other forms of gluing, some of which may be of interest for applications outside those considered in this paper.) An amplitude-damping qubit channel with damping probability 0 ≤ p ≤ 1 glued to a perfect channel is an example of what we call a generalized erasure channel characterized by an erasure probability λ along with p. A second example, using a phase-damping rather than amplitude- damping qubit channel, results in the dephrasure channel of Ledtizky et al. [Phys. Rev. Lett. 121, 160501 (2018)]. In both cases we find the global maximum and minimum of the entropy bias or coherent (1) (1) information, which determine Q (Bg) and Q (Cg), respectively, and the ranges in the (p, λ) parameter space where these capacities are positive or zero, confirming previous results for the dephrasure channel. (1) The nonadditivity of Q (Bg) for two channels in parallel occurs in a well defined region of the (p, λ) plane for the amplitude-damping case, whereas for the dephrasure case we extend previous results to (1) additional values of p and λ at which nonadditivity occurs. For both cases, Q (Cg) shows a peculiar (1) behavior: When p = 0, Cg is an erasure channel with erasure probability 1 − λ, so Q (Cg) is zero for (1) λ ≤ 1/2. However, for any p > 0, no matter how small, Q (Cg) is positive, though it may be extremely small, for all λ > 0. Despite the simplicity of these models we still lack an intuitive understanding of the (1) (1) nonadditivity of Q (Bg) and the positivity of Q (Cg).

Contents

1 Introduction 2

2 Preliminaries 3

3 Glued Isometries and Channels4

4 Generalized Erasure Channel Pair6

5 Applications 8 5.1 Generalized Erasure using Qubit Amplitude Damping Channel ...... 8 5.2 Generalized Erasure with Qubit Dephasing Channel ...... 10 5.3 Incomplete Erasure Channel ...... 11

6 Summary and Conclusions 12

arXiv:2003.00583v1 [quant-ph] 1 Mar 2020 A Appendix. Concatenation and antidegradable channels 14

(1) (1) B Appendix. Asymptotic estimates of Q (Bg) and Q (Cg) 15

[email protected][email protected]

1 1 Introduction

Understanding the capacity of a noisy quantum channel to transmit information is a central and chal- lenging problem in theory. In contrast to the case of a classical channel one can define several capacities for a quantum channel, among them the capacity to transmit classical information [1,2], the private capacity [3], and—the subject of the present paper—the quantum capacity, a measure of its ability to transmit quantum information. The asymptotic capacity C of a classical channel was shown by Shan- non [4] to be equal to the mutual information between input and output, when maximized over probability distributions of the input. An analog of this mutual information for a quantum channel B is the entropy bias or coherent information ∆(B, ρ)[5], the difference of the von Neumann entropies of the outputs of B and its complementary channel C for a given input density operator ρ. Maximizing this over ρ yields a non-negative real number, the single-letter quantum capacity (sometimes also called the channel coherent information) Q(1)(B). (A quantum channel B of the kind considered here is always a member of a complementary pair of channels (B, C), C the complement B and vice versa, generated by a single isometry as discussed in Sec.2.) A significant difference between quantum and classical channels is that when two classical channels are placed in parallel the capacity C of the combination is simply the sum of the individual capacities, whereas in the quantum case when channel B is placed in parallel with B0 one has only an inequality:

Q(1)(B ⊗ B0) ≥ Q(1)(B) + Q(1)(B0). (1)

This inequality can be strict, i.e., Q(1) can be nonadditive [6]. Nonadditivity makes it difficult to calculate the asymptotic quantum capacity Q(B) of a channel B, the limit as n → ∞ of Q(1)(B⊗n)/n [3,7,8]. In addition, due to nonadditivity the asymptotic capacity Q(B ⊗ B0) of two quantum channels B and B0 used in parallel may be greater than Q(B) + Q(B0)[9, 10], which implies that the asymptotic Q, unlike its classical counterpart C, does not completely capture a channel’s ability to transmit quantum information. The mathematical or physical principles behind nonadditivity are at present not well understood. Simple examples of nonadditivity are hard to construct. One source of difficulty is finding the global maximum of a function ∆(B, ρ) which in general is not a concave function of ρ. It both B and B0 are channels that are either degradable or antidegradable (see comments at the end of Sec.2) it is known that (1) is an equality, and therefore Q(B) = Q(1)(B), and Q(B ⊗ B0) = Q(B) + Q(B0)[11, 12]. For an antidegradable channel Q(1)(B) = Q(B) = 0, and the same is true for entanglement- binding channels [13]. But apart from special cases such as these it is in general not easy to determine whether Q(1) or Q is positive or zero [14, 15]. For the case of two identical channels in parallel, B0 = B, a simple example of nonadditivity has recently been constructed by Leditzky et al. [16] using what they call the dephrasure channel. To show nonadditivity (1) they first find Q (B), and then make a guess or ansatz ρˆ2 for a bipartite input density operator for which ⊗2 (1) ⊗2 (1) ∆(B , ρˆ2), a lower bound for Q (B ), is larger than 2Q (B). The ansatz approach can be extended to ⊗n (1) ⊗n n identical channels in parallel in an obvious way to look for cases where ∆(B , ρˆn) (and hence Q (B )) exceeds nQ(1)(B). This approach has been successfully applied for n ≥ 5 to the qubit depolarizing channel [6] where Q(1) is known, and to other qubit Pauli channels [17, 18] where Q(1) is believed to be known. Our exploration of some of these issues begins with a general procedure for combining several quantum channels to form a new channel through a process we call gluing. It differs from the familiar procedures of placing channels in parallel or series, and it puts together in a single overall structure concepts such as subchannels and convex combinations of channels. A particular type of gluing results in what we call a block diagonal channel pair, an instance of which is the much-studied and well-understood erasure channel [19] with erasure probability 0 ≤ λ ≤ 1, whose complement is also an erasure channel. The erasure channel can be regarded as the result of gluing together two perfect channels as discussed in Sec.4. When one of the perfect channel pairs is replaced by an arbitrary complementary channel pair (B1, C1) the result is a generalized erasure channel pair (Bg, Cg). The Bg channel can be viewed as a concatenation of B1 with an erasure channel, and Cg as an “incomplete erasure” channel. We study two cases of such generalized erasure channel pairs. In the first, B1 is a qubit-to-qubit amplitude damping channel, as is its complement C1. In the second, B1 is a qubit-to-qubit phase-damping channel whose complement is a measure-and-prepare channel; here Bg is the dephrasure channel. In both cases the qubit channel pair (B1, C1) depends on a parameter 0 ≤ p ≤ 1, and thus (Bg, Cg) depends on two parameters, (1) (1) p and the erasure probability λ. For all values of these parameters we compute Q (Bg) and Q (Cg) by

2 performing a global optimization and find the (p, λ) values for which they are positive. The dependence of (1) Q (Cg) on these parameters is rather surprising—see Fig.4 and the accompanying discussion—and worth further study. In both the amplitude and phase damping cases we find nonadditivity, a strict inequality in (1), when both 0 B and B are Bg. Our results in the amplitude damping case indicate that nonadditivity occurs over a well- defined region in the space (p, λ) of parameters, as shown in Fig.2. For the phase-damping case, where Bg is the dephrasure channel, our numerical results confirm and also extend the region of nonadditivity identified in [16], but without finding its precise boundaries. In addition we have carried out a limited exploration of higher-order nonadditivity by using various ansatzes, but without finding anything very interesting. The remainder of this paper is structured as follows. Section2 contains preliminary definitions and notation: in particular our use of isometries to construct a channel pair, and the use of projective decompo- sitions of the identity (PDIs) to identify orthogonal subspaces. Definitions of the entropy bias ∆(B, ρ) and the single-letter quantum capacity Q(1)(B) of a channel B, and (anti)degradable channels are also found in this section. Various gluing procedures for combining two or more channels are discussed at some length in Sec.3. The particular procedure that yields a block diagonal channel pair (see eq. (23)) is employed in Sec.4 to define a generalized erasure channel pair. The amplitude-damping case is discussed in Sec. 5.1, and the phase-damping (dephrasure) case in Sec. 5.2. The surprising positivity of Q(1) in the “incomplete erasure” situation found in both cases is discussed in Sec. 5.3. A concluding Sec.6 contains a summary of our results and an indication of some unsolved problems that deserve further study. It is followed by two appendices devoted to some technical issues and details.

2 Preliminaries

A quantum channel (completely positive trace preserving map) can always be constructed using an isometry J † J : Ha 7→ Hb ⊗ Hc; J J = Ia, (2)

mapping the Hilbert space Ha representing the channel’s input to a subspace of Hb ⊗ Hc, where Hb and Hc represent the direct and complementary channel outputs. The isometry preserves inner products, and this is † ensured by the condition J J = Ia, where Ia is the identity operator on Ha. We assume that the dimensions da, db and dc of the three Hilbert spaces are finite and satisfy da ≤ dbdc, but are otherwise unrestricted. The isometry results in a pair of quantum channels with superoperators

† † B(A) = Trc(JAJ ), C(A) = Trb(JAJ ), (3)

that map Hˆa, the space of operators on Ha, to the operator spaces Hˆb and Hˆc, respectively. Given the superoperator B, the corresponding isometry J, and thus the superoperator C, is uniquely determined up to ∗ a unitary acting on Hc . Likewise, C determines B up to a unitary on Hb. One refers to C as the complement of B, or B as the complement of C, and the two together as a complementary pair of channels. We shall be concerned with orthogonal subspaces of Ha, Hb and Hc, and it is convenient to represent the subspaces using a projective decomposition of the identity (PDI), a collection of mutually orthogonal projectors that sum to the identity. Thus a PDI {Pj} of Ha is a set of projectors such that

† 2 X Pj = Pj = Pj ,PjPj = δijPj, and Pj = Ia. (4) j

0 Using it one can define the j th subspace Haj of Ha, as

Haj = PjHa; (5)

that is to say, the collection of all |ψi such that Pj|ψi = |ψi, which means they are orthogonal to any |φi in Hak with k 6= j. Thus Ha is a direct sum, M Ha = Haj, (6) j

∗ This assumes the complementary output space Hc is as small as possible, which is to say it is the support of C(Ia). If one allows Hc to have higher dimension than the support, “unitary” must be replaced with “partial isometry”, see Sec. 5.2 in [20].

3 of these orthogonal subspaces. Similarly, a PDI {Qj} can be used to partition Hb into subspaces Hbj = QjHb, and {Rj} to partition Hc into Hcj = RjHc. The coherent information or entropy bias of a channel B with complement C for an input density operator ρ in Hˆa is ∆(B, ρ) = S(B(ρ)) − S(C(ρ)), (7) where S(ρ) = −Tr(ρ log2 ρ) is the von-Neumann entropy of ρ (in base 2). Since the entropy bias is the difference of two entropy functions, each of which is concave in ρ, the bias itself need not be concave or convex in ρ. The one-letter quantum capacity or channel coherent information, for each channel in the (B, C) pair is given by Q(1)(B) = max ∆(B, ρ),Q(1)(C) = − min ∆(B, ρ). (8) ρ ρ The channel B is said to be degradable and C antidegradable if there exists a quantum channel D such that C = D ◦ B, i.e., if the output of B is made the input of D the result is C. The entropy bias ∆(B, ρ) of a degradable channel is concave [21] in ρ, so Q(1)(B), which for a degradable channel equals Q(B) (see comments in Sec.1) can be computed easily, and of course Q(C) = 0.

3 Glued Isometries and Channels

There are various ways of combining quantum channels and their corresponding isometries. A concate- nation of two channels in which the output of the first becomes the input of the second corresponds to the concatenation of the two isometries. When two channels or channel pairs are placed in parallel, the input space of the combination is the tensor product of the two input spaces, likewise the direct and complementary output spaces are the corresponding tensor products, and the isometry for the combination is the tensor product of the individual isometries. But in addition isometries can be combined in such a way that one or more of the input, direct output and complementary output spaces are subspaces of larger Hilbert spaces, a process which we refer to as gluing. The idea will become clear from the following examples. Consider a collection of isometries Jj : Haj 7→ Hbj ⊗ Hcj (9)

in the notation of Sec.2, where the Haj are either distinct orthogonal subspaces of Ha, or all equal to Ha, and the same for the Hbj and Hcj; see (5) and (6). We shall, in what follows, assume the convenient, but not absolutely necessary, condition: † Jj Jk = 0 for j 6= k. (10)

Finally, the overall isometry J : Ha 7→ Hb ⊗ Hc, obtained from gluing the collection of isometries in (9), is given by a sum X J := νjJj, (11) j

† where the νj are positive numbers. The condition J J = Ia for J to be an isometry is then

X 2 † νj Jj Jj = Ia. (12) j

The isometry J in (11) defines a pair of channels B and C through (3). Perhaps the simplest example of gluing is when each Jj is simply J applied to the subspace Haj = PjHa, while Hbj = Hb and Hcj = Hc, independent of j, and thus

† † Jj = JPj,Jj Jj = PjJ JPj = PjIaPj = Pj, (13) where the projector Pj is the identity operator on Haj. Then with every νj = 1 in (11) one has X X J = Jj = JPj = JIa. (14) j j

4 The corresponding subchannels Bj and Cj are given by the expressions

Bj(A) = B(PjAPj), Cj(A) = C(PjAPj), (15)

in terms of the superoperators B and C for the full channel and its complement. In general B(A) will not P equal j Bj(A), because the latter maps all “off-diagonal” parts, PjAPk for j 6= k, of the operator A to zero; similarly, C(A) will in general not be the sum of the Cj(A). Another example of gluing arises given a collection of isometries Jj : Ha 7→ Hb ⊗ Hcj; that is, the direct channels have the same input and output spaces Ha and Hb, whereas the complementary channels map to orthogonal subspaces Hcj = RjHc of Hc. If we write (11) in the form X √ J = pj Jj, (16) j √ i.e., νj = pj, where the pj > 0 are any set of probabilities that sum to 1, then √ Jj = RjJ/ pj, (17)

and it is easily checked that (10) and (12) are satisfied. It is straightforward to show that the B channel resulting from this gluing of the Hcj spaces is given by X B(A) = pjBj(A), (18) j thus a weighted sum or convex combination of the Bj channels corresponding to the different Jj isometries. P Furthermore, any convex combination j pjBj of channels with a common input space Ha and output space Hb can be constructed in this manner by gluing together the different Hcj output spaces of the complementary channels. But in general there is no simple relationship between the complementary channel C and the different Cj. One can combine the two previous examples and glue both the input spaces Haj and the complementary output spaces Hcj of isometries Jj : Haj → Hb ⊗ Hcj, which have a common direct output space Hb. With νj = 1 in (11) one has X J = Jj,Jj = RjJPj. (19) j Again (10) is obviously satisfied. A simple calculation shows that X X B(A) = B(PjAPj) = Bj(A), (20) j j and that Bj and Cj satisfy (15). But now B (as well as Bj and Cj) maps “off diagonal” parts, PjAPk for j 6= k of the operator A to zero. There is in general no connection between C and the Cj analogous to (20). Of particular interest for what follows later is a block diagonal channel pair obtained from gluing both the direct and complementary outputs of the isometries Jj : Ha 7→ Hbj ⊗ Hcj, with Hbj = QjHb and Hcj = RjHc. As in (16) one writes X √ √ J = pj Jj,Jj = QjRjJ/ pj. (21) j It is then straightforward to show that

QjB(A)Qk = δjkBj(A),RjC(A)Rk = δjkCj(A), (22)

and as a consequence, M M B(A) = pjBj(A), C(A) = pjCj(A). (23) j j L P Instead of , one could have used in (23) to indicate that both B and C are convex sums of {Bi} and L P {Ci} respectively; however, using rather than serves to emphasize that the output spaces of the {Bi}

5 are mutually orthogonal, as are the output spaces of the {Ci}. Thus the outputs of each channel in the (B, C) pair are in separate, mutually orthogonal blocks, whence our name ‘block diagonal’ channel pair. In addition the B and C blocks are correlated: if in a particular run the output of the B channel falls in a particular block Qj (as could be determined by a suitable measurement), the C channel output will be in the corresponding block Rj. This means the entropy of the B output is given by  X  S B(ρ) = pjS Bj(ρ) + h(p), (24) j P where h(p) = − j pj log2 pj is the Shannon entropy of the probability distribution {pi} (in base 2). There is an analogous expression for the entropy of the C output. Thus the output entropy in each case is the weighted sum of the output entropies of the individual channels plus a ”classical” term h(p). This classical term cancels when one computes the entropy bias, (7), which is given by X ∆(B, ρ) = pj∆(Bj, ρ). (25) j These considerations suggest a simple physical picture: the channel B can be obtained by randomly applying with probability pi the channel Bi to the input Hˆa, with the output going to Hˆbi. Thus B is a convex combination of the Bi if one regards each of these as a map into the full operator space Hˆb. A similar comment applies to C as a convex combination of the Ci. It is possible to glue the inputs and both the direct and complementary outputs in a construction called the direct sum of channels. Let the corresponding isometries be Jj : Haj 7→ Hbj ⊗ Hcj, with Haj = PjHa, Hbj = QjHb, Hcj = RjHc. Thus the corresponding channels are completely independent of each other, with distinct input and output spaces. With J the sum of these isometries one has: X J = Jj; Jj = (Qj ⊗ Rj)JPj. (26) j A straightforward calculation shows that J gives rise to channels M M B(A) = Bj(PjAPj), C(A) = Cj(PjAPj). (27) j j Once again there is a block-diagonal structure with correlated blocks, but the physical picture is a bit different from (23). The channel B acts on a density operator ρ by projecting it to the sub-space Haj with probability Tr(Pjρ) (thus the “off-diagonal” PjρPk parts of ρ for j 6= k always map to zero), and then applying the channel Bj. An analogous interpretation holds for C. This direct sum construction of a channel B has been studied in [22] where it was used in simplifying the nonadditivity conjecture of the Holevo capacity of a quantum channel. The gluing picture reveals that C, the complement of B, is also a direct sum, and the two channels B and C have correlated blocks. If an isometry J has been produced by gluing other isometries together in the manner indicated above, one can recover the originals by a process of slicing J, in which projectors corresponding to the different PDIs are placed to the left and right of J. For any map J from Ha to Hb ⊗ Hc, not necessarily an isometry, one can define a collection of operators

Kjkl := (Qk ⊗ Rl)JPj. (28)

using PDIs {Pj}, {Qk}, and {Rl}, which need not have the same number of projectors, on Ha, Hb, and Hc, respectively. It is obvious that J is the sum of all of the Kjkl, but even if J is an isometry, the individual † Kjkl will in general not be isometries or proportional to isometries; that is, KjklKjkl will not be proportional to Pj. Only for special choices of the isometry J and the PDIs, as in the examples considered above, will the operators resulting from slicing be proportional to isometries.

4 Generalized Erasure Channel Pair

An erasure channel pair is an example of a block diagonal channel with two blocks, where the isometries in (21) are given by J1|ψia = |ψib1|eic1,J2|ψia = |fib2|ψic2. (29)

6 Here Hb1 and Hc2 are isomorphic to Ha, whereas Hc1 and Hb2 are one-dimensional Hilbert spaces consisting of multiples of the normalized kets |eic1 and |fib2, respectively. The isometry J1 generates a perfect channel pair (I, T ), where the identity channel I maps any operator A in Hˆa to the same operator A in Hˆb1, while the trace channel T maps A to Tr(A)[e] in Hˆc1. In the same way J2 generates the perfect channel pair (T , I) mapping Hˆa to Hˆb2 and Hˆc2, respectively. Gluing these perfect channel pairs together using p1 = 1 − λ and p2 = λ in (21) results in the channel pair

λ M Be(A) = E (A) = (1 − λ)A λTr(A)[e], (30)

1−λ M Ce(A) = E (A) = (1 − λ)Tr(A)[f] λA, (31) where [e] and [f] employ the abbreviation, used here and later, [ψ] = |ψihψ| for the projector corresponding to a normalized ket |ψi. The channel Eλ is the erasure channel with erasure probability λ, and E1−λ is its complement. The subspaces on the left and right side of L can be interchanged; the order used in (30) and (31) reflects the correlations discussed following (23): if A occurs in the output of the direct channel, [f] will be present in the complementary output. λ It is easily shown that the entropy bias, (7), of Be = E takes the form ∆(Eλ, ρ) = (1 − 2λ)S(ρ). (32)

Its maximum for λ ≥ 1/2 is zero, and (1 − 2λ) log2 da for λ ≤ 1/2 using ρ = Ia/da proportional to the λ µ identity operator on Ha. If E for λ ≤ 1/2 is followed by E with µ = (1 − 2λ)/(1 − λ), the resulting channel Eµ ◦ Eλ = E1−λ is the complement of Eλ. Hence for λ ≤ 1/2, Eλ is degradable, while for λ ≥ 1/2 it is antidegradable. Consequently Q and Q(1) are identical for an erasure channel, and (1) λ Q (Be) = Q(Be) = Q(E ) = max{1 − 2λ, 0} log2 da, (1) 1−λ Q (Ce) = Q(Ce) = Q(E ) = max{2λ − 1, 0} log2 da. (33)

We define the generalized erasure channel pair as one in which J2 is the same as in (29) and corresponds to a perfect channel, but J1 is replaced by any isometry from Ha to Hb1 ⊗ Hc1, where the dimensions of Hb1 and Hc1 are arbitrary (except that the product cannot be less than the dimension of Ha). The result is a channel pair M Bg(A) = (1 − λ)B1(A) λTr(A)[e], (34) M Cg(A) = (1 − λ)C1(A) λA, (35)

where (B1, C1) is the channel pair generated by J1. The form of Bg in (34) means that it either erases its input with probability λ, or else sends it through B1 into the output subpace Hb1 of Hb. Similarly, Cg with probability λ sends its input unchanged to Hc2, or else sends it through C1 to Hc1. When C1 is the trace channel T that completely erases its input, Cg is an erasure channel Ce with erasure probability 1 − λ. But in general C1 need not erase completely, so we call Cg an “incomplete erasure” channel. The Bg channel can be obtained by concatenating B1 with a suitable erasure channel, placed either before or after B1: λ λ Bg(A) = B˜1 ◦ E (A) = E˜ ◦ B1(A), (36) where the tildes denote modified superoperators: The superoperator E˜λ is an erasure channel whose input space is identical to the output space of B1, which need not have the same dimension as its input space. The operator B˜1 is the same as B1 except that when it is applied to [e] in (30) B˜1 maps [e] to the corresponding [e] in (34), and maps any “off-diagonal” ket |eihα| or |αihe|, |αi any element of Ha, to zero. Given these definitions it is straightforward to check the validity of (36). There is no analog of (36) for Cg(A). Since the concatenation of a channel with an antidegradable channel always results in an antidegradable channel (see λ App.A for a simple proof), Bg is antidegradable when either B1 or E is antidegradable (the latter happens when λ ≥ 1/2). A channel obtained by concatenating two channels has a smaller (single-letter) quantum capacity than (1) either of the channels that are being concatenated [5, 23, 24]. Thus, from (36) it follows that Q (Bg) ≤ (1) (1) λ λ min{Q (B1),Q (E )} and Q(Bg) ≤ min{Q(B1),Q(E )}. The channel Ce in (31) can be obtained by concatenating the output of Cg with a channel that traces out operators on Hc1 to a fixed pure state [f] and (1) (1) does nothing to Hc2, thus Q (Ce) ≤ min{Q (Cg), log da} and Q(Ce) ≤ min{Q(Cg), log da}.

7 5 Applications 5.1 Generalized Erasure using Qubit Amplitude Damping Channel

The isometry J1 : Ha → Hb1 ⊗ Hc1 defined by p √ J1|0ia = |0ib1|1ic1,J1|1ia = 1 − p |1ib1|1ic1 + p |0ib1|0ic1, (37)

with 0 ≤ p ≤ 1, and |0i and |1i are the usual orthonormal basis kets for a qubit, defines a channel pair (B1, C1) in which B1 is an amplitude-damping channel with p the probability that the input state [1]a decays to the output state [0]b1. Similarly, C1 is an amplitude damping channel with decay rate (1 − p) if one interchanges |0ic1 and |1ic1 in (37). The Bloch vector parametrization for a qubit density operator, 1 1 ρ(r) = (I + r.~σ) := (I + xσ + yσ + zσ ), (38) 2 2 x y z

where I is the identity and (σx, σy, σz) the three Pauli matrices, provides a convenient way to represent B1 and C1 as maps carrying r to Bloch vectors p p √ √ rb = ( 1 − p x, 1 − p y, (1 − p)z + p) and rc = ( p x, − p y, p − pz − 1), (39)

respectively. See Fig.1(a) for a convenient way to visualize this channel pair.

[0] [0] [φ0]

z = p z axis z axis

z = 2p 1 − z = p 1 − [φ ] [1] [1] 1 (a) Amplitude damping (b) Phase damping

Figure 1: Enclosed inside a Bloch sphere (in black): ellipsoids representing the locus of Bloch vectors rb (in red) and rc (in blue) for p = 0.3. Square brackets indicate projectors, e.g., [1] projects on |1i. (a) Amplitude damping: rb and rc defined in (39). (b) Phase damping: rb and rc defined in (50).

Let (Bg, Cg) be the generalized channel pair resulting from inserting B1 and C1 in (34) and (35). The entropy bias ∆(Bg, ρ(r)) of Bg at ρ(r) is a real-valued function of r = (x, y, z), whose maximum and minimum (1) (1) over r with |r| ≤ 1 gives Q (Bg) and Q (Cg), respectively; see (8). Finding these extrema is simplified by the fact that the rotational symmetry of ∆(Bg, ρ(r)) about the z axis—see Fig. 1(a)—means that for a fixed 2 2 2 2 z it is a function of x + y , so one can set y = 0. In addition, with y = 0, ∆(Bg, ρ(r)) for a fixed x + z is monotone increasing in z for p ≤ 1/2, and monotone decreasing for p ≥ 1/2. Thus one can also set x = 0 and look for its maximum or minimum as a function of the single parameter −1 ≤ z ≤ 1. (1) The range of the two parameters p and λ for which Q (Bg) is greater than 0 can be determined as λ follows. For 1/2 ≤ p ≤ 1, B1 is antidegradable [25], while for 1/2 ≤ λ ≤ 1, E is antidegradable. Thus Bg, the concantenation of these two channels (see (36) and the following discussion) is antidegradable, and (1) (1) Q (Bg) = Q(Bg) = 0. Thus Q (Bg) can only be positive when both p and λ are less than 1/2. At (1) p = 0, B1 is a perfect channel and Bg an erasure channel, so Q (Bg) = 1 − 2λ for λ ≤ 1/2 (see (33) with (1) da = 2). For λ = 0, Bg is just the amplitude damping channel, which is degradable with a positive Q for 0 ≤ p < 1/2.

8 For other values of λ and p between 0 and 1/2, the numerical maximization of ∆(Bg, ρ(r)) together with (1) an asymptotic analysis as z approaches 1 (App.B) shows that Q (Bg) is positive for λ in the interval

0 ≤ λ < λ0(p) = (1 − 2p)/(2 − 2p), (40)

(see Fig.2), is zero for λ ≥ λ0(p), and as δλ = λ0(p) − λ (41) tends to zero has the asymptotic form

(1) Q (Bg) ' a(p)δλ exp[−b(p)/δλ], (42)

(1) where a(p) and b(p) are positive functions of p. The exponentially rapid decrease of Q (Bg) due to δλ in the denominator of the exponent makes a direct numerical study difficult when δλ is very small. For p = 1/4 and 5 × 10−3 < δλ < 10−1 we find good agreement between our numerical values and (42).

(1) Figure 2: For a given p, Q (Bg) is zero for λ ≥ λ0(p), and positive for λ < λ0(p). It is nonadditive at the 2-letter level for λ1(p) < λ < λ0(p).

(1) For all strictly positive p and λ, Q (Cg) is positive, see Sec. 5.3, so its complementary channel Bg cannot (1) be degradable, and hence Q (Bg) might conceivably be nonadditive. A convenient measure of nonadditivity for n copies of this channel placed in parallel is

(1) ⊗n (1) δn := Q (Bg )/n − Q (Bg). (43)

One says that nonadditivity occurs at the n-letter level if n is the smallest integer for which δn > 0. We have found numerical evidence for nonadditivity at the 2-letter level for λ in the range

λ1(p) < λ < λ0(p), thus 0 < δλ < δλ1(p) := λ0(p) − λ1(p), (44)

where λ1(p), determined numerically, is shown in Fig.2. In particular for δλ between 0 and δλ1(p), an input ⊗2 density operator for Bg of the form √ σ = (1 − )[00] + [φ], |φi = (|01i + |10i)/ 2 (45)

⊗2 ⊗2 with 0 <  < 1 chosen to maximize ∆(Bg , σ) gives a larger maximum value of ∆(Bg ) than the product density operator τ = ρm ⊗ ρm, ρm = (1 − z)[0] + z[1], (46)

9 with 0 < z < 1 chosen to maximize ∆(Bg, ρ(r)) for a single channel. When δλ is sufficiently small, the asymptotic behavior of δ2 (App.B) is of the form (42), but with different choices for a(p) and b(p). For larger δλ, see Fig.3 for p = 0.25, it rises to a maximum and then falls to zero with a finite slope at δλ = δλ1(p). While we can be quite confident of nonadditivity in the region λ1(p) < λ < λ0(p), that it is actually zero outside this range is less certain, since an input density operator different from (45) could ⊗2 (1) conceivably give a maximum value of ∆(Bg ) larger than 2Q (Bg), even though we have found no indication of this in our numerical studies.

(1) ⊗2 (1) Figure 3: A plot of δ2 = Q (Bg )/2 − Q (Bg) versus δλ = λ0(p) − λ at p = 0.25 shows δ2 is positive for 0 < δλ < 0.0406 and attains a maximum value ' 5.27 × 10−3.

If δ2 is positive it is easy to show that δn is positive for all n > 2, and cannot be much smaller than δ2. As direct numerical searches become exponentially more difficult with increasing n, it is customary to make a guess or ansatz ρn for the input density operator, which may depend upon a small number of parameters, ⊗n (1) ⊗n and maximize ∆(Bg , ρn) over these parameters, see (8), to obtain a lower bound for Q (Bg ). When n is even the pair ansatz consists in dividing the n channels into n/2 pairs and employing the optimizing density operator σ defined above as the input for each pair; this yields a lower bound δ2 for δn. When n is odd use (1) σ for each of (n − 1)/2 pairs, and for the remaining channel the density operator that gives rise to Q (Bg); the resulting lower bound is a bit less than δ2. In the literature [6, 16–18, 26] various other ansatzes have been proposed, including the Z-diagonal ansatz, a particular case of which is the repetition ansatz. Our numerical studies for n = 3, 4 and 5 using these and others motivated by the functional form of σ have not found any that improve on the pair ansatz.

5.2 Generalized Erasure with Qubit Dephasing Channel

The isometry J1 : Ha1 → Hb1 ⊗ Hc1, with each space a qubit (dimension 2), giving rise to the dephasing channel B1 and its complement C1 can be written in the form

J1|0ia = |0ib1|φ0ic1,J1|1ia = |1ib1|φ1ic1, (47) where p √ p √ √ |φ0ic1 = 1 − p|+ic1 + p|−ic1, |φ1ic1 = 1 − p|+ic1 − p|−ic1; |±i := (|0i ± |1i)/ 2, (48) with 0 ≤ p ≤ 1 the dephasing probability. Interchanging p with 1 − p in (48) is equivalent to applying the unitary σz = [0] − [1] to both Hb1 and Hc1, so for our purposes we can limit p to the range 0 ≤ p ≤ 1/2. The superoperators for the dephasing channel B1 and its complement C1 are † B1(A) = pZAZ + (1 − p)A, C1(A) = h0|A|0i [φ0] + h1|A|1i [φ1], (49)

10 where Z|0ia1 = |0ib1 and Z|1ia1 = −|1ib1. One can think of C1 as first measuring the input in the {|0i, |1i} basis, and for measurement outcome i preparing the channel output [φi]. Such a measure-and-prepare or entanglement breaking channel is antidegradable and thus has zero quantum capacity [27]. The channel B1 and its complement C1 map ρ(r) in (38) to qubit density operators with Bloch vectors p rb = ((1 − 2p)x, (1 − 2p)y, z), and rc = (1 − 2p, 0, 2 p(1 − p) z), (50) respectively (see Fig.1(b)). Inserting B1 and C1 defined in (49) in (34) and (35) yields the generalized erasure channel pair (Bg, Cg). The channel Bg is the same as the dephrasure channel studied in [16], where it was defined using the second equality in (36). These authors showed that the global maximum or minimum of ∆(Bg, ρ(r)) occurs along r = (x, 0, z). They also found that for any fixed p between 0 and 1/2, as λ increases from 0 a local maximum of ∆(Bg, ρ(r)) remains at x = z = 0 until λ reaches the value 1 − 2p − 2p(1 − p) ln[(1 − p)/p] j(p) = , (51) 2 − 4p − 2p(1 − p) ln[(1 − p)/p] at which point this maximum begins moving to positive z values, while x remains at 0. As λ increases further, the local maximum of ∆(Bg, ρ(r)) goes to zero at λ equal to (1 − 2p)2 g(p) = , (52) 1 + (1 − 2p)2 and remains zero for λ > g(p). We strengthen these results by showing that for any p and any λ between 0 and 1/2 the global maximum (1) Q (Bg) of ∆(Bg, ρ(r)) occurs along r = (0, 0, z), where it agrees with the local maximum found in [16], while (1) the global minimum, equal to − Q (Cg) (see (8)) occurs on the line r = (x, 0, 0). This follows from noting 2 2 that by rotational symmetry, see Fig.1, the entropy associated with rb is a function of x + y , whereas that 2 2 associated with rc depends only on z, so one can set y = 0. Next, for a fixed x + z , ∆(Bg, ρ(r)) is a convex function of z symmetric about z = 0. Thus the global maximum of ∆(Bg, ρ(r)) occurs along r = (0, 0, z), and its global minimum along r = (x, 0, 0). The study in [16] used a repetition ansatz,

ρˆ2(η) = η[00] + (1 − η)[11], (53)

⊗2 (1) with η chosen appropriately between zero and one to maximize ∆(Bg , ρˆ2(η)), and showed that Q (Bg) is non-additive at the two-letter level for some values of p between 0 and 1/2, and some values of λ in the range j(p) < λ < g(p). We have extended these results by using a different ansatz ρ(ζ) = {(1 + ζ)[00] + [11] + (1 − ζ)[01] + [10]}/4, (54)

⊗2 and varying ζ between −1 and +1 to maximize ∆(Bg , ρ(ζ)). Inserting this maximum in (43) gives a lower ∗ ∗ bound δ2 for δ2. We find that along the curve λ = j(p), 0 < p < 1/2, δ2 is positive and goes to zero at ∗ the two end points, whereas for a fixed p, δ2 rapidly goes to zero as λ increases or decreases from j(p). It remains an open question whether using a different ansatz than (54), or by some other method, the range (1) of p and λ values for which Q (Bg) is nonadditive can be extended beyond those discussed here or in the previous study [16].

5.3 Incomplete Erasure Channel

As discussed following (35), Cg resembles an erasure channel Ce, except that instead of completely erasing its input it sends it through a noisy channel C1. This “incomplete erasure” leads to an interesting effect shown in Fig.4 for the amplitude damping case of Sec. 5.1. When p = 0, which means that C1 is the (1) completely noisy trace channel T , both Q (Cg) and Q(Cg) are exactly zero for 0 ≤ λ ≤ 1/2. But as soon (1) as p is positive by the smallest amount, Q (Cg) is positive over the entire range λ > 0. As p tends to 0, the analysis in App.B yields the asymptotic behavior λ Q(1)(C ) ' exp[(p + ln p)(1 − λ)/λ] (55) g ln 2

11 (1) (1) Figure 4: Plot of Q (Cg) against λ ∈ [0, 1] for various p values. The inset shows Q (Cg) on a logarithmic scale for small positive λ and p.

for 0 < λ ≤ 1/2; see the inset in Fig.4. A very similar behavior is found when C1 is the complement of the phase-damping channel B1 discussed in Sec. 5.2, for which the corresponding asymptotic expression is λ Q(1)(C ) ' exp [p + (1 − 2p) ln p](1 − λ)/λ . (56) g ln 2

In both cases, when p is small C1 is not only noisy but antidegradable, so that its quantum capacity is (1) exactly zero. Thus its ability to make Q (Cg) positive in the entire range 0 < λ ≤ 1/2 comes as something (1) of a surprise. These two examples might suggest that Q (Cg) is positive for all λ > 0 if C1 is any channel that is not completely noisy. But this is not the case. If C1 is an erasure channel with erasure probability µ, Cg is an erasure channel with erasure probability

 = (1 − λ)µ, (57)

(1) and thus has zero capacity for  ≥ 1/2. This means Q (Cg) = 0 for

0 ≤ λ ≤ 1 − 1/(2µ), (58)

which is a finite interval for any µ greater than 1/2. (One way to derive (57) is to note that if one defines the transmission probability of an erasure channel as 1 minus the erasure probability, B1 has a transmission probability of µ. Since Bg is the concatenation (36) of an erasure channel with transmission probability µ with one whose transmission probability is 1−λ, it is an erasure channel with transmission probability equal to the product µ(1 − λ). And this is the erasure probability of its complement Cg.)

6 Summary and Conclusions

Following a review in Sec.2 of how an isometry gives rise to quantum channel pair ( B, C), the process of combining channels or channel pairs that we call gluing is presented in Sec.3. Combining channels by placing them in parallel or series is of course well known. However, the gluing procedure, aside from particular cases like convex combinations and direct sums of channels, has so far as we know not been discussed earlier in the literature, and might well have some interesting applications in addition to those discussed in this paper. Our focus is on a type of gluing procedure that leads to what we call a block diagonal channel pair, see (21) and the discussion following it. What makes this procedure of combining channels particularly useful for the study of quantum channel capacities is that the entropy of the output of a block diagonal channel is a

12 weighted sum of the entropies of the outputs of the individual channels in the combination, plus a “classical” term, (24). Thus the entropy bias or coherent information, the difference of the entropies of the outputs of a block diagonal channel pair B and C for a given input, is a similar weighted sum (with the “classical” term cancelling out), (25). In addition one has a simple physical picture: with some probability the input to a block diagonal channel pair is sent into one of several different channel pairs. The quantum erasure channel with erasure probability λ, together with its complement, an erasure channel with erasure probability 1 − λ, is an example of a block diagonal channel pair formed by gluing together two perfect channel pairs, as discussed in Sec.4. Replacing one of the perfect channel pairs with an arbitrary pair (B1, C1) results in a generalized erasure channel pair (Bg.Cg). The channel Bg can be viewed as a concatenation of B1 with a suitable erasure channel, while one can think of Cg as an erasure channel with incomplete erasure. In Sec.5 we have analyzed two cases of generalized erasure channel pairs constructed using a pair ( B1, C1), with both B1 and C1 qubit-to-qubit channels. In the first case, Sec. 5.1, B1 and C1 are complementary amplitude damping channels. In the second case, Sec. 5.2, B1 is a phase-damping channel, with complement C1 a measure-and-prepare channel. This second case has been studied in [16]; some of our results confirm and extend the ones published there. In both cases the qubit channel pair depends on a single parameter 0 ≤ p ≤ 1, which determines the amount of amplitude or phase damping of B1. Hence the corresponding generalized erasure channel is characterized by two parameters: p and the erasure probability λ. Both the amplitude and phase damping cases exhibit interesting, and to some extent unexpected, behavior. (1) The nonadditive behavior of Q (Bg) for two identical channels in parallel is analyzed in Sec. 5.1 for the phase-damping case, starting with a global optimization, assisted by an asymptotic analysis, to yield (1) (1) accurate values of Q (Bg). In the (p, λ) plane Q (Bg) is zero for λ ≥ λ0(p), the upper curve in Fig.2, (1) ⊗2 (1) and positive for 0 ≤ λ < λ0(p). A numerical search for a positive δ2 = Q (Bg )/2 − Q (Bg) suggests a plausible form (45) for the bipartite input density operator, and using this one finds a well-defined region in the (p, λ) plane, lying between the two curves in Fig.2, in which δ2 is positive. Figure3 shows δ2 as a function of δλ = λ0(p) − λ for p = 1/4. (1) The nonadditivity of Q (Bg) for the phase-damping case (the dephrasure channel) was studied in [16]. (1) Our global optimization results confirm their Q (Bg) calculation, showing that it is zero for λ ≥ g(p) and positive for λ < g(p), and we have extended the range of (p, λ) values over which nonadditivity occurs, without determining its full extent. Even when nonadditivity is absent for two identical channels in parallel it could be present for three or more. One very preliminary result in Sec. 5.1 for the multiple channel case suggests an interesting possibility: There might be channels for which Q(1) is nonadditive when two are placed in parallel, but thereafter no additional nonadditivity arises when a collection of such “double” channels are placed in parallel with one another. (1) The behavior of Q of the incomplete erasure channel Cg, discussed in Sec. 5.3, is also quite surprising. In both the amplitude and phase damping cases, when p = 0 the channel Cg becomes an erasure channel (1) with erasure probability 1 − λ, so that Q (Cg) = Q(Cg) = 0 for λ ≤ 1/2. However, as soon as p is positive, Cg is greater than zero over the range 0 < λ ≤ 1, see Fig.4. It is surprising that “assisting” the erasure channel with a very noisy C1, which itself has zero quantum capacity, gives rise to this effect. Not every noisy C1 provides such a dramatic improvement, and it would be interesting to determine which channels (1) do so. While the behaviour of Q (Cg) in Fig.4 emerges quite clearly from the mathematics, we lack an intuitive explanation. The results reported here could be extended in various ways. Two real numbers, p and q, are needed to parametrize the family of channel pairs (B1, C1) where both are qubit-to-qubit channels. The amplitude- and phase-damping cases discussed above correspond to different choices of q. It may be possible to extend the results in Sec.5 to this larger family of channels; however the absence of certain symmetries that simplified the analysis in Sec.5 might lead to complications. In addition, nonadditivity can, and undoubtedly does, occur in certain cases when two nonidentical Bg channels, with unequal choices for the parameter pair (λ, p), are placed in parallel. We have no idea what might arise from a study of these, but analyzing what happens when the parameters (λ, p) for one channel are varied while those for the other are held fixed might in some situations turn out be simpler than studying identical channels in which the two sets of parameters are identical. The main advantage of the generalized erasure approach for studying positivity and nonadditivity of Q(1)

13 lies in the fact that when two channel pairs with a very simple structure, in our case the (B1, C1) pair and the perfect channel pair, are glued together, this can give rise to new and interesting behavior not present in either of the separate components. One suspects that there are other instances of this sort worth exploring, and one can hope that analyzing them will yield additional insights into the behavior of the quantum capacity of noisy quantum channels—a very challenging, but at the same time very important, problem in quantum information theory, something which needs to be better understood. We hope our results, limited as they are, will make some contribution to this end.

Acknowledgments

This work used the Extreme Science and Engineering Discovery Environment (XSEDE) [28], which is supported by National Science Foundation grant number ACI-1548562. Specifically, it used the Bridges system [29], which is supported by NSF award number ACI-1445606, at the Pittsburgh Supercomputing Center (PSC).

A Appendix. Concatenation and antidegradable channels

Figure5 will be of assistance in understanding the following proof that the concatenation

B = B2 ◦ B1, (59) of two channels placed in series is antegradable if either B1 or B2 is antidegradable. Let

J1 : Ha → Hb1 ⊗ Hc1,J2 : Hb1 → Hb2 ⊗ Hc2, (60) be the isometries that give rise to the channel pairs (B1, C1) and (B2, C2). Then B and its complement C are generated by the isometry J : Ha → Hb ⊗ Hc, where

Hb = Hb2, Hc = Hc1 ⊗ Hc2,J = (J2 ⊗ Ic1) ◦ J1, (61) with Ic1 is the identity map on Hc1. Thus the complement C of B maps Hˆa to the tensor product Hˆc1 ⊗ Hˆc2, and its partial traces over these outputs are:

Trc1[C(A)] = C2 ◦ B1(A), Trc2[C(A)] = C1(A) (62)

If B1 is antidegradable, i.e., C1 is degradable, there exists a degrading map D1 : Hˆc1 → Hˆb1, indicated by a dashed curve in Fig.5 (ignore D2), such that for any A in Hˆa,

D1 ◦ C1(A) = B1(A) (63)

Given D1, one can define a degrading map D that maps Hˆc = Hˆc1 ⊗ Hˆc2 to Hˆb = Hˆb2 by its action on the tensor product of operators F1 ∈ Hˆc1 and F2 ∈ Hˆc2,

D(F1 ⊗ F2) = Tr(F2) ·B2 ◦ D1(F1). (64)

Then use linearity to extend this to the entire operator space Hˆc. Intuitively (Fig.5) D “throws away” the Hˆc2 output, while mapping the Hˆc1 output to Hˆb1. Since C1 followed by D1 is the same as B1, D ◦ C is identical to B = B2 ◦ B1. Hence C is degradable and B antidegradable. If instead of B1 we assume that B2 is antidegradable, the appropriate degrading map D : Hˆc → Hˆb1 is obtained by “throwing away” the Hˆc1 output of C and applying the degrading may D2, see Fig.5 and ignore D1, to the Hˆc2 output, with the result

D(C1 ⊗ C2) = Tr(C1) ·D2(C2). (65)

So again D ◦ C is identical to B = B2 ◦ B1, which means C is degradable and B antidegradable.

14 ˆ B1 ˆ B2 ˆ Ha Hb1 Hb

1 2 C C 2 D1 D

ˆ ˆ Hc1 Hc2

Figure 5: A schematic diagram indicating the spaces Hˆa, Hˆb1, Hˆb, Hˆc1, and Hˆc2, and the channels B1, B2, C1, C2, D1, and D2 acting between these spaces.

(1) (1) B Appendix. Asymptotic estimates of Q (Bg) and Q (Cg)

Some use was made in Sec.5 of asymptotic expressions for the single-letter capacities Q(1) in circum- stances in which a straightforward numerical approach runs into difficulties because one is trying to find the maximum or minimum of a function f() = α  ln() + β , (66) where  > 0 is small, and α and β are real numbers. If α is positive, f() will be negative for sufficiently small , and positive if α is negative. If α and β are both positive, f has a minimum at

 = m := exp[−(1 + β/α)], (67) where it takes the value f(m) = −α m = −α exp[−(1 + β/α)]. (68) If both α and β are negative, f has a maximum rather than a minimum at (67), and the maximum value is again given by (68). A first application of these formulas is to the amplitude damping case, Sec. 5.1, where for r = (0, 0, z) in the expression for ρ(r) in (38) f() = ∆(Bg, ρ(r)),  = 1 − z (69) has the form (66) for small , where as a function of 0 < λ < 1/2 and 0 < p < 1/2,

α = [p(1 − λ) + λ − 1/2]/ ln 2 (70)

This is zero along the line λ = λ0(p), (40), and negative when δλ = λ0(p) − λ is positive. Hence ∆(Bg, ρ(r)), (1) and therefore its maximum Q (Bg), is greater than zero for sufficiently small δλ > 0. This is consistent (1) with numerical results that indicate that Q (Bg) is zero for λ ≥ λ0(p) and positive elsewhere. (1) One can work out the asymptotic form of Q (Bg) for small positive δλ using (68) and

α = α1(p) δλ, β = β0(p) + β1(p) δλ (71) where

α1(p) = −(1 − p)/ ln 2,

β0(p) = [(p ln p)/(1 − p) − ln(1 − p)]/4 ln 2,

β1(p) = (1 − p)[2β0(p) + 1 + 1/ ln 2], (72)

Both α1(p) and β0(p) are negative in the range of interest, 0 < p < 1/2, so ∆(Bg, ρ(r)) will have a maximum at m ' K exp[−β0/(α1 δλ)],K = exp[−1 − (β1/α1)], (73) (1) and thus Q (Bg), the maximum of ∆(Bg, ρ(r)), has the asymptotic form

(1) Q (Bg) ' −α1 δλ m ' a(p)δλ exp[−b(p)/δλ], (74)

15 for small δλ, where a(p) := −α1K, b(p) := β0/α1, (75) are positive functions whose p dependence is determined by that of α and β. The final factor in (74) is exponentially small due to the δλ in the denominator of the exponent. The approximation (74) is in reasonable agreement with direct numerical calculations for small δλ at p = 1/4. (1) In the same way one can find the asymptotic behavior, for 0 < λ < 1/2 and p very small, of Q (Cg), equal to minus the minimum of f() = ∆(Bg, ρ(r)) in (69). In this case z is close to −1 and

 = 1 + z, (76) is a small quantity. Now α and β are given by

α = λ/ ln 2, β = β0(λ) + β1(λ) ln p + β2(λ)p ln p + β3(λ)p + ··· , (77) where β0 = −α(1 + ln 2), β1(λ) = −α(1 − λ)/λ, β2(λ) = 0, β3(λ) = β1(λ) (78) (1) Inserting these in (68) one arrives at the asymptotic formula (55) for Q (Cg) ' −f(m) (1) An asymptotic estimate for small δλ of the nonadditity of Q (Bg) at the 2-letter level, see (43), can be carried out assuming that  in the input density operator σ,(45), is small, and looking for the maximum of

¯ ⊗2 ¯ f() = ∆(Bg , σ()) =α ¯ ln() + β. (79)

It turns out that ¯ ¯ ¯ ¯ 2 α¯ = 2α, β = β0(p) + β1(p) δλ + β2(p) δλ , (80) where α is the single channel quantity defined in (71), and

(−2 ln 2)p + (4 ln 2)p2 + p ln p + (1 − p(1 + 2p)) ln(1 + p) − 2(1 − p)2 ln(1 − p) β¯ (p) = 0 4 ln 2(1 − p)2 2(1 − 2p) + 2p2(1 + ln 2) + p ln p − (1 − p)2 ln(1 − p) − p(1 + p) ln(1 + p) β¯ (p) = 1 (1 − p) ln 2 ¯ β2(p) = [p ln(4p) − (1 + p) ln(1 + p)]/ ln 2, (81)

Thatα ¯ = 2α makes it convenient to consider the ratio

Q¯(1)(B⊗2) ¯ R = g = m = exp[(β/α)(1 − β/¯ 2β)], (82) (1) 2Q (Bg) m and thus (1) ⊗2 ¯ Q (Bg ) ' a¯(p)δλ exp[−b(p)/δλ], (83) with ¯ ¯ ¯ a¯(p) = −2α1 exp[−1 − β1/2α1], b(p) = β0/2α1. (84) Provided ¯ β0/(2β0) < 1 (85) a condition fulfilled for all 0 < p < 1/2, ¯b(p) is less than b(p), R tends to +∞ as δλ goes to zero, so that as δλ → 0, (1) ⊗2 (1) (1) ⊗2 ¯ δ2 = Q (Bg )/2 − Q (Bg) ' Q (Bg )/2 ' a¯(p)δλ exp[−b(p)/δλ], (86) In the case of the phase-damping channel, Sec. 5.2, similar asymptotic estimates are possible, where the small parameter is now δλ = g(p) − λ, (87)

16 (1) where g(p) is defined in (52). For Q (Bg) the coefficients in (71) and (72) are given by

2 α1(p) = −[1 + (1 − 2p) ]/2 ln 2, 2p(1 − p) ln(4p(1 − p)) β (p) = , 0 (1 + (1 − 2p)2) ln 2

β1(p) = [1 + ln 2 − 2p(1 − p)(1 − ln(2p(1 − p))]/ ln 2. (88)

These coefficients when inserted in (73) and (75) yield the asymptotic form (74). (1) Similarly, an asymptotic formula for Q (Cg) is obtained by employing in (77) the quantities

α = λ/ ln 2, β0 = −α(1 + ln 2), β1(λ) = −α(1 − λ)/λ, β2(λ) = −2β1(λ), β3(λ) = β1(λ) (89) resulting in the asymptotic form (56).

References

[1] A. S. Holevo. The capacity of the quantum channel with general signal states. IEEE Transactions on Information Theory, 44(1):269–273, Jan 1998. doi:10.1109/18.651037. [2] Benjamin Schumacher and Michael D. Westmoreland. Sending classical information via noisy quantum channels. Phys. Rev. A, 56:131–138, Jul 1997. doi:10.1103/PhysRevA.56.131. [3] I. Devetak. The private classical capacity and quantum capacity of a quantum channel. IEEE Trans- actions on Information Theory, 51(1):44–55, Jan 2005. doi:10.1109/TIT.2004.839515. [4] C. E. Shannon. A mathematical theory of communication. Bell System Technical Journal, 27(3):379– 423, 1948. doi:10.1002/j.1538-7305.1948.tb01338.x. [5] Benjamin Schumacher and M. A. Nielsen. Quantum data processing and error correction. Phys. Rev. A, 54:2629–2635, Oct 1996. doi:10.1103/PhysRevA.54.2629. [6] David P. DiVincenzo, Peter W. Shor, and John A. Smolin. Quantum-channel capacity of very noisy channels. Phys. Rev. A, 57:830–839, Feb 1998. doi:10.1103/PhysRevA.57.830. [7] Seth Lloyd. Capacity of the noisy quantum channel. Phys. Rev. A, 55:1613–1622, Mar 1997. doi:10.1103/PhysRevA.55.1613.

[8] Peter W. Shor. Quantum error correction, Nov 2002. URL http://www.msri.org/workshops/203/ schedules/1181. [9] Graeme Smith and Jon Yard. Quantum communication with zero-capacity channels. Science, 321(5897):1812–1815, 2008. doi:10.1126/science.1162242. [10] Fernando G. S. L. Brand˜ao,Jonathan Oppenheim, and Sergii Strelchuk. When does noise increase the quantum capacity? Phys. Rev. Lett., 108:040501, Jan 2012. doi:10.1103/PhysRevLett.108.040501. [11] I. Devetak and P. W. Shor. The capacity of a quantum channel for simultaneous transmission of classical and quantum information. Communications in Mathematical Physics, 256(2):287–303, 2005. doi:10.1007/s00220-005-1317-6. [12] F. Leditzky, N. Datta, and G. Smith. Useful states and entanglement distillation. IEEE Transactions on Information Theory, 64(7):4689–4708, July 2018. doi:10.1109/TIT.2017.2776907. [13] Pawel Horodecki, Michal Horodecki, and Ryszard Horodecki. Binding entanglement channels. J.Mod.Opt., 47:347–354, 2000, quant-ph/9904092. [14] Toby Cubitt, David Elkouss, William Matthews, Maris Ozols, David P´erez-Garc´ıa,and Sergii Strelchuk. Unbounded number of channel uses may be required to detect quantum capacity. Nature Communica- tions, 6:6739, Mar 2015. doi:10.1038/ncomms7739.

17 [15] Graeme Smith and John A. Smolin. Detecting incapacity of a quantum channel. Phys. Rev. Lett., 108:230507, Jun 2012. doi:10.1103/PhysRevLett.108.230507. [16] Felix Leditzky, Debbie Leung, and Graeme Smith. Dephrasure channel and superadditivity of coherent information. Phys. Rev. Lett., 121:160501, Oct 2018. doi:10.1103/PhysRevLett.121.160501.

[17] Graeme Smith and John A. Smolin. Degenerate quantum codes for pauli channels. Phys. Rev. Lett., 98:030501, Jan 2007. doi:10.1103/PhysRevLett.98.030501. [18] Jesse Fern and K. Birgitta Whaley. Lower bounds on the nonzero capacity of pauli channels. Phys. Rev. A, 78:062335, Dec 2008. doi:10.1103/PhysRevA.78.062335.

[19] Charles H. Bennett, David P. DiVincenzo, and John A. Smolin. Capacities of quantum erasure channels. Phys. Rev. Lett., 78:3217–3220, Apr 1997. doi:10.1103/PhysRevLett.78.3217. [20] Mark M. Wilde. From classical to quantum shannon theory. arXiv:1106.1445v8, Jul 2019, arxiv.org/abs/1106.1445v8. [21] J. Yard, P. Hayden, and I. Devetak. Capacity theorems for quantum multiple-access channels: classical-quantum and quantum-quantum capacity regions. Information Theory, IEEE Transactions on, 54(7):3091–3113, Jul 2008. doi:10.1109/TIT.2008.924665. [22] Motohisa Fukuda and Michael M. Wolf. Simplifying additivity problems using direct sum constructions. Journal of Mathematical Physics, 48(7):072101, 2007, doi:10.1063/1.2746128.

[23] G. Smith and J.A. Smolin. Additive extensions of a quantum channel. In Information Theory Workshop, 2008. ITW ’08. IEEE, pages 368–372, May 2008. doi:10.1109/ITW.2008.4578688. [24] Sumeet Khatri, Kunal Sharma, and Mark M. Wilde. Information-theoretic aspects of the generalized amplitude damping channel. arXiv:1903.07747, Mar 2019, arxiv.org/abs/1903.07747. [25] Michael M. Wolf and David P´erez-Garc´ıa. Quantum capacities of channels with small environment. Phys. Rev. A, 75:012303, Jan 2007. doi:10.1103/PhysRevA.75.012303. [26] Johannes Bausch and Felix Leditzky. Quantum codes from neural networks. New Journal of Physics, 22(2):023005, Feb 2020. doi:10.1088/1367-2630/ab6cdd. [27] Michael Horodecki, Peter W. Shor, and Mary Beth Ruskai. Entanglement breaking channels. Reviews in Mathematical Physics, 15(06):629–641, 2003. doi:10.1142/S0129055X03001709. [28] J. Towns, T. Cockerill, M. Dahan, I. Foster, K. Gaither, A. Grimshaw, V. Hazlewood, S. Lath- rop, D. Lifka, G. D. Peterson, R. Roskies, J. R. Scott, and N. Wilkins-Diehr. Xsede: Accel- erating scientific discovery. Computing in Science & Engineering, 16(5):62–74, Sept.-Oct. 2014. doi:10.1109/MCSE.2014.80.

[29] Nicholas A. Nystrom, Michael J. Levine, Ralph Z. Roskies, and J. Ray Scott. Bridges: A uniquely flexible hpc resource for new communities and data analytics. In Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure, XSEDE ’15, pages 30:1–30:8, New York, NY, USA, 2015. ACM. doi:10.1145/2792745.2792775.

18