1 Quantum rate distortion, reverse Shannon theorems, and source-channel separation Nilanjana Datta, Min-Hsiu Hsieh, and Mark M. Wilde

Abstract—We derive quantum counterparts of two key the- of the original data, then the compression is said to be orems of classical , namely, the rate distor- lossless. The simplest example of an information source is tion theorem and the source-channel separation theorem. The a memoryless one. Such a source can be characterized by a rate-distortion theorem gives the ultimate limits on lossy data random variable U with probability distribution pU (u) and compression, and the source-channel separation theorem implies { } that a two-stage protocol consisting of compression and channel each use of the source results in a letter u being emitted with coding is optimal for transmitting a memoryless source over probability pU (u). Shannon’s noiseless coding theorem states a memoryless channel. In spite of their importance in the P that the entropy H (U) u pU (u) log2 pU (u) of such classical domain, there has been surprisingly little work in these an information source is≡ the − minimum rate at which we can areas for quantum information theory. In the present paper, we prove that the quantum rate distortion function is given in compress signals emitted by it [49], [21]. terms of the regularized entanglement of purification. We also The requirement of a scheme being loss- determine a single-letter expression for the entanglement-assisted less is often too stringent a condition, in particular for the quantum rate distortion function, and we prove that it serves case of multimedia data, i.e., audio, and still images as a lower bound on the unassisted quantum rate distortion or in scenarios where insufficient storage space is available. function. This implies that the unassisted quantum rate distortion function is non-negative and generally not equal to the coherent Typically a substantial amount of data can be discarded before information between the source and distorted output (in spite the information is sufficiently degraded to be noticeable. of Barnum’s conjecture that the coherent information would A data compression scheme is said to be lossy when the be relevant here). Moreover, we prove several quantum source- decompressed data is not required to be identical to the original channel separation theorems. The strongest of these are in the one, but instead recovering a reasonably good approximation entanglement-assisted setting, in which we establish a necessary and sufficient codition for transmitting a memoryless source over of the original data is considered to be good enough. a memoryless quantum channel up to a given distortion. The theory of lossy data compression, which is also referred Index Terms—quantum rate distortion, reverse Shannon the- to as rate distortion theory, was developed by Shannon [50], orem, quantum Shannon theory, quantum data compression, [11], [21]. This theory deals with the tradeoff between the source-channel separation rate of data compression and the allowed distortion. Shannon proved that, for a given memoryless information source and a distortion measure, there is a function R(D), called the I.INTRODUCTION rate-distortion function, such that, if the maximum allowed Two pillars of classical information theory are Shannon’s distortion is D then the best possible compression rate is given data compression theorem and his channel capacity theorem by R(D). He established that this rate-distortion function is [49], [21]. The former gives a fundamental limit to the equal to the minimum of the mutual information I(U; Uˆ) := compressibility of classical information, while the latter deter- H (U) + H(Uˆ) H(U, Uˆ) over all possible stochastic maps − mines the ultimate limit on classical communication rates over p ˆ (ˆu u) that meet the distortion requirement on average: U|U | a noisy classical channel. Modern communication systems ˆ arXiv:1108.4940v3 [quant-ph] 19 Aug 2012 exploit these ideas in order to make the best possible use of R(D) = min I(U; U). (1) ˆ communication resources. p(ˆu|u): E{d(U,U)}≤D Data compression is possible due to statistical redundancy in ˆ the information emitted by sources, with some signals being In the above d(U, U) denotes a suitably chosen distortion measure between the random variable U characterizing the emitted more frequently than others. Exploiting this redun- ˆ dancy suitably allows one to compress data without losing source and the random variable U characterizing the output essential information. If the data which is recovered after of the stochastic map. the compression-decompression process is an exact replica Whenever the distortion D = 0, the above rate-distortion function is equal to the entropy of the source. If D > 0, then Nilanjana Datta and Min-Hsiu Hsieh are with the Statistical Laboratory, the rate-distortion function is less than the entropy, implying University of Cambridge, Wilberforce Road, Cambridge CB3 0WB, United that fewer bits are needed to transmit the source if we allow Kingdom. The contribution of M.-H. H. was mainly done when he was with the Statistical Laboratory, University of Cambridge. Now he is with for some distortion in its reconstruction. Centre for Quantum Computation and Intelligent Systems (QCIS), Faculty of Alongside these developments, Shannon also contributed the Engineering and Information Technology (FEIT), University of Technology, theory of reliable communication of classical data over clas- Sydney (UTS), PO Box 123, Broadway NSW 2007, Australia. Mark M. Wilde is with the School of Computer Science, McGill University, Montreal,´ Quebec,´ sical channels [49], [21]. His noisy channel coding theorem Canada H3A 2A7. gives an explicit expression for the capacity of a memoryless 2 classical channel, i.e., the maximum rate of reliable communi- both necessary and sufficient for the reliable transmission of cation through it. A memoryless channel is one for which an information source over a noisy channel, up to some amount there is no correlation in the noise acting onN successive inputs, of distortion D [21]. Thus, we can consider the problems of and it can be modelled by a stochastic map pY |X (y x). lossy data compression and channel coding separately, and the Shannon proved that the capacity of such aN channel ≡ is given| two-stage concatenation of the best code by with the best channel code is optimal. C ( ) = max I (X; Y ) . Considering the importance of all of the above theorems N pX (x) for classical information theory, it is clear that theorems in Any scheme for error correction typically requires the use of this spirit would be just as important for quantum information redundancy in the transmitted data, so that the receiver can theory. Note, however, that in the quantum domain, there perfectly distinguish the received signals from one another in are many different information processing tasks, depending the limit of many uses of the channel. on which type of information we are trying to transmit and Given all of the above results, we might wonder whether which resources are available to assist the transmission. For it is possible to transmit an information source U reliably example, we could transmit classical or quantum data over a over a noisy channel , such that the output of the infor- quantum channel, and such a transmission might be assisted N mation source is recoverable with an error probability that is by entanglement shared between sender and receiver before asymptotically small in the limit of a large number of outputs communication begins. of the information source and uses of the noisy channel. An There have been many important advances in the above immediate corollary of Shannon’s noiseless and noisy channel directions (some of which are summarized in the recent coding theorems is that reliable transmission of the source text [57]). Schumacher proved the noiseless quantum coding is possible if the entropy of the source is smaller than the theorem, demonstrating that the von Neumann entropy of capacity of the channel: a quantum information source is the ultimate limit to the compressibility of information emitted by it [45]. Hayashi H (U) C ( ) . (2) ≤ N et al. have also considered many ways to compress quantum The scheme to demonstrate sufficiency of (2) is for the sender information, a summary of which is available in Ref. [30]. to take the length n output of the information source, compress Quantum rate distortion theory, that is the theory of lossy it down to nH (U) bits, and encode these nH (U) bits into a quantum data compression, was introduced by Barnum in length n sequence for transmission over the channel. As long 1998. He considered a symbol-wise entanglement fidelity as as H (U) C ( ), Shannon’s noisy channel coding theorem a distortion measure [4] and, with respect to it, defined the guarantees≤ that itN is possible to transmit the nH (U) bits over quantum rate distortion function as the minimum rate of data the channel reliably such that the receiver can decode them, compression, for any given distortion. He derived a lower and Shannon’s noiseless coding theorem guarantees that the bound on the quantum rate distortion function, in terms of decoded nH (U) bits can be decompressed reliably as well in well-known entropic quantity, namely the coherent informa- order to recover the original length n output of the information tion. The latter can be viewed as one quantum analogue of source (all of this is in the limit as n ). Given that mutual information, since it is known to characterize the the condition in (2) is sufficient for reliable→ ∞ communication quantum capacity of a channel [38], [52], [23], just as the of the information source, is it also necessary? Shannon’s mutual information characterizes the capacity of a classical source-channel separation theorem answers this question in channel. It is this analogy, and the fact that the classical rate the affirmative [49], [21]. distortion function is given in terms of the mutual information, The most important implication of the source-channel sepa- that led Barnum to consider the coherent information as a ration theorem is that we can consider the design of compres- candidate for the rate distortion function in the quantum realm. sion codes and channel codes separately—a two-stage encod- He also conjectured that this lower bound would be achievable. ing method is just as good as any other method, whenever the Since Barnum’s paper, there have been a few papers in source and channel are memoryless. Thus we should consider which the problem of quantum rate distortion has either been data compression and error correction as independent prob- addressed [25], [20], or mentioned in other contexts [60], [31], lems, and try to design the best compression scheme and the [40], [39]. However, not much progress has been made in best error correction scheme. The source-channel separation proving or disproving his conjecture. In fact, in the absence of theorem guarantees that this two-stage encoding and decoding a matching upper bound, it is even unclear how good Barnum’s with the best data compression and error correction codes will bound is, given that the coherent information can be negative, be optimal. as was pointed out in [25], [20]. Now what if the entropy of the source is greater than the There are also a plethora of results on information trans- capacity of the channel? Our best hope in this scenario is to mission over quantum channels. Holevo [32], Schumacher, allow for some distortion in the output of the source such and Westmoreland [48] provided a characterization of the that the rate of compression is smaller than the entropy of classical capacity of a quantum channel. Lloyd [38], Shor the source. Recall that whenever D > 0, the rate-distortion [52], and Devetak [23] proved that the coherent information function R (D) is less than the entropy H (U) of the source. In of a quantum channel is an achievable rate for quantum this case, we have a variation of the source-channel separation communication over that channel, building on prior work of theorem which states that the condition R (D) C ( ) is Nielsen and coworkers [47], [46], [6], [5] who showed that ≤ N 3 its regularization is an upper bound on the quantum capacity prove that the quantum rate distortion function is given in (note that the coherent information of a quantum channel terms of a regularized entanglement of purification [55] in is always non-negative because it involves a maximization this case. In spite of our characterization being an intractable, over all inputs to the channel). Bennett et al. proved that regularized formula, our result at the very least shows that the mutual information of a quantum channel is equal to the quantum rate distortion function is always non-negative, its entanglement-assisted classical capacity [10] (the capacity demonstrating that Barnum’s conjecture from Ref. [4] does not whenever the sender and receiver are given a large amount of hold since his proposed rate-distortion function can become shared entanglement before communication begins). negative. Furthermore, we prove that the entanglement-assisted In Ref. [10], the authors also introduced the idea of a reverse quantum rate distortion function is a single-letter lower bound Shannon theorem, in which a sender and receiver simulate a on the unassisted quantum rate distortion function (one might noisy channel with as few noiseless resources as possible (later suspect that this should hold because additional resources papers rigorously proved several quantum reverse Shannon such as shared entanglement should only be able to improve theorems [1], [12], [8]). Although such a task might initially compression rates). This bound implies that the coherent seem unmotivated, they used a particular reverse Shannon information between the source and distorted output is not theorem to establish a strong converse for the entanglement- relevant for unassisted quantum rate distortion, in spite of assisted classical capacity.1 Interestingly, the reverse Shannon Barnum’s conjecture that it would be. theorems can also find application in rate distortion theory We finally prove three source-channel separation theorems [60], [31], [40], [39], and as such, they are relevant for our that apply to the transmission of a classical source over a purposes here. quantum channel, the transmission of a quantum source over In this paper, we prove several important quantum rate a quantum channel, and the transmission of a quantum source distortion theorems and quantum source-channel separation over an entanglement-assisted quantum channel, respectively. theorems. Our first result in quantum rate distortion is a The first two source-channel separation theorems are single- complete characterization of the rate distortion function in an letter, in the sense that they do not involve any regularised entanglement-assisted setting.2 This result really only makes quantities, whenever the Holevo capacity or the coherent sense in the communication paradigm (and not in a storage information of the channel are additive, respectively. The third setting), where we give the sender and receiver shared en- theorem is single-letter in all cases because the entanglement- tanglement before communication begins, in addition to the assisted quantum capacity is given by a single-letter expression uses of the noiseless qubit channel. The idea here is for for all quantum channels [2], [10]. We also prove a related a sender to exploit the shared entanglement and a minimal set of source-channel separation theorems that allow for some amount of classical or quantum communication in order for distortion in the reconstruction of the output of the information the receiver to recover the output of the quantum information source. From these theorems we infer that it is best to search source up to some distortion. Our main result is a single-letter for the best quantum data compression protocols [16], [13], formula for the entanglement-assisted rate distortion function, [9], [3], [42], [43], the best quantum error-correcting codes expressed in terms of a minimization of the input-output [51], [19], [18], [41], [44], [37], and the best entanglement- mutual information over all quantum operations that meet the assisted quantum error-correcting codes [17], [33], [36], [58] distortion constraint. This result implies that the computation independently of each other whenever the source and channel of the entanglement-assisted rate distortion function for any are memoryless. The theorems then guarantee that combining quantum information source is a tractable convex optimization these protocols in a two-stage encoding and decoding is program. It is often the case in quantum Shannon theory optimal. that the entanglement-assisted formulas end up being formally We structure this paper as follows. We first overview rel- analogous to Shannon’s classical formulas [10], [28], and our evant notation and definitions in the next section. Section III result here is no exception to this trend. introduces the information processing task relevant for quan- We next consider perhaps the most natural setting for quan- tum rate distortion and then presents all of our quantum rate tum rate distortion in which a compressor tries to compress distortion results in detail. Section IV presents our various a quantum information source so that a decompressor can quantum source-channel separation theorems for memoryless recover it up to some distortion D (this setting is the same as sources and channels. Finally, we conclude in Section V and Barnum’s in Ref. [4]). This setting is most natural whenever discuss important open questions. sufficient quantum storage is not available, but we can equiva- lently phrase it in a communication paradigm, where a sender II.NOTATION AND DEFINITIONS has access to many uses of a noiseless qubit channel and would Let denote a finite-dimensional Hilbert space and let like to minimize the use of this resource while transmitting ( ) denoteH the set of density matrices or states (i.e., positive a quantum information source up to some distortion. We D H operators of unit trace) acting on . Let ρA ( A) denote H ∈ D H 1A strong converse demonstrates that the error probability asymptotically the state characterizing a memoryless quantum information approaches one if the rate of communication is larger than capacity. This is in source, the subscript A being used to denote the underlying contrast to a weak converse, which only demonstrates that the error probability is bounded away from zero under the same conditions. quantum system. We refer to it as the source state. Let 2 ρ One might consider these entanglement-assisted rate distortion results to ψRA R A denote its purification, that is, be part of the “quantum reverse Shannon theorem folklore,” but Ref. [8] does | i ∈ H ⊗ H ρ ρ ρ not specifically discuss this topic. ψ = ψ ψ RA | RAih RA| 4 is a pure state density matrix of a larger composite system The meaning of the above resource inequality is that there RA, such that its restriction on the system A is given by ρA, exists a protocol exploiting n uses of a memoryless quantum ρ i.e. ρA := TrRψ , with TrR denoting the partial trace over channel and nH (A) ebits in order to transmit nI (A; B) RA N the Hilbert space R of a purifying reference system R. The classical bits from sender to receiver. The resource inequality pure state ψρ His entangled if ρ is a mixed state. The von becomes exact in the asymptotic limit n because it is | RAi → ∞ Neumann entropy of ρA, and hence of the source, is defined possible to show that the error probability of decoding these as classical bits correctly approaches zero as n [10]. → ∞ H(A)ρ Tr ρ log ρ . (3) ≡ − { } III.QUANTUM RATE-DISTORTION The quantum mutual information of a bipartite state ωAB is defined as A. The Information Processing Task The objective of any quantum rate distortion protocol is to I (A; B) H (A) + H (B) H (AB) . ω ≡ ω ω − ω compress a quantum information source such that the decom- The coherent information I(A B)σ of a bipartite state σAB is pressor can reconstruct the original state up to some distortion. defined as follows: i Like Barnum [4], we consider the following distortion measure d(ρ, ) for a state ρ ( ) with purification ψρ and I(A B) := H(B) H(AB) . A A RA σ σ σ (4) a quantumN operation ∈ D HA→B: | i i − N ≡ N In quantum information theory, the most general mathemat- d(ρ, ) = 1 Fe(ρ, ), (5) ical description of any allowed physical operation is given by N − N a completely positive trace-preserving (CPTP) map, which is a where Fe is the entanglement fidelity of the map : N map between states. We let idA denote the trivial (or identity) ρ A→B ρ ρ Fe(ρ, ) ψ (idR )(ψ ) ψ . (6) CPTP map which keeps the state of a quantum system A N ≡ h RA| ⊗ N RA | RAi unchanged, and we let A→B denote the CPTP map The entanglement fidelity is not only a natural distortion N ≡ N A→B measure, but it also possesses several analytical properties : ( A) ( B). N D H 7→ D H which prove useful in our analysis. n ⊗n ⊗n The entanglement of purification of a bipartite state ω is The state ρ := (ρA) ( ) characterizes n succes- AB ∈ D HA a measure of correlations [55], having an operational interpre- sive outputs of a memoryless quantum information source. A tation as the entanglement cost of creating ωAB asymptotically source coding (or compression-decompression) scheme of rate from ebits, while consuming a negligible amount of classical R is defined by a block code, which consists of two quantum communication. It is equivalent to the following expression: operations—the encoding and decoding maps. The encoding n is a map from n copies of the source space to a subspace Ep (ωAB) min H ((idB E)(µBE(ω))) , E ⊗n nR n ≡ NE ⊗ N eQ A of dimension 2 : H ⊂ H ω ω ⊗n where µBE(ω) = TrA φ , φ is some purification : ( ) ( n ), { ABE} ABE n A eQ of ωAB, and the minimization is over all CPTP maps E E D H → D H acting on the system E. (The original definition in Ref. [55]N is and the decoding n is a map from the compressed subspace to an output HilbertD space ⊗n: different from the above, but one can check that the definition HA given here is equivalent to the one given there.) ⊗n n : ( eQn ) ( A ). In this paper we make use of resource inequalities (see D D H → D H e.g., [26]), to express information-processing tasks as inter- The average distortion resulting from this compression- conversions between resources. Let [c c] denote one for- decompression scheme is defined as [4]: ward use of a noiseless classical bit→ channel, [q q] one n → X 1 (i) forward use of a noiseless qubit channel, and [qq] one ebit d(ρ, n n) d(ρ, ), D ◦ E ≡ n Fn of shared entanglement (a Bell state). A simple example of a i=1 resource inequality is entanglement distribution: (i) where n is the “marginal operation” on the i-th copy of the F [q q] [qq] , source space induced by the overall operation n n n, → ≥ and is defined as F ≡ D ◦ E meaning that Alice can consume one noiseless qubit channel in (i) ⊗n order to generate one ebit between her and Bob. Teleportation n (ρ) TrA1,A2,··· ,Ai−1,Ai+1,··· ,An [ n(ρ )]. (7) F ≡ F is a more interesting way in which all three resources interact The quantum operations n and n define an (n, R) quantum [7] rate distortion code. D E 2 [c c] + [qq] [q q] . For any R,D 0, the pair (R,D) is said to be an → ≥ → ≥ The above resource inequalities are finite and exact, but we achievable rate distortion pair if there exists a sequence of can also express quantum Shannon theoretic protocols as (n, R) quantum rate distortion codes ( n, n) such that E D resource inequalities. For example, the resource inequality lim d(ρ, n n) D. (8) for the protocol achieving the entanglement-assisted classical n→∞ D ◦ E ≤ capacity of a quantum channel is as follows: The quantum rate distortion function is then defined as + H (A)[qq] I (A; B)[c c] . Rq(D) = inf R :(R,D) is achievable . hN i ≥ → { } 5

R for which d(ρ, ) D. Let N ≤ ρ ωRB := (id ) ψ . Reference ⊗ N RA Furthermore, let n n denote a sequence of quantum oper- ations such that for{Fn}large enough, Alice Bob  ⊗n σRnBn ω ε, (9) A B − RB 1 ≤  where ρ ⊗n E D σRnBn := (idRn n) (ψ ) . ⊗ F RA  Then for n large enough, the average distortion under the quantum operation n satisfies the bound (a) F R d(ρ, n) D + ε, F ≤ n n Proof: Expressing R = R1R2 Rn, and B = Reference ··· B1B2 Bn, we have for any 1 i n, ··· ≤ ≤ (i) ρ σRiBi = (idR n )(ψRA). (10) Alice Bob ⊗ F  By monotonicity of the trace distance under partial trace, we A  B have that ⊗n A n n T σRiBi ωRB σR B ω . (11) E k − k1 ≤ − RB 1  D Hence, the average distortion under the quantum operation n is given by F TB n 1 X  (i)  d(ρ, n) = 1 Fe(ρ, ) F n − Fn (b) i=1 n 1 X ρ ρ = (1 ψ σR B ψ ). (12) Fig. 1. The most general protocols for (a) unassisted and (b) assisted quantum n − h RA| i i | RAi rate distortion coding. In (a), Alice acts on the tensor power output of the i=1 quantum information source with a compression encoding E. She sends the Recall the following inequality from Ref. [15]: compressed qubits over noiseless quantum channels (labeled by “id”) to Bob, who then performs a decompression map D to recover the quantum data that TrP (A B) Tr(A B)−, (13) Alice sent. In (b), the task is similar, though this time we assume that Alice − ≥ − and Bob share entanglement before communication begins. where 0 P I is any positive operator and (A B)− denotes the≤ negative≤ spectral part of the operator (A− B). We then have the following inequalities: − In the communication model, if the sender and receiver have ρ ρ unlimited prior shared entanglement at their disposal, then ψRA σRiBi ψRA the corresponding quantum rate distortion function is denoted h |ρ | ρi ρ q q = ψRA ωRB ψRA + Tr (ψRA(σRiBi ωRB)) as Reac(D) or Reaq(D), depending on whether the noiseless h | | i − Fe(ρ, ) + Tr(σR B ωRB)−, (14) channel between the sender and the receiver is classical or ≥ N i i − quantum. Figure 1 depicts the most general protocols for where the inequality follows from (13) and the definition of unassisted and assisted quantum rate distortion coding. entanglement fidelity: ρ ρ ψ ωRB ψ = Fe(ρ, ). h RA| | RAi N B. Reverse Shannon Theorems and Quantum Rate-Distortion Hence, from (12), (14) and (11), we have Coding d(ρ, ) Before we begin with our main results, we first prove n F n Lemma 1 below. This lemma is similar in spirit to Lemma 26 1 X [1 Fe(ρ, ) Tr(σR B ωR B )−] of Ref. [39] and Theorem 19 of Ref. [60], and like them, ≤ n − N − i i − i i i=1 it shows that to generate a rate-distortion code, it suffices n 1 X to simulate the action of a noisy channel on a source state [1 Fe(ρ, ) + σR B ωR B ] ≤ n − N k i i − i i k1 such that the resulting output state meets the desired distortion i=1 criterion. Unlike them, however, it is specifically tailored to d(ρ, ) + σRnBn ωRnBn 1 the entanglement fidelity distortion measure. ≤ N k − k D + ε, (15) Lemma 1: Fix ε > 0 and 0 D < 1. Consider a state ρA ≤ with purification ψρ and a quantum≤ channel A→B which concludes the proof of the lemma. | RAi N ≡ N 6

n A TA→W The above lemma illustrates a fundamental connection the entangled state with a compression map n , between quantum reverse Shannon theorems and quantum where W is a classical system of size 2Enr,≡ with E r being rate-distortion protocols. In particular, if a reverse Shannon the rate of compression (in Figure 1(b),≈ W corresponds to theorem is available in a given context, then it immediately the outputs of the noiseless quantum channels). Then Bob leads to a rate-distortion protocol. This is done simply by acts on both the classical system W that he receives and choosing the simulated channel to be the one which, when his share TB of the entangled state with the decoding map n WTB →B acting on the source state, yields an output state which meets n . The final state should be such that it is the distortion criterion for the desired rate-distortion task. This distortedD ≡ D by at most D according to the average distortion is our approach in all of the quantum rate-distortion theorems criterion in the limit n (8). With these steps in mind, that follow, and it was also the approach in Refs. [25], [60], consider the following chain→ ∞ of inequalities: [39]. nr H (W ) There is, however, one caveat with the above approach. ≥ The reverse Shannon theorems often require extra correlated H (W TB) ≥ | n resources such as shared randomness or shared entanglement H (W TB) H (W R TB) [10], [1], [8], [12], and the demands of a reverse Shannon ≥ | n − | = I (W ; R TB) theorem are much more stringent than those of a rate-distortion n| n protocol. A reverse Shannon theorem requires the simulation = I (W ; R TB) + I (R ; TB) | n of a channel to be asymptotically exact, whereas a rate- = I (WTB; R ) distortion protocol only demands that a source be recon- I (Bn; Rn) . structed up to some average distortion constraint. The differ- ≥ ences in these goals can impact resulting rates if sufficient The first inequality follows because the entropy nr of the correlated resources are not available [22]. uniform distribution is the largest that the entropy H (W ) In the entanglement-assisted setting considered in the next can be. The second inequality follows because conditioning subsection, the assumption is that an unlimited supply of cannot increase entropy. The third inequality follows because n H (W R TB) 0 from the assumption that W is classical. entanglement is available, and thus the entanglement-assisted | ≥ quantum reverse Shannon theorem suffices for producing a The first equality follows from the definition of mutual infor- n good entanglement-assisted rate-distortion protocol. In the mation, and the second equality follows from the fact that R unassisted setting, no correlation is available, and exploiting and TB are in a product state. The third equality is the chain the unassisted reverse Shannon theorem leads to rates that are rule for quantum mutual information. The final inequality is possibly larger than necessary for the task of quantum rate from quantum data processing. Continuing, we have distortion. Nevertheless, we still employ this approach and dis- n X cuss the ramifications further in the forthcoming subsections. I (Bi; Ri) ≥ i=1 n X    C. Entanglement-Assisted Rate-Distortion Coding Rq d ρ, (i) ≥ eac Fn 1) Rate-Distortion with noiseless classical communica- i=1 q n tion: The quantum rate distortion function, Reac(D), for X 1    = n Rq d ρ, (i) entanglement-assisted lossy source coding with noiseless clas- n eac Fn i=1 sical communication, is given by the following theorem. n ! Theorem 2: For a memoryless quantum information source q X 1  (i) ρ nReac d ρ, n defined by the density matrix ρA0 , with a purification ψ 0 , ≥ n F | AA i i=1 and any given distortion 0 D < 1, the quantum rate dis- nRq (D) . (17) tortion function for entanglement-assisted≤ lossy source coding ≥ eac (i) with noiseless classical communication, is given by In the above, n is the marginal operation on the i-th copy F q of the source space induced by the overall operation n Reac (D) = min I (A; B)ω , (16) F ≡ N : d(ρ,N )≤D n n, and is given by (7). The first inequality follows from D ◦ E 0 superadditivity of quantum mutual information (see Lemma 15 where A →B denotes a CPTP map, N ≡ N in the appendix). The second inequality follows from the fact 0 A →B ρ that the map i i has distortion d (ρ, i i) and the ωAB (idA )(ψ 0 ), ≡ ⊗ N AA information rate-distortionD ◦ E function is theD minimum◦ E of the and I (A; B)ω denotes the mutual information. mutual information over all maps with this distortion. The last Proof: We first prove the converse (optimality). Consider two inequalities follow from convexity of the quantum rate- q the most general protocol for entanglement-assisted lossy distortion function Reac (D), (see Lemma 14 in the appendix), source coding that acts on many copies (ρ⊗n) of the state from the assumption that the average distortion of the protocol ρ ( A) (depicted in Figure 1(b)). We take a purification is no larger than the amount allowed: ∈ D H ρ of ρ as ψRA . Let ΦTATB denote an entangled state, with the n system |T beingi with Alice and the system T being with X 1 A B d (ρ, i i) D, ⊗n n D ◦ E ≤ Bob. Alice then acts on the state ρ and her share TA of i=1 7

q and from the fact that Reac (D), is non-increasing as a function 2) Rate-Distortion with noiseless quantum communica- q of D (see Lemma 14 in the appendix). tion: The quantum rate distortion function, Reaq(D), for The direct part of Theorem 2 follows from the quantum entanglement-assisted lossy source coding with noiseless reverse Shannon theorem, which states that it is possible to quantum communication, is given by the following theorem. simulate (asymptotically perfectly) the action of a quantum Theorem 3: For a memoryless quantum information source ρ channel on an arbitrary state ρ, by exploiting noiseless clas- defined by the density matrix ρA0 , with a purification ψAA0 , sical communicationN and prior shared entanglement between a and any given distortion 0 D < 1, the quantum rate| dis-i sender and receiver [10], [1], [8], [12]. The resource inequality tortion function for entanglement-assisted≤ lossy source coding for this protocol is with noiseless quantum communication, is given by   I (A; B)ω [c c] + H (B)ω [qq] : ρ , (18) q 1 → ≥ hN i Reaq (D) = min I (A; B)ω , (19) 2 N : d(ρ,N )≤D where the entropies are with respect to a state of the following 0 form: where A →B denotes a CPTP map, A0→BE ρ N ≡ N ωABE UN ψAA0 , | i ≡ | i A0→B ρ ρ A0→BE ωAB (idA )(ψAA0 ), (20) ψAA0 is a purification of ρ, UN is an isometric ex- ≡ ⊗ N | i 0 tension of the channel A →B. Our protocol simply exploits and I (A; B) denotes its mutual information. N ω this theorem. More specifically, for a given distortion D, we Proof: We first prove the converse (optimality). The setup take to be the CPTP map which achieves the minimum is similar to that in the converse proof of Theorem 2, with N q in the expression (16) of Reac(D). Then we exploit classical the exception that W is now a quantum system and we let communication at the rate given in the resource inequality (18) E denote the environment of the compressor. Consider the to simulate the action of the channel on the source state following chain of inequalities: ρ. For any arbitrarily small ε > 0 andN n large enough, the protocol for the quantum reverse Shannon theorem simulates 2nr 2H (W ) ≥ n the action of the channel up to the constant ε (in the sense = H (W ) + H (R TBE) of (9)). This allows us to invoke Lemma 1 to show that the n n H (W ) + H (R TBE) H (WR TBE) resulting average distortion is no larger than D + ε. ≥ n − The main reason that we can use the quantum reverse = I (W ; R TBE) n Shannon theorem as a “black box” for the purpose of quantum I (W ; R TB) ≥ n n rate distortion is from our assumption of unlimited shared = I (WTB; R ) + I (W ; TB) I (R ; TB) entanglement. It is likely that this protocol uses much more n − = I (WTB; R ) + I (W ; TB) entanglement than necessary for the purpose of entanglement- n I (WTB; R ) assisted quantum rate distortion coding with classical channels, ≥ and it should be worthwhile to study the trade-off between I (Bn; Rn) . (21) ≥ classical communication and entanglement consumption in more detail, as previous authors have done in the context of The first inequality is because the entropy nr of the uniform channel coding [53], [34], [35], [59]. Such a study might lead distribution is the largest that the entropy H (W ) can be. to a better protocol for entanglement-assisted rate distortion The first equality follows from the fact that the state on n coding and might further illuminate better protocols for other systems WR TBE is pure. The second inequality follows by n quantum rate distortion tasks. subtracting the positive quantity H (WR TBE). The second We think that our protocol exploits more entanglement than equality is from the definition of quantum mutual information. necessary from considering what is known in the classical The third inequality is from quantum data processing (tracing case regarding reverse Shannon theorems and rate-distortion over system E). The third equality is a useful identity for coding [21], [10], [22]. First, as reviewed in (1), the classical quantum mutual information. The fourth equality follows from n n mutual information minimized over all stochastic maps that I (R ; TB) = 0 since R and TB are in a product state. The second-to-last inequality is from I (W ; TB) 0, and the final meet the distortion criterion is equal to Shannon’s classical ≥ rate-distortion function [21]. Bennett et al. have shown that the inequality is from the quantum data processing inequality. The classical mutual information is also equal to the minimum rate rest of the proof proceeds as in (17). needed to simulate a classical channel whenever free common The direct part follows from a variant of the quantum randomness is available [10]. Thus, a simple strategy for reverse Shannon theorem known as the fully quantum reverse achieving the task of rate distortion is for the parties to choose Shannon theorem (FQRS) [1], [24]. This theorem states that the stochastic map that minimizes the rate distortion function it is possible to simulate (asymptotically perfectly) the action of a channel on an arbitrary state ρ, by exploiting noise- and simulate it with the classical reverse Shannon theorem. N But this strategy uses far more classical bits than necessary less quantum communication and prior shared entanglement whenever sufficient common randomness is not available [22]. between a sender and receiver. It has the following resource Meanwhile, we already know that the mutual information is inequality: achievable without any common randomness if the goal is rate 1 1 I (A; B) [q q] + I (B; E) [qq] : ρ , (22) distortion [21]. 2 ω → 2 ω ≥ hN i 8

(k) ⊗k ⊗k where the entropies are with respect to a state of the following where : ( A ) ( B ) is a CPTP map, and form: N D H → D H A0→BE ρ Ep(ρ, ) Ep(ωRB) (25) ωABE UN ψAA0 , (23) N ≡ | i ≡ | i denotes the entanglement of purification, with ρ A0→BE ψAA0 is a purification of ρ, and UN is an isometric | i A0→B A→B ρ extension of the channel . Our protocol exploits this ωRB (idR )(ψ ). (26) ≡ ⊗ N RA theorem as follows. For aN given distortion D, take to be Like its classical counterpart, lossy data compression in- the map which realizes the minimum in the expressionN (19) of q cludes as a special case. If the distortion Reaq(D). Then we exploit quantum communication at the rate D is set equal to zero in (24), then the state ωRB becomes given in the resource inequality (22) to simulate the action of ρ identical to the state ψ . Equivalently, the quantum operation the channel on the source state ρ. For any arbitrarily small RA is given by the identity map id . Since the entanglement ε > 0 and nNlarge enough, the protocol for the fully quantum A ofN purification is additive for tensor power states [55]: reverse Shannon theorem simulates the action of the channel ρ ⊗n ρ up to the constant ε (in the sense of (9)). This allows us to Ep (ψRA) = nEp(ψRA) = nS(ρA), invoke Lemma 1 to show that the resulting average distortion we infer that, for D = 0, Rq(D) reduces to the von Neumann is no larger than D + ε. entropy of the source, which is known to be the optimal rate We could have determined that the form of the for lossless quantum data compression [45]. entanglement-assisted quantum rate distortion function q To prove the achievability part of Theorem 5, we can simply R (D) in Theorem 3 follows easily from Theorem 2 eaq exploit Schumacher compression [45] (which is a special type by combining with teleportation. Though, the above proof of reverse Shannon theorem). Alice feeds each output A of the serves an important alternate purpose. A careful inspection source into a CPTP map that saturates the bound in (24) of it reveals that the steps detailed in (21) for bounding the (for now, we do not considerN the limit and set k = 1). This quantum communication rate still hold even if the system T B leads to a state of the form in (26), to which Alice can then is trivial (in the case where there is no shared entanglement apply Schumacher compression. This protocol is equivalent to between the sender and receiver before communication the following resource inequality: begins). Thus, we obtain as a corollary that the entanglement- assisted quantum rate distortion function is a single-letter H (B)ω [q q] : ρ . (27) lower bound on the unassisted quantum rate distortion → ≥ hN i function. This makes sense operationally as well because the We note that this is a simple form of an unassisted quantum additional resource of shared entanglement should only be reverse Shannon theorem. able to improve a rate distortion protocol. Now, a subtle detail of the simulation idea is that we are interested in simulating the channel A→B from Alice to Corollary 4: The entanglement-assisted quantum rate dis- N tortion function Rq (D) in Theorem 3 bounds the unassisted Bob, and Alice can actually simulate an isometric extension eaq A→BE quantum rate distortion function Rq (D) from below: UN of the channel where Alice receives the system E and just traces over it. q q A→BE R (D) Reaq (D) . Though, instead of simulating UN , we could consider ≥ Alice to simulate the isometry U A→BEB EA locally, Schu- The above corollary firmly asserts that the coherent infor- N macher compressing the subsystems B and EB so that Bob mation I (A B) of the state in (20) is not relevant for quantum i can recover them, while the subsystem EA remains with Alice. rate distortion, in spite of Barnum’s conjecture that it would This leads to the following protocol for unassisted simulation: play a role [4]. That is, one might think that there should be some simple fix of Barnum’s conjecture, say, by conjecturing H (BEB) [q q] : ρ . ω → ≥ hN i that the quantum rate distortion function would instead be The best protocol for unassisted channel simulation is there- max 0,I (A B) . The above lower bound asserts that this fore the one with the minimum rate of quantum communi- cannot{ be thei case} because half the mutual information is never cation, the minimum being taken over all possible isometries smaller than the coherent information: V : E EAEB. This rate can only be less than the rate 1 1 1 → I (A; B) I (A; B) I (A; E) = I (A B) . of quantum communication required for the original naive 2 ≥ 2 − 2 i protocol in (27) since the latter is a special case in the minimization. This is the form of the unassisted quantum D. Unassisted Quantum Rate-Distortion Coding reverse Shannon theorem given in Ref. [8] and is related to a The quantum rate distortion function Rq(D) for unassisted protocol considered by Hayashi [29]. lossy source coding is given by the following theorem. One could then execute the above protocol by blocking k Theorem 5: For a memoryless quantum information source of the states together and by having the distortion channel be of the form (k) : Ak B(k), acting on each block of k defined by the density matrix ρA, and any given distortion N → 0 D < 1, the quantum rate distortion function is given by, states. By letting k become large, such a protocol leads to the ≤ following rate for unassisted communication: q 1 h ⊗k (k) i R (D) = lim min Ep(ρ , ) , (24) 1  (k)  k→∞ k N (k) : N Qmin(ρ, ) = lim min H B EB . (28) ⊗k (k) (k) d(ρ ,N )≤D N k→∞ k V :E →EAEB 9

The above quantity is equal to the entanglement of purification goal of a reverse Shannon theorem is stronger than that of Ak→B(k) ρ ⊗k of the state (idR )((ψRA) ) [29], [8]: a rate distortion protocol, while no correlated resources are ⊗ N available in this particular setting (see the previous discussion 1  (k)  lim min H B EB after Theorem 2). It would be ideal to demonstrate that the k→∞ k V :E(k)→E E A B regularization is not necessary, but it is not clear yet how 1 (k) k (k) (k) E →EB A →B E ⊗k = lim min H(Λ ((UN (ρA )))) to do so without a better way to realize unassisted quantum k→∞ k E(k)→E Λ B rate distortion. Nevertheless, the above theorem at the very 1 Ak→B(k) ρ ⊗k least disproves Barnum’s conjecture because we have demon- = lim Ep((idRk )((ψRA) )). k→∞ k ⊗ N strated that the quantum rate distortion function is always We are now in a position to prove Theorem 5. positive (due to the fact that entanglement of purification Proof of Theorem 5: Fix the map such that the is positive [55]), whereas Barnum’s rate distortion function minimization on the RHS of (24) is achieved.N The quantum can become negative.3 Furthermore, Corollary 4 provides a reverse Shannon theorem (in this case, Schumacher compres- good single-letter, non-negative lower bound on the unassisted sion) states that it is possible to simulate such a channel quantum rate distortion function, which is never smaller than acting on ρ with the amount of quantum communication Barnum’s bound in terms of the coherent information. N equal to Ep(ωRB). Since the protocol simulates the channel up to some arbitrarily small positive ε, the distortion is no IV. SOURCE-CHANNEL SEPARATION THEOREMS larger than D + ε by invoking Lemma 1. This establishes that q This last section of our paper consists of five important R (D) Ep(ωRB). We can have a regularization as above to ≥ quantum source-channel separation theorems. The first two obtain the expression in the statement of the theorem. theorems apply whenever a sender wishes to transmit a mem- The converse part of the theorem can be proved as follows. oryless classical source over a memoryless quantum channel, Figure 1(a) depicts the most general protocol for unassisted whereas the third applies when the information source to be E quantum rate-distortion coding. Let 1 denote the environment transmitted is a quantum source. The second theorem deals E of the encoder, and let 2 denote the environment of the with the situation in which some distortion is allowed in the W decoder, while again denotes the outputs of the noiseless transmission. All these three theorems are expressed in terms quantum channels labeled by “id.” For any rate distortion code of single-letter formulas whenever the corresponding capacity ( (n), (n)) r d(ρ, (n) (n)) D of rate satisfying , we formulas are single-letter. haveE D D ◦ E ≤ The last two theorems correspond to the cases in which a nr H(W ) quantum source is sent over an entanglement-assisted quantum ≥ n channel, with and without distortion. The formulas in these = H(E2B )ω are always single-letter, demonstrating that it is again the min H((id n Λ )(ω n )) B E1E2 B E1E2 entanglement-assisted formulas which are in formal analogy ≥ ΛE1E2 ⊗  (n) (n) ρ ⊗n with Shannon’s classical formulas. = Ep (idRn ( ))(ψ ) ⊗ D ◦ E RA  (n) ρ ⊗n min Ep (idRn )(ψRA) . A. Shannon’s source-channel separation theorem for quantum ≥ N (n) : d(ρ⊗n,N (n))≤D ⊗ N channels (29) Shannon’s original source-channel separation theorem ap- The first inequality follows because the entropy of the max- plies to the transmission of a classical information source over imally mixed state is larger than the entropy of any state on a classical channel. Despite the importance of this theorem, system W . The first equality follows because the isometric it does not take into account that the carriers of information extension of the decoder maps W isometrically to the systems are essentially quantum-mechanical. So our first theorem is a n E2 and B . The second inequality follows because the entropy restatement of Shannon’s source-channel separation theorem minimized over all CPTP maps on systems E1 and E2 can for the case in which a classical information source is to be n only be smaller than the entropy on E2B (the identity map reliably transmitted over a quantum channel. on E2 and partial trace of E1 is a CPTP map included in the Figure 2 depicts the scenario to which this first source- minimization). The second equality follows from the definition channel separation theorem applies. The most general protocol of entanglement of purification. The third inequality follows for sending the output of a classical information source over by minimizing the entanglement of purification over all maps a quantum channel consists of three steps: encoding, trans- that satisfy the distortion criterion (recall that we assume our mission, and decoding. The sender first takes the outputs U n protocol satisfies this distortion criterion). of the classical information source and encodes them with Our characterization of the unassisted quantum rate distor- tion task is unfortunately up to a regularization. It is likely 3To see that Barnum’s proposed distortion function can become negative, consider the case of a maximally mixed qubit source, whose purification is that this regularized formula is blurring a better quantum the maximally entangled Bell state. Suppose that we allow the distortion to rate-distortion formula, as has sometimes been the case in be as large as 3/4. Then a particular map satisfying the distortion criterion quantum Shannon theory [61]. This is due in part to our is the completely depolarizing map because it produces a tensor product of maximally mixed qubits, whose entanglement fidelity with the maximally exploitation of the unassisted reverse Shannon theorem for entangled state is equal to 1/4. The coherent information of a tensor product the task of quantum rate distortion, and the fact that the of maximally mixed qubits is equal to its minimum value of −1. 10

Alice Bob chain of inequalities: n U Û nH (U) = H (U ) N = I(U n; Uˆ n) + H(U n Uˆ n) U Û | U N Û I(U n; Uˆ n) + 1 + Pr Uˆ n = U n n log U ≤ { 6 } | | I (U n; Bn) + 1 + εn log U E D ≤ | | { { ∗ ⊗n Un Ûn χ + 1 + εn log U N ≤ N | | = nχ∗ ( ) + 1 + εn log U . (32) N | | Fig. 2. The most general protocol for transmitting a classical information The first equality follows from the assumption that the classi- source over a memoryless quantum channel. cal information source is memoryless. The second equality is a simple identity. The first inequality follows from applying Fano’s inequality. The second inequality follows from the n n some CPTP encoding map U →A , where the systems An quantum data processing inequality and the assumption that n n are the inputs to many usesE of a noisy quantum channel (30) holds. The third inequality follows because I (U ; B ) A→B. The sender then transmits the systems An over must be smaller than the maximum of this quantity over all theN quantum channels, and the receiver obtains the outputs classical-quantum states that can serve as an input to the ⊗n Bn. The receiver finally performs some CPTP decoding map tensor power channel . The final equality follows from n n N B →Uˆ to recover the random variables Uˆ n (note that this the assumption that the Holevo capacity is additive for the D particular channel . Thus, any protocol that reliably trans- decoding is effectively a POVM because the output systems N are classical). If the scheme is any good for transmitting the mits the information source U should satisfy the following source, then the following condition holds for any given ε > 0, inequality for sufficiently large n: H (U) χ∗ ( ) + (1/n + ε log U ) , ≤ N | | n o which converges to (31) as n and ε 0. Pr Uˆ n = U n ε. (30) → ∞ → 6 ≤ Remark 7: If the Holevo capacity is not additive for the channel, then the best statement of the source-channel separa- tion theorem is in terms of the regularized quantity: Theorem 6: The following condition is necessary and suf- H (U) χ∗ ( ) , ficient for transmitting the output of a memoryless classical ≤ reg N information source, characterized by a random variable U, where A0→B over a memoryless quantum channel , with ∗ 1 ∗ ⊗n N ≡ N χreg ( ) lim χ , additive Holevo capacity: N ≡ n→∞ n N but it is unclear how useful such a statement is because H (U) χ∗ ( ) , (31) we cannot compute such a regularized quantity. (The above ≤ N statement follows by applying all of the inequalities in the proof of Theorem 6 except the last one.) where What if the condition H (U) > χ∗ ( ) holds instead? We can prove a variant of the above source-channelN separation the- χ∗ ( ) max I (X; B) , orem that allows for the information source to be reconstructed XA ρ N ≡ ρ at the receiving end up to some distortion D. We obtain the XB X X A→B A ρ pX (x) x x (ρ ). following theorem: ≡ | i h | ⊗ N x x Theorem 8: The following condition is necessary and suf- ficient for transmitting the output of a memoryless classical Proof: Sufficiency of (31) is a direct consequence of information source over a quantum channel with additive Shannon compression and Holevo-Schumacher-Westmoreland Holevo capacity (up to some distortion D): (HSW) coding. The sender first compresses the information R (D) χ∗ ( ) , (33) source down to a set of size 2nH(U). The sender then em- ≤ N ≈ ploys an HSW code to transmit any message in the compressed where R (D) is defined in (1). set over n uses of the quantum channel. Reliability of the Proof: Sufficiency of (33) follows from the rate distortion ∗ scheme follows from the assumption that H (U) χ ( ), protocol and the HSW coding theorem. Specifically, the sender ≤ N the HSW coding theorem, and Shannon compression. compresses the information source down to a set of size Necessity of (31) follows from reasoning similar to that in 2nR(D) and then uses an HSW code to transmit any element of the proof of the classical source-channel separation theorem this set. The reconstructed sequence Uˆ n at the receiving end [21]. Fix ε > 0. We begin by assuming that there exists a good obeys the distortion constraint E d(U, Uˆ) D, with d(U, Uˆ) scheme that meets the criterion in (30). Consider the following denoting a suitably defined distortion{ measure.} ≤ 11

Necessity of (33) follows from the fact that R nR (D) I(U n; Uˆ n), (34) ≤ Reference and by applying the last four steps in the chain of inequalities in (32). A proof of (34) is available in (10.61-10.71) of Ref. [21]. A’ Alice B Bob N B. Quantum source-channel separation theorem A A’ B Â We now prove a source-channel separation theorem N which is perhaps more interesting for quantum comput- E D A’ B ing/communication applications. Suppose that a sender would N like to transmit a quantum information source faithfully over a quantum channel, such that the receiver perfectly recovers the transmitted quantum source in the limit of many copies of the Fig. 3. The most general protocol for transmitting a quantum information source over a memoryless quantum channel. source and uses of the channel. Figure 3 depicts the scenario to which our second source-channel separation theorem applies. As before, we characterize a memoryless quantum informa- Proof: Sufficiency of (37) follows from Schumacher tion source by a density matrix ρA ( A), and consider ρ ∈ D H compression and the direct part of the quantum capacity ψRA R A denote its purification. The entropy of | i ∈ H ⊗ H A0→B theorem [38], [52], [23]. Specifically, the sender compresses the source H(A)ρ is given by (3). Let denote a N the source down to a space of dimension 2nH(R) with memoryless quantum channel. Suppose Alice has access to ≈ multiple uses of the source, and she and Bob are allowed the Schumacher compression protocol. She then encodes this multiple uses of the quantum channel. subspace with a quantum error correction code for the channel . The condition in (37) guarantees that we can apply the Since Alice needs to act on many copies of the state ρ, N we instead suppose that she is acting on the A systems of direct part of the quantum capacity theorem, and combined ρ ⊗n with achievability of Schumacher compression, the receiver the tensor power state ψRA . The most general protocol is | i can recover the quantum information source with asymptoti- one in which Alice performs some CPTP encoding map n An→A0n ρ ⊗n E ≡ cally small error in the limit of many copies of the source and on the A systems of the state ψRA , producing someE output systems A0n which can serve| asi input to many many uses of the quantum channel. A0→B ρ uses of the quantum channel . Alice then transmits Fix ε > 0 and note that H(A)ρ = H(R)ψ since ψRA is a the A0n systems over the channels,N leading to some output pure state. Then the necessity of (37) follows from the chain systems Bn for the Bob. Bob then acts on these systems with of inequalities given below. Note that the subscripts denoting Bn→Aˆn some decoding map n . If the protocol is any the states have been omitted for simplicity: good for transmitting theD quantum≡ D information source, then the following condition should hold for any ε > 0 and sufficiently nH (A) = nH (R) large n: = H (Rn)

ρ ⊗n ⊗n ρ ⊗n n n n (ψRA) n( ( n((ψRA) ))) ε. (35) I (R B ) + 2 + 4 (1 Fe) log R − D N E 1 ≤ ≤ ni n − | | The relation between trace distance and entanglement fi- I (R B M) + 2 + 4εn log R ≤ X i | | delity [57] implies that = p (m) I (Rn Bn) + 2 + 4εn log R ρm ⊗n m i | | Fe(ρ , Λn) 1 ε, (36) ≥ − Q ⊗n + 2 + 4εn log R ⊗n where Λn is the composite map Λn n n. ≤ N | | ≡ D ◦ N ◦ E = nQ ( ) + 2 + 4εn log R . (38) We can now state our first variant of a quantum source- N | | channel separation theorem. Theorem 9: The following condition is necessary and suf- The first equality follows from the assumption that the initial ρ ⊗n ficient for transmitting the output of a memoryless quantum state ψ is a tensor power state. The first inequality | RAi information source, characterized by a density matrix ρA, over follows from (7.34) of Ref. [6] a fundamental relation between 0 a quantum channel A →B with additive coherent the input entropy, the coherent information of a channel, and information: N ≡ N the entanglement fidelity of any quantum error correction code. H (A) Q ( ) , (37) Now, the encoding that Alice employs may in general be ρ ≤ N some CPTP encoding map (and not an isometry). However, where H (A)ρ is the entropy of the quantum information source, and Q ( ) is the coherent information of the chan- Alice can simulate any such CPTP map by first performing an nel : N isometry and then a von Neumann measurement on the system N not fed into the channel (the environment of the simulated Q ( ) max I (A B)σ , CPTP). Let M denote the classical system resulting from N ≡ |φAA0 i i A0→B measuring the environment of the simulated CPTP map. We σAB (φAA0 ). ≡ N can write the state after the channel acts as a classical-quantum 12 state of the following form: R

X m p (m) m m ρ n n . M R B Reference m | i h | ⊗ Then the second inequality follows from quantum data pro- cessing inequality and (36). The second equality follows A’ Alice B Bob because N A’ B I (Rn BnM) = I (Rn Bn M) A Â i i | N X n n = p (m) I (R B ) , TA i ρm E m A’ B D whenever the conditioning system is classical [57]. The third N inequality follows because the channel’s coherent information TB is never smaller than any individual I (Rn Bn) (and thus ρm never smaller than the average). The finali inequality follows from the assumption that the channel has additive coherent Fig. 4. The most general protocol for transmitting a quantum information information (this holds for degradable quantum channels [27] source over a memoryless, entanglement-assisted quantum channel. and is suspected to hold for two-Pauli channels [54]). Thus, any protocol that reliably transmits the quantum information ΦTATB source should satisfy the following inequality where is the entangled state that they share before communication begins (it does not necessarily need to be H (R) Q ( ) + (2/n + 4ε log R ) , maximally entangled). This leads to our final source-channel ≤ N | | separation theorem: which converges to (37) as n and ε 0. → ∞ → Theorem 11: The following condition is necessary and suf- Remark 10: A similar comment as in Remark 7 holds ficient for transmitting the output of a memoryless quantum whenever it is not known that the channel has additive coherent information source, characterized by a density matrix ρA, over 0 information. any entanglement-assisted quantum channel A →B: N ≡ N 1 C. Entanglement-assisted quantum source-channel separation H (A)ρ I ( ) , (40) ≤ 2 N theorem where H (A)ρ is the entropy of the quantum information Our final source-channel separation theorem applies to the source, and scenario where Alice and Bob have unlimited prior shared entanglement. The statement of this theorem is that the en- I ( ) max I (A; B)σ , N ≡ |ϕAA0 i tropy of the quantum information source being less than the A0→B σAB (ϕAA0 ). entanglement-assisted quantum capacity of the channel [10], ≡ N [26], [57] is both a necessary and sufficient condition for Proof: Sufficiency of (40) follows from reasoning similar the faithful transmission of the source over an entanglement- to that in the proof of Theorem 9. We just exploit Schumacher assisted quantum channel. This theorem is the most powerful compression and the entanglement-assisted quantum capacity of any of the above because the formulas involved are all theorem [10], [26], [57]. single-letter, for any memoryless source and channel. ρ Fix ε > 0 and note that H(A)ρ = H(R)ψ since ψRA is a Figure 4 depicts the scenario to which this last theorem pure state. Then necessity of (40) follows from the following applies. The situation is nearly identical to that of the previous chain of inequalities. Once again, the subscripts denoting the section, with the exception that Alice and Bob have unlimited states have been omitted for simplicity: prior shared entanglement. Alice begins by performing some n 0n n A TA→A n 2nH (R) = 2H (R ) CPTP encoding map n on the systems A E ≡ E n n n from the quantum information source and on her share T of H (R ) + I (R B TB) A ≤ i the entanglement, producing some output systems A0n which n 0 + 2 + 4 (1 Fe) log R can serve as input to many uses of a quantum channel A →B. − | | I (Rn; BnT ) + 2 + 4nε log R A0n N B Alice then transmits the systems over the channels, ≤ n n n | | n n = I (R TB; B ) + I (R ; TB) I (TB; B ) leading to some output systems B for Bob. Bob then acts − on these systems and his share TB of the entanglement with + 2 + 4nε log R n ˆn B TB →A n n | |n some decoding map n . If the protocol is any = I (R T ; B ) I (T ; B ) + 2 + 4nε log R D ≡ D B B good for transmitting the quantum information source, then the n n− | | I (R TBM; B ) + 2 + 4nε log R following condition should hold for any ε > 0 and sufficiently ≤ n | | large n: max I (AX; B ) + 2 + 4nε log R ≤ ρXAA0n | | ⊗n ⊗n ⊗n  ρ ⊗n ρ TATB = I + 2 + 4nε log R (ψRA) n( ( n((ψRA) Φ ))) ε, N | | − D N E ⊗ 1 ≤ = nI ( ) + 2 + 4nε log R . (41) (39) N | | 13

The first inequality follows by applying the same reasoning V. CONCLUSION as the first inequality in (38). The second inequality follows We have proved several quantum rate-distortion theorems n n n n n by applying H (R ) + I (R B TB) = I (R ; B TB) and i and quantum source-channel separation theorems. All of our the fact that 1 Fe ε for a protocol satisfying (39). − ≤ quantum rate-distortion protocols employ the quantum reverse The second inequality follows from a useful identity for Shannon theorems [10], [1], [24], [8], [12]. This strategy quantum mutual information. The third equality follows from works out well whenever unlimited entanglement is avail- n the assumption that systems R and TB begin in a product able, but it clearly leads to undesirable regularized formu- n state. The third inequality follows because I (TB; B ) 0. ≥ las in the unassisted setting. Our quantum source-channel The fourth inequality follows from the reasoning, similar to separation theorems demonstrate in many cases that a two- that used in the proof of Theorem 9, that Alice simulates stage compression-channel-coding strategy works best for an isometry and measures the environment (also exploiting memoryless sources and for quantum channels with additive the quantum data processing inequality). The next inequality capacity measures. Again, our most satisfying result is in n n follows because the state on R TBMB is a state of the form the entanglement-assisted setting, where the pleasing result is

X A0n→Bn x that the entanglement-assisted rate distortion function being pX (x) x x (ρ 0n ), | i h |X ⊗ N AA less than the entanglement-assisted quantum capacity is both x necessary and sufficient for transmission of a source over a n where we identify R TB with A, and M with X. Thus, channel up to some distortion. n n the information quantity I (R TBM; B ) can never be larger The most important open question going forward from here than the maximum over all such states of that form. The is to determine better protocols for quantum rate distortion that second-to-last equality was proved in Refs. [59], [57]. The do not rely on the reverse Shannon theorems. The differing final equality follows from additivity of the channel’s quantum goals of a reverse Shannon theorem and a rate distortion mutual information [2], [10], [57]. Thus, any entanglement- protocol are what lead to complications with regularization assisted protocol that reliably transmits the quantum informa- in Theorem 5. tion source should satisfy the following inequality Another productive avenue could be to explore scenarios where the unassisted quantum source-channel separation the- 1 H (A) I ( ) + (1/n + 2ε log R ) , orem does not apply. In the classical case, it is known that ρ ≤ 2 N | | certain sources and channels without a memoryless structure which converges to (40) as n and ε 0. can violate the source-channel separation theorem [56], and → ∞1 → similar ideas would possibly demonstrate a violation for the What if the condition H (A)ρ > 2 I ( ) holds instead? We can prove a variant of the above source-channelN separation the- quantum case. Though, in the quantum case, it very well orem that allows for the information source to be reconstructed could be that certain memoryless sources and channels could at the receiving end up to some distortion D. We obtain the violate source-channel separation, but we would need a better following theorem: understanding of quantum capacity in the general case in order Theorem 12: The following condition is necessary and suf- to determine definitively whether this could be so. Other interesting questions are as follows: Does the ficient for transmitting the output of a memoryless quantum entanglement-assisted quantum source-channel separation the- information source over an entanglement-assisted quantum orem apply if sender and receiver are given unlimited access channel (up to some distortion D): to a quantum feedback channel, given what we already know 1 about quantum feedback [14]? Can anything learned from Rq (D) I ( ) , (42) eaq ≤ 2 N source-channel separation for classical broadcast or wiretap q channels be applied to figure out a more general characteriza- where Reaq (D) is defined (19). tion for quantum channels that are not degradable? Proof: Sufficiency of (42) follows from the entanglement- The authors thank Jonathan Oppenheim and Andreas Winter assisted rate distortion protocol from Theorem 3 and the for useful discussions, Patrick Hayden for the suggestion to entanglement-assisted quantum capacity theorem [10], [26]. pursue a quantum source-channel separation theorem, and the That is, the sender compresses the information source down anonymous referees for helpful suggestions. ND and MHH nRq (D) to a space of size 2 eaq and then uses an entanglement- received funding from the European Community’s Seventh assisted quantum code to transmit any state in this subspace. Framework Programme (FP7/2007-2013) under grant agree- The reconstructed state at the receiving end obeys the distor- ment number 213681. MMW acknowledges financial support tion constraint. from the MDEIE (Quebec)´ PSR-SIIRI international collabo- Necessity of (42) follows from the fact that ration grant and thanks the Centre for Mathematical Sciences at the University of Cambridge for hosting him for a visit. q 1 n ˆn nReaq (D) I(R ; A ), (43) ≤ 2 APPENDIX by applying the quantum data processing inequality to get Lemma 13: For a fixed state ρ, the quantum mutual infor- n n n n I(R ; Aˆ ) I(R ; B TB), and finally by applying the last mation is convex in the channel operation: ≤ X seven steps in the chain of inequalities in (41). A proof of (43) I (A; B) p (x) I (A; B) , ω ωx is available in (17) of the proof of Theorem 2. ≤ x 14 where where the entropies are with respect to the following state:

0 A →B ρ A1A2→B1B2 ω := (id )(ψ 0 ), θR R B B (φR A ϕR A ), AB AA 1 2 1 2 ≡ N 1 1 ⊗ 2 2 ⊗ N 0 ρ x A →B A1A2→B1B2 ωAB := (id x )(ψ 0 ), with some noisy channel, and φR A and ⊗ N AA N 1 1 X ϕR A being pure, bipartite states. p (x) x. (44) 2 2 N ≡ x N Proof: The inequality is equivalent to

Proof: It is possible to show that H (R1R2) + I (R1R2 B1B2) i I (A; B) = H (ρ) + H ( (ρ)) H ((I )(ψ)) , H (R1) + I (R1 B1) + H (R2) + I (R2 B2) . ω N − ⊗ N ≥ i i I (A; B) = H (ρ) + H ( x (ρ)) H ((I x)(ψ)) , Observing that H (R R ) = H (R ) + H (R ) because the ωx N − ⊗ N 1 2 1 2 state on R and R is product, the inequality is equivalent to and the desired inequality becomes 1 2 I (R1R2 B1B2) I (R1 B1) + I (R2 B2) , H (ρ) + H ( (ρ)) H ((I )(ψ)) i ≥ i i X N − ⊗ N which is in turn equivalent to p (x)[H (ρ) + H ( x (ρ)) H ((I x)(ψ))] . ≤ N − ⊗ N I (R1B1; R2B2) I (B1; B2) . x ≥ This inequality is equivalent to This last inequality follows from the quantum data processing inequality. H ( (ρ)) H ((I )(ψ)) N −X ⊗ N p (x)[H ( x (ρ)) H ((I x)(ψ))] , REFERENCES ≤ N − ⊗ N x [1] Anura Abeyesinghe, Igor Devetak, Patrick Hayden, and Andreas Winter. The mother of all protocols: Restructuring quantum information’s family which in turn is equivalent to convexity of coherent informa- tree. Proceedings of the Royal Society A, 465(2108):2537–2563, August tion, or equivalently, the quantum data processing inequality 2009. arXiv:quant-ph/0606225. for coherent information: [2] Christoph Adami and Nicolas J. Cerf. von Neumann capacity of noisy quantum channels. Physical Review A, 56(5):3470–3483, November I (A B) I (A BX) . 1997. i ≤ i [3] Dave Bacon, Isaac L. Chuang, and Aram W. Harrow. The quantum Schur and Clebsch-Gordan transforms: I. efficient qudit circuits. In Pro- q ceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Lemma 14: The quantum rate-distortion function Reac (D) Algorithms (SODA ’07), pages 1235–1244, New Orleans, Louisiana, is non-increasing and convex: 2007. Society for Industrial and Applied Mathematics. arXiv:quant- ph/0601001. q q [4] Howard Barnum. Quantum rate-distortion coding. Physical Review A, D1 < D2 R (D1) R (D2) , ⇒ eac ≥ eac 62(4):042309, September 2000. [5] Howard Barnum, Emanuel Knill, and Michael A. Nielsen. On quantum q fidelities and channel capacities. IEEE Transactions on Information R (λD1 + (1 λ) D2) eac − Theory, 46:1317–1329, 2000. q q [6] Howard Barnum, M. A. Nielsen, and Benjamin Schumacher. Information λR (D1) + (1 λ) R (D2) , ≤ eac − eac transmission through a noisy quantum channel. Physical Review A, where 0 λ 1. 57(6):4153–4175, June 1998. arXiv:quant-ph/9702049. ≤ ≤ [7] Charles H. Bennett, Gilles Brassard, Claude Crepeau,´ Richard Jozsa, Proof: The proof is similar to Barnum’s [4], which in Asher Peres, and William K. Wootters. Teleporting an unknown quantum q turn is similar to the one from Ref. [21]. Reac (D) is non- state via dual classical and Einstein-Podolsky-Rosen channels. Physical increasing because the domain of minimization becomes larger Review Letters, 70(13):1895–1899, March 1993. [8] Charles H. Bennett, Igor Devetak, Aram W. Harrow, Peter W. Shor, and after increasing D, which implies that the rate-distortion Andreas Winter. Quantum reverse Shannon theorem. December 2009. function can only become smaller. Let (R1,D1) and (R2,D2) arXiv:0912.5537. be two points on the information rate-distortion curve and [9] Charles H. Bennett, Aram W. Harrow, and Seth Lloyd. Universal quantum data compression via nondestructive tomography. Physical let 1 and 2 be the respective operations that achieve the Review A, 73(3):032336, March 2006. E E q minimum in the definition of Reac, respectively. Consider the [10] Charles H. Bennett, Peter W. Shor, John A. Smolin, and Ashish V. Thapliyal. Entanglement-assisted capacity of a quantum channel and the map λ λ 1 + (1 λ) 2. Under the assumption of a E ≡ E − E reverse Shannon theorem. IEEE Transactions on Information Theory, distortion function that is linear in the operation (such as the 48:2637–2655, 2002. entanglement fidelity), it follows that the distortion caused by [11] Toby Berger. Rate Distortion Theory: A Mathematical Basis for Data q Compression. Information and system sciences. Prentice Hall, 1971. λ is Dλ = λD1 + (1 λ) D2. We also have that Reac (Dλ) E − [12] Mario Berta, Matthias Christandl, and Renato Renner. The quantum re- is the minimum over all operations that have distortion Dλ verse Shannon theorem based on one-shot information theory. December 0 0 q AB A →B AA 2009. arXiv:0912.3805. so that Reac (Dλ) I (A; B)ω where ω λ (ψ ). Finally, we have that≤ the mutual information≡ is E convex in the [13] Kim Bostroem and Timo Felbinger. Lossless quantum data compression q and variable-length coding. Physical Review A, 65(3):032313, February operation (see Lemma 13) so that I (A; B)ω λReac (D1) + 2002. q ≤ (1 λ) Reac (D2). [14] Garry Bowen. Quantum feedback channels. IEEE Transactions in Lemma− 15 (Superadditivity of mutual information): Information Theory, 50(10):2429–2434, October 2004. arXiv:quant- The ph/0209076. mutual information is superadditive in the sense that [15] Garry Bowen and Nilanjana Datta. Beyond i.i.d. in quantum information theory. Proceedings of the 2006 IEEE International Symposium on In- I (R1R2; B1B2) I (R1; B1) + I (R2; B2) , formation Theory, pages 451–455, July 2006. arXiv:quant-ph/0604013. ≥ 15

[16] Samuel L. Braunstein, Christopher A. Fuchs, Daniel Gottesman, and [42] Markus Muller,¨ Caroline Rogers, and Rajagopal Nagarajan. Lossless Hoi-Kwong Lo. A quantum analog of . IEEE quantum prefix compression for communication channels that are always Transactions on Information Theory, 46(4):1644–1649, July 2000. open. Physical Review A, 79(1):012302, January 2009. [17] Todd A. Brun, Igor Devetak, and Min-Hsiu Hsieh. Correcting quantum [43] Martin Plesch and Vladim´ır Buzek. Efficient compression of quantum errors with entanglement. Science, 314(5798):436–439, October 2006. information. Physical Review A, 81(3):032317, March 2010. [18] A. Robert Calderbank, Eric M. Rains, Peter W. Shor, and N. J. A. Sloane. [44] David Poulin, Jean-Pierre Tillich, and Harold Ollivier. Quantum serial Quantum error correction via codes over GF(4). IEEE Transactions on turbo-codes. IEEE Transactions on Information Theory, 55(6):2776– Information Theory, 44:1369–1387, 1998. 2798, June 2009. [19] A. Robert Calderbank and Peter W. Shor. Good quantum error-correcting [45] Benjamin Schumacher. Quantum coding. Physical Review A, codes exist. Physical Review A, 54(2):1098–1105, August 1996. 51(4):2738–2747, April 1995. [20] Xiao-Yu Chen and Wei-Ming Wang. Entanglement information rate [46] Benjamin Schumacher. Sending entanglement through noisy quantum distortion of a quantum Gaussian source. IEEE Transactions on channels. Physical Review A, 54(4):2614–2628, October 1996. Information Theory, 54(2):743–748, February 2008. [47] Benjamin Schumacher and Michael A. Nielsen. Quantum data process- [21] Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. ing and error correction. Physical Review A, 54:2629–2635, 1996. Wiley-Interscience, second edition, 2005. [48] Benjamin Schumacher and Michael D. Westmoreland. Sending classical [22] Paul Cuff. Communication requirements for generating correlated information via noisy quantum channels. Physical Review A, 56(1):131– random variables. In Proceedings of the 2008 International Symposium 138, July 1997. on Information Theory, pages 1393–1397, Toronto, Ontario, Canada, [49] Claude E. Shannon. A mathematical theory of communication. Bell July 2008. arXiv:0805.0065. System Technical Journal, 27:379–423, 1948. [23] Igor Devetak. The private classical capacity and quantum capacity of a [50] Claude E. Shannon. Coding theorems for a discrete source with a fidelity quantum channel. IEEE Transactions on Information Theory, 51:44–55, criterion. IRE International Convention Records, 7:142–163, 1959. January 2005. [51] Peter W. Shor. Scheme for reducing decoherence in quantum computer [24] Igor Devetak. Triangle of dualities between quantum communication memory. Physical Review A, 52(4):R2493–R2496, October 1995. protocols. Physical Review Letters, 97(14):140503, October 2006. [52] Peter W. Shor. The quantum channel capacity and coherent information. [25] Igor Devetak and Toby Berger. Quantum rate-distortion theory for mem- In Lecture Notes, MSRI Workshop on Quantum Computation, 2002. oryless sources. IEEE Transactions on Information Theory, 48(6):1580– [53] Peter W. Shor. Quantum Information, Statistics, Probability (Dedicated 1589, June 2002. arXiv:quant-ph/0011085. to A. S. Holevo on the occasion of his 60th Birthday): The classical [26] Igor Devetak, Aram W. Harrow, and Andreas Winter. A resource frame- capacity achievable by a quantum channel assisted by limited entangle- work for quantum Shannon theory. IEEE Transactions on Information ment. Rinton Press, Inc., 2004. arXiv:quant-ph/0402129. Theory, 54(10):4587–4618, October 2008. [54] Graeme Smith and John A. Smolin. Degenerate quantum codes for Pauli Physical Review Letters [27] Igor Devetak and Peter W. Shor. The capacity of a quantum channel channels. , 98(3):030501, 2007. for simultaneous transmission of classical and quantum information. [55] Barbara M. Terhal, M. Horodecki, Debbie W. Leung, and David P. Journal of Mathematical Communications in Mathematical Physics, 256(2):287–303, 2005. DiVincenzo. The entanglement of purification. Physics, 43(9):4286–4298, 2002. arXiv:quant-ph/0202044. [28] Frederic Dupuis, Patrick Hayden, and Ke Li. A father protocol for [56] Sridhar Vembu, Sergio Verdu, and Yossef Steinberg. The source-channel quantum broadcast channels. IEEE Transactions on Information Theory, separation theorem revisited. IEEE Transactions on Information Theory, 56(6):2946–2956, June 2010. 41(1):44–54, January 1995. [29] Masahito Hayashi. Optimal visible compression rate for mixed states [57] Mark M. Wilde. From Classical to Quantum Shannon Theory. June is determined by entanglement of purification. Physical Review A, 2011. arXiv:1106.1445. 73:060301, June 2006. [58] Mark M. Wilde and Min-Hsiu Hsieh. Entanglement boosts quantum Quantum Information: An Introduction [30] Masahito Hayashi. . Springer, turbo codes. October 2010. arXiv:1010.1256. 2006. [59] Mark M. Wilde and Min-Hsiu Hsieh. The quantum dynamic capacity [31] Patrick Hayden, Richard Jozsa, and Andreas Winter. Trading quantum formula of a quantum channel. April 2010. arXiv:1004.0458. for classical resources in quantum data compression. Journal of [60] Andreas Winter. Compression of sources of probability distributions and Mathematical Physics, 43(9):4404–4444, September 2002. arXiv:quant- density operators. August 2002. arXiv:quant-ph/0208131. ph/0204038. [61] Jon Yard, Patrick Hayden, and Igor Devetak. Capacity theorems for [32] Alexander S. Holevo. The capacity of the quantum channel with general quantum multiple-access channels: Classical-quantum and quantum- signal states. IEEE Transactions on Information Theory, 44:269–273, quantum capacity regions. IEEE Transactions on Information Theory, 1998. 54(7):3091–3113, 2008. [33] Min-Hsiu Hsieh, Todd A. Brun, and Igor Devetak. Entanglement- assisted quantum quasicyclic low-density parity-check codes. Physical Review A, 79(3):032340, March 2009. [34] Min-Hsiu Hsieh and Mark M. Wilde. Entanglement-assisted commu- nication of classical and quantum information. IEEE Transactions on Information Theory, 56(9):4682–4704, September 2010. [35] Min-Hsiu Hsieh and Mark M. Wilde. Trading classical communication, quantum communication, and entanglement in quantum Shannon theory. IEEE Transactions on Information Theory, 56(9):4705–4730, September 2010. [36] Min-Hsiu Hsieh, Wen-Tai Yen, and Li-Yi Hsu. High performance entanglement-assisted quantum LDPC codes need little entanglement. IEEE Transactions on Information Theory, 57(3):1761–1769, 2011. arXiv:0906.5532. [37] Kenta Kasai, Manabu Hagiwara, Hideki Imai, and Kohichi Sakaniwa. Quantum error correction beyond the bounded distance decoding limit. July 2010. arXiv:1007.1778. [38] Seth Lloyd. Capacity of the noisy quantum channel. Physical Review A, 55(3):1613–1622, March 1997. [39] Zhicheng Luo. Topics in quantum cryptography, quantum error cor- rection, and channel simulation. PhD thesis, University of Southern California, May 2009. Available from http://digitallibrary.usc.edu/. [40] Zhicheng Luo and Igor Devetak. Channel simulation with quantum side information. IEEE Transactions on Information Theory, 55(3):1331– 1342, March 2009. arXiv:quant-ph/0611008. [41] David J. C. MacKay, Graeme Mitchison, and Paul L. McFadden. Sparse graph codes for quantum error-correction. IEEE Transactions on Information Theory, 50(10):2315, October 2004.