<<

Enhanced Communication With the Assistance of Indefinite Causal Order

Daniel Ebler,1, 2 Sina Salek,1 and Giulio Chiribella3, 4, 2 1Department of Computer Science The University of Hong Kong, Pokfulam Road, Hong Kong 2HKU Shenzhen Institute of Research and Innovation, Kejizhong 2nd Road, Shenzhen, China 3Department of Computer Science, University of Oxford, Wolfson Building, Parks Road, Oxford OX1 3QD, United Kingdom 4Canadian Institute for Advanced Research, CIFAR Program in Science, Toronto, ON M5G 1Z8 In quantum Shannon theory, the way information is encoded and decoded takes advantage of the laws of , while the way communication channels are interlinked is assumed to be classical. In this Letter we relax the assumption that quantum channels are combined classically, showing that a quantum communication network where quantum channels are combined in a super- position of different orders can achieve tasks that are impossible in conventional quantum Shannon theory. In particular, we show that two identical copies of a completely depolarizing channel become able to transmit information when they are combined in a quantum superposition of two alternative orders. This finding runs counter to the intuition that if two communication channels are identical, using them in different orders should not make any difference. The failure of such intuition stems from the fact that a single noisy channel can be a random mixture of elementary, non-commuting processes, whose order (or lack thereof) can affect the ability to transmit information.

PACS numbers:

Introduction – Information theory, initiated by the seminal work of Claude Shannon, [1], has given us a framework to understand the fundamental workings of communication, data storage and signal processing. Shannon’s theory was originally formulated with the as- sumption that the carriers of information and the com- munication channels are classical. The data were rep- resented by classical random variables and the commu- nication channels were treated as stochastic transition matrices. However, the laws of nature are fundamentally quantum and one can take advantage of these laws to build a new model of information processing. This gave rise to quantum Shannon theory [2], where quantum fea- tures such as superposition and entanglement were used to enhance communication, increasing transmission rates [3], providing unconditional security [4], and introducing new means of information transmission [5], just to name FIG. 1: Fixed order vs superposition of orders. 1(a) A a few examples. Nevertheless, quantum Shannon theory quantum particle, prepared in the state ρ, goes first through channel N and then through channel N . This configuration is still conservative, in that it assumes that the commu- 1 2 is associated to the state ρc = |1ih1| of a control , in nication channels are combined in a well-defined config- which the choice of order is encoded. 1(b) The quantum par- uration. In principle, quantum theory allows for new ticle goes first through N2 and then through N1. This alterna- ways to combine communication channels, by connecting tive configuration is associated to the qubit state ρc = |0ih0|. arXiv:1711.10165v2 [quant-ph] 21 Mar 2018 them in a quantum superposition of different configura- 1(c) The quantum SWITCH creates a superposition of the tions. In particular, quantum theory allows the order of two configurations 1(a) and 1(b). It takes a control qubit in application of channels to be entangled with a control a superposition state, such as ρc = |+ih+|, and correlates the order of the two channels with the state of the qubit. system [6], a situation that is sometimes referred to as a quantum superposition of orders. Even more generally, quantum theory allows for exotic configurations that are not compatible with any underlying model where the or- in a superposition of orders can boost the rate of com- der is definite [7]. Both features could emerge in a theory munication beyond the limits of conventional quantum of quantum gravity [8, 9] and would offer enhancements Shannon theory. in a number of tasks, such as testing properties of quan- Our result is based on a novel quantum primi- tum channels [10, 11], playing non-local games [7], and tive, called the quantum SWITCH [6]. The quantum reducing communication complexity [12]. In this Letter, SWITCH is an operation that takes two channels N1 and we show that the ability to combine quantum channels N2 as inputs and creates a new channel, which uses the 2

P † channels N1 and N2 in an order that is entangled with such that i Ki Ki = I. the state of a control qubit, thus generating a quantum In a fixed causal structure channels can be composed superposition of two alternative orders. Fig. 1 illustrates either in parallel or series. Two channels N1 and N2 com- different ways of combining the two channels N1 and N2, posed in parallel are represented by the tensor product either in a definite order or in a quantum superposition of the two channels N1 ⊗N2. If used in series, the second of orders. In [6] it was shown that the quantum SWITCH channel simply acts on the output of the first channel, cannot be realized by any circuit where the order to the i.e. N2 ◦ N1. More generally, the two channels can be two channels N1 and N2 is fixed. Likewise, the quantum connected in an arbitrary including in- SWITCH cannot be realized as a classical mixture of cir- termediate operations. cuits using channels N1 and N2 in fixed orders [10]. An In principle, however, the order does not have to be even broader sense in which the quantum SWITCH can- fixed. Two channels could be combined by the quantum not be decomposed into quantum processes with definite SWITCH operation, ending up in a situation where their order has been discussed in [13]. relative order is entangled with a control system. Let us (1) In this Letter, we introduce a quantum Shannon theo- denote the Kraus operators of the channel N1 as {Ki } retic task where the quantum SWITCH enables two com- (2) and N2 as {Ki }. The quantum SWITCH uses an aux- municating parties to transmit information. We show iliary quantum system to control the order of the Kraus that the entanglement of the control system with the or- operators of the two channels in an indefinite causal man- der of application of two channels can be used to perform ner. The Kraus operators of the overall communication tasks that are impossible when the order resulting from the switching of N1 and N2 are is fixed, or even correlated with a classical variable. Sur- (2) (1) (1) (2) prisingly, the advantage can be achieved by switching two Wij = Ki Kj ⊗ |0ih0|c + Kj Ki ⊗ |1ih1|c, (1) copies of the same channel, a phenomenon which we refer to as self-switching. Specifically, we show the advantage acting on a target ρ and a control state of switching two copies of the completely depolarising ρc. The action of the quantum SWITCH is then given channel, which transforms every quantum state into the by maximally mixed state. Clearly, none of the two fixed X S(N , N )(ρ ⊗ ρ ) = W (ρ ⊗ ρ ) W † . (2) configurations in Fig. (1) can be used to communicate in- 1 2 c ij c ij formation. Here we show that the entanglement of these i,j two configurations with a control system can be used to It is easy to check that above definition is independent communicate classical information – a phenomenon we of the choice of Kraus operators for the channels N1 and call Causal Activation. This phenomenon sheds light on N2. Mathematically, the quantum SWITCH is a higher- the fact that the properties of channels do not solely de- order operation [6]: it takes two channels N1 and N2 as pend on the way they are constructed, but also the way input and creates a quantum channel S(N1, N2) as out- they are combined: two channels combined in a superpo- put. Specifically, this higher-order operation combines sition of different orders behave very differently from the the two input channels in an order that depends on the same channels combined in a fixed order. state of the control qubit: if the qubit is in the state One of the main pillars of Shannon theory is quan- ρx = |0ih0|, the channel S(N1, N2) will return the state tifying the capacity of channels to communicate infor- N2N1(ρ) ⊗ |0ih0|, if the qubit is in the state ρc = |1ih1|, mation. Channel capacity theorems are of fundamental the channel will return the state N1N2(ρ)⊗|1ih1|. When importance both for the theoretical characterisation of the qubit is in a superposition of |0i and |1i, the channel channels, as well as the experimental implementation of returns a correlated state, which can be interpreted as communication protocols. In this Letter, we derive an an- the result of the input channels N1 and N2 acting on ρ alytical expression for the Holevo capacity [14] of the two in a quantum superposition of two alternative orders. causally activated depolarizing channels. Quite counter- Quantum Shannon theory with the assistance of the intuitively, we find that the Holevo capacity is maximum quantum SWITCH. In quantum Shannon theory, quan- for qubit channels and decreases with the dimension of tum channels represent communication resources. Hence, the input. higher-order operations, like the quantum SWITCH, can Preliminaries – In this section we review the concepts be viewed as transformations of resources. Quantum and tools needed to understand our finding. Shannon theory can be cast in the form of a resource the- We use quantum channels to represent transmission ory, by specifying a set of higher-order operations that are lines in a quantum communication network. Mathemat- regarded as free [15]. A basic type of free operation maps ically, quantum channels are described by completely an input channel N into an output channel D ◦ N ◦ E, positive trace preserving maps (CPTP). We will often where E and D are two channels, representing encoding use the Kraus decomposition, which allows one to rep- and decoding operations at the sender’s and receiver’s resent the action of a channel N on a quantum state ρ end, respectively. Another type is composition in paral- P † as N (ρ) = i KiρKi , where {Ki} is a set of operators lel, whereby two channels N1 and N2 are combined into 3 the channel N1 ⊗ N2. Finally, it is also natural to con- M such that χ(N ⊗ M) > χ(N ) + χ(M). Therefore, sider scenarios where the sender sends the information to if many copies of a channel whose Holevo information the receiver through a repeater, connected to the sender is non-additive are available, the Holevo information is and the receiver through two channels N1 and N2, re- a lower bound for the capacity of quantum channels to spectively. The corresponding type of free operation is communicate classical information. Operationally, the composition in sequence, whereby two input channels N1 lower bound corresponds to the amount of information and N2 are combined into the output channel N2 ◦R◦N1, that can be transmitted if the sender uses only product where R represents the operation performed by the re- states in the encoding. peater. Combining these three types of operations (pos- One of the implications of the HSW theorem is that sibly including free classical correlations), one obtains a any quantum channel that is not constant can be used resource theory, suitable to describe basic communication to communicate classical information. This is because tasks involving a single sender and a single receiver. for a channel N that is not constant, there exist at We now define an extension of standard quantum least two pure states |φi and |ψi, such that N (|φihφ|) 6= Shannon theory that includes quantum superpositions N (|ψihψ|). Using these two states with equal probabil- of causal orders. We do this in the resource-theoretic ity it is immediate to see that the Holevo information is framework, by adding the quantum SWITCH to the set positive. On the other hand, the Holevo information of of free operations. More precisely, we add the free op- a constant channel is trivially zero. Even if the constant eration that maps a pair of channels N1 and N2 to the channel is used many times, none of the operations al- 0 0 new channel N defined by N (ρ) = S(N1, N2)(ρ ⊗ ρc), lowed in the standard model of quantum Shannon theory where S is defined as in Eq. (2) and ρc is a fixed state allows to generate a channel that transmits information. of the control qubit. Note that the state of the control In the following we will show that, in contrast, classical is part of the way the two channels are combined, and communication can become possible with the assistance is not accessible to the sender: the sender cannot encode of the quantum SWITCH. classical bit in the state of the control. The control is Main Result – A completely depolarising channel N D only accessible to the receiver, who can use it as an aid on a d-dimensional quantum system can be represented for decoding. We refer to the extended model as quan- by uniform randomisation over d2 orthogonal unitary op- tum Shannon theory with the assistance of the quantum erators Ui, such that its action on a state ρ is SWITCH. Adding the quantum SWITCH is similar to what is done in other variants of quantum Shannon the- d2 D 1 X † I ory, where one adds free entanglement [3], free symmetric N (ρ) = UiρU = Tr[ρ] . (3) d2 i d side-channels [16, 17], free no-signalling channels [18, 19] i=1 and such like. In the following we will focus on the communication Therefore, according to Eq. (1), the overall quantum of classical information. Holevo [14], Schumacher, and channel resulting from the quantum SWITCH of two Westmoreland [20](HSW) proved that a single copy of completely depolarising channels has Kraus operators any quantum channel N can communicate classical infor- mation at best at the rate χ(N ) := max{px,ρx} I(X; B)σ, 1 where I(X; B)σ is the von Neumann Mutual Information, Wij = 2 (UiUj ⊗ |0ih0|c + UjUi ⊗ |1ih1|c) . (4) P d evaluated on a state of the form σ := x px|xihx|X ⊗ N (ρ ) and maximized over all possible ensembles Suppose that the control system is fixed to the state x B √ √ {px, ρx}. The quantity χ(N ) is called the Holevo infor- ρc := |ψcihψc|, where |ψci := p |0i + 1 − p |1i. If the mation and has been shown to be in general non-additive sender prepares the target system in the state ρ, then the [21]. This means that there exist two channels N and receiver will get the output state 4

1 X  S(N D, N D)(ρ ⊗ ρ ) = p|0ih0| ⊗ U U ρU †U † + (1 − p)|1ih1| ⊗ U U ρU †U † c d4 c i j j i c j i i j i,j p † † p † † + p(1 − p)|0ih1|c ⊗ UiUjρUi Uj + p(1 − p)|1ih0|c ⊗ UjUiρUj Ui p † I I p(1 − p) X Uj = p|0ih0| ⊗ + (1 − p)|1ih1| ⊗ + |0ih1| ⊗ Tr[U ρ] c d c d d2 c j d j p p(1 − p) X Uj + |1ih0| ⊗ Tr[ρU †] d2 c j d j I ρ = (p|0ih0| + (1 − p)|1ih1| ) ⊗ + pp(1 − p)(|0ih1| + |1ih0| ) ⊗ . (5) c c d c c d2

The first equality follows from Eq. (4). The second equal- Since these states still depend on ρ, the receiver can use ity is the application of the depolarising channel in Eq. this dependence to extract non-zero information from the (3). Finally, the last equation follows from the fact that target system. In terms of the Kraus operators (4), the the operators Uj form an orthonormal basis for the set postselection on the outcomes + and − generates the of d × d matrices. noisy channel generalisation of the quantum superposi- Eq. (5) shows that the quantum SWITCH of two de- tions of time evolutions proposed by Aharonov and col- polarising channels has a clear dependence on the input laborators [22]. state ρ. Therefore, the HSW theorem implies that we can communicate classical information at a non-zero rate. So far we have shown that the quantum SWITCH al- The quantum SWITCH implements a transfer of in- lows one to use depolarising channels to communicate at formation from the input system to the correlations be- some non-zero rate. We now compute the optimal rate tween the output system and the control. Note that the in the case of product encodings, by analytically calcu- information is not contained in the state of the system lating the maximum Holevo information over all input alone, nor in the state of the control alone: it is gen- ensembles. We restrict our attention to the case where uinely contained in the correlations. Note also that these the control qubit is in the state ρc = |+ih+|, because for correlations must be quantum: if the control decoheres such state the communication rate is the highest. The in the basis {|0i, |1i}, the information is completely lost. expression for the maximum Holevo information is de- In spite of this, the receiver does not need to perform rived in the Supplemental Material, where, in fact, we entangled operations in the decoding. Instead, the re- derive an even more general expression, valid for arbi- ceiver measure the control system in the Fourier basis trary depolarizing channels, sending an input state ρ to {|+i, |−i}, obtaining the conditional states an output state qρ + (1 − q)I/d, with 0 ≤ q ≤ 1 a generic noise parameter. For the complete depolarizing channel I ρ h±|S(N D, N D)(ρ ⊗ ρ ) |±i = ± pp(1 − p) . (6) (q = 0), we find the Holevo information to be c 2d d2

D D χ(S(N , N )) = log d + H(ρec) d + 1 d + 1 d − 1 d − 1  1   1  + log + log + 2(d − 1) log , (7) 2d2 2d2 2d2 2d2 2d 2d

2 where ρec = 1/2|0ih0|+1/2|1ih1|+1/2d (|0ih1|+|1ih0|) is channels are combined. the reduced state of the control system. It should not be surprising that the entropy of the control system appears Eq. (7) is the best rate one can communicate classical in the expression for the capacity of the switched depo- information by switching only two copies of depolarising larising channels. This is because the control system is a channel. However, if one has access to more copies of parameter describing the way in which the depolarising these channels, one may be able communicate more by inputting states that are entangled across the channels. 5

To show this would require a proof that the overall map- ment of experimental techniques for the implementation ping generated by switching depolarising channels is not of the quantum SWITCH. additive. Since this question is separate from the main We thank Philippe Grangier, Renato Renner, point of this Letter, we leave this task for future investi- Aephraim Steinberg, Ognyan Oreshkov, Philip Walter, gation. Giulia Rubino, Paolo Perinotti, Michal Sedl´ak, Matt Conclusions and Discussion – In this work, we ex- Leifer and Christa Zoufal, for helpful and engaging dis- plored an extension of quantum Shannon theory where cussions during the “Hong Kong Workshop on Quan- communication channels can be combined in a quan- tum Information and Foundations 2018”. This work is tum superposition of orders. In this extended model, supported by the National Natural Science Foundation we showed that combining two completely depolarizing of China through grant 11675136, the Croucher Foun- channels with the quantum SWITCH activates them, dation, the Canadian Institute for Advanced Research allowing the transmission of classical information. In (CIFAR), the Hong Research Grant Council through contrast, no such activation is possible in the standard grant 17326616, and the Foundational Questions Insti- model, where the order is fixed or controlled by a classical tute through grant FQXi-RFP3-1325. This publication random variable. was made possible through the support of a grant from Strikingly, we showed that the Shannon theoretic ad- the John Templeton Foundation. The opinions expressed vantage can be gained as a result of creating a superposi- in this publication are those of the authors and do not tion of a channel with another copy of itself. This result necessarily reflect the views of the John Templeton Foun- may seem paradoxical, because exchanging two uses of dation. the same channel would not have any observable effect in any ordinary quantum circuit. The resolution of the paradox lies in the fact that noisy quantum channels can be seen as random mixture of different processes, corre- sponding to different Kraus operators. The advantage of [1] C. E. Shannon, Bell System Technical Journal 27, 379 (1948). the self-switching arises because some of these processes [2] M. M. Wilde, Quantum information theory (Cambridge do not commute with each other, and therefore a quan- University Press, 2013). tum control on the order offers a non-trivial resource. We [3] C. H. Bennett, P. W. Shor, J. A. Smolin, and A. V. observe that no self-switching effect arises for quantum Thapliyal, Physical Review Letters 83, 3081 (1999). channels that admit a Kraus decomposition consisting of [4] C. H. Bennett and G. Brassard, in Conf. on Computers, mutually commuting operators. Systems and Signal Processing (Bangalore, India, Dec. Our results are an invitation to investigate a new 1984) (1984), pp. 175–9. [5] C. H. Bennett, G. Brassard, C. Cr´epeau, R. Jozsa, paradigm of Shannon theory, where the order of the A. Peres, and W. K. Wootters, Physical review letters communication channels can be in a quantum superpo- 70, 1895 (1993). sition. This paradigm may find applications in future [6] G. Chiribella, G. M. DAriano, P. Perinotti, and B. Val- quantum communication networks. Consider a situation iron, Physical Review A 88, 022318 (2013). where a provider connects different communicating par- [7] O. Oreshkov, F. Costa, and C.ˇ Brukner, Nature commu- ties through a network of quantum channels. In this sit- nications 3, 1092 (2012). uation, the provider could opt to connect the channels [8] L. Hardy, arXiv preprint gr-qc/0509120 (2005). [9] L. Hardy, Quantum Reality, Relativistic Causality, and in a superposition of alternative configurations, thereby Closing the Epistemic Circle 73, 379 (2009). boosting the communication rates between parties. Of [10] G. Chiribella, Physical Review A 86, 040301 (2012). course, every such application requires a careful analysis [11] M. Ara´ujo,A. Feix, F. Costa, and C.ˇ Brukner, New Jour- of physical resources required for the implementation of nal of Physics 16, 093026 (2014). the quantum SWITCH. While in this work we treated [12] P. A. Gu´erin,A. Feix, M. Ara´ujo,and C.ˇ Brukner, Phys- the quantum SWITCH as an abstract higher-order op- ical review letters 117, 100502 (2016). eration, there exist different ways in which this opera- [13] O. Oreshkov and C. Giarmatzi, New Journal of Physics 18, 093020 (2016). tion could be realized, including table-top photonic im- [14] A. S. Holevo, IEEE Transactions on Information Theory plementations [23, 24], implementations with ion traps 44, 269 (1998). [25] and superconducting circuits [26]. The practical ex- [15] B. Coecke, T. Fritz, and R. W. Spekkens, Information tent of the advantage shown in our paper greatly depends and Computation 250, 59 (2016). on the resources required in each implementation. For [16] G. Smith and J. Yard, Science 321, 1812 (2008). example, Oreshkov [27] has recently analyzed the struc- [17] G. Smith, J. A. Smolin, and A. Winter, IEEE Transac- ture of the photonic implementations of [23, 24], showing tions on Information Theory 54, 4208 (2008). [18] T. S. Cubitt, D. Leung, W. Matthews, and A. Win- that an essential ingredient is the ability to delocalize ter, IEEE Transactions on Information Theory 57, 5509 the input channels in time, coherently controlling when (2011). the environment interacts with the system. On the other [19] R. Duan and A. Winter, IEEE Transactions on Informa- hand, our result provides new motivation to the develop- tion Theory 62, 891 (2016). 6

[20] B. Schumacher and M. D. Westmoreland, Physical Re- [24] G. Rubino, L. A. Rozema, A. Feix, M. Ara´ujo,J. M. view A 56, 131 (1997). Zeuner, L. M. Procopio, C.ˇ Brukner, and P. Walther, [21] M. B. Hastings, Nature Physics 5, 255 (2009). Science Advances 3, e1602589 (2017). [22] Y. Aharonov, J. Anandan, S. Popescu, and L. Vaidman, [25] N. Friis, V. Dunjko, W. D¨ur,and H. J. Briegel, Physical Physical review letters 64, 2965 (1990). Review A 89, 030303 (2014). [23] L. M. Procopio, A. Moqanaki, M. Ara´ujo, F. Costa, [26] N. Friis, A. A. Melnikov, G. Kirchmair, and H. J. Briegel, I. A. Calafell, E. G. Dowd, D. R. Hamel, L. A. Rozema, Scientific reports 5, 18036 (2015). C.ˇ Brukner, and P. Walther, Nature communications 6, [27] O. Oreshkov, arXiv preprint arXiv:1801.07594 (2018). 7913 (2015).

Lower bound for the classical capacity of the Quantum SWITCH of two depolarizing channels

In this supplemental material we give the expression for the Holevo information of two depolarising channels, used in a Quantum SWITCH with a control qubit initially in the state |+i. In order to do this, we first show how such an operation acts on a general quantum state. A generic depolarising channel can be represented as

I N D(ρ) = q · ρ + (1 − q) · Tr[ρ] (8) q d d2 1 − q X = q · ρ + U ρU † , (9) d2 i i i=1

d2 where {Ui}i=1 are unitary operators and form an orthonormal basis of the space of d × d matrices.

When two depolarizing channels are inserted into the Quantum SWITCH, one obtains a√ quantum channel d q D D √ S(Nq , Nq ), whose Kraus operators are given by Eq. (4). Introducing the notation U0 := 1−q · I, we can ex- press the Kraus operators as

1 − q W = (U U ⊗ |0ih0| + U U ⊗ |1ih1| ) , ∀i, j ∈ {0, 1, . . . , d2} . (10) ij d2 i j c j i c p Let the control system be in the pure state ρc = p|0ih0| + (1 − p)|1ih1| + p(1 − p)(|0ih1| + |1ih0|). Using this control D D system, the action of the new channel S(Nq , Nq ) on a generic state ρ is

d2 (1 − q)2 X  S(N D, N D)(ρ ⊗ ρ ) = p|0ih0| ⊗ U U ρU †U † + (1 − p)|1ih1| ⊗ U U ρU †U † q q c d4 i j j i j i i j i,j=1 p † † p † † + p(1 − p)|0ih1| ⊗ UiUjρUi Uj + p(1 − p)|1ih0| ⊗ UjUiρUj Ui

d2 q(1 − q) X  + p|0ih0| ⊗ U ρU † + (1 − p)|1ih1| ⊗ U ρU † d2 i i i i i=1 p † p † + p(1 − p)|0ih1| ⊗ UiρUi + p(1 − p)|1ih0| ⊗ UiρUi

d2 q(1 − q) X  + p|0ih0| ⊗ U ρU † + (1 − p)|1ih1| ⊗ U ρU † d2 j j j j j=1 p † p † + p(1 − p)|0ih1| ⊗ UjρUj + p(1 − p)|1ih0| ⊗ UjρUj 2 +q · ρ ⊗ ρc . (11)

The four contributions arise in the following way: first, we fix both i 6= 0 and j 6= 0. Second, we fix j = 0 and i 6= 0. Third, we swap the roles to i = 0 and j 6= 0. Finally, both i = 0 and j = 0 gives the last term. Since 1 P † I the randomization over the set of unitaries Uj6=0 completely depolarises the state, i. e. d2 i UiρUi = d , the only non-trivial contribution is the first part for which i 6= 0 and j 6= 0. However, this term is the same as Eq. (5) in the 7 main manuscript. This gives the following output of the SWITCH with two depolarising channels

 I ρ  S(N D, N D)(ρ ⊗ ρ ) = (1 − q)2 (p|0ih0| + (1 − p)|1ih1| ) ⊗ + pp(1 − p)(|0ih1| + |1ih0| ) ⊗ q q c c c d c c d2  I  +2q(1 − q) ρ ⊗ + q2ρ ⊗ ρ . (12) c d c

We can now proceed to computing the best rate at which this channel can communicate classical information. To this purpose, we compute the quantum mutual information of the quantum-classical state.

X D D x σXBC := px|xihx|X ⊗ S(Nq , Nq )(ρa ⊗ ρc) , (13) x where the lower scripts denote explicitly the Hilbert spaces of the state. It has been shown that it is sufficient to optimise Holevo information over classical-quantum states with conditional states that are pure (see e.g. Theorem 13.3.2 of [2]). The Holevo information is a lower bound of the classical capacity of a channel. Hence, in order to bound D D the classical capacity of the channel S(Nq , Nq ), we find the upper limit of the Holevo information. Therefore, we define the following extended input state with pure conditional state

1 X ω = p |xihx| ⊗ |iihi| ⊗ |jihj| ⊗ X(i)Z(j)Ψx Z(j)†X(i)† ⊗ Φ , (14) XIJAC d2 x X I J A c x,i,j

x where system I and J are two new classical registers and ΨA and Φc are some pure states of the target and control systems respectively. Besides, the probability distribution for the registers I and J was assumed to be uniform, leading to the pre-factor 1/d2. Here, X(i) and Z(j) denote the generalized Pauli operators acting on a vector |li as

X(i)|li = |i ⊕ li Z(j)|li = e2πijl/d|li . (15)

The mutual information I(X; BC)σ can be bounded as

I(X; BC)σ = H(BC)σ − H(BC|X)σ

≤ H(BC)ω − H(BC|X)σ = log d + H(ρec) − H(BC|X)σ . (16) The first inequality follows from concavity of the von Neumann entropy. The last line is a direct consequence of

I Tr [(S(N D, N D) ⊗ I)(ω )] = A ⊗ ρ , (17) XIJ q q XIJAC d ec with

2  p 2  ρec = (1 − q) p|0ih0|c + (1 − p)|1ih1|c + p(1 − p)/d (|0ih1|c + |1ih0|c) + q(2 − q)ρc . (18) above, we assumed that Φc is a general pure state on the control. In Eq. (16), we then have ρec instead, but replaced by concavity σ with ω before. To further upper bound the mutual information, we analyse the conditional entropy D D x † † D D x H(BC|X)σ. Define θx,i,j := S(N , N )(X(i)Z(j)ΨAZ(j) X(i) ⊗ Φc) and γx,i,j := X(i)Z(j)S(N , N )(ΨA ⊗ † † Φc)Z(j) X(i) . We find that

1 X H(BC|XIJ) = p H(BC) ω d2 x θx,i,j x,i,j (?) 1 X = p H(BC) d2 x γx,i,j x,i,j X = p H(BC) D D x x S(N ,N )(ΨA⊗Φc) x

= H(BC|X)σ. (19) 8

The second line follows directly from Eq. (12): since the operators X(i) and Z(j) only act on the system state and leave the control invariant, the states θx,i,j and γx,i,j coincide. The third line is a consequence of the von Neumann entropy being invariant under isometric transformations. Using the chain of equalities above, we can further upper bound Eq. (16) as

I(X; BC)σ ≤ log d + H(ρec) − H(BC|XIJ)ω X = log d + H(ρ ) − p H(B) D D ec x S(N ,N )(Ψx⊗Φc) x

≤ log d + H(ρc) − min H(B)S(N D ,N D )(Ψ ⊗Φ ) e x x c min D D ≤ log d + H(ρec) − H (S(N , N )). (20) The first inequality follows because the expectation value can never be smaller than the minimum value. In the second min inequality, we defined the quantity H (N ) = minϕ H(N (ϕ)) for a channel N and input state ϕ. In the remainder of this section we compute Hmin(S(N D, N D)). We start from Eq. (11) and denote the right hand side with the matrix   D D AB S(N , N )(ρ ⊗ ρc) = (21) B Ae with A = ((1 − q)2 + 2q(1 − q))p · I/d + q2p · ρ Ae = ((1 − q)2 + 2q(1 − q))(1 − p) · I/d + q2(1 − p) · ρ B = (1 − q)2pp(1 − p) · ρ/d2 + 2q(1 − q)pp(1 − p) · I/d + q2pρ . (22)

In the following we set p = 1/2 and retrieve ρc = |+ih+| from the main text. This leads to A = Ae, such that the AB resulting matrix is of the form . In order to find the eigenvalues of S(N D, N D)(ρ ⊗ ρ ) , we use the BA c p=1/2 following Lemma. AB Lemma 1. Consider a 2d × 2d matrix M = acting on the vector space V = d ⊕ d. Let V = W ⊕ W , BA C C 1 2 d d with W1 = {(v, v): v ∈ C } and W2 = {(v, −v): v ∈ C }. If W1 and W2 are invariant under the action of M, then the eigenvalues of M are the union of the eigenvalues of A + B and A − B. Therefore, finding the eigenvalues of the operators A + B and A − B is sufficient, which reads for the case p = 1/2 A + B = ((1 − q)2 + 2q(1 − q))p · I/d + q2p · ρ + (1 − q)2pp(1 − p) · ρ/d2 + 2q(1 − q)pp(1 − p) · I/d + q2pρ (1 − q)2  I  (1 − q)2  = + 2q(1 − q) + q2 + ρ 2 d 2d2

A − B = ((1 − q)2 + 2q(1 − q))p · I/d + q2p · ρ − (1 − q)2pp(1 − p) · ρ/d2 − 2q(1 − q)pp(1 − p) · I/d − q2pρ (1 − q)2  I (1 − q)2  = − ρ (23) 2 d 2d2 Since I and ρ commute, there exists an invertible operator T such that A A A −1 A = T diag(λ1 , λ2 , . . . , λd ) T B B B −1 B = T diag(λ1 , λ2 , . . . , λd ) T , (24) A d B d where spec(A) = {λi }i=1 and spec(B) = {λj }j=1 denote the eigenvalues of A and B respectively. It follows that A A A B B B  −1 (A ± B) = T diag(λ1 , λ2 , . . . , λd ) ± diag(λ1 , λ2 , . . . , λd ) T (25) A B d such that the eigenvalues of A ± B read {λi ± λi }j=1. Since A and B only contain the identity matrix and ρ, we do not need to worry about the ordering of the eigenvalues, which is determined by the choice of T . Hence, the spectra are given by D D spec(S(N , N )(ρ ⊗ ρc)p=1/2) = spec(A + B) ∪ spec(A − B)  + d  − d = λi i=1 ∪ λi i=1 , (26) 9

+ d − d where the eigenvalues {λi }i=1 and {λi }i=1 read

 2  2  d  + d (1 − q) + 4q(1 − q) 2 (1 − q) ρ λi i=1 = + q + 2 λi 2d 2d i=1  2 d  − d (1 − q) ρ λi i=1 = 2 (d − λi ) . (27) 2d i=1

ρ d Above, we denoted the eigenvalues of the system input as spec(ρ) := {λi }i=1. Finally, this yields for the minimum entropy of the channel S(N D, N D)

min D D D D H (S(N , N )(ρ ⊗ ρc)) = min H(S(N , N )(ρ ⊗ ρc)) ρ X = min − λ+ log λ+ + λ− log λ− . (28) ρ i i i i i

ρ min D D Since the entropy is a concave function of the eigenvalues λi , H (S(N , N )(ρ⊗ρc)) only possesses global maxima. Furthermore, the minimum value is attained at the border of the interval [0, 1]×d. With the restriction that the eigenvalues have to sum up to one, the only valid points on the edges are

ρ λi = 1 ρ λj6=i = 0 ∀ j 6= i . (29)

D D We note that all d of those points are equivalent. This shows that the optimal ρ that minimizes H(S(N , N )(ρ⊗ρc)) is a pure state. Therefore, we finally obtain ( d + 1 + q(d − 1)(2 + q(2d − 1)) d + 1 + q(d − 1)(2 + q(2d − 1)) Hmin(S(N D, N D)) = − log 2d2 2d2 (1 − q)2 + 4q(1 − q) (1 − q)2 + 4q(1 − q) + (d − 1) log 2d 2d ) (1 − q)2  (1 − q)2  (1 − q)2  (1 − q)2  + (d − 1) log (d − 1) + (d − 1) log . (30) 2d2 2d2 2d 2d

Therefore, we find the following upper bound for the mutual information

min D D I(X; B) ≤ log d + H(ρec) − H (S(N , N )) . (31)

It is left to find the state ensemble {px, ρx} that achieves this bound. We directly find that picking the family {ρx} P as d pure orthonormal states indeed satisfies Eq. (31) with equality. Altogether, the state ωXA = 1/d i |iihi| ⊗ |iihi| is optimal. We find the chi-quantity as a function of the depolarising parameter q to be

χ(S(N D, N D)) = max I(X; B) ρ min D D = log d + H(ρec) − H (S(N , N )) . (32)