<<

Fluctuation- relations for thermodynamic distillation processes

Tanmoy Biswas1, A. de Oliveira Junior2, Micha l Horodecki1, and Kamil Korzekwa2

1International Centre for Theory of Quantum Technologies, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland. 2Faculty of Physics, Astronomy and Applied Computer Science, Jagiellonian University, 30-348 Krak´ow, Poland. May 26, 2021

The fluctuation-dissipation theorem is a funda- tists to investigate fluctuations around these averages and mental result in statistical physics that establishes their impact on the system dynamics. This line of research a connection between the response of a system dates back to Einstein and Smoluchowski, who derived the subject to a perturbation and the fluctuations as- connection between fluctuations and dissipation effects for sociated with observables in equilibrium. Here Brownian particles [5,6]. Now, it is well known that near- we derive its version within a resource-theoretic equilibrium, linear response theory provides a general proof framework, where one investigates optimal quan- of the fluctuation-dissipation theorem, which states that tum state transitions under thermodynamic con- the response of a given system when subject to an external straints. More precisely, for a fixed transforma- perturbation is expressed in terms of the fluctuation prop- tion error, we prove a relation between the min- erties of the system in thermal equilibrium [7,8]. The the- imal amount of free dissipated in a ther- oretical description underlying the fluctuation-dissipation modynamic distillation process and the free en- relations is usually expressed in terms of the stochastic ergy fluctuations of the initial state of the system. character of thermodynamic variables. This approach is Our results apply to initial states given by either strongly motivated since it is experimentally viable [9, 10]. asymptotically many identical pure systems or ar- On the other hand, a complementary approach is based bitrary number of independent energy-incoherent on resource theories [11–14]. It aims to go beyond the systems, and allow not only for a state transforma- thermodynamic limit and the assumption of equilibrium, tion, but also for the change of Hamiltonian. The and is often presented as an extension of statistical me- fluctuation-dissipation relations we derive enable us chanics to scenarios with large fluctuations, the so-called to find the optimal performance of thermodynamic single-shot [15, 16]. A natural ques- protocols such as extraction, information era- tion is then whether fluctuation-dissipation relations are sure and thermodynamically-free communication, present in such a resource-theoretic description. Although up to second-order asymptotics in the number N important insights have been obtained in trying to con- of processed systems. We thus provide a first rig- nect the information-theoretic and fluctuation theorem ap- orous analysis of these thermodynamic protocols proaches [17, 18], they have, so far, not been explicitly for quantum states with coherence between differ- related to dissipation. Resource-theoretic analysis of dis- ent energy eigenstates in the intermediate regime sipation was performed independently [19–22], where the of large but finite N. authors investigated irreversibility of thermodynamic pro- cesses due to finite-size effects. However, these results were obtained for quasi-classical case of energy-incoherent 1 Introduction states, and so they are not able to account for quantum effects that come into play when dealing with even smaller arXiv:2105.11759v1 [quant-ph] 25 May 2021 Thermodynamics has been profoundly triumphant by im- systems, when fluctuations around thermodynamic aver- pacting the natural sciences and allowing the development ages are no longer just thermal in their origin. of technologies that go from coolers to spaceships. As a the- ory of macroscopic systems in equilibrium, it presents us This work makes a step forward towards a genuinely with a compelling picture of what state transformations are quantum framework characterising optimal thermody- allowed in terms of a small number of macroscopic quanti- namic state transformations and links fluctuations with ties, such as work and entropy [1,2]. The drawback of the free energy dissipation. We investigate a special case of macroscopic description is that thermodynamics inevitably state interconversion processes known as thermodynamic deals with average quantities, and as systems get smaller, distillations. These are thermodynamic processes in which fluctuations of these quantities become increasingly rele- a given initial quantum system is transformed, with a given vant, requiring a new description [3,4]. Going beyond the transformation error, to a pure energy eigenstate of the fi- original scenario of equilibrium thermodynamics led scien- nal system. In particular, we focus on the initial system

1 totic expression for the optimal transformation error while Box 1: Fluctuation-dissipation relation for thermo- extracting a given amount of work per copy of the initial dynamic distillation processes subsystem. Moreover, we also verify the accuracy of the obtained expression by comparing it with the numerically In the optimal thermodynamic process of - optimised work extraction process. As a second applica- approximate transformation from many indepen- tion, we analyse the optimal cost of erasing N independent dent non-equilibrium systems into systems without bits prepared in an arbitrary state. In this case, we ob- fluctuations of free energy, the dissipated free en- tained the optimal transformation error of the erasure pro- ergy satisfies cess as a function of invested work. The last application we consider is the optimal thermodynamically-free com- F tot = a() σtot(F ), (1) diss munication rate, i.e., the optimal encoding of information where σtot(F ) is the free energy fluctuation in the into a quantum system without using any extra thermo- initial state, and a() = −Φ−1() with Φ−1(x) dynamic resources. Applying our theorems gives us the being an inverse of a Gaussian cumulative distribu- optimal number of messages that can be encoded into a tion function. quantum system in a thermodynamically free-way, which we show to be directly related to the non-equilibrium free energy of the system. This result can be interpreted as the inverse of the Szilard engine, as in this process we use the 1 ability to perform work to encode information. Further- )

 more, our results connect the fluctuations of free energy

( 0 a and the optimal average decoding error. −1 The paper is organised as follows. We start with re- calling the resource-theoretic approach to thermodynamics in Section2 and introducing the necessary concepts used 0 0.5 1 in the applications. In Section3, we state our two main  results concerning the fluctuation-dissipation relation for We have proved such a statement for many indepen- incoherent and coherent states, discuss their thermody- dent systems in arbitrary incoherent states, as well namic interpretation and apply them to three thermody- as for many independent and identical systems in namic protocols of work extraction, information erasure the same pure state. We conjecture that this is true and thermodynamically-free communication. The deriva- for independent systems in arbitrary mixed states. tion of the main results can be found in Section4. Finally, we conclude with an outlook in Section5. consisting of N non-interacting subsystems that are either 2 Setting the scene energy-incoherent and non-identical (in different states and with different Hamiltonians), or pure and identical. Within 2.1 Thermodynamic distillation processes this setting, our main results are given by two fluctuation- dissipation theorems. The main message behind both the- In order to formally define the thermodynamic distil- orems is a precise relation between the free energy fluc- lation process, we first need to identify the set of tuations of the initial state and the minimal amount of thermodynamically-free states and transformations. By free energy dissipated in a thermodynamic distillation pro- definition, a state of the system that is in equilibrium with cess for a given transformation error (see Box1). The a thermal environment E at inverse β is a free first theorem applies to arbitrary number of independent state. Therefore, for a system described by a Hamiltonian energy-incoherent states, while the second one holds for H, the only free state is given by the thermal Gibbs state asymptotically many identical pure states. Furthermore, e−βH our findings also provide new tools to study approximate γ = ,Z = Tr e−βH  . (2) transformations and corresponding asymptotic rates. Here, Z we not only extend previous distillation results [20] to non- The set of free transformations that we consider is given identical systems, but also to genuinely quantum states in by thermal operations [11, 13, 23], which act on the system superposition of different energy eigenstates. as Our results allow us for a rigorous study of important † E(ρ) = Tr 0 U (ρ ⊗ γ ) U , (3) thermodynamic protocols. First of all, we extend the anal- E E ysis of work extraction to the regime of not necessarily where U is a joint unitary acting on the system and the identical incoherent states and to pure states. By directly thermal environment E that is described by a Hamil- applying our main results, we obtain a second-order asymp- tonian HE and is prepared in a thermal Gibbs state

2 γE at inverse temperature β. Moreover, U is commut- ing with the total Hamiltonian of the system and bath, [U, H ⊗ 1E + 1 ⊗ HE] = 0, and we discard any subsystem E0 of the joint system of the considered system and envi- ronment. A thermodynamic distillation process is a thermodynam- ically free transformation from a general initial system de- scribed by a Hamiltonian H and prepared in a state ρ, to a target system described by a Hamiltonian H˜ and in a Figure 1: Thermodynamic distillation process. The arrow depicts state ρ˜ that is an eigenstate of H˜ .1 An -approximate ther- the existence of a thermal operation transforming N independent initial systems to N˜ independent target systems. The colours repre- modynamic distillation process from (ρ, H) to (˜ρ, H˜ ) is a senting the initial and target systems indicate that each subsystem thermal operation that transforms the initial system (ρ, H) ˜ is described by a different Hamiltonian and prepared in a different to the final system with Hamiltonian H and in a state  state. away from ρ˜ in the infidelity distance δ,  2 q√ √ thermal equilibrium, the maximum amount of work that δ(ρ1, ρ2) := 1 − Tr ρ1ρ2 ρ1 . (4) it can perform (that can be extracted from the system) is bounded by the difference ∆F between its initial and final We say that ρ is energy incoherent if it is a convex combi- free energy. Traditionally, the free energy F = U − S/β nation of eigenstates of H. has been defined only for states at thermal equilibrium, In this paper, we will study the distillation process from with U denoting the and S the entropy of N independent initial systems to arbitrary target systems, the system. However, taking into account its operational e.g., to N˜ independent target systems as illustrated in meaning, one can extend its definition to investigate also Fig.1. In particular, we will be interested in the asymp- the case of non-equilibrium states. More precisely, the rel- totic behaviour for large N. Thus, our distillation setting is ative entropy, specified by a family of initial and target systems indexed by a natural number N. Each initial system (ρN ,HN ) D(ρkγ) := Tr (ρ(log ρ − log γ)) , (6) consists of N non-interacting subsystems with the total Hamiltonian HN and a state ρN given by can be interpreted as a non-equilibrium generalisation of the free energy difference between a state ρ and a thermal N N N X N N O N state γ. It quantifies the maximum amount of work that H = Hn , ρ = ρn , (5) can be extracted on average from the system in an out-of- n=1 n=1 equilibrium state [24, 25]. while each target system is described by an arbitrary Generally, work extraction protocols are based on con- ˜ N N ˜N ˜N ˜N Hamiltonian H and a state ρ˜ = |Ek ihEk |, with |Ek i trolling and changing the external parameters that define being some eigenstate k of H˜ N . the Hamiltonian of the system [26, 27]. Within a resource- A typical example of this setting is when initial and tar- theoretic treatment [23, 28], however, we avoid using an get systems are given by copies of independent and identical external agent. Therefore, we explicitly model the an- subsystems. More precisely, in this case, the family of ini- cillary battery system B, intending to transform it from N N N ⊗N tial systems is given by H with Hn = H and ρ = ρ , an initial pure energy state to another pure energy state while the family of target systems is given by N˜ subsys- with higher energy, see Fig.2. A continuous Hamiltonian tems, each with a Hamiltonian H˜ and in a state |E˜kihE˜k|. usually describes the battery, but we can as well choose One is then interested in the optimal distillation rate N/N˜ a Hamiltonian with the discrete spectrum, as long as its as N tends to infinity. However, we will investigate a more energy differences coincide with the amount of work we general setting, allowing the subsystems to differ in both want to extract. Without loss of generality, we focus on a N state and Hamiltonian, as long as the initial state is uncor- two-level battery system described by a Hamiltonian HB related. with eigenstates |0iB and |1iB corresponding to N 0 and Wext, respectively. The possibility of extracting the N 2.2 Work extraction amount of work equal to Wext from N subsystems described by a Hamiltonian HN and prepared in a state ρN is then One of the manifestations of the second law of thermo- equivalent to the existence of a thermodynamic distillation dynamics is that for a system interacting with a bath in process N E(ρ ⊗ |0ih0|B) = |1ih1|B , (7) 1In fact, all of our results apply to a slightly more general setting with target states being proportional to the Gibbs state on their sup- from (N +1) initial subsystems described by a Hamiltonian γ˜k ˜ ˜ γ˜l ˜ ˜ ˜ port, e.g. forρ ˜ = |EkihEk| + |ElihEl|, where |Eii denotes N N N γ˜k+˜γl γ˜k+˜γl H + HB to a target subsystems with a Hamiltonian HB . ˜ the eigenstate of H andγ ˜i is its thermal occupation. If only an -approximate distillation with transformation

3 N Figure 2: Work extraction process. Extraction of work Wext from Figure 3: Information erasure. The N bits of information to N subsystems described by a Hamiltonian HN and prepared in be erased are represented by N subsystems in a state ρN with a a state ρN can be seen as a particular case of thermodynamic trivial Hamiltonian HN . The process is performed by attaching a N distillation process E involving a battery system B. The battery is battery system B in an excited state |1iB with energy Wcost, which modelled by a two-level system with energy levels |0iB and |1iB measures the energetic cost of erasure. The erasure process resets N N ⊗N corresponding to energies 0 and Wext, respectively. The initial the state ρ to a fixed state |0i , and de-excites the battery system is given by the investigated N subsystems with a battery system. in the ground state |0iB , while the target system is given just by the battery in the excited state |1iB . N state |1iB of energy Wcost to measure the energetic cost of erasure. Then, the erasure process resetting the state N ⊗N N error N is possible, then N directly measures the quality ρ to a fixed state |0i is possible while investing Wcost of extracted work, i.e., with probability 1 − N we end up work, if there exists the following distillation process: with a battery system in an excited state of energy W N . ext N ⊗N E(ρ ⊗ |1ih1|B) = |0ih0| ⊗ |0ih0|B , (8) 2.3 Information erasure with the initial and target Hamiltonians being identical. The transformation error quantifies the quality of erasure, The connection between information and thermodynamics and the process is illustrated in Fig.3. is as old as the thermodynamic theory itself, going back to the thought experiment known as Maxwell’s Demon [29]. 2.4 Thermodynamically-free communication It suggests that if one has information about the particles’ positions and momenta, one can reduce the entropy of a Since thermodynamics is closely linked with information gas of particles without investing work, and thus violate processing, one can also study thermodynamic aspects of the second law of thermodynamics. However, the recog- communication. A traditional communication scenario in nition of the thermodynamic significance of information is which Alice wants to encode and transmit classical infor- perhaps best captured by the Szilard’s engine [30], a sim- mation to Bob over a quantum channel consists of the ple setup that converts information into work. As in the following three steps [34]. First, she encodes a message Maxwell’s demon example, the Szilard engine can overcome m ∈ {1, ..., M} by preparing a quantum system in a state the second law of thermodynamics whenever some infor- ρm. Then, she sends it to Bob via a noisy quantum chan- mation about the state of the system is available. During nel N . Finally, Bob decodes the original message by per- the resolution of this puzzle, it became clear that ther- forming an optimal measurement on N (ρm). Crucially, in modynamics imposes physical constraints on information this standard scenario, both Alice and Bob are completely processing. In particular, the second law can be reformu- unconstrained, meaning that they can employ all encod- lated as a statement that no thermodynamic process can ings and decodings for free, and the only thing beyond result solely in the erasure of information. Every time in- their control is the noise channel N . formation is erased, the erasure process is accompanied by Recently, a modification of this scenario was introduced a fundamental heat cost, i.e., an entropy increase in the en- that allows one to quantify the thermodynamic cost of com- vironment [31]. Alternatively, the Landauer’s Principle [32] munication [35, 36]. More precisely, it is assumed for sim- tells us that the erasure process has an unavoidable ener- plicity that Alice and Bob are connected via a noiseless getic cost, with the minimum possible amount of energy channel, and Bob’s decoding is still unconstrained. How- required to erase a completely unknown bit of information ever, Alice is constrained to thermodynamically-free encod- given by log 2/β (see Ref. [33] in this context, where a ings, meaning that encoded states ρm can only arise from more nuanced view on Szilard engine and Landauer era- thermal operations acting on a given initial state ρ, inter- sure is presented). preted as an information carrier. Physically, this means Similarly to the case of work extraction, the erasure pro- that Alice obeys the second law of thermodynamics, in the cess can also be formulated as a particular type of ther- sense that the encoding channel is constrained to use no modynamic distillation process. The N bits of information thermodynamic resources other than what the information that one wants to erase can be represented by N two-level carrier ρ initially has. We illustrate this process in Fig.4. systems in a state ρN with a trivial Hamiltonian. We also Now, the central question is: what is the optimal num- add the two-level battery system B initially in an excited ber of messages M(ρ, avg) that can be encoded into ρ in

4 where

S(ρ) := − Tr (ρ log ρ) (12)

is the von Neumann entropy. The higher moments can then be understood as fluctuations of the non-equilibrium free energy content of the system. This is most apparent for Figure 4: Thermodynamically-free encoding. The thermal encod- ing of information can be captured by a thermodynamic distillation pure states ρ = |ψihψ|, as V then simply describes energy process by considering N independent subsystems in a state ρN fluctuations of the system: and with a Hamiltonian HN as an information carrier. The sender encodes a message m ∈ {1, ..., M} into it by applying a thermal 1 2 2 2 V (|ψihψ|kγ) = hψ|H |ψi − hψ|H|ψi . (13) operation Em, and the receiver decodes the original message by β N performing a measurement on Em(ρ ). Moreover, as noted in Ref. [20], when ρ = γ0 is a thermal distribution at some different temperature T 0 6= T , the ex- a thermodynamically-free way, so that the average decod- pression for V becomes ing error is smaller than avg? We will investigate the case  0 2 when the information carrier is given by N independent 0 T cT 0 systems in a state ρN and with a Hamiltonian HN , as spec- V (γ kγ) = 1 − · , (14) T kB N ified in Eq. (5). Then, instead of asking for M(ρ , avg), we can equivalently ask for the optimal encoding rate: where ∂ 0 log[M(ρN ,  )] cT 0 = Tr (γ H) (15) R(ρN ,  ) := avg . (9) ∂T 0 avg N is the specific of the system in a thermal state As we will explain later in the paper, the optimal 0 at temperature T , and kB is the Boltzmann constant. thermodynamically-free encodings (i.e., the ones that al- Now, for the initial system (ρN ,HN ), we introduce the low one to achieve the optimal rate R) can be chosen to be following notation for averaged free energy and free energy given by thermodynamic distillation processes. Through fluctuations: this connection and our results on optimal distillation pro- N cesses, we will derive second-order asymptotic expansion of 1 X N F¯N := D(ρN kγN ), (16a) R(ρ , avg) working for large N. βN n n n=1 N 2.5 Information-theoretic notions and their thermo- 1 X σ2(F N ) := V (ρN kγN ), (16b) β2N n n dynamic interpretation n=1 N Finally, before we proceed to present our results, let us in- 1 X κ3(F N ) := Y (ρN kγN ). (16c) troduce the necessary information-theoretic quantities to- β3N n n gether with their thermodynamic interpretation. For any n=1 d-dimensional quantum state ρ, we define the relative en- We also introduce tropy D between ρ and a thermal Gibbs state γ, together N ! with the corresponding relative entropy variance V and the 1 X F N := D(ρN kγN ) − D(˜ρN kγ˜N ) , (17) function Y related to relative entropy skewness [20, 37, 38]: diss βN n n n=1

D(ρkγ) :=Tr (ρ (log ρ − log γ)) , (10a) which describes the average amount of free energy that is  2 dissipated in the distillation process per subsystem of the V (ρkγ) :=Tr ρ (log ρ − log γ − D(ρkγ)) , (10b) N initial system (note that if Fdiss is negative then the free   energy instead of being dissipated is added to the system). Y (ρkγ) :=Tr ρ |log ρ − log γ − D(ρkγ)|3 . (10c) For the quantities introduced in Eqs. (16a)-(16c) and (17) It is clear from the above definitions that we are dealing we will drop the superscript N to denote their value in the ¯ ¯N with the average, variance and the absolute third moment aysmptotic limit N → ∞, e.g. F := limN→∞ F . of the random variable log ρ − log γ. As already mentioned, Let us also make two final technical comments. First, the average of this random variable, D(ρkγ), can be inter- we only consider families of initial systems for which the N N preted as the non-equilibrium free energy of the system limits of σ(F ) and κ(F ) as N → ∞ are well-defined since and non-zero. Second, in what follows, we will use a short- 1 S(ρ) log Z hand notation with ', and denoting equalities and D(ρkγ) = Tr (ρH) − + , (11) . & √ β β β inequalities up to terms of order o(1/ N).

5 3 Results smaller transformation error for the process, with the ini- tial state exhibiting smaller free energy fluctuations. Al- 3.1 Fluctuation-dissipation relations ternatively, for two processes with the same optimal suc- cess probability, a distillation process from a state with Our first main result connects the optimal transformation smaller free energy fluctuations will lead to smaller free error  of a thermodynamic distillation process from inco- energy dissipation. As a particular example consider herent systems with the amount of free energy dissipated a battery-assisted distillation process, i.e. a thermody- during that process and the free energy fluctuations of the N N namic transformation from (ρ ⊗ |1ih1|B,H + HB) to initial state of the system. N N (˜ρ ⊗ |0ih0|B,H + HB), where the energy gap of the bat- N Theorem 1 (Fluctuation-dissipation relation for inco- tery system B is Wcost. Now, the quality of transforma- herent states). For a distillation setting with energy- tion from ρN to ρ˜N (measured by transformation error N incoherent initial states, the transformation error N of the N ) depends on the amount of work Wcost that we in- optimal -approximate distillation process in the asymptotic vest into the process. As expected, to achieve  ≤ 1/2, limit is given by we need to invest at least the difference of free energies [D(˜ρN kγ˜N ) − D(ρN kγN )]/β. However, Theorem1 tells  N √  Fdiss us how much more work is needed to decrease the trans- lim N = lim Φ − · N , (18) N→∞ N→∞ σ(F N ) formation error to a desired level: the more free energy fluctuations there were in ρN , the more work we need to where Φ denotes the cumulative normal distribution func- invest. tion. Moreover, for any N there exists an -approximate Let us also compare Theorem1 to the results presented distillation process with the transformation error N in Ref. [20]. There, the authors studied the incoherent bounded by thermodynamic interconversion problem between identical ⊗N  F N √  Cκ3(F N ) 1 copies of the initial system, ρ , and identical copies of diss ⊗N˜ N ≤ Φ − · N + · √ , (19) the target system, ρ˜ . Here, for the price of the reduced σ(F N ) σ3(F N ) N generality of the target state (it has to be an eigenstate where C is a constant from the Berry-Esseen theorem that of the target Hamiltonian), we obtained a three-fold im- is bounded by provement. First, our result applies to general indepen- 0.4097 ≤ C ≤ 0.4748. (20) dent systems, not only to identical copies. Second, the Hamiltonians of the initial and target systems can vary, We prove the above theorem in Sec. 4.2, and here we which is particularly important for applications like work will briefly discuss its scope and consequences. First, from extraction or thermodynamically-free communication. Fi- Eq. (18) it is clear that if the amount of dissipated√ free nally, we went beyond the second-order asymptotic result N energy per subsystem, Fdiss, vanishes faster than 1/ N, and found a single-shot upper bound on the optimal trans- then the optimal transformation error  → 1/2. On the N √ formation error N , Eq. (19), that holds for any finite N. N other hand, if Fdiss vanishes slower than 1/ N, then the Thus, even in the finite N regime, one can get a guar- error either vanishes (when the target system has lower antee on the transformation error that is approaching the free energy than the initial one) or approaches 1 (in the asymptotically optimal value as N → ∞. opposite case). Thus, the only non-trivial behaviour of the Our second main result connects the optimal transfor- optimal transformation error happens when mation error N of a thermodynamic distillation process   from N identical copies of a pure quantum system with the N α 1 Fdiss = √ + o √ (21) amount of free energy dissipated during that process and N N the energy fluctuations of the initial state of the system. for some constant α describing the level of free energy dissi- To formally state it, we first need to introduce a technical pation. For the sake of interpretation we may now write the notion of a Hamiltonian with incommensurable spectrum. tot N Given any two energy levels, E and E , of such a Hamil- error in terms of total dissipated free energy F√diss = NFdiss i j and total free energy fluctuation σtot(F ) = Nσ(F N ), to tonian, there does not exist natural numbers m and n such arrive at that mEi = nEj. We then have the following result.  tot  Theorem 2 (Fluctuation-dissipation relation for identical Fdiss lim N = lim Φ − . (22) pure states). For a distillation setting with N identical ini- N→∞ N→∞ σtot(F ) tial systems, each in a pure state |ψihψ| and described by √ √ tot the same Hamiltonian H with incommensurable spectrum, Then, both quantities scale as N (since Fdiss ' α N). Thus, the error is specified by the ratio between dissipated the transformation error N of the optimal -approximate free energy and free energy fluctuations. distillation process in the asymptotic limit is given by  N √  As a result, the same level of free energy dissipa- Fdiss lim N = lim Φ − · N , (23) tion in two optimal distillation processes will lead to a N→∞ N→∞ σ(F N )

6 ∆F battery state does not contribute to fluctuations σ and κ, 1 and that the difference between non-equilibrium free en- ergies of the ground and excited battery states is just the 0.75 N energy difference Wext. Then, Theorem1 tells us that, in the asymptotic limit, the optimal transformation error for

,  N N 0.5 extracting the amount of work wext := Wext/N per copy of N  the initial subsystem is 0.25  N ¯N √  wext − F lim N = lim Φ · N . (25) N→∞ N→∞ σ(F N ) 0 0 0.12 0.24 0.36 We thus clearly see that again, we have three cases de- wN , w ext ext pendent on the amount of dissipated work per subsystem, N ¯N (wext − F ). To get the asymptotic error different from 0, Figure 5: Optimal work extraction. Comparison between the N asymptotic approximation, Eq. (27), for the optimal amount of 1 or 1/2, the extracted work wext has to be of the form extracted work wext as a function of transformation error  (solid N N α  1  black line), and the actual value of wext as a function of  (blue wN = F¯N − √ + o √ , (26) circles) obtained by explicitly solving the thermomajorisation con- ext N N ditions (see Sec. 4.1 for details). The inverse temperature of the thermal bath is chosen to be β = 1, while the initial system for some constant α. Combining the above two equations is composed of 100 two-level subsystems. The first 59 subsys- yields the following second-order asymptotic expression for tems are described by the Hamiltonian corresponding to a thermal the extracted work per copy of the system: state 0.6|0ih0| + 0.4|1ih1|, and the remaining 41 subsystems have the Hamiltonian leading to a thermal state 0.75|0ih0| + 0.25|1ih1|. ¯ σ(F ) −1 The initial state of the system is given by 59 copies of a state wext ' F + √ Φ (). (27) N 0.9|0ih0| + 0.1|1ih1| and 41 copies of a state 0.7|0ih0| + 0.3|1ih1|. Thus, for a fixed quality of extracted work measured by , more work can be extracted from states with smaller free where Φ denotes the cumulative normal distribution func- energy fluctuations (assuming that the average free energy tion. Moreover, the result still holds if both the initial and F¯ is fixed). This is a direct generalisation of the result target systems get extended by an ancillary system with an obtained in Ref. [20] to a scenario with non-identical initial arbitrary Hamiltonian H , with the initial and target states A systems and with a cleaner interpretation of the error in being some eigenstates of H . A the battery system. We present the comparison between We prove the above theorem in Sec. 4.3, and here we will our bounds and the numerically optimised work extraction only add one comment to the previous discussion. Namely, processes in Fig.5. since for a pure state the free energy fluctuations are just Similarly, by employing Theorem2, we can investigate the energy fluctuations (recall Eq. (13)), and because in the optimal work extraction process from a collection of N non- considered scenario all pure states are identical, we have interacting subsystems with identical Hamiltonians H and each in the same pure state |ψihψ|. We simply need to 2 N 2 2 2 σ (F ) = ψ H ψ − hψ| H |ψi =: σ (H). (24) choose the ancillary system A to be the battery B with N energy splitting Wext and the initial and target states to Analogously to the incoherent case, the only non-trivial be- be given by |0iB and |1iB. Also, since all systems are in haviour of the optimal transformation error happens when identical pure states and have the same Hamiltonian, we N Fdiss is of the form from Eq. (21). Thus, the optimal trans- have formation error is specified by the ratio α/σ(H) between the level of dissipated free energy in the distillation process N log Z F¯ = F¯ = hHiψ + , (28a) and energy fluctuations of the initial state. β N 2 2 σ(F ) = σ(F ) = hH iψ − hHiψ, (28b) 3.2 Optimal work extraction where we used a shorthand notation h·iψ = hψ| · |ψi. As a As the first application of our fluctuation-dissipation rela- result, the optimal amount of work extracted per one copy tions, we focus on work extraction process from a collec- of a pure quantum system up to second-order asymptotic tion of N non-interacting subsystems with Hamiltonians expansion is given by: N N Hn and in incoherent states ρn . As already described in 2 2 Sec. 2.2, this is just a particular case of a thermodynamic log Z hH iψ − hHiψ −1 wext ' hHiψ + + √ Φ (). (29) distillation process. We only need to note that the pure β N

7 3.3 Optimal cost of erasure H˜ = 0 that is prepared in any of the degenerate eigenstates of H˜ . Note that the non-equilibrium free energy of such a In order to obtain the optimal work cost of erasing N two- N target system is given by level systems prepared in incoherent states ρn , we apply Theorem1 analogously as in the previous section, but this 1 1 D(˜ρN kγ˜N ) = log M. (34) time to the scenario described in Sec. 2.3. We then get the β β optimal transformation error in the erasure process given by Our theorems then tell us that in the asymptotic limit, the optimal transformation error  in the considered distillation 1 N N ! process is given by β s − wcost √ lim N = lim Φ N · N , (30) N→∞ N→∞ σ(F ) log M ¯N √ ! N − βF lim N = lim Φ · N . (35) N→∞ N→∞ βσ(F N ) where 1 sN := S(ρN ) (31) N Rewriting the above, we get the following second-order asymptotic behaviour: is the average entropy of the initial state, and wN = W N /N is the invested work cost per subsystem. log M βσ(F ) cost cost ' βF¯ + √ Φ−1(). (36) Using analogous reasoning as in the case of work extrac- N N tion, we can now obtain the second-order asymptotics for the cost of erasure: Now, the distillation process above can be followed by unitaries that map between M degenerate eigenstates of s σ(F ) −1 H˜ that we will simply denote |1i,..., |Mi. Crucially, note wcost ' − √ Φ (), (32) β N that such unitaries are thermodynamically-free because they act in a fixed energy subspace. Such a protocol then N where s := limN→∞ s . allows one to encode M messages into M states σi, each one Let us make two brief comments on the above result. being -close in infidelity to |ii for i ∈ {1,...,M}. Decod- First, we only considered the application of the incoherent ing the message using a measurement in the eigenbasis of result, Theorem1, as in the case of trivial Hamiltonians, H˜ then leads to the average decoding error avg satisfying: the erasure of a pure state |ψihψ|⊗N is free (because all uni- tary transformations are then thermodynamically-free). Of M 1 X course, our results straightforwardly extend to non-trivial 1 − avg := hi| σi |ii = 1 − , (37) M Hamiltonians, but we believe that the simple case we de- i=1 scribed above is most illustrative and recovers the spirit so that avg = . of the original Landauer’s erasure scenario. Second, since Using the communication protocol described above, we the maximally mixed initial state has vanishing free energy N then get the following asymptotic lower bound on the opti- fluctuations, σ(F ) = 0, we cannot directly apply our re- mal thermodynamically-free encoding rate into a state ρN sult (that relates fluctuations of the initial state to dissi- (recall Eq. (9)): pation) to get the erasure cost of N completely unknown bits of information. However, using the tools described in N ¯ βσ(F ) −1 R(ρ , avg) & βF + √ Φ (avg). (38) Sec.4, it is straightforward to show that in this case, the N exact expression (working for all N) for the erasure cost is given by The above lower bound is exactly matching the upper   bound for R(ρN ,  ) recently derived in Ref. [36] for a N 1 log(1 − ) avg wcost = log 2 − . (33) N N β N slightly different scenario with ρn = ρ and Hn = H for all n, with H˜ N = HN , and with Gibbs-preserving oper- Thus, for the case of zero error one recovers the Landauer’s ation instead of thermal operations. However, the proof cost of erasure [39]. presented there can be easily adapted to work in the cur- rent case if we keep the first restriction, i.e., when the ini- 3.4 Optimal thermodynamically-free communica- tial state is ρN = ρ⊗N and all initial subsystems have equal tion rate Hamiltonians. We explain in detail how to adapt that proof in AppendixA, where we also explain what technical re- Finally, we now explain how our fluctuation-dissipation re- sult concerning hypothesis testing relative entropy needs lations, Theorems1 and2, allow one to obtain the op- to be proven in order to make the proof also work when timal thermodynamically-free encoding rate into a collec- subsystems are not identical. Here we conclude that tion of N identical subsystems in either incoherent or pure p states. We simply choose the target system to be a single ⊗N V (ρkγ) −1 R(ρ , avg) ' D(ρkγ) + √ Φ (avg), (39) M-dimensional quantum system with a trivial Hamiltonian N

8 where ρ is either a pure or incoherent state. with D and Dk being integers. Now, the embedding map The above result can be thermodynamically interpreted is defined as a transformation that sends a d-dimensional as the inverse of the Szilard engine. While the Szilard en- probability distribution p to a D-dimensional probability gine converts bits of information into work, the protocol distribution pˆ in the following way [13]: studied here employs the free energy of the system (i.e.,  p p p p  the ability to perform work) to encode bits of informa- pˆ = 1 ,..., 1 ,..., d ,..., d . (43) tion. While the asymptotic result was recently proven in D1 D1 Dd Dd | {z } | {z } Ref. [35], here we proved that this relation is deeper as D1 times Dd times it also connects fluctuations of free energy to the optimal Observe that the embedded version of a thermal state γ is average decoding error. a maximally mixed state over D states 1 η := [1,..., 1]; (44) 4 Derivation of the results D

In what follows, we first introduce the mathematical for- and the embedded version of a sharp state sk is a flat state malism used to study the incoherent distillation process. fk that is maximally mixed over a subset of Dk entries, We then use it to prove Theorem1. Finally, we also prove with zeros otherwise: Theorem2 by first mapping it to an equivalent incoher-   ent problem and then using the formalism of incoherent sˆk = fk := 0,..., 0 , 1 ... 1 , 0,..., 0 . (45) | {z } | {z } | {z } distillations. Pk−1 Dk Pd Dj Dj j=1 j=k+1 4.1 Incoherent distillation process We can now state the crucial theorem based on Ref. [13] and concerning thermodynamic interconversion for inco- 4.1.1 Distillation conditions via approximate majorisation herent states. A state of a d-dimensional quantum system ρ will be called Theorem 3 (Corollary 7 of Ref. [20]). For the initial energy-incoherent if it commutes with the Hamiltonian of and target system with the same thermal distribution γ, the system, i.e., when it is block-diagonal in the energy there exists a thermal operation mapping between energy- eigenbasis. Such a state can be equivalently represented incoherent states p and a state -close to q in infidelity by a d-dimensional probability vector p given by the eigen- distance, if and only if pˆ  qˆ. values of ρ. Since the thermal Gibbs state γ is energy- incoherent, it can be represented by a vector of thermal Despite the fact that in our case, we want to study occupations γ. Moreover, an energy eigenstate |EkihEk| the general case of initial and final systems with differ- can be represented by a sharp state sk, with (sk)j = δjk. ent Hamiltonians, with a little bit of ingenuity we can still In order to formulate the solution to the thermodynamic use the above theorem. Namely, we consider a family of interconversion problem for incoherent states we will need total systems composed of the first N subsystems with ini- N two concepts: approximate majorisation and embedding. tial Hamiltonians Hn , and the remaining part described First, given two d-dimensional probability vectors p and q, by the target Hamiltonian H˜ N . We choose initial states of we say that p majorises q, and write p q, if and only the total system on the first N subsystems to be a general N if [40] product of incoherent states pn , while the remaining part k k to be prepared in a thermal equilibrium state γ˜N corre- X ↓ X ↓ ˜ N ∀k : pj ≥ qj , (40) sponding to H . Since Gibbs states are free, this setting j=1 j=1 is thermodynamically equivalent to having just the first N N N where p↓ denotes the vector p rearranged in a decreasing systems with Hamiltonians Hn and in states pn . More- order. Moreover, we say that p -post-majorises q [20], over, for target states of the total system, we choose ther- N and write p q, if p majorises r which is -close in the mal equilibrium states γn for the first N subsystems, and  N ˜ N infidelity distance to q, i.e., sharp states s˜k of the Hamiltonian H for the remaining part. Again, this is thermodynamically equivalent to hav- 2 ˜ N  d  ing just the system with Hamiltonian H and in a state X √ s˜N . Thus, employing Theorem3, an -approximate distil- 1 − F (q, r) ≤ , F (q, r) :=  qjrj . (41) k j=1 lation process for incoherent states exists if and only if:

N ! N ! Second, we express the thermal distribution γ as a proba- O O pˆN ⊗ γ˜ˆN γˆN ⊗ s˜ˆN . (46) bility vector with rational entries, n  n k n=1 n=1 D D  γ = 1 ,..., d , (42) This way, using a single fixed Hamiltonian, we can encode D D transformations between different Hamiltonians.

9 Let us introduce the following shorthand notation: Lemma 4 (Lemma 21 of Ref. [20]). Let p and q be distri- butions with V (q) = 0. Then N N N ˆ N O N ˆ N O N O N exp H(q) P := pˆn , G := γˆn = ηn . (47) X ↓ n=1 n=1 n=1 min {|p  q} = 1 − pi . (52) Then, we can use the previous facts on the embedding map i=1 to conclude with the following statement: there exists an Applying the above lemma to Eq. (48) yields the follow- -approximate thermodynamic distillation process from N ing expression for the optimal error N : N systems with Hamiltonians Hn and in energy-incoherent exp[H(Gˆ N )+H(f˜N )] N N k ↓ states p to a system with a Hamiltonian H˜ and in a X  N N  n N = 1 − Pˆ ⊗ η˜ . (53) sharp energy eigenstate s˜N if and only if i k i=1 ˆ N N ˆ N ˜N Now, for an arbitrary distribution p and any flat state f, P ⊗ η˜  G ⊗ fk . (48) we make two observations: the size of the support of f is 4.1.2 Information-theoretic intermission simply exp(H(f)), and the entries of p ⊗ f are just the copied and scaled entries of p. As a result, the sum of the Before we proceed, we need to make a short intermis- l largest elements of p can be expressed as sion for a few important comments concerning information- l l exp(H(f)) theoretic quantities introduced in Eqs. (10a)-(10c). For in- X X p↓ = (p ⊗ f)↓. (54) coherent states ρ and γ represented by probability vectors i i i=1 i=1 p and γ, these simplify and take the following classical form: Inverting the above expression we can write   l l exp(−H(f)) X pi D(pkγ) := p log , (49a) X ↓ X ↓ i (p ⊗ f)i = pi , (55) γi i i=1 i=1  2 X pi where the summation with non-integer upper limit x should V (pkγ) := pi log − D(pkγ) , (49b) γ be interpreted as: i i 3 x bxc X pi Y (pkγ) := p log − D(pkγ) . (49c) X X i pi := pi + (x − bxc)pdxe. (56) γi i i=1 i=1 Moreover, by direct calculation, one can easily show that Since η˜ is a flat state, we conclude that the above quantities are invariant under embedding, i.e., exp[H(Gˆ N )+H(f˜N )−H(η˜N )] D(pkγ) = D(pˆkη), and the same holds for V and Y . X k  = 1 − (Pˆ N )↓ . (57) Therefore N i i=1 D(pkγ) =D(pˆkη) = log D − H(pˆ), (50a) We see that the error depends crucially on partial or- V (pkγ) =V (pˆkη) = V (pˆ), (50b) dered sums as above. To deal with these kind of sums, we Y (pkγ) =Y (pˆkη) = Y (pˆ), (50c) introduce the function χp defined implicitly by the follow- ing equation where χp(l) X ↓ X X p = {pi|pi ≥ 1/l}. (58) H(p) := pi(− log pi), (51a) i i=1 i i X 2 In words: χp(l) counts the number of entries of p that are V (p) := pi(log pi + H(p)) , (51b) i larger than 1/l. Now, we have the following lemma that X 3 will be crucial in proving our theorems. Y (p) := pi |log pi + H(p)| , (51c) i Lemma 5. Every d-dimensional probability distribution p satisfies the following for all l ∈ {1, . . . , d} and for all α ≥ and note that V (p) = 0 if and only if p is a flat state. 1:

l χp(l) 4.1.3 Optimal error for a distillation process X ↓ X ↓ pi ≥ pi , (59a) In order to transform the approximate majorisation condi- i=1 i=1 tion from Eq. (48) into an explicit expression for the opti- l χp(αl)/c X ↓ X ↓ mal transformation error, we will start from the following pi ≤ pi , (59b) result proven by the authors of Ref. [20]. i=1 i=1

10 where This can be further transformed by employing the in- χp(αl) variance of relative entropic quantities under embedding, √ X ↓ c = α pi . (60) Eqs. (50a)-(50b), to arrive at √ i=χp( αl) N P N N N N Proof. The first inequality is very easily proven by observ- D(pn kγn ) − D(s˜k kγ˜ ) n=1 ing that the number of entries larger than 1/l, i.e., χp(l), x = 1 , (68)  N  2 is bounded from above by l due to normalisation. Now, P N N V (pn kγn ) to prove the second inequality, we start from the following n=1 observation: which is precisely the argument of Φ appearing in the state- √ χp( αl)   χp(αl)   ment of Theorem1 in Eq. (18): X ↓ 1 X ↓ 1 pi − √ ≥ pi − √ , (61) αl αl F N √ i=1 i=1 x = diss · N. (69) σ(F N ) which comes from the fact that all extra terms on the right hand side of the above are negative by definition. By rear- We conclude that with the above x we can then rewrite the ranging terms we arrive at expression for the optimal transformation error, Eq. (57), √ as l(x) χp(αl) − χp( αl) ≥ cl, (62) X ˆ N ↓ N = 1 − (P )i . (70) which obviously implies i=1 Next, we will find an upper bound for the error employ- χ (αl) l ≤ p . (63) ing Eq. (59a): c

χPˆN (l(x)) X ˆ N ↓ N ≤ 1 − (P )i i=1   4.2 Proof of Theorem1 X N N 1 = 1 − Pˆ Pˆ ≥ . (71) i i l(x) We start by introducing the following averaged entropic i quantities for the total initial distribution Pˆ N : In order to evaluate the above sum, consider N discrete N N N random variables Xn taking values − log(pˆn )i with prob- 1 N 1 X N 1 X N N h := H(Pˆ ) = H(pˆ ) =: h , (64a) ability (pˆ )i, so that N N N n N n n n=1 n=1 N N N hXni = hn , (72a) 1 1 X 1 X v := V (Pˆ N ) = V (pˆN ) =: vN , (64b) 2 N N N N n N n h(Xn − hXni) i = vn , (72b) n=1 n=1 3 N N N h|Xn − hXni| i = yn , (72c) 1 1 X 1 X y := Y (Pˆ N ) = Y (pˆN ) =: yN . (64c) N N N n N n where the average h·i is taken with respect to the distribu- n=1 n=1 N tion pˆn . We then have the following Note that the above vN and yN are, up to temperature   2 N 3 N X N N 1 rescaling, incoherent versions of σ (F ) and κ (F ) de- Pˆ Pˆ ≥ i i l(x) fined in Eqs. (16b)-(16c). We also define the function l: i ( N N )  p  X Y Y 1 l(z) := exp Nh + z Nv . (65) = (pˆN ) (pˆN ) ≥ N N n in n in l(x) i1,...,iN n=1 n=1 We now rewrite the upper summation limit appearing in ( N N ) X Y X = (pˆN ) − log(pˆN ) ≤ log l(x) Eq. (57) employing the above function: n in n in i1,...,iN n=1 n=1 N N N exp[H(Gˆ ) + H(f˜ ) − H(η˜ )] = l(x) (66) " N # k X p = Pr Xn ≤ NhN + x NvN so that n=1   ˆ N ˆ N ˜N N PN D(P kG ) − D(fk kη˜ ) n=1(Xn − hXni) x = q . (67) = Pr q ≤ x . (73) ˆ N PN 2 V (P ) n=1h(Xn − hXni) i

11 Now, the Berry-Esseen theorem [41, 42] tells us that Now, for any finite δ > 0 it is clear that there exists N0   such that for all N ≥ N0 we have c > 1. From this and PN n=1(Xn − hXni) CyN Eq. (79) we get that for N ≥ N0 we have Pr ≤ x−Φ(x) ≤ , (74) q N p 3 P 2 NvN χ (l(x+δ)) n=1h(Xn −hXni) i PˆN X ˆ N ↓ N ≥ 1 − (P )i where C is a constant that was bounded in Refs. [43, 44] i=1 by   X N N 1 0.4097 ≤ C ≤ 0.4748. (75) = 1 − Pˆ Pˆ ≥ i i l(x + δ) We thus have i CyN   ≥ 1 − Φ(x + δ) − p , (82) X 1 CyN Nv3 PˆN PˆN ≥ − Φ(x) ≤ , (76) N i i p 3 l(x) Nv i N where in the last line we used Eq. (76) again. It is thus and so we conclude that the error N is bounded from above clear that by lim N ≥ 1 − lim Φ(x + δ) = lim Φ(−x − δ) (83)  N √  3 N N→∞ N→∞ N→∞ Fdiss Cκ (F ) 1 N ≤ Φ − · N + · √ , (77) σ(F N ) σ3(F N ) N and, since it works for any δ > 0, we conclude that which proves the single-shot upper bound on transforma-  N √  Fdiss tion error, Eq. (19), presented in Theorem1. lim N ≥ lim Φ − · N . (84) N→∞ N→∞ σ(F N ) We now switch to proving the asymptotic behaviour of the optimal transformation error captured by Eq. (18). Combining the above with the bound obtained in Eq. (78), First, from Eq. (77), it is clear that if limN→∞ vn and we arrive at limN→∞ yn are well-defined and non-zero (as we assume),  N √  then Fdiss lim N = lim Φ − N · N , (85)  N √  N→∞ N→∞ σ(F ) Fdiss lim N ≤ 1 − lim Φ · N . (78) N→∞ N→∞ σ(F N ) which completes the proof. Next, in order to lower bound the expression for the opti- 4.3 Proof of Theorem2 mal error in the√ asymptotic limit we will apply Eq. (59b) with α = exp(δ N) and δ > 0 to Eq. (70): The proof of Theorem2 will be divided into three parts. √ χ (eδ N l(x)) χ (l(x+δ)) First, we will show that a thermodynamic distillation pro- PˆN PˆN c c cess from a general state ρ can be reduced to a distillation X ˆ N ↓ X ˆ N ↓ N ≥ 1 − (P )i = 1 − (P )i , (79) process from an incoherent state that is a dephased ver- i=1 i=1 sion of ρ. Employing this observation, we will recast the where problem under consideration in terms of approximate ma- √ jorisation and thermomajorisation as described in Sec. 4.1. χ (eδ N l(x)) √ PˆN Then, in the second part of the proof, we will derive the δ N X ↓ 2 ˆ N c = e (P )i upper bound for the optimal transformation error N . Fi- √ δ N/2 i=χPˆN (e l(x)) nally, in the third part, we will provide a lower bound for N χ (l(x+δ)) and show that it is approaching the derived upper bound √ PˆN δ N X ↓ 2 ˆ N in the asymptotic limit. = e (P )i

i=χPˆN (l(x+δ/2)) 4.3.1 Reducing the problem to the incoherent case √   δ N X N N 1 = e 2 Pˆ Pˆ ≥ i i l(x + δ) The thermodynamic distillation problem under investiga- i tion is specified as follows. The family of initial systems  ! X N N 1 consists of a collection of N identical subsystems, each with − Pˆ Pˆ ≥ . (80) i i l(x + δ/2) the same Hamiltonian i d Using Eq. (76) we can bound the above expression from X H = Ei|EiihEi|, (86) below as i=1

√ ! δ N 2CyN c ≥ e 2 Φ(x + δ) − Φ(x + δ/2) − . (81) and an ancillary system with an arbitrary Hamiltonian HA p 3 NvN (note that the ancillary system can always be ignored by

12 simply choosing its dimension to be 1). The family of initial an incoherent state that is described by the probability states is given by distribution P N over the multi-index set k

N ⊗N A A   d ρ = ψ ⊗ |E0 ihE0 |, (87) N Y P N = pki . (95) k k , ..., k i where 1 d i=1 d X √ ψ = |ψihψ|, |ψi = p eiφi |E i, (88) N i i Note that Pk specifies the probability of k1 systems being i=1 in energy state E1, k2 systems being in energy state E2, A and so on; and that we made a technical assumption that is an arbitrary pure state and |E0 i is an eigenstate of HA A energy levels are incommensurable, so that each vector k with energy E0 . The family of target systems is composed of subsystems described by arbitrary Hamiltonians H˜ N and corresponds to a different value of total energy. a subsystem described by the Hamiltonian HA. The family We have thus reduced the problem of thermodynamic of target states is given by distillation from pure states to thermodynamic distillation from incoherent states. More precisely, let us denote the N ˜N ˜N A A A A ρ˜ = |Ek ihEk | ⊗ |E1 ihE1 |, (89) sharp distributions corresponding to |Ei i by si and the A corresponding flat states after embedding by fi . As be- ˜N ˜ N A N N where |Ek i is some eigenstate of H and |E1 i is an eigen- fore, we also use s˜ and f˜ to denote distributions related A k k state of HA with energy E . We are thus interested in the ˜N 1 to the sharp state |Ek i and its corresponding flat state. existence of a thermal operation E satisfying The embedded Gibbs state corresponding to HN will be again denoted by Gˆ N , however now it has an even simpler ⊗N A A ˜N ˜N A A E(ψ ⊗ |E0 ihE0 |) = |Ek ihEk | ⊗ |E1 ihE1 |. (90) form than in Eq. (47), as the initial systems have identical Hamiltonians: We now have the following simple, but very useful, ˆ N ⊗N ⊗N lemma. G = γˆ = η . (96) ˆ N Lemma 6. Every incoherent state σ achievable from a Similarly, P will be used to denote the embedded ini- state ρ through a thermal operation is also achievable from tial state (even though it now has a different form than in D(ρ), where D is the dephasing operation destroying coher- Eq. (47)): d ence between different energy subspaces:  N  Y pki PˆN = i , (97) k,gk k k1, ..., kd D i ∃E : E(ρ) = σ ⇔ E(D(ρ)) = σ. (91) i=1 i

Proof. First, for a given ρ and incoherent σ, assume that with  d  there exists a thermal operation E such that E(ρ) = σ. Y k g ∈ 1, ..., D i (98) Now, employing the fact that every thermal operation is k i i=1 covariant with respect to time-translations [45], and using the fact that incoherent σ by definition satisfies D(σ) = σ, indexing the degeneracy coming from embedding. With the we get notation set, our distillation problem can now be written E(D(ρ)) = D(E(ρ)) = D(σ) = σ. (92) as Pˆ N ⊗ f A ⊗ η˜ Gˆ N ⊗ f A ⊗ f˜ . (99) Likewise, the reverse implication holds by noting that the 0  1 k dephasing operation is a thermal operation. 4.3.2 Upper bound for the transformation error Because the target state in our case is incoherent, we can use the above result to restate our problem as the existence We begin by observing that our target distribution in ˆ N A ˜ of a thermal operation E satisfying Eq. (99) is flat, and so V (G ⊗ f1 ⊗ fk) = 0. Thus, we can employ Lemma4 and Eq. (55) to get the following ⊗N A A ˜N ˜N A A E(D(ψ ⊗ |E0 ihE0 |)) = |Ek ihEk | ⊗ |E1 ihE1 |. (93) expression for the optimal transformation error:

Since L X ˆ N ↓ ⊗N A A ⊗N A A N = 1 − (P )j (100) D(ψ ⊗ |E0 ihE0 |) = D(ψ ) ⊗ |E0 ihE0 |, (94) j=1 our problem further reduces to understanding the structure where L is given by of the incoherent state D(ψ⊗N ). It is block-diagonal in the energy eigenbasis and can be diagonalised using thermal ˆ N A ˜ A L = exp[H(G ) + H(f1 ) + H(fk) − H(f0 ) − H(η˜)] operations (since unitaries in a fixed energy subspace are ˆ N ˜ A A free operations). After such a procedure, we end up with = exp[H(G ) − D(fkkη˜) − β(E1 − E0 )]. (101)

13 N N Notice that in the current case Fdiss, defined in Eq. (17), Our goal is then to calculate the sum of Pk(s) in the limit is given by N → ∞ subject to the following hyper-plane constraint  N 1 ⊗N ⊗N A A 1 N Fdiss = D(ψ kγ ) + D(|E0 ihE0 |kγA) √ s · E ≥ −F , (109) βN N diss ˜ ˜ A A  − D(|EkihEk|kγ˜) − D(|E1 ihE1 |kγA) where E is a vector of energies (eigenvalues of H). First, 1   we approximate the multinomial distribution P N specified = ND(ψkγ)−D(f˜ kη˜)−β(EA −EA) . (102) βN k 1 0 in Eq. (95) by a multivariate normal distribution N (µ,Σ) with mean vector µ = Np and covariance matrix Σ = Using the above we can then rewrite L as N(diag (p) − ppT ): N ˆ N log L =βNFdiss + H(G ) − ND(ψkγ). (103)   (µ,Σ) 1 1 T −1 Nk(s) = p exp − (k − µ) Σ (k − µ) Now, employing Eq. (59a) and the above, we provide the (2π)d|Σ| 2 upper bound for N : 1  1  = exp − sT NΣ−1s . (110)   p(2π)d|Σ| 2 X ˆN ˆN 1 N ≤ 1 − P P ≥ k,gk k,gk L k,gk As we explain in AppendixC, such an approximation can

( Qd ki ) always be made with an error approaching to 0 as N → ∞. X N N i=1 Di = 1 − Pk Pk ≥ Next, we standardise the multivariate normal distribution L (µ,Σ) k N using rotation and scaling transformations: ( d ) √ √ X N N X Σ = ΘT Λ ΛΘ, (111) = 1 − Pk log Pk ≥ ki log Di − log L k i=1 where Λ is a diagonal matrix with the eigenvalues of Σ  log P N d k X N k X i and Θ is an orthogonal matrix with columns given by the = 1 − Pk ≥ log γi + D(ψkγ) N N eigenvectors of Σ. We illustrate this process for a three- k i=1 P ) level system (so described by s1 and s2 since i si = 0) in N Fig.6. This rotation and scaling of co-ordinates allows us − βFdiss . (104) to write N (µ,Σ) as a product of univariate standard normal distribution φ(yi): To simplify the calculation of the upper bound of N , we rewrite each k as a function of a multi-parameter vector s such that   d √ (µ,Σ) 1 1 T Y k = k(s) = Np + Ns, (105) N = exp − y y = φ(yi), (112) k(s(y)) p d 2 (2π) |Σ| i=1 Pd with i=1 si = 0. We then note that where √ √ d T −1 X y = N(Θ Λ) s. (113) D(ψkγ) = − pi log γi (106) i=1 We then can equivalently write the equation specifying the hyper-plane, Eq. (109), as and so the condition in Eq. (104) can be rewritten as √ N N d (Θ Λy) · E ≥ −NFdiss. (114) log Pk(s) 1 X ≥ √ s log γ − βF N N i i diss Observe that the standard normal distribution given in N i=1 Eq. (112) is rotationally invariant about the origin. One d β X N can thus choose a coordinate system x = {x1, . . . , xd} by = −√ siEi − βFdiss. (107) applying a suitable rotation R on y = {y1, . . . , yd}, so that N i=1 the hyper-plane specified in Eq. (109) becomes parallel to As we rigorously argue in AppendixB, the left-hand side of a particular coordinate axis x1. Eq. (112) can then be the above vanishes much quicker than the right-hand side rewritten in the following form when N → ∞, leading to   d (µ,Σ) 1 1 T Y lim N N = exp − x x = φ(xi). (115) N→∞ k(s(x)) p d 2 (2π) |Σ| i=1 d )  1 X N X N ≤ 1− lim Pk(s) √ siEi ≥ −Fdiss . (108) As we have N→∞ N s i=1 x = Ry, (116)

14 Figure 6: Standardising the bivariate normal distribution. The points with equal probability density for the bivariate normal distribution are represented by a red ellipsis centred at the origin, and the black dashed line corresponds to the constraining hyper-plane. The upper bound on N is given by the probability mass within the area depicted in grey. In order to calculate it, we first apply a rotation and scaling transformation, making the ellipsis symmetric with respect to the origin. Then, using the rotational symmetry of the standard bivariate normal distribution, one can rotate it such that the hyper-plane becomes parallel to x1. we can use Eq. (116) and Eq. (113) to rewrite Eq. (109) as where x depends on N as per Eq. (121). We shall use this 1 √ result to derive the lower bound on N . ΘT ΛRT x · E ≥ −F N . (117) To prove the lower bound on transformation error  N diss N we start with the exact expression, Eq. (100), and√ use the To calculate the right hand side of the inequality given inequality from Eq. (59b). Taking α = exp(δ N) and in Eq. (108) in the limit N → ∞, we integrate Eq. (115) δ > 0 we then get from −∞ to dO along x1, and from −∞ to +∞ along any √ χ (eδ N L(x)) χ (L(x+δ)) other xi 6= x1, where dO is the signed distance of the hyper- PˆN PˆN c c plane given in Eq. (117) from the origin (see Fig.6). This X N ↓ X N ↓ N ≥ 1 − (Pˆ ) = 1 − (Pˆ ) , (123) distance can be explicitly calculated as i i i=1 i=1 N Fdiss where c can be evaluated similarly as before. So dO = q √ √ 1 T √ N 2 E · (R ΛΘ) (R ΛΘ)E χ (eδ N L(x)) √ √ √ PˆN N N δ N/2 X ˆ N ↓ NFdiss NFdiss c = e (P )i = = , (118) √ q N δ N/2 1 σ(F ) i=χPˆN (e L(x)) N E · (ΣE) χ (L(x+δ)) √ PˆN where we have used the definition of σ(F N ) from Eq. (16b) δ N/2 X ˆ N ↓ = e (P )i in the last line. Thus, the upper bound on N in the limit i=χPˆN (L(x+δ/2)) N → ∞ given in Eq. (108) can be calculated as √   δ N/2 X ˆN ˆN 1 Z +∞ Z +∞ Z dO = e P P ≥ i i L(x + δ) 1− lim dxdφ(xd) ... dx2φ(x2) dx1φ(x1) i N→∞ −∞ −∞ −∞  !  N √  X ˆN ˆN 1 Fdiss − Pi Pi ≥ . (124) = 1− lim Φ(dO) = lim Φ − · N . (119) L(x + δ/2) N→∞ N→∞ σ(F N ) i Using Eq. (122), we see that the limiting behaviour of c 4.3.3 Lower bound for the error of transformation from Eq. (124) is given by We start by writing L from Eq. (103) as √ lim c = (Φ(x + δ) − Φ(x + δ/2)) lim eδ N/2. (125)  p  N→∞ N→∞ L = exp AN + x NvN =: L(x), (120) Thus, for any finite δ > 0, there always exists N0 such that where for all N ≥ N0 we have c > 1. Combining this observation √ N with Eq. (123) we obtain NFdiss A =H(η) − D(ψkγ), x = N . (121) σ(F ) χPˆN (L(x+δ)) X ˆ N ↓ lim N ≥ 1 − lim (P )i In the previous section we have exactly calculated the right N→∞ N→∞ i=1 hand side of Eq. (104) in the limit N → ∞ (see Eq. (119)).   Using Eqs. (120)-(121), we can equivalently rewrite this as X ˆN ˆN 1 = 1 − lim Pi Pi ≥ N→∞ L(x + δ) X  1  i lim PˆN PˆN ≥ = lim Φ(x), (122) k,gk k,gk = 1 − lim Φ(x + δ) = lim Φ(−x − δ), (126) N→∞ L(x) N→∞ N→∞ N→∞ k,gk

15 where the first equality in the last line follows from fluctuations of the initial and target states affects dissipa- Eq. (122). Since the above inequality holds for any δ > 0, tion. For energy-incoherent initial and final states one can taking the limit δ → 0 we conclude that infer from Ref. [22] that appropriately tuned fluctuations can significantly reduce dissipation, however nothing is  F N √  diss known for states with coherence. Unfortunately, since ther- lim N ≥ lim Φ − N · N . (127) N→∞ N→∞ σ(F ) mal operations are time-translation covariant, such that co- Finally, combining the above with the bound obtained in herence and athermality form independent resources [45– Eq. (119), we have 47], it seems unlikely that the current approach can be easily generalised. Thirdly, one could try to extend our re-  F N √  diss sults on pure states to allow for non-identical systems and lim N = lim Φ − N · N , (128) N→∞ N→∞ σ(F ) to derive a bound working for all N, not only for N → ∞ which completes the proof. (i.e., replace the proving technique based on central limit theorem by the one based on a version of Berry-Esseen theorem). 5 Outlook In our work we have also provided a number of physi- In this paper, we have derived a version of the fluctuation- cal applications of our fluctuation-dissipation theorems by dissipation theorem for state interconversion under thermal considering several scenarios and explaining how our re- operations, where for a fixed transformation error we have sults can be useful to describe fundamental and well-known established the relation between fluctuations of free energy thermodynamic and information-theoretic processes. We in the initial state of the system and average dissipation of derived the optimal value of extractable work in a ther- free energy, i.e. the difference in free energy between the modynamic distillation process as a function of the trans- initial and final states. We addressed and solved the prob- formation error associated to the work quality. This lem in two different scenarios: for initial states being either could potentially be used to clarify the notion of imperfect energy-incoherent or pure, with the target state in both work [27, 48, 49], and to construct a comparison platform cases being an energy eigenstate, and with the possibility allowing one to continuously distinguish between work-like to change the Hamiltonian in the process. For the case and heat-like forms of energy. We have also showed how of finitely many independent but not necessarily identical our results yield the optimal trade-off between the work energy-incoherent systems, we have provided the single- invested in erasing N independent bits prepared in arbi- shot upper bound on the optimal transformation error as trary states, and the erasure quality measured by the infi- a function of average dissipated free energy and free en- delity distance between the final state and the fully erased ergy fluctuations. Moreover, in the asymptotic regime we state. This can of course be straightforwardly extended obtained the optimal transformation error up to second or- to higher-dimensional systems and arbitrary final erased der asymptotic corrections, which extends previous results state (not necessarily the ground state). Finally, we have of Ref. [20] to the regime of non-identical initial systems investigated the optimal encoding rate into a collection of and varying Hamiltonians. For the first time we have also non-interacting subsystems consisting of energy-incoherent performed the asymptotic analysis of the thermodynamic or pure states using thermal operations. We derived the distillation process from quantum states that have coher- optimal rate (up to second-order asymptotics) of encoding ence in the energy eigenbasis. As a result, we expressed the information with a given average decoding error and with- optimal transformation error from identical pure states up out spending thermodynamic resources. This provides an to second order asymptotic corrections as a function of av- operational interpretation of the resourcefulness of ather- erage dissipated free energy and free energy fluctuations. mal quantum states for communication scenarios under the Our work can be naturally extended in the following di- restriction of using thermal operations. rections. Firstly, one could generalise our analysis to arbi- We would also like to point out to some possible technical trary initial states. We indeed believe that an analogous extensions of our results. Firstly, we used infidelity as our result to ours will hold for such general mixed states with quantifier of transformation error, but we expect that simi- coherence. That is because dephasing into fixed energy lar results could be derived using other quantifiers, e.g., the subspaces leads to average free energy change of the order trace distance. Secondly, our investigations were performed O( log N ), which is negligible compared to the second order N √ in the spirit of small-deviation analysis (where we look for asymptotic corrections of the order O(1/ N) that we fo- constant transformation√ error and total free energy dissi- cus on. In other words, the contribution of coherence to pation of the order O( N)), but possibly other interest- free energy vanishes faster with growing N than what we ing fluctuation-dissipation relations could be derived within are interested while studying second order corrections. the the moderate and large deviation regimes. Thirdly, our Secondly, it would be extremely interesting to generalise result for pure states is limited to Hamiltonians with incom- the interconversion problem to arbi- mensurable spectrum, but we believe this is just a techni- trary final states, and see how the interplay between the cal nuisance that one should be able to get rid of. Lastly,

16 within the framework of general resource theories, it might This is because the above (potentially suboptimal) choice be possible to derive analogous fluctuation-dissipation rela- of Q clearly satisfies 0 ≤ Q ≤ 1 and also tions, but with free energy replaced by a resource quantifier M relevant for a given resource theory. 1 X Tr (Qτ) = Tr (E (ρ)Π ) ≥ 1 − , (135) M i i i=1 Acknowledgements due to our assumption in Eq. (129). At the same time we KK would like to thank Chris Chubb and Marco have M Tomamichel for helpful discussions. KK and AOJ acknowl- 1 X 1 Tr Qζ˜ = Tr ρ˜N Π  = , (136) edge financial support by the Foundation for Polish Science M i M through TEAM-NET project (contract no. POIR.04.04.00- i=1 00-17C1/18-00). TB and MH acknowledge support from so that  ˜ the Foundation for Polish Science through IRAP project log M ≤ DH (τkζ). (137) co-financed by EU within the Smart Growth Operational Next, we introduce the following encoding channel Programme (contract no.2018/MAB/5). M X F := |iihi| ⊗ Ei, (138) A Optimality of the communication rate i=1 which satisfies The following derivation will closely follow the proof of F(ζ) = ζ.˜ (139) Lemma 1 of Ref. [36]. Let us assume that for a sys- tem (ρN ,HN ) it is possible to encode M messages in a Employing the data-processing inequality twice, we get the thermodynamically-free way so that the average decoding following sequence of inequalities: error is . It means that there exists a set of M encoding M ! ! M M thermal operations {E } and a decoding POVM {Π }   1 X N i i=1 i i=1 D (τkζ˜) = D F |iihi| ⊗ ρ F(ζ) such that H H M M i=1 1 X N  M ! 1 −  = Tr Ei(ρ )Πi . (129)  1 X N M ≤ D |iihi| ⊗ ρ ζ i=1 H M Note that every thermal operation E between the initial i=1 i  N N  system (ρN ,HN ) and a target system (˜ρN , H˜ N ) preserves ≤ DH ρ γ . (140) the thermal equilibrium state, i.e., Combining this with Eq. (137), we arrive at N N Ei(γ ) =γ ˜ . (130)  N N  log M ≤ DH ρ γ . (141) Now, let us introduce the following three states Finally, for the case of identical initial subsystems, ρN = M ⊗N N ⊗N 1 X ρ and γ = γ , we can use the known second or- τ := |iihi| ⊗ E (ρN ), (131a) M i der asymptotic expansion of the hypothesis testing relative i=1 entropy [53], M 1 X N r ζ := |iihi| ⊗ γ , (131b) 1 V (ρkγ) M D (ρ⊗N kγ⊗N ) ' D(ρkγ) + Φ−1(), (142) i=1 N H N M 1 X ζ˜ := |iihi| ⊗ γ˜N . (131c) leading to M i=1 r log M V (ρkγ) −1  D(ρkγ) + Φ (). (143) The hypothesis testing relative entropy DH between τ and N . N ζ˜, defined by [50–52] For the above proof to work also in the case of non-identical  ˜  ˜ 1 DH (τkζ):= − log inf Tr Qζ 0 ≤ Q ≤ , subsystems, one would need to prove the following asymp- Tr (Qτ) ≥ 1 −  , (132) totic behaviour of the hypothesis testing relative entropy: N N ! satisfies the following 1  O O DH ρn γn  ˜ ˜ N DH (τkζ) ≥ − log Tr Qζ (133) n=1 n=1 v u N for u 1 P M N u N V (ρnkγn) X 1 X n=1 Q = |iihi| ⊗ Π . (134) ' D(ρ kγ ) + t Φ−1(). (144) i N n n N i=1 n=1

17 B Eliminating the logarithmic term where pmin = min {p1, . . . , pd}. Moreover, observing that

d ! We start with the following lemma that will be needed to X prove our claim. log N ≥ log N − log ki(si) ≥ −(d−1) log N, (151) i=1 Lemma 7. For a fixed value of b > 0 and any s, such that q Pd 2 we can conclude that ksk = i=1 si ≤ b, we have d N X log Pk(s) = O(log N), (145) log N − log ki(si) = O(log N). (152) i=1 N where Pk and k(s) are defined by Eqs. (95) and (105), respectively. Putting it together, we further simplify the bound given in Eq. (149) as Proof. We start by using the definition, b2 (d − 1)   log P N ≥ −d − − log N = O(log N). (153) N N k1(s1) kd(sd) k(s) Pk(s) = p1 ...pd , (146) pmin 2 k1(s1), ..., kd(sd) Similarly, by employing the Stirling inequality from N to write log Pk(s) as N Eq. (148), we also prove an upper bound for log Pk(s) as follows d d N X X log P =log N! − log ki(si)! + ki(si) log pi. (147) d k(s) √ X i=1 i=1 N log Pk(s) ≤(1 + log N + N log N − N) + ki(si) log pi Employing Stirling inequality, i=1 d √ X  p  log N + N log N − N − log ki(si) + ki(si) log ki(si) − ki(si) ≤ log N! ≤ i=1 √ d X ki(si) 1 + log N + N log N − N, (148) = − k (s ) log + 1 i i Np i=1 i N we first provide a lower bound for log P , d k(s) 1 X + (log N − log ki(si)). (154) √ d 2 N X i=1 log Pk(s) ≥(log N + N log N − N) + ki(si) log pi i=1 g Using the inequality log(1 + g) ≥ 1+g for g > −1, we have d X p − (1+log ki(si)+ki(si) log ki(si)−ki(si)) d d X ki(si) X  si  i=1 k (s ) log = k (s ) log 1 + √ i i Np i i d i=1 i i=1 Npi X ki(si) = − ki(si) log − d d √ d Npi X si X i=1 ≥ ki(si) √ = N si = 0. (155) d si + Npi 1 X  i=1 i=1 + log N − log k (s ) . (149) 2 i i i=1 The above inequality together with Eq. (152) imply that N √ log Pk(s) ≤ O(log N) which completes the proof. Pd Recall that ki(si) = Npi + Nsi, which implies i=1 si = 0. To simplify the above further, we lower bound the first term by employing the inequality log(1 + g) < g for g > −1 Using Lemma7, we will now be able to prove our claim in the following way: that is captured by the following result. d d   Lemma 8. The following limits are equal: X ki(si) X si ki(si) log = ki(si) log 1 + √ Npi Np n 1 β i=1 i=1 i X N N X lim Pk(s) log(Pk(s)) + √ siEi d d N→∞ N N X si √ X si s i ≤ k (s )√ = N (1 + √ )s i i i N o i=1 Npi i=1 Npi + Fdiss ≥ 0 d 2 d 2 2 X si X si b n β o = ≤ ≤ , (150) X N X N p p p = lim Pk(s) √ siEi + Fdiss ≥ 0 . (156) i=1 i i=1 min min N→∞ s N i

18 Proof. We start by introducing the following notation and so by taking the limit b → ∞, we arrive at

X n N 1 N β X lim lim A(b, N) = lim lim B(b, N). (165) A(N) := P log(P ) + √ siEi b→∞ N→∞ b→∞ N→∞ k(s) N k(s) N s i Combining the above with Eqs. (162)-(163) we have N o + Fdiss ≥ 0 , lim A(N) = lim B(N), (166) X n 1 β X N→∞ N→∞ A(b, N) := P N log(P N ) + √ s E k(s) N k(s) i i s N i which completes the proof. N o + Fdiss ≥ 0 such that ksk ≤ b , X n β X o C Central limit theorem for multinomial N √ N B(N) := Pk(s) siEi + Fdiss ≥ 0 , s N i distribution

X n N β X N B(b, N) := P √ siEi + Fdiss ≥ 0 Lemma 9. The multinomial distribution with mean k(s) N s i µ = Np and a covariance matrix Σ can be approximated in o such that ksk ≤ b , the asymptotic limit by a multivariate normal distribution N (µ,Σ) with mean µ and a covariance matrix Σ. n o X N Ω(b, N) := Pk(s) such that ksk ≥ b . (157) s Proof. Assume X1 ... XN are independent and identically distributed random vectors each of them with the following Our goal is to show that distribution

lim A(N) = lim B(N). (158) ( d x N→∞ N→∞ Q p i if x is unit vector, Prob(X = x) = i=1 i (167) From the definition it follows that 0 otherwise.

A(N) − Ω(b, N) ≤ A(b, N) ≤ A(N), (159a) Then, the mean vector of X is p and the covariance matrix B(N) − Ω(b, N) ≤ B(b, N) ≤ B(N). (159b) 1 T N Σ = diag (p)−pp . Define SN := X1 +...+XN . Then Taking the limit N → ∞ of Eqs. (159a) and (159b), we   N k1 kd have Prob(SN = k) = p1 . . . pd . (168) k1 . . . kd   lim A(N) − Ω(b, N) ≤ lim A(b, N) ≤ lim A(N), We thus see that a multinomial distribution arises from N→∞ N→∞ N→∞ (160a) a sum of independent and identically distributed random   variables. Therefore, using the central limit theorem, we lim B(N) − Ω(b, N) ≤ lim B(b, N) ≤ lim B(N). obtain that the distribution of k approaches the distribu- N→∞ N→∞ N→∞ (µ,Σ) (160b) tion N arbitrarily well for N → ∞, which completes the proof. Now, let us define lim Ω(b, N) =: (b). (161) N→∞ References As the multinomial distribution concentrates around mean [1] C. Carath´eodory, Math. Ann. 67, 355 (1909). for N → ∞, it follows that lim (b) = 0. Therefore, b→∞ [2] R. Giles, Mathematical Foundations of Thermodynam- taking the limit b → ∞ in Eq. (160a) we have ics (Pergamon Press, 1964). lim A(N) ≤ lim lim A(b, N) ≤ lim A(N) [3] U. Seifert, Eur. Phys. J. B. 64, 423 (2008). N→∞ b→∞ N→∞ N→∞ [4] U. Seifert, Rep. Prog. Phys. 75, 126001 (2012). ⇒ lim lim A(b, N) = lim A(N). (162) b→∞ N→∞ N→∞ [5] A. Einstein, Ann. Phys. 324, 371 (1906). [6] M. von Smoluchowski, Ann. Phys. 326, 756 (1906). Analogously, taking the limit b → ∞ in Eq. (160b) we can [7] R. Kubo, Rep. Prog. Phys. 29, 255 (1966). show that [8] M. V. S. Bonan¸ca, Phys. Rev. E 78, 031107 (2008). lim lim B(b, N) = lim B(N). (163) [9] T. B. Batalh˜ao,A. M. Souza, L. Mazzola, R. Auccaise, b→∞ N→∞ N→∞ R. S. Sarthour, I. S. Oliveira, J. Goold, G. De Chiara, Moreover, for any fixed b, by employing Lemma7, we see M. Paternostro, and R. M. Serra, Phys. Rev. Lett. 113, 1 N log N 140601 (2014). that N log(Pk(s)) = O( N ), which vanishes as N → ∞. Therefore, we have [10] S. An, J.-N. Zhang, M. Um, D. Lv, Y. Lu, J. Zhang, Z.-Q. Yin, H. T. Quan, and K. Kim, Nat. Phys 11, lim A(b, N) = lim B(b, N), (164) N→∞ N→∞ 193 (2015).

19 [11] D. Janzing, P. Wocjan, R. Zeier, R. Geiss, and [39] R. Alicki, M. Horodecki, P. Horodecki, and T. Beth, Int. J. Theor. Phys. 39, 2717 (2000). R. Horodecki, Open Syst. Inf. Dyn. 11, 205 (2004). [12] M. Horodecki and J. Oppenheim, Int. J. Mod. Phys. [40] R. Bhatia, Matrix Analysis, Graduate Texts in Math- B 27, 1345019 (2013). ematics (Springer New York, 1996). [13] F. G. S. L. Brand˜ao,M. Horodecki, N. H. Y. Ng, J. Op- [41] A. C. Berry, Trans. Am. Math. Soc. 49, 122 (1941). penheim, and S. Wehner, Proc. Natl. Acad. Sci. U.S.A. [42] C. G. Esseen, Ark. Mat. Astr. Fys. 49, 1 (1942). 112, 3275 (2015). [43] C. G. Esseen, Scand. Actuar J. 1956, 160 (1956). [14] M. Lostaglio, Rep. Prog. Phys. 82, 114001 (2019). [44] I. Shevtsova, On the absolute constants in the berry- [15] O. C. O. Dahlsten, R. Renner, E. Rieper, and V. Ve- esseen type inequalities for identically distributed dral, New J. Phys. 13, 053015 (2011). summands (2011), arXiv:1111.6554. [16] N. Yunger Halpern, A. J. P. Garner, O. C. O. [45] M. Lostaglio, D. Jennings, and T. Rudolph, Nat. Com- Dahlsten, and V. Vedral, Phys. Rev. E 97, 052135 mun. 6, 6383 (2015). (2018). [46] M. Lostaglio, K. Korzekwa, D. Jennings, and [17] A. M. Alhambra, L. Masanes, J. Oppenheim, and T. Rudolph, Phys. Rev. X 5, 021001 (2015). C. Perry, Phys. Rev. X 6, 041017 (2016). [47] P. Cwikli´nski,´ M. Studzi´nski, M. Horodecki, and [18] N. Y. Halpern, A. J. P. Garner, O. C. O. Dahlsten, J. Oppenheim, Phys. Rev. Lett. 115, 210403 (2015). and V. Vedral, New J. Phys. 17, 095003 (2015). [48] M. P. Woods, N. H. Y. Ng, and S. Wehner, Quantum [19] H. Tajima and M. Hayashi, Phys. Rev. E 96, 012128 3, 177 (2019). (2017). [49] N. H. Y. Ng, M. P. Woods, and S. Wehner, New J. [20] C. T. Chubb, M. Tomamichel, and K. Korzekwa, Phys. 19, 113005 (2017). Quantum 2, 108 (2018). [50] L. Wang and R. Renner, Phys. Rev. Lett. 108, 200501 [21] C. T. Chubb, M. Tomamichel, and K. Korzekwa, Phys. (2012). Rev. A 99, 032332 (2019). [51] F. Buscemi and N. Datta, IEEE Trans. Inf. Theory. [22] K. Korzekwa, C. T. Chubb, and M. Tomamichel, Phys. 56, 1447 (2010). Rev. Lett. 122, 110403 (2019). [52] N. Datta, M. Mosonyi, M. Hsieh, and F. G. S. L. [23] M. Horodecki and J. Oppenheim, Nat. Commun. 4, Brand˜ao, IEEE Trans. Inf. Theory. 59, 8014 (2013). 2059 (2013). [53] K. Li, Ann. Stat. 42, 171 (2014). [24] M. Esposito, U. Harbola, and S. Mukamel, Rev. Mod. Phys. 81, 1665 (2009). [25] J. M. R. Parrondo, J. M. Horowitz, and T. Sagawa, Nat. Phys 11, 131 (2015). [26] R. Alicki, J. Phys. A: Math. Gen. 12, L103 (1979). [27]J. Aberg,˚ Nat. Commun. 4, 1925 (2013). [28] P. Skrzypczyk, A. J. Short, and S. Popescu, Nat. Com- mun. 5, 4185 (2014). [29] J. Maxwell, Theory of Heat, Text-books of science (Longmans, Green, and Company, 1872). [30] L. Szilard, Zeitschrift f¨urPhysik 53, 840 (1929). [31] C. H. Bennett, Int. J. Theor. Phys. 21, 905 (1982). [32] R. Landauer, IBM J. Res. Dev. 5, 183 (1961). [33] R. Alicki and M. Horodecki, J. Phys. A: Math. Theor. 52, 204001 (2019). [34] M. Wilde, Quantum Information Theory, Quantum Information Theory (Cambridge University Press, 2013). [35] V. Narasimhachar, J. Thompson, J. Ma, G. Gour, and M. Gu, Phys. Rev. Lett. 122, 060601 (2019). [36] K. Korzekwa, Z. Puchala, M. Tomamichel, and K. Zyczkowski,˙ Encoding classical information into quantum resources (2019), arXiv:1911.12373. [37] M. Tomamichel and M. Hayashi, IEEE Trans. Inf. Theory 59, 7693 (2013). [38] P. Boes, N. H. Y. Ng, and H. Wilming, The variance of relative surprisal as single-shot quantifier (2020), arXiv:2009.08391.

20