<<

PHYSICAL REVIEW X 8, 031013 (2018)

Causal Asymmetry in a Quantum World

† Jayne Thompson,1,* Andrew J. P. Garner,1,7 John R. Mahoney,2 James P. Crutchfield,2 Vlatko Vedral,3,1,4 and Mile Gu5,6,1, 1Centre for Quantum Technologies, National University of Singapore, 3 Science Drive 2, 117543, Singapore 2Complexity Sciences Center and Physics Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA 3Atomic and Laser Physics, University of Oxford, Clarendon Laboratory, Oxford OX1 3PU, United Kingdom 4Department of Physics, National University of Singapore, 3 Science Drive 2, Singapore 117543, Singapore 5School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 639673, Singapore 6Complexity Institute, Nanyang Technological University, Singapore 639673, Singapore 7Institute for Quantum Optics and Quantum Information, Austrian Academy of Sciences, Boltzmanngasse 3, A-1090 Vienna, Austria

(Received 30 November 2017; revised manuscript received 8 May 2018; published 18 July 2018)

Causal asymmetry is one of the great surprises in predictive modeling: The memory required to predict the future differs from the memory required to retrodict the past. There is a privileged temporal direction for modeling a stochastic process where memory costs are minimal. Models operating in the other direction incur an unavoidable memory overhead. Here, we show that this overhead can vanish when quantum models are allowed. Quantum models forced to run in the less-natural temporal direction not only surpass their optimal classical counterparts but also any classical model running in reverse time. This holds even when the memory overhead is unbounded, resulting in quantum models with unbounded memory advantage.

DOI: 10.1103/PhysRevX.8.031013 Subject Areas: Quantum Physics, Quantum Information

I. INTRODUCTION Consider a cannonball in free fall. To model its future How can we observe an asymmetry in the temporal order trajectory, we need only its current position and velocity. This of events when physics at the quantum level is time remains true even when we view the process in reverse time. symmetric? The source of time’s barbed arrow is a long- standing puzzle in foundational science [1–4]. Causal asymmetry offers a provocative perspective [5].Itasks how Occam’s razor—the principle of assuming no more (b) Retrodiction causes of natural things than are both true and sufficient to explain their appearances—can privilege one particular temporal direction over another. In other words, if we want to model a process causally—such that the model makes statistically correct future predictions based only on infor- mation from the past—what is the minimum past information (a) Prediction we must store? Are we forced to store more data if we model events in one particular temporal order over the other (see Fig. 1)? FIG. 1. A stochastic process can be modeled in either temporal order. (a) A causal model takes information available in the past x⃖ and uses it to make statistically accurate predictions about the process’s conditional future behavior PðX⃗jX⃖¼ x⃖Þ. (b) A retro- *[email protected] ’ † causal model replicates the system s behavior, as seen by an [email protected] observer who scans the outputs from right to left, encountering Xt 1 before Xt. Thus, it stores relevant future information x⃗,in Published by the American Physical Society under the terms of þ order to generate a statistically accurate retrodiction of the past the Creative Commons Attribution 4.0 International license. ⃖ ⃗ Further distribution of this work must maintain attribution to PðXjX ¼ x⃗Þ. Causal asymmetry implies a nonzero gap between the author(s) and the published article’s title, journal citation, the minimum memory required by any causal model Cþ and its and DOI. retrocausal counterpart C−.

2160-3308=18=8(3)=031013(15) 031013-1 Published by the American Physical Society JAYNE THOMPSON et al. PHYS. REV. X 8, 031013 (2018)

This exemplifies causal . There is no difference in obtained from the past [7]. Implementing it on a computer the amount of information we must track for prediction then gives us a statistically faithful simulation of the process’s versus retrodiction. However, this is not as obvious for more realizations. The simplest causal model for a process PðX;⃖ X⃗Þ complex processes. Take a glass shattering upon impact with is the model that minimizes H. the floor. In one temporal direction, the future distribution of The statistical complexity Cþ is defined as the H shards depends only on the glass’s current position, velocity, of this simplest model—it is the minimal amount of past and orientation. In the opposite direction, we may need to information needed to make statistically correct future track relevant information regarding each glass shard to infer predictions [13,14]. This measure is used to quantify the glass’s prior trajectory. Does this require more or less structure in diverse settings [15–17], including hidden information? This potential divergence is quantified in the variable models emulating quantum contextuality [18]. theory of computational mechanics [6].Itisnotonly Here, Cþ also fields thermodynamic significance, having generally nonzero, but it can also be unbounded. This been linked to the minimal heat dissipation in stochastic phenomenon implies that a simulator operating in the simulation and the minimal structure a device needs to fully “less-natural” temporal direction is penalized with poten- extract free energy from nonequilibrium environments tially unbounded memory overhead and is cited as a [19–22]. candidate source of time’s barbed arrow [5]. Causal asymmetry captures the discrepancy in statistical These studies assumed that all models are implemented complexity when a process is viewed in forward versus X using classical physics. Could the observed causal asym- reverse time [23]. Consider an observer that encounters tþ1 metry be a consequence of this classicality constraint? Here, before Xt. Their observations are characterized by the time- we first consider a particular stochastic process that is reversed stochastic process P− ¼ P−ðY;⃖Y⃗Þ where past and causally asymmetric. We determine the minimal information ⃖ ⃗ future are interchanged, such that Y ¼ … X1X0, while Y ¼ needed to model the same process in forward versus reverse X−1X−2… and Yt ¼ X− t 1 . A causal model for the time- time using quantum physics, and we prove that these ð þ Þ reversed process then corresponds to a retrocausal model for quantities exactly coincide. More generally, we present P X;⃖ X⃗ systematic methods to model any causally asymmetric the forward process ð Þ. It generates a statistically ⃖⃗ stochastic process quantum mechanically. Critically, the accurate retrodiction of the conditional past PðXjX ¼ x⃗Þ, using only information contained in the future x⃗. The resulting quantum models not only use less information − than any classical counterpart but also than any classical statistical complexity of this time-reversed process C model of the time-reversed process. Thus, quantum models (referred to as the retrodictive statistical complexity can field a memory advantage that always exceeds the for P) quantifies the minimal amount of causal information memory overhead incurred by causal asymmetry. Our work we must assign to model PðX;⃖ X⃗Þ in order of decreasing t. indicates that this overhead can emerge when imposing Causal asymmetry captures the divergence ΔC ¼ classical causal explanations. These results remain true even jC− − Cþj. When ΔC>0, a particular temporal direction in cases where causal asymmetry becomes unbounded. is privileged, such that modeling the process in the other temporal direction incurs a memory overhead of ΔC. Note that the definitions above are entropic measures and II. BACKGROUND thus take operational meaning at the i.i.d. limit—i.e., A. Framework modeling N instances of a stochastic process with statistical complexity Cþ requires NCþ bits of past information, in Consider a system that emits an output xt governed by the limit of large N. While this is the most commonly some random variable Xt at each discrete point in time t. This behavior can be described by a stochastic process adopted measure in computational mechanics, single-shot variants do exist. The topological state complexity Dþ is P—a joint probability distribution P X;⃖ X⃗ that correlates ð Þ particularly noteworthy [13]. It captures the minimum X⃖ …X X past behavior, ¼ −2 −1, with future expectations, number of dimensions (max entropy) Ξ must have to ⃗ X ¼ X0X1…. Each instance of the past x⃖¼ …x−2x−1 generate future statistics. A single-shot variant of causal exhibits a conditional future x⃗¼ x0x1… with probabil- asymmetry can thus be defined by the difference ΔD ¼ − ity PðX⃗¼ x⃗jX⃖¼ x⃖Þ. jD − Dþj between the topological state complexities of − Suppose that a model for this system can replicate this Pþ and P . Here, we focus on statistical complexity for future statistical behavior using only H bitsofpastinforma- clarity. However, many of our results also hold in this tion. Then, this model can be executed by encoding the past x⃖ single-shot regime. We return to this when relevant. into a state sðx⃖Þ ∈ S of a physical system Ξ of entropy H, such that repeated application of a systematic action M on Ξ B. Classical models sequentially generates x0;x1… governed by the conditional Prior studies of causal asymmetry assumed all models future PðX⃗jX⃖¼ x⃖Þ. The model is causal if, at each instance were classical. In this context, causal asymmetry can be of time, all the information Ξ contains about the future can be explicitly demonstrated using ε-machines, the provably

031013-2 CAUSAL ASYMMETRY IN A QUANTUM WORLD PHYS. REV. X 8, 031013 (2018)

(a) (b)

þ ⃖ ⃗ FIG. 2. (a) The ε-machine for the process Ph ðX; XÞ, created by flipping a biased coin and emitting outcome 2 when H → T, 0 when þ þ T → T, and 1 when T=H → H. This process has two causal states s1 and s0 , where the latter includes all pasts ending in either 0 or 2. (b) The time-reversed process P−ðY;⃖Y⃗Þ. Here, pasts ending in 0, 1, and 2 now all lead to qualitatively different future behavior and must − − − − be stored in distinct causal states s0 , s1 , and s2 , respectively, which occur with respective probabilities π0 ¼ðq − pqÞ=ðp þ qÞ, − þ − π1 ¼ π1 ¼ p=ðp þ qÞ, and π2 ¼ pq=ðp þ qÞ. optimal classical causal models [13,14]. This involves two causal states, corresponding to the states of the coin. þ dividing the set of pasts into equivalence classes, such that The statistical complexity hðπ1 Þ thus represents the x⃖ x⃖0 þ two pasts, and , lie in the same class if and only if they entropy of the biased coin, where π1 ¼½p=ðp þ qÞ is have coinciding future behavior, i.e., PðX⃗jX⃖¼ x⃖Þ¼ the probability the coin is in heads and hðxÞ¼−x log x − ⃗ ⃖ 1 − x 1 − x Pþ PðXjX ¼ x⃖0Þ. Instead of recording the entire past, an ð Þ logð Þ is the binary entropy. Furthermore, 0 þ − ε-machine records only which equivalence class x⃖ lies is clearly symmetric under time reversal (i.e., P0 ¼ P0 ) within—inducing an encoding function ε∶χ⃖→ S from the and thus trivially causally symmetric. space of pasts χ⃖ onto the space of equivalence classes Suppose we postprocess the output of the perturbed coin, S ¼fsig, known as causal states. At each time step, the replacing the first 0 of each consecutive substring of 0s machine operates according to a collection of transition with a 2 (for example, …1000110100… becomes x …1200112120… probabilities Tij: the probability that an ε-machine initially in ). This results in a new stochastic process, þ ⃖ ⃗ þ si will transition to sj while emitting output x. The classical Ph ðX; XÞ, called the heralding coin Ph , which also has two þ þ statistical complexity thus coincides with the amount of causal states, s1 ¼fx⃖jx−1 ¼ 1g and s0 ¼fx⃖jx−1 ≠ 1g.In þ ⃖ ⃗ information needed to store the current causal state fact, one can model Ph ðX; XÞ by perturbing the same biased X coin in a box and modifying it to output 2—instead of 0— Cþ − π π ; μ ¼ i log i ð1Þ when it transitions from heads to tails (see Fig. 2). Thus, the i heralding coin also has classical statistical complex- þ þ where πi is the probability the past lies within si. Note that ε- ity Cμ ¼ hðπ1 Þ. machines are also optimal with respect to the max entropy Its retrodictive statistical complexity, however, is higher. D − ⃖ ⃗ [24], such that the topological state complexity μ of a The time-reversed process Ph ðY;YÞ represents an alter- process is the logarithm of the number of causal states [13]. native postprocessing of the perturbed coin—replacing the Despite their provable optimality, ε-machines still appear to last 0 in each consecutive substring of 0s with a 2. Now, waste memory. The amount of past information they demand 0 can be followed by 0 or 2, while 1 can be followed by typically exceeds the amount the past contains about the anything, and 2 can only be followed by 1, inducing three ⃖ ⃗ s− y⃖y j future—the mutual information E ¼ IðX; XÞ. Observing an causal states j ¼f j −1 ¼ g (see Fig. 2). This immedi- ε-machine’s entire future is insufficient for deducing its ately establishes a difference in the number of distinct initial state. Some of the information it stores in the present is configurations needed for causal versus retrocausal model- þ never reflected in future statistics and is thus effectively ing. Indeed, Ph fields causal asymmetry, erased during operation. In general, this waste differs ΔC C− − Cþ 1 − π− h γ ; between prediction and retrodiction, inducing nonzero causal μ ¼ μ μ ¼ð 1 Þ ð Þ ð2Þ asymmetry. − − − − − where γ ¼ π2 =ð1 − π1 Þ and πj ¼ Ph ðy⃖∈ sj Þ. To under- þ stand this asymmetry, note that when modeling Ph ,we C. Examples need only know if the previous output was 1 (i.e., current We illustrate this by examples, starting with the per- state of the coin) to decide whether a 0 should be replaced − turbed coin. Consider a box containing a single biased coin. by a 2. To model Ph , however, one cannot simply look At each time step, the box is perturbed, causing the coin to into the “future” to see if the system will output 1 next. flip with probability p if it is in heads (0) and q if it is in Causal asymmetry thus captures the overhead required to tails (1). The coin’s state is then emitted as output. This accommodate this restriction. þ describes a stochastic process P0 . As only the last output is In general, causal asymmetry can be unbounded. In þ necessary for generating correct future statistics, P0 has Appendix D, we describe the class of n-m flower processes,

031013-3 JAYNE THOMPSON et al. PHYS. REV. X 8, 031013 (2018)

þ − where Cμ scales as Oðlog nÞ while Cμ scales as Oðlog mÞ. (a) Note that n and m can be adjusted independently, allowing construction of processes where ΔCμ >K for any given constant K. Setting m ¼ 2, for example, can yield a process þ − where Cμ can be made arbitrarily high, while Cμ ≤ log 3. When this occurs, the memory overhead incurred for modeling the process in the less-natural direction scales towards infinity. (b)

D. Quantum models A quantum causal model is described formally by an ordered tuple Q ¼ðf; Ω; MÞ, where Ω is a set of quantum states; f∶χ⃖→ Ω defines how each past x⃖is encoded into a state fðx⃖Þ¼jsx⃖i of a physical system Ξ; and M is a P X;⃖ X⃗ þ ⃖ ⃗ quantum measurement process. To model ð Þ, FIG. 3. Quantum circuits for generating (a) Ph ðX; XÞ and repeated applications of M on Ξ must generate correct − ⃖ ⃗ (b) Ph ðY;YÞ.Here,CU (black circle and line) is the standard ðw mod 2Þ ¯ conditional future behavior. In other words, application of control gate CU∶jwijψi → jwiU jψi. Meanwhile, CU M Ξ s x ¯ on a system in state j x⃖i must (i) generate an output (white circle, black line) is defined as CUjwijψi¼ P X x X⃖ x⃖ Ξ ðwþ1 mod 2Þ þ ⃖ ⃗ with probability ð 0 ¼ j ¼ Þ and (ii) transition into j0iU jψi. (a) To simulate Ph ðX; XÞ, we initialize a qubit f x⃖0 s x⃖0 xx⃖ þ a new state ð Þ¼j x⃖0 i, where ¼ , such that in state jsi i and an ancilla in state j0i. Executing the local unitary L M x ; …;x V 0 → sþ C -repeated applications of will generate 0 L−1 pj i j 0 i, followed by the two-qubit gate Vq ,where ⃖ þ with correct probability PðX0∶LjX ¼ x⃖Þ for any desired VqVpj0i¼js1 i, creates a suitable entangled state—such that a L ∈ Zþ [25]. The entropy of a model Q is given by computation basis measurement of the top qubit yields xt and the von Neumann entropy SðρÞ¼−Trðρ log ρÞ, where simultaneously collapses the bottom qubit into the causal state for P − ⃖ ⃗ ρ P X⃖ x⃖ s s the next time step. (b) To simulate Ph ðY;YÞ, we prepare state ¼ ð ¼ Þj x⃖ih x⃖j. Thus, the quantum statistical − ¯ þ js ij0ij0i as input. Execution of CU , where Upj0i¼ complexity Cq of a process can be computed by minimiz- iffiffiffiffiffiffiffiffiffiffi ffiffiffi p p1−p 0 pp 1 C U U 0 S ρ j iþ j i, followed by Uq,where q,satisfies qj i¼ ing ð Þ over all valid models [26]. pffiffiffi pffiffiffiffiffiffiffiffiffiffiffi This optimization is highly nontrivial. There exists no qj0iþ 1 − qj1i,andfinally,CX,whereX is the Pauli X systematic techniques for constructing optimal quantum operator, generates a suitable entangled state—such that measuring the first two qubits yields yt (provided we identify measurement models or for proving the optimality of a given candidate 00 → y 0 10 → y 1 01 → y 2 þ outcome t ¼ , t ¼ ,and t ¼ ) and col- model. To date, Cq has only been evaluated for the Ising lapses the remaining qubit into the quantum causal state for the next chain [25]. However, this process is symmetric under time time step. In either circuit, retaining only the state of Ξ (green circle) reversal, implying that ΔCμ is trivially zero. Nevertheless, at each time step is sufficient for generating statistically correct recent advances show multiple settings where quantum predictions or retrodictions. models outperform optimal classical counterparts [27–31]. In fact, for every stochastic process where the optimal show that when forced to model such a process in the less- þ classical models are wasteful (i.e., Cμ >E), it is always natural direction, the quantum advantage always exceeds possible to design a simpler quantum model [27]. Indeed, the memory overhead ΔCμ. þ þ sometimes the quantum memory advantage Cμ − Cq can be unbounded [32]. Could quantum models mitigate the A. The heralding coin þ memory overhead induced by causal asymmetry? Let Ph denote the heralding coin process. Here, we first þ − state the optimal quantum models of Ph and Ph . We then III. RESULTS outline how their optimality is established, leaving details We study this question via two complementary of the formal proof to Appendix B. The optimal causal Qþ approaches. The first is a case study of the heralding model has two internal states: pffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffi coin—the aforementioned process that exhibits causal þ p js0 i¼ 1 − pj0iþ pj1i; asymmetry. We pioneer methods to establish its provably ffiffiffi pffiffiffiffiffiffiffiffiffiffiffi þ p optimal quantum causal and retrocausal models and thus js1 i¼ qj2iþ 1 − qj1i; ð3Þ produce a precise picture of how quantum mechanics ϵþ x⃖ sþ mitigates all present causal asymmetry. The second case with associated encoding function q ð Þ¼j i i if and only x⃖∈ sþ ϵþ x⃖ studies quantum modeling of arbitrary processes with if i . Given a qubit in state q ð Þ, Fig. 3 establishes the þ − causal asymmetry. Here, Cq and Cq cannot be directly sequential procedure that replicates expected future behav- þ ⃗ ⃖ evaluated but can nevertheless be bounded. In doing so, we ior, i.e., samples Ph ðXjX ¼ x⃖Þ.

031013-4 CAUSAL ASYMMETRY IN A QUANTUM WORLD PHYS. REV. X 8, 031013 (2018)

Q− ψ − ψ − ≤ s− s− Meanwhile, the optimal quantum retrocausal model constraints jh j j k ij Pjh j j k ij (see Lemma 2 of ϵ− y⃖ s− y⃖∈ s− σ π− ψ − ψ − has encoding function q ð Þ¼j i i if and only if i , Appendix A). LetP ¼ k j k ih k j, with eigenvalues − − − − − where λk, and ρ ¼ πk jsk ihsk j, with eigenvalues λk .In Lemma 4 of Appendix B, we prove that for all choices s− 0 ; − − j 0 i¼j i of jψ k i satisfying the fidelity constraint, λk majorizes λk. ffiffiffi pffiffiffiffiffiffiffiffiffiffiffi − − p Thus, ρ has minimal entropy among all valid retrocausal js1 i¼ qj0iþ 1 − qj1i; quantum models. − − js2 i¼j1i: ð4Þ Note that Qþ and Q exhibit different encoding func- tions (one maps onto two code words, the other onto three) y⃗ The associated procedure for sequential generation of as and invoke seemingly unrelated quantum circuits for − ⃗ ⃖ governed by Ph ðYjY ¼ y⃖Þ is outlined in Fig. 3. generating future statistics (see Fig. 3). Nevertheless, direct To establish optimality, we first invoke the causal-state computation yields correspondence: For any stochastic process with causal  pffiffiffi 1 þ c states fsig that occur with probability πi, there exists an Cþ C− h ; q ¼ q ¼ 2 ð5Þ optimal model Q ¼ðϵq; Ω; MÞ, where the elements of Ω s are in one-to-one correspondence with f ig (see Lemma 1 where c¼(p2ð1þ4ð1−qÞqÞ−2pqþq2)=ðpþqÞ2 and hð·Þ of Appendix A). Since the heralding coin process has two is the binary entropy. Thus, ΔCq ¼ 0 for all values of forward causal states, we can restrict our computation of p and q. This establishes our first result: Cþ Ω ψ þ ; ψ þ q to quantum models where ¼fj 0 i j 1 ig. Result 1. There exists stochastic processes that are þ − Moreover, we can showp thatffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi the data-processing inequality causally asymmetric (Cμ ≠ Cμ ) but exhibit no such asym- þ þ þ − implies jhψ 0 jψ 1 ij ≤ pð1 − qÞ ≡ F (see Lemma 2 of metry when modeled quantum mechanically (Cq ¼ Cq ). þ þ Appendix A). The monotonicity between jhψ 0 jψ 1 ij and This vanishing of causal asymmetry at the quantum level the entropy of the resulting model, together with the is not simply the result of saturating the bound given by E. þ þ þ − þ − observation that jhs0 js1 ij ¼ F, then implies optimality Figure 4 shows that EE—see Fig. 4(i)]. Ω ψ − ¼fj k igk¼0.::2 has three elements. The data processing Our results persist when considering minimal dimensions, inequality can then be used to establish the fidelity rather than minimal entropy required for causal modeling.

(a) (b) (c) (d) (e)

(f) (g) (h) (i) (j)

þ − þ − FIG. 4. Complexity of the heralding coin plotted against p and q. The figure illustrates E ≤ Cq ¼ Cq ≤ Cμ ≤ Cμ across all values of the parameter space (0 ≤ p, q ≤ 1). Panel (d) depicts the classical causal asymmetry ΔCμ, and panel (f) effectively demonstrates þ − Cq ¼ Cq and thus ΔCq ¼ 0.

031013-5 JAYNE THOMPSON et al. PHYS. REV. X 8, 031013 (2018)

þ Here, Ph requires only two causal states and thus can be Result 4. The quantum topological complexity Dq can þ modeled using a two-level system (Dμ ¼ log 2). However, be strictly less than the classical topological complexity Dμ. − Ph has three causal states. Modeling it thus requires a three- This solves an open question in quantum modeling— − level system (Dμ ¼ log 3). In contrast, the three quantum whether quantum mechanics allows for models that sim- − causal states of Ph can be embedded within a single qubit; ulate stochastic processes using not only reduced memory thus, the dynamics of the heralding coin can be modeled but also reduced dimensions. using a single qubit in either temporal direction. Therefore, These results have a particular impact when ΔCμ is this vanishing of causal asymmetry also applies in single- exceedingly large. Recall that in the case of the n − 2 min þ shot settings. flower process, Cμ ≤ log 3, while Cμ scales as Oðlog nÞ. min Our theorem then implies that Cq ≤ Cμ ≤ log 3. Thus, B. General processes we immediately identify a class of processes whose optimal classical models require a memory that scales as Oðlog nÞ We now study quantum mitigation of causal asymmetry and yet can be modeled quantum mechanically using a Cþ C− for general stochastic processes by bounding q and q single qutrit. min þ − from above. Let Cμ ¼ minðCμ ;Cμ Þ represent the mini- mum amount of information we need to classically model IV. FUTURE DIRECTIONS PðX;⃖ X⃗Þ when allowed to optimize over the temporal max þ − There are a number of potential relations between causal direction. Meanwhile, let Cq ¼ maxðCq ;Cq Þ be the minimal memory a quantum system needs when forced asymmetry and innovations on the and to model the process in the least favorable temporal retrodictive quantum theory. In this section, we survey direction. In Appendix C, we establish the following: some of these connections and highlight promising future Result 2. For any stochastic process P, research directions.

þ − þ − A. Retrodictive quantum mechanics maxðCq ;Cq Þ ≤ minðCμ ;Cμ Þ: ð6Þ Consider the evolution of an open quantum system that is þ − Equality occurs only if Cμ ¼ Cμ ¼ E, such that P is monitored continuously in time. Standard quantum trajec- causally symmetric. tory theory describes how the system’s internal state ρðtÞ Consider any causally asymmetric process P, such that evolves, encapsulating how our expectations of future modeling it in the less-favorable temporal direction incurs measurement outcomes are updated based on past obser- memory overhead ΔCμ. Result 2 implies that this overhead vations. Retrodictive quantum mechanics introduces the E t — can be entirely mitigated by quantum models. There exists effect matrix ð Þ a time-reversed analogue of the density ρ t – E t a quantum model that is not only provably simpler than its matrix ð Þ [35 37]. Note that ð Þ propagates backwards optimal classical counterpart but is also simpler than any through time, representing how our expectations of the past classical model of the time-reversed process P−. In Lemma change as we scan future measurement outcomes in time- ρ t 7 (see Appendix C), we show that such models can be reversed order. The original motivation was that ð Þ and E t systematically constructed and aligned with the simplest, ð Þ combined yield a more accurate estimate of the t ρ t currently known, quantum models—q-machines [33,34]. measurement statistics at time than ð Þ alone, allowing þ improved smoothing procedures [38–41]. As a corollary, causal asymmetry guarantees both Cq < þ − − While this framework and causal asymmetry differ in Cμ and Cq

031013-6 CAUSAL ASYMMETRY IN A QUANTUM WORLD PHYS. REV. X 8, 031013 (2018) generalizations of classical and quantum computational V. DISCUSSION mechanics to continuum time [43,44] and input-dependent Causal asymmetry captures the memory overhead regimes [21,45,46]. More generally, such developments incurred when modeling a stochastic process in one will enable a formal study of causal asymmetry in the temporal order versus the other. This induces a privileged quantum-trajectory formulation of open quantum systems. temporal direction when one seeks the simplest causal explanation. Here, we demonstrate a process where this B. Arrow of time in quantum measurement overhead is nonzero when using classical models and yet Related to such open systems are recent proposals for vanishes when quantum models are allowed. For arbitrary inferring an arrow of time from continuous measurement processes exhibiting causal asymmetry, we prove that [47]. These proposals consider continuously monitoring a quantum models forced to operate in a given temporal ρ quantum system initialized in state i, resulting in a order always require less memory than classical counter- r t P r t ρ measurement record ð Þ with some probability ½ ð Þj i. parts, even when the latter are permitted to operate in either Concurrently, the state of the system evolves through a temporal direction. The former result represents a concrete ρ t quantum trajectory ð Þ into some final configuration case where causal asymmetry vanishes in the quantum ρ T ρ ð Þ¼ f. The goal is to identify an alternative sequence regime. The latter implies that the more causally asym- of measurements, such that for at least one possible outcome metric a process, the greater the resource advantage of r0 t P r0 t ρ record ð Þ occurring with nonzero probability ½ ð Þj f, modeling it quantum mechanically. the trajectory rewinds. In other words, a system initially in Our results also hold when memory is quantified by max ρ ρ state f will evolve into i, passing through all intermediary entropy. They thus establish that quantum mechanics can states in time-reversed order. An arrow of time emerges as reduce the dimensionality needed to simulate a process 0 P½rðtÞjρi and P½r ðtÞjρf generally differ, such that one of beyond classical limits. Indeed, our results isolate families the two directions occurs with greater probability. An of processes whose statistical complexity grows without argument via Bayes’ theorem then assigns different prob- bound but can nevertheless be modeled exactly by a abilistic likelihoods towards whether ρðtÞ occurred in quantum system of bounded dimension. These features forward or reverse time. make such processes ideal for demonstrating the practical This framework provides a complementary perspective benefits of quantum models—allowing us to verify an to our results. It aims to reverse the trajectory of the arbitrarily large quantum advantage in single-shot regimes ’ ρ t system s internal state ð Þ, placing no constraints on the [24,48] and avoiding the need to measure von Neumann r t relation between the measurement statistics governing ð Þ entropy as in current state-of-the-art experiments [29]. r0 t and ð Þ. In contrast, causal asymmetry deals with revers- One compelling open question is the potential thermo- ing the observed measurement statistics (as described by dynamic consequences of causal asymmetry. In computa- some stochastic process P) while placing no restrictions on þ tional mechanics, Cμ has thermodynamical relevance in the the internal dynamics of the causal and retrocausal models contexts of prediction and pattern manipulation [19–22,49]. (the two models may even field different Hilbert space For instance, the minimum heat one must dissipate to dimensions, such as in the heralding coin example). generate future predictions based on only past observations We also observe some striking parallels. Both works start þ þ is given by W ¼ kBTðCμ − EÞ, where kB is Boltzmann’s out with some sequential data but no knowledge about diss constant, T is the environmental temperature, and the whether the sequence occurred in forward or reverse time. E Both ask the following question: Is there some sort of excess entropy is symmetric with respect to time reversal. asymmetry singling out one temporal direction over the Therefore, nonzero causal asymmetry implies that flipping other? In the emerging arrow of time from quantum the temporal order in which we ascribe predictions incurs ΔW k TΔC measurement, we are given a trajectory ρðtÞ, and asym- an energetic overhead of diss ¼ B μ. In processes ΔC metry arises from the difficulty (in terms of success where μ scales without bound, this cost may become min probability) of realizing this trajectory in forward versus prohibitive. Could our observation that ΔCq ≤ Cμ imply reverse time. Meanwhile, in causal asymmetry, we are that such energetic penalties become strongly mitigated given the observed measurement statistics, and an arrow of when quantum simulators are taken into account? time arises from the difference in resource costs needed to A second direction is to isolate what properties of realize these statistics causally in forward versus reverse quantum processing enable it to mitigate causal asymmetry. time. It would then be interesting to see if a similar In Appendix C, we establish that all deterministic processes argument via Bayes’ theorem can be adapted to causal are causally symmetric, such that Cμ ¼ Cq ¼ E (see asymmetry. If we suppose that more complex machines are Lemma 6 of Appendix C). Randomness is therefore less likely to exist in nature (e.g., due to dimensional or essential for causal asymmetry. Observe also that the entropic constraints), could we then argue whether a given provably optimal quantum causal and retrocausal models stochastic process is more likely to occur in one causal for the heralding coin both operated unitarily—such that direction versus the other? their dynamics are entirely deterministic (modulo

031013-7 JAYNE THOMPSON et al. PHYS. REV. X 8, 031013 (2018) measurement of outputs). Indeed, such unitary quantum Condition (i) guarantees that if a quantum model is models can always be constructed [34], and we conjecture initialized in state fðx⃖Þ, then the model’s future output that this unitarity implies causal symmetry. However, it X0 ¼ x will be statistically indistinguishable from the remains an open question as to whether the optimal output of the process itself. Condition (ii) ensures that quantum model is always unitary. the internal memory of the quantum model is updated to Insights here will ultimately help answer the big out- record the event X0 ¼ x, allowing the model to stay standing question of whether the quantum statistical com- synchronized with the sequence of outputs it has generated plexity ever displays asymmetry under time reversal. so far. Thus, a series of L repeated applications of M acting Identifying any process for which such asymmetry persists on Ξ generates output x0∶L ¼ x0…xL−1 with probability implies that Occam’s preference for minimal cause can ⃖ PðX0∶L ¼ x0∶LjX ¼ x⃖Þ and simultaneously transitions Ξ privilege a temporal direction in a fully quantum world. into the state fðxx⃖ 0∶LÞ. In the limit L → ∞, the model Proof that no such process exists would be equally exciting, produces a sequence of outputs x⃗¼ x0x1… with proba- indicating that causal asymmetry is a consequence of P X⃗X⃖ x⃖ enforcing all causal explanations to be classical in a bility ð j ¼ Þ. Q fundamentally quantum world. The entropy of a quantum model is given by

CqðQÞ¼SðρÞ¼−Trðρ log ρÞ; ðA1Þ ACKNOWLEDGMENTS P S ρ π ρ The authors appreciated the feedback and input received where ð·Þ is the von Neumann entropy, ¼ x⃖ x⃖ x⃖ for ⃖ from: Yang Chengran, Suen Whei Yeap, Liu Qing, Alec ρx⃖¼jsx⃖ihsx⃖j, and πx⃖¼ PðX ¼ x⃖Þ. Boyd, Varun Narasimhachar, Felix Binder, Thomas Elliott, Definition 2 Q is an optimal quantum model for a Howard Wiseman, Geoff Pryde, Nora Tischler, Farzad process PðX;⃖ X⃗Þ if, given any other model Q0,we 0 Ghafari, Chiara Marletto. This work was supported by have CqðQ Þ ≥ CqðQÞ. the National Research Foundation of Singapore and, in Consider a stationary stochastic process PðX;⃖ X⃗Þ, such particular, NRF Awards No. NRF-NRFF2016-02, P X P X L ∈ Zþ t ∈ Z that ð 0∶LÞ¼ ð t∶tþLÞ for any , . Let No. NRF-CRP14-2014-02, and No. RF2017-NRF- ⃖ ⃗ PðX; XÞ have causal states S ¼fsig each occurring with ANR004 VanQuTe, the John Templeton Foundation stationary probability πi. Define the conditional distribution Grants No. 52095 and No. 54914, Foundational ⃗ ⃗ ⃖ PiðXÞ¼PðXjX ¼ x⃖∈ siÞ as the future morph of causal Questions Institute Grant No. FQXi-RFP-1609, and s Physics of the Observer Grant (Observer-Dependent state i. We make use of the following two results derived in Ref. [25]. Complexity: The Quantum-Classical Divergence over ⃖ ⃗ ‘ ’ Lemma 1 (Causal-state correspondence). Let PðX; XÞ What Is Complex? ) No. FQXi-RFP-1614, the Oxford s Martin School, the Singapore Ministry of Education Tier 1 be a stochastic process with causal states f ig. There exists Q ϵ ; Ω; M Ω s RG190/17, and the U.S. Army Research Laboratory and an optimal model ¼ð q Þ, where ¼fj iig and ϵ x⃖ s x⃖∈ s the U.S. Army Research Office under Contracts qð Þ¼j ii if and only if i. No. W911NF-13-1-0390, No. W911NF-13-1-0340, and This implies that we can limit our search for optimal No. W911NF-18-1-0028. Much of the collaborative was models Q ¼ðf; Ω; MÞ to those whose internal states Ω ¼ also made possible by the “Interdisciplinary Frontiers of fjψ iig are in one-to-one correspondence with the classical Quantum and Complexity Science” workshop held in causal states. In addition, it can be shown that Ω must Singapore, funded by the John Templeton Foundation, satisfy the following constraint: ⃖ ⃗ the Centre for Quantum Technologies, and the Lee Lemma 2 (Maximum fidelity constraint). Let PðX; XÞ Foundation of Singapore. be a stochastic process with causal states fsig, and let Q ¼ðf;Ω; MÞ be a valid quantum model satisfying APPENDIX A: TECHNICAL DEFINITIONS fðx⃖Þ¼jψ ii iff x⃖∈ si. Then, jhψ ijψ jij ≤ Fij, where Fij ¼ P 1 P x⃗P x⃗ 2 We first introduce further technical notation and back- x⃗½ ið Þ jð Þ is the fidelity between the future morphs s s ground that will be used for subsequent proofs. of i and j. Definition 1 (Quantum causal model) Consider an These definitions assume that all elements of Ω are pure. ordered tuple Q ¼ðf; Ω; MÞ, where Ω is a set of quantum This is because computational mechanics considers only states; f∶χ⃖→ Ω is an encoding function that maps each x⃖ causal models—models whose internal states do not store onto a state fðx⃖Þ¼jsx⃖i of a physical system Ξ; and M is a more information about the future than what is available Q P X;⃖ X⃗ from the past. Specifically, let R be a random variable quantum process. is a quantum model for ð Þ if and ⃗ ⃖ only if, for any x⃖∈ χ⃖, whenever Ξ is prepared in fðx⃖Þ, governing the state of a model at t ¼ 0. Then, IðR; XjXÞ is subsequent application of M (i) generates an output x with known as the oracular information, and it represents the ⃖ R X⃗ probability PðX0 ¼ xjX ¼ x⃖Þ and (ii) transitions Ξ into a amount of extra information contains about the future 0 0 ⃖ new state fðx⃖Þ¼jsx⃖0 i, where x⃖¼ xx⃖ [25]. that is not contained in the past X. For causal models,

031013-8 CAUSAL ASYMMETRY IN A QUANTUM WORLD PHYS. REV. X 8, 031013 (2018)

⃗ ⃖ IðR; XjXÞ¼0 [8]. In Appendix E, we show that this allows jψ 0i¼j0i; pffiffiffiffiffiffiffiffiffiffiffiffi us to assume that all elements of Ω are pure without loss of ψ r θeiω 0 1 − r2eiα 1 r θ 2 ; generality. j 1i¼ sin j iþ j iþ cos j i jψ 2i¼j1i; ðB4Þ APPENDIX B: PROOFS OF OPTIMALITY θ ∈ 0; π=2 0 ≤ r ≤ 1 α; ω ∈ 0; 2π Here, we formally prove that the quantum models for the for some ½ ffiffiffiffiffiffiffiffiffiffiffiffi, ffiffiffiffiffiffiffiffiffiffiffi, ½ , such that pffiffiffi p 2 p heralding coin given in Eqs. (3) and (4) are optimal. r sin θ ≤ q and 1 −Pr ≤ 1 − q. − − − 1 — F P y⃗P y⃗ 2 Proof. Set ij ¼ y⃗½ i ð Þ j ð Þ . Explicitffiffiffiffiffiffiffiffiffiffiffi evalu- − pffiffiffi − − p 1. Optimality of the causal model ation yields F01 ¼ q, F02 ¼ 0, F12 ¼ 1 − q. By the þ − Let Ph denote the heralding coin process, with the maximum fidelity constraint, jhψ ijψ jij ≤ Fij. Thus, corresponding ε-machine depicted in Fig. 2(a). hψ 0jψ 2i¼0. Therefore, jψ 0i¼j0i and jψ 2i¼j1i up to þ þ þ þ Theorem 1. Consider Q ¼ðεq ; Ω ; M Þ, where a unitary rotation. We can then write jψ 1i in the form above þ þ þ εq ðx⃖Þ¼jsi i if and only if x⃖∈ si , with without loss of generality, as the coefficient of j2i can be made real and positive by choosing a suitable definition of pffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffi sþ 1 − p 0 p 1 ; basis element 2 . Meanwhile, constraints on ψ 1 ψ 2 ≤ j 0 i¼ j iþ j i pffiffiffiffiffiffiffiffiffiffiffi j i ffiffiffi jh jffiffiffi ij ffiffiffi pffiffiffiffiffiffiffiffiffiffiffi 1 − q ψ ψ ≤ pq r θ ≤ pq þ p ffiffiffiffiffiffiffiffiffiffiffiffi and jh 0j 1ij imply sin and js1 i¼ qj2iþ 1 − qj1i; ðB1Þ p pffiffiffiffiffiffiffiffiffiffiffi 1 − r2 ≤ 1 − q. □ Ωþ sþ ; sþ Mþ Our models described in Eq. (4)ffiffiffiffiffiffiffiffiffiffiffiffican be obtainedffiffiffiffiffiffiffiffiffiffiffi by ¼fj 0 i j 1 ig, and described by the quantum iω pffiffiffi p 2 iα p Qþ setting r sin θe ¼ q and 1 − r e ¼ 1 − q in circuit in Fig. 3(a). Here, is an optimal quantum model − þ Eq. (B4) (i.e., this corresponds to choosing jψ 0i¼js i, for Ph . 0 ψ s− ψ s− Proof.—We prove this by contradiction. Assume there j 1i¼j 1 i, and j 2i¼j 2 i). The subsequent lemma then þ exists some Q ¼ðf; Ω; MÞ such that CqðQÞ sþ sþ In other words, Q , as described by Eq. (4), is the lowest pqffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið Þ qð Þ implies that jh 0j 1ij jh 0 j 1 ij ¼ p 1 − q entropy (optimal) model that satisfies the causal-state ð Þ. Meanwhile, Lemma 2 implies correspondence. X pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi − − 1 Proof.—By definition, CqðQ Þ¼Sðρ Þ for þ þ P ψ 0 ψ 1 ≤ P x⃗P x⃗ 2 p 1 − q : B2 − − − − − − − jh j ij ½ 0 ð Þ 1 ð Þ ¼ ð Þ ð Þ ρ ¼ iπi jsi ihsi j, where πi ¼ Ph ðy⃗∈ si Þ and the x⃗ − states jsi i are given in Eq. (4). We label the eigenvalues λ− λ− λ− Q □ of this state from largest to smallest by 0 , 1 , 2 . This is a contradiction. Thus, no such exists. ψ Meanwhile, by the above lemma, CqðQÞ¼Sðρ 1 Þ, where 2. Optimality of the retrocausal model ψ1 − − − − ρ ¼ π0 j0ih0jþπ2 j1ih1jþπ1 jψ 1ihψ 1j; Let Ph denote the time reversal of the heralding coin process, with the corresponding ε-machine in Fig. 2(b). − − − − and jψ 1i is described by Eq. (B4). We label the eigenvalues Q ε ; Ω ; M ψ ψ ψ Theorem 2. Define ¼ð q Þ, where ψ1 1 1 1 − − − of ρ from largest to smallest by λ0 , λ1 , λ2 . To establish εq ðy⃖Þ¼jsi i if and only if y⃖∈ si , with − − ψ that CqðQÞ ≥ CqðQ Þ, it is sufficient to show λ ≻λ 1, − where ≻ denotes majorization [51]. This is established by js i¼j0i; ψ ψ ψ 0 λ− ≥ λ 1 λ− λ− ≥ λ 1 λ 1 pffiffiffi pffiffiffiffiffiffiffiffiffiffiffi proving that (1) 0 0 and (2) 0 þ 1 0 þ 1 . s− q 0 1 − q 1 ; − ψ 1 j 1 i¼ j iþ j i We begin by establishing λ0 ≥ λ0 . By the minimax ψ − principle [52], the largest eigenvalue for ρ 1 is js2 i¼j1i; ðB3Þ ψ 1 ψ 1 Ω− s− M− λ0 ¼ max hxjρ jxi: ðB6Þ ¼fj i ig, and the measurement process given in x x 2 1 − − jh j ij ¼ Fig. 3(b). Here, Q is an optimal quantum model for Ph . Below, we break down the proof of this theorem into a Suppose that this maximum is attained for some jxi¼ series of small steps. Each step is phrased as a lemma. jxðt; ϕ; κ; ηÞi such that Lemma 3. Let Q ¼ðf; Ω; MÞ be a quantum model for − − pffiffiffiffiffiffiffiffiffiffiffiffi Ph satisfying fðy⃗Þ¼jψ ii iff y⃗∈ s . Then, up to a unitary i x t ϕeiη 0 eiκ 1 − t2 1 t ϕ 2 ; rotation, j i¼ sin j iþ j iþ cos j i ðB7Þ

031013-9 JAYNE THOMPSON et al. PHYS. REV. X 8, 031013 (2018) where ϕ ∈ ½0; π=2, 0 ≤ t ≤ 1, and η, κ ∈ ½0; 2π. where we have used the fact that 0 ≤ ϕ ≤ β ≤ π=2 We can assume the coefficient of j2i is real and positive and jβ − θ − dθj ≤ jϕ − θj ≤ π=2. Specifically, these two because Eq. (B6) remains unchanged when jxi → eiψ jxi. conditions imply sin β ≥ sin ϕ and cos ðβ − θ − dθÞ ≥ ψ0 ψ Substituting Eq. (B7) into Eq. (B6) yields cos ðϕ − θÞ ≥ 0. Thus, we have λ 1 ≥ λ 1 . ψ 0 0 0 To show λ− ≥ λ 1, we define y to be the state satisfying ψ − 2 − 2 − i ω−η 0 0 j i λ 1 π x 0 π x 1 π rt θ ϕe ð Þ ψ0 0 ¼ 0 jh j ij þ 2 jh j ij þ 1 j sin sin 1 ψ 0 pffiffiffiffiffiffiffiffiffiffiffiffipffiffiffiffiffiffiffiffiffiffiffiffi λ0 ¼hyjρ 1 jyi. In general, we can parametrize 2 þeiðα−κÞ 1 − r2 1 − t2 þ rt cos θ cos ϕj : ðB8Þ pffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi jyi¼weiaj0iþ 1 − w2 sin ξeibj1iþ 1 − w2 cos ξj2i We defined jxðt; ϕ; κ; ηÞi to be the vector that maximizes Eq. (B8); thus, we have implicitly optimized over κ and η in for 0 ≤ w ≤ 1, ξ ∈ ½0; π=2, and a, b ∈ ½0; 2π. Using the Eq. (B8). This optimization will automatically set eiðα−κÞ ¼ same argument as in Eq. (B9), we can show a ¼ b ¼ 0 and iðω−ηÞ e ¼ 1 [since any two complex numbers c1, c2 ∈ C thus c c 2 ≤ c c 2 satisfy j 1 þ 2j ðj 1jþj 2jÞ ]. Using this and trigo- 0 λψ1 π−w2 π− 1 − w2 2ξ nometry identities to simplify Eq. (B8), we obtain 0 ¼ 0 þ 2 ð Þsin pffiffiffiffiffiffiffiffiffiffiffipffiffiffiffiffiffiffiffiffiffiffiffiffi − 2 pffiffiffi 2 ψ þ π j 1 − q 1 − w cos ðχ − ξÞþ qwj : λ 1 ¼ π−t2 sin2 ϕ þ π−ð1 − t2Þ 1 0 0 2 pffiffiffiffiffiffiffiffiffiffiffiffipffiffiffiffiffiffiffiffiffiffiffiffi 2 ðB13Þ π− rt ϕ − θ 1 − r2 1 − t2 : þ 1 j cos ð Þþ j ðB9Þ pffiffiffiffiffiffiffiffiffiffiffiffiffi y0 w 0 1 − w2 1 0 Define j i¼ j iþ j i. By mirroring the ψ 1 We now show that there always exists some λ0 such that analysis in Eq. (B12), we find 0 − ψ1 ψ1 λ0 ≥ λ0 ≥ λ0 . The maximum fidelity constraint implies ffiffiffi − 0 − 0 r θ ≤ pq dθ λ0 ≥ hy jρ jy i sinð Þ . Thus,ffiffiffi there exists some such that r θ dθ pq − 2 − 2 sinð þ Þ¼ (in particular, we choose the solution ¼ π0 w þ π2 ð1 − w Þ 0 < θ dθ ≤ π=2 pffiffiffiffiffiffiffiffiffiffiffipffiffiffiffiffiffiffiffiffiffiffiffiffi of this equation where þ ). Consider − 2 pffiffiffi 2 þ π1 j 1 − q 1 − w þ qwj ψ0 ψ 0 ψ0 λ 1 x ρ 1 x ; 1 0 ¼ max h j j i ≥ λ0 : ðB14Þ jhxjxij2¼1 ψ0 − − − 1 0 0 ψ 0 ψ ρ ¼ π0 j0ih0jþπ2 j1ih1jþπ1 jψ 1ihψ 1j; − 1 1 Together, the above results imply λ0 ≥ λ0 ≥ λ0 , establish- − ψ1 ing step (1), λ0 ≥ λ0 . where − − ψ 1 ψ 1 For step (2), we must show λ0 þ λ1 ≥ λ0 þ λ1 . pffiffiffiffiffiffiffiffiffiffiffiffi However, by construction, ρ− only spans a two-dimen- ψ 0 r θ dθ 0 1 − r2 1 r θ dθ 2 − − j 1i¼ sin ð þ Þj iþ j iþ cos ð þ Þj i sional Hilbert space; thus, we have λ0 þ λ1 ¼ 1. It follows ffiffiffi pffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi − − ψ 1 ψ 1 p that λ0 þ λ1 ≥ λ0 þ λ1 . Together, these results imply ¼ qj0iþsin χ 1 − qj1iþcos χ 1 − qj2i; − ψ − λ ≻λ 1 and therefore CqðQ Þ ≤ CqðQÞ. □ − ðB10Þ By Lemma 1, Ph has an optimal quantum model that satisfies the causal-state correspondence. Meanwhile, by pffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi for sin χ ¼ 1 − r2= 1 − q. Furthermore, let jx0i¼ Lemma 4, any Q satisfying the causal-state correspondence − − jxðt; β; 0; 0Þi be defined as must have CqðQÞ ≥ CqðQ Þ. It follows that Q is an − pffiffiffiffiffiffiffiffiffiffiffiffi optimal quantum model for Ph . jx0i¼t sin βj0iþ 1 − t2j1iþt cos βj2iðB11Þ APPENDIX C: PROOF OF RESULT 2 β π=2; ϕ dθ for ¼ minð þ Þ. Then, we have Here, we prove Result 2. To do this, we require some

ψ0 preliminary lemmas. The first connects the capacity for 1 0 ψ 0 0 λ0 ≥ hx jρ 1 jx i quantum models to improve upon their optimal classical − 2 2 − 2 counterparts with causal asymmetry. ¼ π0 t sin β þ π2 ð1 − t Þ pffiffiffiffiffiffiffiffiffiffiffiffipffiffiffiffiffiffiffiffiffiffiffiffi Lemma 5. If the classical and quantum statistical com- 2 − 2 2 P Cþ Cþ P þ π1 jrt cos ðβ − θ − dθÞþ 1 − r 1 − t j plexities of a process coincide, such that q ¼ μ , then þ − − 2 2 − 2 is causally symmetric and Cμ ¼ Cμ ¼ E. ≥ π0 t sin ϕ þ π2 ð1 − t Þ pffiffiffiffiffiffiffiffiffiffiffiffipffiffiffiffiffiffiffiffiffiffiffiffi Proof.—We first make use of the prior results, showing − 2 2 2 þ π1 jrt cos ðϕ − θÞþ 1 − r 1 − t j that whenever classical models waste information, more þ Cμ >E ψ 1 efficient quantum models exist [27]. Specifically, λ ; B12 þ þ þ þ ¼ 0 ð Þ if and only if Cq

031013-10 CAUSAL ASYMMETRY IN A QUANTUM WORLD PHYS. REV. X 8, 031013 (2018)

þ þ ⃖ ⃗ Cμ ¼ E. It is therefore sufficient to show that Cμ ¼ E Lemma 7. Let PðX; XÞ be a stationary stochastic − − ⃖ ⃗ implies Cμ ¼ E. process and P ðY;YÞ its time reversal, with q-machine þ ¯ þ ¯ − We prove this by contradiction. Assume Cμ ¼ E but complexity Cq and Cq , and q-machine state complexity − þ ⃗ ¯ þ ¯ − ¯ þ ¯ − ¯ þ ¯ − Cμ >E.Now,Cμ ¼ E implies HðS−1jXÞ¼0, where S−1 Dq and Dq , respectively. Then, Cq ¼ Cq and Dq ¼ Dq . is the random variable governing the causal state at t ¼ −1 Proof.—We first introduce some compact notation. Let x⃗ s ⃖ ⃗ ⃗ ⃖ [14]. Thus, given , we can find a unique i such that PðX ¼ x;⃖X ¼ x⃗Þ¼P↔ and, similarly, PðX ¼ x⃗jX ¼ x⃖Þ¼ ⃖ ⃗ x PðX ¼ x⃖jX ¼ x⃗Þ is only nonzero when x⃖∈ si. It follows ⃗ ⃖ Px⃗x⃖, PðX ¼ x⃗jS−1 ¼ siÞ¼Piðx⃗Þ,aswellasPðX ¼ x⃖Þ¼ τ x⃗P X⃗ x⃗ ≠ 0 j that the sets i ¼f j ið ¼ Þ g form a partitioning on Px⃖and Pðx⃖∈ siÞ¼πi. τ ∩ τ ∅ i ≠ j þ the space of all futures (i.e., i j ¼ for ). Now let jSi i denote the internal states of the q-machine for 0 ⃗ ⃖ P Furthermore, any two x⃖, x⃖∈ si satisfy PðXjX ¼ x⃖Þ¼ ⃖ ⃗ þ þ þ ¯ þ þ PðX; XÞ,suchthatρ ¼ iπijSi ihSi j and Cq ¼ Sðρ Þ. P X⃗X⃖ x⃖0 s ’ ð j ¼ Þ, by definition of i. Thus, Bayes theorem From existing work [28,33],weknowthat implies that the τi partition the future into equivalence S l S l S k S k ωþ l Pliml→∞h ið Þj jð Þi¼h ið Þj jð Þi. Thus, let ð Þ¼ x⃗∼ x⃗0 P X⃖X⃗ x⃗ P X⃖X⃗ x⃗0 þ þ þ þ classes if and only if ð j ¼ Þ¼ ð j ¼ Þ iπijSi ðlÞihSi ðlÞj and ω ¼ liml→∞ω ðlÞ such that [53]. Hence, τi constitute the retrocausal states. Bayes’ ¯ þ þ f g Cq ¼ Sðω Þ.Then, ⃖ ⃗ theorem also yields PðX ¼ x⃖jX ¼ x⃗∈ τiÞ ≠ 0 only when − ⃖ − X x⃖∈ si. This implies HðS−1jX ¼ x⃖Þ¼0, where S−1 governs þ ω ¼ lim πijSiðlÞihSiðlÞj; − l→∞ the retrocausal state at time t ¼ −1. Hence, Cμ ¼ E, which i is a contradiction. □ X X pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi π P x⃗P x⃗0 x⃗ x⃗0 ; It follows as a direct corollary of this result that causal ¼ i ið Þ ið Þ j ih j i x;⃗x⃗0 asymmetry vanishes for deterministic processes [i.e., proc- X X X qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ⃗ ⃖ esses where HðXjXÞ¼0]. P P P x⃗ x⃗0 ; ⃖ ⃗ ¼ x⃖ x⃗jx⃖ x⃗0jx⃖j ih j P X; X 0 Lemma 6. Any deterministic process ð Þ has i x⃖∈si x;⃗x⃗ ΔCμ ¼ 0. X qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 0 þ P P P 0 0 P 0 δ 0 x⃗ x⃗ : Proof.—Any deterministic process has E ¼ Cμ [14,54]. ¼ x⃗jx⃖ x⃖ x⃗jx⃖ x⃖ x;⃖x⃖j ih j ðC3Þ þ þ þ þ x;⃗x⃗0;x;⃖x⃖0 Since E ≤ Cq ≤ Cμ , it follows that E ¼ Cμ ¼ Cq ; thus, according to the above lemma, ΔCμ ¼ 0. □ Our next lemma makes use of q-machines [33], the Furthermore, the forward q-machine complexity is given by ¯ þ þ ¯ − simplest currently known quantum models. Consider a Cq ¼ Sðω Þ. A similar argument shows that Cq is given by ⃖ ⃗ ¯ − − process P ¼ PðX; XÞ whose classical ε-machine has a Cq ¼ Sðω Þ, where S s collection of causal states ¼f ig and transition proba- qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x ⃖ ⃗ X bilities Tij. Let k denote the cryptic order of PðX; XÞ, ω− P P P P δ x⃖ x⃖0 : ¼ x⃖jx⃗ x⃗ x⃖0jx⃗0 x⃗0 x;⃗x⃗0 j ih j ðC4Þ defined as the smallest l such that HðSljX0∶∞Þ¼0 x;⃗x⃗0;x;⃖x⃖0 [28,33,54]. The q-machine of P has internal states jSii defined by a recursive relation Consider now the pure state

jSii¼jSiðl ¼ kÞi; where ðC1Þ X pffiffiffiffiffiffiffiffiffiffiffiffiffiffi jψiX;⃖X⃗ ¼ Pðx;⃖x⃗Þ jx;⃖x⃗i; ðC5Þ X qffiffiffiffiffiffi x;⃖x⃗ x jSiðlÞi ¼ Tij jxijSjðl − 1Þi; ðC2Þ xj which represents the quantum superposition, or the q sample [55], over all possible output strings of the stochastic process and jSið0Þi¼jii. The associated encoding function sat- ⃖ ⃗ ¯ PðX; XÞ, with associated density operator isfies fðx⃖Þ¼jSii whenever x⃖∈ si [28,33]. Let Cq ¼ SðρÞ be the q-machine complexity—the amount ofP information a X X X X qffiffiffiffiffiffiffiffiffiffiffiffiffi ρ π S S ρ P↔ P↔ x⃖ x⃗ x⃖0 x⃗0 : q-machine stores about the past, where ¼ i ij iih ij. X;⃖X⃗ ¼ x 0 x j ij ih jh j ðC6Þ ¯ 0 i;i0 x⃖∈s x⃖0∈s0 x;⃗x⃗0 Meanwhile, let the max entropy Dq ¼ log tr½ρ be the q- i i machine state complexity—the minimum dimensionality of Ξ þ − any quantum system capable of storing these internal We can verify that ω ¼ TrX⃖½ρX⃖X⃗ and ω ¼ TrX⃗½ρX⃖X⃗. − þ ¯ þ ¯ − states. Note that since q-machines are valid quantum Thus, Sðω Þ¼Sðω Þ, and therefore, Cq ¼ Cq .The þ ¯ þ − ¯ − þ ¯ þ models, Cq ≤ Cq and Cq ≤ Cq . Likewise, Dq ≤ Dq q-machine complexities of the forward and backward proc- − ¯ − and Dq ≤ Dq . We now establish that the q-machine for esses thus coincide. P and its time reversal P− have coinciding von Neumann Note that the ranks of ωþ and ω− must also and coinciding max entropies. coincide. Thus, an analogous argument establishes that

031013-11 JAYNE THOMPSON et al. PHYS. REV. X 8, 031013 (2018) log tr½ρþ0 ¼ log tr½ρ−0, indicating the two models also ¯ þ ¯ − have the same dimensionality. Therefore, Dq ¼ Dq . □ We now prove Result 2. Consider any stochastic process P. First, assume P is causally asymmetric, such that ΔCμ ≠ 0. þ − Note that this implies Cμ ;Cμ >E (by Lemma 5). ¯ þ ¯ − Meanwhile, Lemma 7 implies that Cq ¼ Cq . Thus, it is ¯ þ þ ¯ − − sufficient to show that Cq E. ¯ Note that for a general process, Cq 0 [56]. It was also previously established that whenever Cμ >E, we can find some hSið1ÞjSjð1Þi > 0 [27], as defined by Eq. (C1). It follows from the iterative ¯ construction that Si Sj > 0, and thus Cq E C¯ þ E C¯ −

031013-12 CAUSAL ASYMMETRY IN A QUANTUM WORLD PHYS. REV. X 8, 031013 (2018) governing the memory. RegroupingP the pasts into causal- [3] D. N. Page and W. K. Wootters, Evolution without Evolu- π I R; X⃗x⃖∈ s 0 tion: Dynamics Described by Stationary Observables, Phys. state equivalence classes yields si∈S i ð j iÞ¼ , π s Rev. D 27, 2885 (1983). where i is the probability that the past belongs to i. Thus, [4] H. Price, Time’s Arrow & Archimedes’ Point: New Direc- ⃗ IðR; Xjx⃖∈ siÞ¼0 for every si ∈ S. tions for the Physics of Time (Oxford University Press, New We have assumed some elements of Ω are mixed. In York, 1997). particular, suppose we have a specific ωi0 ¼ ω ∈ Ω with [5] J. P. Crutchfield, C. J. Ellison, and J. R. Mahoney, Time’s SðωÞ > 0, which occurs with probability πi0 ¼ π. Let x⃖be a Barbed Arrow: Irreversibility, Crypticity, and Stored In- particular past such that fðx⃖Þ¼ω, and let Ψ ¼fjψ kig be a formation, Phys. Rev. Lett. 103, 094101 (2009). [6] J. P. Crutchfield, Between Order, and Chaos , Nat. Phys. 8, set of pure states that form an unravelingP of ω. There must 17 (2012). exist some qk ∈ ½0; 1 such that ω ¼ kqkjψ kihψ kj.Now, [7] Formally, this implies that the conditional mutual informa- let OM be a quantum process that maps ω to a classical tion IðR; X⃗jX⃖Þ is zero, where R is the random variable X⃗ random variable governed by probability distribution governing the state of Ξ. In the literature, IðR; X⃗jX⃖Þ is called ⃗ ⃖ PðXjX ¼ x⃖Þ. By definition of a quantum model, this oracular information, as it captures the amount of informa- process can always be constructed by concatenations of tion a state of the memory s stores about the future that is not M acting on a physical system Ξ. contained in the past [8,9]. Note that this notion of causal Let A represent the state of Ξ, and let B be the random model differs from the recent developments in quantum variable that governs the resulting output of OM acting on causal models [10,11] and from foundation work on causal Ξ. Zero oracular information implies that A and B must be inference [12]. uncorrelated when conditioned on observing past x⃖. [8] J. P. Crutchfield, C. J. Ellison, R. G. James, and J. R. O ψ ψ O ψ ψ O ω Mahoney, Synchronization and Control in Intrinsic and Therefore, Mðj kih kjÞ¼ Mðj jih jjÞ¼ Mð Þ for Designed Computation: An Information-Theoretic Analysis ψ ψ ∈ Ψ all j ki, j ji . of Competing Models of Stochastic Computation, Chaos 20, Now, consider the entropy of Q. By concavity of 037105 (2010). entropy, [9] C. J. Ellison, J. R. Mahoney, R. G. James, J. P. Crutchfield, and J. Reichardt, Information in Irreversible      X X X Processes, Chaos 21, 037107 (2011). S πjωj ¼ S qk πjψ kihψ kjþ πjωj [10] J.-M. A. Allen, J. Barrett, D. C. Horsman, C. M. Lee, and R. j k ωj≠ω W. Spekkens, Quantum Common Causes and Quantum X  X  Causal Models, Phys. Rev. X 7, 031021 (2017). ≥ qkS πjψ kihψ kjþ πjωj [11] F. Costa and S. Shrapnel, Quantum Causal Modelling, New k ωj≠ω J. Phys. 18, 063032 (2016).  X  [12] J. Pearl, Causal Inference in Statistics: An Overview, Statistics Surveys 3, 96 (2009). ≥ mink S πjψ kihψ kjþ πjωj : ðE3Þ [13] J. P. Crutchfield and K. Young, Inferring Statistical Com- ωj≠ω plexity, Phys. Rev. Lett. 63, 105 (1989). [14] C. R. Shalizi and J. P. Crutchfield, Computational Mechan- Without loss of generality, we can assume that this ics: Pattern and Prediction, Structure and Simplicity, J. Stat. minimum is obtained for k ¼ 0. Let Ω00 ¼ðΩnωÞ ∪ Phys. 104, 817 (2001). fjψ 0ihψ 0jg be a set of internal states, where ω is replaced [15] W. M. Gonalves, R. D. Pinto, J. C. Sartorelli, and M. J. de 00 with jψ 0ihψ 0j, and define the encoding function f such Oliveira, Inferring Statistical Complexity in the Dripping that f00ðx⃖Þ¼fðx⃖Þ, except when fðx⃖Þ¼ω, whereby Faucet Experiment, Physica (Amsterdam) 257A, 385 00 f ðx⃖Þ¼jψ 0ihψ 0j. Define a new quantum model (1998). 00 00 00 00 [16] J. B. Park, J. W. Lee, J.-S. Yang, H.-H. Jo, and H.-T. Moon, Q ¼ðf ; Ω ; MÞ. Clearly, CqðQ Þ ≤ CqðQÞ. Ω00 Complexity Analysis of the Stock Market, Physica If any of the states in are still mixed, then by (Amsterdam) 379A, 179 (2007). repeating the above procedure, we can replace them with [17] P. Tino and M. Koteles, Extracting Finite-State Represen- 0 0 0 pure states, thereby constructing a model Q ¼ðf ; Ω ; MÞ tations from Recurrent Neural Networks Trained on Cha- 0 with pure internal states such that CqðQ Þ ≤ CqðQÞ. □ otic Symbolic Sequences, IEEE Trans. Neural Networks 10, 284 (1999). [18] A. Cabello, M. Gu, O. Gühne, and Z.-P. Xu, Optimal Classical Simulation of State-Independent Quantum Con- textuality, Phys. Rev. Lett. 120, 130401 (2018). [1] N. Linden, S. Popescu, A. J. Short, and A. Winter, Quantum [19] A. J. P. Garner, J. Thompson, V. Vedral, and M. Gu, Mechanical Evolution Towards Thermal Equilibrium, Phys. of Complexity and Pattern Manipulation, Rev. E 79, 061103 (2009). Phys. Rev. E 95, 042140 (2017). [2] S. Popescu, A. J. Short, and A. Winter, Entanglement and [20] K. Wiesner, M. Gu, E. Rieper, and V. Vedral, Information- the Foundations of Statistical Mechanics, Nat. Phys. 2, 754 theoretic lower bound on energy cost of stochastic compu- (2006). tation,inProceedings of the Royal Society of London

031013-13 JAYNE THOMPSON et al. PHYS. REV. X 8, 031013 (2018)

A: Mathematical, Physical and Engineering Sciences, [37] H. M. Wiseman, Weak Values, Quantum Trajectories, and Vol. 468 (The Royal Society, 2012) pp. 4058–4066. the Cavity-QED Experiment on Wave-Particle Correlation, [21] A. Cabello, M. Gu, O. Gühne, J.-Å. Larsson, and K. Phys. Rev. A 65, 032111 (2002). Wiesner, Thermodynamical Cost of Some Interpretations [38] D. Tan, S. J. Weber, I. Siddiqi, K. Mølmer, and K. W. of Quantum Theory, Phys. Rev. A 94, 052127 (2016). Murch, Prediction and Retrodiction for a Continuously [22] A. B. Boyd, D. Mandal, P. M. Riechers, and J. P. Crutch- Monitored Superconducting Qubit, Phys. Rev. Lett. 114, field, Transient Dissipation and Structural Costs of Physi- 090403 (2015). cal Information Transduction, Phys. Rev. Lett. 118, 220602 [39] T. Rybarczyk, B. Peaudecerf, M. Penasa, S. Gerlich, B. (2017). Julsgaard, K. Mølmer, S. Gleyzes, M. Brune, J. M. [23] Note that in computational mechanics literature, causal Raimond, S. Haroche et al., Forward-Backward Analysis asymmetry is often referred to as causal irreversibility of the Photon-Number Evolution in a Cavity, Phys. Rev. A [5]. Here, we choose the term causal asymmetry to avoid 91, 062116 (2015). confusion with standard notions of irreversibility used in the [40] S. J. Weber, A. Chantasri, J. Dressel, A. N. Jordan, physics community. K. W. Murch, and I. Siddiqi, Mapping the Optimal Route [24] R. Konig, R. Renner, and C. Schaffner, The Operational between Two Quantum States, Nature (London) 511, 570 Meaning of Min- and Max-Entropy, IEEE Trans. Inf. Theory (2014). 55, 4337 (2009). [41] P. Campagne-Ibarcq, L. Bretheau, E. Flurin, A. Auff`eves, F. [25] W. Y. Suen, J. Thompson, A. J. P. Garner, V. Vedral, and M. Mallet, and B. Huard, Observing Interferences between Past Gu, The Classical-Quantum Divergence of Complexity in and Future Quantum States in Resonance Fluorescence, the Ising Spin Chain, Quantum 1, 25 (2017). Phys. Rev. Lett. 112, 180402 (2014). [26] Note that these definitions implicitly assume we can only [42] Note that since ρðtÞ and EðtÞ vary over a continuum, the encode onto pure states Ω. We show in Appendix E that memory cost of tracking either is likely to be unbounded. even when we allow the encoding function to map directly Thus, one may need to modify present approaches. Existing onto mixed quantum states, it does not help—there is always approaches include the use of differential entropies [43] þ a pure state quantum causal model with entropy SðρÞ¼Cq and studying scaling as we track the process to greater (see Theorem 3). precision [32]. [27] M. Gu, K. Wiesner, E. Rieper, and V. Vedral, Quantum [43] S. Marzen and J. P. Crutchfield, Informational and Causal Mechanics Can Reduce the Complexity of Classical Models, Architecture of Continuous-Time Renewal Processes, J. Nat. Commun. 3, 762 (2012). Stat. Phys. 168, 109 (2017). [28] P. M. Riechers, J. R. Mahoney, C. Aghamohammadi, and J. [44] T. J. Elliott and M. Gu, Superior Memory Efficiency P. Crutchfield, Minimized State Complexity of Quantum- of Quantum Devices for the Simulation of Continuous- Encoded Cryptic Processes, Phys. Rev. A 93, 052317 Time Stochastic Processes, npj Quantum Inf. 4,18 (2016). (2018). [29] M. S. Palsson, M. Gu, J. Ho, H. M. Wiseman, and G. J. [45] N. Barnett and J. P. Crutchfield, Computational Mechanics Pryde, Experimentally Modeling Stochastic Processes with of Input–Output Processes: Structured Transformations Less Memory by the Use of a Quantum Processor, Sci. Adv. and the ϵ-Transducer, J. Stat. Phys. 161, 404 (2015). 3, e1601302 (2017). [46] J. Thompson, A. J. P. Garner, V. Vedral, and M. Gu, Using [30] R. Tan, D. R. Terno, J. Thompson, V. Vedral, and M. Gu, Quantum Theory to Simplify Input–Output Processes, npj Towards Quantifying Complexity with Quantum Mechan- Quantum Inf. 3, 6 (2017). ics, Eur. Phys. J. Plus 129, 191 (2014). [47] J. Dressel, A. Chantasri, A. N. Jordan, and A. N. Korotkov, [31] A. Monras and A. Winter, Quantum Learning of Arrow of Time for Continuous Quantum Measurement, Classical Stochastic Processes: The Completely Positive Phys. Rev. Lett. 119, 220507 (2017). Realization Problem, J. Math. Phys. (N.Y.) 57, 015219 [48] O. C. O. Dahlsten, Non-equilibrium Statistical Mechanics (2016). Inspired by Modern Information Theory, Entropy 15, 5346 [32] A. J. P. Garner, Q. Liu, J. Thompson, V. Vedral, and M. Gu, (2013). Provably Unbounded Memory Advantage in Stochastic [49] S. Still, D. A. Sivak, A. J. Bell, and G. E. Crooks, Thermo- Simulation Using Quantum Mechanics, New J. Phys. 19, dynamics of Prediction, Phys. Rev. Lett. 109, 120604 103009 (2017). (2012). [33] J. R. Mahoney, C. Aghamohammadi, and J. P. Crutchfield, [50] R. Jozsa and J. Schlienz, Distinguishability of States Occam’s Quantum Strop: Synchronizing and Compressing and von Neumann Entropy, Phys. Rev. A 62, 012301 Classical Cryptic Processes via a Quantum Channel, Sci. (2000). Rep. 6, 20495 (2016). [51] M. A. Nielsen and G. Vidal, Majorization and the Inter- [34] F. C. Binder, J. Thompson, and M. Gu, A Practical, Unitary conversion of Bipartite States, Quantum Inf. Comput. 1,76 Simulator for Non-Markovian Complex Processes, (2001). arXiv:1709.02375. [52] R. Courant and D. Hilbert, Methods of Mathematical [35] J. Dressel, Weak Values as Interference Phenomena, Phys. Physics (Interscience, New York, 1953), Vol. 1. Rev. A 91, 032116 (2015). P X⃖ x⃖X⃗ x⃗∈ τ P X⃗ [53] To derive this, simplyP write ð ¼ j ¼ iÞ¼f½ ð ¼ [36] S. Gammelmark, B. Julsgaard, and K. Mølmer, Past ⃖ ⃗ ⃖ ⃗ 0 x⃗jX ¼ x⃖ÞPðx⃖Þ=½ x⃖∈s PðX ¼ x⃗jX ¼ x⃖ÞPðx⃖Þg ¼ f½PðX ¼ x⃗j Quantum States of a Monitored System, Phys. Rev. Lett. P i ⃖ ⃗ 0 ⃖ X ¼ x⃖ÞPðx⃖Þ=½ ⃖ PðX ¼ x⃗jX ¼ x⃖ÞPðx⃖Þg. 111, 160401 (2013). x∈si

031013-14 CAUSAL ASYMMETRY IN A QUANTUM WORLD PHYS. REV. X 8, 031013 (2018)

[54] J. R. Mahoney, C. J. Ellison, R. G. James, and J. P. of the 35th Annual ACM Symposium on Theory of Comput- Crutchfield, How Hidden Are Hidden Processes? A Primer ing (ACM, New York, 2003), pp. 20–29. on Crypticity and Entropy Convergence, Chaos 21, 037112 [56] M. A. Nielsen and I. L. Chuang, Quantum Computation and (2011). Quantum Information (Cambridge University Press, [55] D. Aharonov and A. Ta-Shma, Adiabatic Quantum State Cambridge, England, 2010). Generation and Statistical Zero Knowledge,inProceedings

031013-15