A Bayesian Quasi-Probability Approach to Inferring the Past of Quantum Observables

A Bayesian quasi-probability approach to inferring the past of quantum observables

Mankei Tsang∗ Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117583 and Department of Physics, National University of Singapore, 2 Science Drive 3, Singapore 117551 (Dated: September 16, 2018) I describe a method of inferring the past of quantum observables given the initial state and the subsequent measurement results using Wigner quasi-probability representations. The method is proved to be compatible with logic for large subclasses of quantum systems, including those that involve incompatible observables and can still exhibit some quantum features, such as the uncertainty relation, measurement backaction, and entanglement.

I. INTRODUCTION issues associated with the weak value.

Applying classical reasoning to quantum systems often leads to paradoxes. Many of the paradoxes involve II. LOGICAL INFERENCE inferring the past of a quantum observable given initial conditions and subsequent measurement outcomes, such A. Classical as the path taken by a quantum particle in an interferometer [1–4]. The weak value proposed by Aharonov, Consider first the classical inference problem. Cox’s Albert, and Vaidman (AAV) [5] has been widely used ex- theorem [15, 16] states that the only consistent method of perimentally to provide such an inference [4,6–11], but assigning plausibilities to propositions is equivalent to the it often produces results that defy common sense; for ex- laws of probability, and the Bayes theorem in particular, ample, a weak value can go well outside of the spectrum if one assumes a set of desiderata that can be regarded of an observable and even become complex. as an extension of Aristotelian logic and summarized as It is often claimed that anomalous weak values are follows [16]: signatures of the non-classicality of a quantum system [4,6,7]. A recent work by Ferrie and Combes [12] shows, 1. Representation of degrees of plausibility by real however, that an estimator analogous to the weak value numbers. can also produce non-sense when applied to a classical coin. If the weak value does not even make sense classi- 2. Qualitative correspondence with common sense. cally, should we expect it to be any more meaningful for 3. Consistency. quantum systems? Here I describe an alternative method of inferring the In the rest of the paper I shall take the view that Cox’s past of a quantum system based on quasi-probability desiderata and logic are equivalent notions. Readers who representations. The method was first proposed in wish to challenge this definition of logic are urged to read Refs. [13, 14] and called quantum smoothing, but the fo- Refs. [15, 16], and until they have come up with a bet- cus there was the estimation of classical signals coupled ter definition, Cox’s theorem remains the most rigorous to quantum systems and there was no general proof of its logical foundation available for statistical inference. compatibility with logic. The main purpose of this paper To be specific, consider the inference problem depicted is to prove that, through the celebrated Cox’s theorem in Fig.1. Let x be a parameter that represents facts [15, 16] and Wigner functions [17], quantum smoothing known at an initial time, y be another parameter that

arXiv:1403.3353v2 [quant-ph] 1 Apr 2014 is compatible with logic for large subclasses of quantum represents facts known at a ﬁnal time, and λ be the systems, including those that involve incompatible ob- hidden variable whose value at the intermediate time is servables and can still exhibit some quantum features, to be inferred. Suppose that the predictive probability such as the uncertainty relation, measurement backac- function P1(λ|x) and the retrodictive likelihood function tion, and entanglement. This general proof is the key P2(y|λ, x) are given. This is a reasonable assumption, as feature of the proposed method and a signiﬁcant improve- causality implies that λ should depend on x, y should de- ment over previous approaches [3,4], the logicality of pend on both λ and x, and noise should introduce uncer- which is less obvious when incompatible observables are tainties to the dependencies. The posterior probability concerned. By applying quantum smoothing to a qubit function of λ conditioned on x and y is and the AAV gedanken experiment, I also demonstrate how my method can avoid some of the counter-intuitive P2(y|λ, x)P1(λ|x) P (λ|x, y) = P , (2.1) λ P2(y|λ, x)P1(λ|x) which is the logical assignment of plausibilities to values ∗ [email protected] of λ given known facts x and y. For example, the most 2

If W2(y|λ, x) and W1(λ|x) are both non-negative, the posterior function given by W (y|λ, x)W (λ|x) W (λ|x, y) = 2 1 (2.6) P (y|x) is also non-negative and compatible with the laws of probability and Cox’s desiderata. Hence, the logicality of quantum smoothing and the non-negativity of quasi- probability representations are equivalent notions. ? If W1(λ|x) and W2(y|λ, x) are restricted to be non- negative, it is known that no single pair of maps W1 and W2 can explain all predictions of quantum mechan- Inference ics [19, 20]. This is known as contextuality; without this property, quantum mechanics would be equivalent to a classical hidden variable model. The key point here is that many large and important subclasses of quantum systems, such as the odd-dimensional stabilizer quan- FIG. 1. A ﬂowchart depicting the inference problem. Solid tum computation model [21, 22] and the linear Gaus- arrows denote the direction of causality. The problem of sta- sian model for canonical observables [23–25], turn out tistical inference is to infer λ from x and y using the given to be perfectly described by non-negative W1(λ|x) and P1(λ|x) and P2(y|λ, x). W2(y|λ, x) if they are chosen to be the appropriate discrete or continuous Wigner representations of ρx and E(y|x). These models generally involve incompatible ob- likely λ can be determined from P (λ|x, y) by ﬁnding the λ servables and can still exhibit some quantum features, that maximizes the posterior and is called the maximum such as the uncertainty relation, measurement backac- a posteriori (MAP) estimate. In accordance with the tion, and entanglement. terminology in estimation theory [13, 14, 18], I refer to To prove the logicality of quantum smoothing for the the inference of a hidden variable at the intermediate aforementioned subclasses explicitly, I write Born’s rule time as smoothing. in the following way:

P (y|x) = tr E(y|x, tj)ρx(tj) (2.7)

B. Quantum for any time tj, j = 0, 1,...,J, where

ρx(tj) ≡ Ux(tj) ... Ux(t2)Ux(t1)ρx(t0), (2.8) I now ask how one can logically infer the past of a ∗ ∗ ∗ E(y|x, tj) ≡ Ux (tj+1) ... Ux (tJ−1)Ux (tJ )E(y|x, tJ ), quantum system. Suppose that the initial quantum state (2.9) given prior facts represented by x is ρx, and the positive operator-valued measure (POVM) for the measurement ρx(t0) is the initial density operator, E(y|x, tJ ) is the at the final time with outcome y is E(y|x), which, for final POVM, and Ux(tj) denotes a unitary operation with generality, can be adaptive and depend on x. Born’s rule unitary operator Ux(tj): gives † Ux(tj)ρ ≡ Ux(tj)ρUx(tj), (2.10) ∗ † P (y|x) = tr E(y|x)ρx. (2.2) Ux (tj)E ≡ Ux(tj)EUx(tj). (2.11) The unitary operations can also model open-system evo- To infer logically the value of quantum observables, de- lution and sequential and adaptive measurements if the noted by λ, at the intermediate time is to assign a non- Hilbert space is suitably dilated [26, 27]. If I choose W negative posterior probability function to their possible 1 to be a Wigner representation of ρ (t ): values through the Bayes theorem. Define a pair of maps x j that transform ρx and E(y|x) to quasi-probability func- W1(λ, tj|x) = W1ρx(tj), (2.12) tions of λ: and W2 to be W1 multiplied by a normalization factor N : W1ρx = W1(λ|x), (2.3)

W2E(y|x) = W2(y|λ, x), (2.4) W2(y|λ, x, tj) = W2E(y|x, tj) = NW1E(y|x, tj), (2.13) a fundamental property of the Wigner representation [19] and require that they obey Born’s rule in the following can be used to give way: X tr E(y|x, t )ρ (t ) = W (y|λ, x, t )W (λ, t |x), X j x j 2 j 1 j P (y|x) = tr E(y|x)ρx = W2(y|λ, x)W1(λ|x). (2.5) λ λ (2.14) 3 making the hidden-variable model consistent with Born’s C. Weak value as an estimate P rule. For continuous variables, λ should be replaced by an integral. In the weak-value approach, an additional weak mea- For quantum systems with odd dimensions, Gross [21, surement [26, 27, 29, 30] is made before the final mea- 22] showed that a pure state is a stabilizer state if and surement. To model that scenario, it is important not to only if the natural analog of the Wigner representation confuse the weak measurement outcome with the hidden variable to be inferred. The weak measurement outcome for odd dimensions is non-negative. Restricting ρx(t0) to stabilizer states and U (t ) to Clifford operations, which can be grouped with either x or y, and the Bayesian pro- x j tocol, if it exists, will give us a posterior distribution of λ transform stabilizer states to stabilizer states, W1(λ, tj|x) is non-negative at any time. If E(y|x, t ) is a stabilizer- for any number of trials. From the perspective of Cox’s J theorem, any method that deviates from the Bayesian state projection, W2(y|λ, x, tj) is non-negative as well, ∗ approach is illogical [16], and if we view the weak value since Ux is also a Clifford operation if Ux is one. Hence, the odd-dimensional stabilizer model, which consists of simply as an inference method that combines data in a stabilizer states, Clifford operations, and stabilizer-state special way to produce an estimate of the observable, it projections, can always be represented by non-negative should not surprise us that it can produce non-sense, unless it happens to agree with the Bayesian approach. No W1 and W2 at any time tj and permits logical smoothing inference. amount of heuristic reasoning can save the weak value from illogicality otherwise. Similarly, for canonical observables, such as the con- In the context of the quasi-probability framework, one tinuous positions and momenta of harmonic oscillators, can ask whether the weak value corresponds to an esti- it is well known that a state with a Gaussian Wigner mate arising from a posterior quasi-probability function. representation remains Gaussian if the Hamiltonian is The answer was provided by Johansen and Luis [31], who quadratic with respect to the canonical observables [23– showed that the weak value is equivalent to the condi- 25]. If E(y|x, tJ ) is a projective measurement with re- tional average using what they called the S distributions, spect to the canonical observables, and the unitary op- which can be negative or even complex. It is currently erations are restricted to those with quadratic Hamilto- unknown under what situation the S distributions are nians, W2(y|λ, x, tj) is also Gaussian. The linear Gaus- non-negative, so the logical foundation of the weak value sian model, which consists of Gaussian states, quadratic remains questionable. Hamiltonians, and canonical-observable measurements, can therefore be represented by non-negative and Gaus- sian W1 and W2 at all times and also permits logical D. Logical inference versus decision theory smoothing inference. For quantum systems that do not admit non-negative Besides the logical interpretation, one can also justify quasi-probability representations, the smoothing infer- Bayesian inference in a more utilitarian manner using ence necessarily violates Cox’s desiderata and can lead to decision theory [16, 32], which shows that Bayesian in- paradoxes. One example is shown in Ref. [14] for Hardy’s ference can minimize the expected error between an esti- paradox [28] using a discrete Wigner representation. It mate and the true value of a hidden variable. In this pa- must be emphasized that negative quasi-probabilities do per, I do not attach any decision-theoretic significance to not resolve any paradox; they simply mean that the infer- the quantum inference, as the past of a quantum observ- ence according to the hidden-variable model is illogical. able is usually not available for error evaluation because For illustrative and pedagogical purposes it is still useful of the no-cloning theorem. Any apparent illogicality that to report the quasi-probability functions for experiments, arises from negative quasi-probabilities exists only in the as they are more obvious indicators of non-classicality mind and can be attributed to the wrong model being than density matrices and POVMs. For a more sensible used for a quantum system; the experimenter only has estimate than the weak value, one can choose access to prior facts and measurement outcomes, which obey Born’s rule and hence the laws of probability and logic. MAP This kind of mental exercise is not entirely philosoph- λx,y ≡ arg max W (λ|x, y) (2.15) λ ical however. The smoothing inference provides an alternative way of thinking about when a quantum system follows classical logic internally and is therefore simulable as the MAP quasi-estimate, which is still one of the pos- by a classical computer. This question is central not only sible values of λ and will never become anomalously large to quantum computation and quantum simulation [33], or complex, thus avoiding some of the counter-intuitive but also to the implementation of quantum estimation features of the weak value. In the case of Hardy’s para- and control algorithms [13, 14, 27, 34]. MAP dox, Ref. [14] shows that λx,y indeed reproduces the The method presented here is naturally extensible to most likely paths suggested by classical reasoning, even the inference of classical signals coupled to quantum sys- though they still lead to logical contradictions. tems, a problem studied extensively in Refs. [13, 14, 34]. 4

In that case, it must be emphasized that, although the These are plotted in Fig.2. Note that the value of the ˆr quasi-probability functions can be used as an intermedi- observable can also be inferred by deﬁning ate and often convenient step, the end result is always consistent with both logic and decision theory. The hy- r = (q + p) mod 2. (3.8) brid smoothing method is known to be optimal and su- perior to conventional prediction or ﬁltering methods for Note also how the uncertainty relations among the three certain quantum waveform estimation problems [35–38], spin components are observed in phase space. analogous to the classical case [18, 39], whereas classical parameter estimation based on the weak-value approach often turns out to be suboptimal [40–43]. |0i |1i

III. DISCRETE WIGNER REPRESENTATION OF QUBITS 0.5 0.5 0 0 Odd-dimensional systems possess a natural and unique 1 1 definition of the discrete Wigner function [21, 22], but p 0 0 many equally qualified definitions can exist for even di- q mensions [44]. Here I consider the ones proposed by Feynman [45] and Wootters and co-workers [44, 46, 47]. |+i |−i Consider first a qubit, which can describe the path of a quantum particle in a two-arm interferometer, the polar- ization of a photon, or the spin of an electron for example. 0.5 0.5 Define the observables of interest as 0 0 1 1 1 qˆ ≡ (I − σ ) , pˆ ≡ (I − σ ) , rˆ ≡ (I − σ ) , 2 Z 2 X 2 Y (3.1) where σZ , σX , and σY are Pauli matrices and I is the i − i identity matrix. Define eigenstates ofq ˆ,p ˆ, andr ˆ asq ˆ|0i = | i | i 0|0i,q ˆ|1i = |1i,p ˆ|+i = 0|+i,p ˆ|−i = |−i,r ˆ|ii = 0|ii, and rˆ| − ii = | − ii. The discrete Wigner function proposed by Feynman [45] and Wootters [46] with respect to the 0.5 0.5 eigenvalues ofq ˆ andp ˆ is 0 0 1 W ρ = tr A(q, p)ρ = W (q, p|x), (3.2) 1 x 2 x 1 where FIG. 2. A discrete Wigner representation of some qubit A(q, p) states.

1 q p q+p = (−1) σZ + (−1) σX + (−1) σY + I . (3.3) For two qubits, one of the possible deﬁnitions of the 2 Wigner functions proposed by Gibbons et al. [44, 47] is In the phase-space matrix format, 1 W1(0, 1|x) W1(1, 1|x) W1ρx = tr A(q1, q2, p1, p2)ρx, (3.9) W1(q, p|x) ≡ . (3.4) 4 W1(0, 0|x) W1(1, 0|x) A(q1, q2, p1, p2) For example, 1 = (−1)q1 σ + (−1)p1 σ + (−1)q+p1 σ + I ⊗ 2 Z X Y 0.5 0 0 0.5 W |0ih0| = , W |1ih1| = , 1 1 0.5 0 1 0 0.5 (−1)q2 σ + (−1)p2 σ − (−1)q2+p2 σ + I . 2 Z X Y (3.5) (3.10) 0 0 0.5 0.5 W |+ih+| = , W |−ih−| = , 1 0.5 0.5 1 0 0 Remarkably, the function stays non-negative for many separable states, as shown in Fig.3, as well as some en- (3.6) tangled states, as shown in Fig.4. Unlike the case of odd 0 0.5 0.5 0 W |iihi| = , W | − iih−i| = . dimensions, the correspondence between non-negative 1 0.5 0 1 0 0.5 Wigner representations and the stabilizer model is not (3.7) as strong for even dimensions [48, 49]. For two qubits, 5

|0, 0i |0, 1i |1, 0i |1, 1i |+, +i |+, −i |−, +i |−, −i |i, ii |i, −ii | − i, ii | − i, −ii

|0, +i |0, −i |1, +i |1, −i |0, ii |0, −ii |1, ii |1, −ii |+, 0i |+, 1i |−, 0i |−, 1i

|+, ii |+, −ii |−, ii |−, −ii |i, 0i |i, 1i | − i, 0i | − i, 1i |i, +i |i, −i | − i, +i | − i, −i

FIG. 3. A discrete Wigner representation of separable two-qubit states. The non-zero elements are all equal to 1/4. there are 60 pure stabilizer states [50], but the Wigner hidden-variable model agree with Born’s rule according function considered here is non-negative for only 48 of to Eq. (2.5). Figs.2,3, and4 then also depict the Wigner them, as shown in Figs.3 and4, and becomes negative representations of projective measurements, normaliza- for the remaining 12. tion notwithstanding. Systems with initial states, state transitions, and measurements within this set of states − − |00i + |11i |00i |11i |01i + |10i |01i |10i with non-negative Wigner representations naturally per- mit logical smoothing inference.

IV. LOGICAL SMOOTHING AND COMPLEX WEAK VALUE

|00i + |01i + i|10i − i|11i |00i − |01i + i|10i + i|11i |00i + |01i − i|10i + i|11i |00i − |01i − i|10i − i|11i It is not diﬃcult to construct examples where the weak value does not make sense, while the Wigner functions provide a logical path for smoothing inference. Consider an initial state given by ρ = |0ih0|, (4.1) |00i − i|01i + |10i + i|11i |00i + i|01i + |10i − i|11i −|00i + i|01i + |10i + i|11i |00i + i|01i − |10i + i|11i x=0 ρx=1 = |1ih1|, (4.2) 0.5 0 W (q, p|0) = , (4.3) 1 0.5 0 0 0.5 W (q, p|1) = , (4.4) 1 0 0.5 FIG. 4. A discrete Wigner representation of some entangled two-qubit states. Note that the states denoted in the titles and a ﬁnalr ˆ measurement given by are unnormalized for brevity. E(y = i) = |iihi|, (4.5) For the measurement, the map on the POVM can be E(y = −i) = | − iih−i|, (4.6) chosen as 0 1 W (i|q, p) = , (4.7) 2 1 0 W2E(y|x) = NW1E(y|x), (3.11) 1 0 with N being the dimension, such that a fundamental W2(−i|q, p) = . (4.8) property of the Wigner representation [44, 46] makes the 0 1 6

W1 and W2 are consistent with Born’s rule and remain non-negative for all x and y, enabling logical inference of the past values of q, p, and r = (q + p) mod 2. Sup- pose x = 0 and y = i. The smoothing quasi-probability function becomes

0 0 W (q, p|x = 0, y = i) = , (4.9) 1 0 ? ? which infers with certainty that q = p = r = 0. This result concerning the three incompatible observables fol- Inference lows from a logical inference protocol but seems to be at odds with the uncertainty principle. The apparent con- flict can be resolved by realizing that, under the quasi- FIG. 5. A flowchart (quantum circuit diagram) depicting probability framework, the uncertainty relation holds the Aharonov-Albert-Vaidman gedanken experiment. Start- ing with an initial state ψ, inference of the hidden variable λ only for the predictive part W1, and a smoothing inference can violate the principle even if it is logical. The at times t1 and t2 is performed using a weak q measurement weak value ofp ˆ, on the other hand, is with outcome δz and a final projective p measurement with outcome ξ. hi|pˆ|0i 1 + i = , (4.10) hi|0i 2 the Kraus operator [26, 27]: which is complex and makes no sense as an estimator. 1 (δz − qδtˆ )2 One could take the real part or the absolute value of the K(δz) = exp − (5.1) weak value to make it look more reasonable, but again (2πδt)1/4 4δt there is no logical foundation that justifies such heuristic 1 δz2 δz δt ≈ exp − 1 + qˆ − qˆ2 , operations. (2πδt)1/4 4δt 2 8 This example can also be studied using the approach (5.2) of consistent histories [3]. The coherence functional is the central quantity in this approach and for the present where δt characterizes the measurement strength, as- example with x = 0 and y = i is defined as C(p, p0) and sumed to be unitless and 1. The time after the weak given by measurement and before the final measurement is denoted as t . The final measurement is an p projection: 1 2 C(0, 0) = tr |iihi|+ih+|0ih0|+ih+|iihi| = , (4.11) 4 E(ξ = +) = |+ih+|, (5.3) i C(0, 1) = tr |iihi|+ih+|0ih0|−ih−|iihi| = − , (4.12) E(ξ = −) = |−ih−|, (5.4) 4 i C(1, 0) = tr |iihi|−ih−|0ih0|+ih+|iihi| = , (4.13) such that Born’s rule is given by 4 1 P (ξ, δz|ψ) = tr |ξihξ|K(δz)|ψihψ|K†(δz). (5.5) C(1, 1) = tr |iihi|−ih−|0ih0|−ih−|iihi| = . (4.14) 4 It is important here not to confuse the measurement Although the off-diagonal components C(0, 1) and outcome δz with the hidden observable it is measuring. C(1, 0) are not zero, they are imaginary and still sat- The weak-value approach calculates the average of δz/δt isfy the “weak consistency condition” defined in Ref. [3]. for many trials and claims that it is an estimate of q, but The general relation between the consistent histories ap- such an approach cannot be justified unless it happens proach and the quasi-probability approach here is an in- to agree with a Bayesian estimate. The approach would teresting open problem but beyond the scope of this pa- work for an analogous classical problem because there per. exists a likelihood function describing the weak measurement such that averaging the outcomes over many tri- R ∞ als is akin to evaluating −∞ d(δz)P2(δz|q)(δz/δt) = q, V. THE AHARONOV-ALBERT-VAIDMAN which gives the true q. For the quantum problem, how- GEDANKEN EXPERIMENT ever, it is not obvious how such a non-negative likelihood function can be defined to justify the weak value as an To demonstrate the limitations of the quasi-probability estimate, unless the Kraus operator K(δz) happens to approach, consider the original experiment proposed by commute with all the other operators in the problem. AAV [5], as depicted in Fig.5. A spin-1/2 particle is Let’s focus on the case |ψi = |−i with the final result known to be in a pure state |ψi at time t1, A weakq ˆ ξ = +, which gives an infinite weak value [5]. The ξ = + measurement is then performed and can be modeled by final result is possible because of the small backaction 7 noise introduced by the weak measurement to p. Con- sider the predictive Wigner functions W1(λ, t1|ψ) and W λ, t − W λ, t −, δz W1(λ, t1|ψ, δz) before and after the weak measurement: 1( 1| ) 1( 2| )

1 1 1 1 1 W (λ, t |−) ≡ W |−ih−| = , (5.6) 1 1 1 2 0 0 0 0 K(δz)|−ih−|K†(δz) W1(λ, t2|−, δz) ≡ W1 −1 −1 tr(numerator) 1 1 1 1 δz 0.5 1.5 p 0 0 ∝ + q 0 0 2 −0.5 0.5 δt −0.5 −0.5 W , δz λ, t W λ, t + , (5.7) 2(+ | 1) 2(+| 2) 8 0.5 0.5 1 1 and the following retrodictive quasi-likelihood functions: 0 0 W (+, δz|λ, t ) ≡ W K†(δz)|+ih+|K(δz) 2 1 2 −1 −1 0 0 δz −0.5 0.5 ∝ + 1 1 2 0.5 1.5 δt 0.5 0.5 + , (5.8) 8 −0.5 −0.5 W (λ, t1|−, δz, +) W (λ, t2|−, δz, +) 0 0 W (+|λ, t ) ≡ W |+ih+| = . (5.9) 2 2 2 1 1 10 10 0 0 The smoothing quasi-probability functions W (λ, t1,2|ψ, δz, ξ) at times t1 and t2 are then −10 −10

1/2 − 2δz/δt 1/2 + 2δz/δt W (λ, t |−, δz, +) ≈ , 1 0 0 FIG. 6. The predictive (1st row), retrodictive (2nd row), and (5.10) smoothing (3rd row) quasi-probability functions√ before and 0 0 after the weak measurement for δt = 0.1 and δz = 0.1. Note W (λ, t |−, δz, +) ≈ . 2 1/2 − 2δz/δt 1/2 + 2δz/δt that the W1(λ, t2|−, δz), W2(+, δz|λ, t1), and W2(+|λ, t2) (5.11) plotted here are unnormalized. All the functions must be non-negative for the inference to conform to logic; the negativity shown here is a signature of non-classicality. All the predictive, retrodictive, and smoothing quasi- probability functions are plotted in Fig.6. There are a few interesting observations to be made: • The MAP quasi-estimate of q is

• Both W1(λ, t2|−, δz) and W2(+, δz|λ, t1) become   0, δz < 0, negative when δz 6= 0, rendering the inference il- qMAP = 1, δz > 0, (5.13) logical. The smoothing quasi-probability functions  ambiguous, δz = 0. still agree with common sense about p however, as they infer that p must be 1 at t1 because of the This makes sense, as the outcome δz of a q measure- initial state and 0 at t2 because of the ﬁnal mea- ment, however noisy, should constitute evidence surement outcome. that should persuade the observer one way or the other. Because of the negative quasi-probabilities, • The marginal smoothing distributions with respect the quasi-estimate should not be taken seriously as to q can be written as a logical estimator, but it is at least more sensible than the inﬁnite weak value or, say, the naive W (q, t |−, δz, +) ≈ W (q, t |−, δz, +) 1 2 conditional average: 1 δz 1 δz ≈ − 2 + 2 . (5.12) X 1 δz 2 δt 2 δt q¯ ≡ qW (q|−, δz, +) ≈ + 2 . (5.14) 2 δt q Even though the negativity of Wigner functions is limited, one sees here that the smoothing quasi- q¯ exceeds the range [0, 1] when a smoothing probabilities that result from them can have arbi- quasi-probability becomes negative, that is, when trarily negative values if |δz/δt| > 1/4. |δz/δt| > 1/4. Since δz ∈ (−∞, ∞) and is typically 8 √ on the order of δt, the magnitude ofq ¯ can become distributions generally involve incompatible observables. extremely large, like the weak value. This anolamy Another recent work by Chantasri, Dressel, and Jordan is simply another manifestation of the illogicality [59] also proposes a phase-space approach to the quantum that arises from the negative quasi-probabilities. smoothing problem, but their method is based on path integrals and its connection with more well known and useful quasi-probability functions is unclear. VI. OTHER RELATED WORK

The Bayesian inference of an intermediate quantum VII. CONCLUSION projective measurement outcome was first considered by Watanabe [51] and Aharonov, Bergmann, and Lebowitz The Wigner representations are currently some of the [52]. Yanagisawa first introduced the term quantum best tools for finding classical models of quantum systems smoothing and applied it to quantum non-demolition [17, 60], and negative Wigner quasi-probability is known (QND) observables, which are compatible observables in to be a necessary resource for quantum computation [61– the Heisenberg picture [53]. Refs. [13, 14, 34] focus on 63]. By equating the logicality of quantum smoothing the smoothing inference of classical stochastic waveforms and the non-negativity of quasi-probability representa- coupled to quantum systems under continuous measure- tions, I have also made a connection between the quan- ments, while more recent papers by Dressel, Agarwal, tum smoothing inference problem and the notion of con- and Jordan [54, 55] and Gammelmark, Julsgaard, and textuality [19, 20]. These connections suggest that the Mølmer [56] extend the theory to the inference of weak smoothing method based on logical inference and Wigner measurement outcomes. These results can be regarded as functions is a pretty good, if not the best, attempt at rec- sharing the same foundation, as projective or weak mea- onciling logic and quantum mechanics when one tries to surement outcomes and classical random variables can infer the past of a quantum system, and further progress all be modeled as QND observables in a suitably dilated along these lines will benefit multiple areas of quantum Hilbert space [57]. A collection of QND observables that information processing and quantum foundations. commute with each other at all times of interest are called a quantum-mechanics-free subsystem in Ref. [58] to em- phasize that they have no quantum feature. It is easy to show that QND observables always have consistent ACKNOWLEDGMENTS histories [3]. The inference of QND observables is always compatible Inspiring and helpful discussions with Joshua Combes, with decision theory, since they can be measured without Christopher Ferrie, Justin Dressel, and George Knee are any backaction and compared with the estimates for error gratefully acknowledged. This work is supported by evaluation, but the theory is not as general as the one the Singapore National Research Foundation under NRF proposed here and in Refs. [13, 14], as quasi-probability Grant No. NRF-NRFF2011-07.

[1] John Archibald Wheeler, “The “past” and the “delayed- [7] Adrian Cho, “Furtive approach rolls back the limits of choice” double-slit experiment,” in Mathematical Foun- quantum uncertainty,” Science 333, 690–693 (2011), and dations of Quantum Theory, edited by A. R. Marrow references therein. (Academic Press, New York, 1978) pp. 9–48. [8] J. P. Groen, D. Ristè,L. Tornberg, J. Cramer, P. C. [2] Asher Peres, “Time asymmetry in quantum mechanics: de Groot, T. Picot, G. Johansson, and L. DiCarlo, a retrodiction paradox,” Physics Letters A 194, 21–25 “Partial-measurement backaction and nonclassical weak (1994). values in a superconducting circuit,” Phys. Rev. Lett. [3] Robert B. Griffiths, Consistent Quantum Theory (Cam- 111, 090506 (2013). bridge University Press, Cambridge, 2003). [9] L. Vaidman, “Past of a quantum particle,” Phys. Rev. A [4] Yakir Aharonov and Daniel Rohrlich, Quantum Para- 87, 052104 (2013). doxes: Quantum Theory for the Perplexed (Wiley, New [10] A. Danan, D. Farfurnik, S. Bar-Ad, and L. Vaidman, York, 2005), and references therein. “Asking photons where they have been,” Phys. Rev. Lett. [5] Yakir Aharonov, David Z. Albert, and Lev Vaidman, 111, 240402 (2013). “How the result of a measurement of a component of the [11] P. Campagne-Ibarcq, L. Bretheau, E. Flurin, A. Auffèves, spin of a spin-1/2 particle can turn out to be 100,” Phys. F. Mallet, and B. Huard, “Observing interferences Rev. Lett. 60, 1351–1354 (1988). between past and future quantum states in resonance [6] Yutaka Shikano, “Theory of “Weak Value” and Quantum fluorescence,” ArXiv e-prints (2013), arXiv:1311.5605 Mechanical Measurements,” in Measurements in Quan- [quant-ph]. tum Mechanics, edited by M. R. Pahlavani (InTech, 2012) [12] Christopher Ferrie and Joshua Combes, “How the result Chap. 4, p. 75, and references therein. of a single coin toss can turn out to be 100 heads,” ArXiv e-prints (2014), 1403.2362. 9

[13] Mankei Tsang, “Optimal waveform estimation for clas- [35] Mankei Tsang, Howard M. Wiseman, and Carlton M. sical and quantum systems via time-symmetric smooth- Caves, “Fundamental quantum limit to waveform esti- ing,” Phys. Rev. A 80, 033840 (2009). mation,” Phys. Rev. Lett. 106, 090401 (2011). [14] Mankei Tsang, “Optimal waveform estimation for classi- [36] T. A. Wheatley, D. W. Berry, H. Yonezawa, D. Nakane, cal and quantum systems via time-symmetric smoothing. H. Arao, D. T. Pope, T. C. Ralph, H. M. Wiseman, II. Applications to atomic magnetometry and Hardy’s A. Furusawa, and E. H. Huntington, “Adaptive op- paradox,” Phys. Rev. A 81, 013824 (2010). tical phase estimation using time-symmetric quantum [15] Richard T. Cox, The Algebra of Probable Inference smoothing,” Phys. Rev. Lett. 104, 093601 (2010). (Johns Hopkins University Press, 1961). [37] Hidehiro Yonezawa, Daisuke Nakane, Trevor A. Wheat- [16] E. T. Jaynes, Probability Theory: The Logic of Science, ley, Kohjiro Iwasawa, Shuntaro Takeda, Hajime Arao, edited by G. Larry Bretthorst (Cambridge University Kentaro Ohki, Koji Tsumura, Dominic W. Berry, Timo- Press, Cambridge, 2003). thy C. Ralph, Howard M. Wiseman, Elanor H. Hunting- [17] Christopher Ferrie, “Quasi-probability representations of ton, and Akira Furusawa, “Quantum-enhanced optical- quantum theory with applications to quantum informa- phase tracking,” Science 337, 1514–1517 (2012). tion science,” Reports on Progress in Physics 74, 116001 [38] Kohjiro Iwasawa, Kenzo Makino, Hidehiro Yonezawa, (2011). Mankei Tsang, Aleksandar Davidovic, Elanor Hunting- [18] Dan Simon, Optimal State Estimation: Kalman, H Infin- ton, and Akira Furusawa, “Quantum-limited mirror- ity, and Nonlinear Approaches (Wiley, Hoboken, 2006). motion estimation,” Phys. Rev. Lett. 111, 163602 (2013). [19] Christopher Ferrie and Joseph Emerson, “Frame repre- [39] Harry L. Van Trees, Detection, Estimation, and Modu- sentations of quantum mechanics and the necessity of lation Theory, Part I. (John Wiley & Sons, New York, negativity in quasi-probability representations,” Journal 2001). of Physics A Mathematical General 41, 352001 (2008), [40] George C. Knee, G. Andrew D. Briggs, Simon C. Ben- arXiv:0711.2658 [quant-ph]. jamin, and Erik M. Gauger, “Quantum sensors based on [20] Robert W. Spekkens, “Negativity and contextuality are weak-value amplification cannot overcome decoherence,” equivalent notions of nonclassicality,” Phys. Rev. Lett. Phys. Rev. A 87, 012115 (2013). 101, 020401 (2008). [41] George C. Knee and Erik M. Gauger, “When amplifica- [21] David Gross, “Hudson’s theorem for finite-dimensional tion with weak values fails to suppress technical noise,” quantum systems,” Journal of Mathematical Physics 47, Phys. Rev. X 4, 011032 (2014). 122107 (2006), arXiv:quant-ph/0602001. [42] Saki Tanaka and Naoki Yamamoto, “Information ampli- [22] David Gross, “Non-negative Wigner functions in prime fication via postselection: A parameter-estimation per- dimensions,” Applied Physics B: Lasers and Optics 86, spective,” Phys. Rev. A 88, 042116 (2013). 367–370 (2007), quant-ph/0702004. [43] Christopher Ferrie and Joshua Combes, “Weak value am- [23] M. Hillery, R. F. O’Connell, M. O. Scully, and E. P. plification is suboptimal for estimation and detection,” Wigner, “Distribution functions in physics: Fundamen- Phys. Rev. Lett. 112, 040406 (2014). tals,” Phys. Rep. 106, 121–167 (1984). [44] Kathleen S. Gibbons, Matthew J. Hoffman, and [24] Samuel L. Braunstein and Peter van Loock, “Quantum William K. Wootters, “Discrete phase space based on information with continuous variables,” Rev. Mod. Phys. finite fields,” Phys. Rev. A 70, 062101 (2004). 77, 513–577 (2005). [45] Richard P. Feynman, “Negative probability,” in Quantum [25] Stephen D. Bartlett, Terry Rudolph, and Robert W. Implications: Essays in Honour of David Bohm, edited Spekkens, “Reconstruction of Gaussian quantum me- by F. David Peat and Basil Hiley (Routledge, London, chanics from Liouville mechanics with an epistemic re- 1987) pp. 235–248. striction,” Phys. Rev. A 86, 012103 (2012). [46] William K. Wootters, “A Wigner-function formulation of [26] Karl Kraus, States, Effects, and Operations: Fundamen- finite-state quantum mechanics,” Annals of Physics 176, tal Notions of Quantum Theory (Springer, Berlin, 1983). 1 – 21 (1987). [27] Howard M. Wiseman and Gerard J. Milburn, Quantum [47] William K. Wootters, “Picturing Qubits in Phase Measurement and Control (Cambridge University Press, Space,” eprint arXiv:quant-ph/0306135 (2003), quant- Cambridge, 2010). ph/0306135. [28] Lucien Hardy, “Quantum mechanics, local realistic theo- [48] Ernesto F. Galvão,“Discrete Wigner functions and quan- ries, and lorentz-invariant realistic theories,” Phys. Rev. tum computational speedup,” Phys. Rev. A 71, 042302 Lett. 68, 2981–2984 (1992). (2005). [29] Edward Brian Davies, Quantum Theory of Open Systems [49] Cecilia Cormick, Ernesto F. Galvão,Daniel Gottesman, (Academic Press, London, 1976). Juan Pablo Paz, and Arthur O. Pittenger, “Classicality [30] Vladimir B. Braginsky and Farid Ya. Khalili, Quantum in discrete Wigner functions,” Phys. Rev. A 73, 012301 Measurement (Cambridge University Press, Cambridge, (2006). 1992). [50] Scott Aaronson and Daniel Gottesman, “Improved sim- [31] Lars M. Johansen and Alfredo Luis, “Nonclassicality in ulation of stabilizer circuits,” Phys. Rev. A 70, 052328 weak measurements,” Phys. Rev. A 70, 052115 (2004). (2004). [32] James O. Berger, Statistical Decision Theory and [51] Satosi Watanabe, “Symmetry of physical laws. part III. Bayesian Analysis (Springer-Verlag, New York, 1980). prediction and retrodiction,” Rev. Mod. Phys. 27, 179– [33] Michael A. Nielsen and Isaac L. Chuang, Quantum Com- 186 (1955). putation and Quantum Information (Cambridge Univer- [52] Yakir Aharonov, Peter G. Bergmann, and Joel L. sity Press, Cambridge, 2000). Lebowitz, “Time symmetry in the quantum process of [34] Mankei Tsang, “Time-symmetric quantum theory of measurement,” Phys. Rev. 134, B1410–B1416 (1964). smoothing,” Phys. Rev. Lett. 102, 250403 (2009). 10

[53] M. Yanagisawa, “Quantum smoothing,” ArXiv e-prints Rev. A 88, 042110 (2013). (2007), arXiv:0711.3885 [quant-ph]. [60] Mankei Tsang, “Testing quantum mechanics: a statis- [54] J. Dressel, S. Agarwal, and A. N. Jordan, “Contextual tical approach,” Quantum Measurements and Quantum values of observables in quantum measurements,” Phys. Metrology 1, 84–109 (2013). Rev. Lett. 104, 240401 (2010). [61] Victor Veitch, Christopher Ferrie, David Gross, and [55] J. Dressel and A. N. Jordan, “Contextual-value approach Joseph Emerson, “Negative quasi-probability as a re- to the generalized measurement of observables,” Phys. source for quantum computation,” New Journal of Rev. A 85, 022123 (2012). Physics 14, 113011 (2012), arXiv:1201.1256 [quant-ph]. [56] Søren Gammelmark, Brian Julsgaard, and Klaus [62] Victor Veitch, Seyed Ali Hamed Mousavian, Daniel Mølmer, “Past quantum states of a monitored system,” Gottesman, and Joseph Emerson, “The Resource The- Phys. Rev. Lett. 111, 160401 (2013). ory of Stabilizer Computation,” ArXiv e-prints (2013), [57] Viacheslav P. Belavkin, “Nondemolition principle of arXiv:1307.7171 [quant-ph]. quantum measurement theory,” Foundations of Physics [63] A. Mari and J. Eisert, “Positive wigner functions render 24, 685–714 (1994). classical simulation of quantum computation eﬃcient,” [58] Mankei Tsang and Carlton M. Caves, “Evading quantum Phys. Rev. Lett. 109, 230503 (2012). mechanics: Engineering a classical subsystem within a quantum environment,” Phys. Rev. X 2, 031016 (2012). [59] A. Chantasri, J. Dressel, and A. N. Jordan, “Action principle for continuous quantum measurement,” Phys.