<<

Quantum Measurements and Metrology Mini Review • DOI: 10.2478/qmetro-2013-0007 • QMTR • 2013 • 84-109

Testing : a statistical approach

Abstract As experiments continue to push the quantum-classical Mankei Tsang1,2∗ boundary using increasingly complex dynamical systems, the interpretation of experimental data becomes more and more 1 Department of Electrical and Computer Engineering, challenging: when the observations are noisy, indirect, and National University of Singapore, 4 Engineering Drive 3, limited, how can we be sure that we are observing quan- Singapore 117583 tum behavior? This tutorial highlights some of the difficulties 2 Department of Physics, National University of Singapore, in such experimental tests of quantum mechanics, using op- 2 Science Drive 3, Singapore 117551 tomechanics as the central example, and discusses how the issues can be resolved using techniques from statistics and insights from quantum information theory. Received 8 August 2013 Accepted 28 November 2013 Keywords

PACS: 42.50.Pq, 42.65.Ky, 42.65.Lm, 42.79.Hp © 2013 Mankei Tsang, licensee Versita Sp. z o. o.

This work is licensed under the Creative Commons Attribution-NonCommercial- NoDerivs license, which means that the text may be used for non-commercial pur- poses, provided credit is given to the author.

1. Introduction tomechanics is used as the main example. Optomechanics refers to the physics of the interactions between optical Once thought to be a theory confined to the atomic do- beams and mechanical moving objects. A moving mirror, main, quantum mechanics is now being tested on increas- for example, will introduce varying phase shifts depending ingly macroscopic levels, thanks to technological advances on its position to an optical beam reflected by it. The mo- and the ingenuity of experimentalists [1–6]. As experi- tion of the mirror can then be inferred from measurements ments continue to push the quantum-classical boundary of the optical phase, while the change in momentum of the using increasingly complex dynamical systems, the inter- reflected optical beam also means that the mirror experi- pretation of experimental data becomes more and more ences a force, namely, radiation pressure. Optomechanics challenging: when the observations are noisy, indirect, technology has advanced so rapidly in recent years [4–6] and limited, how can we be sure that we are observing that quantum effects are becoming in mechan- quantum behavior? The goal of this tutorial is to high- ical devices with unprecedented sizes [7–10]. Such de- light some of the difficulties in such experimental tests of vices thus serve as promising testbeds for new concepts quantum mechanics and discuss how the issues can be re- in macroscopic quantum mechanics [11]. Sec. 4.4 in par- solved using techniques from statistics and insights from ticular studies the optomechanics experiment reported by et al. quantum information theory. Apart from quantum physi- Safavi-Naeini [12, 13] and demonstrates how statis- cists, another target audience of this tutorial is statis- tics can be applied to the experimental data. For the mo- ticians and engineers, who might be interested to learn tivated reader, the Appendices also introduce some of the more about quantum physics and how statistics can be more advanced techniques in classical and quantum prob- useful for the new generation of quantum experiments. ability theory that can facilitate the experimental design and signal processing. The tutorial starts off in rather basic and general terms, introducing the basic concepts of quantum mechanics in 2. Quantum mechanics Sec.2 and statistical hypothesis testing in Sec.3. Sec.4 2.1. Origin of quantum is the centerpiece of this tutorial, discussing in detail why and how quantum mechanics should be tested. To illus- The word “quantum” in quantum mechanics refers to the trate the concepts in the context of recent experiments, op- fact that certain physical quantities, such as energy and angular momentum, exist only in discrete levels, or quanta. ∗ E-mail: [email protected] This assumption, together with , are

84 Testing quantum mechanics: a statistical approach

able to explain many phenomena; for example, screen, the experimentalists themselves, and, by exten- sion, the whole universe. 1. Planck’s model of electromagnetic fields with dis- This viral nature of the Hilbert-space theory is nowadays crete energy can explain the blackbody spectrum taken more seriously among some theorists. On a prag- and, later by Einstein, the photoelectric effect. matic level, it makes the theory, by itself, impossible to test experimentally, as the experimentalists would have to 2. Bohr’s model of bound electrons with discrete en- take into account the universe, including themselves, ev- ergy and orbital angular momentum can explain the ery time they would like to generate a prediction from the spectral lines of hydrogen. Hilbert-space theory and perform an experiment to test it. Despite its success, the seemingly ad-hoc nature of the To test the Hilbert-space theory, we must therefore find quantal assumption motivated theorists to find a deeper a way to divorce the test object from the rest of the uni- model. The result is Schrödinger’s wave mechanics and verse and extract reproducible experimental results from Heisenberg’s . the model. Fortunately, for experimentalists, the von Neu- mann measurement theory provides a way out. 2.2. The Hilbert-space theory 2.3. Quantum probability The Schrödinger and Heisenberg pictures of quantum me- The von Neumann measurement theory provides a defini- chanics are equivalent theories, which are able to explain tion of quantum measurement with respect to an observ- the quantal model as a consequence of deeper axioms able, known as the von Neumann measurement. The def- based on Hilbert-space algebra. The central quantities inition allows one to model a test object using a Hilbert of the theory is the , which is a complex space, but still describe the rest of the universe as an vector denoted by |ψi, , which are Hermitian observer that follows the classical rules of probability. matrices, and a unitary matrix U for time evolution. The probabilities of measurement outcomes are deter- The Hilbert-space theory produces many predictions be- mined from a Hilbert-space model using Born’s rule. Al- yond the quantal hypothesis. Perhaps the most outra- though each measurement outcome is random, the Born geous one is the “uncertainty” relation, which states that probability values are deterministic and can be estimated the product of the variances of a pair of incompatible with increasing accuracy by repeated experiments. As the observables, such as the position and momentum of an probabilities depend on the Hilbert-space model being electron, cannot be zero but is instead lower-bounded by assumed, one can then obtain asymptotically reproducible a certain positive value. The word “uncertainty” is put results that verify the validity of the Hilbert-space theory. in quotes because, at this stage, the “uncertainty” rela- The combined theory of Hilbert space and von Neumann tion is nothing more than a mathematical statement in measurement is referred to as the quantum probability the Hilbert-space theory. Although Ehrenfest’s correspon- theory. dence principle tells us that the Hilbert-space average of With the quantum probability theory, the Hilbert-space an observable obeys classical mechanics and gives us a moments and the uncertainty relation acquire operational rough sense of how observables correspond to physical meanings: one can define Hilbert-space averages in an quantities, it is unclear how the Hilbert-space variance is unambiguous fashion by specifying the measurements and related to the common sense of uncertainty, which is best asking how the averages are related to the expected val- described using probability theory. ues for the measurements. Most importantly, the theory This problem becomes more apparent when one wishes to enables experimentalists to stay safely in the realm of define the correlation of incompatible observables. Cor- classical logic and still test the Hilbert-space theory by relation is a well defined concept in probability, but in considering smaller models. the Hilbert-space theory its definition is ambiguous, with We now have a quantum theory that predicts probabilities infinitely many ways of combining the observables that as verifiable deterministic numbers, but it is very clumsy result in different Hilbert-space moments. to use, as it provides no rule that specifies which part of An even more troubling problem with the theory is how the experiment should be included in the Hilbert space to test it in an experiment. In the Stern-Gerlach experi- and which part should be defined as the observer. This ment, for example, an electron beam interacts with mag- dichotomy is known as the Heisenberg cut. An empirical netic fields, before being detected on a screen. If we are way of deciding on a cut is as follows: to believe that the Hilbert-space theory is a fundamental theory that governs all the interacting objects involved in 1. Make a guess of how the cut should be made and an experiment, then we must include in the Hilbert space compute the quantum probabilities based on the not only the electrons, but also the magnetic field, the cut.

85 Mankei Tsang

2. The validity of the cut can be checked by making a 2. Signal processing. After the results are obtained, larger cut: include more experimental objects in a statistical signal processing techniques can be used larger Hilbert space, do the calculation again, and to optimize their accuracy further and compute their see if the predictions match. errors. Universality 3. Alternatively, one can also attempt to find smaller 3. . By using standard error measures, it cuts with smaller Hilbert spaces (by using certain is easier to compare and communicate the signifi- tricks known as the open quantum system theory). cance of an experiment. This is especially important for multidisplinary science and engineering appli- The arbitrariness of the cut is unsatisfactory to some, but cations. we may take a pragmatic view of the cutting procedure as 4. Confidence. Statistics can provide a measure of an algorithm for the scientific method. Without it, the very confidence, such that the experimentalists and the definition of scientific experiments is endangered. society in general can understand the value of the Much like the Hilbert-space theory superseding the quan- results and guard against the risks. tal hypothesis, there have been many proposals that claim to interpret or supersede the quantum probability theory. 5. Fun. Statistics is a full-fledged scientific discipline Until such theories provide distinguishable predictions, in itself, and many scientists and engineers find it however, it is impossible to test them in an experiment. fun to learn and apply. The concepts discussed thus far can be found in many Insight standard textbooks, for example, Ref. [14]. AppendicesC–F 6. . Learning about statistics may shed new present some of the more advanced concepts and methods light on the foundations of quantum probability the- in quantum probability theory. ory. 3. Statistical hypothesis testing The last point should especially incentivize quantum physicists to learn more about statistics. 3.1. Why bother? 3.2. Bayesian hypothesis testing How do we test a hypothesis that gives only probabili- An intuitive approach to statistical hypothesis testing is ties of the measurement outcomes? An easy and by far known as Bayesian hypothesis testing, which computes the most common approach is to perform an experiment the posterior probability P(Hj |Y ) that a hypothesis Hj is in many trials or for a very long time, and combine the true given the observation Y via the Bayes theorem: outcomes into fewer numbers known as the test statistics, such as the mean, correlation, or power spectral density. P(Y |Hj )P(Hj ) The test statistics are then compared with the expected P(Hj |Y ) = P ; (1) P Y |Hj P Hj values according to the hypothesis. j ( ) ( ) To justify this averaging approach, one can appeal to the P Y |H law of large numbers or the ergodic theorem for the con- where ( j ) is the probability of the observation pre- H P H vergence of the test statistics. Unfortunately, such laws dicted by a hypothesis j and ( j ) is the prior proba- are exact only for an infinite number of trials or an in- bility. A common criticism of the Bayesian method is that finitely long time. These limits are called “asymptotically the prior probabilities may imply subjective beliefs, but almost surely” in the lingo of probability theory, but they many definitions of objective priors have been proposed also imply that, in finite time, we can never be sure, and and are now widely accepted [15–17]. Some popular ob- a way of characterizing the uncertainties is needed. jective priors are reviewed in Sec. 3.6. An analysis of experimental uncertainties is a standard If one is uncomfortable with priors, he can avoid them prerequisite for publication nowadays, but it is often by turning to frequentist methods. The significance of a treated as an afterthought. Performing a statistical anal- frequentist test is much more difficult to comprehend and ysis with utmost rigor is not only a moral responsibility, communicate to others, however, unlike the much more but also has many benefits: intuitive meaning of a posterior probability. For example, a popular frequentist significance measure is called the p- 1. Experimental design. Before implementing an ex- value, which is the probability that a test statistic would periment, it can tell experimentalists how much in- be more extreme than the experimentally obtained value formation they can gain from a setup, such that the if a hypothesis to be rejected is true. design can be rejected, adopted, or improved, sav- At least one alternative is needed to compute the posterior ing time, effort, and money. probability distribution. If there is no obvious alternative

86 Testing quantum mechanics: a statistical approach

and one lacks the imagination to come up with one, it is Since the relative entropy is always nonnegative, the ra- H possible to compare a hypothesis with reference alterna- tio is expected to rise if 1 is true. The same argument H tives based on more mathematical grounds [17]. Fortu- works also if 0 is true and the log-likelihood ratio should −MD P ||P ≤ nately, for quantum tests, alternatives, such as classical fall, since ( 0 1) 0. The relative entropies thus mechanics and hidden-variable models, are abundant. provide the experimentalist an idea of how the expected The rest of the tutorial will focus on the Bayesian theory. strength of an experiment increases with the number of For critiques of frequentist methods, see Refs. [15–17]. trials. This rise of expected information is important, as it tells us that, even if each trial is uncertain, more evidence 3.3. Strength of an experiment will get us closer to the truth. For other operational meanings of the relative entropy, see To judge the significance and value of an experiment, it is Refs. [16, 18]. For multiple-hypothesis testing in general, useful to quantify how strongly an experimental result may an appealing measure of information gain is the mutual sway one’s opinion. For the simplest example, consider information; see Ref. [16]. two hypotheses. The ratio of the posterior probabilities is 3.4. Making decisions P H |Y P Y |H P H P H ( 1 ) ( 1) ( 1) Y ( 1) ; P H |Y = P Y |H P H = Λ( ) P H (2) For engineering applications, including communication, ( 0 ) ( 0) ( 0) ( 0) : P Y |H robotic control, and financial trading, the goal of hypoth- Y ( 1) : Λ( ) = P Y |H (3) esis testing is not only to gain knowledge or convince ( 0) skeptics, but also to make a decision on one hypothesis. We define a decision rule as Hk (Y ) and the penalty or Y Λ( ) is called the likelihood ratio. It is used to update cost incurred by a decision on Hk when Hj is true via the one’s prior beliefs about the two hypotheses, and can be loss function L(Hj ; Hk ). The expected loss is called the Y understood as the strength of a given evidence for one risk of a decision rule [15]: hypothesis against the other. An experiment shows strong H H Y  : X evidence for 1 against 0 when Λ( ) 1 and vice versa R(Hj ) = L(Hj ; Hk (Y ))P(Y |Hj ): (8) when Λ(Y )  1. Y Unless the two hypotheses predict the same probability distribution, the likelihood ratio cannot be computed until If we average the risk function over a prior, we obtain the some results are obtained. For experimental design, it is so-called Bayes risk: useful to know in advance how much the likelihood ratio : X is expected to rise or fall. One measure that quantifies R = R(Hj )P(Hj ); (9) j this expected information is the relative entropy:

: which can also be written in terms of the posterior distri- D P ||P Y |H ( 1 0) = E [ln Λ( ) 1] (4) bution as X P Y |H P Y |H ( 1) ; " # = ( 1) ln P Y |H (5) X X ( 0) Y R = L(Hj ; Hk (Y ))P(Hj |Y )P(Y ) : (10) Y j where E denotes the expected value. To see why it is a To minimize R, we can choose a Hk (Y ) that minimizes sensible measure of information, consider M independent Y ;Y ;:::;Y each of the square-bracketed terms in Eq. (10). This is trials with observations 1 2 M , each generating a equivalent to a decision rule that minimizes the posterior likelihood ratio Λ(Ym). The collective log-likelihood ratio expected loss: is X H Y L H ; H Y P H |Y : M ˇ ( ) = arg min ( j k ( )) ( j ) (11) Hk Y X ( ) j Y ;:::;Y Y : ln Λ( 1 M ) = ln Λ( m) (6) m=1 This risk minimization serves as another motivation for M → ∞ the Bayesian approach. For example, the probability of This means that, as , if the trials have identical Pe H making a wrong decision , or the error probability for probability distributions and 1 is true, short, is equivalent to defining the loss function as

Y ;:::;Y → M Y |H MD P ||P : L H ; H − δ ; ln Λ( 1 M ) E [ln Λ( m) 1] = ( 1 0) (7) ( j k ) = 1 jk (12)

87 Mankei Tsang

and the optimal decision is to choose the hypothesis with defined as the set Θc(Y ) with a high posterior probability the highest posterior probability P(Hj |Y ). Pc [16]: Except for a few special cases, the error probability is hard Z to compute exactly, but it can be sandwiched between dθP(θ|Y ) = Pc; (17) θ∈ a lower bound and an upper bound in the case of two Θc P H P H / hypotheses. For ( 0) = ( 1) = 1 2, the bounds are say, 95%. This allows us to dismiss the region outside given by [19, 20] Θc as improbable. Another common measure useful for defining error bars is the posterior mean and covariance 1 n p o 1 − 1 − exp[−2C(0:5)] ≤ min Pe matrix: Hk Y 2 ( ) Z 1 θˇ Y θ|Y dθθP θ|Y ; ≤ min exp [−C(s)] ; (13) ( ) = E( ) = ( ) (18) 2 0≤s≤1 h > i Π(Y ) = E (θ − θˇ)(θ − θˇ) |Y where C(s) is known as the Chernoff information: Z dθ θ − θˇ θ − θˇ >P θ|Y ; : = ( )( ) ( ) (19) C s − s Y |H ; ( ) = ln E [Λ ( ) 0] (14) > where denotes the matrix transpose. and C(0:5) is called the Bhattacharyya distance. The A decision rule, called an estimator in this context, can Chernoff upper bound is useful for guaranteeing the test- also be obtained by specifying a loss function. For exam- ing accuracy, while the lower bound is more useful as a ple, the mean-square error matrix is no-go theorem. The Chernoff information can be used to : h i θ − θˇ θ − θˇ > lower-bound the relative entropy as well: Σ = E ( )( ) Z X dθ θ − θ θ − θ >P Y |θ P θ ; C s = ( ˇ)( ˇ) ( ) ( ) (20) D P ||P ≥ ( ) : Y ( 1 0) max (15) 0≤s≤1 1 − s which is minimized if we decide on the posterior mean. Due to its decision-theoretic meaning for a finite number Like the error probability Pe, Σ is usually difficult to com- of trials and the asymptotic tightness of the upper bound in pute exactly, so one often has to resort to approximations Eq. (13)[21], the Chernoff information is considered a more or bounds. The most popular information measure for pa- meaningful information measure than the relative entropy, rameter estimation is the Fisher information matrix J(θ), although the former is often more difficult to compute. defined as For more details about decision theory, see Ref. [15]. For : X  ∂   ∂  a discussion of decision theory in the context of scien- Jjk (θ) = P(Y |θ) ln P(Y |θ) ln P(Y |θ) : ∂θj ∂θk tific methods, see Ref. [17]. Shannon information theory Y should really be called communication theory and may be (21) regarded as a branch of decision theory; see Ref. [18]. For the use of decision theory for engineering applications, A useful identity is [22] see Ref. [20, 21].  2  ∂ 0 Jjk (θ) = 4 C(0:5; θ; θ ) ; (22) ∂θj ∂θk 0 3.5. Parameter estimation θ =θ C : ; θ; θ0 Instead of considering just two hypotheses, let us con- where (0 5 ) is the Bhattacharyya distance given by P Y |H P Y |θ P Y |H P Y |θ0 sider the other extreme, where a continuum of hypotheses Eq. (14) with ( 0) = ( ) and ( 1) = ( ). may be assumed, and rewrite the assumptions as a col- The Fisher information determines general lower limits umn vector of parameters θ. The problem then becomes on the mean-square errors via the Cramér-Rao family of a parameter estimation problem. θ can be estimated by bounds [20, 23]. The Bayesian version is given by the computing the posterior probability density P(θ|Y ): following matrix inequality: − ≥ J J  1 ; Σ + prior (23) P(Y |θ)P(θ) P(θ|Y ) = R ; (16) : Z  ∂   ∂  dθP(Y |θ)P(θ) J dθP θ P θ P θ ; prior = ( ) ln ( ) ln ( ) (24) ∂θj ∂θk Z P θ : where ( ) is the prior probability density. As a measure J = dθP(θ)J(θ); (25) of posterior uncertainty, a credible region for θ can be

88 Testing quantum mechanics: a statistical approach

and is valid for any estimator. J can then give us an 1. Learning curve. Many people understand classical idea of how accurate an experiment can be in resolving mechanics but quantum mechanics takes much more the parameters. An alternative family of lower bounds effort to learn. If classical mechanics is sufficient, called the Ziv-Zakai bounds can also be computed using the quantum model would be unnecessary for them. 0 C(0:5; θ; θ ) and are often tighter than the Cramér-Rao Quantum simulation bounds [23, 24]. 2. . Even if one knows quantum mechanics, solving it for macroscopic objects is still 3.6. Objective priors very hard. With current computers, classical me- chanics can take much less resources to solve than For scientific tests, it is preferable to choose a prior dis- known numerical methods for quantum mechanics. tribution based on objective principles. One such princi- ple is maximum entropy [17], which chooses the prior that 3. . For a few problems, such as P maximizes the entropy − j P(Hj ) ln P(Hj ) in the pres- factoring large numbers, it has been suggested that ence of known constraints about P(Hj ). Justifications of a quantum computer can be superior to a classical this approach can be found in Refs. [17, 25]. For parame- one [26]. Quantum simulations might also be easier ter estimation, a more popular choice is the Jeffreys prior on a quantum computer. A test of quantum mechan- [15, 16]: ics on a macroscopic level would shed light on the feasibility of a practical quantum computer. p P(θ) ∝ det J(θ); (26) 4. Quantum information. Many limits on sensing and communication have been derived based on the quantum probability theory [24, 26–38], whereas where J(θ) is the Fisher information matrix given by classical mechanics is fundamentally determinis- Eqs. (21). It has the advantage of giving the same prob- tic. Emergent determinism would be good news for ability measure P(θ)dθ regardless of how the unknown sensing near the quantum limits but bad news for parameters are defined. quantum security protocols. One may also resort to decision theory and choose the so-called least favorable prior, which maximizes the Bayes 5. . There are alternative theories risk given by Eq. (9) for the Bayes decision rule given by about how gravity might modify quantum mechanics Eq. (11)[15]. It is the most conservative prior in the context on a macroscopic level [11, 39, 40]. Such theories of decision theory and has the advantage of producing a may be modeled using classical mechanics. Bayes decision rule that coincides with the frequentist minimax rule [15], but it is often much more difficult to To clarify these issues, we should search for a classical calculate than the other priors. mechanics model that is as close to the quantum theory For more in-depth discussions of objective priors, see as possible, such that, without an experiment, one has no Refs. [15–17]. evidence for one over the other, and the experiment can provide new and useful information that people do not 4. Quantum versus classical already know. To find “the most quantum” classical model, the corre- 4.1. Classical mechanics spondence principle is helpful in the first order, but be- comes ambiguous when one attempts to relate higher- Classical mechanics is a natural alternative hypothesis order Hilbert-space moments to classical statistics. To for quantum tests. Experiments and observations have prevent prior intuition from limiting our imagination and verified its validity on a macroscopic level, such that one cast a wider net, it is sometimes worthwhile to adopt should assign a significant value for its prior probability. a more abstract mathematical approach. The theory of This prior cannot be too high either, as the quantum theory quantum computation turns out to be useful in this way. has also been well tested for simple systems, and many theorems rule out naive classical mechanics if the quantum 4.2. Classical simulability theory is true. A “quantum versus classical” test is thus most interesting on a complexity level where the prior One of the most general results about equivalent models probabilities are comparable, if not equal. from the quantum and classical theories is the Gottesman- Even if one does not personally believe in one of the the- Knill theorem [26] and its generalizations for continuous ories on the level being tested, there are many reasons variables [41, 42]. The rough idea is that a certain class why the verification of a particular hypothesis is relevant of models under the quantum probability theory is equiv- to science and engineering: alent to classical hidden Markov models (HMM) [43], with

89 Mankei Tsang

restrictions on the number of dimensions of the classical able values of θ, while alternative quantum gravity the- state space and the number of time steps. “Restrictions” ories [11, 39, 40] may impose different constraints. Such is the key word here, as even the full quantum proba- hard constraints can be imposed by specifying a zero- bility model can in principle be simulated on a classical probability set Θ¯ j : computer, if one simply takes all the parameters that spec- ify the quantum model and use brute-force finite-element P θ|Hj θ ∈ ¯ j : methods. ( ) = 0 for Θ (29) The classical simulability theorems are useful as no-go theorems: they rule out the necessity of the full quan- Other prior information, such as independent calibrations, tum theory when the system can be described by a more can also be incorporated into P(θ|Hj ). succinct classical model. The hidden variables in such a A constructive strategy for composite hypothesis testing model can correspond to incompatible observables; they is as follows: obey uncertainty and measurement-disturbance relations via additional constraints on the probability distributions 1. Compute P(Y |θ) for all plausible θ, taking any ad- and system/observation noise sources. vantage offered by the hidden structure in Eq. (28). The HMM is invaluable for classical estimation and con- trol applications [43] and provides the proper foundation P Y |θ P Y |Hj for any quantum versus classical debate. It is briefly re- 2. Combine ( ) into ( ) for each hypothesis, P θ|Hj viewed in AppendicesA–B, which also set the stage for the using the prior ( ) and Eq. (27). quantum probability theory that follows in AppendixC–F. 3. Compute the posterior probabilities P(Hj |Y ) using 4.3. Testing the the Bayes theorem given by Eq. (1).

Even for classically simulable systems, there are interest- 4. P(Y |θ) can also be used for parameter estimation ing quantum features to be tested. A test showing a modi- without assuming any composite hypothesis. fication of the uncertainty principle, for example, would be highly valuable to quantum gravity theory and relevant to A tutorial example of this Bayesian approach for optome- quantum sensing applications, not to mention the Nobel chanics shall be presented in the next section. prizes that are sure to follow, if the test is done with rigor If one is uncomfortable with any choice of prior, P(Y |θ) can and accuracy and leads to new physics. also be used in frequentist tests. One example is the gen- Let us therefore focus on a classically simulable system in eralized likelihood-ratio test [21], which uses constrained this section and use the HMM for all the hypotheses to be maximum-likelihood estimates of θ in P(Y |θ) instead of tested. Let X be the hidden variables, and let’s introduce the averaging. additional parameters θ that define the HMM as follows: 4.4. An optomechanics example Z P(Y |Hj ) = dθP(Y |θ)P(θ|Hj ); (27) 4.4.1. Modeling Z Consider the experiment on a cavity optomechanical sys- P(Y |θ) = dXP(Y;X|θ): (28) tem by Safavi-Naeini et al. [12, 13]. Let ωa be the reso- nance frequency of an optical cavity mode and ωb be that For an optomechanics experiment for instance, X can in- of a mechanical oscillator. A continuous-wave laser pump clude the canonical positions and momenta of optical and beam with detuned frequency ωa − ωb is coupled into the mechanical oscillators, while θ can include the resonance system, causing a parametric interaction between the op- frequencies, the damping rates, the initial covariance ma- tical mode and the mechanical mode. The output optical trix, and the system and observation noise power levels. field is then measured via heterodyne detection. The goal This breaking down of a hypothesis into a hierarchy of of the experiment is to infer properties of the mechanical more refined ones is very convenient for modeling and nu- oscillator motion from the noisy optical measurements. merical analysis in practice. Hj is then called a composite Define a(t) as the complex analytic signal of the opti- hypothesis. cal mode field and b(t) as that of the mechanical mode. For now, the hypotheses Hj are assumed to differ only By considering the Wigner representation of the output in their prior assumptions about θ according to P(θ|Hj ). field, making appropriate rotating-wave approximations, The quantum theory, for example, would manifest itself and adding excess output noise for the heterodyne detec- as inequalities that imposes constraints on the allow- tion, we can obtain the following classical linear equations

90 Testing quantum mechanics: a statistical approach

of motion: disposal for deriving classical dynamical models with the least amount of contextuality; see Appendices3 and4 for da t γ √ ( ) igb t − a a t γ A t ; further discussions about the Wigner representations. dt = ( ) ( ) + a ( ) (30) 2 For simplicity, we assume that the parameters have not db(t) ∗ γb √ g ig a t − b t γbB t ; drifted from those in the first set of measurements, and , dt = ( ) ( ) + ( ) (31) 0 √ 2 γa, γb, and SA are so accurately determined prior to the A t γ a t − A t A0 t ; −( ) = a ( ) ( ) + ( ) (32) experiments that they can be regarded as being known exactly. The only unknowns included in θ are the system where g is an optomechanical coupling constant propor- noise power levels: tional to the field of the pump beam, γa and γb are the ! damping rates of the optical and mechanical modes, re- SA A t B t θ = ; (39) spectively, ( ) is an optical input noise source, ( ) is a SB mechanical noise source, A−(t) is the output field near ωa 0 to be measured by heterodyne detection, and A (t) is the excess output noise. These equations suggest that there and we seek to perform hypothesis testing and parameter is a coherent energy exchange between the optical and estimation based on the information gained about these mechanical modes enabled by the pump. parameters from the measurements. The noise sources are assumed to be zero-mean, phase- 4.4.2. Power spectral densities insensitive, and uncorrelated with one another, with power Before discussing the statistical hypothesis testing levels defined by method, let us first consider the expected infinite-time statistics. The most important ones are the power spectral A t A∗ t0 |θ S δ t − t0 ; E [ ( ) ( ) ] = A ( ) (33) densities: ∗ 0 0 E B t B t |θ SBδ t − t ; [ ( ) ( ) ] = ( ) (34) " # 0 0∗ 0 0 0 : Z T 2 E A t A t |θ SAδ t − t : 1 [ ( ) ( ) ] = ( ) (35) S± ω|θ E dtA± t iωt θ : ( ) =T lim→∞ T ( ) exp( ) (40) 0 Steady-state initial conditions can also be assumed. The derivation of these classical equations of motion is a stan- It is easy to show that dard exercise in quantum optics [44]; similar derivations have been reported in Refs. [13, 45–47]. As discussed 0 2 S−(ω|θ) = SA + SA + |χ−(ω)| (SB − SA); (41) in AppendicesB and4, this model is equivalent to a 0 S ω|θ S SA |χ ω |2 SB SA ; continuous-time hidden Gauss-Markov model (HGMM) +( ) = A + + +( ) ( + ) (42) [43]. In another set of measurements, a blue-detuned pump where χ±(ω) are the transfer functions that depend on the beam with frequency ωa + ωb is used instead. The equa- other known parameters. Since tions of motion are |g|2 4  da(t) ∗ γa √ 1 (43) = igb (t) − a(t) + γaA(t); (36) γaγb dt 2 db(t) ∗ γb √ = iga (t) − b(t) + γbB(t); (37) |χ ω |2 ≈ |χ ω |2 dt in the experiment, +( ) −( ) , and the asymmetry √ 2 A t γ a t − A t A0 t : of the two spectra can be attributed to the presence of SA, +( ) = a ( ) ( ) + ( ) (38) the input optical noise [45, 47]. Another statistic of interest is the steady-state mechanical These equations suggest a two-mode parametric ampli- energy: fication mechanism that is different from the first experi- ment. Note that this hidden-variable model is similar to   E |b t |2 θ ≈ SB: that for the first set of measurements. This is a result tlim→∞ ( ) (44) of using the Wigner representation. If the Sudarshan- Glauber or Husimi representations [44] had been used With appropriate normalizations, the quantum theory will instead, the model would have to be modified more sub- result in the following constraints: stantially, leading to needless complexity. The Wigner representation can be used with minimal changes for ho- SA ≥ : SB ≥ : ; modyne detection as well, so it is the best method at our 0 5 and 0 5 (45)

91 Mankei Tsang

which are manifestations of the uncertainty principle for 3. Combine the outputs from the Kalman filter with yt the optical and mechanical quadratures. Quantum gravity to obtain P(Y−|θ) for the first set of measurements. theories might violate or modify the uncertainty principle, In continuous time, the formula is [51] resulting in different constraints [11, 39, 40]. For example, P Y |θ a quantum gravity theory may assume ( − ) Z T Z T  > 1 > = PW (Y−) exp dyt µt (θ) − dtµt (θ)µt (θ) ; SB ≥ 0:5 + ; (46) 0 2 0 (51) where  is a parameter that depends on the mechanical where PW (Y−) is the probability measure of a vec- mass [39]. toral Wiener process (with zero increment mean and 4.4.3. Parallel Kalman filters > variance dyt dyt = Idt), µt (θ) are the filtering es- Statistics cannot be measured exactly in finite time, so timates of the following state variables: let us turn to the Bayesian approach to characterize s √ ! : the uncertainties. Our first task is to compute P(Y |θ) 2 Re E[ γaa(t)|Yt ; θ] µt (θ) = 0 √ (52) for many points that cover the two-dimensional plane of SA + SA Im E[ γaa(t)|Yt ; θ] > θ = (SA;SB) . We can take advantage of the Gauss- Markov property of the model and numerically compute which can be extracted from the Kalman filter es- x |Y ; θ dy P(Y |θ) efficiently using the famous Kalman filter in a timates E( t t ), and the t integral is an It¯o dy multiple-model approach [48]. The procedure is as fol- integral, that is, t should be the increment ahead t µ dy lows: of and t should not depend on t . Note that, in any computation of the posterior dis- 1. For the first set of measurements and each θ, define tribution of θ, PW (Y−) appears in both the numer- a normalized vectoral observation process as ator and denominator of the Bayes theorem and, s ! being independent of θ, cancels itself. : Z t A τ y 2 dτ Re −( ) ; t = 0 (47) 4. Repeat Step 1–3 for the second set of measure- SA + SA Im A−(τ) 0 P Y |θ ments to obtain ( + ), assuming Eqs. (36)–(38). P Y |θ P Y |θ P Y |θ ( ) is then ( + ) ( − ). such that the white noise in yt is normalized to give 5. Repeat Step 1–4 for all possible θ. dy dy> Idt; t t = (48) The tricky part is Step 5, as we need to set an appro- priately large but fine grid that discretizes θ in practice. with I being the identity matrix. Fortunately, the Kalman filters can be computed in par- allel for different values of θ, so we can exploit parallel yt 2. Pass through a Kalman filter that assumes the computing power to sweep many θ values, until P(Y |θ) θ same and Eqs (30)–(32). Specifically, let becomes relatively smooth inside the considered region : and negligible outside it. Y {y ; ≤ τ ≤ t} t = τ 0 (49) 4.4.4. Expected information For a useful guide on how to construct the grid for the be the observation record up to time t, and parallel Kalman filters and also how well the signal pro-   cessing technique is expected to work, we can consult the a t Re ( ) information measures introduced in Sec.3. Consider, for :  a t  x  Im ( )  example, two hypotheses with precise assumed values for t =  b t  (50)  Re ( )  θ. The problem then becomes a discrimination between b t Im ( ) two vectoral, complex, stationary, zero-mean, and Gaus- sian processes with power-spectral-density matrices be the state vector. The Kalman filter [31, 48–50] ! S ω|θ is an algorithm that determines the Gaussian pos- S −( 0) 0 ; 0 = S ω|θ (53) terior distribution P(xt |Yt ; θ) given the past obser- 0 +( 0) Yt ! vation record by computing its mean and co- S ω|θ S −( 1) 0 : variance matrix (see Appendices2 and5 for the 1 = S ω|θ (54) 0 +( 1) formulas).

92 Testing quantum mechanics: a statistical approach

The relative entropy and the Chernoff information have the which can be plotted graphically for visual impact, and following long-time limits [52]: a credible region can be assigned according to Eq. (17). To claim a successful observation of zero-point mechani- D P ||P Z dω ( 1 0)  S−1 S − S − |S−1S | ; cal motion, the whole credible region should be close to lim = tr ( 1 0) ln 1 T →∞ T 2π 0 0 SB = 0:5. When the mechanical oscillator is very close to (55) absolute zero, the credible region may also be used to rule C s Z dω | − s S sS | out modified uncertainty principles given by Eq. (46) by ( ) (1 ) 0 + 1 ; Tlim→∞ T = π ln |S |1−s|S |s (56) noting the values of SB that are well outside the credible 2 0 1 region. | · | Using an atomic ensemble as the mechanical oscillator, where is the determinant and the frequency integral et al. should be applied to only positive frequencies if the pro- Brahms have performed an experiment [54] similar cesses are real. These expressions show that the informa- to the one we have studied. A careful analysis of this tion measures should increase linearly with time asymp- experiment is left as an exercise for the reader. totically. The increase of information with time is impor- 4.5. Caveat: systematic errors tant, as it suggests that one can always compensate for a bad signal-to-noise ratio by increasing the measurement The Bayesian approach can perform worse than expected time. if the model assumptions do not hold in practice. The er- For parameter estimation, the Chernoff information given rors due to wrong assumptions are commonly called sys- by Eq. (56) can be used to compute the Cramér-Rao tematic errors. Here are a list of possible sources: bound via Eq. (22) and the Ziv-Zakai bounds [23]. These 1. Parameter uncertainties. In our optomechanics ex- parameter-estimation bounds are especially useful for set- ample in Sec. 4.4, we have assumed that some of the ting the grid resolution for the parallel Kalman filters. parameters, such as the resonance frequencies and 4.4.5. Hypothesis testing and parameter estimation the damping rates, are known exactly in advance. After the hard work of computing P(Y |θ) = P(Y |SA;SB), If not, one useful system identification method for we can now test the composite hypotheses about the un- prior calibration is the expectation-maximization certainty principle by considering various P(SA;SB|Hj ). (EM) algorithm, which is able to estimate most (not First consider the hypotheses used by Ref. [12]. One hy- all) parameters of a homogeneous HMM [55]; see pothesis assumes a classical model with equal spectra for also Ref. [56] for an application of the EM algorithm Eqs. (41) and (42), meaning that SA = 0, and the other to an optomechanics experiment. If the parameters one assumes a quantum model with SA = 0:5. This implies cannot be estimated exactly in advance, they would need to be included in θ.

P SA;SB|H δ SA P SB|H ; ( 0) = ( ) ( 0) (57) 2. Parameter drifts. A more serious problem occurs P S ;S |H δ S − : P S |H : ( A B 1) = ( A 0 5) ( B 1) (58) if the parameters are both unknown and drifting in time. Stationary statistics can no longer be as- The difference in the assumed optical noise powers SA sumed, and the Kalman filters cease to be optimal if can make the test favor one hypothesis over the other the parameter drift is random. To deal with this, we even if the data contain no significant information about have to take the parameters as part of the hidden the mechanical mode (see, however, Ref. [53] for a different state variables and perform nonlinear estimation. opinion). It is obvious that one can infer a lot more infor- Optimal nonlinear estimation is extremely difficult mation about SB from the measurements (as was done in to implement, but there exist many battle-tested Ref. [13]), and the hypotheses should make different as- approximations. Methods based on Kalman filters sumptions about SB, not SA, if a test of the mechanics is include the extended and unscented Kalman filters intended. [50]. A notable example is the Gravity Probe B Without any obvious choice of P(SA;SB|Hj ), we can also experiment, which relies on the unscented Kalman treat the problem as parameter estimation using an objec- filter to perform the parameter estimation [57]. P SA;SB tive prior ( ). The Jeffreys prior given by Eq. (26) 3. Parameter ambiguities. If there are too many un- can be approximated by considering Eq. (56) and using known parameters, different combinations of the pa- the identity in Eq. (22). The posterior distribution is then rameters may lead to the same P(Y |θ), and the data would not be able to distinguish such possibilities. P Y |S ;S P S ;S P S ;S |Y ( A B) ( A B) ; Ignoring the alternatives may lead to serious actual ( A B ) = R dS dS P Y |S ;S P S ;S (59) A B ( A B) ( A B) errors and over-confidence in the estimates.

93 Mankei Tsang

To avoid committing this error, minimizing the num- where we have suppressed the b subscripts for clarity and ber of unknown parameters helps tremendously. will also write S = SB. Consider the mechanical energy. For simple models, this can be done by considering Under classical mechanics, it would be defined as similarity transformations [21, 58], a technique for finding equivalent models that give the same obser- ε(t) = |b(t)|2: (61) vation statistics and discovering parameter redun- dancies. To derive an equation of motion for it, we should use The use of similarity transformations is especially stochastic calculus. From It¯ocalculus, the result is important for the EM algorithm [21], as the algo- rithm can be formulated to treat all parameters of p dε(t) = −γ [ε(t) − S] dt + 2γSε(t)dW (t); (62) a model as unknown and produce one set of esti- mates, ignoring all the other possibilities and giv- dW t ing one a false sense of certainty. If one is still left where ( ) is a Wiener increment and models white with too many parameters after careful considera- noise [60]. tions, independent calibrations and experiments to An alternative representation of the dynamics is the for- provide prior evidence for P(θ|Hj ) would be needed ward Kolmogorov equation [60]: to narrow down the unknowns further. Z ∂ 0 0 0 P(ε; t) = dε A(ε|ε )P(ε ; t); (63) 4. Model mismatch. Our model in Sec. 4.4 ignores the ∂t complication that the mechanical mode is coupled to another optical mode via laser cooling [12]. This also known as the Fokker-Planck equation or the master 0 means that Eqs. (31) and (37) are approximations. equation. The transition function A(ε|ε ), assuming the H A higher-order HMM, that is, one with more state linear model (designated as 0), is variables, is needed to model the actual situation more accurately, especially if there are other no- ∂ ∂2 A ε|ε0; H γ δ ε − ε0 ε0 − S γS δ ε − ε0 ε0: ticeable resonances in the data. ( 0) = ∂ε ( ) + ∂ε2 ( ) A more troubling implication for fundamental (64) physics tests is that the mechanical noise B(t) ac- ε t tually has a significant optical origin due to the The steady-state distribution for ( ) is given by laser cooling. If we already assume that an optical  ε  source must have SA ≥ 0:5, it would be inconsistent P ε|H 1 − ; ss( 0) = S exp S (65) to assume that SB may go below 0:5. One needs to formulate the hypotheses much more carefully to avoid logical inconsistencies such as this. with moments

εm|H m Sm: Systematic errors are “unknown unknowns”: things we do Ess [ 0] = ! (66) not know we don’t know [59]. They are much harder to catch, and worse still, ignoring them may result in mis- For example, the mean and variance are placed confidence in one’s estimates. To deal with such errors, it is a good idea in general to be conservative with : ε ε|H S; ¯0 = Ess [ 0] = (67) the prior assumptions, use different inference algorithms : h i ε2 ε − ε 2 H S2: to cross-check the results, and perform independent cali- ∆ 0 = Ess ( ¯0) 0 = (68) brations if possible.

4.6. Testing quantum jumps All the properties of the continuous energy model should be consistent with the statistics of homodyne or hetero- Let us come back to the mechanical oscillator and study dyne detection in an optomechanics experiment; after all, its energy dynamics. Under the linear model described in all we have done is a change of variables. Sec. 4.4, the equation of motion for the analytic signal in Eq. (62) predicts a continuous energy, whereas the quan- the absence of measurements would be tum theory can also result in a discrete energy model if we measure in the phonon-number basis. Experimental db(t) γ √ progress towards such a measurement in optomechanics = − b(t) + γB(t); (60) dt 2 is reported by Thompson et al. [61] and Sankey et al. [62].

94 Testing quantum mechanics: a statistical approach

The discrete jumps mean that it is difficult to write an property, but the procedure is very inefficient especially equation of motion that resembles Eq. (62), and it is more if the observation noise is high, as it throws away most common to represent the dynamics just by the forward of the data that can be obtained in-between the sampling Kolmogorov equation given by Eq. (63). For the damped times. quantum harmonic oscillator, we have [63] A much more efficient method is to consider a continuous measurement of ε(t) and perform hypothesis testing on A ε|ε0; H δ ε − − ε0 ε0 δ ε − ε0 ε0 ( 1) = ( 1 )Γ+( ) + ( + 1 )Γ−( ) the whole record and discriminate between Eqs. (64) and − δ ε − ε0 ε0 ε0 ; ( ) [Γ+( ) + Γ−( )] (69) (69) directly, as proposed by Tsang [51]. The statistical analysis becomes much more complicated however, as the observation processes are highly non-Gaussian. Stochas- where Γ± are the jumping rates: tic calculus helps [51, 64], but analytic results are more difficult to obtain than the ones for the linear model in ε0 γ S ∓ : ε0 ± : ; Γ±( ) = ( 0 5)( 0 5) (70) Sec. 4.4. We leave this interesting problem for future work. Note that there are alternative approaches to testing ε and is restricted to discrete levels: quantum jumps that are not based on statistical hypothe- sis testing [65–69]. A critique of these methods is left as ε ∈ {0:5; 1:5;::: }: (71) an exercise for the reader.

The steady state is 4.7. Contextuality

 S − : ε−0:5 With the demonstrations of the uncertainty principle and P ε|H 1 0 5 ; ss( 1) = S : S : (72) the discrete energy, the quantum proposition would be- + 0 5 + 0 5 come a lot more attractive: we get two contextual classical models for the price of one. Yet the skeptics may still ask with the mean and variance given by the following:

ε S; ¯1 = (73) 1. Is there a noncontextual classical model, beyond ε2 S2 − : 2: ∆ 1 = 0 5 (74) the representations we have considered, that can explain both phenomena, or all quantum phenom- In practice, ε(t) is a hidden variable and observed indi- ena in general? rectly, so a measurement model should be constructed to include observation noise and any measurement backac- 2. Are two contextual models really that bad, if it tion effect. means one can avoid the Hilbert-space theory? Although Eq. (69) is also a classical HMM, it is radically different from the HGMM suitable for heterodyne and ho- modyne detection, meaning that, in order to reproduce the To address the first question, we can appeal to the Bell- quantum theory, the classical models are contextual with Kochen-Specker theorem, which is a no-go theorem that respect to the type of measurement being performed. rules out the possibility of one noncontextual classical Given the prior success of the linear model, the hypothesis model to explain all quantum phenomena, if we impose given by Eq. (64) is a compelling alternative to Eq. (69). certain restrictions on the classical state variables [14, 70]. Evidence for the discrete energy model against the con- Of course, it is still a fundamental open question whether tinuous alternative would be a more direct confirmation of an efficient classical description of quantum mechanics is the original quantal hypothesis and, together with the ob- possible if we relax the restrictions somewhat. servation of the uncertainty principle, a convincing demon- To address the second question, we can appeal to the stration of the quantum probability theory for mechanics. power of quantum computation: it is known that linear There are two ways of testing the discrete-energy hypoth- bosonic dynamics, together with discrete-energy sources esis, both difficult but in different ways. One is to sample and measurements, is sufficient to perform universal quan- ε(t) at very sparse time intervals, such that the samples tum computation [71] and solve difficult problems [72] effi- can be assumed to be independent and identically dis- ciently. This means that, if an experiment performs opera- tributed (i.i.d.), and the test becomes a simple one between tions that require switching between the different contexts, the steady-state distributions given by Eqs. (65) and (72). naive contextual models can fail, as it is not even known The statistical analysis is relatively easy given the i.i.d. if an efficient classical description exists at all.

95 Mankei Tsang

4.8. Nonlocality Acknowledgments

Contextuality is a serious inconvenience for classical mod- It is impossible to list all the people who help shape the els, but we may also ask whether there are other more views expressed here through various forms of interac- fundamental reasons for finally giving up on classical me- tions, but surely the most important are the ones who chanics. Bell’s theorem and its generalizations try to pro- have also paid me while I indulge in these issues, includ- vide one by pitting classical mechanics against special ing Carl Caves, Jeff Shapiro, Seth Lloyd, Demetri Psaltis, relativity: to reproduce the quantum theory, the classi- the various funding agencies, and the National Univer- cal hidden variables must be able to communicate at su- sity of Singapore. I also thank Kurt Jacobs, who provided perluminal speeds [14]. Moreover, by providing explicit some insightful suggestions, and my group members, who inequalities that classical models with local hidden vari- are the main motivation for my writing this tutorial. ables must obey, the Bell theorems can be tested exper- This work is supported by the Singapore National imentally. The interested readers are referred to more Research Foundation under NRF Grant No. NRF- knowledgeable sources [14, 70, 73, 74] on this topic; we NRFF2011-07. emphasize only that statistical hypothesis testing meth- Appendix A: HIDDEN MARKOV MODELS ods can and should be applied to such tests, as proposed by Peres [75] and van Dam et al. [76]. (HMM) Rather than focusing on the constraints and the no-go An HMM expresses the probability function P(Y ) of an theorems, we might ask a more positive question: does observed variable Y in terms of a hidden variable X as the contextuality or the nonlocality of a quantum system follows: confer any useful advantage in information processing ap- X plications? It is perhaps this question that inspired the P(Y ) = P(Y;X); (A1) emergence of quantum information science [26, 77], and it X is perhaps a critical examination of this question that will with assumptions about how the variables are related to ensure a sustainable development of the field. each other. 5. Conclusion In the following I define the most basic type of HMM with discrete time and discrete possibilities, following closely As the take-home message, we conclude with the following the treatment in Ref. [43]. quote: 1. State vector

At each time, the system of interest is in one of N possible We’ve learned from experience that the states. A possibility is denoted by n, n = 1;:::;N. For truth will come out. Other experimenters will D example, with D bits, N = 2 , and each n denotes a repeat your experiment and find out whether particular bit sequence. The state at time k is represented you were wrong or right. Nature’s phenom- by a vector ena will agree or they’ll disagree with your theory. And, although you may gain some  0 N−1 xk ∈ e ; : : : ; e (A2) temporary fame and excitement, you will not gain a good reputation as a scientist if you in state space, and the global state vector X is haven’t tried to be very careful in this kind X {x ; : : : ; x } : of work. And it’s this type of integrity, this = K 0 (A3) kind of care not to fool yourself, that is miss- ing to a large extent in much of the research Here the superscripts are indices and should not be con- in Cargo Cult Science. fused with powers; the meaning should be clear given the en ... So I have just one wish for you—the context. The unit vectors represent the different pos- N good luck to be somewhere where you are sibilities of a state. They are in an -dimensional Eu- free to maintain the kind of integrity I have clidean state space: described, and where you do not feel forced     1 0 by a need to maintain your position in the      0   1  organization, or financial support, or so on, e0   ; e1   ;::: =  .  =  .  (A4) to lose your integrity. May you have that  .   .  freedom. —Richard P. Feynman [78] 0 0

96 Testing quantum mechanics: a statistical approach

and orthogonal to each other in terms of the inner product: 4. Markovianity : n X n m nm 1; n = m; The state described by is hidden and inferred only he ; e i = δ = (A5) Y X Y 0; otherwise: through an observed variable . Similar to , can also be broken down into a series of observation state vectors 2. State functions

x  M It is important to emphasize that is an indicator of the yk ∈ f 1; : : : ; f ; (A13) possibility and does not carry any other physical property. Y {y ; : : : ; y } ; n = K 1 (A14) A property of a state can be quantified by a value F assigned to each possibility n. To write the value as a F x n n function ( ), let where f are unit vectors similar to e . Define n he ; xi = 1en (x) ∈ {0; 1} (A6) : Y {y ; : : : ; y } ; k = k 1 (A15) be the nth component of x, where 1en (x) is an indicator : X {x ; : : : ; x } : function: k = k 0 (A16) n ; x ∈ ; x 1 Ξ 1Ξ( ) = ; : (A7) x y 0 otherwise The Markov property assumes that k+1 and k+1 depend xk n only on the previous , such that he ; xi is a binary (“yes-no”) variable that indicates whether x is in state n. We can then write P yk ; xk |Yk ;Xk P yk ; xk |xk ; X n n ( +1 +1 ) = ( +1 +1 ) (A17) F(x) = F he ; xi : (A8) n

Note the subtle difference between a function F(x) and which leads to n its possible values F . For example, the identity function is P Y;X P y ; x |x :::P y ; x |x P x : ( ) = ( K K K −1) ( 1 1 0) ( 0) (A18) X n n Ix = e he ; xi = x; (A9) n It is common to assume that the system noise and the and when multiplied by a matrix A, observation noise are independent:

X nm n m Ax = A e he ; xi : (A10) n;m P y ; x |x P y |x P x |x ; ( k+1 k+1 k ) = ( k+1 k ) ( k+1 k ) (A19) We see that the definition of a state function here depends heavily on the assumption that the system is always in one although this is often not the case in classical models of and only one of the possible states. quantum optics. 3. Initial probability function We now have the complete specification of an HMM, and in principle we can use it to calculate any multi-time k Pn F X;Y At time = 0, a nonnegative probability 0 is assigned statistic by taking the expectation of any function ( ). en function x to each . The probability of a state variable 0 For further details, more general HMM, and their appli- is written as cations, see Ref. [43]. X P x Pn hen; x i : ( 0) = 0 0 (A11) 5. Bayesian filtering n For simplicity, in the following we use the same notation The probability distribution can be extracted from the to denote probability functions and probability distribu- function by n tions. For example, P(xk = e ) is written as P(xk ), and P P n n n P xk P xk e P P x e : xk ( ) is taken to mean n ( = ). 0 = ( 0 = ) (A12) Bayesian filtering is a signal-processing technique that P x P x |Y Note the subtle difference between the function ( 0) and computes the posterior distribution ( k k ) given the im- n P Yk the distribution 0 . mediate past record . For an HMM, we can obtain a

97 Mankei Tsang

recursive formula via the following: computed using filtering, while P(Y¯k |xk ) is given by

P xk ;Yk P(Y¯k |xk ) P x |Y ( +1 +1) ( k+1 k+1) = P Y (A20) X ( k+1) P Y¯k ; yk ; xk |xk P = ( +1 +1 +1 ) (A30) P x ; x ; y ;Y x xk ( k+1 k k+1 k ) k+1 = P Y (A21) X ( k+1) P Y¯k |yk ; xk ; xk P yk ; xk |xk P = ( +1 +1 +1 ) ( +1 +1 ) (A31) P x ; x ; y |Y P Y x xk ( k+1 k k+1 k ) ( k ) k+1 = P Y (A22) X ( k+1) P Y¯k |xk P yk ; xk |xk ; P = ( +1 +1) ( +1 +1 ) (A32) P y ; x |x ;Y P x |Y xk xk ( k+1 k+1 k k ) ( k k ) +1 = P y |Y (A23) ( k+1 k ) P P yk ; xk |xk P xk |Yk xk ( +1 +1 ) ( ) : which defines a backward-time recursion analogous to = P y |Y (A24) P Y¯K ( k+1 k ) Eq. (A24), starting from the final-time condition ( ) = 1. Continuous-time limits of the smoothing equations can be found in Refs. [81–83]. P x |Y In other words, ( k+1 k+1) is obtained by starting from P x P x |Y the initial condition ( 0), propagating ( k k ) forward 7. Curse of dimensionality P y ; x |x using ( k+1 k+1 k ), and normalizing the resulting ex- pression. Continuous-time limits of the filtering equation The principal difficulty with implementing Bayesian filter- can be found in Refs. [79–82]. ing and smoothing in practice is that a probability distri- bution of xk is specified by O(N) numbers, and N grows 6. Bayesian smoothing exponentially with the degree of freedom D in a system. This makes any direct computation of an N-dimensional The goal of Bayesian smoothing is to compute the poste- probability distribution extremely expensive for large D; a rior distribution P(xk |Y ) of the hidden state at a certain problem known as the curse of dimensionality. One central time in the past given the complete observation record goal of statistical inference research is to find algorithms Y . It is usually more accurate than filtering when xk is that approximate a probability distribution using far less a stochastic process, as the future record can contain in- numbers (relative to D), finish in a reasonable time (rela- formation about xk that is not available in the past, but it tive to the number of time steps K in the model), and still is less useful for real-time applications that require infor- achieve acceptable estimation performances. mation about the current and the future, such as aircraft control and financial trading. Appendix B: HIDDEN GAUSS-MARKOV One method of smoothing is to split Y into the past record MODELS (HGMM) Yk {yk ; : : : ; y } = 1 and the future record 1. Discrete-time HGMM

x y Y Y \ Y {y ; : : : ; y } : Suppose now that k and k are vectors of unbounded ¯k = k = K k+1 (A25) continuous random variables:

    x(0) y(0) We then have k k  x(1)  D  y(1)  d xk  k  ∈ R ; yk  k  ∈ R : =   =   (B1) . . P(xk |Y ) = P(xk |Y¯k ;Yk ) (A26) . . P(xk ; Y¯k |Yk ) P x P y ; x |x = (A27) If the initial ( 0) and the transitional ( k+1 k+1 k ) are P(Y¯k |Yk ) Gaussian: P(Y¯k |xk ;Yk )P(xk |Yk ) = (A28)   P(Y¯k |Yk ) > P x ∝ √ 1 − 1 x − x0  −1 x − x0  ; P Y |x P x |Y ( 0) exp 0 0 Σ0 0 0 (¯k k ) ( k k ) det Σ0 2 = : (A29) P(Y¯k |Yk ) (B2) P y ; x |x ( k+1 k+1 k )   In other words, P(xk |Y ) is equal to the product ∝ √ 1 − 1 z − z > −1 z − z ; exp ( k+1 ¯k ) Λk ( k+1 ¯k ) (B3) P(Y¯k |xk )P(xk |Yk ) with normalization. P(xk |Yk ) can be det Λk 2

98 Testing quantum mechanics: a statistical approach

with where the redefined system noise is ! ! x A x B u : z k+1 ; z k k + k k ; ξ w − T v ; k+1 = ¯k = (B4) k = k k k (B19) yk+1 Ck xk ! Q S k k v Λk = > ; (B5) which can be made independent of k if we set Sk Rk

T S R −1; the model is known as a hidden Gauss-Markov model k = k k (B20) > (HGMM), which has been extensively studied due to its E ξk vk = 0; (B21) > −1 > analytic and computational tractability. A more common E ξk ξk = Qk − Sk Rk Sk : (B22) representation is to define zero-mean Gaussian system and observation noises as This allows us to apply the Kalman filter for uncorrelated : w x − A x − B u ; noises to Eqs. (B17) and (B18). The result is k = k+1 k k k k (B6) : vk yk − Ck xk ; = +1 (B7) > > −1 Γk = Σk Ck Ck Σk Ck + Rk ; (B23) : x0+ x |Y x0 y − C x0 ; such that the equations of motion can be written as k = E ( k k+1) = k + Γk ( k+1 k k ) (B24) : h > i + x − x0+ x − x0+ |Y Σk = E k k k k k+1 x A x B u w ; k+1 = k k + k k + k (B8) = (I − Γk Ck ) Σk ; (B25) yk Ck xk vk ; +1 = + (B9) 0 0 − 0 x Ak x + Bk uk Sk R 1 yk − Ck x + ; k+1 = k + + k ( +1 k ) (B26) A − S R −1C + A − S R −1C > with noise statistics given by Σk+1 = ( k k k k )Σk ( k k k k ) −1 > + Qk − Sk Rk Sk : (B27) E (wk ) = E (vk ) = 0; (B10) w w> Q ; E k k = k (B11) The exceptional computational efficiency of the Kalman > E vk vk = Rk ; (B12) filter has made it the standard filtering algorithm in en- > gineering; many practical filtering algorithms for non- E wk vk = Sk : (B13) Gaussian models, such as the extended and unscented 2. Kalman filter Kalman filters [50], are based on HGMM approximations and extensions of the Kalman filter. The Kalman filter [48–50] is an algorithm that computes 3. Rauch-Tung-Striebel (RTS) smoother the mean : An HGMM smoother computes the mean and covariance x0 x |Y k = E ( k k ) (B14) of the Gaussian posterior distribution given the whole ob- servation record Y : and covariance matrix : x x |Y ; : h 0  0 > i ˇk = E ( k ) (B28) k xk − x xk − x |Yk Σ = E k k (B15) :  >  Πk = E (xk − xˇk )(xk − xˇk ) |Y : (B29) of the Gaussian posterior distribution given the immediate past observation record Yk for the HGMM. One trick of The Rauch-Tung-Striebel (RTS) smoother [48, 50, 84] deriving the filter for nonzero Sk is to rewrite Eqs. (B8) is the most convenient algorithm. It first runs the and (B9) as [48] Kalman filter given by Eqs. (B23)–(B27) to obtain the 0 0 {x +; +; x ; k ; k ;:::;K − } set k Σk k+1 Σ +1 = 0 1 . Then, starting x A x B u w T y − C x − v k+1 = k k + k k + k + k ( k+1 k k k ) from (B16) A − T C x B u T y ξ ; x x0 ; ; = ( k k k ) k + k k + k k+1 + k (B17) ˇK = K ΠK = ΣK (B30) 0 yk Ck xk vk ; xK − x + ; K − + ; +1 = + (B18) ˇ 1 = K −1 Π 1 = ΣK −1 (B31)

99 Mankei Tsang

the following formulas are iterated backward in time: 6. Mayne-Fraser-Potter smoother

− > − k + Ak − Sk R 1Ck 1 ; Although there exists a continuous-time version of the Υ = Σk ( k ) Σk+1 (B32) x x0+ x − x0  ; RTS smoother [84], a time-symmetric form of the optimal ˇk = k + Υk ˇk+1 k (B33) +1 smoother due to Mayne [86] and Fraser and Potter [87] is + − − >: Πk = Σk Υk (Σk+1 Πk+1)Υk (B34) more amenable to analytic calculations. It involves run- ning the following filter, which has the same form as the For other forms of HGMM filters and smoothers, see Kalman-Bucy filter, backward in time: Refs. [48, 50]. c> s  r−1; 4. Continuous-time HGMM Γt = Σt t + t t (B51) 00 00  00 −dxt = − ft xt + bt ut dt + Γt (dyt − ct xt dt); (B52) Define time as dΦt > > : − = −ft Φt − Φt ft + qt − Γt rt Γt ; (B53) t t kδt; dt k = 0 + (B35) and combining the results with the forward Kalman-Bucy where δt is the time interval between consecutive time filter given by Eqs. (B48)–(B50) as follows: steps. Suppose − −1 −1 1 Ak − I = fk δt + o(δt); (B36) Πt = Σt + Φt ; (B54) −1 0 −1 00 Bk = bk δt + o(δt); (B37) xˇt = Πt Σt xt + Φt xt : (B55) Ck = ck δt + o(δt); (B38)

Qk qk δt o δt ; The final-time conditions for Eqs. (B52) and (B53) should = + ( ) (B39) −1 −1 correspond to ΦT = 0. In practice, one can solve for Φt Rk = rk δt + o(δt); (B40) −1 00 00 and Φt xt instead of Φt and xt to avoid the ill-defined S s δt o δt ; k = k + ( ) (B41) final-time conditions [86, 87]. where o(δt) denotes terms asymptotically smaller than δt. Appendix C: QUANTUM PROBABILITY THE- We can then define the continuous-time limit of an HGMM ORY in terms of the following stochastic equations of motion: 1. Hilbert space dxt = ft xt dt + bt ut dt + dwt ; (B42) Consider an N-dimensional Hilbert space spanned by an dyt = ct xt dt + dvt ; (B43) orthonormal basis with noise properties given by  N− Bφ = φ0; : : : ; φ 1 : (C1) E (dwt ) = E (dvt ) = 0; (B44) > N D D φn E dwt dwt = qt dt; (B45) For example, = 2 for . is a projection > , and in the bra-ket notation, it would be written E dvt dvt = rt dt; (B46) > as E dwt dvt = st dt: (B47) n n n 5. Kalman-Bucy filter φ ≡ |φ ihφ |; (C2)

The continuous-time limit of the Kalman filter in Ap- where the ≡ sign here means different notations for the pendix B2 is known as the Kalman-Bucy filter [85]. It same quantity. The Hilbert-Schmidt inner product is writ- is given by [48, 50, 85] ten as >  − t t c st r 1; h i Γ = Σ t + t (B48) hφn; φmi ≡ |φnihφn| † |φmihφm| δnm: 0 0  0 tr ( ) = (C3) dxt = ft xt + bt ut dt + Γt (dyt − ct xt dt); (B49) d Σt f f > q − r >: dt = t Σt + Σt t + t Γt t Γt (B50) In classical probability theory, we assume that a state vector can only be one of the unit vectors in one basis This limit is useful for deriving analytic results and sim- in a Euclidean space. The key to quantum probability plifying the filter implementation, as the differential equa- theory is that any basis in the Hilbert space can be used tions are easier to solve analytically. to specify the possibilities.

100 Testing quantum mechanics: a statistical approach

2. Quantum state which models the transition from one state to another. The unitary operator U is expressed as Consider a basis Bξ . Similar to the classical case, we can its define a state as one of possibilities: X nm n m X m m U = U |ξ ihξ | = |φ ihξ |; (C12) n;m m

ψ ∈ Bξ ; m X nm n m (C4) |φ i = U |ξ i = U|ξ i; (C13) n

ψ Bξ such that the indicator function of with respect to is nm where U is the unitary matrix that defines U:

1Bξ (ψ) = 1: (C5) nm n m n m U = hξ |U|ξ i = hξ |φ i : (C14)

ψ is called a quantum state. In the bra-ket notation, we Note the subtle difference between an operator and a ma- may write it as trix. A special class of unitary operators is the permutation, ψ ≡ |ψihψ|: (C6) which simply assigns one ket to another in the same basis:

nm n;π m U = δ ( ); (C15) Here ψ ∈ Bξ implies that ψ is compatible with the basis hξn; φmi |Unm|2 ∈ { ; }: Bξ , meaning that the state becomes equivalent to a classi- = 0 1 (C16) cal state if we restrict ourselves to state operations within ψ ∈ B this basis. Conversely, given any ψ, one can always find If ξ , a permutation would stay in the same basis, a compatible basis Bξ in which ψ is an element. This is a and the transition becomes equivalent to a classical state subtle but important point: it allows us to associate any transition. quantum state ψ with a classical state of reality in the In the other extreme, the Fourier-transform unitary assigns context of a compatible basis. a state in one basis to another in a “maximally incompat- The nth component of ψ in a compatible basis is ible” basis:

nm i πnm n U √1 2 ; hξ ; ψi = 1ξn (ψ) ∈ {0; 1}; (C7) = N exp N (C17)

hξn; φmi |Unm|2 1 ; = = N (C18) which is a qualified indicator function like Eq. (A7), but for any basis in general, the inner product which is useful for quantum computation [26]. Two bases that satisfy Eq. (C18) are also called mutually unbiased hφn; ψi ≡ |hφn|ψi|2 (C8) [88]. To model continuous-time evolution, the unitary operator H has the following properties is expressed in terms of a Hamiltonian operator as

n U(t) = exp(−iHt): (C19) 0 ≤ hφ ; ψi ≤ 1; (C9) X n hφ ; ψi = 1; (C10) 4. von Neumann measurement n Similar to the classical case, we can define a conditional n which hint at the role of hφ ; ψi as a probability function. probability function with respect to two von Neumann measurements. A von Neumann measurement is defined 3. Unitary maps with respect to a basis Bφ, with each outcome correspond- n ing to a φ . For one measurement in basis Bφ followed An important class of operations on a quantum state are by another in Bξ , the probability function of an outcome ψ ∈ B ψ ∈ B the unitary maps, written as ξ , conditioned on the previous outcome 0 φ, is

Uψ ≡ U|ψihψ|U† ; P ψ|ψ hψ; ψ i ≡ |hψ|ψ i|2 ; (C11) ( 0) = 0 0 (C20)

101 Mankei Tsang

which is Born’s rule. We can also model time evolution symmetry; there are some practical benefits but we will before the final measurement by including a unitary map: not dwell on them for now. If N is a prime power, there are R = N + 1 such bases P ψ|ψ hψ; Uψ i : Bq N ( 0) = 0 (C21) including [88]. Let us focus on a prime , and denote the mutually unbiased bases by Eq. (C21) is quantum mechanics in a nutshell. B {B ; B ;:::; B } ; B B ; A trivial but powerful property of Eq. (C21) is unitary in- ˜ = 0 1 N N = q (D2) variance:  0 N−1 Br = pr ; : : : ; pr ; r = 0;:::;N − 1: (D3) P ψ|ψ hU∗ψ; U∗Uψ i ; ( 0) = 0 0 0 (C22) For N = 2, the bases simply consist of the eigenstates of B U U the three Pauli operators. For the other primes, r can be where 0 is any unitary map. For example, if we let 0 = constructed from Bq using the fractional Fourier transform. U, and since the adjoint is the same as the inverse for a We assume that one is interested only in state operations unitary, we obtain with B˜. B˜ P ψ|ψ hU∗ψ; ψ i ; The next step is to map the composite basis to a clas- ( 0) = 0 (C23) sical state space. A naive way would be to consider each Br to be a separate object; for example, a (N = 2) which is the . Any new picture can would be modeled as three classical bits that correspond U be generated by choosing a 0, akin to a change of refer- to the three components. This is obviously not the ence frame in relativity. The is a useful most efficient representation, as the statistics of the clas- example. sical bits must be correlated to model one qubit. In gen- In principle, Eq. (C21) is all we need to compute quan- eral, this naive approach would require an extremely large N tum probabilities, but it is extremely difficult to do so in N +1-dimensional classical state space. Surprisingly, it practice without further approximations if the degree of turns out that an N2-dimensional classical state space D freedom is large. In the following, we consider the the- is sufficient, if we define an appropriate quasiprobability oretical tools that can facilitate this task. function in analogy with the Wigner function for continu- Appendix D: QUASIPROBABILITY FUNC- ous variables [92]. TIONS 3. Discrete Wigner function

1. Quantum-mechanics-free model Let z be a classical state in one of N2 possibilities. The possibilities can be assigned to N × N points on a two- If we restrict state operations (including the initial state, dimensional lattice known as the phase space. For each state transitions, and measurements) to unit vectors in one n q , we assign a vertical line of classical states, denoted basis, then the quantum model becomes equivalent to a λ qn B by the set ( ). For the Fourier-transform basis 0, the classical model without any quantum feature, such as the λ pn function ( 0 ) also assigns a horizontal line of classical uncertainty relations or measurement invasiveness (this is n states for each p . Beyond the vertical and horizontal called a quantum-mechanics-free model in Ref. [89]; see 0 lines, the basic idea of Ref. [92] is to define tilted lines on also Refs. [14, 90, 91]). It is, however, possible to relax this n a lattice appropriately and construct a function λ(pr ) that restriction significantly and still find a classical represen- n provides a general mapping from any pr ∈ B˜ to a line of tation, if we incorporate probabilities. The next sections classical states in the phase space. The discrete Wigner describe how this can be done via Wigner functions. function, defined via an operator w(z) as 2. Mutually unbiased bases : W z hw z ; ψ i ; 0( ) = ( ) 0 (D4) To pick the Hilbert-space bases for classical modeling, we start with one, say, is then required to give the correct probability function  N− Bq = q0; : : : ; q 1 ; (D1) that coincides with Born’s rule in Eq. (C20) for any mea- surement in any Br ∈ B˜:

Bq and try to find all the bases that are unbiased with X P ψ|ψ z W z ψ ∈ B: and each other according to Eq. (C18). We choose mutu- ( 0) = 1λ(ψ)( ) 0( ) for ˜ (D5) ally unbiased bases mainly because of the mathematical z

102 Testing quantum mechanics: a statistical approach

A w(z) that satisfies these properties for prime N is re- example of how this correspondence can be exploited for ported in Ref. [92]. If N is not a prime, it can be factored the purpose of hypothesis testing and parameter estima- into a product of primes, and the procedure can be ap- tion. plied to a tensor product of smaller Hilbert spaces with the prime dimensions. Appendix E: OPEN QUANTUM SYSTEMS Any quantum state transition within B˜ can be represented W z|z The concepts introduced in this section can also be found by an appropriate conditional probability function ( 0) in many textbooks [26, 31, 63, 98]. in the classical state space, such that Eq. (C21) becomes 1. Density operator X P ψ|ψ z W z|z W z ψ ∈ B: ( 0) = 1λ(ψ)( ) ( 0) 0( 0) for ˜ (D6) z;z We would now like to incorporate more probabilities to 0 ψ ψ model classical uncertainties in 0. Suppose that 0 de- pends on a classical hidden variable j, and the probability ψ B˜ W z j As long as 0 is also in , 0( ) is nonnegative, and distribution for j is P . P(ψ) becomes the quantum system can be modeled by a classical HMM 2 N ψ B˜ X j with possible states. However, if 0 is not in or if P ψ P ψ|ψ Pj U ( ) = ( 0) (E1) induces state transitions beyond the composite basis, j W z W z|z then 0( ) or ( 0) may become negative somewhere, X D j E ψ; ψ Pj hence the name quasiprobability functions. = 0 (E2) j If N is a prime power, the N + 1 mutually unbiased bases hψ; ρ i ; can be used to form the composite basis B˜ directly, and = 0 (E3) alternative Wigner functions can be defined with respect to measurements in such bases without going through the where composition; see Ref. [93]. For a discussion of the rela- X j X j j tionships between nonnegative quasiprobability functions, ρ Pj ψ ≡ Pj |ψ ihψ | 0 = 0 0 0 (E4) contextuality, and quantum information science in general, j j see Refs. [94–96]. 4. Wigner function for continuous variables is called a density operator. Another way of arriving at the density operator is to con- The Wigner function was, of course, originally invented for sider a larger Hilbert space as a tensor product of two A B unbounded continuous variables, such as the position and smaller ones and , with an initial state given by Ψ0 momentum of a mechanical oscillator and the quadratures and a final von Neumman projection given by of an optical field. Its properties and applications have been exhaustively studied; see, for example, Refs. [42, 44, Ψ ≡ ψ ⊗ ψB; (E5) 97]. P | h ; i hψ ⊗ ψ ; i : (Ψ Ψ0) = Ψ Ψ0 AB = B Ψ0 AB (E6) The symmetry properties of the Wigner function is ex- tremely powerful for modeling a large class of quantum operations with minimal contextuality. In particular, if If we neglect the outcome ψB, the marginal probability function is 1. the initial Wigner function is Gaussian, X P ψ P | hψ ⊗ I ; i hψ; ρ i ; 2. the Hamiltonian is at most quadratic with respect ( ) = (Ψ Ψ0) = B Ψ0 AB = 0 A (E7) ψ to the continuous-variable operators, such that the B Heisenberg equations of motion for these operators I B are linear, and where B denotes the identity operator in , and

3. the measurements can be modeled as von Neumann ρ hI ; i ≡ | ih | 0 = B Ψ0 B trB Ψ0 Ψ0 (E8) measurements of arbitrary linear combinations of the continuous variables, turns out to have exactly the same properties as Eq. (E4). the quantum observation statistics become equivalent to The third and the most nontrivial way of arriving at a those of an HGMM described in AppendixB[31, 41, 42], density operator is Gleason’s theorem [99], which roughly and all the statistical methods valid for an HGMM are states that, if N ≥ 3 and we are given a probability func- also applicable to such a quantum model. Sec. 4.4 is an tion P(ψ), then there always exists a positive-semidefinite

103 Mankei Tsang

ρ operator 0 such that Eq. (E3) holds. The theorem is re- To model continuous-time evolution, a CP map can be dundant, however, if we assume Born’s rule, as we have written as already derived Eq. (E3) by other more constructive means and we do not really need the theorem to tell us that a V Lt ; ρ = exp( ) (E14) 0 exists. ρ It is common in quantum physics to call 0 a quantum ρ ψj ≡ |ψj ihψj | L state; it is called a pure state when 0 = 0 0 0 is where is known as the Lindblad generator. a projection and a mixed state otherwise. This terminology A CP map can be used to describe the phenomenon of is confusing and we shall avoid it here, as the density decoherence, which occurs when the system of interest operator is different from the state concept in probability interacts with another inaccessible system. The system of theory, as described in Appendix A1. interest is then called an open system. Like the density operator and the POVM, it can be shown that a CP map 2. Positive operator-valued measure (POVM) is equivalent to unitary evolution in a larger Hilbert space The von Neumann projection can be generalized to a more that includes all the inaccessible subsystems. E y general notion of measurement called the POVM ( ), 4. Generalized measurements where y is an observation. The POVM is a positive- semidefinite operator that satisfies the completeness prop- For a series of generalized measurements, the probability erty: function of the outcome can be written as

X E(y) = I; (E9) P Y P y ; : : : ; y ( ) = ( K 1) (E15) y hE y ; W y ::: W y ρ i = ( K ) ( K −1) ( 1) 0 (E16) with I denoting the identity operator. The probability function of y is then given by where W(yk ) is a CP map with an observation yk at time k, describing both the dynamics and the probabilities of P y hE y ; ρ i : the observation. It reduces to a trace-preserving CP map ( ) = ( ) 0 (E10) Vk when summed over all possible outcomes:

It can be shown that a POVM is equivalent to a von Neu- X W yk Vk : mann projection in a larger Hilbert space, but it is a con- ( ) = (E17) yk venient tool nonetheless to model partial measurements.

3. Time evolution In principle, Eq. (E16) can also be modeled using Eq. (C21) in a larger Hilbert space through the princi- Instead of the unitary map, we can use a more general ple of deferred measurement, but for numerical analysis mathematical operation called a completely positive (CP) a smaller Hilbert space is usually more desirable to al- map to model dynamics that involve uncertainties: leviate the curse of dimensionality. Eq. (E16) may be regarded as a generalization of the classical HMM. P hE; Vρ i hV∗E; ρ i ; = 0 = 0 (E11) We have stressed repeatedly that the open quantum sys- tem theory is a reformulation of quantum probability the- where the trace-preserving CP map V can be written in ory and contains no new physics, but its value for funda- the Kraus representation as mental physics should not be dismissed entirely. After all, Hamiltonian and Lagrangian mechanics were also merely X † reformulations of Newtonian mechanics, until quantum Vρ V ρ V : 0 = j 0 j (E12) j mechanics turned them into a centerpiece. Appendix F: QUANTUM ESTIMATION Vj is called a Kraus operator, which satisfies the com- pleteness property: 1. Quantum filtering

X † The goal of quantum filtering is to predict the future ob- V Vj I: y any E y j = (E13) servation k:+1 for given ( k+1) using the past obser- j Y {y ; : : : ; y } vations k = k 1 . Using Eq. (E16), the filtering

104 Testing quantum mechanics: a statistical approach

probability function becomes Eq. (F9) has the same structure as the filtering equation in Eq. (F4) and can be calculated by the same methods P y |Y ( k+1 k ) applied backwards in time. Hence P y ;Y ( k+1 k ) = (F1) P(Yk ) D E P yk |Yk ; Y¯k N E Y¯k ; W yk ρ Yk ; hE y ; W y ::: W y ρ i ( +1 +1) = ( +1) ( +1) ( ) (F10) ( k+1) ( k ) ( 1) 0 = hI; W y ::: W y ρ i (F2) ( k ) ( 1) 0 hE y ; ρ Y i ; = ( k+1) ( k ) (F3) ρ Y E Y and ( k ) and (¯k+1) provide the sufficient statistics for smoothing. Continuous-time limits of quantum smoothing where the posterior density operator defined as can be found in Refs. [81, 82, 103, 104]. y : ρ The omission of k+1 from the given observations may ρ Y CW y ::: W y ρ ; Cρ ; ( k ) = ( k ) ( 1) 0 = hI; ρi (F4) seem artificial, but this formulation can actually be used for quantum sensing of hidden classical waveforms. This contains the sufficient statistics for filtering. Eq. (F4) is is done by embedding a classical HMM in the quantum W y sometimes called the quantum Bayes theorem [63]. A use- model and assuming that ( k+1) is a perfect observation ful way of computing Eq. (F4) is to find a classical HMM of the classical HMM [81, 82, 103]. Recent quantum optics representation via quasiprobability functions and take ad- experiments that used smoothing for waveform estimation vantage of existing classical algorithms. The Kalman filter are reported in Refs. [105–107]. is especially useful for quantum optomechanics and large The concept of quantum smoothing can be traced back to atomic spin ensembles [31], as we have also seen from Aharonov et al. [108], who proposed the time-symmetric Sec. 4.4. form given by Eq. (F10) for von Neumann measurements. As pioneered by Belavkin [100], a continuous-time limit of The connection between this time-symmetric form and Eq. (F4) can be defined using stochastic calculus to model smoothing estimation was first made and studied by Tsang observations with white noise [31]. See also Ref. [101] for [81, 82, 103]. The presentation here follows a more recent an alternative mathematical treatment of continuous-time work by Gammelmark et al. [104]. quantum filtering. The curse of dimensionality also exists for quantum esti- Quantum filtering can form the basis for quantum parame- mation, as the number of variables that specify a density ter estimation and hypothesis testing techniques; see, for matrix also grows exponentially with the degree of free- example, Refs. [51, 102]. dom. As quantum technologies become more complex and nonclassical, one can envision an increasing demand for 2. Quantum smoothing efficient quantum filtering and smoothing algorithms for y future signal processing and control applications. Quantum smoothing is the estimation of k+1 using the past References Y {y ; : : : ; y } ; k = k 1 (F5) [1] S. Haroche and J. M. Raimond, Exploring the Quan- as well as the future tum: Atoms, Cavities, and Photons (Oxford Univ. Press, Oxford, 2006). Y¯k Y \ Yk {yK ; : : : ; yk } ; +1 = +1 = +2 (F6) [2] S. Haroche, Rev. Mod. Phys. 85, 1083 (2013), http://link.aps.org/doi/10.1103/ URL y RevModPhys.85.1083 assuming that k+1 is missing. The conditional probability . function is [3] D. J. Wineland, Rev. Mod. Phys. 85, 1103 (2013), http://link.aps.org/doi/10.1103/ URL P y |Y ; Y N P Y ; y ;Y ; RevModPhys.85.1103 ( k+1 k ¯k+1) = (¯k+1 k+1 k ) (F7) . [4] T. J. Kippenberg and K. J. Vahala, Science 321, http://www.sciencemag.org/ where N is a normalization constant. We rewrite 1172 (2008), URL content/321/5893/1172.abstract Eq. (E16) in the time-symmetric form in terms of Eq. (F4): . D E [5] M. Aspelmeyer, S. Gröblacher, K. Hammerer, and P Y E Y ; W y ρ Y ; 27 ( ) = (¯k+1) ( k+1) ( k ) (F8) N. Kiesel, J. Opt. Soc. Am. B , A189 (2010). ∗ ∗ [6] M. Aspelmeyer, T. J. Kippenberg, and F. Marquardt, E Y¯k W yk ::: W yK − E yK : ( +1) = ( +2) ( 1) ( ) (F9) ArXiv e-prints (2013), 1303.0733.

105 Mankei Tsang

[7] D. W. C. Brooks, T. Botter, S. Schreppler, T. P. [23] H. L. Van Trees and K. L. Bell, eds., Bayesian Purdy, N. Brahms, and D. M. Stamper-Kurn, Na- Bounds for Parameter Estimation and Nonlinear http: ture 488, 476 (2012), ISSN 0028-0836, URL Filtering/Tracking (Wiley-IEEE, Piscataway, 2007). //dx.doi.org/10.1038/nature11325 . [24] M. Tsang, Phys. Rev. Lett. 108, 230401 (2012), http://link.aps.org/doi/10.1103/ [8] T. P. Purdy, R. W. Peterson, and C. A. Regal, Science URL PhysRevLett.108.230401 339, 801 (2013). . [9] A. H. Safavi-Naeini, S. Groblacher, J. T. Hill, J. Chan, [25] J. Shore and R. Johnson, IEEE Transactions on In- M. Aspelmeyer, and O. Painter, Nature 500, 185 formation Theory 26, 26 (1980), ISSN 0018-9448. http://dx.doi. (2013), ISSN 0028-0836, URL [26] M. A. Nielsen and I. L. Chuang, Quantum Computa- org/10.1038/nature12307 . tion and Quantum Information (Cambridge Univer- [10] T. P. Purdy, P.-L. Yu, R. W. Peterson, N. S. sity Press, Cambridge, 2000). Kampel, and C. A. Regal, Phys. Rev. X 3, [27] C. W. Helstrom, Quantum Detection and Estimation http://link.aps.org/doi/ 031012 (2013), URL Theory (Academic Press, New York, 1976). 10.1103/PhysRevX.3.031012 . [28] A. S. Holevo, Statistical Structure of Quantum The- [11] Y. Chen, Journal of Physics B: Atomic, Molecular and ory (Springer-Verlag, Berlin, 2001). http:// Optical Physics 46, 104001 (2013), URL [29] V. B. Braginsky and F. Y. Khalili, Quantum Mea- stacks.iop.org/0953-4075/46/i=10/a=104001 . surement (Cambridge University Press, Cambridge, [12] A. H. Safavi-Naeini, J. Chan, J. T. Hill, T. P. M. Ale- 1992). gre, A. Krause, and O. Painter, Phys. Rev. Lett. 108, [30] C. M. Caves, K. S. Thorne, R. W. P. Drever, V. D. http://link.aps.org/doi/ 033602 (2012), URL Sandberg, and M. Zimmermann, Rev. Mod. Phys. 10.1103/PhysRevLett.108.033602 http://link.aps.org/doi/ . 52, 341 (1980), URL 10.1103/RevModPhys.52.341 [13] A. H. Safavi-Naeini, J. Chan, J. T. Hill, S. Gröblacher, . H. Miao, Y. Chen, M. Aspelmeyer, and O. Painter, [31] H. M. Wiseman and G. J. Milburn, Quantum Mea- New Journal of Physics 15, 035007 (2013), surement and Control (Cambridge University Press, 1210.2671. Cambridge, 2010). [14] A. Peres, Quantum Theory: Concepts and Methods [32] M. G. A. Paris and J. Řeháček, eds., Quantum State (Kluwer, New York, 2002). Estimation (Springer-Verlag, Berlin, 2004). [15] J. O. Berger, Statistical Decision Theory and [33] V. Giovannetti, S. Lloyd, and L. Maccone, Science http://www.sciencemag. Bayesian Analysis (Springer-Verlag, New York, 306, 1330 (2004), URL org/content/306/5700/1330.abstract 1980). . [16] J. M. Bernardo and A. F. M. Smith, Bayesian Theory [34] V. Giovannetti, S. Lloyd, and L. Maccone, Na- (Wiley, Chichester, 2009), ISBN 9780470317716, ture Photon. 5, 222 (2011), ISSN 1749-4885, URL http://books.google.com.sg/books?id= http://dx.doi.org/10.1038/nphoton.2011.35 URL . 11nSgIcd7xQC . [35] M. Tsang, H. M. Wiseman, and C. M. [17] E. T. Jaynes, Probability Theory: The Logic of Caves, Phys. Rev. Lett. 106, 090401 (2011), http://link.aps.org/doi/10.1103/ Science (Cambridge University Press, Cambridge, URL PhysRevLett.106.090401 2003). . [18] T. M. Cover and J. A. Thomas, Elements of Informa- [36] M. Tsang, Phys. Rev. Lett. 107, 270402 (2011), http://link.aps.org/doi/10.1103/ tion Theory (Wiley, New York, 2006). URL PhysRevLett.107.270402 [19] T. Kailath, IEEE Transactions on Information Theory . 15, 350 (1969), ISSN 0018-9448. [37] M. Tsang and R. Nair, Phys. Rev. A 86, http://link.aps.org/doi/ [20] H. L. Van Trees, Detection, Estimation, and Modu- 042115 (2012), URL 10.1103/PhysRevA.86.042115 lation Theory, Part I. (John Wiley & Sons, New York, . 2001). [38] M. Tsang, New Journal of Physics 15, 073005 http://dx.doi.org/10.1088/ [21] B. C. Levy, Principles of Signal Detection and Pa- (2013), URL 1367-2630/15/7/073005 rameter Estimation (Springer, New York, 2008). . [22] H. L. Van Trees, Detection, Estimation, and [39] I. Pikovski, M. R. Vanner, M. Aspelmeyer, M. S. Modulation Theory, Part III: Radar-Sonar Kim, and Č. Brukner, Nature Physics 8, 393 (2012), Signal Processing and Gaussian Signals 1111.1979. in Noise (John Wiley & Sons, New York, [40] M. P. Blencowe, Phys. Rev. Lett. 111, 021302 http: http://link.aps.org/doi/10. 2001), ISBN 9780471463818, URL (2013), URL //books.google.com/books?id=lc4YnId2yYoC 1103/PhysRevLett.111.021302 . .

106 Testing quantum mechanics: a statistical approach

[41] S. L. Braunstein and P. van Loock, Rev. Mod. Phys. G. M. Keiser, A. S. Silbergleit, T. Holmes, http://link.aps.org/doi/ 77, 513 (2005), URL J. Kolodziejczak, et al., Phys. Rev. Lett. 106, 10.1103/RevModPhys.77.513 http://link.aps.org/doi/ . 221101 (2011), URL 10.1103/PhysRevLett.106.221101 [42] S. D. Bartlett, T. Rudolph, and R. W. Spekkens, . Phys. Rev. A 86, 012103 (2012), URL [58] K. Zhou, J. C. Doyle, and K. Glover, Robust and http://link.aps.org/doi/10.1103/PhysRevA. Optimal Control (Prentice Hall, Englewood Cliffs, 86.012103 . 1996). [43] R. J. Elliott, L. Aggoun, and J. B. Moore, Hidden [59] Defense.gov news transcript: DoD news brief- Markov Models: Estimation and Control (Springer, ing ? Secretary Rumsfeld and Gen. My- http://www.defense.gov/transcripts/ New York, 1995). ers, transcript.aspx?transcriptid=2636 [44] D. F. Walls and G. J. Milburn, Quantum Optics (2002). (Springer-Verlag, Berlin, 2008). [60] C. W. Gardiner, Stochastic Methods: A Hand- [45] F. Y. Khalili, H. Miao, H. Yang, A. H. Safavi- book for the Natural and Social Sciences (Springer, Naeini, O. Painter, and Y. Chen, Phys. Rev. A 86, Berlin, 2010), ISBN 9783642089626. http://link.aps.org/doi/ 033840 (2012), URL [61] J. D. Thompson, B. M. Zwickl, A. M. Jayich, F. Mar- 10.1103/PhysRevA.86.033840 . quardt, S. M. Girvin, and J. G. E. Harris, Nature [46] A. M. Jayich, J. C. Sankey, K. Børkje, D. Lee, C. Yang, (London) 452, 72 (2008), 0707.1724. M. Underwood, L. Childress, A. Petrenko, S. M. [62] J. C. Sankey, C. Yang, B. M. Zwickl, A. M. Jayich, Girvin, and J. G. E. Harris, New Journal of Physics and J. G. E. Harris, Nature Physics 6, 707 (2010), 14, 115018 (2012), 1209.2730. 1002.4158. [47] M. Tsang, ArXiv e-prints (2013), 1306.2699v1. [63] C. W. Gardiner and P. Zoller, [48] Y. Bar-Shalom, R. Li, and T. Kirubarajan, Estimation (Springer-Verlag, Berlin, 2004). with Applications to Tracking and Navigation (John [64] M. Tsang, ArXiv e-prints (2013), 1310.0291. Wiley & Sons, New York, 2001). [65] D. H. Santamore, A. C. Doherty, and M. C. [49] R. E. Kalman, Journal of Basic Engineering 82, Cross, Phys. Rev. B 70, 144301 (2004), URL http://dx.doi. http://link.aps.org/doi/10.1103/PhysRevB. 35 (1960), ISSN 0098-2202, URL org/10.1115/1.3662552 70.144301 . . [50] D. Simon, Optimal State Estimation: Kalman, H In- [66] D. H. Santamore, H.-S. Goan, G. J. Mil- finity, and Nonlinear Approaches (Wiley, Hoboken, burn, and M. L. Roukes, Phys. Rev. A 70, http://books. http://link.aps.org/doi/ 2006), ISBN 9780470045336, URL 052105 (2004), URL google.com.sg/books?id=urhgTdd8bNUC 10.1103/PhysRevA.70.052105 . . [51] M. Tsang, Phys. Rev. Lett. 108, 170502 (2012), [67] K. Jacobs, P. Lougovski, and M. Blencowe, Phys. http://link.aps.org/doi/10.1103/ http://link. URL Rev. Lett. 98, 147201 (2007), URL PhysRevLett.108.170502 aps.org/doi/10.1103/PhysRevLett.98.147201 . . [52] D. Kazakos and P. Papantoni-Kazakos, IEEE Trans- [68] H. Miao, S. Danilishin, T. Corbitt, and actions on Automatic Control 25, 950 (1980), ISSN Y. Chen, Phys. Rev. Lett. 103, 100402 (2009), http://link.aps.org/doi/10.1103/ 0018-9286. URL PhysRevLett.103.100402 [53] A. H. Safavi-Naeini and O. Painter, ArXiv e-prints . (2013), 1306.5309. [69] A. A. Clerk, F. Marquardt, and J. G. E. Har- [54] N. Brahms, T. Botter, S. Schreppler, D. W. C. Brooks, ris, Phys. Rev. Lett. 104, 213603 (2010), http://link.aps.org/doi/10.1103/ and D. M. Stamper-Kurn, Phys. Rev. Lett. 108, URL http://link.aps.org/doi/ PhysRevLett.104.213603 133601 (2012), URL . 10.1103/PhysRevLett.108.133601 . [70] N. D. Mermin, Rev. Mod. Phys. 65, 803 (1993), http://link.aps.org/doi/10.1103/ [55] R. H. Shumway and D. S. Stoffer, Time Series URL RevModPhys.65.803 Analysis and Its Applications (Springer, New York, . 2006). [71] E. Knill, R. Laflamme, and G. Milburn, Nature http://dx.doi.org/10. [56] S. Z. Ang, G. I. Harris, W. P. Bowen, and M. Tsang, 409, 46 (2001), URL 1038/35051009 New Journal of Physics 15, 103028 (2013), . http://stacks.iop.org/1367-2630/15/i= URL [72] S. Aaronson and A. Arkhipov, ArXiv e-prints (2010), 10/a=103028 . 1011.3245. [57] C. W. F. Everitt, D. B. DeBra, B. W. Parkin- [73] N. Brunner, D. Cavalcanti, S. Pironio, V. Scarani, son, J. P. Turneaure, J. W. Conklin, M. I. Heifetz, and S. Wehner, ArXiv e-prints (2013), 1303.2849.

107 Mankei Tsang

pii/000349168790176X [74] C. Emary, N. Lambert, and F. Nori, ArXiv e-prints . (2013), 1304.5133. [93] K. S. Gibbons, M. J. Hoffman, and W. K. Woot- [75] A. Peres, Fortsch. Phys. 48, 531 (2000). ters, Phys. Rev. A 70, 062101 (2004), URL http://link.aps.org/doi/10.1103/PhysRevA. [76] W. van Dam, R. D. Gill, and P. D. Grunwald, IEEE 70.062101 Transactions on Information Theory 51, 2812 (2005), . ISSN 0018-9448. [94] C. Ferrie, Reports on Progress in Physics 74, http://stacks.iop.org/ [77] C. M. Caves, ArXiv e-prints (2013), 1302.1864. 116001 (2011), URL 0034-4885/74/i=11/a=116001 [78] R. P. Feynman, Engineering and Science 37, 10 . (1974). [95] V. Veitch, C. Ferrie, D. Gross, and J. Emerson, New [79] R. S. Liptser and A. N. Shiryaev, Statistics of Ran- Journal of Physics 14, 113011 (2012), 1201.1256. dom Processes: I. General Theory (Springer, Berlin, [96] V. Veitch, S. A. H. Mousavian, D. Gottesman, and http://books. 2000), ISBN 9783540639282, URL J. Emerson, ArXiv e-prints (2013), 1307.7171. google.com.sg/books?id=7An21SYEATsC . [97] M. Hillery, R. F. O’Connell, M. O. Scully, and [80] R. S. Liptser and A. N. Shiryaev, Statistics of Ran- E. P. Wigner, Phys. Rep. 106, 121 (1984), ISSN http://www.sciencedirect. dom Processes: II. Applications (Springer, Berlin, 0370-1573, URL http://books. com/science/article/pii/0370157384901601 2000), ISBN 9783540639282, URL . google.com.sg/books?id=7An21SYEATsC . [98] H. P. Breuer and F. Petruccione, The Theory of Open [81] M. Tsang, Phys. Rev. A 80, 033840 (2009), URL Quantum Systems (Oxford University Press, Oxford, http://link.aps.org/doi/10.1103/PhysRevA. 2002). 80.033840 . [99] A. M. Gleason, J. Math. Mech 6, 885 (1957). [82] M. Tsang, Phys. Rev. A 81, 013824 (2010), URL [100] V. P. Belavkin, ArXiv Mathematical Physics e-prints http://link.aps.org/doi/10.1103/PhysRevA. (2007), arXiv:math-ph/0702079. 81.013824 . [101] L. Bouten, R. Van Handel, and M. James, SIAM [83] E. Pardoux, Stochastics 6, 193 (1982), URL Journal on Control and Optimization 46, 2199 http://www.tandfonline.com/doi/abs/10. http://epubs.siam.org/doi/abs/ (2007), URL 1080/17442508208833204 10.1137/060651239 . . [84] H. E. Rauch, F. Tung, and C. T. Striebel, AIAA Jour- [102] J. Gambetta and H. M. Wiseman, Phys. Rev. A 64, http: http://link.aps.org/doi/ nal 3, 1445 (1965), ISSN 0001-1452, URL 042105 (2001), URL //dx.doi.org/10.2514/3.3166 10.1103/PhysRevA.64.042105 . . [85] R. E. Kalman and R. S. Bucy, Journal of Basic En- [103] M. Tsang, Phys. Rev. Lett. 102, 250403 (2009), http://link.aps.org/doi/10.1103/ gineering 83, 95 (1961), ISSN 0098-2202, URL URL http://dx.doi.org/10.1115/1.3658902 PhysRevLett.102.250403 . . [86] D. Q. Mayne, Automatica 4, 73 (1966), ISSN [104] S. Gammelmark, B. Julsgaard, and K. Mølmer, http://dx.doi.org/10.1016/ 0005-1098, URL Phys. Rev. Lett. 111, 160401 (2013), 0005-1098(66)90019-7 http://link.aps.org/doi/10.1103/ . URL PhysRevLett.111.160401 [87] D. Fraser and J. Potter, IEEE Transactions on Auto- . matic Control 14, 387 (1969), ISSN 0018-9286. [105] T. A. Wheatley, D. W. Berry, H. Yonezawa, [88] W. K. Wootters and B. D. Fields, Annals of D. Nakane, H. Arao, D. T. Pope, T. C. Ralph, Physics 191, 363 (1989), ISSN 0003-4916, H. M. Wiseman, A. Furusawa, and E. H. Hunt- http://www.sciencedirect.com/science/ URL ington, Phys. Rev. Lett. 104, 093601 (2010), article/pii/0003491689903229 http://link.aps.org/doi/10.1103/ . URL PhysRevLett.104.093601 [89] M. Tsang and C. M. Caves, Phys. Rev. X 2, . http://link.aps.org/doi/ 031016 (2012), URL [106] H. Yonezawa, D. Nakane, T. A. Wheatley, K. Iwa- 10.1103/PhysRevX.2.031016 . sawa, S. Takeda, H. Arao, K. Ohki, K. Tsumura, [90] B. O. Koopman, Proceedings of the National D. W. Berry, T. C. Ralph, et al., Science 337, http: http://www.sciencemag.org/ Academy of Sciences 17, 315 (1931), URL 1514 (2012), URL //www.pnas.org/content/17/5/315.short content/337/6101/1514.abstract . . [91] J. Gough and M. James, IEEE Transactions on Au- [107] K. Iwasawa, K. Makino, H. Yonezawa, M. Tsang, tomatic Control 54, 2530 (2009), ISSN 0018-9286. A. Davidovic, E. Huntington, and A. Furu- [92] W. K. Wootters, Annals of Physics 176, sawa, Phys. Rev. Lett. 111, 163602 (2013), http: http://link.aps.org/doi/10.1103/ 1 (1987), ISSN 0003-4916, URL URL //www.sciencedirect.com/science/article/ PhysRevLett.111.163602 .

108 Testing quantum mechanics: a statistical approach

aps.org/doi/10.1103/PhysRev.134.B1410 [108] Y. Aharonov, P. G. Bergmann, and J. L. Lebowitz, . http://link. Phys. Rev. 134, B1410 (1964), URL

109