<<

arXiv:1705.03636v1 [quant-ph] 10 May 2017 hoeia rfntoa nltcldtisi hsitouto.Th introduction. this in details analytical or theoretical steiett arxwe b xn notoomlbss eidentif we ) orthonormal an fixing (by when identity the is = Ω sacleto ( collection a as aefis yloiga h ahmtclysmlrdsuso sfollows as discussion simpler mathematically the v giv at obtain we looking can infinite by paper, one of first However, this case observables. ‘continuous’ of optimal also rest the involving the characterize case In general the properties. a in above, these outlined with properties fin optimality associated of the observables introductio define discrete this considering formally only In systems, observables setting given. simple optimal a are of within pre-processing lems review and extensive post- outcome with An the dealing or interrelated. state are initial its the they either or fro modifying nature by quantum informational obtained more or be a can classical observa of it th of The measurement determine a either observable. to completely noise reduced this to of enough types we decisive how different be matter may bet no it differentiate surement to or enough system powerful ot the be the of observa may than as observable better them The considered call teria: be we can thus observables and some measurement However, quantum a of comes one prtr on operators bounded esrmn of measurement srpeetda est matrix density number a as represented is M savrie,ltu rtcnie POVM a consider first us let advertised, As observab quantum for optimality of notions various these study We th describes (POVM) measure valued positive normalized A sampwihasgst ahsubset each to assigns which map a is { H Abstract. ASnmes 36.a 03.67.–a 03.65.Ta, a numbers: joint PACS discuss also We imp observables. states. quantum are pure optimal conditions of of set sufficient ments the and pre-processin within necessary inv of and complete We o characterization given, mixing, complete measurement. is post-processing, the a observables classical after Especially, by or caused before nature. noise system from the free quant W of are measured the state paper. of a this values or in the determine studied that are observables (POVMs) for measures valued operator itive x p 1 x , (denote i r[ tr = 2 x , . . . M RK HOO APSL N UAPKAPELLONP JUHA-PEKKA AND HAAPASALO THEODOR ERKKA ρ M M 1 aiu om fotmlt o unu bevbe ecie sn as described observables quantum for optimality of forms Various d , i N spromdadtesse si h iiilo nu)state input) or (initial the in is system the and performed is ] M dim = } ∈ 2 , . . . , PIA UNU OBSERVABLES QUANTUM OPTIMAL H nafiiedmninlqatmsse ihteascae Hilbert associated the with system quantum finite-dimensional a on [0 , iheeet ftemti matrix the of elements with ]i nepee stepoaiiyo etn noutcome an getting of probability the as interpreted is 1] H M N < fpstv semidefinite positive of ) ∞ ρ hti,apstv eient arxo rc ,adthe and 1, of matrix semidefinite positive a is, that , .Ti en htw ontne og nomeasure into go to need not do we that means This ). 1. Introduction X 1 fΩapstv matrix positive a Ω of M ihfiieymn aus(routcomes) (or values many finitely with d × d t ihpoaiitccertainty probabilistic with ity –matrices r-rcsigo quantum of pre-processing r n otpoesn clean post-processing and g M esacrigt ieetcri- different to according hers lal nih nti general this in insight aluable siaeosralswhich observables estigate h esrmn fwhich of measurement the m enaygvniiilstates initial given any ween d sdo informationally on osed ( dcaatrz observables characterize nd ,w prahteeprob- these approach we n, iecharacterizations give e POVM e dsqeta measure- sequential nd n e eut especially results new and C entoso optimality of definitions e lso unu system. quantum a of bles l a lob refrom free be also may ble dmninlssesand systems -dimensional .Asaeo h system the of state A ). t-iesoa quantum ite-dimensional esrmn antbe cannot measurement e n netgt how investigate and les M statistics. . ttsiso h out- the of statistics e tt fe h mea- the after state e M y ( X i H raie pos- ormalized A ¨ uhthat such = ) A M ¨ with a eviewed be can P ρ C Actually, . x x d i ∈ i P n the and X hna when i N M =1 i M so i 2 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨ that tr [ρM(X)] is the probability of getting an outcome belonging to the set X. Especially, M x = M . { i} i We fix a POVM M as above and study its different optimality criteria (in the categories of discrete POVMs in finite ). For that we will need another discrete POVM M′, or Cd′ (M1′ , M2′ ,..., MN′ ′ ), which acts in a d′-dimensional Hilbert space ′ = . Without restricting H ∼1 generality, we will assume that the matrices Mi and Mj′ are nonzero. Write M = mi λ ϕ ϕ = mi d d where the eigenvectors ϕ , k =1,..., m , i k=1 ik| ik ih ik| k=1 | ik ih ik| ik i form an orthonormalP set, the eigenvaluesP λik are positive (and bounded by 1), and dik = √λikϕik. We say that mi is the multiplicity of the outcome xi or the rank of Mi, and M is of rank 1 if mi = 1 for all i =1,...,N. Recall that rank-1 observables have many important properties [3, 21, 32, 33, 34, 35]. For example, their measurements break entanglement completely between the system and its environment [33]. One can define a (maximal) rank-1 refinement POVM 1 1 1 1 N M of M via Mik = dik dik , i = 1,...,N, k = 1,...,mi. Now M has N = i=1 mi | ih | mi 1 1 outcomes and pi = tr [ρMi]= k=1 tr [ρMik] (i.e. M is a relabeling of M ) thus showing thatP any 1 measurement of M can be viewedP as a measurement of M, the so-called complete measurement, since the value space of M1 ‘contains’ also the multiplicities k m of the measurement ≤ i outcomes xi of M, see [33, 34, 35] for further properties of complete measurements. Let then be a Hilbert space spanned by an eik where i = 1,...,N H⊕ 1 and k = 1,...,mi. Obviously, dim = N . Define a discrete normalized projection valued H⊕ mi measure (PVM) P =(P1,..., PN ) of via Pi = k=1 eik eik so that Pi is spanned by H⊕ | ih | NH⊕ the vectors eik, k = 1,...,mi, and we may write (theP ) = (Pi ). Define ⊕ i=1 ⊕ N mi H H an J : , J = eik dik for which J ∗PiJ = MLi. Hence, , J, P H → H⊕ i=1 k=1 | ih | H⊕ is a Na˘ımark dilation of M.2 NoteP thatP one can identify with a (closed) subspace J of , H H H⊕ equipped with the projection JJ ∗ from onto J , and we may briefly write = ⊥. H⊕ H H⊕ H ⊕ H Especially, any state ρ of can be viewed as a state JρJ ∗ of the bigger space . By using this interpretation, any measurementH of P in the subsystem’s state can be viewedH as⊕ a measurement of M via pi = tr [ρMi] = tr [JρJ ∗Pi]. Finally, we note that M is a PVM if and only if J is unitary (i.e. dik i,k is an orthonormal basis of ). In this case one can identify with { } H H⊕ H and P with M e.g. by setting eik = dik. Remark 1. Let , J, P be a (minimal) Na˘ımark dilation of the POVM M as above. Without H⊕ restricting generality, one can pick any orthonormal basis en ∞ of an infinite-dimensional { }n=1 Hilbert space and choose eik = ei ek so that becomes a (closed) subspace H∞ m⊗i ∈ H∞ ⊗H∞ H⊕ of and Pi = ei ei ek ek Pi′ IM where Pi′ = ei ei , i = 1,...,N, H∞ ⊗ H∞ | ih | ⊗ k=1 | ih | ≤ ⊗ | ih | constitutes a rank-1 PVM P′ in anPN-dimensional space N = lin ei i =1,...,N and IM = M H { | } ek ek is the identity operator of M = lin ei i =1,...,M where M = maxi N mi . k=1 | ih | H { | } ≤ { } InP addition, J can be interpreted as an isometry from into N M by the same formula N mi 3 H H ⊗ H J = i=1 k=1 ei ek dik and we have Mi = J ∗(Pi′ IM )J = ΦJ (Pi′ ) where ΦJ is a | ⊗ ih | ⊗ M (completelyP P positive) Heisenberg channel, ΦJ (B) = J ∗(B IM )J = s=1 As∗BAs where B is C 4 ⊗ an N N–matrix (i.e. B N ( )). P × ∈M 1 If, for instance, Mi = 0 then pi = 0 regardless of the state ρ so the outcome xi is never obtained and we may replace Ω by Ω xi and similarly remove all outcomes related to zero matrices. 2 \{ } The dilation is minimal, that is, the span of vectors PiJφ, i =1,...,N, φ , is the whole ⊕. Indeed, this N mi N mi ∈ H −1P H follows immediately from equation ψ = i=1 k=1 eik ψ eik = i=1 k=1 eik ψ λik iJdik where ψ ⊕. 3 ′ h | i h | i ∈ H Hence, N M , J, P IM is aP Na˘ımarkP dilation of M, whichP P is minimal if and only if mi = M for all H ⊗ H ⊗ i =1,...,N.  4 N Note that the Kraus operators As = ei dis are linearly independent, i.e. the Kraus decomposition i=1 | ih | of ΦJ is minimal. In addition, the correspondingP Schr¨odinger channel (ΦJ )∗ transforms a d d–state ρ to the ′ M ∗ ′ ′ ′ × N N–state ρ = AsρA and pi = tr [ρMi] = tr [ρ P ]= ei ρ ei holds. × s=1 s i h | | i P OPTIMAL QUANTUM OBSERVABLES 3

Davies and Lewis [11] introduced the concept of instrument which turned out to be crucial in developing quantum measurement theory since, besides measurement statistics, it also de- scribes the conditional state changes due to a quantum measuring process. For example, if the measurement outcome set is finite, Ω = x1,...,xN , then any (Schr¨odinger) instrument describing a measurement of M (with the{ outcomes} Ω), can be viewed as a collection of I N (completely positive) operations5 on (C) such that tr [ (ρ)] = 1 for any state ρ. Ii Md i=1 Ii Now transforms an input state ρ to a (nonnormalized) outputP state i(ρ) if xi is obtained. I I In addition, defines the measurement outcome probabilities pi and the corresponding POVM I N M via p = tr [ (ρ)] = tr [ρM ].6 Note that ρ (ρ) is a (Schr¨odinger) channel which i Ii i 7→ i=1 Ii transforms any state of the system to another stateP of the same system. More generally, a quan- tum channel is a completely positive trace-preserving (cptp) between state spaces associated to quantum systems (with possibly different Hilbert spaces , ′) so that channels transmit quantum information between different systems. Similarly,HwithH possibly different d d′ input and output spaces = C and ′ = C , one may also assume an initial state of to H ∼ H ∼ H transform into conditional states of ′ as a result of the measurement prompting to describe H the measurement through an instrument with Schr¨odinger operations i : d(C) d′ (C). Now we are ready to introduce the followingI six optimality criteria forI MM: →M (1a) M determines the future of the system (completely) if each instrument implementing M is nuclear (or preparatory), i.e., of the form (ρ)= p σ where σ ’s areI density matrices Ii i i i (of any fixed output Hilbert space ′) which do no depend on the input state ρ. If H the outcome xi is obtained with the nonzero probability pi = tr [ρMi] then the output system is in the ρ-independent state σi after the measurement. It can be shown that M determines the future if and only if M is of rank 1, i.e. each M is of the form d d i | i ih i| where di [21, 32]. ∈ H N ′ (1b) M is post-processing maximal (post-processing clean) if the condition Mi = j=1 pji′ Mj′ for all i (where (p′ ) is N ′ N–probability matrix and M′ =(M′ ,..., M′ ′ ) isP a POVM ji × 1 N M N M of the same Hilbert space ′ = ) implies that j′ = i=1 pij i for all j where (pij) H 7 H is N N ′–probability matrix. The condition Mi = P p′ M′ yields pi = tr [ρMi] = × j ji j j pji′ tr ρMj′ thus showing that, instead of measuringPM, one can measure M′ in the sameP state ρ and then classically post-process the data by using the matrix (pij′ ). Post- processing clean POVMs are free from this type of classical noise and it is easy to show that M is post-processing clean if and only if M is of rank 1 [13, Theorem 3.4]. (2a) M determines the past of the system if it is informationally complete, i.e. the mea- N surement outcome statistics (pi)i=1 determines the input state ρ, i.e. the condition tr [ρMi] = tr [ρ′Mi] for all i implies that ρ′ = ρ. Clearly, M determines the past of the system if and only if N d2 and any d d–matrix B can be written as a linear ≥ × combination of matrices Mi, i =1,...,N [5, Prop. 18.1]. We will construct later an in- formationally complete (extreme) rank-1 POVM M with the minimum number N = d2 of outcomes. Generally, an informationally complete POVM need not be rank-1 but if M is informationally complete then its rank-1 refinement M1 is also informationally complete [33].

5 A A∗ For any i, i(ρ) = s isρ is (a Kraus decomposition) and the dual (Heisenberg) operation is i(B) = ∗ A∗I A J i (B)= s isB is whereP B is any d d–matrix. I 6 × M A∗ A In otherP words, using the Kraus decompositions of the operations, i = s is is. 7 Recall that (pij ) is a probability (or stochastic or Markov) matrix if pij P0 and j pij = 1 for all i. The M′ M≥ M′ M numbers pij are transition probabilities and is said to be a smearing of if j =P i pij i holds. In this ′ case, M and M are jointly measurable, a joint observable being Nij = pij Mi. P 4 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨

(2b) M is extreme if it is an extremal point of the of all (discrete) POVMs of , M 1 M 1 M M M H i.e. = 2 ′ + 2 ′′ implies ′ = . Thus, extreme observables describe statistics of the pure quantum measurements, free from any classical randomness due to fluctuations in the measuring procedure (in the same way as pure states describe preparation procedures without classical randomness). One can show that M is extreme if and only if the matrices dik diℓ , i =1,...,N, k,ℓ =1,...,mi, are linearly independent [30, Theorem 2.4], [9, Theorem| ih | 2]. Especially, if M is rank-1, i.e. M = d d , d = 0, then it is i | i ih i| i 6 extreme if and only if the matrices Mi are linearly independent. In this case, it is informationally complete if and only if N = d2. Trivially, if M is extreme then its rank-1 refinement M1 is also extreme thus8 showing that any extreme informationally completely POVM is necessarily of rank 1. Finally, we note that PVMs are automatically extreme. (3a) M determines its values xi if each Mi is of (operator) 1, i.e. Mi has the eigenvalue 1 (with the unit eigenvector ϕi). In this case, for any outcome xj one can pick a state ρ = ϕj ϕj such that pi = tr [ρMi]= δij for all i, i.e. the observable M has the value xj in the| stateih ρ| with probabilistic certainty.9 Clearly, a rank-1 norm-1 POVM is a PVM10 and any PVM is of norm-1. (3b) M is pre-processing maximal (pre-processing clean) if the condition Mi = Φ(Mi′ ) for all i (where Φ : d′ (C) d(C) is a Heisenberg channel and M′ is a POVM on the possibly M →M d′ different Hilbert space ′ = C with N ′ = N outcomes) implies that M′ = Θ(M ) for all H ∼ i i i where Θ : (C) ′ (C) is some Heisenberg channel. The condition M = Φ(M′ ) Md →Md i i can be written in the form pi = tr [ρMi] = tr [ρΦ(Mi′ )] = tr [Φ (ρ)Mi′ ] so that to get the ∗ probabilities pi one can equally well measure M′ in the state Φ (ρ), i.e. M′ is ‘better’ measurement in this sense and M is obtained from it by adding∗ quantum noise in ρ (characterized by the channel Φ). Hence, pre-processing clean observables are free from this type of quantum noise. Since, using the Na˘ımark dilation ( , P,J) of M, H⊕ Mi = J ∗PiJ = Φ(Pi), where Φ is the (rank-1) isometry channel J ∗( )J, to show that · 11 M is pre-processing clean, one must find a channel Θ such that Pi = Θ(Mi) holds and 12 thus each Mi is of norm 1, i.e. M determines its values. We will show that, in finite dimensions, pre-processing clean POVMs are exactly norm-1 POVMs and exactly of the form Mi = Ei Fi where E is a PVM (Ei = 0 for all i) and F a POVM (Θ(Fi)=0 for all i) acting⊕ on orthogonal subspaces of6 . Hence, for any pre-processing clean POVM M, there exists a projection (onto a subspace)H such that the projected POVM E is projection valued. Especially, PVMs are pre-processing clean [31]. We have seen that ‘optimal observables’ must be of rank 1, see (1a) and (1b) above. More- over, observables satisfying (2a) or (2b) can be maximally refined into rank-1 observables which share the same optimality criteria as the original POVMs. If M is rank-1 then the map CN (c ,...,c ) N c M is surjective iff (2a) holds and injective iff (2b) holds. We ∋ 1 2 7→ i=1 i i will construct a POVM forP which all conditions (1a), (1b), (2a), and (2b) hold. PVMs are optimal observables in the sense of (3a) and (3b). In addition, the rank-1 refinement of a PVM is also projection valued (and of norm-1). However, it is easy to construct a norm-1 POVM

8 2 1 N 2 2 Since d = N = mi N and N d imply mi 1 and N = d . i=1 ≥ ≥ ≡ 9Note that this holdsP only in finite dimensions. Generally, a norm-1 (effect) operator can have a fully (i.e. no eigenvalues at all). However, even in such a case, for each j and any ε (0, 1), ∈ there is a state ρ = ϕj ϕj such that tr [ρMj ]=1 ε. 10This holds for discrete| ih observables.| As a counterexample,− consider the canonical phase observable which is rank-1 norm-1 ‘continuous’ POVM but not projection valued. 11 M P′ P′ Actually, Remark 1 shows that i =ΦJ ( i) must be connected to a rank-1 PVM via some channel. 12 1= Pi = Θ(Mi) Θ Mi = Mi 1 implies Mi = 1. k k k k≤k kk k k k≤ k k OPTIMAL QUANTUM OBSERVABLES 5 whose rank-1 refinement is not of norm 1. For example, in C3 (with the basis 0 , 1 , 2 ), one M 1 M 2 | i | i | i can define 2-valued norm-1 POVM 1 = 1 1 + 3 0 0 , 2 = 2 2 + 3 0 0 whose refine- M1 M1 1 | ih | 1 | ih | M | ihE | F | ih | E ment has an effect 12 = 3 0 0 of norm 3 . Note that i = i i where 1 = 1 1 , E | ih | F 1 ⊕ F 2 | ih | 2 = 2 2 constitutes a PVM in a 2-dimensional space, and 1 = 3 0 0 , 2 = 3 0 0 . To| conclude,ih | there are essentially two sorts of optimal observables: rank-1| ih | PVMs and| ih extreme| informationally complete POVMs. Since they are extreme (2b) and rank-1, i.e. post processing clean (1b), they are free from classical noise due to the mixing of measurement schemes or data processing. Moreover, they determine the future of the system (1a). It is easy to show that a pre-processing clean POVM (e.g. a PVM) cannot be informationally complete and vice versa,13 i.e. an informationally complete POVM is never free from quantum noise. Moreover, the determination of the past (2a) and the values (3a) are complementary properties. However, when one assumes that only a restricted class of states (related to a subspace ) can be determined completely then these complementary properties can be combinedH as ⊆ follows: H⊕ 2 One can pick a d -outcome extreme informationally complete rank-1 POVM Mi = di di , 2 | ih | i =1,...,d , and its minimal Na˘ımark dilation with the rank-1 PVM Pi = ei ei acting in a 2 d2 | ih | d -dimensional space = ⊥ with the orthogonal basis ei i=1 (recall that d = dim ). H⊕ H ⊕ H { } H Now, for any subsystem’s state ρ, one gets pi = di ρ di = ei JρJ ∗ ei and Pi is informationally complete only within the set of states of theh subspace| | i h.| Instead| i of measuring M one can H P prepare a state of ∼= J and then perform a measurement of to get probabilities pi and posterior statesHσ (seeH item (1a) above) since the nuclear instrument (ρ) = d ρ d σ i Ii h i| | ii i implementing M can be trivially extended to an instrument of P via i(ρ)= ei ρ ei σi where ρ is a state of . Below we study sequential and joint measurementsI I of optimalh | POV| i Ms with other observables.H⊕ N N ′ Let M = (Mi)i=1 and M′ = (Mj′ )j=1 be POVMs as in the beginning of this introduction, and let = ( i) be an instrument implementing M (i.e. any i : d(C) d′ (C) is of the I I A A M A A I M → M M form i(ρ) = s isρ is∗ and i = s is∗ is). Suppose then that one measures first in I 1 the state ρ (describedP by ) and thenP M′ in the transformed (conditional) state p− i(ρ) if I i I the outcome xi is obtained (with the probability pi > 0) in the first measurement of M. This sequential measurement can be described by a joint POVM J =(Jij) where Jij = i∗(Mj′ ) since 1 I the conditional probability is tr [ρJ ] = tr p− (ρ)M′ p . Hence, a sequential measurement ij i Ii j i of M and M′ can be interpreted as a joint measurement of M and the disturbed POVM M′′, M′′ = J = Φ(M′ ) of the same Hilbert space (here Φ = ∗ is the total Heisenberg j i ij j i Ii channelP of ). P I Indeed, any POVMs M = (M ) and M′′ = (M′′) (of the same Hilbert space ′′ = ) are i j H H jointly measurable if there exists a POVM N =(Nij) such that M and M′′ are the margins of N, N ′′ N i.e., Mi = j=1 Nij and Mj′′ = i=1 Nij. If , J, P is a (minimal) Na˘ımark dilation of M then 14 H⊕ it is easyP to show that Nij =PJ ∗PijJ where Pij is a (unique) positive semidefinite mi mi– 15 × matrix such that j Pij = Pi. Hence, for each i N, the map j Pij is a POVM with N ′′ ≤ 7→ N ′′ values so that thereP is a channel Φi such that Φi(Pj′ )= Pij, see Remark 1. Here P′ =(Pj′ )j=1 is 16 N ′′ a fixed rank-1 PVM acting in a minimal N ′′-dimensional Hilbert space ′′ = C . Define HN ∼ an instrument by ∗(B) = J ∗Φ (B)J, B ′′ (C). Clearly, N = ∗(P′ ) and, thus, any I Ii i ∈ MN ij Ii j

13Since a pre-processing clean POVM has at most N = d outcomes and an informationally complete POVM has at least N = d2 outcomes. ′′ 14 N N N ∗P P P Since ij j=1 ij = J iJ implies the existence of ij i (mi mi–identity matrix). 15 ≤ ≤ × This POVMP acts in the subspace Pi ⊕ and is normalized there. ′′ N H ′′ ′ 16 M N N ′ M N P P If, say, k = 0 then ik i′=1 i k = k yields ik = 0 and thus ik = 0 for all i N. Hence, k =0 ′′ ≤ ′′ ′′ ≤ ′′ and N is the number of nonzeroP effects M since usually we assume that M = 0 for all j =1,...,N . j j 6 6 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨ joint measurement of M and M′′ can be interpreted as a sequential measurement of M followed by a rank-1 PVM P′. Note that Mj′′ = Φ′(Pj′ ) where Φ′(B)= J ∗ i Φi(B)J. In conlusion, a sequential measurement of M and M′ definesP a joint obsevable J with the margins M and M′′ = Φ(M′). If we put N = J above, we see that this measurement of J can be interpreted as a new sequential measurement of M and a rank-1 PVM P′. In addition, M′′ = Φ′(P′). Thus, the latter observable in a sequential set-up can be assumed to be very optimal: free from both classical and quantum noise. Next we study how the optimality criteria (1a)—(3b) affect the joint measurability of an optimal observable with other observables. (1) If M is rank-1 (m 1) then any P is a 1 1–matrix, i.e. a number p , and N = i ≡ ij × ij ij pijMi is rank-1 (and a post-processing of M). Since (pij) is a probability matrix, also M′′ is a smearing (post-processing) of M, i.e. Mj′′ = i pijMi, see item (1b) above. Moreover, the M-compatible instrument is nuclear (1a)P and N = J = tr σ M′ M I ij ij i j i from where one can read the transition probabilities pij = tr σiMj′ (where the states σi determines completely) showing that, if one gets xi in the first M-measurement, I then the instrument ‘prepares’ the post-measurement state σi which is the input state for the second M′-measurement giving the probability distribution pij, j = 1,...,N ′, and tr [ρJij] = pipij where pi = tr [ρMi]. Hence, after a measurement of a rank-1 POVM there is no need to perform any extra measurements to get more information. It should be stressed that, even if Mj′′ = Φ′(Pj′ ), the entanglement breaking channel Φ′ 17 (associated to a nuclear instrument ′ of M) [32] adds so much quantum noise to the I rank-1 PVM P′ that it becomes the fuzzy version M′′ of M. Hence, in this sequential measurement, the latter observable M′′, which arises as the second marginal of the joint observable J, is obtained both through adding classical noise to the observable M

first measured (i.e., as a post-processing Mj′′ = i pijMi) and through adding quantum noise to the observable M′ actually measured afterP M in the form of the pre-processing Mj′′ = Φ(Mj′ )= i tr σiMj′ Mi. The same results naturally apply in the situation where we modify the measurementP   of the first observable M and measure some rank-1 PVM P′ after it to obtain M′′ as the second marginal. In this case M′′ is a classical smearing of the rank-1 M and a quantum smearing of the rank-1 PVM P′. (2) If M is informationally complete (2a) then N = J is also informationally complete.18 Hence, trivially, if already M determines the past, then its subsequent measurements cannot increase the (already maximal) state distinguishing power. Suppose now that N = J = ∗(M′ ) for some instrument measuring the informationally complete M ij ij Ii j I and some subsequently measured M′ giving rise to the second marginal Mj′′ = Φ(Mj′ ) for the total channel Φ = ∗. If we also assume that M′′ is informationally complete, i Ii i.e., we jointly measure twoP informationally complete observables in a sequential setting, then the Heisenberg channel Φ is surjective (in this finite-dimensional case) and the 19 corresponding Schr¨odinger channel Φ = i is injective. ∗ i I If M is extreme (2b) then N is the uniqueP joint POVM which has the margins M and 20 M′′. If, in addition, M is rank-1 then Nij = pijMi and Mj′′ = i pijMi and we have the chain of : N M′′ (pij) N. P 7→ 7→ 7→ 17 P′ ′ ′ M ′ Since j = ej ej one can define states σi = j pij ej ej and a nuclear instrument i(ρ) = tr [ρ i] σi ′∗ | ih′ M| ′∗ P′ M N | ih ′| ′∗ I ′ M (or i (B) = tr [Bσi] i) such that i ( j )= pij iP= ij and Φ (B)= i i (B)= i tr [Bσi] i. I I I ′′ 18 N M N N If one gets probabilites tr [ρ ij ] then one can solve ρ from the probabilitiesP pi =P tr [ρ i]= j=1 tr [ρ ij ]. 19 M′′ C M′′ Since is informationally complete, the map d( ) ρ tr ρ j j is injective, and,P since this map M′ ∋ 7→ is the composition of the maps Φ∗ and σ tr σM , as the first of these  maps, Φ∗ has to be injective. 7→ j j 20 N ∗P N′ ∗P′    M′′ N N N N′ Since, for ij = J ij J and ij = J ij J, the condition j = i=1 ij = i=1 ij can be written in the ∗ N ′ ′ ′ form J Dj J = 0 where Dj = =1(Pij P ). If M is extreme then PDj 0, i.e. PPij P , and N = N . ⊕i − ij ≡ ≡ ij OPTIMAL QUANTUM OBSERVABLES 7

(3) If M = P is a PVM, i.e. PkPℓ δkℓPk, (and thus = and J is the identity map) then N = P where each map≡j P is a (subnormalized)H⊕ H POVM which commutes ij ij 7→ ij with P, i.e. P P P P , since P P . Thus, any M′′ compatible with a PVM P ij k ≡ k ij ij ≤ i commutes with P. If, moreover, P is of rank-1 then Pij = pijPi. Note that N (or M′′) needs not to be a PVM or even of norm 1 (e.g. consider P1 = 1 1 , P2 = 2 2 = P22, 1 1 2 | ih | | ih | P = P = 1 1 = M′′, M′′ = 1 1 + 2 2 in C ). 11 12 2 | ih | 2 1 2 | ih | | ih | In this paper, we generalize the above results to the case of arbitrary observables (with sufficiently ‘nice’ value spaces) acting in separable Hilbert spaces. For example, consider the single-mode optical field with the Hilbert space spanned by the photon number states n , H | i n = 0, 1, 2,..., associated with the number operator N = a∗a = n∞=0 n n n where a = 1| ih | ∞ √n +1 n n+1 . Define the position and operators Q = (a + a) and P = n=0 P √2 ∗ i | ih | P(a∗ a) which, in the position representation, are the usual multiplication and differentiation √2 − operators, (Qψ)(x) = xψ(x) and (P ψ)(x) = i dψ(x)/dx (we set ~ = 1). Define the Weyl − 21 za∗ za operator (or the displacement operator of the complex ) D(z) = e − , z C. Let ∈ q, p R and z =(q + ip)/√2. Then ∈ ipQ iqP iqp/2 ipQ iqP iqp/2 iqP ipQ D(q,p)= D(z)= e − = e− e e− = e e− e , i.e., for all ψ = L2(R), (eipQψ)(x)= eipxψ(x), (eiqP ψ)(x)= ψ(x + q), and ∈ H ∼ iqp/2 ipx D(q,p)ψ (x)= e− e ψ(x q). − One can measure the following physically relevant POVMs: Rotated quadrature operators Q = (cos θ)Q + (sin θ)P where θ [0, 2π) so that Q = • θ ∈ 0 Q and Qπ/2 = P . In the position representation, the spectral measure of Q is the canonical spectral measure, [Q(X)ψ](x) = χ (x)ψ(x), X R (Borel set), so that X ⊆ the spectral measure (rank-1 PVM) of Qθ is Qθ(X) = R(θ)Q(X)R(θ)∗ where R(θ) = iθN inθ e = ∞ e n n is the (unitary) rotation operator. Rotated quadaratures can n=0 | ih | be measuredP by a balanced homodyne detector where the phase shift θ is caused by a phase shifter. A single Qθ cannot be informationally complete (as a PVM) but the whole measurement assemblage Qθ(X) θ∈[0,π) forms an informationally complete set of { } X⊆R effects. Actually, a rank-1 POVM G (Θ X) = 1 Q (X)dθ) determines the input ht × π Θ θ state completely (optical homodyne tomography, OHT).R Note that Ght(Θ X) 1 dθ< 1 if Θ [0, π) is not of ‘length’ π. k × k ≤ π Θ ⊆ TheR number operator N = ∞ n n n whose spectral measure (rank-1 PVM) is • n=0 | ih | n Nn = n n (an ideal photonP detector with the 100 % efficiency). An7→ unsharp| (rank-ih | ) number observable (POVM) • ∞ ∞ ǫ m n m n n N = ǫ (1 ǫ) − m m 7→ n  n  − | ih | mX=n (a nonideal photon detector with quantum efficiency ǫ [0, 1)). Now Nǫ is neither of ∈ ǫ norm-1 nor informationally complete, since it is commutative [4], and limǫ 1 Nn = Nn (by 00 = 1). → Covariant observables (POVMs) • 1 2 1 GS(Z)= D(z)SD(z)∗d z = D(q,p)SD(q,p)∗dqdp, Z C, π ZZ 2π ZZ ⊆

21Recall that Weyl operators are associated to a of the Heisenberg H, or to a C R2 projective representation of the additive group ∼= . 8 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨

where S is (essentially) the reference state of an eight-port (or double) homodyne de- tector. In practice GS can be viewed as a joint measurement of unsharp rotated quadra- tures (e.g., unsharp position and momentum). Moreover, GS is rank-1 if and only if S = ψ ψ where ψ is a . Note that G (Z) 1 d2z < 1 when the area | ih | k S k ≤ π Z of the set Z is small enough. R Covariant phase observables (POVMs) • ∞ 1 i(n m)θ dθ ΦC (Θ) = R(θ)CR(θ)∗dθ = Cnm e − n m , Θ [0, 2π), 2π Z Z 2π | ih | ⊆ Θ n,mX=0 Θ where C = ∞ C n m is a positive with the unit diagonal n,m=0 nm| ih | (Cnn 1), i.e.P a phase matrix. If Cnm 1 we get the canonical phase observable Φcan, ≡ ≡ ∞ i(n m)θ dθ Φcan(Θ) = e − n m , Θ [0, 2π), Z 2π | ih | ⊆ n,mX=0 Θ

whereas the margin of GS is called a phase space phase observable. Both can be measured by double homodyne detection [36]. Any phase observable is never projection valued and is a preprocessed version of the canonical phase given by a (Schur type) . Note that Φcan is not informationally complete (consider number states) but it is norm-1. However, Φcan is not pre-processing clean since its nontrivial effects cannot have eigenvalues [5, Theorem 8.2]. 1.1. Definitions and mathematical background. In this paper, N = 1, 2,... , i.e., 0 is 0 { } not included in the set of natural numbers. We define an empty sum ( ) to be equal to j=1 ··· zero. When is a Hilbert space, we denote by ( ) the algebra of boundedP linear operators on and byHI the unit element of this algebraL H (the identity operator on ); by ‘Hilbert space’H we alwaysH mean a complex Hilbert space. The inner product of any HilbertH space will be simply denoted by since the Hilbert space in question should always be clear from the context, and the innerh·|·i product is chosen to be linear in the second argument. By ( ), P2 H we denote the set of projections of , i.e., operators P ( ) such that P = P ∗ = P . An operator E ( ) is called effect ifH 0 E I holds. Especially,∈ L H any projection is an effect, the so-called∈L sharpH effect. We let ( ≤) stand≤ H for the set of trace-class operators on , i.e., tr [ T ] < for all T ( ). We denoteT H the set of positive trace-1 operators in ( ) byH ( ); in quantum| | ∞ ,∈ these T H normalized positive states of ( ) are identified withT H the physicalS H states of the system described by . Note that ( ) L H( ) consists of rank-1 projections ψ ψ (where ψ is a unit vector).H P H ∩ S H | Whenih | µ and ν∈are H positive measures on a measurable space (Ω, Σ) (where Ω = is a set and Σ is a σ-algebra of subsets of Ω) we say that µ is absolutely continuous with6 respect∅ to ν and denote µ ν if µ(X) = 0 whenever ν(X) = 0. When both µ ν and ν µ, we denote µ ν and say≪ that µ and ν are equivalent. ≪ ≪ ∼Let (Ω, Σ) be a measurable space and be a Hilbert space. A map M : Σ ( ) is said to be a normalized positive-operator-valuedH measure (POVM) if, for all ρ ( →L), theH set function X tr [ρM(X)], denoted hereafter by pM, is a probability measure or,∈ equivalently, S H M(X) 0 7→ ρ ≥ for all X Σ, M(Ω) = I , and, for any pairwise disjoint X1, X2,... Σ, one has M( X )=∈ M(X ) (ultra)weakly.H Denote the set of POVMs from Σ to ( ) by∈ Obs(Σ, ). ∪j j j j L H H When P(X)P ( ) for all X Σ for a POVM P : Σ ( ), we say that P is a normalized projection-valued∈ P H measure (PVM)∈ or a spectral measure.→L WeH extend the notions of and equivalence introduced above for scalar measures in the obvious way and thus may write, e.g., for a POVM M Obs(Σ, ) and a measure µ : Σ [0, ], M µ if M(X)=0 whenever µ(X) = 0 and, for another∈ POVMH N : Σ ( ), where→ ∞is some≪ Hilbert space, → L K K OPTIMAL QUANTUM OBSERVABLES 9

M N if N(X) = 0 implies M(X) = 0. We say that M is discrete if there exist distinct points ≪N N M N M N M xi i=1 Ω, N , and effects i i=1 ( ) such that = i=1 iδxi where δx { } ⊆ ∈ ∪ {∞} { } ⊆ LMH N M is a Dirac (point) measure concentrated on x. Now i=1 δxi . A discreteP observable ≪ N can naturally be identified with the effects Mi and we will useP the notation (Mi)i=1 for M if the outcomes x Ω are not relevant. Note that, if is separable, and M Obs(Σ, ), picking any state ρ ∈ ( ) which is faithful, i.e., tr [ρA] =H 0 implies A = 0 for any∈ positiveHA ( ) ∈ S H M ∈L H (or, equivalently, the of ρ is 0 ), we have M pρ . In quantum physics, POVMs are{ associated} in a one-to-one∼ fashion with observables of the system. The observables associated with PVMs are called sharp. In this view, the number pM(X) = tr [ρM(X)] [0, 1] is the probability of obtaining a value within the outcome set ρ ∈ X Σ when measuring M Obs(Σ, ) and the system being measured is in the ∈ ρ ( ). In realistic physical∈ experiments,H we measure only discrete observables which in many∈ cases S H can be thought as discretizations of continuous observables, i.e. for any M ∈ Obs(Σ, ) one can choose pairwise disjoint sets Xi Σ whose union is the whole Ω and define a discreteH POVM by M = M(X ) (with the outcome∈ set 1,...,N or N). In this case, one i i { } can replace Σ with the sub-σ-algebra generated by the sets Xi. Let and be C∗-. We say that a linear map Φ : is n-positive (n N) if the mapA B A → B ∈ n ( ) (a )n Φ(a ) ( ) Mn A ∋ ij i,j=1 7→ ij i,j=1 ∈Mn B defined between the n n-matrix algebras over the input and output algebras is positive. If Φ is n-positive for all n ×N, Φ is said to be completely positive. Suppose that and are unital (with units 1 and 1 ∈) in which case Φ is called unital if Φ(1 )=1 . For anyA unitalB 2-positive A B A B map Φ : one has the Schwarz inequality, Φ(a)∗Φ(a) Φ(a∗a) for all a . We further define CPA( →; B ) as the set of completely positive unital linear≤ maps Φ : ∈A( ) whenever A H A→L H is a unital C∗-algebra and is a Hilbert space. Suppose that and are von Neumann Aalgebras. We say that a positiveH map Φ : is normal, if for anyA increasingB (equivalently, decreasing) net (a ) of self-adjoint operators,A → B one has λ λ ⊆A

sup Φ(aλ) = Φ sup aλ , λ  λ  where sup bλ is the supremum (ultraweak ) of the increasing net (equivalently, with sup replaced by inf, the infimum, in the case of a decreasing net). Fix a C∗-algebra and a Hilbert space . For any completely positive map Φ : ( ), A H A→L H there is a Hilbert space , a unital ∗-representation π : ( ), and an isometry J : M A→L M such that Φ(a)= J ∗π(a)J for all a and the linear hull of vectors π(a)Jϕ, a , H→Mϕ , forms a dense subspace of . Such a∈A triple ( ,π,J) is called as a minimal Stinespring∈A ∈ H M M dilation for Φ and it is unique up to unitary equivalence, i.e., if ( ′, π′,J ′) is another minimal M dilation, then there is a U : ′ such that Uπ(a)= π′(a)U for all a M→M ∈A and UJ = J ′. Let and be Hilbert spaces. We call normal completely positive maps Φ : ( ) ( ) satisfyingH Φ(IK) I as operations. When Φ is in addition unital, i.e., Φ(I )L =KI ,→L we callH Φ as a channelK. For≤ H any normal linear map Φ : ( ) ( ) there exists aK (unique)H predual map Φ : ( ) ( ) such that L K →L H ∗ T H → T K tr [Φ (T )A] = tr [T Φ(A)] , T ( ), A ( ). ∗ ∈ T H ∈L K The version Φ : ( ) ( ) is said to be in the and the version Φ : ( ) ( ) isL saidK → to L beH in the Schr¨odinger picture. For a channel Φ, the Schr¨odinger∗ TchannelH → Φ T, whenK restricted onto ( ), describes how the system associated with transforms under Φ into∗ another system associatedS H with . H K 10 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨

Let (Ω, Σ) be a measurable space and and Hilbert spaces. We say that a map : ( ) Σ ( ) is a (Heisenberg) instrumentH ifK J L K × →L H (i) ( ,X): ( ) ( ) is an operation for all X Σ, (ii) J (·, Ω) isL aK channel,→L H and ∈ J · (iii) for any pairwise disjoint sequence X1, X2,... Σ and any A ( ), (A, jXj) = (A, X ) (ultra)weakly. ∈ ∈ L K J ∪ j J j For anyP instrument : ( ) Σ ( ), we define the predual (Scr¨odinger) instrument : ( ) Σ (J ), L K × → L H J∗ T H × → T K (T,X) = [ ( ,X) ](T ), T ( ), X Σ. J∗ J · ∗ ∈ T H ∈ Note that, for an instrument , the map (I , ) : Σ ( ) is a POVM. On the other hand, for any M Obs(Σ, )J and a HilbertJ spaceK · there→ L isH an instrument : ( ) ∈ H K M J L K × Σ ( ) such that M(X) = (I ,X) for all X Σ, i.e., pρ = tr [ (ρ, )]; we call an →M-instrument. L H In a measurementJ K of an observable∈ associated with a POVMJ∗ ·M, the systemJ transforms conditioned by registering an outcome x X. This conditional state transformation is given by the operation ( ,X) where is an M∈-instrument. The operator (ρ, X) is a J∗ · J M J∗ subnormalized state whose trace coincides with the probability pρ (X) of registering an outcome M M 1 in X. If pρ (X) > 0 then [pρ (X)]− (ρ, X) is the corresponding conditional state. J∗ 2. General structure of a quantum observable In this section, we analyse the structure of an observable with a general value space on a system described by a separable Hilbert space. We will refer to the results reviewed in this section several times on the course of this paper. dim Suppose that is a separable Hilbert space and let h = hn n=1H be an orthonormal (ON) basis of and H { } H Vh := linC h 1 n< dim +1 . { n | ≤ H } Note that Vh is dense in . Let Vh× be the algebraic antidual of the Vh, that is, H Vh× is the linear space consisting of all antilinear functions c : Vh C (antilinearity means → that c(αψ + βϕ) = αc(ψ)+ βc(ϕ) for all α, β C and ψ, ϕ Vh). By denoting cn = c(hn) ∈ ∈ dim one sees that Vh× can be identified with the linear space of formal c = n=1H cnhn where cn’s are arbitrary complex numbers. Hence, Vh Vh×. Denote the dualP pairing dim ⊆ H ⊆ ψ c := c(ψ) = n=1H ψ hn cn and c ψ := ψ c for all ψ Vh and c Vh×. Especially, h | i h | i h | i h | i dim ∈ ∈ c = h c . We sayP that a mapping c : Ω Vh×, x H c (x)h is (weak∗-)measurable n h n| i → 7→ n=1 n n if its components x cn(x) are measurable [23]. NoteP that, if c : Ω Vh× is weak∗- measurable then the7→ maps x ψ c(x) are measurable for all ψ . → H ⊆ 7→ h | i ∈ H Let (Ω, Σ) be a measurable space and denote a direct ⊕ (x) dµ(x) of separable H⊕ Ω H Hilbert spaces (x) such that dim (x)= m(x) N 0, ; hereRµ is a σ-finite nonnegative 22 H H ∈ ∪{ ∞} measure on (Ω, Σ) [12]. For each f L∞(µ), we denote briefly by fˆ the multiplicative ∈ (i.e. diagonalizable) (fψˆ )(x) := f(x)ψ(x) on . Especially, one has the canonical spectral measure Σ X P (X) :=χ ˆ ( ) (whereH⊕ χ is the characteristic ∋ 7→ ⊕ X ∈ L H⊕ X function of X Σ). We say that an operator D ( ) is decomposable if there is a weakly µ-measurable field∈ of operators Ω x D(x) ∈L H(x⊕) such that (Dψ)(x)= D(x)ψ(x) for ∋ 7→ ∈L H all ψ and µ-a.a x Ω; it is often denoted  ∈ H⊕ ∈ ⊕ D = D(x) dµ(x). ZΩ 22Note that µ can be a probability measure everywhere in this paper; any σ-finite measure is equivalent with a probability measure. OPTIMAL QUANTUM OBSERVABLES 11

We have the following theorem proved in [23, 31]: Theorem 1. Let M : Σ ( ) be a POVM and µ : Σ [0, ] a σ-finite measure such that →L H → ∞ M µ. Let h be an ON basis of . There exists a = ⊕ (x) dµ(x) (with ≪ H H⊕ Ω H m(x) dim ) such that, for all X Σ, R ≤ H ∈ (i) M(X) = J ∗ P (X)J where J : is a linear isometry such that the set of ⊕ ⊕ ⊕ ⊕ H → H⊕ linear combinations of vectors P (X′)J ϕ, X′ Σ, ϕ , is dense in . ⊕ ⊕ ∈ ∈ H H⊕ (ii) There are measurable maps dk : Ω Vh× such that, for all x Ω, the vectors dk(x) =0, k < m(x)+1, are linearly independent,→ and ∈ 6 m(x)

ϕ M(X)ψ = ϕ dk(x) dk(x) ψ dµ(x), ϕ,ψ Vh, h | i Z h | ih | i ∈ X Xk=1 (a minimal diagonalization of M). In addition, there exist measurable maps Ω x ∋ 7→ gℓ(x) Vh such that dk(x) gℓ(x) = δkℓ (the Kronecker delta). (iii) M is∈ a spectral measureh if| and onlyi if J is a unitary operator and thus can be identified with . ⊕ H⊕ H A minimal Stinespring dilation for a POVM M : Σ ( ) (viewed as a completely positive →L H map L∞(µ) ( ), f f dM, where µ is a probability measure such that M µ) is called →L H 7→ ≪ as a minimal Na˘ımark dilationR and it consists of a Hilbert space , an isometry J : , M H→M and a spectral measure P : Σ ( ) such that M(X)= J ∗P(X)J and the vectors P(X)Jϕ, X Σ, ϕ , span a dense→L subspaceM of . The above theorem tells that, whenever ∈ ∈ H M H is separable, one can choose = = ⊕ (x) dµ(x) and P to be the canonical spectral M H⊕ Ω H measure P . R ⊕ 2.1. Physical outcome spaces. It is reasonable to assume that a physically relevant outcome space (Ω, Σ) of an observable is regular or ‘nice’ enough. One can often suppose that Σ is countably generated, i.e. there exists a countable S Σ such that Σ is the smallest σ-algebra of Ω containing S. We will always consider any topological⊆ space T as a measurable space T, (T ) where (T ) is the Borel σ-algebra of T . Furthermore, we equip any subset S of T B B with its subspace and the corresponding Borel σ-algebra (S)= (T ) S. We have the following proposition [37, Proposition 3.2]: B B ∩ Proposition 1. A measurable space (Ω, Σ) is countably generated if and only if there exists a map f : Ω R such that → 1 (i) for all Y (R) the preimage f − (Y ) Σ (measurability) and ∈ B ∈ 1 (ii) for all X Σ there is Y (R) such that f − (Y )= X. ∈ ∈ B Recall that f satisfying (i) and (ii) is called exactly measurable. If (Ω, Σ) is countably generated and µ any σ-finite positive measure on Σ then L2(µ) and = ⊕ (x) dµ(x) are separable. H⊕ Ω H We say that (Ω, Σ) is nice23 if it is countably generated andR f : Ω R of the above proposition meets the additional condition → (iii) f(Ω) (R). ∈ B Note that in this case actually f(X) (R) for all X Σ [37, Lemma 4.1]. If, in addition, f is injective then the nice space (Ω,∈Σ) B is a standard∈ Borel space showing that nice spaces are generalizations of standard Borel spaces. Any Borel subset of a separable complete is a standard Borel space and, indeed, any standard Borel space is σ-isomorphic24 to

23In [37], nice spaces correspond to type -spaces. B 24A bijective map between two measurable spaces is a σ- if it is measurable and its inverse is also measurable. 12 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨ such a set or even to some compact metric space. Usually in physics, outcome spaces are finite-dimensional second countable Hausdorff which are (as locally compact spaces) standard Borel. One can think of nice spaces as standard Borel spaces without the separability property (recall that Σ is separable if x Σ for all x Ω). For any x Ω one can define an atom { }1 ∈ ∈ ∈ A := X Σ x X = f − ( f(x) ) [37, Lemma 3.1] so that a nice space is standard x { ∈ | ∈ } { } Borel ifT and only if Ax = x for all x Ω. Hence, atoms of nice spaces may have an ‘inner structure’ (compare to the{ case} of real world∈ atoms). Suppose that (Ω, Σ) is nice with an f satisfying (i)–(iii). Hence, f(Ω) (R) is a standard Borel space and, without restricting generality,25 we can assume that ∈ B f(Ω) = 1, 2,...,N , N N, or f(Ω) = N (discrete case), or • f(Ω) = R{ (continuous} case).∈ • 1 In the discrete case, we say that (Ω, Σ) is discrete and denote X = f − ( i )= A , i =1, 2,..., i { } xi so that Xi Xj = , i = j, thus showing that Σ is the set of all unions of sets Xi and the empty set . Moreover,∩ any∅ 6 observable M : Σ ( ) is discrete and, as earlier, can be identified ∅ N → L H with (Mi)i=1 where Mi = M(Xi). 3. Joint measurability and sequential measurements If quantum devices can be applied simultaneously on the same system, we say that they are compatible. Simultaneously measurable observables are called jointly measurable. Let us give formal definitions for these notions. Definition 1. Observables M : Σ ( ) with outcome spaces (Ω , Σ ), i =1, 2, are jointly i i →L H i i measurable if they are margins of a joint observable N : Σ1 Σ2 defined on the product σ-algebra Σ Σ (generated by sets X Y , X Σ , Y ⊗Σ ), i.e.,→ H 1 ⊗ 2 × ∈ 1 ∈ 2 M (X)= N(X Ω ), M (Y )= N(Ω Y ), X Σ ,Y Σ . 1 × 2 2 1 × ∈ 1 ∈ 2 Especially, M1 and M2 are jointly measurable if (and only if) they are functions or relabelings 1 of a third observable M Obs(Σ, ), i.e. for both i = 1, 2 one has M (X ) = M f − (X ) for ∈ H i i i i all Xi Σi where fi : Ωi Ω is a measurable function. Now a joint observable N is defined ∈ 1 → 1 by N(X Y ) = M f − (X) f − (Y ) for all X Σ and Y Σ . Note that, in this case, all × 1 ∩ 2 ∈ 1 ∈ 2 the three measurable spaces can be arbitrary [5, Chapter 11]. This implies that, in particular, any observable is jointly measurable with its relabelings. Definition 2. Similarly, we say that an observable M : Σ ( ) and a channel Φ : ( ) ( ) are compatible if there exists a joint instrument : →L( ) H Σ ( ) such thatL K → L H J L K × →L H M(X)= (I ,X), Φ(B)= (B, Ω), X Σ, B ( ). J K J ∈ ∈L K The above means, when M and Φ are compatible, there exists a measurement of M such that Φ is the unconditioned state transformation induced by the measurement. ∗It is useful to look at joint measurablility and compatibility from a more general perspective. Recall the definition of the set CP( ; ) of unital completely positive maps Φ : ( ). The following result, to which we willA oftenH refer, has been obtained, e.g., in [16]: A→L H Theorem 2. Let and be von Neumann algebras, a Hilbert space, and Ψ CP( ; ). A B H ∈ A⊗B H Define the map Ψ(1) CP( ; ), Ψ(1)(a)=Ψ(a 1 ) for all a , and pick a minimal dilation ( ,π,J) for Ψ . There∈ isA aH unique map E CP⊗ B( ; ) such∈A that M (1) ∈ B M Ψ(a b)= J ∗π(a)E(b)J, π(a)E(b)= E(b)π(a), a , b . ⊗ ∈A ∈ B If, additionally, Ψ is normal, then both π and E are normal.

25Since two standard Borel spaces are σ-isomorphic if and only if they have the same cardinality. OPTIMAL QUANTUM OBSERVABLES 13

Let us first analyse what the above means for two jointly measurable observables.

Theorem 3. Suppose that Mi : Σi ( ), i = 1, 2, are jointly measurable observables on a . Let ( , P,J) be any minimal Na˘ımark→ L H dilation for M . Fix a joint observable N for M H M 1 1 and M2. There is a unique POVM F : Σ2 ( ) such that P(X)F(Y ) = F(Y )P(X) for all X Σ and Y Σ and → L M ∈ 1 ∈ 2 N(X Y )= J ∗P(X)F(Y )J, X Σ ,Y Σ . × ∈ 1 ∈ 2 3.1. Connection between joint and sequential measurements. A special case of joint observables is sequential measurements where an initial observable M : Σ ( ) is first measured yielding some M-instrument : ( ) Σ ( ) with output Hilbert→ L H space . J L K × → L H K Then some observable M′ : Σ′ ( ) is measured. The conditional probability for obtaining →L K an outcome within Y Σ′ in the second measurement, conditioned by the first measurement observing a value in X∈ Σ, is ∈ tr [ (ρ, X)M′(Y )] = tr ρ M′(Y ),X J∗ J   when the system is initially in the state ρ ( ). For all spaces (Ω, Σ) and (Ω′, Σ′), the ∈ S H positive operator bimeasure (X,Y ) M′(Y ),X extends into a POVM on Σ Σ′ [28, 38]. 7→ J ⊗ In this case, the extension J : Σ Σ′ ( ) is a joint observable for the initial observable M ⊗ →L H and a distorted version M′′ = M′( ), Ω of the second observable. As shown in Section 1, any J · joint measurement of discrete observables  can be implemented as a sequential measurement; see also [18] for this fact and its generalizations in the case of discrete observables. Next we show that joint and sequential measurements are, in this sense, equivalent in a very general case. Whenever (Ω, Σ) is a measurable space and µ is a probability measure on Σ, we denote by P the canonical spectral measure on L2(µ), i.e., P (X)ψ (x)= χ (x)ψ(x) for all X Σ, µ µ X ∈ ψ L2(µ), and µ-a.a. x Ω. When is a Hilbert space we naturally identify L2(µ) with the∈L2-space of functions∈ Ω . K ⊗K →K Proposition 2. Suppose that (Ωi, Σi), i =1, 2, are countably generated measurable spaces and is a separable Hilbert space. Assume that Mi : Σi ( ), i = 1, 2, are jointly measurable observablesH with a joint observable N. There is a separable→ L H Hilbert space , an M -instrument K 1 : ( ) Σ ( ), and a POVM M′ : Σ ( ) such that J L K × 1 →L H 2 →L K (1) N(X Y )= (M′(Y ),X), X Σ ,Y Σ . × J ∈ 1 ∈ 2 Proof. Choose probability measures µ : Σ [0, 1] such that M µ , i =1, 2. Pick a minimal i → i ≪ i Na˘ımark dilation ( , P ,J ) of Theorem 1 for M1 where H⊕ ⊕ ⊕ ⊕ := (x) dµ1(x) ⊕ H ZΩ1 H is a direct integral space which is separable since (Ω1, Σ1) is countably generated. According to Theorem 3, there is a POVM F : Σ2 ( ) such that P (X)F(Y ) = F(Y )P (X) and → L H⊕ ⊕ ⊕ N(X Y )= J ∗ P (X)F(Y )J for all X Σ1 and Y Σ2. Let( , Q,K) be a minimal Na˘ımark × ⊕ ⊕ ⊕ ∈ ∈ M dilation for F. Again, is separable since is separable and Σ2 is countably generated. M H⊕ Fix X Σ1 and define FX : Σ2 ( ) by FX (Y )= P (X)F(Y ). Now FX (Y ) F(Y ) for ∈ →L H⊕˜ ⊕˜ ≤ all Y Σ2, so that one can define a unique P(X) ( ) by P(X)Q(Y )Kψ := Q(Y )KP (X)ψ, ∈ ∈L M 2 ⊕ Y Σ2 and ψ (see, e.g., a similar proof of [16, Proposition 2.1]). Clearly, P˜(X) = P˜(X), ∈ ∈ H⊕ P˜(X)Q(Y ) = Q(Y )P˜(X), and FX (Y ) = K∗P˜(X)Q(Y )K for all Y Σ2. Hence, X P˜(X) is a PVM. ∈ 7→ 14 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨

For all X Σ1 and Y Σ2, define the projection R(X,Y )= P˜(X)Q(Y ) ( ). Since N is a POVM, for∈ any ϕ , ϕ ∈ , X , X Σ , and Y ,Y Σ , the complex∈L bimeasureM 1 2 ∈ H 1 2 ∈ 1 1 2 ∈ 2 (X,Y ) Q(Y1)KP (X1)J ϕ1 R(X,Y )Q(Y2)KP (X2)J ϕ2 7→ h ⊕ ⊕ | ⊕ ⊕ i = ϕ N (X Y ) (X Y ) (X Y ) ϕ . h 1| × ∩ 1 × 1 ∩ 2 × 2 2i  extends into a complex measure on Σ1 Σ2. Using the minimality of the subsequent dilations, one finds that (X,Y ) ξ R(X,Y )ξ ⊗extends into a measure for all ξ . Thus (X,Y ) 7→ h | i ∈ M 7→ P˜(X)Q(Y ) extends into a PVM which we shall also denote by R. Since is separable, we may diagonalize R and thus identify with the direct integral space M M ⊕ = (x, y) d(µ1 µ2)(x, y), ⊕ M ZΩ1 Ω2 M × × where R operates as the canonical spectral measure. From now on, let us fix a separable infinite-dimensional Hilbert space so that we may define a decomposable isometry W : 2 M2∞ 2 := L (µ1 µ2) = L (µ1) L (µ2) , M→ M × ⊗M∞ ∼ ⊗ ⊗M∞   ⊕ W = W (x, y) d(µ1 µ2)(x, y), ZΩ1 Ω2 × × where W (x, y): (x, y) are . One may also define the decomposable M → M∞ 2 isometry W1 : , W1 = ⊕ W1(x) dµ1(x), where W1(x): (x) L (µ2) are H⊕ → M Ω1 H → ⊗M∞ isometries, and K := W KW ∗ R( ). 1 ∈L M Define the canonical spectral measure R := Pµ1 µ2 I ∞ of with the margin P : Σ1 × ⊗ M M → 2 ( ), P(X)= R(X Ω2)= Pµ1 (X) IL (µ ) I ∞ . It is simple to check that R(Z)W = W R(Z) L M × ⊗ 2 ⊗ M and P (X)W1∗ = W1∗P(X) for all Z Σ1 Σ2 and X Σ1. This means that ⊕ ∈ ⊗ ∈ ˜ P(X)K = W P(X)KW1∗ = WKP (X)W1∗ = K P(X) ⊕ 2 for all X Σ1. Thus, K = ⊕ K(x) dµ1(x) where K(x) L (µ2) . Define the ∈ Ω1 ∈ L ⊗M∞ isometry K˜ := WK = KW R= ⊕ K˜ (x) dµ(x) with the isometries K˜ (x) = K(x)W (x): 1 Ω1 1 2 (x) L (µ2) . R H For→µ -a.a. x ⊗MΩ ∞define the channel 1 ∈ 1 2 ˜ ˜ Tx : L (µ2) (x) , B Tx(B) := K(x)∗(B I ∞ )K(x). L →L H 7→ ⊗ M Since the field x K˜ (x) of isometries is measurable, one may define the channel 7→ 2 ⊕ T : L (µ2) ( ), B T (B) := Tx(B) dµ(x). L →L H⊕ 7→ Z  Ω Using the intertwining properties of the various isometries and POVMs we have, for all ϕ , X Σ , and Y Σ , ∈ H ∈ 1 ∈ 2

J ϕ P (X)T Pµ2 (Y ) J ϕ = (J ϕ)(x) Tx Pµ2 (Y ) (J ϕ)(x) dµ1(x) h ⊕ | ⊕ ⊕ i Z h ⊕ | ⊕ i  X  ˜ ˜ = K(x)(J ϕ)(x) Pµ2 (Y ) I ∞ K(x)(J ϕ)(x) dµ1(x) Z h ⊕ | ⊗ M ⊕ i X  = KJ˜ ϕ R(X Y )KJ˜ ϕ = KJ ϕ R(X Y )KJ ϕ h ⊕ | × ⊕ i h ⊕ | × ⊕ i = J ϕ P (X)F(Y )J ϕ = ϕ N(X Y )ϕ . h ⊕ | ⊕ ⊕ i h | × i

Hence, N(X Y )= J ∗ P (X)T Pµ2 (Y ) J for all X Σ1 and Y Σ2. × ⊕ ⊕ 2 ⊕ ∈ ∈ Define the instrument : L (µ2) Σ1 ( ) by (B,X) = J ∗ P (X)T (B)J , see J L 2 × → L H J ⊕ ⊕ ⊕ [32, Theorem 1]. The choices := L (µ2) and M′ := Pµ yield Equation (1).  K 2 OPTIMAL QUANTUM OBSERVABLES 15

4. Observables determining the future We now turn our attention to those observables which have the property that, no matter how we measure them, registering an outcome unequivocally determines the post-measurement state of the system under study. Combining Theorem 2 with Theorem 1, we obtain the following characterization [32, Theorem 1]: Theorem 4. Let (Ω, Σ) be a measurable space, a separable Hilbert space, and M : Σ ( ) an observable. Pick the minimal Na˘ımark dilationH ( , P ,J ) of Theorem 1 for →LM. LetH : ( ) Σ ( ) be an M-instrument. There isH a⊕ unique⊕ ⊕ channel T : ( ) ( ) definedJ L K by a× (weakly→ LµH-measurable) field x T of channels T : ( ) L(x)K, → L H⊕ 7→ x x L K →L H  ⊕ T (B)= Tx(B) dµ(x), ZΩ i.e., T (B)ψ (x)= Tx(B)ψ(x) for all B ( ), ψ , and µ-a.a. x Ω, such that ∈L K ∈ H⊕ ∈  (B,X)= J ∗ T (B)P (X)J , B ( ), X Σ. J ⊕ ⊕ ⊕ ∈L K ∈ Definition 3. Let an M Obs(Σ, ) be associated with the Na˘ımark dilation ( , P ,J ) of Theorem 1. If dim (x∈) = 1 forHµ-a.a. x Ω, we say that M is of rank 1. InH⊕ this⊕ case,⊕ 2 H ∈ = L (µ) and P = Pµ. H⊕ ⊕ Let the observable M of Theorem 1 be of rank 1. Also assume that : ( ) Σ ( ) is an M-instrument defined by the pointwise channels T : ( ) J (Lx)Kof× Theorem→ L H 4. x L K → L H Because of the rank-1 assumption, there are states σx ( ) such that Tx(B) = tr [σxB], x Ω, B ( ). It follows that is of the following type:∈ S K ∈ ∈L K J Definition 4. Let and be Hilbert spaces and (Ω, Σ) a measurable space. We say that an instrument : H ( ) KΣ ( ) is nuclear if there is a weakly µ-measurable26 field Ω x σ J( )L ofK states× such→ that L H ∋ 7→ x ∈ S K

(B,X)= tr [σxB] dM(x), X Σ, B ( ). J ZX ∈ ∈L K The term nuclear follows the terminology of Cycon and Hellwig [8]. The above definition means that, in the Schr¨odinger picture, a nuclear instrument has the form J M (ρ, X)= σx dpρ (x), ρ ( ), X Σ, J∗ ZX ∈ S H ∈ where the integral is defined weakly. Physically this means that a nuclear instrument prepares the quantum system into some post-measurement state which solely depends on the outcome registered, not on the pre-measurement state of the system. This is why also the name measure- and-prepare instrument could also be used. Thus, any measurement of a rank-1 observable is described by a nuclear instrument and registering a value fully determines the post-measurement state. This is to say, rank-1 observables determine the future of the system under measurement. In fact, also the contrary is true as the following result from [32] tells us. Theorem 5. An observable M : Σ ( ) is rank-1 if and only if each M-instrument : ( ) Σ ( ) is nuclear (where→ LisH any Hilbert space). J L K × →L H K The above result can be reformulated in the form that an observable determines the future if and only if it is of rank 1. The channel ( , Ω) associated with a measurement of a rank-1 observable is also seen to be entanglement breakingJ · [22].

26 In this case, all maps x tr [σxB], B ( ), are µ-measurable. 7→ ∈ L K 16 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨

1 1 Let M Obs(Σ, ) with vectors dk(x) of Theorem 1, Ω := N Ω, and let Σ be the product σ-algebra∈ of 2N andH Σ. Let µ1 : Σ1 [0, ] be the product measure× of the counting measure → ∞ and µ. Define d(k, x)= dk(x) if k < m(x) + 1 and d(k, x) = 0 if k > m(x). Then

1 1 1 1 1 (2) ϕ M (X )ψ = ϕ d(k, x) d(k, x) ψ dµ (k, x), ϕ,ψ Vh, X Σ , h | i ZX1 h | ih | i ∈ ∈ defines a rank-1 POVM M1 : Σ1 ( ); we say that M1 is a maximally refined version of M. 1 1 →L H 1 Since M(X)= M f − (X) where f : Ω Ω is a measurable function defined by f(k, x)= → 1 1 f(x) for all k N and x Ω,M is a relabeling of M . Note that the value space of M contains the multiplicities∈ (k, x), k∈ < m(x) + 1, of a measurement outcome x of M. Moreover, M and M1 are jointly measurable and M1 can be measured by performing a sequential measurement of M and some discrete ‘multiplicity’ observable [34]. We will see that the maximally refined version of an observable possesses many of the same optimality properties as the original observable meaning that we may freely assume the rank-1 property for these observables.

5. Post-processing and post-processing maximality Let us begin with a definition.

Definition 5. Let (Ω1, Σ1) and (Ω2, Σ2) be measurable spaces. Also assume that µ : Σ1 R is a positive measure. We say that a map β : Σ Ω R is a µ-weak Markov kernel [24]→ if 2 × 1 → (i) β(Y, ) : Ω1 R is µ-measurable for all Y Σ2, (ii) β(Y, x· ) 0→ for all Y Σ and µ-a.a. x Ω∈ , ≥ ∈ 2 ∈ 1 (iii) β(Ω2, x)=1 for µ-a.a. x Ω1, and (iv) for all pairwise disjoint ∈ Y ,Y ,... Σ , 1 2 ∈ 2 ∞ β ∞ Y , x = β(Y , x) ∪j=1 j j  Xj=1 for µ-a.a. x Ω . ∈ 1 If β( , x) is a probability measure for all x Ω1 and the maps β(Y, ) are measurable then β is simply· called a Markov kernel. ∈ ·

When µ1 is a probability measure on (Ω1, Σ1), µ1 µ, and β : Σ2 Ω1 R is a µ-weak Markov kernel, then the set function ≪ × →

Σ1 Σ2 (X,Y ) B(X,Y ) := β(Y, x) dµ1(x) [0, 1] × ∋ 7→ ZX ∈ 27 is a probability bimeasure with the marginal probability measures X B(X, Ω2) = µ1(X) β 7→ and Y B(Ω1,Y ) =: µ1 (Y ). As an immediate consequence of Carath´eodory’s extension theorem,7→ one gets the well-know result stating that if β is a Markov kernel then B extends into probability measure B : Σ1 Σ2 [0, 1], i.e., B(X Y )= B(X,Y ) for all X Σ1 and Y Σ2. β × → × ∈ ∈ Note that µ1 can be interpreted as a result of (classical) data processing represented by β. We call this data processing scene post-processing since the processing can be carried out after obtaining the data represented by the measure µ1. This data processing scheme generalizes to the case of POVMs in the following way.

Definition 6. Let M1 : Σ1 ( ) be an observable operating in the Hilbert space . We assume that there is a (probability)measure→ L H µ on (Ω, Σ) such that M µ. We say thatH an 1 ≪ 27 Recall that B : Σ1 Σ2 C is a bimeasure if B(X, ), X Σ1, and B( , Y ), Y Σ2, are (complex) measures. × → · ∈ · ∈ OPTIMAL QUANTUM OBSERVABLES 17 observable M2 : Σ2 ( ) is a post-processing of M1, if there is a µ-weak Markov kernel → L HM M β : Σ Ω R such that p 2 =(p 1 )β for all ρ ( ) or, equivalently, 2 × 1 → ρ ρ ∈ S H

M2(Y )= β(Y, x) dM1(x) (weakly) ZΩ1 for all Y Σ . We denote M = Mβ. ∈ 2 2 1 The above means that by measuring M1, we obtain all the information obtainable by measur- ing M2; we just have to process the data given by M1 classically with the fixed kernel β. Thus, M1 can give us at least the same amount of information on the quantum system as M2 modulo 1 classical data processing. Note that if M2 is a relabeling of M1, i.e. M2(Y )= M1(f − (Y )), then M = Mβ where β(Y, x)= χ (x) is a Markov kernel. 2 1 f −1(Y ) We may thus set up an information-content ‘order’ among observables [3, 13, 29] M M 2 ≤post 1 if there is a µ-weak Markov kernel β : Σ Ω R (where M µ) such that M = Mβ. We 2 × 1 → 1 ≪ 2 1 may also say that M1 and M2 are post-processing equivalent if there are weak Markov kernels β β γ and γ such that M2 = M1 and M1 = M2 . Recall that the ‘order’ post here may not actually be a partial order (because of the failure of transitivity); for situations≤ where this problem can be overcome and identification of canonical representatives of the resulting equivalence classes, see [27]. An observable M is post-processing maximal or post-processing clean if, for any observable M′ such that M post M′, one has M′ post M. The maximal observables have been characterized earlier in the case≤ of discrete outcomes≤ [13, Theorem 3.4]. We generalize this characterization for observables with nice outcome spaces. For that, we need the following proposition:

Proposition 3. Let (Ω1, Σ1) be nice, (Ω2, Σ2) countably generated, and B : Σ1 Σ2 [0, 1] a probability bimeasure. Denote µ = B( , Ω ). × → 1 · 2 (i) There exists a probability measure B : Σ Σ [0, 1] such that B(X Y )= B(X,Y ) 1 ⊗ 2 → × for all X Σ1 and Y Σ2. (ii) There exists∈ a Markov∈ kernel β : Σ Ω [0, 1] such that B(X,Y )= β(Y, x) dµ (x) 2 × 1 → X 1 for all X Σ1 and Y Σ2. R ∈ ∈ Proof. First we note that (i) holds in the case where (Ω1, Σ1) and (Ω2, Σ2) are standard Borel spaces [10, Lemma 4.2.1] showing that Lemma 12.1 of [37] holds even in the case where prob- ability measures on Σ Σ (i.e. joint probability measures) are replaced with probability 1 ⊗ 2 bimeasures on Σ1 Σ2. Manifestly the rest of the proof of Theorem 12.1 of [37] can be car- ried out by replacing× joint probability measures with probability bimeasures everywhere. This proves item (ii). Item (i) follows from (ii) by recalling the well-known fact that any Markov kernel defines a joint probability measure. 

Remark 2. Let (Ω1, Σ1) and (Ω2, Σ2) be as in Proposition 3, Mi Obs(Σi, ), i = 1, 2, β ∈ H M1 µ1, and M2 = M1 where β is a µ1-weak Markov kernel, i.e. M2 is a post-processing of M1. Since∼ β defines a probability bimeasure, we immediately get from Proposition 3 the following results: β′ There is a Markov kernel β′ such that M2 = M1 and β(Y, x)= β′(Y, x) for all Y Σ2 • and µ -a.a. x Ω . ∈ 1 ∈ 1 The POVMs M1 and M2 are jointly measurable, a joint observable N Obs(Σ1 Σ2, ) • being defined through N(X Y ) := β(Y, x)dM (x). ∈ ⊗ H × X 1 R 5.1. Joint measurements of rank-1 observables. For the results of the rest of this section, it is useful, as an interlude, to now turn our attention to joint-measurability issues of rank-1 observables. Let M : Σ ( ), i = 1, 2, be jointly measurable observables where M is of i i → L H 1 18 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨ rank 1. Let be separable and ( , P ,J ) be the minimal (diagonal) Na˘ımark dilation of M introducedH in Theorem 1 withH the⊕ vector⊕ ⊕ field x d (x) =: d(x) so that, for all X Σ , 1 7→ 1 ∈ 1

ϕ M1(X)ψ = ϕ d(x) d(x) ψ dµ1(x)= (J ϕ)(x)(J ψ)(x) dµ1(x) h | i ZX h | ih | i ZX ⊕ ⊕ C 2 where ϕ, ψ Vh, since (x) implies = L (µ1) and P = Pµ1 . According to Theorem ∈ H ≡ H⊕ ⊕ 3, there is a unique POVM F : Σ L2(µ ) such that P (X)F(Y ) = F(Y )P (X) for all 2 → L 1 µ1 µ1 X Σ1 and Y Σ2 and M2(Y ) = J ∗ F(Y )J for all Y Σ2. Hence, for any Y Σ2, there is ∈ ∈ ⊕ ⊕ ∈ ∈ a measurable function β(Y, ) : Ω R such that F(Y )η (x)= β(Y, x)η(x) for all η L2(µ ) · 1 → ∈ 1 and µ1-a.a. x Ω1. It is simple to check that the map β : Σ2 Ω1 R satisfies the conditions ∈ × → β (i)–(iv) of Definition 5 implying that β is a µ1-weak Markov kernel and M2 = M1 . Thus, we have [35]: Theorem 6. Let M : Σ ( ) be a rank-1 observable of a separable . Any observable → L H H M′ : Σ′ ( ) jointly measurable with M is a post-processing of M. →L H 5.2. Post-processing clean observables. The general form of post-processing clean observ- ables is claimed to have been solved in [2]. There are, however, some problems in the definition of post-processing the paper uses: Despite the author’s definition of post-processing involves, according to the terminology used here, weak Markov kernels, a kernel β is treated assuming that β( , x) is a measure for a.a. x. Moreover, we find the proofs of the main theorems dubious. That is· why we provide a new proof. We end up with the same characterization as in [2] though. The next theorem is an essential part of the characterization of post-processing clean observables given in Corollary 2.

Theorem 7. Let (Ωi, Σi), i = 1, 2, be measurable spaces, a separable Hilbert space, M1 Hβ ∈ Obs(Σ , ), and β : Σ Ω [0, 1] a Markov kernel. If M is of rank 1 then M is of rank 1. 1 H 2 × 1 → 1 1 Proof. Assume that µ1 is a probability measure on (Ω1, Σ1) such that M1 µ1. Clearly, β β ≪ M2 := M µ2 := µ (i.e. µ2(Y ) = β(Y, x) dµ1(x)). For any Hilbert-Schmidt operator 1 ≪ 1 Ω1 R ( ), Z Σ , i =1, 2, by the Radon-Nikod´ymR property of the , ∈L H ∈ i

R∗Mi(Z)R = mi(z) dµi(z), ZX where m : Ω ( ) is a weakly µ -measurable positive trace-class-valued function (which i i → L H i depends on R), see e.g. [23]. Requiring M2 to be rank-1 is equivalent with m2(y) being at most rank-1 almost everywhere. Fix now a Hilbert-Schmidt operator R and let mi be the corresponding densities of R∗M ( )R with respect to µ . Now i · i

(3) R∗M2(Y )R = β(Y, x)m1(x) dµ1(x) ZΩ1 for all Y Σ . Since β is a Markov kernel, the probability bimeasure (X,Y ) β(Y, x) dµ (x) ∈ 2 7→ X 1 extends into a probability measure µ : Σ1 Σ2 [0, 1] whose margins are µ1 Rand µ2. Because µ µ µ , there is a (nonnegative) density⊗ function→ ρ L1(µ µ ) such that ≪ 1 × 2 ∈ 1 × 2

µ(Z)= ρ d(µ1 µ2), Z Σ1 Σ2 ZZ × ∈ ⊗ and, hence, ρ(x, y)dµ (y)= β(Y, x) for all Y Σ and µ -a.a. x Ω . From Equation (3), Y 2 ∈ 2 1 ∈ 1 it now followsR

(4) m2(y)= ρ(x, y)m1(x) dµ1(x) ZΩ1 OPTIMAL QUANTUM OBSERVABLES 19 for µ -a.a. y Ω . Let now, for every y Ω , P (y) be the at most one-dimensional projection 2 ∈ 2 ∈ 2 onto the range of m2(y). This is a weakly measurable map. Multiplying (4) from both sides with P (y)⊥, one obtains for µ -a.a. y Ω 2 ∈ 2

0= ρ(x, y)P (y)⊥m1(x)P (y)⊥ dµ1(x). ZΩ1 Thus also

P (y)⊥m1(x)P (y)⊥ dµ(x, y)= ρ(x, y)P (y)⊥m1(x)P (y)⊥ d(µ1 µ2)(x, y)=0, ZΩ1 Ω2 ZΩ1 Ω2 × × × implying that ρ(x, y)P (y)⊥m (x)P (y)⊥ =0 for (µ µ )-a.a. (x, y) Ω Ω . 1 1 × 2 ∈ 1 × 2 Denote by N the set of those (x, y) Ω1 Ω2 such that ρ(x, y)P (y)⊥m1(x)P (y)⊥ = 0. Applying the Fubini theorem for the characteristic∈ × function χ , one finds that for µ 6 -a.a. N 1 x Ω , ρ(x, y)P (y)⊥m (x)P (y)⊥ = 0 for µ -a.a. y Ω . For µ -a.a. x Ω , there is y Ω ∈ 1 1 2 ∈ 2 1 ∈ 1 ∈ 2 such that ρ(x, y) > 0. Indeed, if E Σ1 is such that ρ(x, y) = 0 for all x E and y Ω2, it follows that 0 = ρ d(µ µ∈) = µ(E Ω ) = µ (E). Hence, µ -a.a.∈ x Ω ,∈ there E Ω2 1 2 2 1 1 1 × × × ∈ is y Ω such thatRP (y)⊥m (x)P (y)⊥ = 0 implying m (x) = P (y)m (x)P (y), i.e., m (x) is ∈ 2 1 1 1 1 at most rank-1 and, since this holds for any Hilbert-Schmidt operator R, we have that M1 is rank-1.  From Remark 2 and the theorem above we get:

Corollary 1. Suppose that (Ω1, Σ1) (resp. (Ω2, Σ2)) is a nice (resp. countably generated) mea- surable space and is a separable Hilbert space. Let M : Σ ( ), i =1, 2, be observables H i i →L H such that M2 is of rank 1. If M2 is a post-processing of M1 then M1 is of rank 1 as well. The next corollary gives an exhaustive characterization of post-processing clean observables with a nice value space. Especially, we find that such an observable is post-processing maximal if and only if it determines the future of the system under study. Corollary 2. Let (Ω, Σ) be a measurable space, a separable Hilbert space, and M Obs(Σ, ). If M is of rank-1 then it is post-processing clean.H The converse holds when (Ω, Σ)∈is nice. H Proof. Suppose first that M is rank-1 and µ M is a probability measure. Hence, according ∼ 2 to Theorem 1, M has a minimal Na˘ımark dilation (L (µ), Pµ,J ). If M is a post-processing of ⊕ an M˜ Obs(Σ˜, ) on some measurable space (Ω˜, Σ),˜ i.e. M = M˜ β˜ where β˜ is aµ ˜-weak Markov ∈ H kernel andµ ˜ M˜ , one can define a positive operator bimeasure ∼

(X,Y ) β˜(X,y)dM˜ (y)= J ∗ Pµ(X)F(Y )J 7→ ZY ⊕ ⊕ where now F is of the form F(Y )η (x) = β(Y, x)η(x) for all Y Σ,˜ η L2(µ) and µ-a.a. ∈ ∈ x Ω, and thus M˜ = Mβ, see Section 5.1 for details. ∈Assume now that M is post-processing clean. Let M1 : Σ1 ( ) be the rank-1 refinement of M defined in (2) from which M can be post-processed. Since→LMHis clean, M1 is also a post- processing of M. If (Ω, Σ) is nice then (Ω1, Σ1) is nice (thus countably generated) and Corollary 1 implies that M is rank-1 as well (i.e., M and M1 coincide). 

6. Observables determining the past In this section, we concentrate on observables that define the past of the system under study, i.e., those observables whose measurement outcome statistics completely determine the state of the system prior to the measurement. 20 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨

Let be a Hilbert space and (Ω, Σ) a measurable space. Let M Obs(Σ, ) and recall our earlierH definition pM = tr [ρM( )] for all ρ ( ). Note that the map∈ ρ pMHis an affine map ρ · ∈ S H 7→ ρ which is continuous with respect to the trace norm on ( ) and the total variation norm of probability measures. If this map is an injection, the naturalS H conclusion is that the observable M can separate all states; with different states of the system, the outcome statistics will always differ. How one can actually determine the state of the system prior to the measurement is not discussed here; the reader is redirected to [26] for this issue. This prompts the following definition: an observable M : Σ ( ) is informationally complete if for ρ, σ ( ), ρ = σ implies pM = pM. This injectivity→ L H extends to the whole ∈ S H 6 ρ 6 σ of ( ), and thus informational completeness of M is equivalent with the following: for any T T H( ), the condition tr [T M(X)] = 0 for all X Σ implies T = 0. From this we see that the∈ range T H ran M = M(X) X Σ of M has to be extensive∈ enough to separate the trace class ( ). Indeed, M is{ informationally| ∈ } complete if and only if the ultraweak of the linear T H hull of ran M (which coincides with the double commutant (ran M)′′) is the whole of ( ) [5, Proposition 18.1]. We can make the following important immediate observations: If LMHis in- formationally complete, its rank-1 refinement M1 is informationally complete as well, and any joint measurement of an informationally complete observable with some observable is also in- formationally complete. More generally, if a post-processing of an observable is informationally complete, then the post-processed observable is also informationally complete. To further quantify the informational content of an M Obs(Σ, ) in a Hilbert space , let us define for each ρ ( ) the set [ρ]M ( ) as the∈ set of thoseH states σ ( ) suchH that pM = pM. It is evident∈ S thatH M is informationally⊆ S H complete if and only if [ρ]M ∈= SρHfor all σ ρ { } ρ ( ). This definition can be generalized to the case of sets of observables (in the same Hilbert∈ S H space ): O H M M [ρ]O := σ ( ) p = p , M { ∈ S H | σ ρ ∀ ∈ O} One can say that a set of observables is informationally complete if [ρ]O = ρ for all ρ ( ). O { } ∈An S observableH M : Σ ( ) is said to be commutative if M(X)M(Y )= M(Y )M(X) for all X,Y Σ. Let (→L) beH a set of selfadjoint operators. We call the set of vectors ϕ ∈ L⊆L H ∈ H such that L1 Lnϕ = Lπ(1) Lπ(n)ϕ for any L1,..., Ln , any permutation π of 1,...,n , and any n ···N as the commutation··· domain of and denote∈L it by com . The following{ results} concerning∈ relationships between commutativityL and sharpness with informationalL completeness have been proven in [4]: Whenever dim 2 and M Obs(Σ, ) is commutative, M is not informationally • complete. H ≥ ∈ H A family of mutually commuting spectral measures is never informationally complete. • If P : Σ ( ) is a spectral measure and ρ ( ),[ρ]P = ρ if and only if ρ is pure • (a rank-1→L projection)H and there is an X Σ∈ such S H that ρ = P{(X}). If is an informationally complete set∈ of observables then dim com 1, where • O L ≤ = M ran M. L ∈O The followingS are examples on informationally complete observables and sets of observables:

The set Qθ θ [0,π) of the rotated quadratures introduced in Section 1 is informationally • complete{ [5,} Theorem∈ 18.1] 2 Equivalently with the above, the homodyne observable Ght : [0, π) R L (R) • 1 B × →L defined by G (Θ X)= π− Q (X) dθ is informationally complete.   ht × Θ θ The covariant phase space observableR GS introduced in Section 1 is informationally • complete if and only if the support28 of the function (q,p) tr [SD(q,p)] is R2 [25]. 7→ 28That is, the closure of the set of points (q,p) R2 such that tr [SD(q,p)] = 0. ∈ 6 OPTIMAL QUANTUM OBSERVABLES 21

6.1. Informational completeness within the set of pure states. Sometimes it is fruitful to consider informational completeness of an observable within a restricted set ( ) of states; we are, e.g., already guaranteed that the pre-measurement state ρ is withinP ⊆ S andH it is enough to only be able to discern between states in [6]. Thus we arrive at informationalP completeness of an M Obs(Σ, ) within meaningP that, whenever ρ, σ , ρ = σ, then pM = pM. When the set∈ consistsH of pure states,P we identify it with [ϕ] ϕ ∈ P, ϕ 6 ϕ ; ρ 6 σ P { | ∈ H | ih |∈P} here, for any ϕ , we denote [ϕ] := tϕ t T and T := z C z = 1 . We get the following result for∈ H the case where we have{ to| distinguish∈ } a pure{ stat∈ e from|| | other} pure states: Proposition 4. Let ( , P ,J ) be the minimal Na˘ımark dilation of Theorem 1 for an M Obs(Σ, ) in a separableH⊕ Hilbert⊕ ⊕ space . The observable M is informationally complete within∈ the set Hϕ ϕ ϕ , ϕ =1 of pureH states if and only if WJ ϕ / J whenever ϕ {| ih || ∈ H k k } ⊕ ∈ ⊕H ∈ H and W = ⊕ W (x) dµ(x) ( ) is a decomposable isometry such that WJ ϕ = tJ ϕ for all Ω ∈L H⊕ ⊕ 6 ⊕ t T. R ∈ Proof. Let ϕ, ψ be unit vectors. We have pM = pM if and only if (J ϕ)(x) = ϕ ϕ ψ ψ ⊕ ∈ H | ih | | ih | km(x) k (J ψ)(x) for µ-a.a. x Ω. One can construct measurable fields x en(x) n=1 (x), k ⊕ k ∈ 7→ { } ⊆ H x f (x) m(x) (x) of orthonormal bases such that 7→ { n }n=1 ⊆ H 1 (J ϕ)(x) − (J ϕ)(x), (J ϕ)(x) =0 e1(x) = k ⊕ k ⊕ ⊕ 6 ,  η(x) otherwise 1 (J ψ)(x) − (J ψ)(x), (J ψ)(x) =0 f1(x) = k ⊕ k ⊕ ⊕ 6  η(x) otherwise for µ-a.a. x Ω, where x η(x) (x) is a measurable field of unit vectors. Defining m∈(x) 7→ ∈ H W (x) := n=1 fn(x) en(x) we may set up the decomposable isometry (even unitary) W = | ih | M M ⊕ W (x)P dµ(x) such that J ψ = WJ ϕ if p = p holds. In reverse, it is simple to check Ω ⊕ ⊕ ϕ ϕ ψ ψ that,R whenever W is a decomposable isometry| ih | such| thatih | J ψ = WJ ϕ, then (J ϕ)(x) = M M ⊕ M ⊕ M k ⊕ k (J ψ)(x) for µ-a.a. x Ω, i.e., p ϕ ϕ = p ψ ψ . Thus, p ϕ ϕ = p ψ ψ if and only if k ⊕ k ∈ | ih | | ih | | ih | | ih | J ψ = WJ ϕ with a decomposable isometry W ( ). The claim immediately follows from⊕ this observation⊕ and by noting that ϕ ϕ = ∈ψ L ψH⊕if and only if J ψ = tJ ϕ for some t T. | ih | | ih | ⊕ ⊕  ∈ The above proposition implies the well-known fact stated earlier: a PVM in a separable Hilbert space cannot be informationally complete. In fact such a PVM P is not informationally complete even within the set of pure states. Indeed, the isometry J in the dilation of Theorem 1 for P is unitary, i.e., J = . ⊕ ⊕H H⊕ For another example, as well known, the canonical phase Φcan introduced in Section 1 is not informationally complete within the set of pure states. To see this using Proposition 4, let us 2 1 give the minimal Na˘ımark dilation of Theorem 1 for Φcan in the form (L [0, 2π), (2π)− dθ , Pcan,Jcan), 2 1 where Pcan is the canonical spectral measure of L [0, 2π), (2π)− dθ and 

∞  inθ J = ψ n , ψ (θ)= e− , 0 θ< 2π, n 0 N. can | n ih | n ≤ ∈{ } ∪ Xn=0 inθ Let n N. Since ψn(θ) = e− ψ0(θ) for all θ [0, 2π), defining the decomposable unitary ∈ inθ ∈ 2 1 operator W through (W ψ)(θ) = e− ψ(θ), ψ L [0, 2π), (2π)− dθ , θ [0, 2π), one has n n ∈ ∈ Jcan n = ψn = Wnψ0 = WnJcan 0 = tJcan 0 . This proves the claim.  Let| iM Obs(Σ, ) be an observable| i 6 | ini a separable Hilbert space with the minimal Na˘ımark dilation∈ ( H, P ,J ) of Theorem 1. Let us make a few observationsH and collect a couple conditions thatH⊕ guarantee⊕ ⊕ that M is not informationally complete within the set of pure states and a necessary and sufficient condition for this. 22 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨

If there exist nonzero vectors ϕ, ψ such that (J ϕ)(x) (J ψ)(x) =0 for µ-almost • all x Ω then M is not informationally∈ H completeh within⊕ the| set⊕ of purei states. To see this, fix∈ ϕ and ψ satisfying the above condition. Without restricting generality, assume 1/2 that ϕ =1= ψ . Hence, ϕ ψ = J ϕ J ψ = 0 and ϕ = 2− (ϕ ψ) are unit k k k k h | i h ⊕ | ⊕ i ± ± vectors for which ϕ+ = tϕ for all t T and (J ϕ+)(x) = (J ϕ )(x) for µ-a.a. x Ω, that is, pM 6 = p−M . ∈ k ⊕ k k ⊕ − k ϕ+ ϕ+ ϕ− ϕ− ∈ | ih | | ih | Note, however, that the existence of vectors ϕ and ψ of the above condition is not necessary for an informationally incomplete observable, a counterexample being the canonical phase: Assume that (J ϕ)(θ)(J ψ)(θ) = 0 for dθ-a.a. θ [0, 2π). Then can can ∈ either Jcanψ or Jcanϕ is zero since any Hardy function vanishing on a set of positive measure is identically zero. If there are disjoint sets Xi Σ and nonzero vectors ϕi such that M(Xi)ϕi = ϕi, • i = 1, 2, then M is not informationally∈ complete within the set of pure states. Indeed, let ϕi 0 and Xi, i =1, 2 be as above. Thus, P (Xi)J ϕi = J ϕi for all i =1, 2 implying∈H\{ that } ⊕ ⊕ ⊕

(J ϕ1)(x) (J ϕ2)(x) (J ϕ1)(x) (J ϕ2)(x) =0 |h ⊕ | ⊕ i|≤k ⊕ kk ⊕ k for µ-almost all x Ω so that M is not informationally complete within the set of pure states. ∈ For any decomposable isometry W = ⊕ W (x) dµ(x) ( ), define the operator • Ω ∈ L H⊕ ZW := J ∗ WJ . The observable M is informationallyR complete within the set of pure ⊕ ⊕ states if and only if, for any decomposable isometry W ( ), the operator ZW ∈ L H⊕ strictly decreases the norm (i.e., ZW ϕ < ϕ ) for any nonzero ϕ such that J ϕ is not an eigenvector of W . (Recallk thatk ank k isometry may not have∈ H any eigenvalues⊕ and if eigenvalues exist they belong to T.) To see this, note that, when ϕ and W ( ) is a decomposable isometry, the vector WJ ϕ = tJ ϕ is not in the∈ subspace H ∈L H⊕ ⊕ 6 ⊕ J = if and only if its norm genuinely decreases under the ‘projection’ J ∗ , i.e., ⊕H ∼ H ⊕ J ∗ WJ ϕ < ϕ . Thus we obtain the above as a reformulation of Proposition 4. Note k ⊕ ⊕ k k k that ZW 1 and, if WJ ϕ = tJ ϕ, then ZW ϕ = tϕ. k k ≤ ⊕ ⊕ T ⊕ Suppose that w : Ω is a µ-measurable function and W = Ω w(x)I (x)dµ(x). → M HM Denote Zw := ZW = Ω w(x) d (x). For the informational completenessR of within the set of pure states,R it is necessary that Zw be strictly norm decreasing in the way defined above for any T-valued measurable function w. We see immediately that this condition becomes also sufficient if M is of rank 1. Note that, if w is not a constant on a set of positive measure, then the corresponding W does not have any eigenvalues. For example, in the case of the canonical phase, Wn (n = 0) does not have any eigenvalues 2π inθ 6 and Z = e− dΦ (θ)= ∞ m+n m is an isometry. Often we are interested Wn 0 can m=0 | ih | in the stateR determination powerP of the rank-1 refinement of an observable which is why the rank-1 case is of particular importance. Let us take a closer look at a couple of examples utilizing the observations made above.

Example 1. Consider a phase space observable GS with some generating positive trace-1 operator S. Let us denote the closure of the range of S by . The dilation of Theorem 1 is given by the Hilbert space L2(R2) identified here with the correspondingK L2-space of (equivalence classes of) -valued functions,⊗K the canonical spectral measure P : (R2) L2(R2) , K ⊕ B → L ⊗K P (Z)η)(q,p) = χ (q,p)η(q,p) for all Z (R2), η L2(R2) , and a.a. (q,p ) R2, and ⊕ Z ∈ B ∈ ⊗K ∈ the isometry J : L2(R) L2(R2) , ⊕ → ⊗K 1 1/2 2 2 (J ϕ)(q,p)= S D(q,p)∗ϕ, ϕ L (R), (q,p) R . ⊕ √2π ∈ ∈ OPTIMAL QUANTUM OBSERVABLES 23

It follows that GS is informationally complete within the set of pure states if and only if, whenever R2 (q,p) W (q,p) ( ) is a weakly measurable field of isometries, the operator ∋ 7→ ∈L K 1 1/2 1/2 ZW = D(q,p)S W (q,p)S D(q,p)∗dqdp 2π ZR2 strictly decreases the norm of any nonzero vector ϕ such that J ϕ is not an eigenvector of W . Especially, if w : R2 T is measurable then Z = w(q,p⊕) dG (q,p) is strictly norm → w R2 S decreasing in the above sense if GS is informationally completeR (within the set of pure states). If S is of rank 1 (i.e., GS is rank-1) then GS is informationally complete within the set of pure states if and only if the operators Zw are strictly norm decreasing as above. Let S = rank S s ϕ ϕ be the spectral decomposition of S where ϕ L2(R) is a unit i=1 i| i ih i| i ∈ eigenvectorP associated to the eigenvalue si (0, 1] (and ϕi ϕj = δij, tr [S]= i si = 1). Pick R C ∈ h | i rank S 2 a representative x ϕi(x) from each class ϕi such that i=1 si ϕi(Px) < for all R ∋ 7→ ∈ R2 C | | ∞ x and define a positive semidefinite integral kernel KS : P by ∈ → rank S K (x, y) := s ϕ (x)ϕ (y), (x, y) R2. S i i i ∈ Xi=1 Indeed, K (x, y) 2 K (x, x)K (y,y) and K (x, x)dx = 1 by the monotone convergence | S | ≤ S S R S theorem. Now X := x R K (x, x) = 0R is essentially unique in the sense that, if K˜ is KS { ∈ | S 6 } S another integral kernel of S then XKS and XK˜ S differ in the set of zero. For any ϕ, ψ L2(R) and (q,p) R2 one gets ∈ ∈ 2π (J ϕ)(q,p) (J ψ)(q,p) = D(q,p)∗ϕ SD(q,p)∗ψ h ⊕ | ⊕ i h | i = (D(q,p) ϕ)(x)S(x, y)(D(q,p)∗ψ)(y)dxdy Z Z ∗ XKS XKS

ipx ipy = e ϕ(x + q)KS(x, y)e− ψ(y + q)dxdy q ZZYS,ϕ,ψ where Y q = X X x ϕ(x + q) =0 y ψ(y + q) =0 . If there exists an R > 0 S,ϕ,ψ KS × KS ∩{ | 6 }×{ | 6 } such that X [ R, R] is of measure zero (e.g. S = χ χ ) then it is easy to find ϕ KS \ − | [0,1] ih [0,1]| q R and ψ such that YS,ϕ,ψ is zero measurable for all q and, hence, GS is not informationally complete within the set of pure states by above observation.∈ To connect our analysis with earlier results, define a continuous square-integrable function

2 iqp/2 ipx Sˆ : R R, (q,p) Sˆ(q,p) := tr [D(q,p)S]= e e KS(x, x + q)dx. → 7→ ZR 1 iqp/2 ipx If Sˆ is integrable, K (x, x + q) = e− e− Sˆ(q,p)dp for all q R and a.a. x R. If, S 2π R ∈ ∈ additionally, X [ R, R] is of measureR zero for some R > 0, Sˆ(q,p) = 0 for all p R if KS \ − ∈ q > 2R but Sˆ need not be compactly supported (e.g. S = χ χ for which Sˆ(0,p) = | | | [0,1] ih [0,1]| i(1 eip)/p for all p = 0). Assume then that the of Sˆ is compact and thus contained − 6 in a rectangle [ R0, R0] [ R0, R0]. Now Sˆ is integrable and S(x, x + q) = 0 for (almost) − × − 2 all x and q such that q > R0. Immediately one finds unit vectors ϕ, ψ L (R) such that | | 2 ∈ (J ϕ)(q,p) (J ψ)(q,p) = 0 for all (q,p) R thus showing that GS is not informationally completeh ⊕ within| ⊕ the seti of pure states. Hence,∈ we have obtained Proposition 20(c) of [6] as a special case. Example 2. Let M : 2Ω ( ) be an observable with an at most countably infinite value → L H space Ω = x1, x2,... , and denote Mi := M( xi ) for all i =1, 2,.... Denote the closure of the range of M{ by for} each i and define the Hilbert{ } space := which is equipped with i Ki K i Ki L 24 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨ the canonical spectral measure P : 2Ω ( ) defined by P( x ) ϕ := ϕ for all i and → L K { i} j j i all ϕ . Moreover, define the isometry J : , ϕ JϕL= M1/2ϕ. The triple j j ∈ K H→K 7→ i i ( ,LP,J) is a minimal Na˘ımark dilation for M like the one presented in TheoremL 1. KThe observable M is informationally complete within the set of pure states if and only if, for any isometries W ( ) and any ϕ such that there is no t T such that i ∈ L Ki ∈ H ∈ W M1/2ϕ = tM1/2ϕ for all i, one has Z ϕ < ϕ where Z := M1/2W M1/2. This is a i i i k W k k k W i i i i direct consequence of our earlier observations by noting that, when PWi ( i) are isometries ∈L K and W := W , then WJϕ = tJϕ for some ϕ and t T if and only if W M1/2ϕ = tM1/2ϕ i i ∈ H ∈ i i i for all i. IfLM is of rank 1, this condition can be simplified: M is informationally complete within the set of pure states if and only if, for any (nonconstant) function i wi T and any nonzero 1/2 1/2 7→ ∈ ϕ (such that wiMi ϕ = tMi ϕ for all i), one has Zwϕ < ϕ , where Zw = i wiMi. ∈ H 6 dim i k k k k Finally, we note that, if ϕik K is an orthonormal basis of i for each i one canP define (lin- { }k=1 K early independent) vectors d := M1/2ϕ , k < dim +1, such that M1/2 = ϕ ϕ M1/2 = ik i ik Ki i k | ik ih ik| i ϕik dik , J = eik dik , and Pi := P( xi )= eik eik , whereP eik := δjiϕik; k | ih | i k | ih | { } k | ih | j compareP to SectionP 1. P P L

7. Extreme observables The relevant mathematical structures in quantum theory, sets of states, observables, channels, and instruments, are convex. For example, for observables M, M′ Obs(Σ, ) and p [0, 1], ∈ H ∈ one can determine a mixed observable, a convex combination pM + (1 p)M′ by − pM + (1 p)M′ (X)= pM(X)+(1 p)M′(X), X Σ. − − ∈ Such mixing of devices can be seen as classical noise produced by an imprecise implementation that produces a measurement of M with relative frequency p and something else otherwise. An element x K in a convex set K set is called extreme if, for any y, z K and p (0, 1), x = py + (1 p∈)z implies x = y = z. Thus extreme quantum devices are∈ free of classical∈ noise due to mixing.− The extreme elements of the set of states ( ) are the rank-1 projections ϕ ϕ , ϕ , ϕ = 1, called as pure states, whereas theS extremeH effects are projections. |Theih general| ∈ characterizations H k k of extremality for quantum devices follow ultimately from the following result [1]:

Theorem 8. Suppose that is a unital C∗-algebra and is a Hilbert space. Let Φ CP( ; ) and pick a minimal StinespringA dilation ( ,π,J) forHΦ. The map Φ is an extreme∈ pointA H of the convex set CP( ; ) if and only if theM map A H (ran π)′ D J ∗DJ ( ) ∋ 7→ ∈L H defined on the commutant of the range of π is an injection. We usually say shortly that an observable M : Σ ( ) is extreme if M is an extreme element of Obs(Σ, ). We may elaborate the above extremality→ L H characterization in the case of quantum observablesH [31]. Theorem 9. Let ( , P ,J ) be the minimal Na˘ımark dilation of Theorem 1 for an M Obs(Σ, ) in a separableH⊕ ⊕ Hilbert⊕ space . The observable M is extreme if and only if, for any∈ H H ⊕ decomposable operator D = Ω D(x) dµ(x) ( ), the condition J ∗ DJ =0 implies D =0. ∈L H⊕ ⊕ ⊕ R It is an immediate result of Theorem 9 that PVMs are extreme. This can also be proven directly by using the fact that projections are the extreme elements of the set of effects. Also, if (Ω, Σ) is nice and dim < then an extreme observable M Obs(Σ, ) is discrete. Indeed, using an exactly measurableH ∞ function f : Ω R such that∈f(Ω) H (R), we now obtain → ∈ B OPTIMAL QUANTUM OBSERVABLES 25

1 an extreme observable M f − Obs (R), which is supported on an at most countable ◦ ∈ B H set λ1,λ2,... R [15, Section 5]; see also [7]. This means that M is supported by the set { 1 } ⊂ 1 f − ( λ ) where f − ( λ ) are atoms of Σ. Below are some examples on extreme observables i { i} { i} whichS are not PVMs.

One can show that the phase space observable GS introduced in Section 1 is extreme • if and only if S is pure, S = ψ ψ , and (q,p) ψ D(q,p)ψ = 0 for all (q,p) R2 | ih | 7→ h | i 6 ∈ [20]. Hence, if GS is extreme then it is informationally complete (but the converse does not hold). Especially, when S = 0 0 , i.e., the generating state is the state, | ih | then we get the rank-1 informationally complete extreme observable G 0 0 . | ih | The canonical phase Φcan introduced in Section 1 is extreme [19]. • Fix a number m > 0 and denote byϕ ˆ the Fourier-Plancherel transformation of ϕ • L2(R); if ϕ L1(R) L2(R) we may write ∈ ∈ ∩ 1 ipx ϕˆ(p)= e− ϕ(x) dx, p R. √2π ZR ∈ The canonical time-of-arrival observable τ : (R) L2(R) for a free mass-m particle B →L moving in R defined through 

1 ∞ ∞ it 2 2 2m (p2 p1) ˆ ˆ ϕ τ(X)ψ = e − ϕˆ(p1)ψ(p2)+ ϕˆ( p1)ψ( p2) √p1p2 dp1 dp2 dt h | i 2πm Z Z Z − − X 0 0  for any X (R) and any vectors ϕ and ψ from the of rapidly decreasing functions,∈ is B extreme [17]. We see from Theorem 9 that if M is extreme so is its rank-1 refinement M1 [32]. The two first examples given above are of rank 1. The third example, however, is of rank 2. The rank-1 1 1,2 2 refinement of the canonical time observable is τ : 2{ } (R) L (R) , ⊗ B →L  ∞ ∞ it 2 2 1 1 (p p ) k k ϕ τ ( k X)ψ = e 2m 2− 1 ϕˆ ( 1) p1 ψˆ ( 1) p2 √p1p2 dp1 dp2 dt h | { } × i 2πm Z Z Z − − X 0 0   for all X (R), k = 1, 2, and all ϕ and ψ from the Schwartz space of rapidly decreasing functions.∈ Thus, B τ 1 is extreme too. Let us next consider an example where we show how to construct an observable in a separable Hilbert space with all the optimality properties discussed this far, i.e., a post-processing clean (rank-1) informationally complete extreme observable.

Example 3. Let N := N and Nd := 1, 2,...,d for each d N. Let d, d N , ∞ ∪ {∞} { } ∈ H ∈ ∞ be a d-dimensional Hilbert space with having the orthonormal basis n n∞=1. We assume H∞ N d{| i} that d1 d2 whenever d1 d2 so that, for any d , n n=1 is an orthonormal H ⊆ H ⊆ H∞ ≤ ∈ {| i} basis for d. For allHn, m N, pick some numbers p > 0 for which ∈ nm ∞ p := p < . nm ∞ n,mX=1 Let n, m N such that n < m. Define the following vectors: ∈ f := n , f := n + m , f := n i m , nn | i nm | i | i mn | i − | i so that f f = n n + n m + m n + m m , | nm ih nm| | ih | | ih | | ih | | ih | f f = n n + i n m i m n + m m | mn ih mn| | ih | | ih | − | ih | | ih | 26 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨ and thus n n = f f , | ih | | nn ih nn| 2 n m = f f f f f f | ih | | nm ih nm|−| nn ih nn|−| mm ih mm| i f f f f f f  , − | mn ih mn|−| nn ih nn|−| mm ih mm| 2 m n = f f f f f f  | ih | | nm ih nm|−| nn ih nn|−| mm ih mm| + i f f f f f f  . | mn ih mn|−| nn ih nn|−| mm ih mm|  Hence, for any d N , the linearly independent set29 ∈ ∞ := p f f ( ) n, m 0 are such # 0 2 ∈ H k k that I ϕkI = tr [S ] < . Note that k=1 k k I ∞ P 2 2 ϕkI = ϕkI − S ϕkI = ϕkI − pnm fnm ϕkI fnm ran S = ran S 0 = lin fnm (n, m) 0 k k I k k h | i ∈ I I { | ∈ I } (n,mX) ∈I # 0 2 and I ϕI − ϕI ϕI is the projection from onto the # –dimensional Hilbert space k=1 k k k | k ih k | Hd I0 ran SP0 . I 30 We assume next that # 0 = d so that fnm (n, m) 0 is a basis of d so that S is I { | ∈ I } H I of full rank and thus invertible. Let 2I be the power set of and define the following discrete I rank-1 POVM M : 2I ( ): →L Hd 1/2 1/2 Mnm := √pnmS− fnm √pnmS− fnm , (n, m) . | I ih I | ∈ I Since, for any complex numbers c such that sup c (n, m) < , nm {| nm|| ∈ I} ∞ 1/2 1/2 cnmMnm = cnmpnm S− fnm S− fnm =0 | I ih I | (n,mX) (n,mX) ∈I ∈I 1/2 1/2 1/2 1/2 if and only if (n,m) cnmpnm fnm fnm = S (n,m) cnmpnm S− fnm S− fnm S = ∈I | ih | I ∈I | I ih I | I 0 if and onlyP if cnm 0, the observable M is extremeP with N = # # 0 = d elements. ≡ 2 I ≥ I Automatically, d N d which must hold for any extreme rank-1 POVM. If = Nd Nd ≤ ≤ 2 I × (or = N N if d = ) then N = d and d is a basis of ( d) showing that M is also informationallyI × complete.∞ Note that, in the caseB N = d, we get aL PVMH M. When N runs from d to d2 the value determination ability weakens but state determination power increases.

29 If d = then the of ∞ is dense in ( ∞) with respect to the . 30 ∞ B L H For example, = Nd Nd and 0 = (n,n) n Nd (if d< ). I × I { | ∈ } ∞ OPTIMAL QUANTUM OBSERVABLES 27

7.1. Joint measurements of extreme observables. We now concentrate on joint measure- ments involving extreme observables. To this end, let us recall the sets CP( ; ) of completely A H positive unital maps defined on a unital C∗-algebra and taking values in a type-1 factor ( ). A L H Let and be von Neumann algebras. We say that Φ1 CP( ; ) and Φ2 CP( , ) are compatibleA ,B if there is a joint map Ψ CP( ; ) such∈ thatA ΦHcoincides with∈ theB marginH ∈ A ⊗ B H 1 Ψ(1) and Φ2 coincides with the margin Ψ(2), where

Ψ(1)(a)=Ψ(a 1 ), Ψ(2)(b) = Ψ(1 b), a , b . ⊗ B A ⊗ ∈A ∈ B The following result has been obtained in [16].

Theorem 10. Let and be von Neumann algebras and Φ1 CP( ; ) and Φ2 CP( ; ) be compatible. A B ∈ A H ∈ B H

(i) If Φ1 is extreme in CP( ; ) or Φ2 is extreme in CP( ; ) then they have a unique joint map Ψ CP( A; H). B H (ii) If both Φ and∈Φ areA extreme ⊗ B H then the unique joint map Ψ is extreme in CP( ; ). 1 2 A⊗B H (iii) If Φ1 or Φ2 is a *-representation (i.e., especially extreme), one has Φ1(a)Φ2(b) = Φ2(b)Φ1(a) for all a and all b and the unique joint map Ψ CP( ; ) is given by ∈A ∈ B ∈ A ⊗ B H Ψ(a b) = Φ (a)Φ (b), a , b . ⊗ 1 2 ∈A ∈ B Applying Theorem 10 to the case of an extreme observable and other measurement devices compatible with this observable, we obtain the following [16]:

Theorem 11. Let observables M : Σ ( ) and M′ : Σ′ ( ) be jointly measurable, and let Φ: ( ) ( ) be a channel compatible→L H with M. If M→Lis extremeH then L K →L H (i) the M-instrument : ( ) Σ ( ) such that (B, Ω) = Φ(B) for all B ( ) is unique and J L K × →L H J ∈L K (ii) the joint observable N : Σ Σ′ ( ) for M and M′ is unique. ⊗ →L H Moreover, if M is a PVM, (i)’ M(X)Φ(B) = Φ(B)M(X) for all X Σ and all B ( ) and the only M-instrument : ( ) Σ ( ) such that (∈, Ω) = Φ is given∈ L byK J L K × →L H J · (B,X)=Φ(B)M(X)= M(X)1/2Φ(B)M(X)1/2, B ( ), X Σ, J ∈L K ∈ and (ii)’ M(X)M′(Y )= M′(Y )M(X) for all X Σ and all Y Σ′ and the only joint observable ∈ ∈ N : Σ Σ′ ( ) for M and M′ is determined by ⊗ →L H 1/2 1/2 N(X Y )= M(X)M′(Y )= M(X) M′(Y )M(X) , X Σ,Y Σ′. × ∈ ∈ The above result essentially means that there is only one way in which an extreme observ- able can be measured if we fix the unconditioned state transformation associated with the measurement. Similarly, there is only one observable incorporating an extreme marginal ob- servable and some other fixed observable. The corresponding conditions for joint measurements involving PVMs, being from a special subclass of extreme observables, are even more stringent: compatibility or joint measurability with a PVM requires that the other measurement device (observable or channel) commutes with the PVM.

8. Observables determining their values We say that an observable M : Σ ( ) determines its values if for any set of its outcomes we may prepare the system into a state→L suchH that, in a measurement of M, the values obtained 28 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨ are approximately localized within the given set. Formally, this means that, for any X Σ such that M(X) = 0 and any ε (0, 1], there is a state ρ ( ) such that pM(X) > 1 ε∈. 6 ∈ ∈ S H ρ − Suppose that M : Σ ( ) determines its values and X Σ is such that M(X) > 0. We may evaluate for any ε→(0 L,H1] ∈ ∈ M M(X) = sup tr [ρM(X)] = sup pρ (X) > 1 ε k k ρ ( ) ρ ( ) − ∈S H ∈S H showing that M(X) = 1. Indeed, we see that this reasoning can easily be inverted: an observable determinesk k its values if and only if it has the norm-1 property, i.e., M(X) = 1 for all X Σ such that M(X) = 0. k k A more∈ stringent condition6 than the norm-1 property is the eigenvalue-1 property: M Obs(Σ, ) is an eigenvalue-1 observable if and only if, whenever X Σ is such that M(X) = 0,∈ then M(HX) has the eigenvalue 1. This means that for any X Σ∈ such that M(X) = 0 there6 M ∈ 6 is a state ρX ( ) “localized” in X in the sense that pρX (X) = 1, i.e., the approximate “ε-localization”∈ associated S H with norm-1 observables can be replaced with exact localization. Of course, we may always assume that ρX above is pure. Clearly, PVMs are eigenvalue-1 observables and there exist norm-1 POVMs which have not eigenvalue-1 property, e.g. the canonical phase Φcan. Consider then a norm-1 observable M Obs(Σ, ) in a finite-dimensional Hilbert space. Denote d = dim . Since any effect has∈ fully discreteH spectrum in finite dimensions, M has H the eigenvalue-1 property. For i = 1, 2, let the sets Xi Σ and unit vectors ϕi be such that M(X )ϕ = ϕ and X X = (we assume that M∈is not trivial). By using∈ the H minimal i i i 1 ∩ 2 ∅ Na˘ımark dilation for M, one gets ϕ1 ϕ2 = 0. Hence, there exist at most d pairwise disjoint sets X Σ such that M(X) =0. Ifh (Ω|, Σ)i is, e.g., nice we may conclude that M is discrete, i.e. ∈ 6 M N M there exist points xi Ω, i =1,...,N d, so that = i=1 (Axi )δxi where Axi is the atom ∈ ≤ R associated with xi. This result follows by using an exactlyP measurable function f : Ω such that f(Ω) (R) and results of [15, Section 5]. → Finally,∈ let B us recall that an eigenvalue-1 observable cannot be informationally complete even within the set of pure states. Indeed, it is easy show that this holds in arbitrary (nonseparable) Hilbert spaces. The question, whether a norm-1 observable can be informationally complete, remains to be answered.

9. Pre-processing and pre-processing maximality Let us start this section with a definition.

Definition 7. Let (Ω, Σ) be a measurable space and and ′ be Hilbert spaces. We say that H H an observable M′ : Σ ( ′) is a pre-processing of an observable M : Σ ( ) if there is a →L H →L H channel Φ : ( ) ( ′) such that M′(X) = Φ M(X) for all X Σ. L H →L H ∈ M′ M  The above definition means that p = p for all σ ( ′), that is, we may measure σ Φ∗(σ) ∈ S H M′ by first transforming the system with the channel Φ and then measuring M. The predual channel Φ is here seen as a form of quantum pre-processing that is used to process the incoming state carrying∗ quantum information before the measurement of M. Pre-processing gives naturally rise to a partial order within the class of observables with the fixed value space (Ω, Σ) and varying system’s Hilbert space [3]. We denote M′ M if H ≤pre M′ is a pre-processing of M by some channel. We may thus ask which are the pre-processing maximal or clean observables. Maximality of an M Obs(Σ, ) means that if M M′ with ∈ H ≤pre some observable M′ : Σ ( ′) such that M′ M (i.e. M′(X) = 0 exactly when M(X) = 0) → L H ∼ then M′ M. ≤pre OPTIMAL QUANTUM OBSERVABLES 29

In the following subsections, we characterize the pre-processing maximal observables first in the case of discrete outcomes and then for the general case. The reason for this division is that we may formulate the maximality in a tighter fashion for discrete observables. Let us first recall a result from [31]:

Theorem 12. Suppose that and ′ are separable Hilbert spaces, P : Σ ( ′) is a sharp observable (a PVM), and M :H Σ H( ) is some observable such that M →LP.H There exists a →L H ≪ channel Φ′ : ( ′) ( ) such that M(X) = Φ′ P(X) for all X Σ, i.e., M P. L H →L H ∈ ≤pre 9.1. Case of discrete observables. The theorem below characterizes pre-processing clean discrete observables. Theorem 13. Suppose that Ω is a finite or a countably infinite set and M :2Ω ( ) is an observable in a separable Hilbert space . Then M is pre-processing clean if and→L onlyH if it has the eigenvalue-1 property. H Proof. We give the proof for the ‘only if’ part for an observable with a more general value space (Ω, Σ) where Σ is countably generated, since this yields no extra complications. As- sume that M Obs(Σ, ) is pre-processing clean and µ is a probability measure on (Ω, Σ) such that M ∈ µ. AlsoH define the PVM P : Σ L2(µ) , P (X)ψ (x) = χ (x)ψ(x) ∼ µ → L µ X for all X Σ, all ψ L2(µ), and µ-a.a. x Ω. Hence,  since L2(µ) is separable, ac- ∈ ∈ ∈ 2 cording to Theorem 12, Pµ(X) = Φ M(X) where Φ : ( ) L (µ) is a channel and 2 L H → L thus, for all ρ L (µ) , X Σ, one has tr[ρPµ(X)] = tr [Φ (ρ)M(X)]. We define ρX := ∈ S ∈ ∗ r µ(X) 2 χ χ L2(µ) when µ(X) > 0 (or M(X) = 0). LetΦ (ρ )= λ ϕ ϕ , − X X  X n=1 n n n | rih | ∈ S 6 ∗ | ih | λn > 0, n=1 λn = 1, ϕn ϕm = δnm, be the spectral decomposition of the stateP Φ (ρX ). Now h | i r ∗ 1 = tr [ρPX Pµ(X)] = tr [Φ (ρX )M(X)] = n=1 λn ϕn M(X)ϕn implying that ϕn M(X)ϕn =1 ∗ 2 h | i h | i and thus M(Ω X)ϕn = ϕn M(ΩPX)ϕn = ϕn [I M(X)]ϕn =0 or M(Ω X)ϕn = k \ k h | \ i h | H − i \ M(Ω Xp) M(Ω X)ϕ =0 or M(X)ϕ =1 ϕ for all n

Proof. Suppose that RAR is a projection. We now have (I R)AR ∗ (I R)AR = RA(I R)AR =0 − − − implying that (I R)AR = 0, i.e., AR = RAR. We obtain − AR = RAR =(RAR)∗ =(AR)∗ = RA, proving the claim.  Let and be Hilbert spaces and Φ : ( ) ( ) a channel. There is a minimal projectionH R K ( ) such that Φ(R) = I ,L i.e.,H if→Q L K ( ) is such that Φ(Q) = I and Q R, then ∈Q P= HR. Thus, R can be calledK the support∈ P ofH Φ (indeed it is uniquely definedK by≤ Φ). Moreover, Φ(A) = Φ(RAR) for all A ( ) and, whenever A ( ) is an effect, Φ(A) = 0 if and only if RAR = 0 [5, Section 10.8].∈ L WeH now easily obtain the∈ L followingH result. Lemma 2. Let and be Hilbert spaces and Φ: ( ) ( ) a channel with the support projection R. IfHΦ(A) isK a projection for some effectL HA → L( K), then RAR is a projection, ∈ L H RA = AR, and A = RAR + R⊥AR⊥. Proof. Let A ( ), 0 A I and assume that Φ(RAR) is a projection. Using the Schwartz inequality∈ L H (applicable≤ especially≤ H to unital completely positive maps), Φ(RAR)=Φ(RAR)2 Φ(RARAR) ≤ implying Φ(RAR RARAR) 0. Since RAR RARAR, one finds that Φ(RAR RARAR)= 0. Since RAR −RARAR =≤R(A ARA)R ≥and 0 A ARA I , the properties− of the support projection− cited above imply− that RAR = ≤RARAR− , i.e.,≤RARH ( ). Lemma 1 ∈ P H yields AR = RA so that A = RAR + RAR⊥ + R⊥AR + R⊥AR⊥ = RAR + R⊥AR⊥.  Theorem 14. Suppose that the σ-algebra Σ 2Ω is countably generated and M : Σ ( ) is an observable operating in a separable Hilbert⊆ space . Then M is pre-processing→ clean L H if and only if there exist a closed subspace , a PVMH E : Σ ( ), and a POVM M ⊆ H → L M F : Σ ( ⊥) such that M E and →L M ∼ (5) M(X)= E(X) F(X), X Σ. ⊕ ∈ Proof. Suppose that M is pre-processing clean. Then M can be pre-processed with a channel Φ: ( ) L2(µ) into the observable P of the first part of the proof of Theorem 13, L H → L µ where µ M. We have by Lemma 2 that there is an R ( ) such that Φ(R) = IL2(µ) and ∼ ∈ P H M(X) = RM(X)R + R⊥M(X)R⊥, where RM(X)R is a projection, for all X Σ. Suppose ∈ that RM(X)R = 0 for some X Σ. Now Pµ(X)=Φ(RM(X)R) = 0 implying that µ(X) = 0. Hence M RM( )R; the contrary∈ is, of course, automatically satisfied. Thus, we may choose ≪ · = R and E(X)= RM(X)R and F(X)= R⊥M(X)R⊥ for all X Σ. MAssumeH then that the decomposition of M into E and F of the claim∈ exists. Clearly, E M ≤pre (with a rank-1 channel defined by the projection from onto ). If M′ : Σ ( ′) H M → L H is an observable such that M′ M E and M M′ then, according to Theorem 12, ∼ ∼ ≤pre M′ E M.  ≤pre ≤pre As also seen in the proof of Theorem 13, for M Obs(Σ, ) to be pre-processing clean, all the nonzero effects M(X) must have the eigenvalue∈ 1. ThisH is clearly satisfied by the direct sum form (5) since E is a PVM and, whenever E(X) = 0, M(X) = 0 as well. However, the contrary is problematic: if all nonzero M(X) have the eigenvalue 1, does it follow that we have the decomposition (5) with a fixed subspace ? This would mean that Theorem 13 extends plainly to general observables. We leave thisM as an open question. Moreover, in the finite- dimensional case, norm-1 observables also have the eigenvalue-1 property, which for (discrete) POVMs implies post-processing maximality; recall that now norm-1 POVMs are discrete if OPTIMAL QUANTUM OBSERVABLES 31 their outcome spaces are regular enough. An important infinite-dimensional and continuous counter example is the canonical phase observable Φcan which is of norm 1, but none of its effects Φcan(Θ) = I has the eigenvalue 1. Especially, Φcan is not pre-processing maximal. A different analysis6 H of pre-processing can result in remarkably different characterizations of pre-processing clean observables. For instance, the authors of [3] concentrate on finite- outcome observables on a fixed finite-dimensional Hilbert space . In this setting, an N-valued H observable M in is clean if for any N-valued observable M′ in such that there exists H H a channel Φ : ( ) ( ) with M = Φ(M′ ), i = 1,..., N, there also exists a channel L H → L H i i Ψ: ( ) ( ) such that Mi′ = Ψ(Mi), i = 1,..., N. With this definition, the set of cleanL observablesH → L H within the set of N-valued observables in are exactly those M such that H Mi = 1 for all i = 1,..., N, in the case where N dim . However, now also the case kN >k dim is possible and, in general, any rank-1 observable≤ H is clean. The difference in the definitionH of post-processing and clean observables of [3] and the corresponding definitions of this paper is that, in [3] one is restricted to using a single system within which to carry out pre-processing whereas in our analysis one is free to use any systems for pre-processing (no limitation to dimensionality of the Hilbert space from which one pre-processes). Remark 3. Norm-1 observables, and hence also pre-processing maximal observables as eigenvalue- 1 observables, are an example of so-called regular observables [5, Section 11.3]: An effect E ( ) is called regular if E I E and I E E or, equivalently, the spectrum of E extends∈ L H both above and below6≤ 1/2.H − For example,H − a rank-16≤ effect p ϕ ϕ , p (0, 1], ϕ , 1 | Mih | ∈ ∈ H ϕ = 1, is regular if and only if p> 2 (if dim > 1). An observable : Σ ( ) is regular kif Mk (X) is regular whenever 0 = M(X) = I .H There exist regular POVMs→L whichH are not of M 1 2 6 M 6 2H 1 norm-1 (e.g. 1 = 3 1 1 + 3 2 2 , 2 = 3 1 1 + 3 2 2 is regular but not norm-1). It is simple to check| ih that| whenever| ih | M : Σ | ih( |) is| regular,ih | its range ran M equipped with →L H the intersection M, ∧ran M(X) M M(Y ) := inf M(Z) M(Z) M(X), M(Y ) ,X,Y Σ, ∧ran { | ≤ } ∈ and the complementation ′ : M(X) M(X)′ = I M(X) is a Boolean algebra, i.e., especially 7→ H − M(X) M M(X)′ = 0 for all X Σ. In fact, the converse is true as well [14]: if M : Σ ( ) ∧ran ∈ →L H is an observable such that (ran M, ran M,′ ) described above is a Boolean algebra, then M is regular. Hence, a regular POVM M∧preserves the ‘classical’ Boolean logic between the Boolean algebras Σ and ran M. Whether a regular observable can be informationally complete remains to be seen. This is not possible in the finite-dimensional case. To see this, let us consider an N-valued observable N M = (Mi)i=1 in a d-dimensional (d < ) Hilbert space . Taking the trace on both sides of N ∞ H the equation I = Mi and assuming that M is informationally complete and regular, one H i=1 arrives at d = N Ptr [M ] > N/2 d2/2, where the first inequality follows from regularity and i=1 i ≥ the second fromP informational completeness. This is possible only if d = 1. The above no-go result can be alleviated by relaxing the requirement on informational com- pleteness within the set of all states. Let us consider an example in the two-dimensional Hilbert space. Fix the σx, σy, and σz which in the eigenbasis of σz take the form 0 1 0 i 1 0 σx = , σy = − , σz = .  1 0   i 0   0 1  − 3 Also denote b σ := bxσx + byσy + bzσz for any b = (bx, by, bz) R . We now define the · 1 ∈ three-valued observable M =(M , M , M ) by M =3− (I + a σ), i =1, 2, 3, where 1 2 3 i i · 1 √3 1 √3 a1 = (1, 0, 0), a2 = , , 0 , a3 = , , 0 .  − 2 2   − 2 − 2  32 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨

It is easily checked that M is an extreme rank-1 observable and the non-zero eigenvalue of Mi is 2/3 for each i =1, 2, 3. Thus, M is regular. Moreover, M is informationally complete within the restricted set of states ρ such that tr [ρσz] = 0. Thus, M is ‘more informationally complete’ than any PVM can be in the two-dimensional case. Indeed, whenever P =( d1 d1 , d2 d2 ) is a PVM, the maximal subset of states where P is informationally complete| is theih convex| | ih hull| of the states d d and d d parametrized by a single parameter p [0, 1] whereas one | 1 ih 1| | 2 ih 2| ∈ needs two parameters for the set of states ρ for which tr [ρσz] = 0.

10. Conclusions In this paper we have identified some important optimality properties of a quantum ob- servable represented mathematically as a POVM M: Determination of the past (the pre- measurement state of the system), i.e., informational completeness; freedom from dependence on more informational measurements from which the output data of M may be processed, i.e., post-processing maximality; freedom from interference of different measurement schemes, i.e., extremality; determination of values, i.e., whether for any outcome set X one can prepare the system in a state realizing a value from X with arbitrarily high accuracy; determination of the future, i.e., any measurement of M also works as a state preparator; and freedom from quantum noise, i.e., pre-processing maximality. We have investigated these properties and generalized results known for discrete observables for more general observables. We have also found connections between these conditions: Pre-processing maximality and determination of future are equivalent and both are characterized by the rank-1 property of M. Moreover, using a refinement procedure, one can replace an informationally complete (resp. extreme) M with an informationally complete (resp. extreme) rank-1 POVM M1. Determination of values is equivalent with the norm-1 property (i.e. M(X) = 1 for all outcome sets X, M(X) = I ) and using our characterizations, we immediatelyk k see that, when M is preprocessing clean,6 H it automatically defines its values. We may conclude that there are two major lines of optimality for quantum observables: On one hand, an observable may be informationally complete, and, at the same time, such an observable may also be free from all kinds of classical noise, i.e., it may be extreme and post-processing clean simultaneously. On the other hand, an observable may define its values, i.e., have the norm-1 property; especially, the observable subtype may be pre-processing clean. However, we are not aware of any norm-1 informationally complete observable; informational completeness requires properties that are strongly opposed to properties found in typical norm- 1 observables: unsharpness, noncommutativity, and nonlocalizability, namely the inability to prepare systems into states yielding particular outcomes in the measurement with certainty. Thus the two main optimality criteria, informational completeness (determination of past) and determination of values, at its strongest in eigenvalue-1 observables, appear as complementary properties of a quantum observable. However, an observable may be free from classical noise (extreme rank-1) as well as from quantum noise (pre-processing maximality) simultaneously, in which case the observable is forced to be a rank-1 PVM, which automatically defines its values and the future of the quantum system (only measurements of such observables are preparative) but fails to determine the past of the system. We have also discussed and reviewed results concerning joint and sequential measurements involving optimal observables. Especially, all the observables which are jointly measurable with a rank-1 observable M are smearings (post-processings) of M, and for a jointly measurable pair (M, M′) of observables, where either one of the observables is extreme, there exists a unique joint observable giving M and M′ as its margins. OPTIMAL QUANTUM OBSERVABLES 33

Acknowledgements The authors would like to thank T. Heinosaari for his remarks on the manuscript and Y. Kuramochi for turning the authors’ attention to Ref. [2]. E. H. also acknowledges financial support from the Japan Society for the Promotion of Science (JSPS) as an overseas postdoctoral fellow at JSPS.

References [1] W. B. Arveson, “Subalgebras of C∗-algebras,” Acta Math. 123, 141-224 (1969) [2] R. Beukema, “Maximality of positive operator-valued measures”, Positivity 10, 17-37 (2006) [3] F. Buscemi, M. Keyl, G. M. D’Ariano, P. Perinotti, and R. F. Werner, “Clean positive operator valued measures”, J. Math. Phys. 46, 082109 (2005) [4] P. Busch and P. Lahti, “The determination of the past and the future of a physical system in ”, Found. Phys. 19, 633-678 (1989) [5] P. Busch, P. Lahti, J.-P. Pellonp¨a¨a, and K. Ylinen, Quantum Measurement (Springer, 2016) [6] C. Carmeli, T. Heinosaari, J. Schultz, and A. Toigo, “Tasks and premises in quantum state determination”, J. Phys. A: Math. Theor. bf 47, 075302 (2014) [7] G. Chiribella, G. M. D’Ariano, and D. Schlingemann, “How continuous quantum measurements in finite dimensions are actually discrete”, Phys. Rev. Lett. 98, 190403 (2007) [8] H. Cycon and K.-E. Hellwig, “Conditional expectations in generalized probability theory”, J. Math. Phys. 18, 1154-1161 (1977) [9] G. M. D’Ariano, P. Lo Presti, and P. Perinotti, “Classical randomness in quantum measurements”, J. Phys. A: Math. Gen. 38, 5979 (2005) [10] E. B. Davies, Quantum Theory of Open Systems (Academic Press, London, 1976) [11] E. B. Davies and J. T. Lewis,“An operational approach to quantum probability”, Commun. Math. Phys. 17, 239-260 (1970) [12] J. Dixmier, Von Neumann Algebras (North-Holland, Amsterdam, 1981) [13] S.V. Dorofeev and J. de Graaf, “Some maximality results for effect-valued measures”, Indag. Math. (N. S.) 8(3), 349-369 (1997) [14] P. Dvureˇcenskij and S. Pulmannov´a, “Difference posets, effects, and quantum measurements”, Int. J. Theor. Phys. 33, 819-850 (1994) [15] E. Haapasalo, T. Heinosaari, and J.-P. Pellonp¨a¨a, “Quantum measurements on finite dimensional systems: relabeling and mixing,” Quantum Inf. Process. 11, 1751-1763 (2012) [16] E. Haapasalo, T. Heinosaari, and J.-P. Pellonp¨a¨a, “When do pieces determine the whole? Extreme marginals of a completely positive map”, Rev. Math. Phys. 26, 1450002 (2014) [17] E. Haapasalo and J.-P. Pellonp¨a¨a, “Extreme covariant quantum observables in the case of an Abelian group and a transitive value space”, J. Math. Phys. 52, 122102 (2011) [18] T. Heinosaari and T. Miyadera, “Universality of sequential quantum measurements”, Phys. Rev. A 91, 022110 (2015) [19] T. Heinosaari and J.-P. Pellonp¨a¨a, “The canonical phase is pure”, Phys. Rev. A 80, 040101 (R) (2009) [20] T. Heinosaari and J.-P. Pellonp¨a¨a, “Generalized coherent states and extremal positive operator valued measures”, J. Phys. A: Math. Theor. 45, 244019 (2012) [21] T. Heinosaari and M. M. Wolf, “Non-disturbing quantum measurements”, J. Math. Phys. 51, 092201 (2010) [22] A. S. Holevo, M. E. Shirokov, and R. F. Werner, “On the notion of entanglement in Hilbert spaces”, Russian Math. Surveys 60, 359-360 (2005) [23] T. Hyt¨onen, J.-P. Pellonp¨a¨a, and K. Ylinen, “Positive sesquilinear form measures and generalized eigenvalue expansions,” J. Math. Anal. Appl. 336, 1287-1304 (2007) [24] A. Jenˇcov´a, S. Pulmannov´a, and E. Vincecov´a, “Sharp and fuzzy observables on effect algebras”, Int. J. Theor. Phys. 47, 125-148 (2008) [25] J. Kiukas, P. Lahti, J. Schultz, and R. F. Werner, “Characterization of informational completeness for covariant phase space observables”, J. Math. Phys. 53, 102103 (2012) [26] J. Kiukas, J.-P. Pellonp¨a¨a, and J. Schultz, “State reconstruction formulas for the s-distributions and quadra- tures”, Rep. Math. Phys. 66, 55-84 (2010) [27] Y. Kuramochi, “Minimal sufficient positive-operator valued measure on a separable Hilbert space”, J. Math. Phys. 56, 102205 (2015) 34 ERKKATHEODORHAAPASALOANDJUHA-PEKKAPELLONPA¨ A¨

[28] P. Lahti and K. Ylinen, “Dilations of positive operator measures and bimeasures related to quantum mechanics”, Math. Slovaca 54, 169-189 (2004) [29] H. Martens and W. M. de Muynck, “Nonideal quantum measurements”, Found. Phys. 20, 255-281 (1990) [30] K. R. Parthasarathy, “Extremal decision rules in quantum hypothesis testing”, Infinite Dimens. Anal. 2, 557-568 (1999) [31] J.-P. Pellonp¨a¨a, “Complete characterization of extreme quantum observables in infinite dimensions”, J. Phys. A: Math. Theor. 44, 085304 (2011) [32] J.-P. Pellonp¨a¨a, “Quantum instruments: II. Measurement theory”, J. Phys. A: Math. Theor. 46, 025303 (2013) [33] J.-P. Pellonp¨a¨a, “Complete quantum measurements break entanglement”, Phys. Lett. A 376, 3495 (2012) [34] J.-P. Pellonp¨a¨a, “Complete measurements of quantum observables”, Found. Phys. 44, 71-90 (2014) [35] J.-P. Pellonp¨a¨a, “On coexistence and joint measurability of rank-1 quantum observables”, J. Phys. A: Math. Theor. 47, 052002 (2014) [36] J.-P. Pellonp¨a¨aand J. Schultz, “Measuring the canonical phase with phase-space measurements”, Phys. Rev. A 88, 012121 (2013) [37] C. Preston, “Some notes on standard Borel and related spaces” (2008). Available at https://arxiv.org/abs/0809.3066 [38] K. Ylinen, “Positive operator bimeasures and a noncommutative generalization”, Studia Math. 118, 157 (1996)

Department of Nuclear Engineering, Kyoto University, 6158540 Kyoto, Japan

Turku Centre for Quantum Physics, Department of Physics and Astronomy, University of Turku, FI-20014 Turku, Finland