<<

1

Optimal quantum Alessandro Bisio, Giulio Chiribella, Giacomo M. D’Ariano, Stefano Facchini, and Paolo Perinotti

Abstract—The present short review article illustrates the latest linearly dependent, allowing different expansion coefficients, theoretical developments on quantum tomography, regarding which can be then optimized, according to specific criteria. general optimization methods for both data-processing and setup. The optimization theory for the setup [8], on the other hand, The basic theoretical tool is the informationally complete measure- ment. The optimization theory for the setup is based on the new needs the new theory of quantum combs and quantum testers theoretical approach of quantum combs. [9], novel powerful notions in quantum mechanics, which generalize those of quantum channel and of POVM (positive- Index Terms—Quantum Tomography, Quantum Process To- mography, operator-valued measure). These will be briefly reviewed in the section before conclusions. As the reader will see, the theoretical framework is sufficiently general and mature for a I.INTRODUCTION concrete optimization in the lab, i.e. accounting for realistic INE calibration of apparatuses is the of any precise bounded resources, and this will be the direction of future F experiment, and the quest for precision and reliability is development of the field. relentlessly increasing with the strict requirements of the new photonics, nanotechnology, and the new world of quantum II.HISTORICALEXCURSUS information. The latter, in particular, depends crucially on the reliability of processes, sources and detectors, and on precise Quantum tomography is a relatively recent discipline. How- knowledge of all sources of noise, e.g. for error correction. ever, the possibility of “measuring the ” has But what does it mean to calibrate a quantum device? puzzled physicists in the last half century, since the earlier It is really a much harder task than calibrating a classical theoretical studies of Fano [10] (see also Pauli in Ref. [11]). “scale”. For example, for calibrating a photo-counter, we don’t That more than two observables—actually a complete set have standard sources with precise numbers of photons—the of them, a so-called quorum of observables [12], [13]—are equivalent of the “standard weights” for the scale. Even worst, needed for a complete determination of the was we never know for sure that all photons have been actually immediately clear [10]. However, in those years it was hard to absorbed by the detector. The practical problem is then to devise concretely measurable observables other than , perform a kind of quantum calibration to determine in a purely and energy (Royer pointed out that instead of experimental manner (by relying on some well established measuring varying observables one can vary the state itself in a measuring instruments) the quantum description of our device, controlled way, and measure e. g. just its energy [16]). For this without the need of a detailed theoretical knowledge of its reason, the fundamental problem of determining the quantum inner functioning—being it a measuring apparatus, a quantum state remained at the level of mere speculation for many channel, a quantum gate, or a source of quantum states. years. The issue finally entered the realm of experiments only And here it comes the powerful technique of quantum less than twenty years ago, after the pioneering experiments tomography. Originally invented for determining the quantum by Raymer’s group [17], in the domain of quantum optics. state of radiation (for recent reviews see the book [1] and Why quantum optics? Because in quantum optics, differently e. g. Refs. [2], [3], it soon became the universal measuring particle physics, there is the unique opportunity of measuring technique by which one can determine any ensemble average all possible linear combinations of position and momentum and measure the fine details of any quantum operation, chan- of a , representing a single mode of the nel, or measuring instruments—objects that before were just electromagnetic field. Such measurement can be achieved by theoretical tools (for history and references see next section). means of a balanced homodyne detector, which measures 1 † iφ −iφ In the present short review article we will illustrate our latest the quadrature Xφ = 2 a e + ae of a field mode theoretical developments on quantum tomography, consisting at any desired phase φ with respect to the local oscillator arXiv:1702.08751v1 [quant-ph] 28 Feb 2017 in a first systematic theoretical approach to optimization of (LO) [as usual a denotes the annihilator of the field mode]. both data-processing and setup. Therefore, apart from the The first technique to reconstruct the density matrix from historical excursus of the next section, where we mention the homodyne measurements—so called homodyne tomography— relevant contributions from other authors, the body of the paper originated from the observation by Vogel and Risken [18] is focus only on our theoretical work. that the collection of probability distributions p(x, φ) for The basic tool of the theoretical approach is the infor- φ [0, π) is just the Radon transform—i.e. the tomography{ } — mationally complete measurement [4] (see Refs. [5], [6] for of∈ the Wigner function W . Therefore, by a Radon transform applications in the present context)—corresponding to the inversion, one can obtain W , and from W the matrix elements mathematical theory of operator bases. The optimization of of the density operator ρ. This first method, however, works data-processing [7] relies on the fact that as an operator fine only for high number of photons or for almost classical basis the informationally complete measurement is typically states, whereas in the truly quantum regime is affected by 2 the smoothing needed for the Radon transform inversion. is however negligible in many cases of practical interest. The main physical tool, however—i.e. using homodyning— However, there is a drawback: this is the need of estimating the was a perfectly good idea: one just needed to process the full density matrix (the strategy is essentially a maximization experimental data properly. of the joint probability of the full data-set over all possible In Ref.[19] the first exact technique was given for measuring density matrices, or Bayesian variations of such maximiza- experimentally the matrix elements of ρ in the photon-number tion accounting for prior knowledge[40]). This, on one side representation, by just averaging functions of homodyne data. requires a cutoff of the dimension of the when After that, the method was further simplified [20], and the infinite (such as for the harmonic oscillator, as in homodyne feasibility for nonunit quantum efficiency η < 1 at detectors— tomography), thus introducing the mentioned bias; on the above some bounds—was established. Further improvements other side it has computational and memory complexities in the numerical algorithms made the method so simple and which increase exponentially with the number of systems for fast that it could be implemented easily on small PCs, and a joint tomography on multiple systems. On the contrary, the the method became quite popular in the laboratories (for the averaging strategy for any desired expectation value needs earlier progresses and improvements the reader can see the old just to average a single function of the experimental outcome, review [21]). In the meanwhile there has been an explosion without needing the full matrix, and this includes as a special of interest on the subject of measuring quantum states, with case the evaluation of single matrix element itself, whence hundreds of papers, both theoretical and experimental. The without necessitating a dimensional cutoff. exact homodyne method has been implemented experimentally Contemporary to this preliminary evolution of data- to measure the photon statistics of a semiconductor processing methods, there has been also a parallel evolution [22], and the density matrix of a squeezed vacuum[23]. The in the tomographic setup design. It was realized that it is success of optical homodyne tomography has then stimulated possible not only states, but also channels [41], [42]—the the development of state reconstruction procedures for atomic so-called (standard quantum) process tomography (SQPT)— beams [24], the experimental determination of the vibrational based on the idea of tomographing the outputs of a channel state of a molecule [25], of an ensemble of helium atoms [26], corresponding to a set of input states making an operator and of a single ion in a Paul trap [27], and different state basis for all density matrices. However, soon later it was reconstruction methods have been proposed (for an extensive recognized (first for the diagonal matrix elements in the list of references of these first pioneering years, see e.g. Ref. number basis of an optical process [43], then in general for [28]). any channel [44], [45] that the same process tomography can Later the method of quantum homodyne tomography has be actually achieved using just a single input state entangled been generalized to the estimation of an arbitrary observable of (with maximal Schmidt number) with an ancilla—the so-called the field [29], with any number of modes [30], and, to arbitrary ancilla-assisted process tomography (AAPT)—exploiting the quantum systems via group theory [31], [32], [33], and with “quantum parallelism” of the entangled input state which plays a general method for unbiasing noise [31], [32]. Eventually, it the role of a “superposition of all possible input states”. This was recognized that the general data-processing is just an ap- can have a great experimental advantage when the basis of plication of the theory of operator expansions[34], [35], which states is not easily achievable experimentally, whereas the lead to identify quantum tomography as an informationally entangled state is, as in the case of homodyne tomography complete measurement [36]—a generalization of the concept where it is easy to achieve such entangled state from paramet- of quorum of observables [12], [13]. ric down-conversion of vacuum, whereas it is hard to achieve State reconstruction was extended to the case where an in- photon-number states (see however, Ref. [46], where a set of complete measurement is performed. In this the reconstruction random coherent states have been proposed as a basis). As of the full density matrix of the system is actually impossible, later proved in Ref. [47], and experimentally verified in Refs. and one can only estimate the state that best fits the measured [48], almost any joint system-ancilla state can be exploited for data applying the Jaynes’s maximum entropy principle (Max- AAPT. On the other hand, the same AAPT has been extended Ent) [14]. When one has some non-trivial prior information to quantum operations and to measuring apparatus [49], [50] the fit can be improved by minimizing the Kullback-Leibler (former theoretical proposals for calibration of detectors were distance from a given state which represents this a priori published without ancilla [52], [53], and even ancilla-assisted information [15]. [54]). Later, by another kind of quantum parallelism, it was At the same time, in alternative to the averaging data- recognized that one can also estimate the ensemble average processing strategy of the original method [19], it was rec- of all operators of a quantum system by measuring only one ognized in Refs. [37], [38] the possibility of implementing a fixed ”universal” entangled observable on an extended Hilbert maximum likelihood strategy for reconstructing the diagonal space [55]—a truly universal observable. At this point the of the density matrix, and later for the full matrix [39]. An tomographic method had reached the stage in which a single advantage of the maximum likelihood strategy is that the fixed apparatus (single preparation of the input and single density matrix is constrained to be positive, whereas positivity observable at the output) is needed, in principle reducing enor- can be violated in the fluctuations of the averaging strategy. mously the experimental complexity for joint tomography on In addition, the maximum likelihood often allows to reduce many systems (complexity 1 versus exponential complexity). dramatically the number of experimental data for achieving After the first experimental SQPT by NMR [56], AAPT the same statistical error, at the expense of a bias, which was experimentally proved in Refs. [57], [48], for photon 3 polarization quantum operations, exploiting spontaneous data-processing, and to minimize the physical resources, han- parametric downconversion in a non linear crystal as a source dling increasingly large numbers of quantum system jointly. of entangled states. Regarding this last point, a relevant issue is the exponentially As in the case of state tomography, the freedom in the increasing dimension of the Choi operator of the quantum choice of the experimental configuration poses the natural process versus the number of systems involved, and methods question of what is the optimal setup for a given figure for safely neglecting irrelevant parameters in multiple qubit of merit. In Ref. [58] the issue of minimizing the number noise model reconstruction have been introduced [74] based of different experimental configurations needed for process on assumptions of qubit noise independence and Markovianity. tomography was raised again, and a the so-called Direct In Ref. [75], methods to tackle the case of sparse Choi matrices Characterization of Quantum Dynamics (DCQD) setup was are shown, expressing the minimum `1-norm distance criterion introduced for , later generalized to arbitrary finite in terms of a standard convex optimization problem. On the dimensional systems [59]. The proposed protocol starts from problem of optimizing data-processing, on the other hand, the expression of the Choi-Jamiołkowski operator (also called upper bounds on minimal Hilbert-Schmidt distance between P † χ-matrix) C (ρ) = mn χmnAmρAn of the quantum channel the estimated and the actual Choi-Jamiołkowski state has been C operating on the inputs state ρ, choosing for the basis derived[76] exploiting spherical 2-designs. It can be shown Am the shift-and-multiply group elements, and then uses that minimizing such a distance is equivalent to the minimizing techniques{ } from error detection for the estimation of param- the statistical error in the estimation of any ensemble average eters χmn from estimated error probabilities. The DCQD evolved by the channel. On the other hand, a systematic way of approach is interesting because of the interpretation of Process posing the problem of optimizing the data-processing is to fix Tomography in terms of error detection, however, it does a cost-function (depending on the purpose of the tomographic not provide any optimality argument in terms of number reconstruction), and minimize the average cost—the canonical of experimental configurations, apart from a vague resource procedure in quantum estimation theory [77]. analysis [60]. A similar scheme was introduced in Ref. [61], The optimal data-processing for any measurement (in finite- where the authors provide a method for process tomography dimensions) for estimating the expectation of any observable that allows to separately reconstruct the Choi operator matrix with minimum error was derived in Ref. [7]. On the other elements in a fixed basis based by Haar-distributed input state hand, in regards of the optimal setups, an approach based on sampling. The authors exploit spherical 2-designs [62] in order the theory of quantum combs and quantum testers [9], [78] to discretize the required averaging over the group SU(d). have been introduced, that allowed to determine the optimal In more recent years some experiments in the continuous- schemes (minimizing the statistical error in estimating expec- variable domain were performed both for process tomography tation values) for all of the three kinds of tomography: state, [63] and for measurement calibration [64], however both process, and measurement [8] (quantum combs and quantum experiments exploited the SQPT technique, while no AAPT testers generalize the notion of channels and POVM’s). The experiments with continuous variable systems have been re- optimal setups use up to three ancillas, and need only a ported so far. Many tomographic experiments on different single input state (with bipartite entanglement only) and the kinds of quantum systems have been performed, like atoms measurement of a Bell basis, with a variable local unitary in optical lattices [65], cold ions in Paul traps [66], NMR shifts of the ancillas. Exploiting the same approach incomplete probed molecules [67], solid state qubits [68], and quantum process tomography has been addressed in Ref. [79] using optic cavity modes interacting with atoms [69]. “process entropy”, the analogous of the max-entropy method In the last decade the interest in quantum tomography [14] for process tomography. grew very fast with the increasing number of applications in the hot field of quantum information, allowing testing III. METHODOLOGY the accuracy of state-preparation and calibration of quantum In the following we will treat linear operators X from to gates and measuring apparatuses. One should realize that the 0 as elements of a vector space, and the following formulaH whole technology of quantum information crucially depends 1 isH very useful on the reliability of processes, source and detectors, and d −1 d −1 on precise knowledge of sources of noise and errors. For X1 X0 X := Xmn m 1 n 0. (1) example, all error correction techniques are based on the | ii | i | i knowledge of the noise model, which is a prerequisite for m=0 n=0 an effective design of correcting codes [70], [71], [72], and In Eq. (1) di denotes the dimension of the Hilbert space i, H Quantum Process Tomography allows a reliable reconstruction n i are orthonormal bases for for i i = 1, 2, and Xmn of the noise and its decoherence free subspaces without {|arei the} matrix elements of X on the sameH orthonormal basis. recurring to prior assumptions on the noisy channels [73]. The A general mathematical framework for quantum tomogra- increasingly high confidence in the tomographic technique, phy was introduced in Refs. [34], [35], based on spanning sets with the largest imaginable flexibility of data-processing, and of observables called quorums. In we will review the more expanding outside the optical domain in the whole physical general approach based on informationally complete domain, grew the appetite of experimentalists and theoreticians [5], [6]. A POVM is a set of positive operators Pl that add posing increasingly challenging problems. The relevant issues up to the identity. The method is based on operator{ expansions,} were now to establish the optimal tomographic setups and and we will show how expanding operators on a POVM can be 4

used to reconstruct their expectation values on the state of the the optimal coefficients f[X] for a fixed POVM Pl , [7], measured system. The aim of a tomographic reconstruction [84]. On the other hand, the full classification of inverses{ } Γ is to obtain the ensemble expectation of an operator X by and consequent optimization is a still unsolved problem for averaging some function fl[X] depending on the outcome of a infinite dimensional systems, for which alternative approaches suitable POVM Pl . We require the procedure to be unbiased, are useful [83]. namely the reconstruction{ } must be as follows X A. Frames X ρ = fl[X]p(l ρ), p(l ρ) := Tr[ρPl]. (2) h i | | l In this subsection we will review the relevant results in the theory of frames on Hilbert spaces, which is useful for dealing Whatever notion of convergence one uses, the requirement for with POVMs on infinite dimensional systems [83] where a unbiasedeness implies—by the polarization identity—that the classification of all inverses Γ is still lacking. The method for following expansion for the operator X holds evaluating possible inverses provided in Refs. [34], [35] is an X X = fl[X]Pl, (3) orthogonalization algorithm—similar to the customary Gram- l Schmidt method—based on the assumption that the POVM is a frame [81] in the Hilbert space of Hilbert-Schmidt operators, where the sum can be replaced by an integral in the case namely that the two following inequalities hold of continuous outcome set (the expansion clearly is defined 2 X 2 2 for weakly convergent sum, meaning that Eq. (2) holds for a X Pl X b X . (5) || || ≤ |hh | ii| ≤ || || all states ρ). The general reconstruction method consists in l finding expansion coefficients fl[X], and then averaging them Equivalently, Pl is a frame iff its frame operator over the outcomes l. In this way one can define the expansion { } X for general bounded operators X. Further extensions of the F := Pl Pl , (6) | iihh | definition in Eq. (3) to unbounded operators can be obtained l requiring the convergence of Eq. (2) for states ρ in a dense is bounded and invertible with bounded inverse. The theory of set (e. g. finite energy states). A particularly simple case frames provides a (partial) classification of inverses Γ in terms S is that of operators on finite dimensional Hilbert spaces, or of dual frames for Pl , namely those frames Ql such that the for Hilbert-Schmidt operators in infinite dimensional spaces, following identity{ holds} in the vector space of operators since in these cases the space of operators is a Hilbert space X A B := Pl Ql = I. (7) itself, equipped with the Hilbert-Schmidt product | iihh | Tr[A†B], and convergence of Eq. (3) can be definedhh | inii the l Hilbert-Schmidt norm X := p X X . While the orthogonalization method is effective in providing Clearly, the use of|| the|| formulahh in| Eq.ii (2) for estimation adequate coefficients f[X] for the purpose of evaluating the of X ρ (for all ρ and for all X such that Tr[ρX] is expectation value of operators X, it maybe inefficient in h i ∈ S defined on ) is possible iff Pl is a complete set in the space minimizing the statistical errors, since the orthogonalization of linear operators.S Such a{ POVM} is called informationally would be equivalent to discard experimental data. On the complete [4]. For the sake of simplicity, in the following we other hand, using the method of alternate duals of a frame will restrict attention to the case of Hilbert-Schmidt operators allows one to use all experimental data in the most efficient X. Eq. (3) defines a Λ from the vector space of way, according to any chosen criterion, such as minimize the coefficients f := (fl) to linear operators as follows statistical error. We will now show how the method works, X The canonical dual frame is defined as Λf = flPl, (4) −1 Dl := F Pl , (8) l | ii | ii whose domain contains all the vectors f such that the sum in and it trivially satisfy Eq. (7). All alternate dual frames of a Eq. (4) converges (either in Hilbert-Schmidt norm or weakly). fixed frame Pl are classified in Ref. [82], and they are given As we mentioned before, a reconstruction strategy requires by the following{ } expression a choice of coefficients f[X] for any operator X, such that X Ql = Dl + Yl Dl Pj Yj, (9) Λf[X] = X. In algebraic terms, the choice corresponds to − hh | ii a generalized inverse Γ of Λ defined by ΛΓΛ = Λ, so j P that f[X] = Γ(X) [80]. When the set P is not linearly where Yl is arbitrary, provided that the sum Dl Pj Yj l jhh | ii independent the inverse Γ is not unique,{ and} this implies converges. It is clear from the definition in Eq. (7) that any dual that one can choose the coefficients f[X] according to some frame Ql corresponds to an inverse Γ, via the identification { } optimality criterion, as we will explain in Sec. VIII. Notice Γ(X) = f[X], with the coefficients given by that by linearity, any inverse Γ provides a dual spanning set ∗ fl[X] := Ql X . (10) Ql whose matrix elements are (Ql)mn := fl[Emn], with hh | ii { } † Emn := m n , namely fl[X] = Tr[Q X]. For finite dimensions also the converse is true, namely any | ih | l As we will see in the next sections, for finite dimensional inverse Γ provides a dual set Ql which is a frame. However, systems the theory of generalized inverses is sufficient for in the infinite dimensional case{ it} is not guaranteed that all the classifying all possible expansions and consequently deriving dual sets corresponding to inverses Γ are frames themselves. 5

The results in this subsection can be generalized to frames for X such that Tr[ρX] < and for all ρ . | | ∞ ∈ S for bounded operators (for the theory of frames for Banach • For a fixed x, fξ(x, X) is linear in X, namely spaces, see Ref. [85]) by weakening the definition of conver- f (x, aX + bY ) = af (x, X) + bf (x, Y ), gence of the sums in Eqs. (6), (7), and (9). ξ ξ ξ † ∗ (14) fξ(x, X ) = fξ(x, X) . IV. WHATYOUNEEDTOMEASUREFORTOMOGRAPHY The problem of tomography is to find all possible corre- As we mentioned in the previous section, the use of a de- spondences X f (x, X), namely all possible estimators. tector whose statistics is described by an informationally com- ξ Usually quorums↔ are obtained from observable spanning sets plete POVM Pl allows the reconstruction of any expectation { } Fω ω∈Ω, satisfying value (including those of external products m n , namely { } matrix elements in a fixed representation). In| theih assumption| Z X = d ω cω[X]Fω, (15) that every repetition of the experiment is independent, it is Ω f[X] indeed sufficient to find a set of coefficients , and to where the measure d ω may be unnormalizable. However, this ν := n /N n average it by the experimental frequencies l l ( l is feature is usually due to redundancy of the set Ω, which may l N the number of outcomes occurred, and is the total number be partitioned into sets K of observables such that for all of repetitions). The estimated expectation is then ξ Fω Kξ, one has [Fω,Xξ] = 0 for a fixed observable Xξ. The X ∈ X := νlfl[X] X ρ, (11) set Kξ then corresponds to the observable Xξ in the quorum so ' h i l that for Fω Kξ we can write Fω := F = fκ(Xξ). Under ∈ (ξ,κ) where the symbol means that by the law of large numbers standard hypotheses d ω can be decomposed as µ(d ξ)νξ(d κ), ' l.h.s. converges in probability to r.h.s. where νξ(d κ) is the measure on Kξ induced by d ω, and Eq. (15) can be rewritten as A. Informationally complete measurements Z Z X = µ(d ξ) ν(d κ)c [X]f (X ). (16) Informationally complete measurements play a relevant role (ξ,κ) κ ξ X Kξ in foundations of quantum mechanics, constituting a kind of standard reference measurement with respect to which all The last expression has the form of Eq. (12) with the choice quantum quantities are defined. They have been used as a tool of tomographic estimators provided by to assess general foundational issues, such as in the proof of Z the quantum version of the de Finetti theorem [86]. One of fξ(x, X) := ν(d κ)c(ξ,κ)[X]fκ(x). (17) K the most popular examples of informationally complete mea- ξ surement is the coherent-state POVM for harmonic oscillators, Notice that in the case of a quorum the possibility of optimiz- which is used in particular in quantum optics. Its probability ing the estimator depends on non uniqueness of the estimator, distribution is the so-called Q-function (or Husimi function). which is equivalent to the existence of null functions, namely Other example are the quorums of observables, such as the functions nξ(x) such that set of quadratures of the harmonic oscillator, which was the Z Z first kind of informationally complete measurement considered µ(d ξ) Eξ(d x)nξ(x) = 0. (18) for quantum tomography [18]. The use of the notion of X Xξ informational completeness has also lead to advancements on other relevant conceptual issues, such as the problem of joint C. Group tomography measurements of non-commuting observables [87]. In this subsection we will review the approach to quan- tum tomography based on group representations, that was B. Quorums introduced in Ref. [32], and then exploited in Refs. [33], A quorum of observables Xξ ξ∈X is a set of independent [89]. The method exploits the following group theoretical { } 0 observables ([Xξ,Xξ0 ] = 0 only if ξ = ξ ), with spectral identity, holding for unitary irreducible representations U(g) resolution Eξ(d x) and spectrum ξ, such that the statistics of a unimodular group G X of their outcomes x ξ allows one to reconstruct average Z values of an arbitrary∈ operator X X as follows d g U(g)XU(g†) = Tr[X]I, (19) Z G X ρ = µ(d ξ) fξ(Xξ,X) ρ, (12) d g G h i h i where is the invariant Haar measure of normalized X to 1 [we recall that a group is unimodular when the left- where µ(d ξ) is a probability measure on X and fξ(x, X) is invariant measure is equal to the right-invariant one]. In the a complex function of x ξ called tomographic estimator, ∈ X following we will consider compact Lie groups (such as enjoying the following properties the rotation group or the group of unitary transformations), • In order to have bounded variance in the estimation, which are necessarily unimodular. However the identity can be fξ(x, X) is square summable with respect to the measure extended to square summable representations of non compact µ(d ξ) Eξ(d x) ρ for all ρ in the set of interest, namely h i S unimodular groups [92], allowing for extension of group Z Z tomography to the noncompact groups SU(1, 1) [90], [91], µ(d ξ) E (d x) f (x, X) 2 < , (13) ξ ρ ξ along with the Euclidean group on the complex plane (which X Xξ h i | | ∞ 6 is the case of homodyne tomography). We will exploit the two POVMs, and a comparison without optimization generally following identities coming from the correspondence of Eq. (1) leads to a wrong choice of POVM. Before reviewing recent

T T ∗ results on optimization of processing and POVMs, in Sec. A B C = ACB , Tr1[ A B ] = A B , (20) ⊗ | ii | ii | iihh | VIII, we summarize the main approaches to data-processing, where XT and X∗ denote the transpose and the complex along with the corresponding figures of merit. conjugate of X, respectively, on the bases of Eq. (1). Using R Eqs. (19) and (20) one obtains d g U(g) U(g) = I, A. The unbiased averaging method: tomography as indirect G | iihh | which implies the following reconstruction formula estimation Z Quantum tomography can be regarded as a special case of X = d g Tr[U †(g)X]U(g). (21) G indirect estimation [87], in which the informationally complete In the hypothesis that the group manifold is connected, the detector allows one to indirectly estimate without bias any iψn·T expectation value. From this point of view, a very natural exponential map e covers the whole group, Ti denoting the generators Lie algebra representation and n being a figure of merit in judging a data processing strategy is the normalized real vector. The integral can then be rewritten as statistical error in the reconstruction of expectations. The follows statistical error occurring when the processing in Eq. (3) is Z Z used has the following expression −iψn·T iψn·T X = µ(d ψ) ν(d n)Tr[e X]e . (22) 2 X 2 n ∆(X) := fl[X]νl X ρ , (25) Ψ S ρ,ν | − h i | By exchanging the two integrals over ψ and n, the integral l over ψ is evaluated analytically, whereas the integral over n where the frequencies νl have a multinomial distribution N! Q Nνl is sampled experimentally. The practical problem is then to pN (ν ρ) := Q n ! l Tr[ρPl] . Notice that this reconstruc- | l l measure n T. A way is to start from a finite maximal set of tion is unbiased for any N, since averaging the reconstructed · commuting observables, say Tν (these make the so-called expectation in Eq. (11) over all possible experimental out- { } Cartan abelian subalgebra of the Lie algebra), and achieve the comes provides exactly X ρ. On the other hand, averaging h i observables of the quorum by evolving Tν with the group G the statistical error over all possible experimental outcomes of physical transformations in the Heisenberg picture, e.g. by provides the following expression preceding the Tν -detectors with an apparatus that performs P 2 2 fl[X] p(l ρ) X ρ the transformations of G. For example, for the group SU(2) ∆(X)2 := l | | | − |h i | . (26) ρ N the generators are the angular momentum components Ji, and a quorum is provided by the set of all angular momentum Finally, this quantity depends on the state ρ, and in order operators J n on the sphere n S2 [89], that can be obtained to remove this dependence we consider a Bayesian setting measuring J· after a rotation of∈ the state. in which the measured state is assumed to be distributed z p(ρ) The use of group representations provides also a tool for according to a prior probability . Averaging the error over constructing covariant informationally complete POVMs. A the prior distribution finally provides covariant POVM with respect to the representation U(g) of X 2 2 δ(X)E := fl[X] p(l ρE ) X , (27) | | | − |h i| E the group G is a POVM with the following form l P (d g) = d gU(g)ξU(g)†, R R (23) where ρE := d ρ p(ρ)ρ, and f(ρ)E := d ρ p(ρ)f(ρ). In Refs. [93], [6], the expression in Eq. (27) was considered as a where ξ 0 is called seed and must be such that R P (d g) = G figure of merit for judging the quality of the reconstruction I. The informational≥ completeness can be required through the provided by the processing coefficients f[X] with a fixed invertibility condition for the frame operator in Eq. (6), which POVM Pl . In Sect. VIII we will show how the optimal rewrites { } Z processing [7] can be derived within this framework. F = d g U(g) U(g)∗ ξ ξ U(g)† U(g)T . (24) ⊗ | iihh | ⊗ G B. The maximum likelihood method A general classification of covariant informationally complete The unbiased averaging method can generally lead to ex- measurements has been given in Ref. [6]. pectations that are unphysical, e.g. violating the positivity of the density operator. This fact had led some authors to adopt V. METHODS OF DATA PROCESSING data processing algorithms based on the maximum likelihood Given a detector corresponding to an informationally com- criterion, that allows one to constrain the estimated state to be plete POVM, one can use either the theory of generalized physical[37], [38]. However, it actually does not make much inverses or the theory of frames to find a suitable data difference if the deviation from the true value results in a processing to reconstruct all the parameters of a quantum state. physical or unphysical state: is it better to guess a physical However, the processing is usually not unique, and this feature state that is far from the true one, or to guess an unphysical leaves room for optimization. One can indeed choose a figure one that is close to the true one? Indeed, as we have already of merit and look for the processing that optimizes it for a fixed discussed in Sect. II, the maximum likelihood is generally POVM. This step is mandatory for a fair comparison between biased, and the physical constraint may result e. g. in the state 7 to be pure when instead the true state is mixed. A Bayesian Convergence is assured by convexity and differentiability of variation of the maximum-likelihood method was proposed in the functional to be maximized over the convex set of states. Ref. [40], in order to avoid such feature. However, the derivatives of [ρ] with respect to some of the A comprehensive maximum-likelihood approach has been parameters defining ρ can beL very small, so that very different given in Ref. [39]. The likelihood is a functional L[ρ] over values of the parameters will give almost the same likelihood, the set of states that evaluates the probability that the state thus making it hard to judge whether the point reached at ρ produces the experimental outcomes summarized by the a given iteration step is a good approximation of the point frequencies ν, and has the following expression corresponding to the maximum: in such case the problem 1 becomes numerically ill conditioned, with an extremely low ! N Y Y convergence rate. L[ρ] := p(l ρ)νl = p(l ρ)nl . (28) | | l l C. Unbiasing known noise It is convenient to define the following functional, which is In this subsection we will show how the unbiased averaging just the logarithm of L[ρ] method explained in Subsect. V-A can be applied also in 1 X the presence of a known noise disturbing the measurement, [ρ] := nl log p(l ρ), (29) L N | provided that the quantum channel describing the noise is in- l vertible [32]. The unbiasing method is the following. Suppose L[ρ] whose maximization is equivalent to the maximization of . that the noisy channel N (in the Heisenberg picture) affects The positivity constraint on ρ is achieved by substituting it the system before it is measured by the detector corresponding with T †T in Eq. (29), thus defining a functional 0[T ], and L to the POVM Pl . Then the measured POVM is actually introducing a Lagrange multiplier to account for the con- (P ) , that{ for} invertible is still informationally com- † N N l N dition Tr[T T ] = 1. Eq. (29) provides a natural interpretation plete.{ The} reconstruction formula Eq. (3) then becomes of the maximum likelihood criterion in terms of the Kullback- ! Leibler divergence D(ν p), where pl := p(l ρ). Indeed, the −1 X −1 X = NN (X) = N fl[N (X)]Pl Kullback-Leibler distance|| of the probability| distribution p l from experimental frequencies ν has the following expression X −1 = fl[N (X)]N (Pl). (35) X νl D(ν p) = νl log , (30) l || pl l Using the statistics from the measurement of N (Pl) it P is then possible to unbias the noise by averaging{ } the and since S(ν) := l νl log νl is fixed, the minimization N of the distance is equivalent− to the maximization of functions f[N −1(X)]. In all known cases, the coefficients f [Z] are obtained as f [Z] = Tr[Q†Z] for a dual frame X 1 X l l l νl log pl = nl log pl [ρ]. (31) Ql , and consequently the coefficients for unbiasing are N ≡ L † −1 † l l { } −1 Tr[Ql N (Z)] = Tr[N∗ (Ql )Z], where N∗ denotes the The maximization over ρ with the positivity and normalization Schrodinger¨ picture of the channel N . As we will see in constraints can thus be interpreted as the choice of a physical the following, usually the procedure for unbiasing the noise state ρ such that its has the minimum increases the statistical error. For examples of noise-unbiasing Kullback-Leibler distance from the experimental frequencies. see Refs. [95], [94]. The statistical motivation for the maximum likelihood es- timator resides in the following argument. Given a family VI.THEQUANTUMSYSTEMS of probability distributions p(x; θ) in x, depending on a A. Qubits multidimensional parameter θ, the Fisher information matrix The case of a two-dimensional quantum system (qubit) is can be defined as follows the easiest example. Any operator on a qubit space can be ∂p(x; θ) ∂p(x; θ) written as F (θ)mn := . (32) 3 ∂θm ∂θn x 1 X X = (Tr[X]I + Tr[Xσ ]σ ), (36) 2 i i Upon defining the covariance matrix for an estimator θˆ as i=1 follows where σ are the Pauli matrices. The reconstruction of the ˆ ˆ i Σmn := (θm θm)(θn θn) x, (33) expectation X ρ can be obtained by measuring the ob- h − − i h i one has the Cramer-Rao´ bound servables σi (namely the POVM collecting their eigenstates, 1/3 ψi± ψi± ) and then averaging the function 1 | ih | Σ F (θ)−1, (34) 1 ≥ N f [X] = ( 3Tr[Xσ ] + Tr[X]). (37) i± 2 i ˆ ± which is independent of the estimator θ. It can be proved that Also noise unbiasing is particularly easy in this case. Consider when the bound is tight the maximum-likelihood estimator for example a depolarizing channel Dp acting in the Heisen- saturates asymptotically for large N. berg picture as The maximization of the functional [ρ] is a nonlinear L p convex programming problem, and can be solved numerically. Dp(X) = (1 p)X + Tr[X]I, (38) − 2 8 with 0 p < 1. The unbiased estimator is then A. Tomography of channels ≤ 3 1 A quantum channel describes the most general evolution fi±[X] = Tr[Xσi] + Tr[X]. (39) ±2(1 p) 2 that a quantum system can undergo. It must satisfy three main − requirements: linearity, complete positivity, and preservation The physical realization of a qubit in quantum optics is of trace (the physical motivation of complete positivity is the dual rail encoding involving two modes (typically two that the transformation must preserve positivity of states also different polarization in the same spatial mode) with the when applied locally to a bipartite system). Probabilistic logical states 0 L and 1 L corresponding to 0 1 and 1 0 , transformations—so-called quantum operations—enjoy linear- | i | i | i| i | i| i respectively. ity and complete positivity, but generally decrease the trace. The tomography of channels is strictly related to the pos- sibility of imprinting all the information about a quantum B. Continuous variables transformation on a quantum state [44], formally expressed by The term continuous variables in the literature has become the Choi-Jamiołkowski correspondence between a channel C : a synonym of quantum mechanics of a radiation mode (har- ( 0) ( 1) and a positive operator RC ( 1 0) monic oscillator) with creation and annihilation operators a LdefinedH → as L H ∈ L H ⊗ H † and a . A spanning set of observables for linear operators RC := (C I )( I I ), (42) on such system is the displacement representation D(α) := ⊗ | iihh | † ∗ I eαa −α a of the Weyl-Heisenberg group, parametrized by where I is the identity map and 0 0. The correspondence can be inverted as follows| ii ∈ H ⊗ H α C, for which the following identity holds ∈ T C (ρ) = Tr0[(I ρ )R ], (43) Z d2 α ⊗ C D(α) D(α) = I. (40) π and this implies that determining RC is equivalent to de- C | iihh | termining C . While complete positivity of C corresponds Notice that we use of the term observable to designate any to positivity of R , trace preservation corresponds to the normal operator X such that [X,X†] = 0, so that its C condition Tr1[R ] = I. The reconstruction of the channel real and imaginary parts (X† + X)/2 and i(X† X)/2, C C can then obtained preparing the maximally entangled state respectively, are simultaneously diagonalizable, and− unitary 2 1/d( I I ), applying the channel locally and then recon- operators like D(α) are indeed normal. The measure d α on | iihh | −1 structing the output state d RC . More generally it can be the Complex plane C is unnormalizable, and plays the role of shown that one can use any bipartite input state R as an input the measure d ω of Eq. (15). However, for α with argument state, as long as it is connected to the maximally entangled i|α|Xφ arg α = φ π/2 we have [D(α),Xφ] = [e ,Xφ] = 0, state 1/d( I I ) by an invertible channel [47]. Such a state is X −:= 1 (a†eiφ + ae−iφ) | iihh | where φ 2 are the field quadratures. called faithful. This situation is actually forced in the infinite Thus, we can take D(α) as the set Fω of Eq. (15), { } { } dimensional case, where the vector I is not normalizable, and the quadratures Xφ as the quorum observables Xξ. The | ii 2 and e. g. one can use as a faithful state the twin-beam R d α R π d φ R +∞ |k| † † integral can be separated as d k, and 2 a a a a C π 0 π −∞ 4 T (λ) = (1 λ ) λ λ [47], [48]. since the integral over d k is included in the definition of the − | | | iihh | estimators f (X ,X) as in Eq. (17), the remaining integral is φ φ B. Tomography of measurements the one on d φ which is bounded and can be sampled from a uniform distribution on [0, π). The homodyne technique then The statistics and dynamics of a general quantum mea- consists in measuring the informationally complete POVM surement are described by a quantum instrument, that is a d φ set of quantum operations such that P = is trace x x φ d x (where x φ are Dirac eigenvectors of the Ei i Ei E | ih | π | i preserving. Their Choi operators satisfy R = P R , and quadrature Xφ), for suitably sampled values of φ, and then E i Ei averaging the estimators. The final reconstruction formula is the POVM describing the statistics of the measurement is the following provided by Pi := Tr1[REi ]. Similarly to the case of quantum channels, one can reconstruct quantum operations, along with Z π Z ∞ d φ the whole instrument corresponding to a measurement [50]. X ρ = d x fφ(x, X) x x φ ρ, (41) h i 0 π −∞ h| ih | i The tomography of the POVM can be obtained also for measurements that destroy the system (such as in photo- R +∞ |k| † iφ ikx with fφ(x, X) = −∞ 4 d k Tr[D (ke )X]e . detection), exploiting the following argument introduced in Ref. [49]. If we consider a faithful state T , then measuring the POVM Pi on 1 we have the following conditional VII.TOMOGRAPHYOFDEVICES { } H state on 0 Since the publication of Refs. [41], [44] most of the efforts H Tr1[(Pi I)T ] ρi := ⊗ . (44) in quantum tomography were directed to the reconstruction Tr[(Pi I)T ] of devices, that consists in using the techniques for state ⊗ Tomographing ρ and collecting the statistics of outcomes i, reconstruction to the problem of characterizing the behavior i one can reconstruct Pi by inverting the map T (P ) = Tr[(P of a quantum device, like a channel [43], a quantum operation I)T ] as follows ⊗ [50] or a POVM [49]. In the following subsections we will −1 review the main issues of these techniques. Pi = Tr[(Pi I)T ]T (ρi). (45) ⊗ 9

VIII.OPTIMIZATION 01 23 4 2N-1 2N 2N+1 In this section we will show the full optimization of quan- ... tum tomographic setups for finite-dimensional states, channels and measurements, according to the figure of merit defined in Eq. (27). Optimizing quantum tomography is a complex task, Fig. 1. A quantum comb with N slots. Information flows from left to right. that can be divided in two main steps. The causal structure of the comb implies that the input system m cannot The first optimization stage involves a fixed detector, and influence the output system n if m > n. only regards the data processing, namely the choice of the inverse Γ used to determine the expansion coefficients f[X] 01 2351234 4 for a fixed X. As we will prove in the following, the Γ is = independent of X, and only depends on the ensemble E . * The second stage consists in optimizing the average statis- tical error on a determined set of observables with respect to the POVM, namely the detector itself. = 05 A. Optimization of data-processing In this section we review the data processing optimization, giving the full derivation in the case of state tomography. Fig. 2. Linking of two combs. We identify the wires with the same label. Optimizing the data processing means choosing the best Γ according to the figure of merit. As proposed in section V-A, a natural figure of merit for the estimation of the expectation where we defined X Pj Pj X ρ of an observable X is the average statistical error; this Y = | iihh |. (51) h i Tr[ρ P ] is given by the variance δ(X)E of the random variable fl[X] j E j with probability distribution Tr[ρE Pi], namely δ(X)E defined in Eq. (27) The only term in Eq. (27) that depends on f[X] B. Optimization of the setup P 2 is l fl[X] p(l ρE ), that can be expressed as a norm in the 1) Short Review on Quantum Comb Theory: In this section space | of coefficients| | K we give a brief review of the general theory of quantum X 2 2 circuits, as developed in [9], [78], [97]. fl[X] p(l ρE ) = f[X] , (46) | | | || ||π l A quantum comb describes a quantum circuit board, namely a network of quantum devices with open slots in which 2 P ∗ where c π := lm cl πlmcm, with variable subcircuits can be inserted. A board with (N 1) || || − πlm = δlmp(l ρE ). (47) open slots has N input and output systems, labeled by even | numbers from 0 to 2N 2 and by odd numbers from 1 to It is now clear that minimizing the statistical error in Eq. (27) 2N 1, respectively, as− in Fig. 1. The internal connections is equivalent to minimizing the norm f[X] . In terms of π − || ||π of the circuit board determine a causal structure, according to we define the minimum norm generalized inverses Γ: this a which the input system m cannot influence the output system generalized inverse that satisfies [84] n if m > n. Moreover, two circuit boards C1 and C2 can † † πΓΛ = Λ Γ π. (48) be connected by linking some outputs of C1 with inputs of C2, thus forming a new board C3 := C1 C2. We adopt the Γ has the property that for all A Rng(Λ), f[A] = Γ(A) ∗ ∈ convention that wires that are connected are identified by the is a solution of the equation Λf[A] = A with minimum same label (see Fig. 2). norm. Notice that the present definition of minimum norm The quantum comb associated to a circuit board with N generalized inverse requires that the norm is induced by a input/output systems is a positive operator acting on the Hilbert P ∗ scalar product (in our case a b := a πlmbm). NN−1 lm l spaces out in where out := 2j+1 and in := It can be shown that the minimum· norm Γ is unique and H ⊗ H H j=0 H H NN−1 , being the Hilbert space of the n-th system. does not depend on X; the corresponding optimal dual is given j=0 2j n For a deterministicH H circuit board (i.e. a network of quantum by [7] channels) the causal structure is equivalent to the recursive Γ = Λ‡ ([(I M)π(I M)]‡πM)Λ‡, (49) normalization condition − − − ‡ ‡ (k) (k−1) where M := Λ Λ and Λ denotes the Moore-Penrose general- Tr2k−1[R ] = I2k−2 R k = 1,...,N (52) ized inverse of Λ, satisfying Λ‡ΛΛ‡ = Λ‡ and Λ‡Λ = (Λ‡Λ)†. ⊗ where R(N) = R, R(0) = 1, R(k) (N2k−1 ), We would like to stress that as long as the figure of merit can n=0 n 2n denoting the Hilbert space of the nth∈ input, L and H Hthat be expresses as a norm in induced by a scalar product, the 2n+1 of the nth output. We call a positive operator RHsatisfying optimal processing representedK by Γ does not depend on X. Eq. (52), a deterministic quantum comb. We can also con- The minimum of the expression Eq. (46) can be rewritten in sider probabilistic combs, which are defined as the Choi- this way [96] Jamiołkowski operators of probabilistic circuit boards (i.e. net- −1 2 δ(X)E = X Y X X , (50) work of quantum operations). A network containing measuring hh | | ii− |h i| E 10

devices will be then described by a set of probabilistic combs This means that we can expand any operator on out in H ⊗ H Ri , where the index i represents a classical outcome. The as follows { } normalization of probabilities implies that the sum over all X outcomes R = P R has to be a deterministic quantum comb. A = ∆l A Πl A ( out in), (58) i i hh | ii ∈ B H ⊗ H The connection of two circuit boards is represented by the l link product of the corresponding combs R1 and R2, which is where we use the fact that for all generalized inverses Γ one defined as has f [X] = Tr[∆†X] with ∆ a possible dual spanning set θK l l l R1 R2 = TrK[R R2], (53) { }P 1 of Πl satisfying the condition Πi ∆i = Iout Iin. ∗ { } i | iihh | ⊗ θK denoting partial transposition over the Hilbert space of Optimizing the tomography of quantum transformations the connected systems (recall that we identify with the sameK means minimizing the statistical error in the determination of labels the Hilbert spaces of connected systems). Note that Eq. the expectation of a generic operator A as in Eq. (58). The (43), which gives the action of a channel C on a state ρ in term optimization of the dual frame follows exactly the same lines of the Choi operator C, can be rewritten using the link product as for state tomography and gives the same result of Eq. (50), as C (ρ) = C ρ. Moreover, when variable circuits with Choi provided that i) the POVM Pi is replaced by the tester Πi ∗ { } { } operators Cj ( 2j H ), j = 1,...,N 1 are ii) the ensemble becomes an ensemble = Rk, pk of ∈ L H ⊗ H 2j−1 − E E { } inserted as inputs in the slots of the circuit board, one obtains possible transformations and the average state ρE becomes the 0 as output the quantum operation C given by average Choi operator RE . 0 2) Derivation of the optimized setup: In this section we C = R C1 C2 CN−1 . (54) address the problem of the optimization of the tester Πi .A ∗ ∗ ∗ · · · ∗ { } According to the above equation, quantum combs describe priori one can be interested in some observables more than all possible manipulations of quantum circuits, thus generaliz- other ones, and this can be specified in terms of a weighted ing the notions of quantum channel and quantum operation to set of observables = Xn, qn , with weight qn > 0 for the G { } the case of transformations where the input is not a quantum observable Xn. The optimal tester depends on the choice of system, but rather a set of quantum operations. An important , as we will prove in the following. We can assume that we G example of such transformations is that of quantum testers, already optimized the data-processing, so that the minimum i.e. transformations that take circuits as the input and provide statistical error averaged over , leading to G probabilities as the output. A tester is a set of probabilistic X −1 X 2 δE,G := Xn Y Xn qn Xn . (59) combs Πi with one-dimensional spaces 0 and 2N−1, hh | | ii− |h i| E { } P H H n n with the sum Π = i Πi being a deterministic comb satisfy- ing Eq. (52). When connecting the tester with another circuit Notice that only the first addendum of Eq. (59) depends on T board R we obtain the probabilities pi = Πi R = Tr[Πi R], the tester, so we just have to minimize which, a part from the transpose (which can be∗ reabsorbed into −1 the definition of the tester), is nothing but the generalization ηE,G := Tr[Y G], (60) of the for quantum networks. In the particular case P of testers with a single slot, the tester is a set of probabilistic where G = qn Xn Xn . n | iihh | combs Πi ( out in) , and its normalization becomes In the following, for the sake of clarity we will consider { ∈ L H ⊗H } X dim( 1) = dim( 2) =: d, and focus on the “symmetric” Πi = Iout σ, σ 0, Tr[σ] = 1. (55) H H ⊗ ≥ case G = I; this happens for example when the set Xn is i an orthonormal basis, whose elements are equally weighted.{ } When connecting a channel C to the tester, the latter provides Moreover, we assume that the averaged channel of the en- the outcome i with probability semble is the maximally depolarizing channel, whose Choi operatorE is R = d−1I I. Since R is invariant under the p = Tr[R Π ] , (56) E E i C i action of SU(d) SU(⊗d) we now show that it is possible × where RC is the Choi operator of . to impose the same covariance also on the tester without C It is easy to see that every tester Πi can be realized increasing the value of ηE,G. Let us define with the following physical scheme: i){ prepare} the pure state T † † √σ ( in in), ii) apply the channel C on one Πi,g,h := (Ug Vh)Πi(Ug Vh ), (61) | ii ∈ L H ⊗ H ⊗ ⊗ side of the entangled state iii) measure the joint POVM † † ∆i,g,h := (Ug Vh)∆i(U V ). (62) − 1 − 1 −1 g h Pi = Π 2 ΠiΠ 2 , where Π is the g-inverse Π. With ⊗ ⊗ { } this scheme one has indeed It is easy to check that ∆i,g,h is a dual of Πi,g,h. In fact, using T T identity in Eq. (20), we have pi = Tr[Pi(C I)( √σ √σ )] = Tr[ΠiR ] . (57) ⊗ | iihh | C X Z Tomographing a quantum transformation means using a d g d h Πigh ∆igh = (63) | iihh | suitable tester Πi such that the expectation value of any other i possible measurement can be inferred by the probability dis- Z ! X † −1 tribution p = Tr[R Π ]. In order to achieve this task we have d g d h Wgh Πi ∆i W = d I I (64) i T i | iihh | gh ⊗ to require that Πi is an operator frame for ( out in). i { } L H ⊗ H 11 where d g and d h denote the Haar measure normalized to unit, corresponding respectively to quantum operations, general ∗ ∗ and Wgh := (Ug Vh) (U V ). Then we observe that channels and unital channels. The subspaces and are ⊗ ⊗ g ⊗ h C U the normalization of Πi,g,h gives invariant under the action of the group Wg,h and thus the { } Z respective projectors decompose as X −1 d g d h Πi,g,h = d I I (65) ⊗ i QC = P1 + P2 + P4,QU = P1 + P4 (73) corresponding to σ = d−1I in Eq. (55), namely one can Without loss of generality we can assume the operators −1 choose ν = d I I . It is easy to verify that the figure of Πi to be one. In fact, suppose that Πi has rank higher | iihh | { } P merit for the covariant tester is the same as for the non covari- than 1. Then it is possible to decompose it as Π = j Πi,j ant one, whence, w.l.o.g. we optimize the covariant tester. The with Πi,j rank 1. The statistics of Πi can be completely condition that the covariant tester is informationally complete achieved by Πi,j through a suitable post-processing. For the w.r.t. the subspace of transformations to be tomographed will purpose of optimization it is then not restrictive to consider P be verified after the optimization. rank one Πi, namely Πi = αi Ψi Ψi , with αi = d. | iihh | i We note that a generic covariant tester is obtained by Notice that all multiple seeds of this form lead to testers Eq. (61), with operators Πi becoming seeds of the covariant satisfying Eq. (66). In the three cases under examination, the POVM, and now being required to satisfy only the normaliza- figure of merit is then tion condition  2 2  X −1 2 2 (d 1) Tr[Π ] = d (66) ηQ = Tr[Y˜ ] = 1 + (d 1) + − i − A 1 2A i  −2 2  ‡ 2 1 (d 1) (analogous of covariant POVM normalization in [6], [98]). ηC = Tr[Y˜ QC] = 1 + (d 1) + − − A 1 2A With the covariant tester and the assumptions G = I, R = I E  2 −2  Eq. (60) becomes ‡ 2 (d 1) ηU = Tr[Y˜ QU ] = 1 + (d 1) − (74) − 1 2A −1 ηE,G = Tr[Y˜ ], (67) − 2 −1 P † 2 1 where 0 A = (d 1) ( i αiTr[(ΨiΨi ) ] 1) d+1 < where 1 ≤ − − ≤ 2 . The minimum can simply be determined by derivation with Z Z 2 X d Πi,g,h Πi,g,h respect to A, obtaining A = 1/(d +1) for quantum operations, Y˜ = d g d h | iihh | = d g d h W XW † Tr[Π ] g,h g,h A = 1/(√2(d2 1) + 2) for general channels and A = 0 for i i,g,h − (68) unital channels. The corresponding minimum for the figure of merit is P with Y = d Πi Πi /Tr[Πi]. Using Schur’s lemma we i 6 4 2 | iihh | ηQ d + d d have ≥ − 6 4 2 ηC d + (2√2 3)d + (5 4√2)d + 2(√2 1) Y˜ = P1 + AP2 + BP3 + CP4, (69) ≥ 2 3 − − − ηU (d 1) + 1. (75) P1 = Ω13 Ω24,P2 = (I13 Ω13) Ω24, ≥ − ⊗ − ⊗ P3 = Ω13 (I24 Ω24) ,P4 = (I13 Ω13) (I24 Ω24), The same result for quantum operations and for unital channels ⊗ − − ⊗ − has been obtained in [99] in a different framework. having posed Ω = I I /d and | iihh | These bounds are simply achieved by a single seed Π0 = ( 2 ) d Ψ Ψ , with 1 X Tr[(Tr2[Πi]) ] A = 1 , | iihh | 2 2 2 d 1 Tr[Πi] − 2d √2(d 1) + 1 + d 1 − i † 2 Tr[(ΨΨ ) ] = 2 , − , (76) ( 2 ) d + 1 d(√2(d2 1) + 2) d 1 X Tr[(Tr1[Πi]) ] − B = 2 1 , (70) d 1 Tr[Πi] − respectively for quantum operations, general channels and − i ( 2 ) unital channels, namely with 1 X dTr[Πi ] 2 C = (d 1)(A + B) 1 . 1 2 2 −1 2 (d 1) Tr[Πi] − − − Ψ = [d (1 β)I + β ψ ψ ] (77) − i − | ih | One has where β = [(d + 1)/(d2 + 1)]1/2 for quantum operations, 2 1/2  2  β = [(d + 1)/(2 + √2(d 1))] for general channels and −1 2 1 1 (d 1) − Tr[Y˜ ] = 1 + (d 1) + + − . (71) β = 0 for unital channels, and ψ is any pure state. The A B C − informational completeness is verified| i if the operator Notice that if the ensemble of transformations is contained Z in a subspace ( 2 2) the figure of merit becomes F = d g d h Π0gh Π0gh (78) ‡ V ⊆ B H ⊗ H | iihh | η = Tr[Y˜ QV ]. We now carry on the minimization for three relevant subspaces: is invertible, namely (see [6]) if, for every i,

= ( 2 1), = R , Tr2[R] = I1 Ψ Ψ Pi Ψ Ψ = 0, (79) Q B H ⊗ H C { ∈ Q } hh |hh | | ii| ii 6 = R , Tr2[R] = I1, Tr1[R] = I2 (72) U { ∈ Q } which is obviously true for Ψ defined in Eq. (77). 12

A1 REFERENCES ˇ S1 [1] M. G. A. Paris and J. Reha´cekˇ (Eds.) Quantum State Estimation, Lect. U1 Notes Phys. 649 (Springer, Berlin-New York, 2004). T 1 [2] G. M. D’Ariano, L. Maccone, and M. F. Sacchi, Homodyne tomography Ψ √ I | ⟩⟩ d| ⟩⟩ and the reconstruction of quantum states of , in Quantum Informa- U2 tion with Continuous Variables of Atoms and Light, Ed. by N. Cerf, G. S2 Leuchs, and E. Polzik, (Imperial College Press, London, 2007). A [3] G. M. D’Ariano, M. G. A. Paris, M. F. Sacchi, Quantum Tomography, 2 Advances in Imaging and Electron Physics 128 205-308 (2003) [4] P. Busch, Informationally complete sets of physical quantities, Int. J. Fig. 3. Physical implementation of optimal quantum transformation tomog- Theor. Phys. 30, 1217 (1991). raphy. The two measurements are Bell’s measurements preceded by a random [5] G. M. D’Ariano, P. Perinotti, and M. F. Sacchi, Quantum Universal unitary. The state |Ψii depends on the prior ensemble. Detectors, Europhys. Lett. 65 165 (2004) [6] G. M. D’Ariano, P. Perinotti, and M. F. Sacchi, Informationally complete measurements and group representation, J. Opt. B: Quantum Semiclass. Opt. 6, S487-S491 (2004). [7] G. M. D’Ariano and P. Perinotti, Optimal Data Processing for Quantum The same procedure can be carried on when the operator G Measurements, Phys. Rev. Lett. 98, 020403 (2007). has the more general form G = g1P1 + g2P2 + g3P3 + g4P4, [8] A. Bisio, G. Chiribella, G. M. D’Ariano, S. Facchini, and P. Perinotti, Optimal Quantum Tomography of States, Measurements, and Transfor- where Pi are the projectors defined in (69). In this case Eq. mations, Phys. Rev. Lett. 102, 010404 (2009). (71) becomes [9] G. Chiribella, G. M. D’Ariano, and P. Perinotti, Quantum Circuit Architecture, Phys. Rev. Lett. 101, 060401 (2008).  2  [10] U. Fano, Description of States in Quantum Mechanics by Density Matrix −1 2 g2 g3 (d 1)g4 Tr[Y˜ G] = g1 + (d 1) + + − , (80) and Operator Techniques, Rev. Mod. Phys. 29, 74 (1957). − A B C [11] W. Pauli, in Encyclopedia of Physics V (Springer, Berlin 1958) p. 17. [12] W. Band and J. L. Park, The empirical determination of quantum states, Found. Phys. 1, 133 (1970). which can be minimized along the same lines previously [13] B. D’Espagnat, Conceptual Foundations of Quantum Mechanics (W. A. followed. G has this form when optimizing measuring pro- Benjamin, Mass. 1976). cedures of this kind: i) preparing an input state randomly [14] V. Buzek,ˇ Quantum tomography from incomplete data via MaxEnt † principle, in Lect. Notes Phys. 649 189 (2004). drawn from the set UgρUg ; ii) measuring an observable [15] S. Olivares and M. G. A. Paris, Quantum estimation via the minimum { }† chosen from the set UhAU . With the same derivation, Kullback entropy principle, Phys. Rev. A 76, 042120 (2007). { h} [16] A. Royer, Measurement of quantum states and the Wigner function, but keeping dim( 1) = dim( 2), one obtains the optimal H 6 H Found. Phys. 19, 3 (1989). tomography for general quantum operations. The special case [17] D. T. Smithey, M. Beck, M. G. Raymer, and A. Faridani, Measurement of dim( 2) = 1 (one has P3 = P4 = 0 in Eq. (69)) of the Wigner distribution and the density matrix of a light mode using H optical homodyne tomography - Application to squeezed states and the corresponds to optimal tomography of states, whereas case vacuum, Phys. Rev. Lett. 70, 1244 (1993). dim( 2) = 1 (P2 = P4 = 0) gives the optimal tomography [18] K. Vogel and H. Risken, Determination of quasiprobability distributions of POVMs.H in terms of probability distributions for the rotated quadrature phase, Phys. Rev. A 40, 2847 (1989). 3) Experimental realization schemes: We now show how [19] G. M. D’Ariano, C. Macchiavello, and M. G. A. Paris, Detection of the the optimal measurement can be experimentally implemented. density matrix through optical homodyne tomography without filtered back projection, Phys. Rev. A 50, 4298 (1994). Referring to Fig. 3, the bipartite system carrying the Choi [20] U. Leonhardt, H. Paul and G. M. D’Ariano, Tomographic reconstruction operator of the transformation is indicated with the labels S1 of the density matrix via pattern functions, Phys. Rev. A 52, 4899 (1995). and S2. We prepare a pair of ancillary systems A1 and A2 [21] G. M. D’Ariano, Measuring Quantum States, in Quantum Optics and in the joint state Ψ Ψ , then we apply two random unitary Spectroscopy of Solids, ed. by T. Hakiogluˇ and A. S. Shumovsky, | iihh | (Kluwer Academic Publisher, Amsterdam, 1997), p. 175-202. transformations U1 and U2 to S1 and S2, finally we perform a [22] M. Munroe, D. Boggavarapu, M. E. Anderson, and M. G. Raymer, Bell measurement on the pair A1S1 and another Bell measure- Photon-number statistics from the phase-averaged quadrature-field dis- ment on the pair A S . This experimental scheme realizes the tribution: Theory and ultrafast measurement, Phys. Rev. A 52, R924 2 2 (1995). continuous measurement by randomizing among a continuous [23] S. Schiller, G. Breitenbach, S. F. Pereira, T. Muller¨ and J. Mlynek, set of discrete POVM; this is a particular application of a Quantum statistics of the squeezed vacuum by measurement of the general result proved in [101]. The scheme proposed is feasible density matrix in the number state representation, Phys. Rev. Lett. 77 2933 (1996); G. Breitenbach, S. Schiller and J. Mlynek, Measurement using e. g. the Bell measurements experimentally realized in of the quantum states of squeezed light, Nature 387, 471 (1997). [100]. We note that choosing Ψ maximally entangled (as [24] U. Janicke and M. Wilkens, Tomography of Atom Beams, J. Mod. Opt. proposed for example in [60]) is| generallyii not optimal, except 42, 2183 (1995); S. Wallentowitz and W. Vogel, Reconstruction of the Quantum Mechanical State of a Trapped Ion, Phys. Rev. Lett. 75, 2932 for the unital case. (1995); S. H. Kienle, M. Freiberger, W. P. Schleich, and M. G. Raymer The experimental schemes for POVMs/states are obtained in Experimental Metaphysics: Quantum Mechanical Studies for Abner Shimony ed. by S. Cohen et al. (Kluwer, Lancaster, 1997) p. 121. by removing the upper/lower for branch quantum operations, [25] T. J. Dunn, I. A. Walmsley and S. Mukamel, Experimental Determina- respectively. In the remaining branch the bipartite detector be- tion of the Quantum-Mechanical State of a Molecular Vibrational Mode comes mono-partite, performing a von Neumann measurement Using Fluorescence Tomography, Phys. Rev. Lett. 74, 884 (1995). [26] C. Kurtsiefer, T. Pfau T. and J. Mlynek, Measurement of the Wigner for the qudit, preceded by a random unitary in SU(d). More- function of an ensemble of helium atoms, Nature 386, 150 (1997). over, for the case of POVM, the state Ψ is missing, whereas, [27] D. Leibfried, D. M. Meekhof, B. E. King, C. Monroe, W. M. Itano and for state-tomography, both bipartite| statesii are missing. The D. J. Wineland, Experimental Determination of the Motional Quantum 3 2 State of a Trapped Atom, Phys. Rev. Lett. 77, 4281 (1996). optimal ηE,G in Eq. (67) is given by η = d + d d, in both − [28] G. M. D’Ariano, M. Vasilyev, and P. Kumar, Self-homodyne tomography cases (for state-tomography compare with Ref. [76]). of a twin-beam state, Phys. Rev. A 58, 636 (1998). 13

[29] G. M. D’Ariano, Homodyning as universal detection, in Quantum [62] P. Delsarte, J. M. Goethals, and J. J. Seidel, Spherical codes and designs, Communication, Computing, and Measurement, edited by O. Hirota, A. Geometriae Dedicata 6, 363 (1977). S. Holevo, and C. M. Caves (Plenum Publishing, New York and London, [63] M. Lobino, D. Korystov, C. Kupchak, E. Figueroa, B. C. Sanders, and A. 1997) p. 253. I. Lvovsky, Complete Characterization of Quantum-Optical Processes, [30] G. D’Ariano, P. Kumar, M. Sacchi, Universal homodyne tomography Science 322, 563 (2008). with a single local oscillator, Phys. Rev. A 61, 13806 (2000). [64] J. S. Lundeen, A. Feito, H. Coldenstrodt-Ronge, K. L. Pregnell, C. [31] G. M. D’Ariano, Latest developements in quantum tomography, in Silberhorn, T. C. Ralph, J. Eisert, M. B. Plenio, I. A. Walmsley, Quantum Communication, Computing, and Measurement, edited by P. Tomography of quantum detectors, Nature Physics 5, 27 (2008). Kumar, G. M. D’Ariano, and O. Hirota (Kluwer Academic/Plenum [65] S. H. Myrskog, J. K. Fox, M. W. Mitchell, and A. M. Steinberg, Publishers, New York and London, 2000) p. 137. Quantum process tomography on vibrational states of atoms in an [32] G. M. D’Ariano, Universal quantum estimation, Phys. Lett. A 268, 151 optical lattice, Phys. Rev. A 72, 013615 (2005). (2000). [66] M. Riebe, M. Chwalla, J. Benhelm, H. Haffner,¨ W. Hansel,¨ C. F. [33] G. Cassinelli, G. M. D’Ariano, E. De Vito, A. Levrero, Group Theoret- Roos, and R. Blatt, Quantum teleportation with atoms: quantum process icalQuantum Tomography, J. Math. Phys. 41, 7940 (2000) tomography, New. J. Phys. 9, 211 (2007). [34] G. M. D’Ariano, L. Maccone, and M. G. A. Paris, Orthogonality [67] H. Kampermann and W. S. Veeman, Characterization of quantum relations in Quantum Tomography, Phys. Lett. A 276, 25 (2000) algorithms by quantum process tomography using quadrupolar spins [35] G. M. D’Ariano, L. Maccone and M. G. A. Paris, Quorum of observables in solid-state nuclear magnetic resonance, J. Chem. Phys. 122, 214108 for universal quantum estimation, J. Phys. A: Math. Gen. 34, 93 (2001). (2005). Information-theoretical aspects of quantum measure- [36] E. Prugovecki,ˇ [68] M. Howard, J. Twamley, C. Wittmann, T. Gaebel, F. Jelezko, and J. ment ,Int. J. Theor. Phys 16, 321 (1977). Wrachtrup, Quantum process tomography and Linblad estimation of a Quantum-state estimation [37] Z. Hradil, , Phys. Rev. A 55, R1561 (1997). solid-state qubit, New J. Phys. 8, 33 (2006). [38] K. Banaszek, Maximum-likelihood estimation of photon-number distri- [69] M. Brune, J. Bernu, C. Guerlin, S. Deleglise,´ C. Sayrin, S. Gleyzes, S. bution from homodyne statistics Phys. Rev. A 57, 5013 (1998). Kuhr, I. Dotsenko, J. M. Raimond, and S. Haroche, Process Tomography [39] K. Banaszek, G. M. D’Ariano, M. G. A. Paris, M. F. Sacchi, Maximum- of Field Damping and Measurement of Fock State Lifetimes by Quantum likelihood estimation of the density matrix, Phys. Rev. A 61 010304(R) Nondemolition Photon Counting in a Cavity, Phys. Rev. Lett. 101, (2000). 240402 (2008). [40] R. Blume-Kohout, Optimal, reliable estimation of quantum states, quant- Scheme for reducing decoherence in quantum computer memory ph/0611080. [70] P. Shor, , [41] I. L. Chuang and M. A. Nielsen, Prescription for experimental determi- Phys. Rev. A 52, R2493 (1995). nation of the dynamics of a quantum black box, J. Mod. Opt. 44, 2455 [71] A. M. Steane, Error Correcting Codes in Quantum Theory, Phys. Rev. (1997). Lett. 77, 793 (1996). [42] J. F. Poyatos, J. I. Cirac, and P. Zoller, Complete Characterization of [72] E. Knill and R. Laflamme, Theory of quantum error-correcting codes, a Quantum Process: The Two-Bit Quantum Gate, Phys. Rev. Lett. 78, Phys. Rev. A 55, 900 (1997). 390 (1997). [73] M. W. Mitchell, C. W. Ellenor, R. B. A. Adamson, J. S. Lundeen, [43] G. M. D’Ariano and L. Maccone, Measuring Quantum Optical Hamil- A. M. Steinberg, Quantum process tomography and the search for tonians, Phys. Rev. Lett. 80, 5465 (1998). decoherence-free subspaces, in Quantum Information and Computation [44] G. M. D’Ariano and P. Lo Presti, Quantum Tomography for Measuring II. E. Donkor, A. R. Pirich, R. Andrew, H. E. Brandt eds., Proceedings Experimentally the Matrix Elements of an Arbitrary Quantum Operation, of the SPIE, 5436, 223-231 (2004). Phys. Rev. Lett. 86, 4195 (2001). [74] J. Emerson, M. Silva, O. Moussa, C. Ryan, M. Laforest, J. Baugh, [45] D. Leung, Ph.D. thesis, Stanford University, comp-sci/0012017. D. G. Cory, and R. Laflamme, Symmetrized Characterization of Noisy [46] M. F. Sacchi, Maximum-likelihood reconstruction of completely positive Quantum Processes, Science 317, 1893 (2007) maps, Phys. Rev. A 63, 054104 (2001). [75] R. L. Kosut, Quantum Process Tomography via L1-norm Minimization, [47] G. D’Ariano and P. Lo Presti, Imprinting a complete information about a arXiv:0812.4323. quantum channel on its output state, Phys. Rev. Lett. 91, 047902 (2003). [76] A. J. Scott, Tight informationally complete quantum measurements, J. [48] J. B. Altepeter, D. Branning, E. Jeffrey, T. C. Wei, P. G. Kwiat, R. T. Phys. A 39, 13507 (2006). Thew, J. L. O’Brien, M. A. Nielsen, and A. G. White, Ancilla-Assisted [77] C. W. Helstrom, Quantum detection and estimation theory, (Academic Quantum Process Tomography, Phys. Rev. Lett. 90, 193601 (2003). Press, New York, San Francisco, London, 1976). [49] G. M. D’Ariano, P. Lo Presti, and L. Maccone, Quantum Calibration [78] G. Chiribella, G. M. D’Ariano, and P. Perinotti, Memory Effects in of Measurement Instrumentation, Phys. Rev. Lett. 93, 250407 (2004). Quantum Channel Discrimination, Phys. Rev. Lett. 101, 180501 (2008). [50] G. M. D’Ariano and P. Lo Presti, Characterization of Quantum Devices, [79] M. Ziman, Incomplete quantum process tomography and principle of Lect. Notes Phys. 649, 297-322 (Springer, Berlin-New York 2004). maximal entropy, Phys. Rev. A 78, 032118 (2008). [51] G. M. D’Ariano, M. G. A. Paris, and M. F. Sacchi, Quantum Tomo- [80] R. B. Bhapat, Linear Algebra and Linear Models, (Springer-Verlag, New graphic Methods, Lect. Notes Phys. 649, 2-58 (Springer, Berlin-New York, 2000). York 2004). [81] R. J. Duffin and A. C. Schaeffer, A class of nonharmonic Fourier series, [52] J. Fiura´sekˇ and Z. Hradil, Maximum-likelihood estimation of quantum Trans. Am. Math. Soc. 72, 341 (1952). processes , Phys. Rev. A 63, 020101(R) (2001). [82] S. Li, On general frame decompositions, Numer. Funct. Anal. Optim. Maximum-likelihood estimation of quantum measurement [53] J. Fiura´sek,ˇ , 16 1181 (1995). Phys. Rev. A 64 024102 (2001). [83] G. M. D’Ariano, M. F. Sacchi, Renormalized quantum tomography [54] A. Luis and L. L. Sanchez-Soto, Complete Characterization of Arbitrary arXiv:0901.2866 Quantum Measurement Processes, Phys. Rev. Lett. 83, 3573 (1999). Adaptive Bayesian [55] G. M. D’Ariano, Universal quantum observables, Phys. Lett. A 300, 1 [84] G. M. D’Ariano, D. F. Magnani, and P. Perinotti, and frequentist data processing for quantum tomography (2002). Phys. Lett. A, [56] A. M. Childs, I. L. Chuang, D. W. Leung, Realization of quantum doi:10.1016/j.physleta.2009.01.055. process tomography in NMR, Phys. Rev. A 64, 012314 (2001). [85] P. G. Casazza, D. Han, and D. Larson, Frames for Banach spaces, [57] F. de Martini, A. Mazzei, M. Ricci, G. M. D’Ariano, Exploiting quantum Contemp. Math. 247, 149-182 (1999). parallelism of entanglement for a complete experimental quantum char- [86] C. M. Caves, C. A. Fuchs, and R. Schack, Unknown quantum states: acterization of a single-qubit device, Phys. Rev. A 67, 062307 (2003). The quantum de Finetti representation, J. Math. Phys. 43, 4537 (2002). [58] M. Mohseni and D. A. Lidar, Direct Characterization of Quantum [87] G. M. D’Ariano, P. Perinotti, and M. F. Sacchi, Quantum indirect Dynamics, Phys. Rev. Lett. 97, 170501 (2006). estimation theory and joint estimates of all moments of two incompatible [59] M. Mohseni and D. A. Lidar, Direct characterization of quantum observables, Phys. Rev. A 77, 052108 (2008). dynamics: General theory, Phys. Rev. A 75, 062331 (2007). [88] G. M. D’Ariano, P. Perinotti, and M. F. Sacchi, Informationally complete [60] M. Mohseni, A. T. Rezakhani, and D. A. Lidar, Quantum-process measurements on bipartite quantum systems: Comparing local with tomography: Resource analysis of different strategies, Phys. Rev. A 77, global measurements, Phys. Rev. A 72, 042108 (2005). 032322 (2008). [89] G. M. D’Ariano, L. Maccone, and M. Paini, Spin tomography, J. Opt. [61] A. Bendersky, F. Pastawski, J. P. Paz, Selective and Efficient Estimation B 5, 77 (2003). of Parameters for Quantum Process Tomography, Phys. Rev. Lett. 100, [90] G. M. D’Ariano, E. De Vito, and L. Maccone, SU (1,1) tomography, 190403 (2008). Phys. Rev. A 64, 033805 (2001). 14

[91] G. Chiribella, G. M. D’Ariano, and P. Perinotti, Applications of the group SU (1,1) for quantum computation and tomography, Laser Phys. 16, 1572 (2006). [92] A. Grossmann, J. Morlet, and T. Paul, Transforms associated to square integrable group representations. I: General results., J. Math. Phys. 26, 2473 (1985). [93] G. M. D’Ariano, P. Perinotti, and M. F. Sacchi, Optimization of Quantum Universal Detectors, in Squeezed States and Uncertainty Relations, ed. by H. Moya-Cessa, R. Jauregui, S. Hacyan, and O. Castanos, (Rinton Press, Princeton, 2003) pag. 86. [94] G. M. D’Ariano, Tomographic methods for universal estimation in quantum optics, Scuola “E. Fermi” on Experimental Quantum Com- putation and Information, F. De Martini and C. Monroe ed. (IOS Press, Amsterdam, 2002) pag. 385. [95] G. M. D’Ariano, and N. Sterpi, Robustness of Homodyne Tomography to Phase-Insensitive Noise, J. Mod. Optics 44 2227 (1997). [96] G. M. D’Ariano and P. Perinotti, Optimal estimation of ensemble averages from a quantum measurement, in Proceedings of the 8th Int. Conf. on Quantum Communication, Measurement and Computing, ed. by O. Hirota, J. H. Shapiro and M. Sasaki (NICT press, Japan, 2007), p. 327. [97] G. Chiribella, G. M. D’Ariano, and P. Perinotti, Transforming quantum operations: quantum supermaps, Europhys. Lett. 83, 30004 (2008). [98] A. S. Holevo, Probabilistic and Statistical Aspects of Quantum Theory, North Holland, Amsterdam, 1982. [99] A. J. Scott, Optimizing quantum process tomography with unitary 2- design, J. Phys. A 41, 055308 (2008). [100] P. Walther, A. Zeilinger, Experimental Realization of a Photonic Bell- State Analyzer, Phys. Rev. A 72, 010302(R) (2005) [101] G. Chiribella, G. M. D’Ariano, D. M. Schlingemann, How continuous quantum measurements in finite dimension are actually discrete, Phys. Rev. Lett. 98, 190403 (2007).