Quantum Computation (CMU 15-859BB, Fall 2015) Lecture 22: Introduction to Quantum Tomography November 23, 2015 Lecturer: John Wright Scribe: Tom Tseng
1 Motivation
When conducting research in quantum physics, it is essential that, once we create or are otherwise in possession of a quantum state, we can determine what the state is with high accuracy. For instance, researchers have conducted various experiments that support that the quantum teleportation procedure, which we discussed in lecture 3, works. These experiments involve sending quantum states over long distances. But how does the receiving party verify that the state that they receive is the same as the one that was sent by the other party? This is a problem with significant practical importance and is precisely what tomography aims to solve.
2 The problem
2.1 Definition Roughly speaking, tomography is the problem where we are given an unknown mixed state ρ ∈ Cd×d and where our goal is to “learn” what ρ is. We would like to take the input ρ and put it through some tomography algorithm that will output the classical d × d state describing ρ.
classical state quantum state ρ1,1 . . . ρ1,d Tomography ρ . .. . = ρ algorithm . . . ρd,1 . . . ρd,d
This is similar to hypothesis testing, which we have been discussing during the past few lectures. The difference is that we have no prior information as to what ρ is. Actually, there is a problem with the above description: the goal is impossible to achieve. If we really had an algorithm that could completely learn everything about ρ, then we could replicate ρ, contradicting the no-cloning theorem. What can we say about the state ρ, then? Not much. Suppose we measured ρ with respect to some basis {|v1i , |v2i ,..., |vmi} and observed |vii. All we learn is that hvi| ρ |vii,
1 the probability of observing |vii, is non-zero, and maybe we can further guess that hvi| ρ |vii is somewhat large. This tells us little useful information about ρ. Although no perfect tomography algorithm exists, we can make some adjustments to the conditions of the problem to make tomography more tractable. We saw last lecture that when using the Pretty Good Measurement for the hidden subgroup problem, we could generate several copies of the coset state in order to obtain measurement probabilities that were more favorable. That gives our first adjustment: we can stipulate that we are given n ∈ N many copies of ρ. In addition, the requirement that we have to output exactly ρ is d×d too restricitive, so our second adjustment will be that we only have to output some ρe ∈ C that approximates ρ with high probability.
quantum states classical state ρ ρ1,1 ... ρ1,d Tomography e e ρ . .. . = ρ algorithm . . . e ρd,1 ... ρd,d ρ e e
This situation is more sensible. With n copies of ρ, perhaps we can learn enough infor- mation to make a fairly precise guess as to what ρ is. Experiments that test quantum teleportation are verified in this way. In 2012, a group of researchers sent a quantum state between two islands that were 143 miles apart. The result was analyzed by applying quantum tomography to the state, which was sent many times over [MHS+12]. The above discussion motivates the following definition of tomography.
Definition 2.1. In the problem of quantum state tomography, the goal is to estimate an unknown d-dimensional mixed state ρ when given ρ⊗n, where n, d ∈ N. Explicitly, we want a process with the following input and output.
Input: a tensor power ρ⊗n of the unknown mixed state ρ ∈ Cd×d, and an error margin ε > 0.
d×d Output: a state ρe ∈ C such that dtr(ρ, ρe) ≤ ε with high probability.
2.2 Optimizing the input If ρ were given to us in the form ρ⊗n where n is unimaginably large, we could estimate ρ with great accuracy. However, in reality, preparing so many copies of ρ is expensive. The main question, then, is, given the constraint on the output, how small can n be? Here are a couple of results that shed some light on this question.
2 Fact 2.2. To solve the tomography problem, it is necessary to have n ∈ Ω(d2/ε2) and sufficient to have n ∈ Θ(d2/ε2). Fact 2.3. If ρ is pure, then it is sufficient to have n ∈ Θ(d/ε2). If ρ is pure, then it is likely also necessary to have n ∈ Ω(d/ε2). However, the best known bound is n ∈ Ω(d/ (ε2 log(d/ε))) [HHJ+15]. It makes sense that n need not be as great if ρ is pure, since ρ is then a rank-1 matrix that can be fully described as the outer product of a d-dimensional vector. At first glance, this looks promising. Quadratic and linear complexity with respect to d seems efficient. However an m-qubit state will have dimension d = 2m. Therefore, in the case of fact 2.2, we would need n ∈ Θ (4m/ε2). This fast exponential growth in the size of the input is the biggest obstacle to quantum tomography at present. In the remainder of this lecture, we will focus on pure state tomography and prove fact 2.3. The general bound from fact 2.2 is considerably more difficult to prove, so we will not discuss it. For more detail, consult the recent paper written by the instructors of this course [OW15].
3 Pure state tomography
3.1 Relation to hypothesis testing ⊗n In pure state tomography, we are given |ψi ∈ Cd in the form |ψi , and we want to output |ψei ∈ Cd such that dtr |ψi hψ| , |ψei hψe| ≤ ε. When looking at pure states and their trace distance, we have the following fact. Fact 3.1. For |ψi , |ψei ∈ Cd, the trace distance of |ψi and |ψei satisfies the following equality: r 2
dtr |ψi hψ| , |ψei hψe| = 1 − hψ|ψei . Thus the goal is to make our guess |ψei have large inner product with |ψi. How do we achieve that? This problem resembles worst-case hypothesis testing. In ⊗n ⊗n worst-case hypothesis testing, we are given some state |ψi from some set {|vi }|vi, and we try to guess at |ψi. The difference in tomography, of course, is that the set of possibilities is infinitely large and that we have some margin of error. This problem is also similar to average-case hypothesis testing, which, as we saw last lecture, was related to worst-case hypothesis testing through the minimax theorem. In tomography, we are not given a probability distribution, so it is similar to the situation ⊗n ⊗n where we try to guess |ψi given a uniformly random |ψi ∼ {|vi }unit |vi. We saw that if we have an optimal algorithm for average-case hypothesis testing, then it is closely related to the optimal algorithm for worst-case hypothesis testing. Thus, to devise the algorithm for tomography, we can get away with just solving the average-case situation on the hardest distribution – the uniform distribution.
3 3.2 Infinitesimals The strategy for tomography is that we will merely pretend we are given a probability ⊗n distribution. With input |ψi , we will treat |ψi as a uniformly random unit vector in Cd. What does it mean to have a uniformly random unit vector? Since the state is uniformly random, the probability of measuring any particular |ψi ∈ Cd is infinitesimally small. We will write this as Pr[|ψi] = dψ. We need to choose a POVM suitable for tackling this probability distribution. Last lecture, we saw that a good way to deal with average-case hypothesis testing was to run the Pretty Good Measurement, in which, given an ensemble {(σi, pi)}i of density matrices and probabilities, we would choose a POVM whose elements resembled the states we knew were in the ensemble. Namely, we took each POVM element Ei to be piσi adjusted by a couple of S1/2 factors. We will reappropriate that technique for our situation. The version of the Pretty Good Measurement for tomography is a POVM with elements resembling
⊗n ⊗n E|ψi ≈ dψ · |ψi hψ| .
This gives an infinite number of POVM elements, which means that the typical constraint P that i Ei = I needs to be rethought. We instead consider a new constraint that Z E|ψi = I |ψi where I is the n × n identity matrix. This talk of infinitesimals and integrals might seem non-rigorous, but it is really just informal shorthand that will be notationally convenient for us. A more rigorous analysis could be achieved by instead taking finite-sized POVMs that approximate this infinite scheme and then considering the limit as you make the finite POVMs more and more accurate (and consequently larger and larger in the number of elements). (A natural question to ask is, “how would we actually implement measurement with a POVM of infinitely many elements?” We would have to take a finite-sized POVM that approximates the infinite POVM. This is not a big sacrifice; we do a similar discretization of randomness in classical algorithms whenever we implement a randomized algorithm that calls for a real number from the uniform distribution over [0, 1].)
3.3 The algorithm Now we are ready to present the algorithm for pure state tomography that will prove fact 2.3.
4 Pure State Tomography Algorithm ⊗n Input: We are given |ψi , where |ψi ∈ Cd. We know n and d, which are natural numbers.
Strategy: Measure with the POVM consisting of {E|vi}unit vectors |vi∈Cd , where n + d − 1 E = |vi⊗n hv|⊗n dv. |vi d − 1
Our output: The measurement will give some vector |ψei ∈ Cd, which is what we will output.
Let’s do a quick sanity check by making sure our output is sensible. We have ⊗n ⊗n n + d − 1 ⊗n ⊗n ⊗n ⊗n Pr[|ψei] = tr E |ψi hψ| = tr |ψei hψe| |ψi hψ| dψe |ψei d − 1 n + d − 1 2n = hψ|ψei dψe (1) d − 1 where the last equality follows from the cyclic property of the trace operator. Notice that
hψ|ψei is 1 if |ψei is equal to |ψi and less than 1 otherwise. This signals that the probability of outputting a particular vector is greater if that output vector is close in dot product to the input state, which is what we want. R Before the analyze the accuracy of the algorithm, we need to first verify that |ψi E|ψi = I.
3.4 The integral constraint The presence of that binomial coefficient in the POVM elements might seem strange, but R it is merely a scaling factor that makes the constraint |ψi E|ψi = I hold. Why does this particular coefficient appear, though? Notice that the input is guaranteed to be of a certain form, namely, a tensor power with specific dimensions. Naturally, we want to understand the space in which these inputs live.
⊗n Definition 3.2. The vector space Sym(n, d) is span{||vi i : |vi ∈ Cd}. The notation stands for “symmetric” since vectors of the form |vi⊗n are, in a way, sym- metric as a tensor product. As an abbreviation, we will simply refer to this space as Sym with the n and d parameters being implied. The binomial coefficient comes from the dimension of this space.