<<

View metadata, citation and similar papers at core.ac.uk brought to you by CORE providedweek by ending Caltech Authors - Main PRL 116, 250501 (2016) PHYSICAL REVIEW LETTERS 24 JUNE 2016

Improved Classical Simulation of Quantum Circuits Dominated by Clifford Gates

Sergey Bravyi1 and David Gosset2 1IBM T.J. Watson Research Center, Yorktown Heights, New York 10598, USA 2Walter Burke Institute for Theoretical Physics and Institute for and Matter, California Institute of Technology, Pasadena, California 91125, USA (Received 9 March 2016; published 20 June 2016) We present a new algorithm for classical simulation of quantum circuits over the Clifford þ T gate set. The runtime of the algorithm is polynomial in the number of and the number of Clifford gates in the circuit but exponential in the number of T gates. The exponential scaling is sufficiently mild that the algorithm can be used in practice to simulate medium-sized quantum circuits dominated by Clifford gates. The first demonstrations of fault-tolerant quantum circuits based on 2D topological codes are likely to be dominated by Clifford gates due to a high implementation cost associated with logical T gates. Thus our algorithm may serve as a verification tool for near-term quantum computers which cannot in practice be simulated by other means. To demonstrate the power of the new method, we performed a classical simulation of a hidden shift with 40 qubits, a few hundred Clifford gates, and nearly 50 T gates.

DOI: 10.1103/PhysRevLett.116.250501

The path towards building a large-scale quantum com- When it comes to realizing a logical (encoded) circuit, puter will inevitably require verification and validation of non-Clifford gates pose a serious challenge for any fault- small quantum devices. One way to check that such a tolerant scheme based on 2D stabilizer codes [8,13] due to device is working properly is to simulate it on a classical the lack of topological protection [14,15]. Such non- computer. This becomes impractical at some point because Clifford gates can be implemented fault-tolerantly using the cost of classical simulation typically grows exponen- special single- resource states known as magic states tially with the size of a quantum system. With this [16]. The magic states must themselves be prepared using a fundamental limitation in mind it is natural to ask how fault tolerant protocol for “” [16], well we can do in practice. which is relatively resource intensive. For example, in the Simulation methods which store a complete description case of the surface code, the overhead associated with of an n-qubit quantum state as a complex vector of size 2n logical T gates exceeds that of any logical Clifford gate by are limited to a small number of qubits n ≈ 30 − 40.For orders of magnitude [17,18]. Thus it is likely that the first example, a state-of-the art implementation has been used to logical circuits demonstrated in the lab will be Clifford þ T simulate Shor’s factoring algorithm with 31 qubits and circuits dominated by Clifford gates. In this Letter we roughly half a million gates [1]. Using distributed compu- propose a new algorithm for classical simulation of such tation it is possible to simulate 40 qubit circuits [2].For circuits. Our algorithm could therefore serve as a verifica- certain restricted classes of quantum circuits it is possible to tion tool for near-term quantum computers. do much better [3–7]. Most significantly, the Gottesman- Let us now state our results. A Clifford þ T quantum Knill theorem allows efficient classical simulation of circuit of length m acting on n qubits is a unitary operator quantum circuits composed of gates in the so-called U ¼ Um U2U1, where each Uj is a one- or two-qubit Clifford group [3]. In practice this allows one to simulate gate from the set fH; S; T; CNOTg, where such circuits with thousands of qubits [1,4]. It also means       that a quantum computer will need to use gates outside of 1 11 10 10 H ¼ pffiffiffi ;S¼ ;T¼ ; the Clifford group in order to achieve useful speedups over 2 1 −1 0 i 0 eiπ=4 classical computation. The full power of quantum compu- tation can be recovered by adding a single non-Clifford gate and CNOT ¼j0ih0j ⊗ I þj1ih1j ⊗ X is the controlled- to the Clifford group. A simple choice is the single-qubit NOT gate. We shall write m ¼ c þ t, where c is the number T ¼j0ih0jþeiπ=4j1ih1j gate. The Clifford þ T gate set of Clifford gates (H; S; CNOT) and t is the number of T obtained in this way is a natural instruction set for small- gates also known as the T count. Applying U to the initial scale fault-tolerant quantum computers based on the sur- state j0⊗ni and measuring some fixed output register Q ⊆ 1; …;n face code [8,9], and has been at the center of a recent out f g in the 0,1 basis generates a random bit x w Q x renaissance in classical techniques for compiling quantum string of length ¼j outj. A string appears with circuits [10–12]. probability

0031-9007=16=116(25)=250501(5) 250501-1 © 2016 American Physical Society week ending PRL 116, 250501 (2016) PHYSICAL REVIEW LETTERS 24 JUNE 2016 P x 0⊗n U†Π x U 0⊗n ; T t ≤ 48 outð Þ¼h j ð Þ j i ð1Þ Clifford gates, and count . Specifically, we simulated a quantum algorithm which solves the hidden Π x Q x where ð Þ projects out onto the basis state j i and acts shift problem [21] for non-linear Boolean functions [22]. trivially on the remaining qubits. An instance of the hidden shift problem is defined by a pair 0 n Our main result is a classical algorithm for sampling the of oracle functions f, f ∶F 2 → f1g and a hidden shift x ϵ P s ∈ F n f output string from a distribution which is -close to out string 2. It is promised that is a bent (maximally with respect to the L1 norm. The algorithm has runtime nonlinear) function, that is, the Hadamard transform of f takes values 1. It is also promised that f0 is the shifted τ ¼ Oðwðw þ tÞðc þ tÞþwðn þ tÞ3 þ 2γtt3w4ϵ−5Þ; ð2Þ version of the Hadamard transform of f, that is, X 0 −n=2 x·y n where f ðx ⊕ sÞ¼2 ð−1Þ fðyÞ for all x ∈ F 2: ð4Þ n y∈F 2 γ ≤−2log2½cos ðπ=8Þ ≈ 0.228 ð3Þ Here ⊕ stands for the bitwise exclusive OR gate. The goal is a constant that depends on the implementation details. is to learn the hidden shift s by making as few queries to f Note that the runtime scales polynomially in all parameters and f0 as possible. The classical query complexity of this except for the T count. We expect the algorithm to be problem is known to be linear in n, see Theorem 8 of practical when the size of the output register w is small and Ref. [22]. In the quantum setting, f and f0 are given as the precision ϵ is not too small. For example, assuming that diagonal n-qubit unitary operators Of and Of0 such that w 1 ϵ 0 n the circuit outputs a single bit ( ¼ ), is a fixed constant, Ofjxi¼fðxÞjxi and Of0 jxi¼f ðxÞjxi for all x ∈ F 2.A and t ≤ n ≤ c, the runtime becomes quantum algorithm can learn s by making a single query to each of these oracles, as can be seen from the identity [22] τ ¼ Oðn3 þ ct þ 2γtt3Þ: ⊗n ⊗n ⊗n ⊗n jsi¼Uj0 i;U≡ H Of0 H OfH : ð5Þ The algorithm can be divided into independent subroutines 3 with a runtime Oðt Þ each and thus supports a large amount This hidden shift problem is ideally suited for our bench- of parallelism. We provide pseudocode for the main steps marking task for two reasons. First, the algorithm produces of the algorithm in the Supplemental Material [19]. a deterministic output, i.e., the output is a computational Since the simulation runtime is likely to be dominated by basis state jsi for some n-bit string s. Because of this we the term exponential in t, one may wish to minimize the achieve the most favorable runtime scaling in Eq. (2) since exponent γ in Eq. (2). This exponent is related to the each bit of s can be learned by calling the sampling ⊗t stabilizer rank [20] of t-qubit tensor product states jA i, algorithm with a single-qubit output register (w ¼ 1) and where jAi is a “magic state” a constant statistical error ϵ. Second, the T count of the algorithm can be easily controlled by choosing a suitable −1=2 iπ=4 jAi¼2 ðj0iþe j1iÞ: bent function. Indeed, the non-Oracle part of the algorithm consists only of Hadamard gates. We show that for a large Recall that a t-qubit state is called a stabilizer state if it has class of bent functions f (from the so-called Maiorana- V 0⊗t V the form j i, where is a composed of McFarland family) the oracles Of and Of0 can be con- Clifford gates. Stabilizer states form an overcomplete basis structed using Clifford gates and only a few T gates. in the Hilbert space of t qubits. Let χtðδÞ be the smallest The numerical simulations were performed for two ⊗t integer χ such that jA i can be approximated with an randomly generated instances of the hidden shift problem error at most δ by a linear combination of χ stabilizer with n ¼ 40 qubits. For each of these instances we states (here the approximating state jψi should satisfy simulated the quantum circuit for the hidden shift algo- ⊗t 2 jhA jψij ≥ 1 − δ). The runtime scaling in Eq. (2) holds rithm, i.e., the circuit implementing the unitary U described γt for any exponent γ such that χtðδÞ¼Oð2 Þ for any above. The T counts of the two simulated circuits are constant δ > 0 and all sufficiently large t. For simplicity t ¼ 40 and t ¼ 48, respectively. Since the hidden shift s is here we assumed that the precision parameter ϵ in Eq. (2) is known beforehand, we are able to verify correctness of the a constant. Below we propose a systematic method of simulation. Our results are presented in Fig. 1. As one can finding approximate stabilizer decompositions of jA⊗ti see from the plots, the output probability distribution of γt −1 which yields an upper bound χtðδÞ¼Oð2 δ Þ, where each qubit has most of its weight at the corresponding value γ ≈ 0.228, see Eq. (3). We conjecture that this upper bound of the hidden shift bit. Only the output probabilities for is tight. qubits 21; 22; …; 40 are shown because our algorithm We implemented our classical sampling algorithm in perfectly recovered the first half of the hidden shift bits MATLAB and used it to simulate a class of benchmark 1; 2; …; 20. This perfect recovery occurs due to the quantum circuits on n ¼ 40 qubits, with a few hundred special structure of the chosen bent functions. Further

250501-2 week ending PRL 116, 250501 (2016) PHYSICAL REVIEW LETTERS 24 JUNE 2016 1 1

0.8 0.8

0.6 0.6

0.4 0.4 Output probabilities Output probabilities 0.2 0.2

0 0 00001000010011100011 01101101000011111001 Hidden shift string Hidden shift string

FIG. 1. Output single-qubit probability distributions obtained by a classical simulation of the hidden shift quantum algorithm on n ¼ 40 qubits. Only one half of all qubits are shown (qubits 21; 22; …; 40). The final state of the algorithm is jsi¼Uj0⊗ni, where s is the hidden shift string to be found and U is a Clifford þ T circuit with the T count t ¼ 40 (left) and t ¼ 48 (right). In both cases the circuit U contains a few hundred Clifford gates. For each qubit the probability of measuring “1” in the final state is indicated in blue. The x-axis labels indicate the correct hidden shift bits. The entire simulation took several hours on a laptop computer. implementation details can be found in Section IV of the Suppose jθ1i; …; jθLi ∈ St are random independent stabi- Supplemental Material [19]. lizer states. Define a random variable Let us now describe two main ingredients of our sampling algorithm. The first ingredient is a subroutine d XL ξ θ ϕ 2: for estimating the norm of a linear combination of stabilizer ¼ L jh ij ij ð7Þ states. It takes as input a t-qubit state jϕi, a target error i¼1 parameter ϵ > 0, and a failure probability pf. The state jϕi From Eq. (6) one infers that the expected value of ξ is is given as a linear combination of χ stabilizer states, ξ¯ ¼ EðξÞ¼∥jϕi∥2 and the standard deviation of ξ is Xχ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffi d − 1 jϕi¼ zajϕai; ϕa ∈ St: 2 −1 2 −1=2 2 σ ¼ d L ðM4 − M2Þ ¼ L ∥jϕi∥ : a¼1 d þ 1

−1=2 2 Here St is the set of all t-qubit stabilizer states. The For large t one has σ ≈ L ∥jϕi∥ . By the Chebyshev ξ ¯ −1=2 subroutine computes a real number which, with proba- inequality, Pr½jξ − ξj ≥ pf σ ≤ pf. Thus 2 bility at least 1 − pf, approximates the norm ∥jϕi∥ with ϵ O χt3ϵ−2p−1 relative error . It has running time ð f Þ. This ð1 − ϵÞ∥jϕi∥2 ≤ ξ ≤ ð1 þ ϵÞ∥jϕi∥2 improves upon the brute force method which has complex- ity Oðχ2t3Þ. The key idea is to approximate ∥jϕi∥2 by with the probability of at least 1 − pf provided that computing inner products between jϕi and randomly −1 −2 L ¼ pf ϵ . The inner product between any t-qubit stabi- chosen stabilizer states. Let jθi ∈ St be a random stabilizer O t3 state drawn from the uniform distribution. Define expect- lizer states can be computed classically inP time ð Þ, see θ ϕ χ z θ ϕ ation values Refs. [20,24]. The inner product h ij i¼ a¼1 ah ij ai 3 in Eq. (7) can be computed in time Oðχt Þ since jθii and ϕ t M ≡ E θ ϕ 2 M ≡ E θ ϕ 4: j ai are stabilizer states of qubits. Thus we compute an 2 θjh j ij and 4 θjh j ij 2 3 −2 −1 approximation to ∥jϕi∥ in time Oðχt ϵ pf Þ. We antici- pate that the above norm estimation method can be The set St is known to be a 2-design [23]. This implies that generalized to stabilizer states of qudits of prime dimension one may compute M2 and M4 by pretending that jθi is [25] and fermionic Gaussian states [26]. drawn from the Haar measure. Standard formulas for the The second ingredient of our simulation algorithm is a integrals over the unit sphere yield method for computing approximate stabilizer decomposi- A⊗t A 2 4 tions of j i. The magic state j i is equivalent to a state ∥jϕi∥ 2∥jϕi∥ t H ≡ π=8 0 π=8 1 M2 ¼ and M4 ¼ ; where d≡2 : ð6Þ j i cosð Þj iþsinð Þj i modulo Clifford gates d dðdþ1Þ and a global phase, jAi¼eiπ=8HS†jHi. Thus it suffices

250501-3 week ending PRL 116, 250501 (2016) PHYSICAL REVIEW LETTERS 24 JUNE 2016 to consider approximate stabilizer decompositions of contains c þ t þjyj gates. Since the gadgetized circuit is jH⊗ti. We have the identity equivalent to the original Clifford þ T circuit, we have

1 X ⊗n ⊗t † ⊗n ⊗t H⊗t x~ ⊗ x~ ⊗ ⊗ x~ h0 ⊗ A jVyðΠðxÞ ⊗ jyihyjÞVyj0 ⊗ A i j i¼ t j 1 2 tið8Þ P x ; 2ν outð Þ¼ † ð Þ x∈F t ⊗n ⊗t ⊗n ⊗t 2 h0 ⊗ A jVyðIn ⊗ jyihyjÞVyj0 ⊗ A i ð11Þ where j0~i ≡ j0i, j1~i ≡ Hj0i¼2−1=2ðj0iþj1iÞ, and ν ≡ cosðπ=8Þ. The right-hand side of Eq. (8) is a uniform t for any measurement outcomes y. Let jψi be a linear superposition of 2 nonorthogonal stabilizer states labeled −2t −1 t combination of χ ¼ Oðν δ Þ stabilizer states con- by elements of the vector space F 2. We construct an structed above such that jhψjA⊗tij2 ≥ 1 − δ. Replacing approximation jψi which is a uniform superposition of ⊗t t jA i by its approximation jψi in Eq. (11) we are led to states jx~1 ⊗ x~2 ⊗ ⊗ x~ti over a linear subspace of F 2. The dimension k of this subspace is chosen to be the unique consider a distribution 4 ≥ 2kν2tδ ≥ 2 δ positive integer satisfying , where is the ⊗n † ⊗n t y h0 ⊗ ψjVyðΠðxÞ ⊗ jyihyjÞVyj0 ⊗ ψi error tolerance. For any k-dimensional subspace L of F 2 we P x : outð Þ¼ ⊗n † ⊗n define a normalized state h0 ⊗ ψjVyðIn ⊗ jyihyjÞVyj0 ⊗ ψi 1 X ð12Þ jLi¼pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jx~1 ⊗ x~2 ⊗ ⊗ x~tið9Þ 2kZ L ð Þ x∈L This distribution will in general depend on y since jψi is not P exactly equal to jA⊗ti. In Section II of the Supplemental Z L ≡ 2−jxj=2 where ð Þ x∈L . Here j · j denotes the Hamming Material we show that weight of a bit string. A simple computation shows that jLi ⊗t approximates jH i with error 1 X Py x − P x O ϵ t outð Þ outð Þ ¼ ð Þ k 2t 2 t 1 2 ν y∈f0;1g δðLÞ ≡ 1 − jhH⊗tjLij2 ¼ 1 − : ð10Þ ZðLÞ provided that δ ¼ Oðϵ2Þ. This shows that we may approx- k P O ϵ The error δðLÞ can be computed in time Oðt2 Þ since ZðLÞ imately sample from out [with error ð Þ] by first contains 2k terms. In Section III of the Supplemental selecting a t-bit string y uniformly at random and then Py O ϵ Material we show that by choosing Oð1=δÞ k-dimensional approximately sampling from out [with error ð Þ]. It Py subspaces L uniformly at random we obtain at least one remains to show how to approximately sample from out ⋆ ⋆ subspace L such that δðL Þ ≤ δ with high probability. We for a fixed y. Since the gadgetized circuit Vy contains only conclude that jhψjA⊗tij2 ≥ 1 − δ, where jψi ≡ ðHS†Þ⊗tjL⋆i Clifford gates we may use the standard Gottesman-Knill is a linear combination of χ ¼ 2k ¼ Oðν−2tδ−1Þ stabilizer theorem to compute t-qubit stabilizer groups G; H and states. Computing the approximation jψi takes time integers u, v such that Oðν−2ttδ−2Þ. We will see that this is negligible compared ⊗n † ⊗n −u with the overall runtime Eq. (2) of the sampling algorithm. h0 ⊗ ψjVyðΠð0Þ ⊗ jyihyjÞVyj0 ⊗ ψi¼2 hψjΠGjψi We are now ready to describe the algorithm for sampling ϵ P ð13Þ from a distribution -close to out. For simplicity here we restrict our attention to the case when the output register consists of a single qubit (w ¼ 1). We first transform the ⊗n † ⊗n −v h0 ⊗ ψjVyðΠð1Þ ⊗ jyihyjÞVyj0 ⊗ ψi¼2 hψjΠHjψi Clifford þ T circuit to be simulated by replacing each T gate by a certain well-known gadget [27], shown in Fig. 2, ð14Þ that contains only Clifford gates and a 0,1 measurement. The S gate is classically controlled by the measurement where ΠG, ΠH are projectors onto the code space of outcome. The gadget consumes one copy of the magic state stabilizer codes defined by G, H. This computation, which jAi. This gives an equivalent “gadgetized” circuit consist- is described in more detail in Sections I, II of the ing of Clifford gates and t single-qubit measurements, Supplemental Material, takes time acting on a nonstabilizer initial state that contains t copies 3 of jAi. Let Vy be the Clifford circuit on n þ t qubits τ1 ¼ Oðtðc þ tÞþðn þ tÞ Þ: corresponding to measurement outcomes described by a t-bit string y ¼ y1y2; …;yt. Each gadget with yj ¼ 0 Since we are considering the case where the output string x contributes a CNOT gate to Vy, whereas each gadget with is a single bit, the output probability distribution is y 1 S V V Py 0 ; 1 − Py 0 j ¼ contributes a CNOT and the gate to y. Thus y f outð Þ outð Þg, where

250501-4 week ending PRL 116, 250501 (2016) PHYSICAL REVIEW LETTERS 24 JUNE 2016

[1] D. Wecker and K. M. Svore, arXiv:1402.4467. [2] M. Smelyanskiy, N. P. D. Sawaya, and A. Aspuru-Guzik, arXiv:1601.07195. [3] D. Gottesman, arXiv:quant-ph/9807006. [4] S. Aaronson and D. Gottesman, Phys. Rev. A 70, 052328 (2004). [5] I. Markov and Y. Shi, SIAM J. Comput. 38, 963 (2008). T S FIG. 2. The -gate gadget. The Clifford gate is classically [6] M. Van den Nest, Quantum Inf. Comput. 10, 0258 controlled by the measurement outcome. (2010). [7] H. Pashayan, J. Wallman, and S. D. Bartlett, Phys. Rev. Lett. 2−u ψ Π ψ Py 0 h j Gj i : 115, 070501 (2015). outð Þ¼ −v −u ð15Þ 2 hψjΠHjψiþ2 hψjΠGjψi [8] S. Bravyi and A. Kitaev, arXiv:quant-ph/9811052. [9] A. G. Fowler, M. Mariantoni, J. M. Martinis, and A. N. We compute the expectation values in Eq. (15) with a small Cleland, Phys. Rev. A 86, 032324 (2012). relative error using the norm estimation subroutine [10] V. Kliuchnikov, D. Maslov, and M. Mosca, Quantum Inf. Comput. 13, 607 (2013). described above. Indeed, since the projector ΠG maps [11] P. Selinger, Quantum Inf. Comput. 15, 159 (2015). stabilizer states to stabilizer states, one can represent [12] N. J. Ross and P. Selinger, arXiv:1403.2975. Π ψ χ O ν−2tϵ−2 Gj i as a linear combination of ¼ ð Þ stabilizer [13] H. Bombin and M. A. Martin-Delgado, Phys. Rev. Lett. 97, 2 states. Thus one can estimate hψjΠGjψi¼∥ΠGjψi∥ with a 180501 (2006). relative error OðϵÞ and a failure probability OðϵÞ in time [14] S. Bravyi and R. König, Phys. Rev. Lett. 110, 170503 (2013). 3 −3 −2t 3 −5 91 τ2 ¼ Oðχt ϵ Þ¼Oðν t ϵ Þ: [15] F. Pastawski and B. Yoshida, Phys. Rev. A , 012305 (2015). −u 0 −v [16] S. Bravyi and A. Kitaev, Phys. Rev. A 71, 022316 Let ξ ¼ 2 hψjΠGjψið1 ϵÞ and ξ ¼ 2 hψjΠHjψið1 ϵÞ be the resulting approximations. The final step in the (2005). [17] A. Fowler, S. Devitt, and C. Jones, Sci. Rep. 3, 1939 (2013). algorithm is to sample a bit from the probability distribution [18] C. Jones, arXiv:1310.7290. p ; 1 − p p ξ= ξ ξ0 f 0 0g where 0 ¼ ð þ Þ [cf. Eq. (15)]. The [19] See Supplemental Material at http://link.aps.org/ ξ ξ0 approximation guarantees for , ensure that this distri- supplemental/10.1103/PhysRevLett.116.250501 for further O ϵ Py bution is ð Þ-close to out. The total runtime of the details on the implementation of our algorithm and numeri- algorithm is τ1 þ τ2 from which we recover the w ¼ 1 case cal simulations. of Eq. (2). [20] S. Bravyi, G. Smith, and J. Smolin, arXiv:1506.01396 Whereas here we focused on the case w ¼ 1,in [Phys. Rev. X (to be published)]. Section II of the Supplemental Material we describe the [21] W. van Dam, S. Hallgren, and L. Ip, SIAM J. Comput. 36, simulation algorithm for arbitrary w. Although this algo- 763 (2006). rithm can be used for sampling from the output distribution [22] M. Rötteler, in Proceedings of the 21st ACM-SIAM Symposium on Discrete Algorithms with a small statistical error, in general it cannot accurately (Society for Industrial and Applied Mathematics (SIAM), Philadelphia, 2010), compute individual probabilities of the output distribution. pp. 448–457. In the Supplemental Material we also present a different [23] C. Dankert, R. Cleve, J. Emerson, and E. Livine, Phys. Rev. algorithm which uses similar techniques to compute the A 80, 012304 (2009). P x ϵ output probabilities outð Þ with a relative error . [24] H. J. García, I. Markov, and A. Cross, Quantum Inf. Comput. 14, 683 (2014). D. G. acknowledges funding provided by the Institute for [25] E. Hostens, J. Dehaene, and B. De Moor, Phys. Rev. A 71, Quantum Information and Matter, an NSF Physics 042315 (2005). Frontiers Center (NFS Grant No. PHY-1125565) with [26] S. Bravyi, Quantum Inf. Comput. 5, 216 (2005). support of the Gordon and Betty Moore Foundation [27] X. Zhou, D. W. Leung, and I. L. Chuang, Phys. Rev. A 62, (GBMF-12500028). S. B. thanks Alexei Kitaev for helpful 052316 (2000). discussions and comments.

250501-5