The effect of noise correlations on randomized benchmarking
Harrison Ball,1, 2 Thomas M. Stace,3 Steven T. Flammia,1 and Michael J. Biercuk1, 2, ∗ 1ARC Centre of Excellence for Engineered Quantum Systems, School of Physics, The University of Sydney, NSW 2006 Australia 2Australian National Measurement Institute, West Lindfield, NSW 2070 3ARC Centre of Excellence for Engineered Quantum Systems, School of Physics and Mathematics, The University of Queensland, QLD 4072 Australia (Dated: February 25, 2016) Among the most popular and well studied quantum characterization, verification and validation techniques is randomized benchmarking (RB), an important statistical tool used to characterize the performance of physical logic operations useful in quantum information processing. In this work we provide a detailed mathematical treatment of the effect of temporal noise correlations on the outcomes of RB protocols. We provide a fully analytic framework capturing the accumulation of error in RB expressed in terms of a three-dimensional random walk in ”Pauli space.” Using this framework we derive the probability density function describing RB outcomes (averaged over noise) for both Markovian and correlated errors, which we show is generally described by a gamma distribution with shape and scale parameters depending on the correlation structure. Long temporal correlations impart large nonvanishing variance and skew in the distribution towards high-fidelity outcomes – consistent with existing experimental data – highlighting potential finite-sampling pitfalls and the divergence of the mean RB outcome from worst-case errors in the presence of noise correlations. We use the Filter- transfer function formalism to reveal the underlying reason for these differences in terms of effective coherent averaging of correlated errors in certain random sequences. We conclude by commenting on the impact of these calculations on the utility of single-metric approaches to quantum characterization, verification, and validation.
I. Introduction why such distortion of RB results can appear, or a clear under- Quantum verification and validation protocols are a vital standing of the impact of this observation on the applicability tool for characterizing quantum devices. These take many of RB. forms [1–8], but one of the most popular due to its efficiency is randomized benchmarking (RB). In this approach, devel- In this manuscript we examine the impact of relaxing the oped originally by Knill et al. [9], and expanded theoretically Markovian-error assumption by studying its effect on the dis- by various authors [10–13], the average error probability of tribution of measurement outcomes over randomizations. We a quantum gate (e.g. a bit flip) is estimated by implementing find that while all randomizations (meeting certain criteria) a randomly sampled gate sequence from the set of Clifford are valid within the RB framework, they exhibit starkly dif- operations, and measuring the difference between the ideal ferent susceptibility to error when those errors exhibit tempo- transformation and the actual result. Averaging over many ral correlations over multiple individual Clifford operations. randomized sequences yields information about the underly- We provide a detailed mathematical treatment of error accu- ing gate fidelity. mulation in RB, using a general model treated in the specific RB has become so important in the experimental commu- case of dephasing errors. We demonstrate how the reduction nity [9, 14–23] that despite the experimental complexity it is of fidelity for a particular sequence may be given a geomet- now common to simply quote a measured gate error, pRB, re- ric interpretation, taking the form of a random walk in a 3D sulting from a RB measurement for a particular experimental Cartesian coordinate system. The steps of this walk corre- system, and relate this number to tight bounds such as fault- spond to the appearance of Pauli X,ˆ Y,ˆ Zˆ errors derived from tolerance error thresholds in quantum error correction [24]. interleaved dephasing operators in the sequence of Clifford Of late, reported values of pRB have compared favorably with operations, and the overall statistics are determined by the un- these thresholds, and have been used to justify the scalability arXiv:1504.05307v2 [quant-ph] 23 Feb 2016 derlying correlations in the noise process. Our treatment in- of particular experimental platforms. Underlying this entire cludes both extremal cases of uncorrelated (Markovian) and approach is the subtle but central assumption that errors are systematic (DC) errors, as well as generic correlations inter- uncorrelated [9]: an assumption which is generally violated polating between these limiting cases and captured through a by a wide variety of realistic error processes with long tempo- noise power spectral density. Our results provide simple an- ral correlations [14, 25, 26]. Violation of this assumption has alytic forms for the probability density functions of fidelity been noted previously as a cause of increased variance of out- outcomes over RB randomizations as a gamma distribution, comes over randomizations, but until now there has not been building connections to engineering literature on failure anal- a quantitative means to understand the impact of such tem- ysis. We describe the impact of these observations on the poral correlations, a detailed physical mechanism to explain interpretation of RB measurement outcomes in various noise environments and highlight the disconnect between measured RB error and metrics relevant to fault-tolerance such as worst- case errors. ∗ Correspondence to: [email protected] 2
II. Randomized Benchmarking Procedure where angle brackets in both subscripts indicate averaging over both rows and columns (that is, all elements) of F (J). A common experimental implementation of single-qubit The quantity µˆ(J) is therefore an estimator of the true mean (J) RB involves measuring the fidelity of net operations com- fidelity µ ≡ hFiη,δ, formally obtained as the expectation posed from random sequences of Clifford operations. For a over all possible fidelity outcomes F defined on the support sequence of length J, the net operation is written as the prod- of the random variables η and δ. uct In the standard RB procedure, measurements of µˆ(J) for J increasing J are fitted to an exponential decay from which Y the mean gate error pRB is extracted as the decay constant. Sη ≡ Cˆη (1) j Implicit in this procedure, however, is the assumption that, j=1 for any random set of Clifford sequences, the resulting dis- of Clifford operators Cˆ (see Table III) indexed by the se- tribution of fidelity outcomes represents the underlying error ηj process fairly, and the total mean µˆ(J) is reasonably repre- quence η = (η1, η2, ..., ηJ ), where the ηj are random vari- ables uniformly sampled from the set {1, 2, ..., 24} labelling sentative of any individual sequence. That is, the distribution (J) (J) the elements of the Clifford group. Typically one makes the of values F i,h·i is symmetric about µˆ with small relative constraint on η that, absent any error processes, the ideal RB variance for any random set {η1, ..., ηk}. J (J) sequence performs the identity operation Sη ≡ I. Of the 24 In this paper we show µˆ to be an unbiased and effec- total sequence combinations for a given J, only a small sub- tive estimator only when the error process is truly Markovian. set of k 24J random Clifford sequences is implemented That is, it possesses no temporal correlations. We provide a in practice. Each sequence is repeated (with appropriate qubit detailed analytic calculation of the impact of realistic, corre- reinitialization) n times to build an ensemble of fidelity mea- lated noise on the distribution of outcomes over different ran- surements sampling and the underlying error process. These domizations. To examine this we introduce a physical model measurement outcomes are then averaged together to obtain compatible with the experimental procedure, and which per- useful information from projective qubit measurements, with mits accounting for the presence of noise correlations. the mean fidelity retained for each random sequence. We represent the space of measurement outcomes by the III. Physical model (k × n) matrix In this section we develop the mathematical framework used to model and investigate the impact of noise correla- δ δ ... δ 1 2 n tions on RB. Our model assumes a Unitary error process with η F F ...F 1 1,1 1,2 1,n temporal correlations, reflecting typical experimental condi- η2 F2,1 F2,2 ...F2,n tions where qubit rotations, e.g. from a fluctuating classical F (J) ≡ (2) . . . .. . field, dominate. This process is generically described by a . . . . . power spectral density (PSD) capturing the various correla- η F F ...F k k,1 k,2 k,n tion timescales. Our error model is fully general and considers universal (multi-axis) errors. For simplicity of technical pre- where element F ∈ [0, 1] is the measured fidelity when im- i,j sentation we treat single-axis dephasing in the main text and plementing the ith Clifford sequence η = (η , η , ..., η ) i i,1 i,2 i,J provide the full presentation of universal noise in Appendix in the presence of the jth realization of the error process, de- D. noted by δ . For clarity, we make the distinction between the j The main result of this section is the geometric interpre- measured fidelity F (e.g. obtained in experiment via pro- i,j tation of error accumulation in RB sequences in terms of a jective qubit measurements) and the calculated fidelity (see random walk in 3 dimensions (a step taken for each Clifford), Eq. 5) serving as a proxy for F in our analytic framework. i,j with step lengths and directionality inheriting the correlation The process of repeating each η and averaging the fidelity i structure of the error process. This result is independent of outcomes over the finite sample of noise realizations therefore any choice of representation of the Clifford group, though our corresponds to a measurement of the “noise-averaged” fidelity numeric simulations later used to verify it do rely on a partic- for the ith sequence, represented by the quantity ular representation (see Table III). n (J) 1 X A. Noise-averaged fidelity hFi F ≡ F , i ∈ {1, ..., k} (3) i,h·i n i,j j=1 We begin by introducing our metric to quantify the reduc- tion in fidelity for a given η and δ. We employ the trace fi- where the angle brackets h·i in the second subscript denote av- delity eraging over the columns of F (J). Similarly the mean fidelity 2 2 averaged over both Clifford sequences and errors is denoted 1 † ˜ 1 ˜ F (η, δ) ≡ Tr S Sη,δ = Tr Sη,δ (5) 2 η 4 k n (J) (J) 1 X X µˆ ≡ F = Fi,j (4) capturing the overlap between ideal, S , and noise-affected h·i,h·i kn η i=1 j=1 sequences, S˜η,δ, via the Hilbert-Schmidt inner product. For 3 inputs ηi and δj this provides us with a computational metric As we will show, the noise interaction “steers” the operator corresponding to measurements Fi,j obtained in experiment. product away from the identity gate performed by the ideal Consequently our proxy for the measured noise-averaged fi- sequence Sη and reduces the operational fidelity in some way (J) characteristic of the correlation structure of δ. delity F i,h·i takes the computational form
1 2 C. Analytic Expression for Sequence Fidelity ˜ hFiδ,n = Tr Sη,δ (6) 4 δ,n 1. Series expansion for Sequence Fidelity where h·iδ,n denotes an ensemble average over n realizations We begin by obtaining an approximation for hFi. As RB is of δ, and explicit reference to η and δ in the argument of F intended to estimate errors too small to resolve in a single gate has been dropped. Henceforth we also drop the subscripts implementation, we assume the noise strength per gate, σ, is and denote the calculated noise-averaged fidelity simply by small. Over long gate sequences the accumulation of errors hFi. In the following sections we proceed with our main task: may be quantified by Jσ2, which can be large for sufficiently 2 deriving the probability density function (PDF) of hFi, and its large J. Assuming Jσ . 1, however, we can make analytical dependence on the correlation structure of δ. progress by expressing the error unitaries in terms of a power series truncated at O σ4. The noise-affected sequence Eq. 7 B. Error model is therefore approximated by To reiterate, a Clifford sequence η implements the opera- QJ ˆ ˆ th J 2 3 4 ! tion Sη ≡ i=1 Cηi , where Cηi is the ηi Clifford generator. Y δj iδj δj S˜ ≈ + iδ Zˆ − Zˆ2 − Zˆ3 + Zˆ4 Cˆ (8) The random variables ηi are uniformly sampled from the set η,δ I j 2 6 24 ηj {1, 2, ..., 24}, and are assumed to be independent and iden- j=1 tically distributed (i.i.d.), subject to the technical constraint The full expansion of Eq. 8 generates products of order up Sη ≡ I. Unitary errors are implemented by interleaving to O QJ δ4 = O σ4J . Let ξ(n) denote the Sη with a sequence of stochastic qubit rotations, yielding the j=1 j k1,k2,...,km noise-affected operation m kρ O Q δ = O (σn) product due to cross-multiplying ρ=1 jρ m ˜ ˆ ˆ ˆ ˆ k1 ˆ k2 ˆ km P Sη,δ ≡ U1Cη1 U2Cη2 ...UJ CηJ . (7) terms like (δj1 Z) (δj2 Z) ...(δjm Z) , where ρ=1 kρ = n. Retaining terms only up to fourth order (n = 4), consis- This approach holds for arbitrary unitaries, Uj, enacting rota- tent with the order of our original Taylor approximation, we tions in any Cartesian direction; a full treatment for universal thereby obtain noise models appears in Appendix D. However, in the follow- ˜ (0) (1) (2) (2) (3) (3) (3) ing we restrict ourselves to the subset of unitaries enacting Sη,δ ≈ ξ + ξ1 + ξ1,1 + ξ2 + ξ1,1,1 + ξ2,1 + ξ3 a sequence of dephasing rotations, Uj ≡ exp(−iδjZˆ), pa- + ξ(4) + ξ(4) + ξ(4) + ξ(4) + O σ6 . (9) rameterized by the time-series vector δ = (δ1, δ2, ..., δJ ). We 1,1,1,1 1,1,2 2,2 4 assume the error process is wide-sense stationary (i.e. the for The first term in this expansion is identical to the ideal Clif- errors δ and δ the two-point correlation function hδ δ i = i j i j ford sequence, ξ(0) ≡ S . Higher-order terms capture suc- hδ δ i depends only on the time difference k = i − j), and η i i−k cessive error contributions and are composed of blocks, or has zero mean. ˆ Temporal noise correlations are established by introducing subsequences, within Sη interrupted by one or more Z er- correlations between the elements of δ. In this work we treat rors, indicated by the subscripts. To evaluate Eq. 5 we must obtain expressions for the quantities Q(n) ≡ three distinct cases k1,k2,...,km 1 (n) 2 2 Tr ξk ,k ,...,k . Detailed derivations for these terms are (a) Markovian process: Elements δj ∼ N (0, σ ) are i.i.d. 1 2 m Gaussian-distributed errors with zero mean and vari- given in Appendix C. ance variance σ2, and completely uncorrelated between 2. Mapping from Clifford Sequence to “Pauli space” distinct Clifford gates in any sequence Sη (correlation length 1). We compute the terms in the power series by relating a 2 given Clifford sequence to a random walk in “Pauli space”. (b) DC process: Elements δj ≡ δ ∼ N (0, σ ) are identical For illustrative purposes, we consider in detail only the over a given sequence Sη (maximally correlated, corre- lation length J), but are i.i.d. (uncorrelated) Gaussian- quadratic terms: distributed errors with zero mean and variance variance (2) X ˆ ˆ σ2 over different instances. ξ1,1 = − δjδkC1,j−1ZCj,k−1ZCk,J (10) j We now define the cumulative operators Km giving the prod- Consequently, R~ is a random walk in “Pauli space”. uct of the first m Clifford operations Taking half the trace of Eq. 9 and substituting in Eq. 20, (2) 1 the term Q cancels out in the expression for Tr S˜η,δ . K ≡ C = Cˆ ...Cˆ , (12) 2 2 m 1,m η1 ηm This further simplifies by observing that † −1 where K0 ≡ KJ ≡ I, and Km = Km is also a Clifford ( 1 k 0, k odd operator (since it is a product of Clifford operators). Any sub- Tr C1,j−1Zˆ Cj,J = (21) sequence Cjk therefore “factorizes” as 2 1, k even ˆ† ˆ† ˆ ˆ ˆ ˆ † This follows from (a) the cyclic property of the trace, (b) the Cjk = Cη ...Cη Cη1 ...Cηj−1 Cηj ...Cηk = Kj−1Kk (13) j−1 1 ˆk technical constraint Sη ≡ C1,J ≡ I, and (c) Z is either Z or allowing us to rewrite I depending on whether k is odd or even respectively. Using Eq. 21 we find Q(0) = 1 and Q(1) = Q(3) = Q(3) = 0, ˆ ˆ 1 1,2 3 C1,j−1ZCj,k−1ZCk,J = PjPk (14) leading to the simplified expression where the operators on the right hand side are defined by 1 1 Tr S˜ ≈ 1 − kR~ k2 + Q(3) (22) 2 η,δ 2 1,1,1 ˆ † ˆ ˆ ˆ Pm ≡ Km−1ZKm−1 ∈ {±X, ±Y, ±Z} (15) (4) (4) (4) (4) (4) + Q1,1,1,1 + Q1,1,2 + Q1,3 + Q2,2 + Q4 . for 0 ≤ m ≤ J, and are always signed Pauli operators (since Substituting this into Eq. 6 and retaining only terms up to the Clifford group is the normalizer of the Pauli group). We O σ4 we obtain may therefore express the operators Pm in the basis of Pauli operators as hFi ≈ 1 − hkR~ k2i + O(4), (23) P = x Xˆ + y Yˆ + z Zˆ (16) m m m m where where xm, ym, zm ∈ {0, ±1} subject to the constraint 1 D E D (4)E D (4)E 2 2 2 O(4) ≡ kR~ k4 + 2 Q + 2 Q . (24) |xm| + |ym| + |zm| = 1 (only one nonzero coefficient). 4 2,2 4 The unit vector defined by To a good approximation O(4) may be treated as a small cor- rˆm ≡ (xm, ym, zm), krˆmk = 1 (17) rection in the form of a small constant, with the statistical distribution properties residing in the leading-order term, the therefore points uniformly at random along one of the princi- random variable hkR~ k2i. In arriving at these expressions we ple Cartesian axes {±xˆ, ±yˆ, ±zˆ}, and maps the “direction” have used the assumption that the error process has zero mean, of the operator Pm in Pauli space. Thus we have constructed from which it follows only terms with noise random variables a map from a given random Clifford sequence of length J, raised to even powers, or those summed over terms raised to a set of J unit vectors each oriented at random along the to even powers, survive the ensemble average with the oth- cartesian axes. (3) (4) ers reducing to zero. In particular, hQ1,1,1i = hQ1,1,1,1i = With these insights the error contributions may be recast (4) (4) hQ1,1,2i = hQ1,3i = 0 (see Appendix C). into more convenient expressions by moving to vector nota- tion. In particular, taking the trace over Eq. 14 and using the cyclic composition properties of the Pauli matrices, we find IV. Probability Distribution Functions for RB Outcomes via a Geometric Interpretation of Error Accumulation 1 In the expressions above we see that the key metric cap- Tr (PjPk) = rˆj · rˆk. (18) 2 turing the reduction of fidelity in a randomized benchmarking From Eqs. 10, 14 and 18 we therefore obtain sequence is J (2) X Q = − δ δ rˆ · rˆ . (19) X 1,1 j k j k R~ ≡ δjrˆj (25) j Observing the quantity δ δ rˆ · rˆ is invariant under ex- 3 j k j k which may be interpreted as a random-walk in R “Pauli change of indices, we may recast the restricted sum over space” generated by adding J randomly-oriented steps along j < k into an unrestricted sum, picking up a residual term the principle Cartesian axes, with step lengths specified by (2) 1 PJ 2 Q2 = − 2 j=1 δj (see Appendix C), to obtain δ. Walks which terminate far from the origin correspond to sequences with large net infidelities while those which ulti- J (2) 1 (2) X mately end near the origin have small infidelities. These ob- Q = − kR~ k2 − Q , R~ ≡ δ rˆ . (20) 1,1 2 2 j j servations form a key contribution of this work, as we will j=1 show. 5 This geometric picture provides a unique insight into how to any given Cartesian axis. The probability of finding any error accumulates in RB sequences, and facilitates calcula- particular combination (nx, ny, nz) is therefore given by the tion of the distribution of hFi incorporating the effects of cor- multinomial distribution relations in the error process. In particular, the calculation nx ny nz of sequence fidelities maps onto the distribution of random J! 1 1 1 P (nx, ny, nz) = (29) walk, with possible correlations manifesting the random step nx!ny!nz! 3 3 3 lengths. For J 5, however, this is sufficiently peaked around In the following sections we explore how correlations in & n = J/3 that we may simply regard these values as fixed δ affect the distributions of the terms in Eq. 23, and hence x,y,z without significant error. In this case ∆ ∼ N 0, Jσ2/3 the probability density function (PDF) f (F ) of the noise- x,y,z hFi ~ 2 averaged fidelity hFi. Our calculations demonstrate that un- reduce to i.i.d. random variables. The distribution of kRk consequently reduces to chi-square distribution with 3 degrees der all error models treated here fhFi(F ) takes the form of a gamma distribution, which is well known in statistics and of freedom. It is more convenient, however, to express this in provides the significant benefit in that explicit analytic forms more general terms as a member of the two-parameter family are available for both the moments of the distribution and the of gamma distributions (see Eq. A5), of which the chi-square moment-generating functions (see Table I). Interestingly the is a special case. Specifically, we obtain gamma distribution is known to be useful for failure analysis 3 2Jσ2 and life-testing in engineering [27] which bears some similar- kR~ k2 ∼ Γ(α, β) , α = , β = (30) ity to the notion of error accumulation in RB. 2 3 α β A. PDF for Markovian Processes with shape parameter and scale parameter . An ensem- ble average over n independent noise realizations is therefore In the Markovian limit (i.e. uncorrelated noise), we assume specified by all noise random variables δj are i.i.d. Hence R~ corresponds n 2 to a J-length unbiased random walk with step lengths sam- ~ 2 1 X ~ 2 ~ 2 3 2Jσ 2 hkRk in = kRk , kRk ∼ Γ , (31) pled from the normal distribution N 0, σ . Since these n j j 2 3 step lengths are symmetrically distributed about zero, the j=1 distributions of the components of the walk vector δ rˆ = j j where the kR~ k2 are i.i.d. gamma-distributed random vari- (δ x , δ y , δ z ) are invariant with respect to the sign of the j j j j j j j ables. But the sample mean over n gamma-distributed ran- coefficients α in all Cartesian directions α ∈ {x, y, z}. Ig- j dom variables simply yields a rescaled gamma distribution noring the signs we therefore treat the the coefficients as bi- with α → nα and β → β/n (see Eq. A7). Consequently naries αj ∈ {0, 1}, where the zero event simply reduces the number of steps taken in that direction. Let 2 2 3n 2Jσ hkR~ k in ∼ Γ , (32) J 2 3n X nα ≡ |αj| , α ∈ {x, y, z}, nx + ny + nz = J (26) ~ 2 2 i=1 with expectation E[hkRk in] = Jσ and variance [hkR~ k2i ] = 2 J 2σ4n−1. count the total number of nonzero components in each V n 3 Cartesian direction over the sequence of walk vectors Higher-order contributions may be included by computing {rˆ1, rˆ2, ..., rˆJ }. Then the terms in O(4). For full derivations see Appendix E; here 2 ~ x x y y z z we sketch the result. From the known distribution of kR~ k R = δ1 + ... + δn , δ1 + ... + δn , δ1 + ... + δn (27) x y z in Eq E6, the PDF for kR~ k4 is given by a transformation al- where the superscripts in δα indicate summing only over the lowing us to compute the expectation and variance. Taking an j ensemble average over noise realizations, applying the cen- subset of δj for which the coefficients αj are nonzero. Thus we have tral limit theorem, and observing the narrowness of the re- sulting distribution compared to the leading-order term, we ~ 2 2 2 2 ~ 4 ~ 4 5 2 4 kRk = ∆x + ∆y + ∆z (28) make the approximation hkRk i ≈ E[kRk ] = 3 J σ . The (4) 1 4 remaining terms yield values hQ2,2i = 8 J(J − 1)σ and ∆ ≡ δα + δα + ... + δα where we have defined α 1 2 nα for (4) 1 4 2 hQ4 i = 8 Jσ . Substituting these into Eq. 24 we obtain the α ∈ {x, y, z}. Since all δj ∼ N 0, σ are i.i.d., the new (4) 2 2 4 2 4th order correction O = 3 J σ and the noise-averaged random variables ∆α ∼ N 0, nασ are also independent fidelity reduces to 2 and normally distributed, with zero mean and variance nασ . The distribution of the sum of squares of non-identical 2 hFi ≈ 1 − hkR~ k2i + J 2σ4 (33) Gaussians is generally complicated to write down, requiring n 3 a generalized chi-square distribution. To avoid this we make 2 the following modest approximation. Since the vectors rˆj are inheriting the gamma distribution of hkR~ k in described in 1 uniformly-distributed there is a 3 probability of being parallel Eq. 32. We linearly transform this expression to produce a 6 final PDF for noise-averaged fidelity in the Markovian regime: Markovian DC Block-correlated α 3 n 3 3 J/(M − 1) f (F ) ≡ ν(F )α−1e−ν(F )/ββ−α/Γ(α), (34) 2 2 2 hFi 2 2 2 2 2 2 β 3 Jσ /n 3 Jσ 3 (M − 1)σ 2 2 4 where Γ(x)is the gamma function, and the quantities ν(F ), α ν(F ) 1 − F + 3 J σ 1 − F 1 − F and β are defined in Table I. 2 2 2 4 2 2 E 1 − Jσ + 3 J σ 1 − Jσ 1 − Jσ 2 2 2 2 4 1 2 2 2 M−1 B. PDF for DC (quasi-static) processes M 1 − Jσ 1 − 3n + 3 J σ 1 − 3 Jσ 1 − Jσ 1 − 3 J 2 2 4 2 2 4 2 4 V 3 J σ /n 3 J σ 3 J(M − 1)σ In the DC limit (i.e. quasi-static noise) we assume all noise p p p S −2 2/3n −2 2/3 −2 2(M − 1)/3J random variables δj ≡ δ are identical (maximally correlated) over a given sequence Sη. However over separate instances δ 2 TABLE I. Scale and shape parameters for the respective gamma dis- is sampled from the normal distribution δ ∼ N 0, σ . Then tributions, and calculated moments for noise-averaged fidelity dis- ~ R corresponds to a J-step unbiased random walk with fixed tributions fhFi(F ). Moments obtained from standard gamma dis- step length δ directed along the Cartesian axes. In this case tribution after appropriate linear transformation specified by ν(F ): the noise random variables δ and Clifford-dependent random Expectation, E [hFi] = ν + F − αβ; Mode, M [hFi] = ν√+ F − (α − 1)β [hFi] = αβ2 [hFi] = −2/ α variables rˆj factorize, allowing us to express ; Variance, V ; Skew, S . J ~ ~ ~ X R = δV , V ≡ rˆj (35) ward to show that the PDF for R is j=1 3/2 2 3 2 −3r f (r) = 4πr e 2J . (38) where V~ ∈ R3 defines an unbiased random walk on a 3D lat- R 2πJ tice generated by adding J unit-length steps. Thus, in contrast with the Markovian case, here the random walk in Pauli space This expression describes a random walk of unit step length, is unaffected (step-by-step) by the noise interactions. Rather, and is derived assuming that rˆj is uniformly and continuously in a given run, the noise variables δ effectively scale the ran- sampled from all directions in R3, however Eq. 38 is a good dom walk V~ generated by the Clifford sequence, up to a sign. approximation for the PDF of a walk on a 3D lattice [28]. The 2 2 Since we are interested in the norm square kR~ k2 = δ2kV~ k2, distribution of the distance square kV~ k ≈ R is then given however, any sign dependence of δ vanishes. Performing a fi- by the transformation (see Eq. A3) nite ensemble average over n noise randomizations we there- 1 fore obtain hkR~ k2i = hδ2i kV~ k2, and Eq. 23 yields f (x) = f x−1/2 (39) n n kV~ k2 2x1/2 R 2 ~ 2 (4) 1 x hFi ≈ 1 − hδ inkV k + O (36) = xα−1 exp − (40) Γ(α)βα β In this case, O(4), includes a term 1 hδ4i kV~ k4 which is now 4 n where α = 3/2 and β = 2J/3, and Γ(x) is the gamma func- highly correlated with the leading-order contribution, so we tion; again this is a gamma distribution (see Eq. A5) with cannot use the expectation value as a proxy for the whole dis- shape parameter α and scale parameter β. Consequently tribution. However corrections from these terms are O(σ4), so 2 we ignore these terms, and formally study the limit Jσ 1. 3 2J Numerical evidence indicates that this approximation works kV~ k2 ∼ Γ , . (41) 2 3 well up to Jσ2 ∼ 1. 2 2 Since δ ∼ N 0, σ is normally distributed, δ is chi- Thus, to leading order, the PDF for hFi with DC noise square-distributed which is a special case of the gamma distri- is specified by the product of two independent gamma- 2 bution, δ ∼ Γ(α, β) with shape parameter α = 1/2 and scale distributed random variables. The closed-form expression can 2 parameter β = 2σ . Taking the sample mean over n inde- be calculated by direct integration (see Appendix F), however pendent gamma-distributed variables again yields a rescaled for moderate ensemble sizes n & 50 it is sufficient to approx- gamma distribution with α → nα and β → β/n 2 2 imate hδ in as strongly peaked around its mean, σ , such that hFi ≈ 1−σ2kV~ k2. In this case the PDF f (F ) reduces to a n 2σ2 hFi δ2 ∼ Γ , (37) ~ 2 n 2 n linear-transformed gamma distribution associated with kV k and possesses the same form as Eq. 34 but with different val- 2 2 2 ues of the parameters ν(F ), α and β, as given defined in Ta- with expectation E[hδ in] = σ and variance V[hδ in] = 4 ble I. This is a remarkable observation given the substantial 2σ . n differences in error correlations between these extreme limits. The random-walk behavior of kV~ k2 (Eq. 35) represents a C. PDF for Generically-Correlated Processes well-studied problem in diffusion statistics. Let the random variable R be the distance from the origin in a symmetric For both limiting cases of Markovian and DC noise cor- (Bernoulli) 3D random walk after J steps. It is straightfor- relations treated above we showed hFi is described, to first 7 order, by the gamma distribution with shape and scale pa- autocorrelation function rameters dependent on the correlation structure. Assuming continuity of the distribution we also expect hFi to be ap- Cδ(k) ≡ hδjδj+kij (44) proximately gamma-distributed for generic, intermediate cor- relation structures. We can show this formally for a spe- where the ensemble average is over the index j. Substituting cific class of correlated noise models, in which the noise these definitions into Eq. 25 the expectation and variance of ~ 2 is block-correlated; i.e. the error random variables δj are hkRk i are given by (see Appendix H) constant over blocks, or subsequences, of Clifford operators J−1 of fixed length M ≤ J, and there is no correlation be- 4 X = JC (0), = (J − k)(C (k))2. (45) tween distinct blocks. The full derivation of the PDF for this E δ V 3 δ case can be found in Appendix G. We find block-correlated k=1 noise yields a gamma-distributed fidelity, with parameters In experiment one typically has access to the PSD S(ω) rather α = 3J/2(M − 1), β = 2(M − 1)σ2/3, enabling us to in- than the autocorrelation function. However this easily maps to terpolate between the Markovian (M = 1) and DC (M = J) our framework via an inverse Fourier transform limits for arbitrary correlation length M (see Table I). Z ∞ While block-correlated noise is not stationary (i.e. it does −iωkτg Cδ(k) = S(ω)e dω. (46) not have a stationary power spectrum), it simply and explicitly −∞ captures the notion of a correlation length, M. The correla- tion length thus manifests itself in the distribution of fidelities. The PDF therefore has the same form as Eq. 34 but with α With these insights, for brevity we assume that generic noise and β given by Eq. 43, and ν(F ) = 1 − F . correlations also give rise to gamma-distributed infidelities. With these expressions we now have a complete analytic This is supported by the quantitative calculations and qualita- representation of the distribution of measured noise-averaged tive arguments in Appendix G, and by comparison with Monte fidelities over different RB sequences. In the next section we Carlo numerics. Consequently, the shape and scale factors can verify these results with numeric Monte Carlo simulations and be inferred from the mean and variance of the distribution, discuss the differences in distribution characteristics depend- which we calculate directly. ing on noise correlations. For simplicity, we assume the error process is sufficiently D. Comparing PDFs for Various Noise Correlations non-Markovian that the noise ensemble size n and the O σ4 contributions may be ignored without introducing large errors, The analytic forms for fhFi(F ) derived above serve as a as in the DC case. Thus, we write tool to analyze the impact of temporal noise correlations on RB experiments. In all cases (Markovian, DC, and Intermedi- hFi ≈ 1 − hkR~ k2i, hkR~ k2i ∼ Γ[α, β] . (42) ate) the PDF is Γ(α, β)-distributed, with differences captured in the values of the shape parameter α and scale parameter β. From the first two moments of the gamma distribution, the This result is derived from statistics of a 3D random walk. expectation and variance of the random-walk variable are Plotting the PDF fhFi(F ) in Fig. 1 immediately reveals E[hkR~ k2i] = αβ and V[hkR~ k2i] = αβ2, so the shape and substantial differences in the distribution of outcomes for the scale parameters are given by two limiting cases. While the distributions in both Marko- vian and DC cases yield the same value for the mean (the 2 α = E , β = V. (43) statistic currently used in RB protocols), the higher order mo- V E ments diverge significantly. For DC noise fhFi(F ) is skewed towards high-fidelities and possesses a variance significantly The task of defining the gamma distribution therefore reduces larger than that for Markovian errors of equivalent strength, ~ 2 to computing the expectation and variance of hkRk i, as a parameterized by the value of σ2. Averaging over a large function of the generic correlation structure of the error pro- noise ensemble results in the mode converging to the mean cess, δ. in the Markovian regime, but maintaining a fixed higher value An arbitrary time-series x(t) may be characterized by of fidelity for the DC regime. By comparison, the variance an autocorrelation function Cx(τ) = hx(t)x(t + τ)it and skew for Markovian noise diminish with increasing noise where the ensemble average is over all t, and τ is the averaging, but remain fixed and nonzero in the DC case. time difference between measurements. In this case, in- To compare our analytic PDFs with the true distributions voking the Wiener-Khintchine theorem, the power spectral- we directly simulate the fidelity outcomes associated with the density (PSD) is given by the Fourier transform S(ω) = ∞ metric in Eq. 5. For a given J, we generate an ensem- √1 R C (τ)eiωτ dτ. We make use of these relations by 2π −∞ x ble {η1, ..., ηk} of k random Clifford sequences. The first discretizing the time-series. Let the elements of δ be defined J − 1 elements in each sequence ηi are uniformly and in- by δj = x(tj) for tj/τg ∈ {1, 2, .., J} where τg is the time dependently sampled from the set {1, ..., 24}, with the final taken to perform a Clifford operation. The underlying error element chosen such that the total operator product performs process is thereby discretely “sampled” by the Clifford se- the identity Sηi ≡ I. For each sequence ηi we then generate quence Sη, and correlations between elements of δ separated an ensemble {δi,1, ..., δi,n} of n random and independent re- by a time interval of “k gates” are specified by the (discrete) alizations of the error process, where each δi,j is a sequence 8 0.0 0.5 1.0 f ( F) (Normalized) array reduces to a column vector containing k noise-averaged 150 Markovian recent experiments [17]. Agreement with analytics is good c) d) DC 2 (with no free parameters) for Jσ . 1, beyond which higher order error terms contribute to the distribution. ) 100 F ( J =500 For correlated noise, similarly good agreement is obtained. 40 -2 higher-frequency components (shorter correlation times) re- e) Markovian 10 500 duces the variance and partially restores symmetry. In the ) (ω) -3 10 F S limit of power spectra containing substantial high-frequency ( 20 -4 DC 10 250 noise, small shifts due to higher-order terms in Eq. 24 become sampling (experimentally achieved by repeating a fixed se- quence Sη and averaging over the resulting projective mea- (M) surements), and the required kmin . Increasing the noise aver- (M) aging, n, reduces the value kmin . Beyond the question of how well measurements performed in strongly correlated environments estimate pRB, there is un- certainty surrounding the breadth of applicability of this single proxy metric as a general quantum verification tool for quan- tum information. One key observation is that while the vari- ance and skew of the fhFi(F ) converge to zero for Markovian errors in the limit of infinite noise averaging (n → ∞), both remain fixed for DC noise. Accordingly our results provide direct evidence of the divergence between pRB and parameters FIG. 2. Filter functions {G(ω; η)} for ensemble of RB sequences. relevant to fault-tolerance [35] such as worst-case errors [36] Dimensionless angular frequency ω normalized to the duration of a in noise environments with strong temporal correlations; in bit-flip operation, τπ; low-frequency (ω < 1) noise susceptibility is the DC limit the worst-case error can be much larger than the captured by vertical offsets and slopes. Inset: Histogram of calcu- average error. −4 −1 lated fidelities for {G(ω; η)} using S(ω) ∝ δ(ω − 4π × 10 τπ ) Finally, the fact that some RB sequences are intrinsically for quasi-DC noise (scaled to correspond to σ = 0.015) and overlaid “blind” to correlated errors, highlights potential shortcomings f (F ) with hFi . in performing experimental gate optimization by maximizing µˆ(J) at fixed J; an operator may optimize an experimental parameter (to maximize the RB fidelity) in such a way to in- V. Discussion and Conclusion crease systematic errors in individual gates. Such issues may be partially mitigated by selecting subsets of valid RB se- Our primary observation is that the form of f (F ) – and hFi quences using the length of the random walk, ||R~ ||, in the DC in particular the moments of the distribution – can exhibit limit to ensure fidelities are not overestimated in a RB proce- strikingly different behaviors in different error regimes. This [hFi] ≡ [F] dure. Future work will explore the use of entropic measures to is true despite the fact that the expectation E E J ~ approximately converges for Markovian and DC cases for associate RB sequence structure with ||R|| without the need weak noise (see Table I). This has a variety of important im- to perform full calculations of the random walk. pacts in the application and interpretation of RB experiments We conclude that the interpretation and applicability of a for quantum information. measured RB outcome pRB can differ significantly depending on the nature of the underlying errors and measurement pa- For instance, due to the form of fhFi(F ) for low frequency (J) rameters experienced in a real experimental situation. This noise, the sample estimator µˆ obtained from an ensem- challenge can be partially mitigated through presentation of ble of k 24J instances of η can differ from the true ex- more complete datasets – specifically F i,h·i for each sequence pectation E[F]J , generally leading to overestimation of the – in order to assist readers making comparisons between re- mean fidelity [12, 13]. This risks systematically underesti- ported results. We believe the new insights our calculations mating pRB due to insufficient sampling over η. The differ- p (J) have revealed will help bound the utility of RB in quantum in- ence |µˆ −E[F]J | may be formally bounded using moment- formation settings and also help experimentalists ensure that generating functions for the gamma distribution (see Supple- measurements are not subject to hidden biases. mental Material) to provide a more inclusive bound than pre- vious approximations [12, 13]. Ensuring µˆ(J) falls within an Acknowledgments acceptable confidence interval, say within ±10% of E[F]J Acknowledgements: The authors thank J. M. Chow, J. (M) −4 Gambetta, and A. Yacoby for useful discussions on RB (and assuming σ = 0.015, or pRB ∼ 2 × 10 ) requires (M) measurements, and L. Viola for discussions on random- kmin > 9 randomizations be selected in the Markovian case, (DC) ized decoupling. Work partially supported by the ARC but kmin > 443 for DC case. These values increase rapidly Centre of Excellence for Engineered Quantum Systems as confidence bounds tighten. These sample sizes are much CE110001013, the Intelligence Advanced Research Projects larger than typically employed in experimental settings, but Activity (IARPA) through the ARO, the US Army Research may be partially relaxed when calculating pRB by fitting to Office under Contracts W911NF-12-R-0012, W911NF-14-1- measurements performed for multiple values of J. In the 0098 and W911NF-14-1-0103, and a private grant from H. & Markovian regime we also find a tradeoff between finite noise A. Harley. STF acknowledges support from an ARC Future Fellowship FT130101744. [1] James D F V, Kwiat P G, Munro W J and White A G 2001 Phys. [2] O’Brien J L, Pryde G J, Gilchrist A, James D F V, Langford Rev. A 64(5) 052312 N K, Ralph T C and White A G 2004 Phys. Rev. Lett. 93(8) 10 080502 Phys. Rev. Lett. 110(14) 147002 [3] Emerson J, Alicki R and Zyczkowski˙ K 2005 Journal of Optics [27] Papoulis A 1991 Probability, Random Variables, and Stochastic B: Quantum and Semiclassical Optics 7 S347 Processes (New York: McGraw Hill) [4] Lévi B, López C C, Emerson J and Cory D G 2007 Phys. Rev. [28] V~ actually describes a lattice walk and should strictly be de- A 75(2) 022314 scribed by a probability mass function (PMF). However this in- [5] Bendersky A, Pastawski F and Paz J P 2008 Phys. Rev. Lett. troduces unnecessary complications in the form of combinato- 100(19) 190403 rial expressions to account for the directional discretization, and [6] López C C, Lévi B and Cory D G 2009 Phys. Rev. A 79(4) in the interest of analytic simplicity it is entirely appropriate to 042328 make this continuous approximation. Doing so simply smooths [7] Barz S, Fitzsimons J F, Kashefi E and Walther P 2013 Nature over the discreteness of the PMF while retaining all essential Physics 9 727–731 distribution characteristics. [8] Magesan E and Cappellaro P 2013 Phys. Rev. A 88(2) 022127 [29] Soare A, Ball H, Hayes D, Zhen X, Jarratt M C, Sastrawan J, [9] Knill E, Leibfried D, Reichle R, Britton J, Blakestad R B, Jost Uys H and Biercuk M J 2014 Phys. Rev. A 89 042329 J D, Langer C, Ozeri R, Seidelin S and Wineland D J 2008 Phys. [30] Green T J, Sastrawan J, Uys H and Biercuk M J 2013 New J. Rev. A 77(1) 012307 Phys. 15 095004 [10] Magesan E, Gambetta J M and Emerson J 2011 Phys. Rev. Lett. [31] Soare A, Ball H, Hayes D, Sastrawan J, Jarratt M, McLoughlin 106(18) 180504 J, Zhen X, Green T and Biercuk M 2014 Nat. Phys. 10 1 [11] Magesan E, Gambetta J M, Johnson B R, Ryan C A, Chow [32] Green T J, Uys H and Biercuk M J 2012 Phys. Rev. Lett. 109 J M, Merkel S T, da Silva M P, Keefe G A, Rothwell M B, Ohki 020501 T A, Ketchen M B and Steffen M 2012 Phys. Rev. Lett. 109(8) [33] Ryan C A, Laforest M and Laflamme R 2009 New J. Phys 11 080505 013034 [12] Wallman J J and Flammia S T 2014 New Journal of Physics 16 [34] Viola L and Knill E 2005 Phys. Rev. Lett. 94(6) 060502 103032 [35] Gutiérrez M and Brown K R 2015 Phys. Rev. A 91(2) 022335 [13] Epstein J M, Cross A W, Magesan E and Gambetta J M 2014 (Preprint 1501.00068) Phys. Rev. A 89(6) 062321 [36] Sanders Y R, Wallman J J and Sanders B C 2015 [14] Biercuk M, Uys H, VanDevender A, Shiga N, Itano W M and arXiv:1501.04932 Bollinger J J 2009 Quantum Info & Comp. 9 920–949 [15] Olmschenk S, Chicireanu R, Nelson K D and Porto J V 2010 New Journal of Physics 12 113007 [16] Gaebler J P, Meier A M, Tan T R, Bowler R, Lin Y, Hanneke D, Jost J D, Home J P, Knill E, Leibfried D and Wineland D J 2012 Phys. Rev. Lett. 108(26) 260503 [17] Brown K R, Wilson A C, Colombe Y, Ospelkaus C, Meier A M, Knill E, Leibfried D and Wineland D J 2011 Phys. Rev. A 84(3) 030303 [18] Chow J M, Gambetta J M, Tornberg L, Koch J, Bishop L S, Houck A A, Johnson B R, Frunzio L, Girvin S M and Schoelkopf R J 2009 Phys. Rev. Lett. 102(9) 090502 [19] Kelly J, Barends R, Campbell B, Chen Y, Chen Z, Chiaro B, Dunsworth A, Fowler A, Hoi I C, Jeffrey E, Megrant A, Mu- tus J, Neill C, O’Malley P, Quintana C, Roushan P, Sank D, Vainsencher A, Wenner J, White T, Cleland A and Martinis J M 2014 Phys. Rev. Lett. 112(24) 240504 [20] Harty T P, Allcock D, Ballance C J, Guidoni L, Janacek H A, Linke N M, Stacey D N and Lucas D M 2014 Phys. Rev. Lett. 113(22) 220501 [21] Xia T, Lichtman M, Maller K, Carr A W, Piotrowicz M J, Isen- hower L and Saffman M 2015 Phys. Rev. Lett. 114(10) 100503 [22] Lu D, Li H, Trottier D A, Li J, Brodutch A, Krismanich A P, Ghavami A, Dmitrienko G I, Long G, Baugh J and Laflamme R 2015 Phys. Rev. Lett. 114(14) 140505 [23] Muhonen J T, Laucht A, Simmons S, Dehollain J P, Kalra R, Hudson F E, Freer S, Itoh K M, Jamieson D N, McCallum J C, Dzurak A S and Morello A 2015 Journal of Physics: Con- densed Matter 27 154205 [24] Nielsen M and Chuang I 2010 Quantum Computation and Quantum Information: 10th Anniversary Edn. (Cambridge: Cambridge University Press) [25] Bylander J, Gustavsson S, Yan F, Yoshihara F, Harrabi K, Fitch G, Cory D G, Nakamura Y, Tsai J S and Oliver W D 2011 Nat. Phys. 7 565 [26] Anton S M, Birenbaum J S, O’Kelley S R, Bolkhovsky V, Braje D A, Fitch G, Neeley M, Hilton G C, Cho H M, Irwin K D, Wellstood F C, Oliver W D, Shnirman A and Clarke J 2013 11 A. Mathematical Preliminaries 1. Linear transformation Let Y is a continuous, non-negative random variable described by probability density functions fY (y). If Z is the random variable defined by Z = αY + β, where α, β ∈ R and α 6= 0, then the probability density functions of Z is 1 z − β f (z) = f (A1) Z |α| Y α 2. Product distribution Let X and Y be to independent, continuous random variables described by probability density functions fX (x) and fY (y). Then the joint distribution of Z = XY is Z ∞ 1 fZ (z) = fX (x)fY (z/x)dx (A2) −∞ |x| 3. Strictly increasing function of a random variable Let X be an absolutely continuous non-negative random variable described by probability density function fX (x). Define the transformed random variable Y = g(X) where g(x) is a strictly increasing function. That is x1 > x2 ⇐⇒ g(x1) > g(x2), so that g(x) has a well-defined inverse g−1(x). Then the probability density functions of Y is dg−1(y) f (y) = f (g−1(y)) (A3) Y X dy 4. Sum of continuous random variables Let X and Y be two independent, continuous random variables described by probability density functionsfX (x) and fY (y). Then the probability density functions of the sum Z = X + Y is Z ∞ Z ∞ fZ (z) = fX (z − y)fY (y)dy = fY (z − x)fX (x)dx (A4) −∞ −∞ 5. Gamma distribution The gamma distribution describes a family of continuous probability distributions, related to beta distribution, and arising nat- urally in processes for which the waiting times between Poisson-distributed events are relevant. The common exponential distribution and chi-squared distribution are special cases. The gamma distribution is a two-parameter distribution, and there are a few different parametrizations in common use. Throughout this paper we parametrize the distribution in terms of its shape parameter α and a scale parameter β. Let X ∼ Γ(α, β) be a gamma-distributed random variable, then the probability density function is defined by 1 x f (x) = xα−1 exp − (A5) X Γ(α)βα β where Γ(x) is the gamma function. Table A 5 contains some relevant statistics. mean αβ 2 variance αβ√ skew 2/ α mode (α − 1)β TABLE II. Statistics of the gamma distribution Γ(α, β). 6. Distribution of sample mean of gamma-distributed random variables Let Xi ∼ Γ(α, β), i ∈ {1, ..., n}, be a set of n independent and identically distributed random variables sampled from the Pn gamma distribution with shape parameter α and scale parameter β. Then the sum Z = i=1 Xi follows the transformed gamma distribution Z ∼ Γ(nα, β) (A6) 12 The ensemble average Z/n has probability density function given by transforming nfZ (nx). Substituting into Eq. A5, and defining rescaled parameters α → nα and β → β/n, the distribution of the ensemble average hXiin is found to be hXiin ∼ Γ(nα, β/n) (A7) 7. Lindeberg-Levy central limit theorem 2 Let {X1,X2, ..., Xn} be a set of i.i.d. random variables with expectation E[Xi] = µ and variance V[Xi] = ρ < ∞. Then the 1 Pn sample average Sn = n i=1 Xi converges to the normal distribution √ 2 lim n(Sn − µ) ∼ N 0, ρ (A8) n→∞ 13 B. Clifford Group Representation for Single Qubit A unitary operation Cˆ is an element of the Clifford group if CPˆ Cˆ† = P (B1) where we have defined the Pauli group P = {±I, ±X,ˆ ±Y,ˆ ±Zˆ}. (B2) That is, the Clifford group is the normalizer of the Pauli group, where for every Pauli operation P ∈ P, there is another P 0 ∈ P such that CˆP Cˆ† = P 0. For a single qubit, the set of all such Cˆ may be thought of as rotations of the Bloch sphere that permute the orientation of ±X,ˆ ±Y,ˆ ±Zˆ in the Cartesian basis associated with the Pauli matrices, which we refer to as “Pauli space.” To obtain a clearer physical picture of these operations, consider associating Xˆ to any of the six Cartesian axes {±xˆ, ±yˆ, ±zˆ}. With this axis fixed, we may rotate the axes about Xˆ into four possible orientations while preserving XˆYˆ Zˆ right-handedness. This is the action of the Cˆ group: the group of rotational symmetries of the cube. We construct our representation as follows. Let Ri(θ) represent one of nine elementary unitaries generating a clockwise rotation (looking down the axis of rotation toward the origin) through angle θ ∈ {π, ±π/2} about axis i ∈ {x, y, z}. The three π rotations correspond to Rx,y,z(π) ≡ X,ˆ Y,ˆ Zˆ (B3) and we use the shorthand ± Ri ≡ Ri(±π/2), i ∈ {x, y, z} (B4) ± for the remaining six π/2 rotations. For example, the action of the operators Ri on the Pauli operators/axes is + ˆ ˆ ˆ ˆ ˆ ˆ Rx :(X, Y, Z) → (X, −Z, Y ) (B5) + ˆ ˆ ˆ ˆ ˆ ˆ Ry :(X, Y, Z) → (Z, Y, −X) (B6) + ˆ ˆ ˆ ˆ ˆ ˆ Rz :(X, Y, Z) → (−Y, X, Z) (B7) Products of these nine elementary operations generate a representation of the 24 elements of the single-qubit Clifford group as tabulated in Table III. We use this prescription to generate numerical simulations verifying our analytic calculations above. 14 # Gate Name Action on (X,ˆ Y,ˆ Zˆ) Minimal Sequence(s) Notes ˆ ˆ ˆ ˆ ˆ 2 + − C1 I (X, Y, Z) Xi , Ri Ri , i ∈ {1, 2, 3} identity Cˆ2 Xˆ (X,ˆ −Y,ˆ −Zˆ) Xˆ Cˆ3 Yˆ (−X,ˆ Y,ˆ −Zˆ) Yˆ π rotation Cˆ4 Zˆ (−X,ˆ −Y,ˆ Zˆ) Zˆ ˆ + ˆ ˆ ˆ + C5 Rx (X, −Z, Y ) Rx ˆ + ˆ ˆ ˆ + C6 Ry (Z, Y, −X) Ry +π/2 rotations ˆ + ˆ ˆ ˆ + C7 Rz (−Y, X, Z) Rz ˆ − ˆ ˆ ˆ − C8 Rx (X, Z, −Y ) Rx ˆ − ˆ ˆ ˆ − C9 Ry (−Z, Y, X) Ry −π/2 rotations ˆ − ˆ ˆ ˆ − C10 Rz (Y, −X, Z) Rz ˆ ˆ ˆ ˆ ˆ + C11 (−X, −Z, −Y ) ZRx ˆ ˆ ˆ ˆ ˆ − C12 (−X, Z, Y ) ZRx ˆ ˆ ˆ ˆ + ˆ C13 (−Y, −X, −Z) Rz X ˆ ˆ ˆ ˆ − ˆ C14 (Y, X, −Z) Rz X ˆ ˆ ˆ ˆ + + C15 (−Y, −Z, X) Rz Rx ˆ ˆ ˆ ˆ + − C16 (−Y, Z, −X) Rz Rx ˆ ˆ ˆ ˆ + − C17 (−Z, −X, Y ) Rx Rz ˆ ˆ ˆ ˆ ˆ − C18 (−Z, −Y, −X) ZRy ˆ ˆ ˆ ˆ + − C19 (−Z, X, −Y ) Rz Ry ˆ ˆ ˆ ˆ − + C20 (Z, −X, −Y ) Rz Ry ˆ ˆ ˆ ˆ ˆ + C21 H (Z, −Y, X) ZRy Hadamard ˆ ˆ ˆ ˆ − + C22 (Y, −Z, −X) Rz Rx ˆ ˆ ˆ ˆ + + C23 (Z, X, Y ) Rz Ry ˆ ˆ ˆ ˆ − − C24 (Y, Z, X) Rz Rx TABLE III. Representation of the Clifford group for a single qubit from products of elementary rotations. Relevant transformations of the coordinate system X,ˆ Y,ˆ Zˆ under the action of each Clifford shown in column 3. Minimal sequence of elementary operations needed to generate each Clifford shown in column 4. We also indicate how these geometric rotations map to logical operations of interest for quantum information where relevant. C. Approximating the Noise-Averaged Fidelity hFi Here we present the full derivation of the approximation to the noise-averaged fidelity hFi given in Eq. 23 of the main text. Let ξ(n) denote the O Qm δkρ = O (σn) term in the expansion of Eq. 8 due to cross-multiplying terms like k1,k2,...,km ρ=1 jρ m ˆ k1 ˆ k2 ˆ km P (δj1 Z) (δj2 Z) ...(δjm Z) , where ρ=1 kρ = n. Retaining only terms up to fourth order (n = 4), consistent with our Talyor approximation of the error unitaries, we thereby obtain ˜ (0) (1) (2) (2) (3) (3) (3) (4) (4) (4) (4) Sη,δ ≈ ξ + ξ1 + ξ1,1 + ξ2 + ξ1,1,1 + ξ2,1 + ξ3 + ξ1,1,1,1 + ξ1,1,2 + ξ2,2 + ξ4 (C1) 15 where (0) ξ = C1,J (C2) J (1) X ˆ ξ1 = (iδj)C1,j−1ZCj,J (C3) j=1 (2) X ˆ ˆ ξ1,1 = (iδj)(iδk)C1,j−1ZCj,k−1ZCk,J (C4) j (3) 1 X ξ = − (iδ )δ2C ZCˆ Zˆ2C (C7) 1,2 2 j k 1,j−1 j,k−1 k,J j (4) 1 X ξ = − (iδ )(iδ )δ2C ZCˆ ZCˆ Zˆ2C (C10) 1,1,2 2 j k l 1,j−1 j,k−1 k,l−1 l,J j (4) 1 X ξ = δ2δ2C Zˆ2C Zˆ2C (C13) 2,2 4 j k 1,j−1 j,k−1 k,J j (4) i X ξ = − (iδ )δ3C ZCˆ Zˆ3C (C14) 1,3 6 j k 1,j−1 j,k−1 k,J j 1 To evaluate Eq. 5 we must obtain an expression for Tr(S˜η,δ). By the linearity of the trace, we must therefore obtain expressions 2 for the quantities Q(n) ≡ 1 Tr ξ(n) for each of the terms above. This requires some manipulation. To begin we k1,k2,...,km 2 k1,k2,...,km highlight the following useful properties ( Z,ˆ k odd Zˆk = (C17) I, k even C1,J ≡ Sη ≡ I. (C18) In fact since C1,J = I, any cyclic permutation of subsequences Cjk also gives the identity. For instance C1,J = C1,j−1Cj,k−1Ck,J = Ck,J C1,j−1Cj,k−1 = Cj,k−1Ck,J C1,j−1 = I (C19) 16 and so on for any number of subsequences. Using the cyclic property of the trace, we therefore have ( 1 1 1 0, k odd Tr C ZˆkC = Tr C C Zˆk = Tr Zˆk = (C20) 2 1,j−1 j,J 2 j,J 1,j−1 2 1, k even The last equality uses Eq. C17, the property of Pauli matrices that Tr(Zˆ) = 0, and that the corresponding (2 × 2) identity has trace Tr(I) = 2. Now define the cumulative operators Km giving the product from the first through to the mth Clifford operator in the sequence ˆ ˆ Km ≡ C1,m = Cη1 ...Cηm ,K0 ≡ I ≡ KJ (C21) where each Km is some element of the Clifford group due to the closure property of groups under group composition. In this case any subsequence Cjk “factorizes” into the products K† K = Cˆ† ...Cˆ† Cˆ ...Cˆ Cˆ ...Cˆ = C j−1 k ηj−1 η1 η1 ηj−1 ηj ηk jk (C22) allowing us to rewrite C1,j−1ZCˆ j,k−1ZCˆ k,J = PjPk C1,j−1ZCˆ j,k−1ZCˆ k,l−1ZCˆ l,J = PjPkPl C1,j−1ZCˆ j,k−1ZCˆ k,l−1ZCˆ l,m−1ZCˆ m,J = PjPkPlPm ... where we define the Zˆ-conjugating operators ˆ † ˆ ˆ ˆ Pm ≡ Km−1ZKm−1 ∈ {±X, ±Y, ±Z}, 0 ≤ m ≤ J (C23) 17 and where Pm is always a signed Pauli matrix due to the property that the Clifford group is the normalizer of the Pauli group. Thus we find Q(0) = 1 (C24) (1) Q1 = 0 (C25) (2) 1 X Q = − δ δ Tr (P P ) (C26) 1,1 2 j k j k j (3) i X Q = − δ δ δ Tr (P P P ) (C28) 1,1,1 2 j k l j k l j (4) 1 X Q = δ δ δ2Tr (P P ) + δ δ2δ Tr (P P ) + δ2δ δ Tr (P P ) (C32) 1,1,2 4 j k l j k j k l j l j k l k l j (4) 1 X Q = δ2δ2 (C33) 2,2 4 j k j (4) 1 X Q = δ δ3Tr (P P ) (C34) 1,3 12 j k j k j The nonzero terms may be recast into more convenient expressions by moving to vector notation. We expand the operators Pm in the basis of Pauli operators, 2 2 2 Pm = xmXˆ + ymYˆ + zmZ,ˆ xm, ym, zm ∈ {0, ±1}, |xm| + |ym| + |zm| = 1. (C36) That is, only one nonzero coefficient xm, y,zm, equivalent to expressing the fact that they are sampled from the set {±X,ˆ ±Y,ˆ ±Zˆ}. The associated unit vector rˆm ≡ (xm, ym, zm), krˆmk = 1 (C37) therefore points uniformly at random along one of the principle Cartesian axes {±xˆ, ±yˆ, ±zˆ}, capturing the “direction” of the operator Pm in “Pauli space”. With these definitions we may derive 1 Tr (P P ) = rˆ · rˆ (C38) 2 j k j k 1 Tr (P P P ) = i (rˆ × rˆ ) · rˆ (C39) 2 j k l j k l 1 Tr (P P P P ) = (rˆ · rˆ )(rˆ · rˆ ) + (rˆ · rˆ )(rˆ · rˆ ) − (rˆ · rˆ )(rˆ · rˆ ) (C40) 2 j k l m j k l m j m k l j l k m 18 following directly from the trace and the cyclic composition properties of the Pauli matrices σˆiσˆj = iijkσˆk + δijI, where ijk is the Levi-Civita symbol, δij is the Kronecker delta and Einstein summation notation used. Consequently we obtain (2) X Q1,1 = − δjδkrˆj · rˆk (C41) j (4) 1 X Q = δ δ δ δ {(rˆ · rˆ )(rˆ · rˆ ) + (rˆ · rˆ )(rˆ · rˆ ) − (rˆ · rˆ )(rˆ · rˆ )} (C44) 1,1,1,1 2 j k l m j k l m j m k l j l k m j (4) 1 X Q = δ δ δ2rˆ · rˆ + δ δ2δ rˆ · rˆ + δ2δ δ rˆ · rˆ (C45) 1,1,2 2 j k l j k j k l j l j k l k l j (4) 1 X Q = δ δ3rˆ · rˆ (C46) 1,3 6 j k j k j (4) 1 X Q = δ2δ2 (C47) 2,2 4 j k j (2) We expect the major contribution to the distribution of hFi to reside in the term Q1,1 since, in the limit σ 1, higher order (n>2) (2) terms Q will be orders of magnitude smaller. Anticipating this, we recast Q1,1 in terms of an unrestricted sum to facilitate more straightforward analysis. Observing the quantity δjδkrˆj · rˆk is invariant under exchange of indexes we thus obtain J J J (2) 1 X X X Q = − δ δ rˆ · rˆ − δ2rˆ · rˆ (C49) 1,1 2 j k j k j j j j=1 k=1 j=1 J J 1 X 1 X = − δ δ rˆ · rˆ + δ2 (C50) 2 j k j k 2 j j,k j=1 J J ! 1 X X (2) = − δ rˆ · δ rˆ − Q (C51) 2 j j k k 2 j=1 k=1 1 = − kR~ k2 − Q(2) (C52) 2 2 where we define the resultant vector J X R~ ≡ δjrˆj. (C53) j=1 Hence 1 1 Tr S˜ ≈ 1 − kR~ k2 + Q(3) + Q(4) + Q(4) + Q(4) + Q(4) + Q(4). (C54) 2 η,δ 2 1,1,1 1,1,1,1 1,1,2 1,3 2,2 4 Since the above terms are all real, the fidelity is then obtained taking the square of Eq. C54 and truncating any cross terms higher than O σ4. Thus we obtain 1 2 F (η, δ) ≈ 1 − kR~ k2 + kR~ k2 + 2Q(3) + 2Q(4) + 2Q(4) + 2Q(4) + 2Q(4) + 2Q(4). (C55) 4 1,1,1 1,1,1,1 1,1,2 1,3 2,2 4 19 Averaging Eq. C55 over an ensemble of noise realizations δ then yields the approximate expression for hFi. However we make simplifying observation that only terms raised to even powers, or those summed over terms raised to even powers, survive the ensemble average. Moreover, since the vectors rˆm are uniformly distributed over the set {±xˆ, ±yˆ, ±zˆ}, summing over compositions (rˆj · rˆk) j < k < J (C56) (rˆj × rˆk) · rˆl j < k < l < J (C57) (rˆj · rˆk)(rˆl · rˆm) j < k < l < m < J (C58) will on average resolve to the zero vector. Hence, even in the presence of correlated noise random variables, it is appropriate to set D (3) E Q1,1,1 → 0 (C59) D (4) E Q1,1,1,1 → 0 (C60) D (4) E Q1,1,2 → 0 (C61) D (4)E Q1,3 → 0 (C62) In fact these quantities are random variables sharply (and symmetrically) peaked around 0. However since they are O σ4 terms, the spread of their distributions is very small relative to the spread of hkR~ k2i, which scales as O σ2. The noise- averaged fidelity to O σ4 therefore reduces to D E hFi ≈ 1 − kR~ k2 + O(4) (C63) where the final term in Eq. C63 consists only of O σ4 terms 1 D E D E D E O(4) ≡ kR~ k4 + 2 Q(4) + 2 Q(4) . (C64) 4 2,2 4 The remaining task is to study how correlations between the noise variables affect the distributions of the terms in Eq. C63, and hence the distribution of hFi. In particular how these distributions change depending on whether the noise falls into Markovian or quasi-static regimes. As stated in the main text, our key insight is the interpretation of the vector quantity R~ in terms of a 3D random walk generated by adding J randomly-oriented steps with step lengths specified by δ. The distribution of hFi then maps onto the distance square of this 3D random walk. 20 D. Universal Error Model Here we sketch a generalization of our analytic framework, showing its applicability to universal errors, beyond the (perhaps most important) case of dephasing specifically treated above. As in the main text, errors are implemented by interleaving Sη with a sequence of stochastic unitary rotations, yielding the noise-affected operation ˜ ˆ ˆ ˆ Sη,δ ≡ U1Cη1 U2Cη2 ...lUJ CηJ . (D1) However we now let the unitaries take the general form h ~ i h (z) (y) (z) i h (z) ˆ (y) ˆ (z) ˆi Uj ≡ exp −i∆j · σ ≡ exp −i ∆j σˆx + ∆j σˆy + ∆j σˆz ≡ exp −i ∆j X + ∆j Y + ∆j Z (D2) ˆ ˆ ˆ ~ (x) (y) (z) where σ ≡ (ˆσx, σˆy, σˆz) ≡ (X, Y, Z) denotes a vector of Pauli matrices and the vector ∆j = (∆j , ∆j , ∆j ) contains the error rotations induced about each Cartesian axis. That is, the unitary Uj causes the net rotation through an angle k∆~ jk about ˆ the axis δj ≡ ∆~ j/k∆~ jk on the Bloch sphere. In this case the error process indicates a list of 3-component vectors δ = (∆~ 1, ∆~ 2, ..., ∆~ J ). (D3) Assuming the weak-noise limit, quantified by the perturbative condition JE[k∆~ k2] < 1, we make the Taylor approximation (z) (y) (z) Uj ≈ I + i ∆j σˆx + ∆j σˆy + ∆j σˆz + ... (D4) In the main text we presented a complete treatment up to fourth order to account for both the leading- and higher-order contribu- tions to infidelity, and assuming a dephasing-only environment. Here we demonstrate rigorously how leading-order contributions to the fidelity take the same form when the noise model is universal, as described above. Once again, the leading order contribu- tion derives from the term like Eq. 10 but with a slight amendment: (2) X ˜(2) ξ1,1 = ξ1,1(µ, ν) (D5) α,β where ˜(2) X (µ) (ν) ξ1,1(µ, ν) = − ∆j ∆k C1,j−1σˆµCj,k−1σˆν Ck,J . (D6) j † Writing Cjk = Kj−1Kk, and using K0 ≡ KJ ≡ I, we therefore obtain † † † C1,j−1σˆµCj,k−1σˆν Ck,J = K0Kj−1σˆµKj−1Kk−1σˆν Kk−1KJ (D7) † = Kj−1σˆµKj−1Kk−1σˆν Kk−1 (D8) (µ) (ν) = Pj Pk (D9) where the operators in the last equality generalize the definition in Eq.15, namely (µ) † ˆ ˆ ˆ Pm ≡ Km−1σˆµKm−1 ∈ {±X, ±Y, ±Z}. (D10) (µ) Once again, 0 ≤ m ≤ J, and the Pm are always signed Pauli operators due to the property that the Clifford group is the normalizer of the Pauli group. We may therefore expand the operators in the basis of Pauli operators by writing (µ) (µ) ˆ (µ) ˆ (µ) ˆ Pm = xm X + ym Y + zm Z (D11) (µ) (µ) (µ) where xm , ym , zm ∈ {0, ±1} with only one nonzero coefficient. Since the Clifford sequences (and hence subsequences) are (µ) uniformly random, the operators Pm are uniformly random, independent of the choice of µ. The associated unit vector (µ) (µ) (µ) (µ) (µ) rˆm ≡ (xm , ym , zm ), krˆm k = 1 (D12) 21 therefore still points uniformly at random along one of the principle Cartesian axes {±xˆ, ±yˆ, ±zˆ}. Taking the trace over Eq. D7, using the cyclic composition properties of the Pauli matrices, and moving to vector notation we therefore obtain 1 Tr P(µ)P(ν) = rˆ(µ) · rˆ(ν). (D13) 2 j k j k Consequently (2) 1 (2) X (µ) (ν) (µ) (ν) Q˜ (µ, ν) ≡ Tr ξ˜ (µ, ν) = − ∆ ∆ rˆ · rˆ (D14) 1,1 2 1,1 j k j k j ˜(2) where leading order contribution derives from the sum of all such terms Q1,1(µ, ν) 3 (2) X ˜(2) Q1,1 = Q1,1(µ, ν). (D15) µ,ν=1 Taking an ensemble average over the noise realizations, we obtain 3 D (2)E X D ˜(2) E Q1,1 = Q1,1(µ, ν) (D16) µ,ν=1 3 X X D (µ) (ν)E (µ) (ν) = − ∆j ∆k rˆj · rˆk . (D17) µ,ν=1 j We now make a simplification in which we assume the errors in separate quadratures arises from distinct physical mechanisms (e.g. dephasing and depolarization noise arise from different mechanisms). In this case the correlation between separate error quadratures is zero, and D (µ) (ν)E ˜ ∆j ∆k = δµ,ν (D18) ˜ where δµ,ν is the Kronecker delta. With this assumption, we obtain 3 D (2)E X X D (µ) (µ)E (µ) (µ) Q1,1 = − ∆j ∆k rˆj · rˆk . (D19) µ=1 j (µ) (µ) (µ) (µ) Now the quantity ∆j ∆k rˆj · rˆk is invariant under exchange of lower indices, and we may recast the ordered sum as an unrestricted sum 3 J 2 D (2)E 1 X X D (µ) (µ)E (µ) (µ) X (µ) Q = − ∆ ∆ rˆ · rˆ − ∆ (D20) 1,1 2 j k j k j µ=1 j,k k=1 Defining ~ (µ) X (µ) (µ) R ≡ ∆j rˆi (D21) i=1 we obtain 3 D (2)E 1 X D (µ) E D (2)E Q = − kR~ k2 − Q (D22) 1,1 2 1,1 µ=1 22 where the residual term 3 J 2 (2) 1 X X (µ) Q = − ∆ (D23) 2 2 j µ=1 k=1 is independent of the Clifford sequence and cancels out in the full expansion of hFi, as in the case of the dephasing error model in the main text. The leading order contribution to the noise-averaged infidelity is therefore given by 3 X D (µ) E 1 − hFi ≈ kR~ k2 (D24) µ=1 (µ) where each of the terms hkR~ k2i are independent random variable inheriting the PDF of the random walk, and taking the (µ) same form presented in the main text. Consequently each of the hkR~ k2i are mutually independent, gamma-distributed random variables. In general, each of these random walks will be distinct, and the sum in Eq. D24 is over non-identical gamma- distributed random variables. The total PDF may be obtained from successive applications of Eq. A4 by direct integration. In the event that all noise quadratures follow the same gamma-distribution Γ(α, β), the total PDF simply reduces to the rescaled gamma distribution Γ(3α, β) by an application of Eq. A6. 23 E. PDF Derivation - Markovian Regime In the Markovian regime, we assume all noise random variables are i.i.d. Hence R~ corresponds to a J-length unbiased random walk with step lengths sampled from the normal distribution N 0, σ2. Since these step lengths are symmetrically distributed about zero, the distributions of the components of the walk vector δjrˆj = (δjxj, δjyj, δjzj) are invariant with respect to the sign of the coefficients αj in all Cartesian directions α ∈ {x, y, z}. Ignoring the signs we therefore treat the the coefficients as binaries αj ∈ {0, 1}, where the zero event simply reduces the number of steps taken in that direction. Let J X nα ≡ |αj| , α ∈ {x, y, z}, nx + ny + nz = J (E1) i=1 count the total number of nonzero components in each Cartesian direction over the sequence of walk vectors {rˆ1, rˆ2, ..., rˆJ }. Thus R~ = δx + ... + δx , δy + ... + δy , δz + ... + δz 1 nx 1 ny 1 nz (E2) α where the superscripts in δj indicate summing only over the subset of δj for which the coefficients αj are nonzero. Thus we have kR~ k2 = ∆2 + ∆2 + ∆2, ∆ ≡ δα + δα + ... + δα , α ∈ {x, y, z}. x y z α 1 2 nα (E3) δ ∼ N 0, σ2 S ≡ δα, δα, ...δα Since all j are i.i.d. in the Markovian regime, so too are all the random variables in set α 1 2 nα , α ∈ {x, y, z}. The distribution of their sum is therefore given by