Implementation of Gate Set Tomography on Quantum Hardware

Henrique Guimaraes˜ Silverio´ [email protected]

Instituto Superior Tecnico,´ Lisboa, Portugal November 2019

Abstract Finding and understanding errors in quantum devices is instrumental to their development. Among error characterization techniques, Gate Set Tomography (GST) stands as the most comprehensive proto- col. In this work, we implement a working GST protocol to characterize single- operations of selected IBM Q’s quantum devices. By benchmarking GST with simulated data, we find the optimal solutions to be degenerate, with the initial state and readout estimations changing correlatedly. To fix this, we propose a protocol modification which assumes a perfect initial state. The modified GST protocol produces accurate and consistent estimations, with precision limited mainly by sampling noise. In the application of GST to IBM Q devices, we observe that the average single-qubit gate quality is close to the limit of GST’s precision. The same no longer holds for the readout errors, whose larger values agree with IBM’s calibrations. We also characterize a qubit’s readout error over time and find that it often changes in the order of 1% − 2% per hour.

Keywords: , GST, qubit characterization, IBM Q

1. Introduction tude of the deviations that appear then speak for In recent years, the field of quantum computation the overall quality of the device. However, a mini- has benefited from substantial interest and invest- mal amount of information can be extracted using ment, which has been translated into the devel- this naive approach. Thus, a more systematic ap- opment of the first widely available prototypes of proach to hardware testing is needed, more specif- quantum computers, with up to 20 . How- ically one that tries to describe the individual ele- ever, these improvements are not yet enough to ments of the device to predict its behaviour on all take full advantage of most of the quantum algo- possible scenarios. rithms under development since the 1990s, in large Qubit characterization provides rigorous tools part due to technological challenges. Further im- for error diagnostics and assessment of the hard- proving current quantum computers relies heavily ware’s performance. The objective is to find out the on understanding the nature and origin of existing sources of errors so that they can be mitigated, or complications, to which extent an accurate method to reliably set an upper bound on the error of an of characterization is essential. The main focus of arbitrary algorithm’s outcome. The different ways this work will be the study, implementation and ap- to approach qubit characterization can be roughly 1 plication to real devices of state of the art charac- sorted by the amount of information they intend to terization techniques. extract from the procedure. Among the more pop- ular methods, Randomized Benchmarking (RB) [7] 1.1. The need for qubit characterization and Gate Set Tomography (GST) [2] stand at op- In the development of quantum devices, testing posite ends of this spectrum. the hardware’s performance is always an essen- tial part of the process. In a first iteration, this can RB is a process in which gate sequences of dif- be done through the computation of quantum al- ferent lengths are applied and the outcome com- gorithms whose outcomes are known; the magni- pared with the predicted value. By doing this, one attains a measurement of the average process fi- 1Namely the IBM Quantum Experience (IBM Q) quantum delity, which can then be used to extract the av- processors. erage gate fidelity [6]. Additionally, this process

1 can be extended to multi-qubit operations with only polynomial scaling with the number of qubits [10]. H In contrast with other techniques, RB is simple, ro- bust and gives reliable average gate fidelities for multi-qubit processes that are useful in ascertain-

ing the overall quality of the gates in a quantum State preparation Conditional State device. However, it can only provide the average Manipulation readout fidelity of specific gates, which makes it unsatisfac- Figure 1: Diagram representing the stages of a quantum com- tory as a standalone characterization technique. putation, with the example of a circuit√ that prepares the max- imally entangled state (|00i + h11|)/ 2. The state prepara- On the other hand, GST tries to achieve a com- tion represents the transformation of an arbitrary quantum state plete characterization of the system. With GST, the onto the |00i state and is usually omitted in quantum circuits. The circuit itself shows the implementation of a Hadamard gate goal is to achieve a description of the hardware that on the first qubit, which creates an equal superposition state on correctly predicts the outcome of any circuit. This that qubit, followed by a CNOT gate that flips the state of the feature makes GST much more informative than second qubit conditionally on the state of the first. In the end, RB, but has the serious drawbacks of being costly each qubit is individually measured in the computational basis, returning one of two possible basis states. in resources and unpractical for multiple qubits. In this work, the method of choice for the charac- terization of IBM’s quantum processors was GST, 2.2. Operations with Density Matrices for several reasons: First and foremost, because In this work, we adopt the density matrix formal- we were interested in achieving a description of the ism, which represents a system as an ensemble of system that was as detailed as possible. Secondly, quantum states with probability pi, {pi, |ψii}. It is due to the fact that RB has already been exten- defined as an operator sively applied to IBM’s systems, while GST results X remain, to our knowledge, either undone or unpub- ρ = pi |ψii hψi| (1) lished. Lastly, because the IBM Q is remotely ac- i cessible to the scientific community and has thus been subjected to extensive testing, there are mul- A density matrix, ρ, represents a physical system tiple reports of its poor performance when com- if Tr[ρ] = 1, ρ = ρ† and ρ is semi-definite positive pared with simulation results [5]. These reports (i.e. ρ  0) [14]. To describe changes to a system, made clear that understanding what is wrong and we use quantum operations, which are defined as trying to improve the current devices is a pressing maps acting on the system’s density matrix, ρ → issue, for which purpose GST is a far more useful Λ(ρ). Maps that describe physical processes must tool than RB. meet two basic requirements [6]:

1.2. Objectives 1. Trace preservation: If Tr[ρ] = 1, then Tr[Λ(ρ)] = 1 The principal focus of this work was to implement a working GST protocol capable of fully character- 2. Complete Positivity: For a composite system izing single-qubit operations of present-day quan- ρ  0, then (Λ(·) ⊗ I)ρ  0, where Λ only acts tum devices and testing its performance in both on part of ρ. simulated and real experimental data. Nonethe- less, rather than debugging IBM Q’s systems in For obeying the conditions above, a physical map particular, our main intention was to explore GST’s is often referred to as a CPTP (completely positive general capabilities, to uncover potential limitations trace-preserving) map. In fact, any physical map and to establish the extent of its applicability and can be conveniently written in its Kraus form [6]: usefulness in experimental settings. N X † 2. The principles of GST Λ(ρ) = KiρKi (2) 2.1. Model i=1 The computation model used describes quantum where N ≤ d2, with d = 2n being the Hilbert space algorithms as circuits, in which a sequence of dimension of the n-qubit space. The Ki are the quantum gates (i.e. operators) act on qubits (i.e. Kraus operators and they need only fulfill the com- quantum states). Any quantum computation can PN † pleteness condition, i=1 Ki Ki = I. then be divided into three stages: state prepara- tion, conditional manipulation and state readout. 2.3. Superoperator Formalism As an example, figure 1 depicts a typical quantum Despite its generality, the quantum operations for- circuit, subdivided into these three stages. malism lacks the ease of manipulation the Dirac

2 notation offers the state vector formalism. We cir- These definitions lead naturally to (see Section cumvent this issue with the superoperator formal- 2.2 of Ref.[6]): ism, which represents density operators ρ (of di- mension d × d) in the Hilbert space of dimension d |Λ(ρ)ii = RΛ|ρii (9) as vectors |ρii in the Hilbert-Schmidt space of di- R = R R (10) mension d2. In turn, quantum operations now con- Λ1◦Λ2 Λ1 Λ2 sist of d2 ×d2 matrices in this space. We define the Thus, we see that the superoperator formalism Hilbert-Schmidt inner product of two density oper- permits a quantum operation acting on a density ators A and B as matrix to be represented as a PTM acting on a state vector. A sequence of quantum operations † hhA|Bii = Tr[A B] (3) can also be represented as a simple multiplication of the respective PTMs. This formalism also supports the representation of some observable, E, since Physicality Constraints hEi = Tr[Eρ] = hhE|ρii (4) A physical quantum operation maps a density ma- trix ρ to another density matrix Λ(ρ). This im- Therefore, a generic observable E is represented plies that Λ(ρ) can be decomposed in the normal- hhE| as in the superoperator formalism. ized Pauli√ basis√ with real coefficients in the inter- val [−1/ d, 1/ d]. Considering the PTM’s param- Choice of basis eters definition in eq. (8), one concludes that all entries must fulfil Although superoperators can be written in any ba- sis of Hn, a particularly convenient choice is the (RΛ)ij ∈ [−1, 1] (11) normalized Pauli basis. This is a non-standard nor- malization in which the basis elements belong to Furthermore, a quantum operation needs to be a the set CPTP map in order to be a representation of a physical system. Therefore, we must constrain our n √ √ √ √ o⊗n PTM’s parameters in a way that ensures the repre- σ ∈ I/ 2, X/ 2,Y/ 2,Z/ 2 (5) k sented operation is both trace-preserving and com- with {I,X,Y,Z} being the standard Pauli matrices. pletely positive. Let us denote the basis elements in matrix form as Trace preservation: Taking into account that√ the Pauli matrices are traceless (i.e. Tr[σi] = dδ0i, σk and their superoperator counterparts as |kii. In this basis, a general density operator A is decom- since σ0 is the normalized identity matrix), a√ trace- posed as preserving map must verify Tr[Λ(σi)] = dδ0i, X which in turn implies that (RΛ)0i = Tr[σ0Λ(σi)] = A = akσk (6) δ0i. k Complete Positivity: Ensuring a given PTM rep- where ak = Tr[σkA] = hhk|Aii are the coefficients resents a CP quantum operation is, unfortunately, of the vector |Aii. With the usage of the normal- not so straightforward. To impose this condition, ized Pauli basis, one ensures that the basis states we rely on the fact that the Choi-Jamiolkowski ma- in the superoperator formalism obey the complete- trix, ρΛ, associated with a completely-positive map P P ness relation k|kiihhk| = k σkTr[σk·] = I. is positive semi-definite and takes the form [12]:

1 X T Pauli Transfer Matrix Representation ρΛ = (RΛ)ijσ ⊗ σi (12) d j ij The Pauli Transfer Matrix (PTM) representation al- lows us to define quantum operations as matrices Therefore, one needs to impose ρΛ  0, meaning in the superoperator formalism. For an arbitrary ρΛ is an hermitian matrix with non-negative eigen- quantum operation Λ, there will be an associated values. PTM R such that Λ 2.4. Qubit Characterization X We shall see that Gate Set Tomography (GST) [2] R = |jiihhj|R |kiihhk| (7) Λ Λ is the only reliable method of characterization in jk the presence of state preparation and measure- ment (SPAM) errors. However, in order to under- hhj|R |kii = Tr[σ Λ(σ )] = (R ) (8) Λ j k Λ jk stand GST, one should start with its predecessors: Later on, we represent a PTM, RΛ, solely by the Quantum State Tomography (QST) [8] and Quan- quantum operation, Λ, that it describes. tum Process Tomography (QPT) [4].

3 QST consists of determining the complete den- is then sity matrix of an unknown quantum state. Like  G = {G ,G ,G } = {},X π ,Y π (14) every other characterization procedure, QST re- 0 1 2 2 2 quires access to a large ensemble of identical π where X π and Y π represent a rotation around states — obtained, for instance, through repetition 2 2 2 of the same experiment. Given this ensemble, the the x or y axis, respectively. The above stated req- expectation value of different observables (usually uisites are verified if one can write all the required the Pauli matrices) can be determined by repeating SPAM gates using gates within G. In this case, one the same measurement and averaging over the re- has sults.  F = {F1,F2,F3,F4} = {},X π ,Y π ,X π X π QPT is similar to QST, though the goal is now 2 2 2 2 to characterize some unknown quantum process (15) instead of a quantum state. QPT requires repet- With G and F defined, one then conducts all the itive access to the process it wishes to describe; experiments and organizes their results in a tensor this process can be regarded as a quantum black of the form box, to which one can feed different inputs (ρin) pijk = hhE|FiGkFj|ρii (16) and read the corresponding outputs (ρout). The end goal is then to construct the quantum opera- where G ∈ G and F ,F ∈ F. Considering for tion that best maps each ρ to the measured ρ , k i j in out instance that E = |0i h0|, each experiment should which in turn requires QST to be performed on ev- be run N times and count the number of trials a ery output. |0i state was measured — here represented as By default, QPT does not take into account pos- [# |0i] ; with this, the experimental probability sible errors in the preparation of the input states, ijk p = [#|0i]ijk nor in the measurement of the corresponding out- can then be computed as ijk N . As a con- put; the protocol assumes that the input and out- sequence of the sampling being finite, each expec- tation value will have an accompanying sampling put states are known exactly, while this is usu- p ally not the case. In fact, most of the states are error, σijk = pijk(1 − pijk)/N. In possession of prepared using gates also included within the pro- all the experimental data, one then moves on to the cess QPT is supposed to characterize [2]. In this gate set estimation phase. way, QPT makes an incorrect assumption about the system that creates a self-consistency issue, 2.4.1 Linear Inversion Gate Set Tomography ultimately tainting the obtained results. (LGST) GST offers a solution to this problem by including Having the tensor of experimental points, one can the faulty gates in the estimation itself. In contrast start by reorganizing the information it contains. In- with QPT, GST needs to perform, simultaneously, troducing the completeness relation in the Pauli ba- an estimation of an entire set of gates and states sis, one reaches {|ρii, hhE|, G} (13) X where |ρii is the initial state, hhE| is a two-outcome pijk = hhE|Fi|riihhr|Gk|siihhs|Fj|ρii (17) measurement and G = {Gk} is the gate set, which rs X contains the quantum operations performed by the = Air(Gk)rsBsj (18) system. rs The first step in performing GST is the choice of = (AGkB)ij (19) the gate set, G, to be characterized, which needs to permit the system’s preparation and measure- With this construction, we then define ment in an informationally complete basis [3]. This can be achieved solely through individual gates in G˜k = AGkB (20) G, or through a composition of these. In either g =G˜ = AB (21) case, we define a state preparation and measure- 0 X ment (SPAM) gate set, F, whose gates are used |ρ˜ii = A|ρii = |iiihhE|Fi|ρii (22) to obtain all the starting states |ρiii = Fi|ρii and i hhE | = hhE|F X measurement projectors j j. The other hhE˜| = hhE|B = hhE|Fj|ρiihhj| (23) requirement is that both G and F include the ”null” j gate (represented as {}) which is, by definition, an operation that does nothing for no time. The stan- Note that we can estimate g — and thus isolate dard example of a minimal gate (i.e. one with as the product of matrices A and B — because we im- few gates as possible) for a single-qubit system set posed G0 = {} when choosing the gate set. Also,

4 noting that |ρ˜ii and hhE˜| are component-wise iden- form X 2 tical, the following relations can be established with LS(~t) = pijk − pˆijk(~t) (31) the acquired data: ijk produced the best results. The minimization itself (G˜ ) = hhE|F G F |ρii (24) should use a global optimization technique, as the k ij i k j cost function has multiple local minima. In this gij = hhE|FiFj|ρii (25) work, we use the Differential Evolution (DE) opti- |ρ˜iii = hhE|Fi|ρii = hhE˜|i (26) mization method [15, 16], which also allows us to enforce the physicality conditions One then simply inverts g to get g−1 = B−1A−1 and achieve the following estimations (RΛ)ij ∈ [−1, 1], ∀Λ ∈ G (32) ρ  0 (33) Gˆ = g−1G˜ = B−1G B (27) k k k E  0 (34) |ρˆii = g−1|ρ˜ii = B−1|ρii (28) I − E  0 (35) ˆ ˜ hhE| = hhE| = hhE|B (29) 1 X ρ = (R ) σT ⊗ σ  0, ∀Λ ∈ G (36) Λ d Λ ij j i This displays the gauge freedom in terms of B ij present in the results, which with the available ex- perimental data, we cannot measure. Moreover, with relative ease. Once the cost function minimum is reached, we take into account that Bsj = Tr[σsFj(ρ)] cannot be use the fact that the gradient at that point is null the identity, as this would require Fj(ρ) = σs, which is not a CPTP map. to propagate the sampling errors and thus deter- mine the uncertainties of the free parameters in ~t. One can, nonetheless, try to devise some meth- −1 ods of partially circumventing this gauge freedom This amounts to calculating a matrix D = −NM , problem. The most obvious one is to assume that whose entries are element-wise defined as the initial state and the SPAM gates are perfect, ∂tj thus enabling the explicit calculation of B, which Dmj = (37) ∂pm we call the exact gauge. A more refined approach ∂2 is to try to find the gauge that results in the closest Mji = C(tk, pl) (38) possible gate set to the ideal case [6], which we ∂tj∂ti call the optimized gauge. None of these result in ∂2 Nmi = C(tk, pl) (39) an accurate estimation, but they produce estima- ∂pm∂ti tions that serve as sufficiently good starting points for the next step, Maximum Likelihood Estimation where C(tk, pl) is the cost function, tk are the free (MLE). parameters and pl the flattened data tensor pijk, with the index l running through all the entries. N and M are calculated numerically to obtain D, from 2.4.2 Maximum Likelihood Estimation (MLE) which the uncertainties are calculated as MLE surpasses both shortcomings of LGST: for q one, it does not suffer from LGST’s gauge free- ~ε = (DT )◦2 ~σ◦2 (40) dom problem; secondly, it enforces the physical- ity constraints that LGST cannot. The MLE proce- where ~ε and ~σ are the errors in ~t and ~p written as dure employed is reasonably standard: one finds collumn vectors, and the symbol in the exponents the adequate parameters that yield expected re- stands for the Hadamard power, which exponenti- sults compatible with the experimental data. This ates each entry of a vector or matrix element-wise. is achieved through the minimization of a cost func- tion, subjected to the physicality constraints. To 2.5. State of the art GST start, one constructs the probability estimators The above-given description of GST represents a primitive version of the method, which its cre-

pˆijk(~t) = hhEˆ(~t)|Fˆi(~t)Gˆk(~t)Fˆj(~t)|ρˆ(~t)ii (30) ators have since improved [3]. The state of the art GST implementation surpasses its predeces- where each quantity with a ”hat” represents an es- sor in accuracy, precision, speed and capability. timation, as a function of the free parameters in ~t. Such improvements are achieved through a modi- Then, one minimizes the difference between the fied data-fitting protocol, in which new data points experimental points and the estimators through the are included iteratively. These points come from in- minimization of a cost function. In this work, we creasingly long sequences of gates, specially con- found a standard least-squares cost function of the structed to amplify existing errors in the gates.

5 Despite its superior performance, this process For LGST, we found, unsurprisingly, that the es- requires a severe increase in the size of the data timations did not obey the physicality conditions. set. In our case, we are restricted by the num- Nonetheless, we compared the closeness to the ber of experiments that IBM Q’s devices can run, exact results of estimations that used the exact and which impedes the acquisition of all the required optimized gauges and found the exact gauge to be data points. For this reason, we have to resort to a preferable when the initial state is expected to be more primitive version of GST that, on the flip side, perfect. can work with more modest data sets. Proceeding to MLE, we first realized that, when accounting for identical sequences in p , the 2.6. Error quantification ijk number of data points is smaller than the number The error of an operation quantifies its distance of free fitting parameters. To correct this flaw, we from its ideal form. In this work, we use the dia- extended the SPAM gate set to mond norm 0  F = {},X π ,Y π ,X π X π ,X π X π X π ,Y π Y π Y π 2 2 2 2 2 2 2 2 2 2 1 (44) ||A−B|| = max Tr (A⊗I)[ρ]−(B ⊗I)[ρ] (41) With the resulting larger data set, we performed ♦ 2 ρ a series of MLE trials on the data without sampling which measures the distance between the output noise to assess the method’s accuracy and con- states of two processes, A and B, maximized over sistency. Having found large deviations in ρˆ and all the possible input states. The distance between Eˆ among trials, we looked for correlations between output states is quantified through the trace dis- the deviations in these quantities using the Pear- tance [14], son correlation coefficient [17] as a quantifier. This 1 analysis produced the results in figure 2, which d(ρ, σ) = Tr|ρ − σ| (42) 2 shows the deviations in Eˆ, ρˆ and Iˆ are perfectly √ correlated. At the same time, none of these are where |A| ≡ A†A is the positive square root of correlated to deviations in the cost function value A†A. at convergence, ∆ , which led us to conclude As an alternative to the diamond norm, we also Cost that they can vary in a way that compensates one resort to the widely used average gate fidelity, de- another and does not influence the cost function fined as [9, 13, 11]: value. T Tr[RARB] + d FA,B = (43) {||(X )i|| }{||(Y )i|| } d(d + 1) {d(Ei, E0)} 2 2 {||Ii|| } {( Cost)i} where d is the system’s dimension (d = 2 in our 0.998 0.043 0.162 1.0 -0.073 {d( i, 0)} single-qubit case). While the diamond norm is an upper-bound on the gate error, the average gate 1.0 0.026 0.172 1.0 -0.082 fidelity is a measure of average gate quality, ob- {d(Ei, E0)} tained via another distance measure, the fidelity 0.5 [14]. Furthermore, what is commonly referred to 0.596 0.042 -0.189 {||(X ) || } 2 i

as average gate error is simply 1 − FA,B. 0.0

0.17 0.112 {||(Y )i|| } 3. Benchmarking GST 2 Before applying it to real data, we tested all as- 0.5

-0.05 pects of GST’s estimation procedure using simu- {||Ii|| } lated faulty data. Our set of meaningful quantities 1.0 was {|ρii, hhE|,X π ,Y π ,I}, which consists of the 2 2 Figure 2: Pearson correlation coefficients between the figures minimal gate set and the identity gate. We took of merit of a series of MLE trials, without sampling noise. ∆Cost these quantities’ ideal forms and introduced slight represents the difference between the cost function value at the deviations consistent with reported physical errors end of the minimization and the cost function value for the ex- in IBM Q’s devices. This faulty gate set mimicked a act solution. All values are obtained through comparison of the estimated quantities with the exact ones used in the creation real device, with which all the required experiments of the simulated points — symbolically explicit when the trace were performed to construct a data tensor like that distance is used (ρ0 and E0) or implicit in the diamond norm of eq. (16). We also added sampling noise by sim- notation and in ∆Cost. ulating the finite sampling process that takes place in real devices. In this way, we obtained a set of The fact that Eˆ, ρˆ and Iˆ can vary concordantly data points while knowing the gates that generated and without affecting the cost function value meant it, and could thus gauge GST’s ability to character- that they can assume incorrect values. To elim- ize a device correctly. inate these fluctuations, we fixed ρˆ and repeated

6 10 1 0.0186 ˆ Free anti-correlated with the deviations in I and the cost 0.00247 0.00264 0.00243 0.00226 0.000902 Fixed 10 3 function value at convergence, as depicted in fig- 5.73e-05 ure 4. This behaviour meant that ρˆ and Eˆ now de- 10 5 1.52e-06 viated from their ideal values to accommodate the 10 7 discrepancies induced by the sampling noise and 10 9 thus reach lower cost function values. Also oppos- 2.66e-11

10 11 ing previously observed behaviour is the fact that

10 13 the identity gate now changed to accommodate the sampling noise discrepancies whenever Eˆ and ρˆ 15 10 did not, though it did so less successfully consid- 4.87e-18 10 17 ering how it was positively correlated with ∆Cost. Cost d(E, E ) 0 ||X(2 )|| ||Y(2 )|| ||I|| As before, fixing ρˆ eliminated the correlations and Figure 3: MLE performance with free and fixed initial states, improved the accuracy and consistency of the in- without sampling noise. Presented here are the mean and stan- volved quantities. dard deviation of the labeled quantities, extracted from a sample of ten estimations. 4. Application to real devices Armed with a better understanding of GST, we em- ployed it to characterize IBM Q’s Melbourne and the MLE trials. The results in figure 3 show how Yorktown quantum processors. To do so, we gath- this alteration improves the accuracy and consis- ered the necessary data through a series of quan- tency of the affected quantities, which further sup- tum circuits, submitted remotely to run on IBM Q’s ports our hypothesis. devices via the IBM Q Experience website [1]. We When MLE was applied to data with sampling used the same gate set as that of section 3, with noise, we found that our methods only worked which 92 distinct experiments of the form of eq. systematically for around 25000 samples per data (16) were constructed. Each experiment was run point — for inferior sample sizes, the sampling er- consecutively three times in order to gather a large ror was too large and caused the optimization algo- enough sample size. In total, 276 quantum circuits rithm to misconverge. Additionally, the cost func- were run at the maximum number of shots (8192), tion value at convergence was always lower than taking around 40 minutes to collect the necessary the one the exact result would yield, which implied data for a single qubit’s characterization. This pro- that the algorithm managed to reach an estimation cess was repeated in close succession for every that described the noisy data better than the exact qubit in each device so that the experimental con- result. ditions changed as little as possible. Based on the observations of section 3, we fixed the initial state estimator in the analysis of real de- {||(X )i|| }{||(Y )i|| } {d(Ei, E0)} 2 2 {||Ii|| } {( ) } Cost i vices so that an adequate estimation of the readout error could be reached. This meant that we had to 0.999 0.004 -0.048 -0.954 -0.979 {d( , )} i 0 assume the qubits always started in their ground 1.0 state instead of characterizing it. -0.015 -0.067 -0.946 -0.978 {d(E , E )} i 0 4.1. Device characterizations 0.5 We performed complete characterizations of all se- 0.681 -0.144 -0.104 {||(X ) || } lected devices’ qubits. Although the quality of the 2 i characterizations varied among them, we found 0.0 that even the best characterizations, of which the -0.017 -0.086 {||(Y )i|| } π 2 X -gate estimation in figure 5 is an example, have 2 0.5 errors which, though small, still encompass the re- spective ideal value. This suggests that we are 0.922 {||I || } i near the threshold of single-qubit gate quality for 1.0 which this GST implementation can still detect de- Figure 4: Pearson correlation coefficients between the figures viations between the ideal and the real cases. of merit of a series of MLE trials, with sampling noise. Nonetheless, this GST implementation still proved itself to be rather useful in detecting errors Overall, we observed that the results with sam- in current IBM Q devices; this was especially true pling noise resembled those without it, apart from for readout errors, which were commonly at least an expectable decrease in quality. The correlated one order of magnitude higher than regular gate er- behaviour between the deviations in ρˆ and Eˆ was rors. As a distinct example, figure 6 shows a GST still present, though this time they were perfectly estimation of the measurement operator with the

7 PTM: X( /2) PTM: Y( /2)

1.00 1.00

1.0 0.0 0.0 0.0 1.01.0 0.0 0.0 0.0 0.0 0.0 0.0 0.75 1.0 0.0 0.0 0.0 0.75

0.50 0.50

-0.001 0.998 0.014 0.018 -0.001 -0.007 -0.022 0.998 0.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.25 0.25 ±0.001 ±0.003 ±0.003 ±0.003 ±0.005 ±0.001 ±0.003 ±0.002

0.00 0.00

-0.002 0.015 -0.019 -0.997 0.001 0.105 0.992 0.024 0.0 0.0 0.0 -1.0 ±0.000.04 ±0.000.03 ±0.0041.0 ±0.0020.0 0.25 ±0.001 ±0.003 ±0.001 ±0.004 0.25

0.50 0.50

-0.000 -0.014 0.996 -0.020 0.001 -0.992 0.106 -0.002 0.0 0.0 1.0 0.0 ±0.000.07 ±0.00-1.04 ±0.0020.0 ±0.0030.0 0.75 ±0.004 ±0.002 ±0.003 ±0.002 0.75

1.00 1.00 Ideal EstimatedExact Estimated

Figure 5: X π -gate’s PTM estimation, for IBM Q Melbourne’s Figure 7: Y π -gate’s PTM estimation, for IBM Q Yorktown’s 3th 2 2 11th qubit. In an ideal case, the purple entries would have value qubit. In an ideal case, the purple entries would have value ”1”, ”1”, the red entry ”-1” and the rest ”0”. The matrix’s entries, the red entry ”-1” and the rest ”0”. rij , are represented in the standard order, with i = 0, 1, 2, 3 indexing the matrix line and j = 0, 1, 2, 3 the column. 4.2. Comparison with IBM’s calibrations Measurement, E Having a complete qubit characterization, one can 1.00 also compute the error figures used by IBM to de- 0.75 scribe their devices. In this way, we compared our 0.76 (-0.003±0.003)- 1.0 0.0 results with IBM’s calibration data, which provided ±0.02 i (0.006±0.002) 0.50 values for the single-qubit gate and readout errors. 0.25

0.00 Id Error: IBM vs GST 0.0040 0.25 IBM = GST (-0.003±0.003)+ 0.23 0.0035 Qubit 0 0.0 0.0 i (0.006±0.002) ±0.02 0.50 Qubit 1 0.0030 Qubit 2 0.75 Qubit 3 0.0025 Qubit 4 1.00 Ideal Estimated 0.0020

0.0015 Figure 6: GST characterization of IBM Q Melbourne’s qubit 7’s measurement operator. In its ideal form, E = |0i h0|. The GST's estimation 0.0010

matrix’s entries, rij , are represented in the standard order, with 0.0005 i = 0, 1 indexing the matrix line and j = 0, 1 the column. 0.0000 0.0000 0.0002 0.0004 0.0006 0.0008 0.0010 IBM's value most substantial error out of all the characterized Figure 8: IBM vs GST: Identity gate errors in IBM Q Yorktown. The dashed grey line represents where the points should lie if qubits, where the deviations from the ideal case GST’s estimations and IBM’s values were identical. are unequivocal even though the uncertainties are one order of magnitude above those of figure 5. For the gate errors, we generally witnessed that Another feature particular to GST is its ability to GST’s precision tends to be too low for the order of determine the actual values of the ”ideally zero” pa- magnitude of the errors we are dealing with; even rameters of a PTM, in which sometimes significant for the points with smaller errors we did not see any deviations are concentrated. This is especially rel- trend, with the GST’s gate error estimation lying ei- evant because the widely used gate fidelity, from ther below, above or in agreement with those of eq. (43), only looks at the overlap between the two IBM, depending on the qubit. The exception to the PTM’s, which means that PTM values that would rule was the error of the identity gate in IBM Q York- be zero in the perfect case do not affect the gate town’s qubits, which we compare with IBM’s value error. Take for instance the Y π -gate estimation in in figure 8, and where we see that IBM’s gate error 2 figure 7: although the ”non-zero” parameters of the values show a tendency to be underestimated. PTM are close to their ideal values, there are sig- As for the readout errors, exemplified in figure 9, nificant deviations in the ”ideally zero” parameters. we noticed a much more consistent match between the GST’s estimations and IBM’s values. For IBM Q

8 Melbourne’s qubits, we confirmed that most of the Readout error evolution qubits had readout errors in the 2.5%−7.5% range, IBM 0.225 Calibration with qubit 7 having a significantly higher readout GST error than the rest — something we had already 0.200 remarked through figure 6, which exhibits a read- 0.175 out error that is even larger than the one provided 0.150 by IBM. 0.125 Readout error

0.100

Readout errors: IBM vs GST 0.075 IBM = GST 0.050 0.25 Qubit 0 0 10 20 30 40 50 Qubit 1 Time (h) Qubit 2 0.20 Qubit 3 Figure 10: Readout error of IBM Q Melbourne’s qubit 7 over Qubit 4 time. The grey boxes highlight when the calibrations were pre- 0.15 Qubit 5 Qubit 6 formed, while the red points are the readout error’s reported by Qubit 7 IBM at the time of the calibrations. 0.10 Qubit 8 Qubit 9 GST's estimation Qubit 10 0.05 Qubit 11 Qubit 12 5. Conclusions, findings and achievements Qubit 13 0.00 In this work, we set out to implement a fully func- 0.000 0.025 0.050 0.075 0.100 0.125 0.150 0.175 IBM's value tioning qubit characterization tool based on the Figure 9: IBM vs GST: Readout errors in IBM Q Melbourne. GST framework. This process involved not only the The dashed grey line represents where the points should lie if study and subsequent programming of the whole GST’s estimations and IBM’s values were identical. procedure, but also a thorough examination of its capabilities and shortcomings. In this regard, we found that the method can produce reliable esti- 4.3. Device performance over time mations, but only under a specific set of potentially Using GST, we chose to study IBM Q Melbourne’s demanding conditions. th 7 qubit’s behaviour over time, mainly to probe the Experimentally, we found that GST demands evolution of its readout error. several experiments to be performed (in our case, The results, taken over around 48 hours, are pre- we needed 92 distinct experiments) and each ex- sented in figure 10, where we observe how erratic periment to be repeated a large enough number of the qubit’s readout is over time and how far apart times (in our case, around 25000) to mitigate sam- its error can be from the values provided in IBM’s pling noise. This results, depending on the device, calibrations (in red). Although we cannot discern on considerably long data acquisition times (in our any specific pattern to these oscillations, these re- case, for the IBM Q devices, of around 40 minutes), sults show that the system’s readout error repeat- which leaves room for the system to drift signifi- edly changes in a rate of 1 − 2% per hour, which cantly throughout the data collection phase. If this is especially worrying when we consider that our is the case, then GST’s capabilities are compro- data collection takes a total of 40 minutes. mised by its inability to contemplate non-markovian These fluctuations have far-reaching implica- behaviour. tions, in the sense that they compromise the ac- When it comes to the estimation procedure, we curacy of the whole estimations; this becomes discovered that the global minima of the MLE particularly detrimental when considering its influ- cost functions were degenerate. This degener- ence on the accuracy of smaller quantities like the acy stemmed mainly from correlated deviations in single-qubit gate errors. In fact, what we observe the initial state and readout estimations that com- in this case is that the gate error estimations end pensated one another and thus produced no net up having error bars that are too large to observe change in the cost function value. This was found whether some type of fluctuations are happening to be a property endemic to the probability estima- to a significant extent — and so we refrain from tor of eq. (30), and thus present in all cost func- showing those points. We stress, however, that in tions that try to match estimations to experimental this particular case this is an unavoidable conse- points. Since this behaviour is particularly detri- quence that would arise in the results of even the mental to the accuracy of the correlated quantities, most sophisticated GST implementations, simply if a useful estimation of one is desired, it is neces- because we are dealing with measurements that sary to fix the other, thus forcing a new assumption drift significantly throughout the data acquisition, about the system to be made. which is something the physical model does not We did not attempt to replicate the state of the contemplate. art GST implementation because of its prohibitive

9 demands in data collection. Instead, we based our [6] D. Greenbaum. Introduction to Quantum Gate work on the more primitive implementation of ref. Set Tomography. 9 2015. [6], upon which we made some modifications and improvements. These included the incorporation [7] E. Knill, et al. Randomized benchmark- of an error estimation procedure and the adoption ing of quantum gates. Physical Review A, of differential evolution as a simple and efficient 77(1):012307, 1 2008. algorithm for global optimization. Moreover, we [8] D. Leibfried, et al. Experimental Determi- adapted the original method to account for entries nation of the Motional Quantum State of a in the data tensor that corresponded to identical Trapped Atom. Technical report, 1996. experiments and fixed the existence of more free parameters than experimental points by increasing [9] E. Magesan, R. Blume-Kohout, and J. Emer- the number of different experiments without intro- son. Gate fidelity fluctuations and quan- ducing extra gates. tum process invariants. Physical Review A, By applying our GST implementation to the IBM 84(1):012309, 7 2011. Q devices, we noticed that not all results were of equal precision, with some displaying larger uncer- [10] E. Magesan, J. M. Gambetta, and J. Emerson. tainties than others as a virtue of their cost function Scalable and Robust Randomized Bench- landscape at the convergence point. Usually, we marking of Quantum Processes. Physical Re- found that the average single-qubit gate quality of view Letters, 106(18):180504, 5 2011. IBM’s devices is close to the limit of GST’s capac- [11] E. Magesan, J. M. Gambetta, and J. Emer- ity to consistently and accurately detect deviations son. Characterizing quantum gates via ran- from their ideal form. This was not the case for the domized benchmarking. Physical Review A, readout errors, whose estimated values have ad- 85(4):042311, 4 2012. equate uncertainty values, stemming from the fact that readout errors are usually one to two orders of [12] S. T. Merkel, et al. Self-consistent quan- magnitude higher than single-qubit gate errors. tum process tomography. Physical Review A, Nonetheless, we found some instances in which 87(6):062119, 6 2013. GST’s estimations conflicted with IBM’s characteri- zation values. Many hypotheses can be formulated [13] M. A. Nielsen. A simple formula for the aver- to explain this, including ones that put into ques- age gate fidelity of a quantum dynamical op- tion either their or our characterization procedures. eration. Physics Letters A, 303(4):249–252, On the other hand, by characterizing a qubit with a 10 2002. particularly lousy readout over time, we found re- [14] M. A. Nielsen and I. L. Chuang. Quantum current fluctuations in the readout error, often in Computation and . Cam- the order of 1% − 2% per hour. Although it is diffi- bridge University Press, 2000. cult to say whether such drifts are present in the other qubits, smaller fluctuations would be suffi- [15] K. Price, R. M. Storn, and J. A. Lampinen. Dif- cient to explain the discrepancy between our and ferential Evolution: A Practical Approach to IBM’s characterizations. Global Optimization (Natural Computing Se- ries). Springer-Verlag, Berlin, Heidelberg, References 2005. [1] Active quantum devices & simulators - IBM Q Experience. [16] R. Storn and K. Price. Differential Evolution – [2] R. Blume-Kohout, et al. Robust, self- A Simple and Efficient Heuristic for global Op- consistent, closed-form tomography of quan- timization over Continuous Spaces. Journal tum logic gates on a trapped ion qubit. 10 of Global Optimization, 11(4):341–359, 1997. 2013. [17] G. Upton and I. Cook. A Dictionary of Statis- [3] R. Blume-Kohout, et al. Demonstration of tics. Oxford University Press, 2008. qubit operations below a rigorous fault toler- ance threshold with gate set tomography. Na- ture Communications, 8:14485, 2 2017. [4] I. L. Chuang and M. A. Nielsen. Prescription for experimental determination of the dynam- ics of a quantum black box. 10 1996. [5] P. J. Coles, et al. Imple- mentations for Beginners. 4 2018.

10