Fault-tolerant magic state preparation with flag

Christopher Chamberland1,2 and Andrew Cross1

1 IBM T. J. Watson Research Center, Yorktown Heights, NY, 10598, United States 2 Institute for and Department of Physics and Astronomy, University of Waterloo, Waterloo, Ontario, N2L 3G1, Canada

May 16, 2019

Magic state distillation is one of the lead- proposals for fault-tolerant universal gate constructions ing candidates for implementing universal fault- have been introduced [4–9]. One of the earliest pro- tolerant logical gates. However, the distilla- posals, known as magic state distillation, uses resource tion circuits themselves are not fault-tolerant, states that, along with stabilizer operations, can simu- so there is additional cost to first implement late a non-Clifford operation [10, 11]. These resource encoded Clifford gates with negligible error. states are called magic states since they can also be dis- In this paper we present a scheme to fault- tilled using stabilizer operations. Despite considerable tolerantly and directly prepare magic states us- effort in alternative schemes, magic state distillation re- ing flag qubits. One of these schemes requires mains a leading candidate for universal fault-tolerant only three ancilla qubits, even with noisy Clif- quantum computation. ford gates. We compare the physical and Recently, very efficient distillation protocols have gate cost of our scheme to the magic state dis- been developed which require substantially fewer re- tillation protocol of Meier, Eastin, and Knill source states to achieve a desired target error rate of (MEK), which is efficient and uses a small sta- the magic state being distilled [12–15]. When studying bilizer circuit. For low enough noise rates, we magic state distillation protocols, it is often assumed show that in some regimes the overhead can be that all Clifford operations can be implemented per- improved by several orders of magnitude com- fectly and that only the resource states can introduce pared to the MEK scheme which uses Clifford errors into the protocol. However, with current quan- operations encoded in the codes considered in tum devices, two qubit gates are amongst the noisiest this work. components of the circuits. In practice, Clifford gates with very low failure rates can be achieved by perform- ing encoded versions of the gates in large enough er- 1 Introduction ror correcting codes, such that the failure rates of the Clifford operations are negligible compared to the non- Certain algorithms can be implemented efficiently on Clifford components of the circuit. However the over- quantum computers, whereas the best known classi- head cost associated with performing magic state dis- cal algorithms require superpolynomial resources [1,2]. tillation with encoded Clifford operations is often not At present, however, quantum devices are dramatically considered (there are exceptions such as [16–19]). If the noisier then their classical counterparts. For all but the overhead cost of encoded Clifford operations is taken shortest depth quantum computations to succeed with into account, it is not clear that magic state distillation high probability, operations will need to be performed schemes which minimize the resource state cost would arXiv:1811.00566v2 [quant-ph] 14 May 2019 with very low failure rates. Fault-tolerant quantum er- achieve the lowest overhead when used in a quantum ror correction provides one way to achieve the desired algorithm. logical failure rates. A straightforward way to implement fault-tolerant Recently, new schemes using flag qubits have been logical gates is to use transversal operations. However introduced to implement error correction protocols us- by the Eastin-Knill theorem, for any error correcting ing the minimal number of ancilla qubits to measure the code, there will always be at least one gate in a universal codes stabilizer generators [20–25]. The idea behind flag gate set which cannot be implemented fault-tolerantly error correction is to use extra ancilla qubits which flag using transversal operations [3]. In recent years, many when v ≤ t faults result in a data qubit error of weight greater then v (here t = b(d − 1)/2c where d is the dis- Christopher Chamberland: [email protected] tance of the code). The flag information can then be Andrew Cross: [email protected] used in addition to the error syndrome to deduce the

Accepted in Quantum 2019-05-14, click title to verify 1 data qubit error. by AGP. In this paper, we propose a new scheme to fault- The paper is structured as follows. In Section2 we tolerantly prepare magic states which requires a min- introduce the basic notation and noise model that we imal amount of extra ancilla qubits (in our case only used throughout the remainder of the manuscript. In one extra qubit). In particular, we consider a full cir- Section3 we describe our scheme for fault-tolerantly cuit level noise model (which includes noisy Clifford op- preparing magic states. We provide both an error de- erations) where gates can be applied between any pair tecting and and correcting scheme. Proofs of fault- of qubits. In this model, we compare the overhead cost tolerance are given in AppendixA. In Section4 we of our scheme to a magic state distillation scheme in- briefly review the MEK magic state distillation scheme troduced by Meier, Eastin and Knill (MEK), due to its and in Section5 we compare the qubit and gate over- efficiency and small circuit size [26]. Using the same en- head costs of both schemes. Details of the overhead and coded Clifford operations in both schemes, we show that numerical analysis are provided in AppendicesC toE. in some regimes the qubit and gate overhead cost of our scheme can be smaller by several orders of magnitude as compared to the MEK scheme. We note that currently, 2 Basic notation and noise model state of-the-art surface-code-based magic state distil- We define P(1) to be the n-qubit Pauli group (i.e. n- lation schemes are given in [27, 28]. Furthermore, an n fold tensor products of the identity I and the three Pauli efficient state injection scheme using the surface code is matrices X,Y, and Z and all scalar multiples of them given in [29]. The surface-code-based magic state distil- (2) lation schemes have been shown to achieve logical error by ±1 and ±i). The Clifford group is defined by Pn = (1) † (1) rates of 10−11 for physical error rates of 10−3 and can {U : ∀P ∈ Pn ,UPU ∈ Pn } and is generated by the  π  π thus tolerate larger physical error rates than the scheme −i 4 Y single qubit Hadamard, Y 2 = e gates proposed in this work. On the other hand, in low noise −5 −4 rate regimes (10 ≤ p ≤ 10 ), we show that the qubit 1  1 1  π  1  1 −1  H = √ , and Y = √ , and gate overhead cost of our scheme is lower compared 2 1 −1 2 2 1 1 to distance-three surface code implementations of magic (1) state distillation. For larger distances, a more thorough π −i π Z analysis of surface code implementations of magic state as well as Z( ) = e 4 and the two-qubit CNOT gate distillation for error rates mentioned above is required 2 (which we write as CX ) with to determine if our scheme has a smaller overhead. The scheme we numerically analyze in this work does re- CX |ai|bi = |ai|a ⊕ bi, (2) quire code concatenation for distances d > 3 and thus requires non-local connectivity1. However, the goal of where |ai and |bi are computational basis states. In this work is not solely to outperform the surface code, general, a controlled-U gate is but instead to provide alternative magic state prepara- 1 1 tion schemes with low overhead. Indeed the proposed CU = (I + Z) ⊗ I + (I − Z) ⊗ U. (3) scheme in this work could be accessible to near term 2 2 ion trap based architectures which could be amongst the first experiments to demonstrate the fault-tolerant implementation of logical non-Clifford gates. We point out that a fault-tolerant magic state prepa- ration scheme was previously considered by Aliferis, Gottesman and Preskill (AGP) in [30]. However, the scheme proposed by AGP requires the preparation and Figure 1: Circuit for simulating a T gate using an |Hi state verification of large cat state ancillas to perform the π  along with Y 2 and CY gates and a measurement in the logical measurements. Furthermore, Steane error cor- π  Y -basis. The Y 2 gate is only applied if the measurement rection, which requires a large number of extra ancilla outcome is +1. The black rectangular box represents an idle qubits, was used for the error correction circuits. Our location. The circuit for simulating T † is obtained by replacing π  π  scheme does not require the preparation of large ancilla Y 2 with Y − 2 and applying this gate if the measurement states, uses smaller error correction circuits, and has an outcome is −1. improved threshold compared to the scheme proposed The state 1In addition we show how our scheme can be extended to π π higher distance color codes, but these are not numerically an- |Hi = cos |0i + sin |1i = T |0i, (4) alyzed in this work. 8 8

Accepted in Quantum 2019-05-14, click title to verify 2 is the +1 eigenstate of the Hadamard operator and Our assumption of p/100 idle error is valid for sys- tems whose gate errors are far from the limits set by  π π  −i πY cos − sin T = e 8 = 8 8 (5) coherence times. For example, trapped ion experiments sin π cos π 8 8 have coherence times T2 = 2T1 on the order of 1 − 10 seconds with gate times on the order of 10µs. This sug- does not belong to the Clifford group2. We also write gests idle error probabilities of 10−5 to 10−6, while two −3 −4 | − Hi = Y |Hi (6) qubit gate infidelities are on the order of 10 to 10 [31, 32]. On the other hand, the assumption may not as the −1 eigenstate of the Hadamard operator. hold in systems such as superconducting qubits, whose The |Hi state is an example of a magic state. Magic gates currently achieve infidelities near the coherence states can be distilled to produce reliable states from a limit [33]. Regardless, the concrete schemes we ana- larger number of noisy copies using only stabilizer op- lyze all use the same underlying quantum code family erations [11]. The reliable magic states can be used, and Clifford gate implementations, so we expect com- together with stabilizer operations, as resource states parisons between them to be robust. for universal quantum computation. In particular, as In the following sections, all Clifford gates and re-   π source states will be encoded in the (see shown in Fig.1, the |Hi state, along with Y 2 , CY , and a measurement in the Y -basis, can be used to simu- Section3) which, paired with flag qubits, will allow late a T gate. Note that some schemes choose to distill us to obtain encoded magic states with low overhead. the state The code distance will be increased through code con- catenation. Since the encoded version of the gates and 1 i π i π † |A π i ≡ √ (|0i + e 4 |1i) = e 8 HS |Hi, (7) states are implemented in a fault-tolerant way, the fail- 4 2 ure probability for each logical fault E of a gate G at a physical error rate p resulting from the malignant event which is equivalent to the |Hi state up to products of mal(1) (where the superscript denotes the level of con- Clifford gates. For instance, see the |A π i distillation E 4 scheme in [11]. catenation) can be upper bounded as Throughout this paper, we will use the following circuit-level noise model in all simulations: LG (1) X (1) Pr[mal |G, p] ≤ c(k)pk ≡ Γ , (8) 1. With probability p, each single-qubit gate location E G k=d d e is followed by a Pauli error drawn uniformly and 2 independently from {X,Y,Z}. where c(k) denotes the number of weight-k errors which 2. With probability p, each two-qubit gate is followed can lead to a logical fault and LG is the total number of by a two-qubit Pauli error drawn uniformly and circuit locations in the logical gate G. At the first con- independently from {I,X,Y,Z}⊗2 \{I ⊗ I}. catenation level, we performed Monte-Carlo simulations with the above noise model to estimate the coefficients 2p 3. With probability 3 , the preparation of the |0i c(k) for each logical gate and encoded states. state is replaced by |1i = X|0i. Similarly, with As was shown in [34–36], Eq. (8) can be generalized 2p probability 3 , the preparation of the |+i state is to the level-l concatenation level where each physical replaced by |−i = Z|+i. gate is replaced by its level-(l − 1) Rec3 (see [30] for more details). The error rate at the l-th concatenation 4. With probability p, the preparation of the |Hi state level can be bounded as is replaced by P |Hi where P is a Pauli error drawn uniformly and independently from {X,Y,Z}. L 2p G k 5. With probability , any single qubit measurement (l) X  (l−1) (l) 3 Pr[mal |G, p] ≤ c(k) Γ ≡ Γ . (9) has its outcome flipped. E G G d k=d 2 e 6. Lastly, with probability p/100, each idle gate loca- tion is followed by a Pauli error drawn uniformly In other words, to estimate the logical failure rate of and independently from {X,Y,Z}. a level-l gate, the probability of failure for all physi- cal gates G in our level-l simulation will be replaced by 2Note that some references define the T gate as T = iπ/4  1 1  3 e√ which is a Clifford gate, while others define Rectangles (Rec’s) are encoded gates with trailing error cor- 2 i −i rection units. Extended rectangles, which are encoded gates with T = diag(1, eiπ/4) which is non-Clifford but Clifford-equivalent leading and trailing error correction units, are abbreviated as to the gate we have defined as T . exRec’s.

Accepted in Quantum 2019-05-14, click title to verify 3 [[7, 1, 3]] Steane code [[4, 2, 2]] code

g1 = XIXIXIX g1 = XXXX

g2 = IIIXXXX g2 = ZZZZ

g3 = IXXIIXX (a) g4 = ZIZIZIZ

g5 = IIIZZZZ

g6 = IZZIIZZ ⊗7 X = X X1 = XXII, X2 = XIIX ⊗7 Z = Z Z1 = ZIIZ, Z2 = ZZII (b) Table 1: Stabilizer generators and logical operators of the Figure 2: (a) Non-destructive measurement of the Hadamard [[7, 1, 3]] Steane code and the [[4, 2, 2]] error detecting code. operator. (b) The CH gate can be decomposed into the gates † T , CZ and T . As was shown in Fig.1, the T gates can be simulated using only Clifford gates and an |Hi resource state. the property that all Clifford gates can be implemented transversally. In particular, the logical Hadamard op- erator is given by H = H⊗7. The fault-tolerant prepa- (l−1) Pr[mal |G, p] obtained from a level-(l − 1) Monte- ration of an encoded |A π i (see Eq. (7)) state has been E 4 Carlo simulation. For the physical error rates we con- previously analyzed in [30]. However, Steane error cor- sider, higher order terms are found to be relatively rection (not to confuse with the Steane code) was used small, so we only estimate the leading order terms. for the EC units and cat states were fault-tolerantly More details can be found in [34–36]. prepared in order to measure T¯X¯T¯† 4. The high cost for preparing cat states and implementing Steane EC’s results in a large qubit overhead. 3 Preparing magic states in the Steane Instead of using Steane-EC circuits, in this work we code consider an EC circuit recently introduced by Reichardt [25] for measuring the stabilizers of the Steane code with One way to measure the Hadamard operator using only low circuit-depth and which requires only three ancilla Clifford gates and an |Hi resource state is shown in qubits (see Fig.3). The small qubit overhead is achieved Fig.2. To obtain resource states with high-fidelity with the use of flag qubits which can detect events where (which is required for universal quantum computation), errors from a single fault spread to an X or Z error of one method is to encode them into an error detecting weight greater than one on the data. The fault-tolerant or error correcting code and perform several rounds of properties of the circuit are discussed in Appendix A.1. state-distillation. If the physical error rate is below Flag-qubits can also be used to fault-tolerantly mea- some threshold (which depends on the codes and dis- sure the logical Hadamard operator with only one extra tillation routine), the error rate can be exponentially ancilla qubit (and thus do not require the fault-tolerant suppressed with the number of distillation rounds. An- preparation of cat states). In an error detection scheme, other, more direct method, is to fault-tolerantly prepare where the |Hi state is rejected if errors are detected, resource states in a large distance error correcting code. the circuit in Fig. 5a can be used to measure the log- The distance is chosen to obtain the desired logical error ical Hadamard operator. If a fault results in an X or suppression. In particular, in the presence of noisy Clif- Z data qubit error of weight greater than one, the flag ford gates, distillation routines require encoded Clifford qubit (ancilla prepared in the |0i state) will be mea- sured as −1 and the |Hi state will be rejected. The operations which can substantially increase the qubit 5 and gate overhead of the routine. We now show how to full error detecting scheme is shown in Fig. 5b . Note fault-tolerantly prepare an |Hi state using the Steane that our scheme uses only 10-qubits and is shown to be code and flag-qubits in order to achieve a lower gate and fault-tolerant in Appendix A.2. qubit overhead. We will compare the overhead require- Error detection schemes have extra overhead arising ments of our methods with that of the Meier-Eastin- 4Note that the T gates considered in this paper are Clifford- Knill (MEK) distillation routine [26]. A review of the equivalent to the T gates considered in [30] MEK distillation routine is given in Section4. 5In both Fig. 5a and Fig. 5b, the controlled Hadamard gates The Steane code is a [[7, 1, 3]] CSS (Calderbank-Shor- are implemented as shown in Fig.2 and for concatenation levels Steane) code [37, 38] with stabilizer generators and log- greater than one, the T gates are implemented as in Fig.1. See ical operators given in Table1. The Steane code has AppendixC for more details.

Accepted in Quantum 2019-05-14, click title to verify 4 Figure 3: Fault-tolerant error correction circuit introduced by Reichardt [25] for measuring the stabilizers of the [[7, 1, 3]] Steane code. The ancilla qubits act as flag qubits that flag if a single fault results in a data-qubit error of weight greater than one.

Figure 4: Non-fault-tolerant circuit for preparing a logical |Hi state encoded in the [[7, 1, 3]] Steane code. We label this circuit by |Hinf. If there are no faults then |Hinf = T |0i. (a) from starting the process anew when a state is rejected (although they achieve much higher pseudo-thresholds). Alternatively, in Fig.6 we show how the |Hi state (b) can be fault-tolerantly prepared in an error correction scheme using three flag qubits. The details of the im- Figure 5: (a) Flag fault-tolerant circuit for measuring the the plementation and a proof of fault-tolerance is given in logical Hadamard operator of the [[7, 1, 3]] Steane code in an Appendix A.3. Due to the low pseudo-threshold of error detection scheme. If a single fault results in a data qubit the error correction scheme, we will focus on the er- error of weight greater than one, the circuit will flag (the |0i ror detection scheme of Fig.5 in the remainder of this state will be measured as −1) and the |Hi state preparation manuscript. scheme will begin anew. Note that the controlled Hadamard gates are implemented as shown in Fig.2, where the T gates are The main motivation for concatenating the Steane implemented as in Fig.1. (b) Full fault-tolerant error detection code in order to increase the code distance is that find- scheme for preparing an encoded |Hi state. The EC circuit is ing flag circuits to measure high weight operators while given in Fig.3 and the non-fault-tolerant circuit for preparing maintaining error distinguishability is quite challeng- the encoded |Hi state is given in Fig.4. ing [23]. However if a flag circuit satisfying the desired fault-tolerance criteria is found for a small code, the same circuit can be used at level-l where each gate is a level-(l − 1) gate. Additionally, if the data used in a quantum computation is encoded in the same code used to prepare the |Hi state, the encoded |Hi state does not need to be decoded and can be used to directly apply a logical T gate. Lastly, we point out that following the 2-flag circuit construction of [23], it is possible to obtain a fault-

Accepted in Quantum 2019-05-14, click title to verify 5 (a)

(b)

Figure 6: (a) Flag fault-tolerant circuit for measuring the the logical Hadamard operator of the [[7, 1, 3]] Steane code. The extra flag qubits are used to localize errors occurring near the fourth controlled-Hadamard gate since these cannot be distinguished from errors occurring at the other controlled-Hadamard gates. Note that in order to distinguish errors from faults at other locations, the full six bit error syndrome must be considered. That is, we cannot correct X and Z errors separately as is usually done in CSS constructions. (b) Full fault-tolerant error correction scheme for preparing an encoded |Hi state. The EC circuit is given in f2 Fig.3 and the non-fault-tolerant circuit for preparing the encoded |Hi state is given in Fig.4. If a flag occurs in the third Hm measurement, an extra round of EC is performed.

Figure 7: 2-flag circuit for measuring the logical Hadamard operator of the [[17, 1, 5]] color code. This circuit can be used in an error detection scheme for fault-tolerantly preparing an |Hi state encoded in the [[17, 1, 5]] color code. If there are v ≤ 2 faults which cause a data qubit error of weight greater than v, at least one of the flag qubits will flag and the |Hi state will be rejected.

Accepted in Quantum 2019-05-14, click title to verify 6 tolerant circuit to measure the logical Hadamard op- In our case, we observe that the logical error rates in- erator of the [[17, 1, 5]] color code. The circuit is given crease if we twirl, so we simply use the input states as in Fig.7 and can be used in an error detection scheme they occur. analogous to that of Fig.5. EC circuits for measur- ing the stabilizers of the [[17, 1, 5]] color code were ob- tained in [23] and require at most four ancilla qubits. 5 Resource overhead comparison One could also measure the stabilizers using standard In this section we compare the overhead cost of topological methods at the cost of requiring O(n) an- our fault-tolerant magic state preparation scheme for cillas. This circuit could be useful if a higher distance preparing an encoded |Hi state in the concatenated is required prior to concatenation or if a hybrid state- Steane code to the overhead cost of the MEK scheme for preparation and magic state distillation scheme is used distilling |Hi states (also encoded in the concatenated (see Section5 for more details). Steane code). In particular, we will compare the qubit and gate overhead cost of both schemes. In what fol- 4 Meier-Eastin-Knill distillation circuits lows, a level-l state or gate will always correspond to its encoded version in l concatenation levels of the Steane In the MEK distillation protocol, |Hi states are encoded code. in the [[4, 2, 2]] error detecting code, whose stabilizer gen- For the MEK magic state distillation protocol, we erators and logical operators are given in Table1. For consider two different approaches. In the first approach, the [[4, 2, 2]] code, the operator H⊗4 is a valid encoded physical |Hi states are teleported into level-2 |Hi states gate and performs the operation H1H2SWAP12 where and a subsequent round of MEK (with level-2 Clifford SWAP12 swaps the two encoded qubits. The circuit in gates) is applied to produce a state with a logical error 2 4 Fig.8 performs an encoded measurement of H1H2 on rate well-approximated by the form ap +bp below the 2 a pair of encoded |Hi states and measures the stabiliz- level-2 pseudo-threshold. The O(p ) term is dominant 4 ers of the [[4, 2, 2]] code (note that the circuit applies a at low error rates, and the O(p ) term is dominant at er- Hadamard gate to one of the two encoded qubits). The ror rates near the pseudo-threshold. To obtain further routine accepts the pair of |Hi states if both the H⊗4 error suppression, we then teleport the distilled state and syndrome measurements are trivial. More details into a level-3 state and perform a level-3 round of MEK can be found in [26]. to obtain a state with a logical error rate of the form 0 4 0 8 Suppose that the desired output failure probability of a p + b p . An additional round of MEK could be per- l the input resource states is O(p2 ). Note that a single formed to obtain an |Hi state with logical error rate 8 fault in some of the two-qubit gates in Fig.8 can result O(p ). However, for the physical error rates considered −5 −4 in multi-qubit errors on the data that go undetected6. in this work (p ∈ [10 , 10 ]), we found that there is Since all of the gates in Fig.8 fail following the noise no advantage in doing so. model described in Section2, they must be encoded in In the second approach, we consider a hybrid scheme another code (which we choose to be the level-l con- where a level-2 |Hi state is first prepared using our catenated Steane code) in order to achieve the desired fault-tolerant flag preparation scheme. The state is then logical error rate in the output resource states. To pro- teleported into a level-3 state and a round of MEK is vide a fair comparison between the overhead costs of performed resulting in an |Hi state with logical error 8 our state preparation scheme with flag qubits and the rate O(p ). All teleportation schemes are performed MEK scheme, all exRec’s contain the EC unit of Fig.3. using the methods described in AppendixB. These features will play an important role when evalu- The qubit and gate overhead results for the magic ating the qubit and gate overhead of the MEK scheme. state preparation scheme with flag qubits and the MEK Many magic state distillation protocols use a random- schemes are shown in Figs.9 to 11. Details of the analy- ization called twirling [11] to diagonalize each magic sis leading to the plots are given in AppendicesC andD. state in a convenient basis. This greatly simplifies the Comparing the results, it is clear that the qubit and analysis so that the acceptance probabilities and logi- gate overhead cost of the fault-tolerant |Hi state prepa- cal error rates can be found in closed form. However, ration scheme of Fig.5 is substantially smaller than twirling is not necessary in a physical implementation, schemes involving MEK. The primary reasons for this since there is always another protocol without the ran- are due to the high pseudo-threshold of the error de- domization that uses the optimal choice of gates [39]. tection scheme as well as the small size of the circuits compared to the circuits used by MEK (where we used 6This is due to the fact that magic state distillation circuits are the teleportation scheme of AppendixB to inject states fault-tolerant for noise models where errors are only introduced at a higher concatenation level). Further, note that the in resource states. hybrid scheme (Fig. 11) has smaller qubit and gate over-

Accepted in Quantum 2019-05-14, click title to verify 7 (a)

(b)

Figure 8: (a) First half of the MEK distillation circuit. (b) Second half of the MEK distillation circuit. The circuit is the same as given in [26] but has been written in detail to illustrate the idle qubit locations (filled rectangles). We reuse the |Hi ancilla qubit instead of doing gates in parallel in order to minimize the overhead cost. A total of four CH gates are required to measure the operator H1H2.

Figure 9: Qubit and gate overhead for our fault-tolerant magic state preparation scheme using flag qubits. We considering target logical failure rates ranging between 10−6 to 10−9. The different jumps in the curves indicate that an additional concatenation level is required in order to achieve the desired logical failure rate. Note that the y axis of the qubit overhead plot is not displayed on a log scale. This is to show the increase in overhead cost due to the probability of rejecting a state when implementing the error detection scheme.

Accepted in Quantum 2019-05-14, click title to verify 8 Figure 10: Qubit and gate overhead for the MEK protocol applied to level-2 and level-3 |Hi states. In order to achieve the desired logical failure rate, we teleport a level-2 |Hi state into a level-3 |Hi state and apply one round of level-3 MEK. In a level-l simulation, all gates in the MEK circuit are encoded at level-l.

Figure 11: Qubit and gate overhead for the hybrid scheme. First a level-2 |Hi state is fault-tolerantly prepared using the methods of Fig.5. Afterwards the state is teleported to a level- 3 state and a round of level-3 MEK is performed.

Accepted in Quantum 2019-05-14, click title to verify 9 head costs compared to the full MEK scheme (Fig. 10), tion scheme in [29] produces encoded magic states with although the overhead is still much larger than the error logical failure rate O(p) and thus for low enough physi- detection scheme using flag qubits. cal error rates, our approach could achieve lower logical Lastly, we point out that due to the lower logical fail- failure rates. ure rates of the error detection scheme using flag qubits, If codes such as topological codes were used to per- three concatenation levels were sufficient to achieve log- form the MEK protocol at high physical error rates, ical failure rates of 10−9 for larger physical error rates then magic state distillation might outperform the flag- compared to MEK (the same is true for other target based scheme, since a large number of concatenation logical failure rates). For example, suppose one wants levels would be necessary. However, the ideas we have −9 to achieve a logical failure rate pL ≤ 10 . In our error presented may be broadly applicable to improve the detection scheme, one must use at least four concatena- pseudothreshold of magic state preparation circuits us- tion levels for physical error rates p > 6 × 10−5. For the ing codes other than the Steane code. In this direction, MEK scheme, one requires four concatenation levels for we have given a 2-flag circuit for fault-tolerantly mea- physical error rates p > 4 × 10−5. Hence for physical suring logical Hadamard of the 17-qubit color code. If error rates 4 × 10−5 ≤ p ≤ 6 × 10−5, the error detec- lower logical error rates are needed, states could then tion scheme will require fewer qubits by several orders be used in a hybrid approach that takes advantage of of magnitude compared to MEK (see Figs.9 and 10). the topological code threshold. Lastly, one could use the w-flag circuit construction and EC circuits of [23] to fault-tolerantly prepare magic states for the family 6 Conclusion of color codes on a hexagonal lattice. The high cost of universal fault-tolerant logic is a chal- lenging and enduring problem. In this work, we propose 7 Acknowledgements an alternative to magic state distillation that is based on recently discovered flag fault-tolerance techniques. C.C. acknowledges IBM for its hospitality where all of We have designed a flag-based magic state preparation this work was completed. C.C. also acknowledges the scheme that reduces the number of qubits and opera- support of NSERC through the PGS D scholarship. We tions to prepare a reliable magic state, and does so by thank Sergey Bravyi, Earl Campbell, Tomas Jochym- orders of magnitude in some error regimes. Further- O’Connor, and Ted Yoder for useful discussions. We more, the error-detection-based state preparation cir- thank Earl Campbell for sharing observations about the cuit at level-1 has a high pseudothreshold and uses a MEK protocol. total of 10 qubits. The circuit accepts around 80% of the time and has logical error rates near 10−4 at physi- cal error rates near 3 × 10−3 (see Table4), making it an interesting candidate for experimental consideration. The magic state preparation scheme relies on a transversal Hadamard operator and a flag circuit for measuring it fault-tolerantly. So that the operator’s weight does not grow, we concatenate the code with it- self to achieve some low logical error rate. The threshold is ultimately that of the concatenated Steane code in all of the schemes we analyze. It would be interesting to account for more realistic constraints on qubit connec- tivity, and to consider how to extend these techniques to families of higher distance (non-concatenated) codes, but we leave this to future work. Another interesting application of our work is its use in state injection schemes [29, 40]. Using the fault- tolerant circuits for distance-three and five color codes in Figs.5 and7 (the w-flag circuits described in [23] could also be used for higher distance color codes), en- coded magic states with logical failure rates of O(p2) and O(p3) can be produced. Lattice surgery techniques could then be performed to further distill the input magic states [40, 41]. We note that the state injec-

Accepted in Quantum 2019-05-14, click title to verify 10 References RevA.71.022316. URL https://link.aps.org/ doi/10.1103/PhysRevA.71.022316. [1] P.W. Shor. Algorithms for quantum computa- [12] Jeongwan Haah, Matthew B. Hastings, D. Poulin, tion: Discrete logarithms and factoring. Proceed- and D. Wecker. Magic state distillation with ings., 35th Annual Symposium on Foundations of low space overhead and optimal asymptotic in- Computer Science, pages 124–134, 1994. DOI: put count. Quantum, 1:31, October 2017. ISSN 10.1109/SFCS.1994.365700. 2521-327X. DOI: 10.22331/q-2017-10-03-31. URL [2] Zoo. https://math.nist. https://doi.org/10.22331/q-2017-10-03-31. gov/quantum/zoo/. Last Accessed: 2018-10-31. [13] Jeongwan Haah and Matthew B. Hastings. Codes [3] Bryan Eastin and Emanuel Knill. Restrictions on and Protocols for Distilling T , controlled-S, and transversal encoded quantum gate sets. Phys. Rev. Toffoli Gates. Quantum, 2:71, June 2018. ISSN Lett., 102:110502, Mar 2009. DOI: 10.1103/Phys- 2521-327X. DOI: 10.22331/q-2018-06-07-71. URL RevLett.102.110502. URL https://link.aps. https://doi.org/10.22331/q-2018-06-07-71. org/doi/10.1103/PhysRevLett.102.110502. [14] Matthew B. Hastings and Jeongwan Haah. Distil- [4] E. Knill, R. Laflamme, and W. Zurek. Threshold lation with sublogarithmic overhead. Phys. Rev. Accuracy for Quantum Computation. arXiv Lett., 120:050504, Jan 2018. DOI: 10.1103/Phys- e-prints, art. quant-ph/9610011, Oct 1996. RevLett.120.050504. URL https://link.aps. URL https://ui.adsabs.harvard.edu/abs/ org/doi/10.1103/PhysRevLett.120.050504. 1996quant.ph.10011K. [15] Jeongwan Haah, Matthew B. Hastings, David [5] Tomas Jochym-O’Connor and Raymond Poulin, and Dave Wecker. Magic state distillation Laflamme. Using concatenated quantum codes at intermediate size. Quantum Info. Comput., 18 for universal fault-tolerant quantum gates. Phys. (1& 2):97–165, February 2018. ISSN 0114-0140. Rev. Lett., 112:010505, 2014. DOI: 10.1103/Phys- [16] Austin G. Fowler, Matteo Mariantoni, John M. RevLett.112.010505. URL http://link.aps. Martinis, and Andrew N. Cleland. Surface codes: org/doi/10.1103/PhysRevLett.112.010505. Towards practical large-scale quantum computa- [6] Adam Paetznick and Ben W. Reichardt. Univer- tion. Phys. Rev. A, 86:032324, Sep 2012. DOI: sal fault-tolerant quantum computation with only 10.1103/PhysRevA.86.032324. transversal gates and error correction. Phys. Rev. [17] Hayato Goto. Step-by-step magic state encod- Lett., 111:090505, Aug 2013. DOI: 10.1103/Phys- ing for efficient fault-tolerant quantum computa- RevLett.111.090505. URL https://link.aps. tion. Scientific Reports, (4):7501, 2014. DOI: org/doi/10.1103/PhysRevLett.111.090505. 10.1038/srep07501. [7] Jonas T. Anderson, Guillaume Duclos-Cianci, and [18] Joe O’Gorman and Earl T. Campbell. Quan- David Poulin. Phys. Rev. Lett., 113:080501, Aug tum computation with realistic magic-state 2014. DOI: 10.1103/PhysRevLett.113.080501. factories. Phys. Rev. A, 95:032338, Mar [8] Héctor Bombín. Dimensional jump in quan- 2017. DOI: 10.1103/PhysRevA.95.032338. tum error correction. New Journal of Physics, URL https://link.aps.org/doi/10.1103/ 18(4):043038, apr 2016. DOI: 10.1088/1367- PhysRevA.95.032338. 2630/18/4/043038. URL https://doi.org/10. [19] Daniel Litinski. A Game of Surface Codes: Large- 1088%2F1367-2630%2F18%2F4%2F043038. Scale Quantum Computing with Lattice Surgery. [9] Theodore J. Yoder, Ryuji Takagi, and Isaac L. Quantum, 3:128, March 2019. ISSN 2521-327X. Chuang. Universal fault-tolerant gates on con- DOI: 10.22331/q-2019-03-05-128. URL https:// catenated stabilizer codes. Phys. Rev. X, doi.org/10.22331/q-2019-03-05-128. 6:031039, Sep 2016. DOI: 10.1103/Phys- [20] Theodore J. Yoder and Isaac H. Kim. The surface RevX.6.031039. URL https://link.aps.org/ code with a twist. Quantum, 1:2, April 2017. ISSN doi/10.1103/PhysRevX.6.031039. 2521-327X. DOI: 10.22331/q-2017-04-25-2. URL [10] E. Knill. Fault-Tolerant Postselected Quantum https://doi.org/10.22331/q-2017-04-25-2. Computation: Schemes. arXiv e-prints, art. quant- [21] Rui Chao and Ben W. Reichardt. Quantum error ph/0402171, Feb 2004. URL https://ui.adsabs. correction with only two extra qubits. Phys. Rev. harvard.edu/abs/2004quant.ph..2171K. Lett., 121:050502, Aug 2018. DOI: 10.1103/Phys- [11] Sergey Bravyi and . Univer- RevLett.121.050502. URL https://link.aps. sal quantum computation with ideal clifford org/doi/10.1103/PhysRevLett.121.050502. gates and noisy ancillas. Phys. Rev. A, [22] Rui Chao and Ben W. Reichardt. Fault- 71:022316, Feb 2005. DOI: 10.1103/Phys- tolerant quantum computation with few qubits.

Accepted in Quantum 2019-05-14, click title to verify 11 arXiv:quant-ph/1705.05365, 2017. URL https: Benjamin, and M. Müller. Fault-tolerant pro- //arxiv.org/abs/1705.05365. tection of near-term trapped-ion topological [23] Christopher Chamberland and Michael E. Bever- qubits under realistic noise sources. arXiv land. Flag fault-tolerant error correction with ar- e-prints, art. arXiv:1810.09199, Oct 2018. bitrary distance codes. Quantum, 2:53, February URL https://ui.adsabs.harvard.edu/abs/ 2018. ISSN 2521-327X. DOI: 10.22331/q-2018- 2018arXiv181009199B. 02-08-53. URL https://doi.org/10.22331/ [33] Maika Takita, Andrew W. Cross, A. D. Córcoles, q-2018-02-08-53. Jerry M. Chow, and Jay M. Gambetta. Experi- [24] Theerapat Tansuwannont, Christopher Cham- mental demonstration of fault-tolerant state prepa- berland, and Debbie Leung. Flag fault-tolerant ration with superconducting qubits. Phys. Rev. error correction, measurement, and quantum Lett., 119:180501, Oct 2017. DOI: 10.1103/Phys- computation for cyclic CSS codes. arXiv RevLett.119.180501. URL https://link.aps. e-prints, art. arXiv:1803.09758, Mar 2018. org/doi/10.1103/PhysRevLett.119.180501. URL https://ui.adsabs.harvard.edu/abs/ [34] Adam Paetznick and Ben W. Reichardt. Fault- 2018arXiv180309758T. tolerant ancilla preparation and noise threshold [25] Ben W. Reichardt. Fault-tolerant quantum er- lower bounds for the 23-qubit golay code. Quan- ror correction for steane’s seven-qubit color code tum Info. Comput., 12(11-12):1034–1080, Novem- with few or no extra qubits. arXiv:quant- ber 2012. ISSN 1533-7146. URL http://dl.acm. ph/1804.06995, 2018. URL https://arxiv.org/ org/citation.cfm?id=2481569.2481579. abs/1804.06995. [35] Christopher Chamberland, Tomas Jochym- [26] Adam M. Meier, Bryan Eastin, and Emanuel Knill. O’Connor, and Raymond Laflamme. Thresh- Magic-state distillation with the four-qubit code. olds for universal concatenated quantum Quantum Info. Comput., 13(3-4):0195–0209, 2013. codes. Phys. Rev. Lett., 117:010501, Jun [27] Austin G. Fowler and Craig Gidney. Low 2016. DOI: 10.1103/PhysRevLett.117.010501. overhead quantum computation using lattice URL https://link.aps.org/doi/10.1103/ surgery. arXiv e-prints, art. arXiv:1808.06709, PhysRevLett.117.010501. Aug 2018. URL https://ui.adsabs.harvard. [36] Christopher Chamberland, Tomas Jochym- edu/abs/2018arXiv180806709F. O’Connor, and Raymond Laflamme. Overhead [28] Craig Gidney and Austin G. Fowler. Efficient analysis of universal concatenated quantum magic state factories with a catalyzed |CCZi codes. Phys. Rev. A, 95:022313, Feb 2017. DOI: to 2|T i transformation. Quantum, 3:135, April 10.1103/PhysRevA.95.022313. 2019. ISSN 2521-327X. DOI: 10.22331/q-2019- [37] A. R. Calderbank and Peter W. Shor. Good 04-30-135. URL https://doi.org/10.22331/ quantum error-correcting codes exist. Phys. Rev. q-2019-04-30-135. A, 54:1098–1105, Aug 1996. DOI: 10.1103/Phys- [29] Ying Li. A magic state’s fidelity can be RevA.54.1098. superior to the operations that created it. [38] Andrew W. Steane. Multiple-Particle Interference New Journal of Physics, 17(2):023037, feb and . Proc. Roy. Soc. 2015. DOI: 10.1088/1367-2630/17/2/023037. Lond., 452:2551–2577, 1996. URL http://www. URL https://doi.org/10.1088%2F1367-2630% jstor.org/stable/52827. 2F17%2F2%2F023037. [39] Earl Campbell and Dan Browne. On the structure [30] Panos Aliferis, Daniel Gottesman, and John of protocols for magic state distillation. Lecture Preskill. Quantum accuracy threshold for con- Notes in Computer Science, 5906. catenated distance-3 codes. Quantum Info. Com- [40] Andrew J. Landahl and Ciaran Ryan-Anderson. put., 6(2):97–165, March 2006. ISSN 1533- Quantum computing by color-code lattice surgery. 7146. URL http://dl.acm.org/citation.cfm? arXiv e-prints, art. arXiv:1407.5103, Jul 2014. id=2011665.2011666. URL https://ui.adsabs.harvard.edu/abs/ [31] Colin J Trout, Muyuan Li, Mauricio Gutiérrez, 2014arXiv1407.5103L. Yukai Wu, Sheng-Tao Wang, Luming Duan, and [41] Christophe Vuillot, Lingling Lao, Ben Criger, Car- Kenneth R Brown. Simulating the performance men García Almudéver, Koen Bertels, and Bar- of a distance-3 surface code in a linear ion trap. bara M Terhal. Code deformation and lattice New Journal of Physics, 20(4):043038, apr 2018. surgery are gauge fixing. New Journal of Physics, DOI: 10.1088/1367-2630/aab341. URL https:// 21(3):033028, mar 2019. DOI: 10.1088/1367- doi.org/10.1088%2F1367-2630%2Faab341. 2630/ab0199. URL https://doi.org/10.1088% [32] A. Bermudez, X. Xu, M. Gutiérrez, S. C. 2F1367-2630%2Fab0199.

Accepted in Quantum 2019-05-14, click title to verify 12 [42] Daniel Gottesman. An introduction to quantum er- ror correction and fault-tolerant quantum compu- tation. Proceedings of Symposia in Applied Math- ematics, 68:13–58, 2010. URL https://arxiv. org/abs/0904.2557. [43] Emanuel Knill. Quantum computing with realisti- cally noisy devices. Nature, 434(7029):39–44, 2005. DOI: 10.1038/nature03350. [44] Panagiotis Panos Aliferis. Level reduction and the quantum threshold theorem. PhD thesis, California Institute of Technology, 2007. [45] : an open source quantum computing frame- work. https://github.com/Qiskit. Last Ac- cessed: 2018-10-31.

Accepted in Quantum 2019-05-14, click title to verify 13 A Proof of fault-tolerance for the magic Let us assume that a single-fault occurs. For the first state preparation schemes half of the EC circuit, the possible weight-two errors that can arise from a single fault are Z4Z6, Z3Z7, and In this appendix we provide proofs showing that our X3X7. In the case of Z4Z6 and Z3Z7, the measurement magic state preparation schemes for the |Hi state are in the X-basis will flag (with the two Z-basis measure- fault-tolerant. In what follows, a state-preparation ments being +1), at which point the entire syndrome scheme will be called fault-tolerant if the following two measurement is repeated. Since Z4Z6 and Z3Z7 are er- conditions are satisfied [42] rors that have different syndromes than all other errors arising from a single fault leading to a − + + measure- Definition 1. Fault-tolerant state preparation ment outcome of the first three ancilla qubits, after the For t = b(d−1)/2c, a state-preparation protocol using syndrome measurement is repeated these errors can be a distance-d C is t-fault-tolerant if the distinguished and corrected. following two conditions are satisfied: The same argument applies to the case where a single fault leads to the X3X7 data-qubit error. But this time 1. If there are s faults during the state-preparation the X4 error has the same syndrome as X3X7 (s(X4) = protocol with s ≤ t, the resulting state differs from s(X3X7) = 010000). However, the faults causing an a codeword by an error of at most weight s. X3X4 error result in a + − − measurement outcome of the first three ancillas whereas X4 results in + − +. 2. If there are s faults during the state-preparation Thus when the syndrome measurement is repeated and protocol with s ≤ t, then ideally decoding the out- the flag outcomes are taken into account, the errors can put state results in the same state that would be ob- be distinguished and corrected. tained from the fault-free state-preparation scheme. An analogous analysis can be applied to the second half of the EC circuit. See [25] for more details. In Definition1, ideally decoding is equivalent to per- forming fault-free error correction. Now suppose |ψi A.2 Fault-tolerance proof for the error detection is the encoded state to be prepared. If there are s ≤ t scheme faults during a state-preparation protocol satisfying the criteria in Definition1, then the output state will have Recall that the error detection scheme can be decom- the form E|ψi with wt(E) ≤ s (the output state will posed into three components as shown in Fig. 5b. Since encode the correct state with no more than s errors). t = 1 for the [[7, 1, 3]] Steane code, we must show that For CSS codes, this definition can be applied inde- if a single fault occurs in any of the three components, pendently to X and Z errors. both criteria in Definition1 will be satisfied. Case 1: fault in |Hinf (see Fig.4) . Since the Steane

code is a perfect CSS code and the |Hinf circuit is not A.1 Error correction circuit fault-tolerant, the output state of the first component In this section we provide more details on the properties will have the form of the EC circuit of Fig.3 which is used in all of our schemes. (x) (z) (x) (z) |ψ(1)i = E E E¯ E¯ |Hi, (10) The first half of the circuit measures the XIXIXIX i j i j (green CNOT’s), IIIZZZZ (blue CNOT’s) and where E(x) ∈ {I,X } and E(z) ∈ {I,Z } are single- IZZIIZZ (red CNOT’s) stabilizers of the Steane code. i i j j ¯(x) ¯(z) The second half of the circuit measures the remain- qubit errors on qubits i and j and Ei ∈ {I, X}, Ej ∈ ing stabilizers. Given that the CNOT gates are not {I, Z} are logical operators of the Steane code. transversal, it is possible for a single fault to result in a Now the output state of the second component of weight-two X or Z data-qubit error. However the an- Fig. 5b, including the contribution from the first ancilla cilla qubits also act as flag qubits which can be used to qubit (which will subsequently be measured in the X- detect such events. basis), is given by

1n    o |ψ(2)i = E(x)E(z)E¯(x)E¯(z) + E(z)E(x)E¯(z)E¯(x) |Hi|+i + E(x)E(z)E¯(x)E¯(z) − E(z)E(x)E¯(z)E¯(x) |Hi|−i . 2 i j i j i j i j i j i j i j i j (11)

(x) (z) If Ei 6= I or Ej 6= I, then the single-qubit error will be detected by the subsequent EC and the state will be

Accepted in Quantum 2019-05-14, click title to verify 14 (x) (z) rejected. Thus let us assume that Ei = Ej = I so that 1n  |ψ(2)i = E¯(x)E¯(z) + E¯(z)E¯(x) |Hi|+i 2 i j i j   o ¯(x) ¯(z) ¯(z) ¯(x) (a) + Ei Ej − Ei Ej |Hi|−i . (12)

¯(x) ¯(z) (2) If Ei Ej = Y , then |ψ i = | − Hi|−i and the ancilla measurement will be −1 resulting in rejection. ¯(x) ¯(z) Thus suppose that Ei Ej is X or Z so that 1 n o |ψ(2)i = √ |Hi|+i ∓ i| − Hi|−i , (13) 2 (b) where we used the identities H = √1 (X + Z), XHX = Figure 12: (a) Propagation of X errors through the control- 2 qubit of a controlled-Hadamard gate. This induces a Hadamard √1 (X − Z) and ZHZ = √1 (Z − X). From Eq. (13), 2 2 error on the target qubit. (b) Propagation of a Z error through we see that |ψ(2)i will be accepted with probability 1/2 the control-qubit of a controlled-Hadamard gate. resulting in the state |Hi. Thus if a single fault occurs in the first component of Fig. 5b, an accepted state will be |Hi. fault-free error correction of an accepted state of the f1 Case 2: fault in Hm (see Fig. 5a). Since there are no protocol in Fig. 5b will always result in the state |Hi. We conclude that if there is at most one fault, both faults in |Hinf, the input to the circuit will be |Hi. If the fault results in any measurement outcome to be −1, criteria in Definition1 will be satisfied. the state will be rejected. Thus we only consider faults such that the measurement outcome of all ancilla’s is A.3 Fault-tolerance proof for the error correction +1. Since the circuit will flag if a single fault results scheme in a data-qubit error of weight greater than one, the (2) (x) (z) output state of the circuit will be |ψouti = Ei Ej |Hi A.3.1 Error distinguishability (x) (z) (x) (z) with wt(Ei ) ≤ 1 and wt(Ej ) ≤ 1. If Ei Ej is non- Recall that for the error correction scheme, the circuit trivial, it will be detected by the subsequent EC circuit for measuring the logical Hadamard operator is given in and the state will be rejected. Hence an accepted state Fig. 6a. The first thing to notice is that the controlled- will be |Hi. Hadamard gates are implemented in a different order Case 3: fault in the EC circuit (see Fig.3) . Since compared to the gates in Fig. 5a. If a fault occurs at there are no faults in the first two components of Fig. 5b, one of the controlled-Hadamard gates resulting in an (3) the input state to the EC will be |ψ i = |Hi. If a fault X or Y error on the control-qubit, the resulting data causes a non-trivial measurement, the state will be re- qubit error can be expressed as a product of Hadamard jected. Hence let us consider the case where a single errors and Pauli errors (see Fig. 12). For the second fault results in an error which goes undetected. A sin- to sixth controlled-Hadamard gates, all errors arising gle fault in the EC can result in a data-qubit error of from a single fault at one of these locations which prop- (x) (z) (x) (z) the form Ei Ej where wt(Ei ) ≤ 1 and wt(Ej ) ≤ 1 agate to the data qubits (causing at least one of the flag without any flags7. However, due to the fault-tolerant qubits to flag) must be distinguishable (since H = H⊗7, properties of the EC, a fault resulting in a data error a fault occurring at the first controlled-Hadamard gate (x) (z) with wt(Ei ) ≥ 2 or wt(Ej ) ≥ 2 will cause one of the will result in an X and Z data-qubit error of weight flag qubits to flag. Hence if the state is accepted (all at most one). This is only possible if errors are cor- measurements are trivial), the output state of the EC rected based on their full syndrome8 and with a care- will have the form fully chosen ordering of the controlled-Hadamard gates. Performing a numerical search of all 7! = 5040 per- (x) (z) mutations of the controlled-Hadamard gates, there was |ψouti = Ei Ej |Hi, (14) 8 (x) (z) Instead of using a three-bit syndrome for correcting X er- with wt(Ei ) ≤ 1 and wt(Ej ) ≤ 1. Since errors of rors and separately using the other three-bit syndrome to correct the form E = XiZj are correctable by the Steane code, Z errors, one would use the full six-bit syndrome to correct the data errors. Note that we only correct using the full six-bit syn- 7As an example, consider the error Z ⊗ X on the fourth black drome if there are flags. Otherwise, we perform the standard CSS CNOT of Fig.3 which results in the data-qubit error X6Z7. correction of X and Z errors separately.

Accepted in Quantum 2019-05-14, click title to verify 15 f2 no permutation that allowed all possible errors (arising output to the first Hm circuit (where we only write the from a single fault) to be distinguished. However, ig- first ancilla since the flag qubits have no effect) can be noring a fault on the fourth controlled-Hadamard gate, written as a numerical search showed that the ordering found in Fig. 6a allows errors arising from a single fault to be EpE + EfpEe EpE − EfpEe |ψ(2)i = |Hi|+i + |Hi|−i distinguished. Consequently, we need to have the abil- 2 2 ity to isolate errors arising from a fault at the fourth (15) controlled-Hadamard location. This can be achieved There are several cases to consider using the two extra flag qubits shown in Fig. 6a( |0if1 and |0if can be measured using the same qubit). 3 1. Ep = I. Suppose for example that an X error arises on the control-qubit of the fourth controlled-Hadamard gate If E ∈ {X, Z}, resulting in the data qubit error P2H1H3H4, where P2 ∈ {I,X ,Y ,Z }. Then |0i , |0i and |0i will flag. We 2 2 2 f0 f2 f3 1 1 could then apply H H H to the data qubit and be left |ψ(2)i = |Hi|+i ± i| − Hi|−i. (16) 1 3 4 2 2 with the single-qubit Pauli P2. Of course, if a fault occurs on the first CNOT connecting |0if3 to the |+i Thus a ±1 measurement outcome gives | ± Hi and state resulting in an X ⊗ X error, |0i will not flag f2 f3 all three Hm circuits will result in a ±1 outcome. but |0i will flag (in addition to |0i ). After applying (2) f2 f2 f0 If E = Y , |ψ i = | − Hi|−i and all three Hm H1H3H4, the resulting data qubit error would be H2. measurement outcomes will be −1. Going through all possible cases of a single fault at one For the case where all three measurement out- of the CNOT gates interacting the ancillas |0if1 , |0if2 comes are −1, applying Y to |ψfinali, we will have and |0if3 with the |+i state, we can guarantee error |ψfinali = |Hi. distinguishability by applying H1H3H4 to the data if the following combinations of flag qubits flag: 2. Ep = Yi. In this case 1. |0if0 and |0if2 flag.

(2) 1 1 2. |0if and |0if flag. |ψ i = Y (E − Ee)|Hi|+i + Y (E + Ee)|Hi|−i. 0 3 2 i 2 i (17) 3. |0if0 , |0if2 and |0if3 flag.

Note that |0i would not flag if a single measurement If E ∈ {X, Z}, |ψ(2)i = ± √i Y | − Hi|+i + f0 2 i 1 error of the flag qubits |0if1 , |0if2 or |0if3 occurred. √ Y |Hi|−i. The Y error will be corrected by the 2 i i When following the circuit in Fig. 6b by an EC cir- following EC round. For a ±1 measurement out- cuit, we will use a lookup-table containing all errors (2) f2 come, |ψ i = Yi|∓Hi and the next two H mea- arising from a single fault resulting in a flag (which are m surements will be ∓1 with |ψfinali = | ∓ Hi. Thus all distinguishable) in order to correct. f2 if the three Hm measurement outcomes are + − −, applying Y to |ψfinali will give |ψfinali = |Hi. If the A.3.2 Fault-tolerance proof measurement outcome is − + +, |ψfinali = |Hi. (2) Given the above for applying Hadamard corrections to If E = Y , |ψ i = Yi| − Hi|+i and Yi will be f2 the data qubits depending on the flag outcomes (thus corrected by the next EC. Hence the three Hm guaranteeing error distinguishability if a single fault oc- measurement outcomes will be + − −. Applying Y curs), we now show that our error correcting scheme to |ψfinali will result in |ψfinali = |Hi. shown in Fig. 6b satisfies both criteria in Definition1. (2) Lastly, if E = I, |ψ i = Yi|Hi|−i. The Yi error In what follows, |ψfinali will correspond to the output f2 will be corrected by the next EC and the three Hm state of the circuit in Fig. 6b. Also, the state at the measurements will be − + +. output of the i’th circuit in Fig. 6b will be labeled as (i) |ψ i. For instance, the state at the output of |Hinf 3. Ep ∈ {Xi,Zj}. (1) will be |ψ i. If E = Y then Case 1: fault in |Hinf. 1 The output of |Hi is given by Eq. (10). To simplify |ψ(2)i = ± (X − Z )| − Hi|+i nf 2 i i ¯(x) ¯(z) the notation, let E = Ei Ej , Ee = (H)E(H), Ep = 1 (x) (z) + (Xi + Zi)| − Hi|−i. (18) Ei Ej , and Efp = HEpH. With this notation, the 2

Accepted in Quantum 2019-05-14, click title to verify 16 f2 f2 The three Hm measurements will be ± − − with Hm measurement outcomes Logical gate correction |ψfinali = | − Hi. Applying Y will give the correct + + + I state. − − − Y If E ∈ {X, Z}, we have + − − Y

(2) 1 − + + I |ψ i = (XiZ + ZiX)|Hi|+i 2 + − + I 1 ± (Z X − X Z)|Hi|−i, (19) + + − I 2 i i or Table 2: Logical correction to apply at the end of the circuit in f2 Fig. 6b given the three measurement outcomes of the Hm cir- (2) 1 |ψ i = (XiX + ZiZ)|Hi|+i cuit. The comparison between rows three and five of this table 2 shows why three logical Hadamard measurements are required 1 ± (X X − Z Z)|Hi|−i. (20) instead of two. 2 i i Since in both Eqs. (19) and (20) the states are lin- ear combinations of products of single-qubit Pauli’s there will be a flag and the error will be corrected. The f2 and logical operators, after the first EC circuit the output of the EC will be |Hi and all three Hm mea- output state will collapse to |ψ(3)i = E|Hi where surements will be +1 with |ψifinal = |Hi. f2 If there are no flags, the output state of the EC can E ∈ {X, Z} (regardless of whether the first Hm (3) (x) (z) measurement outcome was ±1). During the second be written as |ψ i = Ep|Hi where Ep = Ei Ej . f2 f2 Hm measurement, the output state will become The output state of the second Hm measurement will be given by E + Ee E − Ee |ψ(4)i = |Hi|+i + |Hi|−i Ep + Eep Ep − Eep 2 2 |ψ(4)i = |Hi|+i + |Hi|−i. (22) = |Hi|+i ± i| − Hi|−i (21) 2 2 Note that the error will be corrected in the next EC f2 From Eq. (21) we conclude that the three Hm f2 round. However the type of error can affect the Hm measurement outcomes will either be ± + + with measurement outcomes. |ψfinali = |Hi or ± − − with |ψfinali = | − Hi (in which case we apply Y ). 1. Ep = Yi In this case |ψ(4)i = Y |Hi|−i and the second Hf2 4. Ep = XiZj with i 6= j. i m measurement will be −1 (with all three measure- E will be corrected in the next EC round. A sim- p ment outcomes being + − +) and |ψi = |Hi9. ilar calculation as the examples above show that final

the possible measurement outcomes of the three 2. Ep ∈ {Xi,Zi}. Hf2 are + + +, − + + and + − −. If the last two m In this case |ψ(4)i = Xi+Zi |Hi|+i ± Xi−Zi |Hi|−i. measurements are −1, then apply Y to the data. In 2 2 Thus the second Hf2 measurement will be ±1 and all cases (and after applying the necessary logical m f2 all three Hm measurements will be + ± + with operations) |ψfinali = |Hi. |ψifinal = |Hi. f2 Case 2: fault in the first Hm circuit. If the fault results in a data-qubit error of weight 3. Ep = XiZj with i 6= j. greater than one, as was shown in Appendix A.3.1, the The analysis of this case is analogous to the case possible errors can be distinguished when considering where Ep = Yi. the full six-bit syndrome as well as applying H1H3H4 f2 or I to the data depending on the flag outcomes. Thus Case 4: fault in the second Hm circuit. f2 The analysis is identical to Case 2. The possible mea- the output state of the first Hm measurement can be (2) f2 expressed as |ψ i = Ef |Hi|±i where Ef will be cor- surement outcomes of the Hm circuits are + ± + with rected by the following EC. Hence the possible measure- |ψifinal = |Hi. ment outcomes of the three Hf2 circuits are ± + + and m 9 the final output state will be |ψi = |Hi. This case shows why it is important to measure the logical final Hadamard operator three times since if the first two measurement Case 3: fault in the first EC. outcomes are +−, then the data qubit state can either be |Hi as The input state to the EC will be |Hi. If a fault re- in this case or | − Hi as shown in Case 1 (see the text following sults in a data-qubit error of weight greater than one, Eq. (21)).

Accepted in Quantum 2019-05-14, click title to verify 17 p(0) = p, the probability of this part is bounded by 4p. (k) Lastly, pdec is a bound on the probability of failure of the decoding circuit D. The level-k block is decoded recursively as follows. The level-k circuit comprises level-(k − 1) gates which, when applied to the code block, results in a level-(k −1) encoded state. Then D is applied again using level-(k − 2) gates and so on. Assuming that D has D locations, (k) Figure 13: Circuit for teleporting the single-qubit state |ψi we can bound pdec by into the codeblock. The operation D is used to decode one of the code blocks, and PL is a logical Pauli which is applied to k−1 (k) X (j) complete the teleportation protocol. pdec ≤ D p , (24) j=0

Case 5: fault in the second EC circuit. where p(j) is a bound on the failure probability of level-j The analysis of this case is analogous to Case 3. The gates. f2 possibilities for the three Hm measurement outcomes In this paper, instead of using the bounds in Eqs. (23) (x) (z) are + + ± with |ψifinal = Ep|Hi where Ep = Ei Ej . and (24), we perform a direct simulation of the telepor- Note however that Ep is a correctable error so both tation circuit using the methods in Section2 in order to criteria in Definition1 are still satisfied. obtain smaller constant pre-factors (since not all fault f2 Case 6: fault in the third Hm circuit. locations will lead to a logical fault). The analysis is analogous to Case 4. However if there is a flag, an extra EC round should be performed follow- f2 f2 ing the third Hm circuit. The three Hm measurement C Overhead analysis of the |Hi state outcomes will be + + ± with |ψifinal = Ep|Hi where (x) (z) preparation scheme Ep = Ei Ej as in Case 5. A summary of the logical operations to apply to In this section we provide a detailed description of the f2 |ψifinal based on the three Hm measurement outcomes qubit and gate overhead analysis for preparing an en- is given in Table2. coded |Hi state using our error detection scheme.

B Teleporting into code blocks C.1 Qubit overhead analysis

H(l) In this section we review a general method introduced We begin with a few definitions. Let pA be the prob- by Knill for preparing an encoded state |ψi from a phys- ability that an encoded |Hi state at level-l passes the H(l) ical qubit state |ψi using teleportation [43]. verification test of the circuit in Fig. 5b and let nT be The circuit used to implement the teleportation pro- the total number of qubits used to prepare an encoded tocol is illustrated in Fig. 13. After preparing a logical |Hi state at level-l. At the first concatenation level, the Bell state, one of the code blocks is decoded. A CNOT largest component of the circuit in Fig. 5b is the EC, is applied between |ψi and the decoded state. After which requires 10 qubits in its implementation. Hence measuring both qubits in the X and Z basis, a logical we have Pauli operator is applied to the code block in order to 10 complete the teleportation protocol. hnH(1)(p)i = , (25) T H(1) In [44], Aliferis showed that the probability of a logi- pA (p) cal fault occurring during the teleportation protocol can be bounded by for a physical error rate p of the depolarizing noise model. (k) (k) At higher levels, a few considerations are necessary pL ≤ 3p + pdec + 4p. (23) when considering different contributions to the qubit Here we are assuming that the code block is encoded overhead. First, for the level-l circuit in Fig.4, it is im- with k-levels of concatenation. Assuming a stochastic portant to take into account the fault-tolerant prepa- noise model where encoded gates fail with probability ration of the level-(l − 1) |0i and |+i states. The at most p(k), a fault in the encoded Bell state is up- level-(l − 1) |+i state is obtained by first preparing the per bounded by 3p(k). Since there are four locations in state |+(l−2)i⊗7 (which is a +1 eigenstate of all the the physical part of the teleportation circuit, and with X-stabilizers), followed by measuring the Z-stabilizers

Accepted in Quantum 2019-05-14, click title to verify 18 Fig.1, at level l ≥ 2, the logical Hadamard gate is mea- sured using the circuit in Fig. 15. Due to the way in which we parallelize the the circuit in Fig. 15, only two level-(l − 1) resource states are required at each time step, apart from the first and last time step where we only need one resource state. In addition, a level-(l − 1) resource state is required for the circuit in Fig.4. In order to minimize the qubit overhead, we consider (1) preparing in parallel ml ≥ 1 level-(l − 1) resource states at a time step where one resource state is re- (2) quired, and ml ≥ 2 level-(l − 1) resource states at a time step where two resource states are required. At a time step where one level-(l − 1) resource state is re- Figure 14: Flag fault-tolerant circuit for measuring the Z sta- (1) quired, if none of the ml resource states pass the ver- bilizers of the [[7, 1, 3]] Steane code obtained from [21]. The ification test, the protocol is aborted and begins anew. dashed vertical lines are used to separate different time steps. Similarly, at a time step where two resource states are (2) required, we need at least two of the ml resource states to pass the verification test, otherwise, the protocol is aborted. (l) Let pAP 1 be the probability that at least one of the (l) m1 level-(l − 1) |Hi states passes the verification test (l) and pAP 2 be the probability that at least two of the (l) m2 level-(l − 1) |Hi states pass the verification test. We have that

m(l) 1  (l) (l) X m1 H(l−1) k H(l−1) m(l)−k p (p) = (p (p)) (1 − p (p)) 1 , AP 1 k A A k=1 (26)

and Figure 15: Circuit used to measure the logical Hadamard op- erator for concatenation levels l ≥ 2. The T gates are im- m(l) 2  (l) plemented using the circuit in Fig.1. At level- l, two level- (l) X m2 H(l−1) k H(l−1) m(l)−k p (p) = (p (p)) (1 − p (p)) 2 . (l − 1) |Hi resource states are required for the entire circuit. AP 2 k A A k=2 We assume that the two resource states can be reused for each (27) parallel implementation of T and T †. (l) At each level, we set the values of pAP 1(p) and p(l) (p) from 1 − 10−3 to 1 − 4 × 10−2 in increments of using the circuit of Fig. 14 (which was shown to be fault- AP 2 − (l) tolerant in [21]). Depending on the measurement and 10 2 and obtain the corresponding values of m1 and (l) flag outcomes, it might be necessary to repeat the mea- m2 by solving Eqs. (26) and (27). The final values surement of X or Z stabilizers without the flag qubit. (l) (l) H(l) of pAP 1(p) and pAP 2(p) chosen to minimize hnT (p)i Note that at the first level, the circuit in Fig. 14 re- (see Eq. (28) below and Table3) . Now, the circuits for quires 11 qubits instead of 10 since there needs to be preparing level-1 |0i and |+i states require 11 qubits. at least one ancilla qubit prepared in the |+i state in Additionally, three extra |0i and |+i ancilla states are order to detect errors of weight greater than two aris- required, two in the circuit of Fig. 15 and three for the ing from a single fault. A similar protocol is used to EC (we assume that the ancilla qubit’s used to mea- fault-tolerantly prepare a level-(l − 1) |0i state. sure the Hadamard operator can be reused for the EC). Next, the details of the implementation of the Apart for the resource state used to prepare the circuit controlled-Hadamard gates are considered as follows. in Fig.4, the qubits used in the resource states for im- Since the controlled-Hadamard gates are decomposed plementing T and T † gates can be reused at each time as shown in Fig. 2b with the level-(l−1) T and T † gates step of the circuit in Fig. 15. Since there are three time implemented using level-(l − 1) |Hi states as shown in steps where a single resource state is required and six

Accepted in Quantum 2019-05-14, click title to verify 19 (l) (l) Acceptance probabilities for preparing m1 and m2 resource states (1) (0) (0) (0) (0) (0) nH = 3(n|+i + n|0i ) + n|Hi + 11nCNOT + 6nidle, p(2) (p) ≈ 0.999887 − 73.8p nf AP 1 (30) (2) pAP 1(p) ≈ 0.99998 − 149p (2) 5 2 pAP 1(p) ≈ 0.999897 − 86.7p − 3.71 × 10 p (1) (0) (0) (0) (0) (2) n =2n + 7(n + n + n ) 5 2 f1 CNOT CZ T T † pAP 1(p) ≈ 0.99994 − 177p − 6.97 × 10 p Hm (0) (0) (0) (0) (0) (l) + n|+i + n|0i + 191nidle + nXm + nZm, (31) Table 3: List of the acceptance probabilities pAP 1(p) and (l) pAP 2(p) at the second and third concatenation levels. The ex- pressions were obtained by solving Eqs. (26) and (27) numeri- n(1) = n(1) + hn(1) in(1) , (32) (l) (l) −2 −3 ED EC1 ED2 EC2 cally for values of pAP 1(p), pAP 2(p) ∈ [1 − 4 × 10 , 1 − 10 ] increasing in increments of 10−2 and choosing the pair of values and H(l) which minimize the total average qubit overhead hnT (p)i. We round the coefficients of p and p2 to three digits. (1) (1) (1) (1) (1) nEC = hnREC1inEC1 + hnREC2inEC2, (33) times steps where two resource states are required, the (l) (l) l where nZm = nXm = 7 are the number of Z and X- average number of qubits required to implement our er- basis measurement locations. ror detection scheme at level-l is given by † (l) l In the above, for a gate G 6∈ {T,T }, nG = 7 since as we will show below, we will treat the EC’s sepa- rately from the gates for l ≥ 2. We split the EC circuit (l) (l) H(l−1) l−1 H(l) (dm1 e + 2dm2 e)hnT (p)i + 9(11 ) into the two components shown in Fig.3, which we call hnT (p)i = . H(l) (l) 3 (l) 6 pA (p)(pAP 1(p)) (pAP 2(p)) EC1 and EC2. As we explained in Appendix A.1, de- (28) pending on the syndrome measurement outcome, the full syndrome measurement can be repeated. Thus hn(l) i and hn(l) i corresponds to the average num- C.2 Gate overhead analysis REC1 REC2 ber of times that the circuits EC1 and EC2 are im- (l) Since the circuit in Fig. 5b can be decomposed into three plemented at level-l. Similarly, hnED2i is the average parts, the number of gates required to implement our number of times that EC2 is implemented when the EC error detection scheme at level-l can be written as10 circuit is used for error detection. These values were ob- tained through Monte-Carlo simulations with 106 trials (see Fig. 16). n(l) + n(l) + n(l) Performing a gate count of the circuits EC1 and EC2, Hnf f1 ED n(l) = Hm , (29) |Hi H(l) we have that pA (p) (1) (0) (0) (0) (0) (0) (0) nEC1 = 46nidle + 14nCNOT + 2(n|0i + nZm) + n|+i + nXm where n(l) is the number of gates used for preparing Hnf (34) (l) the state |Hinf at level-l, n f1 is the number of gates Hm and required to measure the logical Hadamard operator at level-l and n(l) is the number of gates used in the EC (1) (0) (0) (0) (0) (0) (0) ED nEC2 = 46nidle + 14nCNOT + 2(n|+i + nXm) + n|0i + nZm. circuit of Fig.3 at level- l. We used ED instead of EC (35) since the circuit is used in an error detection scheme. It will also be important to analyze the gate overhead Lastly, at concatenation levels l ≥ 2, the |0i and |+i of the circuit in Fig.3 when used in an error correction states are prepared fault-tolerantly using the circuit in scheme since all level-l (l ≥ 2) Clifford gates in our Fig. 14 to measure the three Z stabilizers of the Steane circuits consist of extended rectangles where the EC’s code, and a similar circuit to measure the three X sta- are used for error correction instead of error detection. bilizers (see the discussion in Appendix C.1). When Performing a gate count of the level-1 circuits in preparing a logical |+i state, if there is a flag in the cir- Figs.3,4 and 15, we find that cuit of Fig. 14, then there could be a Z error of weight greater than one. Thus one must measure the X sta- 10 (l) Note that in this section, all quantities nG should be written bilizers of the Steane code (without using a flag qubit) (l) as hnG i since we are computing average quantities. However, to to correct the Z errors. If there are no flags but the avoid cluttering in the notation, we omit the brackets. Z stabilizer measurement outcome is nontrivial, then

Accepted in Quantum 2019-05-14, click title to verify 20 (a) (b)

(c) (d)

(l) (l) Figure 16: (a) Plot showing the values obtained for hnREC1i and hnREC2i at the first three concatenation levels of the Steane code. (l) (l) (b) Plots showing the values obtained for hnXS0i and hnZS0i when preparing the |0i state at the first three concatenation levels. (l) (c) Same as in (b) but for the |+i state. (d) Plot showing the values of hnED2i at the first three levels. All plots were obtained by performing a Monte-Carlo simulation with 106 trials.

Accepted in Quantum 2019-05-14, click title to verify 21 must be careful not to double-count overlapping EC’s since consecutive gates will share an EC (see Fig. 17 for an example). Lastly, for l ≥ 2, the T and T † gates are implemented as shown in Fig.1. We thus see that the overhead for these gates can be computed recursively using Figure 17: Circuit illustrating shared EC units between two logical CNOT gates and an idle qubit location. It is important n(l) = n(l) = n(l−1) + 3n(l−1) + 4(7l). (40) not to double-count overlapping EC’s when computing the gate T T † |Hi EC overhead at concatenation levels l ≥ 2. With the above considerations and using Eq. (40), the gate overhead can be computed recursively using the following relations. First, the recursive relations for one must repeat the Z syndrome measurement without the EC unit are given by the flag qubit. A similar analysis holds for preparing a logical |0i state. Hence, as in Eqs. (32) and (33), (l) (l) (l) (1) (l) nEC = hnREC1inEC1 + hnREC2inEC2, (41) we must also take into account the average number of times the non-flagged X and Z stabilizers are measured where when counting the number of gates required to prepare n(l) = 2n(l−1) + n(l−1) + 73n(l−1) + 63(7l), (42) level-l |0i and |+i states. The averages are obtained by EC1 |0i |+i EC performing a Monte-Carlo simulation (see Fig. 16). In and what follows we define n(l) and n(l) to be the number XS ZS n(l) = 2n(l−1) + n(l−1) + 66n(l−1) + 63(7l). (43) of locations in the level-l X and Z stabilizer measure- EC2 |+i |0i EC (l) ment circuit without the flag qubits and hnXS+/0i and Next, the recursive relations for the |0i and |+i states (l) are hnZS+/0i to be the average number of times these cir- cuits are used when preparing a level-l | + /0i state. n(l) = 8n(l−1) + 3n(l−1) + 100n(l−1) + 92(7l) Performing the level-1 gate count, we have that |0i |0i |+i EC (l) (l) (l) (l) + hnZS0inZS + hnXS0inXS, (44)

(1) (0) (0) (0) (0) (0) (0) n|+i =73nidle + 15nCNOT + 8n|+i + 3(n|0i + nZm) + nXm (l) (l−1) (l−1) (l−1) l n|+i = 8n|+i + 3n|0i + 100nEC + 92(7 ) (1) (1) (1) (1) + hnZS+inZS + hnXS+inXS, (36) (l) (l) (l) (l) + hnZS+inZS + hnXS+inXS, (45) and where

(1) (0) (0) (0) (0) (0) (0) (l) (l−1) l (l−1) n|0i =73nidle + 15nCNOT + 8n|0i + 3(n|+i + nXm) + nZm nZS = 3n|0i + 62(7 ) + 70nEC , (46) (1) (1) (1) (1) + hnZS0inZS + hnXS0inXS, (37) and (l) (l−1) l (l−1) with nXS = 3n|+i + 62(7 ) + 70nEC . (47) Using Eqs. (41) to (47) and the results in Fig. 16, we (1) (0) (0) (0) (0) can compute the following expressions nZS = 48nidle + 11nCNOT + 3(n|0i + nZm), (38) and n(l) = 3(n(l−1) + n(l−1)) + n(l−1) + 28n(l−1) + 17(7l−1), Hnf |+i |0i |Hi EC (1) (0) (0) (0) (0) (48) nXS = 48nidle + 11nCNOT + 3(n|+i + nXm). (39)

We now have all the tools to obtain the gate over- (l) (l−1) (l−1) l (l) (l−1) n f1 = n|+i + n|0i + 202(7 ) + 14nT + 129nEC , head at arbitrary concatenation levels. Recall that at Hm (49) concatenation levels l ≥ 2, all physical gates G in the circuits of Fig. 5b are represented by extended rect- and angles, which consists of the logical gate G preceded (l) (l) (l) (l) and followed by EC units (in our case, the circuit in nED = nEC1 + hnED2inEC2. (50) Fig.3). More details on extended rectangles can be Note that n0 = 0 and n(0) = n(0) = n(0) = n(0) = found in [30]. When computing the gate overhead of EC T T † |Hi |0i (0) the error detection scheme (Eq. (29) with l ≥ 2), we n|+i = 1.

Accepted in Quantum 2019-05-14, click title to verify 22 D Overhead analysis for the MEK We now consider the overhead cost for performing scheme a round of level-3 MEK. We first teleport the distilled level-2 |Hi states to level-3 |Hi states. We have that In this section we provide a detailed description of the qubit and gate overhead analysis required to implement n(2) (p) the MEK scheme. n(T )(p) = 2(113) + qMEK q23 2 1335 = 2662 + . (53) D.1 Qubit overhead analysis (2) 2aMEK(p) We first compute the overhead cost of teleporting a Using the level-3 |Hi states, the qubit overhead for a physical |Hi state to a level-2 |Hi state and then per- level-3 MEK simulation is given by forming one round of level-2 MEK. (T ) Let nqij be the qubit overhead cost of a level-i |Hi state teleported to a level-j |Hi state. From the tele- 3(113) + 4n(T )(p) n(3) (p) = q23 , (54) portation of Fig. 13, a level-2 |0i and |+i state must be qMEK (3) a (p) prepared, each requiring 112 qubits. We assume that MEK these qubits can be reused when performing the logical where we will divide Eq. (54) by two when comparing gates and EC’s that follow (since an EC requires only 10 with the qubit overhead of our error detection scheme. qubits). Including the qubit for the physical |Hi state, For the case when a level-2 |Hi state is prepared using we have our error detection scheme, we must first teleport the (T ) level-2 |Hi state to a level-3 |Hi state. Defining n2ED3 (T ) 2 to be the overhead cost for the teleportation step, we nq02 = 2(11 ) + 1 = 243. (51) have

(l) Next we define nqMEK to be the qubit overhead cost (T ) 3 H(2) associated with performing a level-l round of MEK. We n2ED3(p) = 2(11 ) + hnT (p)i, (55) also define a(l) to be the probability that a pair of MEK H(2) level-l |Hi states are accepted in a level-l round of where hnT (p)i is obtained from Eq. (28). Hence the MEK. When implementing a level-l MEK circuit, we overhead cost for performing a round of level-3 MEK is must prepare two level-l |+i states and one level-l |0i given by state. As was explained in Appendix C.1, we require two additional level-l |Hi states in order to perform the (T ) 3(113) + 4n (p) C gates. Thus the level-2 MEK circuit qubit over- n(3) (p) = 2ED3 , (56) H qMEKED (3) head with level-2 |Hi states teleported from physical 2aMEKED(p)

|Hi states is given by (3) where aMEKED is the acceptance probability for the level-3 MEK scheme when the level-2 |Hi states (tele- 2 (2) 3(11 ) + 4(243) 1335 ported to level-3) were obtained from our error detec- n (p) = = . (52) qMEK (2) (2) tion scheme. aMEK(p) aMEK(p) Since the MEK circuit produces two distilled |Hi D.2 Gate overhead analysis states, when we compare the qubit overhead cost to our |Hi state preparation error detection scheme, we will We now consider the gate overhead analysis for the var- divide Eq. (52) by two. We note that several additional ious rounds of MEK considered in Appendix D.1. (T ) optimizations are possible. For instance, some of the Let ngij be the qubit overhead cost of a level-i |Hi qubits used for teleporting the |Hi states that are used state teleported to a level-j |Hi state. Two EC’s must to perform the CH gates could be reused. Addition- be performed after applying the logical CNOT gate. We ally, the extra qubits used for preparing the level-l |0i must also take into account the gate cost for the decod- and |+i states for the MEK circuit could also be reused ing circuit. The decoding circuit requires the prepara- in various parts of the protocol. However these addi- tion of four |+i states and three |0i states. There are tional optimizations are likely to strongly depend on eight CNOT gates and the circuit requires a total of 16 (D) the particular architecture that is being used and their EC’s. Let nij be the number of gates in the decoding associated constraints. Therefore, to be as general as circuit for level-i to level-j teleportation scheme. We possible, these will be omitted. have that

Accepted in Quantum 2019-05-14, click title to verify 23 (D) (2) (1) (2) (1) n(T ) (p) = n(2) + n(3) + n(3) n02 = 4(n|+i + n|+i) + 3(n|0i + n|0i ) g23ED |Hi |+i |0i (1) (2) 2 (3) (2) 3 2 (D) + 16(nEC + nEC) + 8(7 + 7), (57) + 2nEC + 3nEC + 7 + 7 + n23 , (63)

(l) (l) (l) where n(2) is obtained from Eq. (29). Using Eq. (63) where nEC, n|0i and n|+i are given by Eqs. (41), (44) |Hi and (45). We then have we obtain

(T ) (3) (3) (3) (T ) (2) (2) (2) (D) 2 3 n = n + n + 2n + n + 7 + 4. (58) (3) 4ng23ED(p) + 2n|+i + n|0i + 48(7 ) + 177nEC gij |+i |0i EC 02 n = , gMEKED (3) aMEKED(p) The last term in Eq. (58) is due to the four physical (64) operations of the decoding circuit. Next we compute the gate overhead of the level-2 (3) where aMEKED is the acceptance probability of the level- MEK scheme. The circuit has 48 logical gates, 177 EC’s 3 MEK circuit when the level-2 |Hi states (teleported to and requires the preparation of two level-2 |+i states level-3) are obtained from our error detection scheme. (l) and one level-2 |0i state. Defining ngMEK to be the gate overhead of a level-l MEK circuit, we have E State vector simulations

(T ) (2) (2) 2 (2) Here we describe how we simulate each |Hi state prepa- 4ng02 + 2n + n + 48(7 ) + 177nEC n(2) = |+i |0i . gMEK (2) ration and MEK magic state distillation circuit. A aMEK(p) common method for simulating fault-tolerant error- (59) correction is to apply the Gottesman-Knill theorem to track how Pauli errors propagate through stabilizer Again when comparing the gate overhead of a level-2 circuits. However, unlike error-correction, circuits for MEK circuit to the gate overhead required to prepare magic state preparation and distillation contain non- a level-2 |Hi state using our error detection scheme, we Clifford gates that map Pauli errors outside the Pauli will divide Eq. (59) since an MEK circuit produces two group, complicating the analysis. Since all of the cir- distilled |Hi states. cuits we consider act on relatively few qubits, we use the Following the same steps that lead to Eq. (59), it Qiskit state vector simulation [45] to accurately track is straightforward to compute the gate overhead for a error propagation through non-Clifford gates. level-3 MEK circuit. Using Each gate, preparation, idle and measurement loca- tion is subject to Pauli channel errors whose probabil- (D) (3) (3) (3) ities are functions of the physical error rate p that are n = 4n + 3n + 16n + 8(73), (60) 23 |+i |0i EC determined by the noise model at level-1 and the log- and ical error probabilities of fault-tolerant gates at level-2 and above (see Section2). For each value of p, we run (T ) (3) (3) (3) between 106 and 108 Monte-Carlo samples, where each ng23 = n + n + 2nEC |+i |0i sample draws an error at each fault location. The cir- (2) n cuits are composed with an ideal decoder so that the + 3n(2) + 73 + 72 + n(D) + gMEK , (61) EC 23 2 output state is one or two physical qubits for state preparation or distillation, respectively. The output for we get each sample i is a pure state vector |ψii and a measure- (T ) (3) (3) 3 (3) ment record mi on which we post-select. 4ng23 + 2n + n + 48(7 ) + 177nEC n(3) = |+i |0i . For simulations of |Hi state preparation, we compute gMEK (3) overlaps hψ |P |ψ i for each Pauli P to determine the aMEK(p) i i (62) logical error class, since in this case the logical error is always of this form after ideal decoding. However, for The gate overhead for a level-3 MEK scheme where a the MEK distillation protocol there are additional fail- level-2 |Hi state was prepared from our error detection ure modes. For example, the two output qubits can be scheme is obtained as follows. First we compute the in a maximally entangled state. This can be seen by gate overhead to teleport the level-2 |Hi state to a level- placing Y errors on the fourth and seventh |Hi states, 3 |Hi state. It is given by which corresponds to failures of the first T gate and

Accepted in Quantum 2019-05-14, click title to verify 24 third T † gate. Therefore, we solve for Pauli channel parameters using an estimate of the output density ma- P trix ρ˜ ≈ i |ψiihψi|. First, we compute the reduced state ρ = Tr2ρ˜ of one output qubit. Next, we model the output state as a Pauli channel applied to the state ρH = |HihH|,

E(ρH ) = (1 − px − pz)ρH + pxXρH X + pzZρH Z

(65) √ ! 1 1 + 2 − 2p 1 − 2p = √ x √ z . 2 2 1 − 2pz −1 + 2 + 2px (66)

For the parameter ranges we considered, we can solve for the Pauli channel parameters in terms of the matrix elements of ρ, √ √ 1 + 2 − 2 2ρ00 px = (67) √ 2 1 − 2 2ρ p = 01 . (68) z 2 We substitute these parameters back into the model and verify the density matrices are equal to machine precision. The logical error probabilities and rejection probabil- ities are fit to functions of the physical error p. State preparation results are summarized in Table4 and dis- tillation results in Table5. For the second round of distillation, the O(p4) contribution to the logical error is too small to observe with 108 Monte-Carlo trials, so we do two such simulations. The first case uses ideal |Hi states and noisy stabilizer operations, while the second case uses noisy |Hi states and ideal stabilizer opera- tions. We simulate the second case at higher physical error rates to estimate the coefficient of the O(p4) term. Finally, we approximate the logical error rate and ac- ceptance probability of the noisy circuit as a piecewise function (see the caption of Table5).

Accepted in Quantum 2019-05-14, click title to verify 25 (l) Concatenation Pr[accept] Pr[malE |G, p] l = 1 (1 − p)75 (9.95, 4.41, 7.87)p2 l = 2 (1 − 3000p2)200 (1.26, 0.0627, 1.09) × 109p4 l = 2∗ 1 − 3.67p − 2.67 × 103p2 (5.14, 0.33, 2.26) × 104p4 l = 3 1 − 84.3p + 1.60 × 106p2 − 9.41 × 109p3 (1.23, 0.0555, 1.16) × 1024p8 l = 3∗ 1 − 84.3p + 1.60 × 106p2 − 9.41 × 109p3 (1.24, 0.0683, 1.17) × 1024p8

Table 4: State vector simulation results for |Hi state preparation circuits using error-detection. The error probabilities (px, py, pz) are given as functions of p for each error type E ∈ {X,Y,Z}. The entires marked with (∗) use error-detection circuits for level-1 stabilizer operations (only in the |Hi state preparation circuit). The acceptance probabilities are for the level-l circuit and do not include rejections at lower levels of concatenation. The expressions are all approximations computed using a standard non-linear fitting method, and coefficients are rounded to three digits. The level-3 expressions are valid for p ≤ 4 × 10−4. We notice that both level-3 error polynomials are similar. This is due to the fact that errors arising from the level-2 |Hi states have a much smaller contribution to the overall logical error rate compared to the Clifford operations.

(l) Circuit Pr[accept] Pr[malE |G, p] hybrid, l = 3 1 − 3.45 × 1020p6 (0.812, 1.68) × 1026p8 round-1, l = 2 1 − 32.2p + 2.24 × 104p2 − 2.37 × 109p3 (150, 152)p2 + (1.01, 2.00) × 1011p4 round-2, l = 3, ideal |Hi 1 − 6.50 × 1027p8 (1.43, 2.43) × 1026p8 round-2, l = 3, ideal stabilizer∗ 1 − 1.47 × 103p2 (1.97, 1.84) × 105p4 round-3, l = 3 1 − 8.00 × 1027p8 (2.00, 3.04) × 1026p8

Table 5: State vector simulation results for Meier-Eastin-Knill distillation circuits. The error probabilities (px, pz) are given as functions of p for each error type E ∈ {X,Z}. The expressions are all approximations computed using a standard non-linear fitting method, and coefficients are rounded to three digits. The expressions are valid for p ≤ 2 × 10−4, except for (∗) which is valid for p ≤ 5 × 10−3. For round-2, we combine the two results in the table into the following optimistic piecewise approximation: 3 2 27 8 (l) 5 4 26 8 Pr[accept] ≈ 1 − max(1.47 × 10 p , 6.50 × 10 p ) and Pr[malE |G, p] ≈ (1.97, 1.84) × 10 p + (1.43, 2.43) × 10 p .

Accepted in Quantum 2019-05-14, click title to verify 26