<<

arXiv:1908.05154v1 [quant-ph] 14 Aug 2019 ∗∗ ¶ § ‡ † ∗ rm unn ncasclprle optrplatforms, pro- computer are parallel They classical [4]. on processors running grams investi- noisy in help of for gations simulators quantum software veloped the away. since magnitudes correction of error orders is no threshold with fault-tolerance and interactions limited , 10- operations, between with with devices 10-1000 qubits, span platforms, 100 roughly purpose They [3]. special capabilities. NISQ often limited labeled are been systems interme- has Such noisy systems of quantum era with scale This that up diate results. techniques come the as to verify well and necessary validate as is designs, it pur- system time, intensively error-resilient same being the are At rate sued. down error bring environmental to upcoming, techniques the and and noisy, ob- available, are devices The and quantum controls them. necessary perturb even servations environment; the large from a components. is more challenge or The 10 of 2]. say compo- [1, integration, scale predicted hardware quantum as elementary of work laws and nents clear; known, is foun- precisely theoretical field are The the of states. dation quantum of squeezing tunneling entanglement, these and to superposition, contribute com- are that technologies in features essential ultimately The and to theory, putation. feedback quantum then as metrology, of then and foundations simulations, sensing and in communications are first in applications appear Practical to years. break- expected coming significant the for poised in is throughs and years, recent in strides lcrncades [email protected] address: Electronic [email protected] address: Electronic [email protected] address: Electronic lcrncades [email protected] address: Electronic lcrncades [email protected] address: Electronic lcrncades [email protected] address: Electronic olwd,mn nvriisadcmaishv de- have companies and universities many Worldwide, disturbances to sensitive highly are systems Quantum rapid made has technologies quantum of field The 4 negaut optrSine nentoa Institute International Science, Computer Undergraduate e akn nIMsoe-oreQsi ltom nti do examples. this simple In some platform. with it Qiskit quan illustrate open-source p Our and IBM’s incorporate models. on and simple backend using new matrices, measurement density and memory their gates, by states quantum 2 ehv eeoe otaelbayta iuae os qu noisy simulates that library software a developed have We negaut optrSine nttt fEngineering of Institute Science, Computer Undergraduate .MOTIVATION I. 5 etefrHg nryPyis ninIsiueo Science of Institute Indian Physics, Energy High for Centre iasuChaudhary, Himanshu 3 otaeSmltrfrNiyQatmCircuits Quantum Noisy for Simulator Software A 1 negaut hsc,Ida nttt fTcnlg,Ka Technology, of Institute Indian Physics, Undergraduate negaut hsc,Ida nttt fSine Banga Science, of Institute Indian Physics, Undergraduate aa Roshan, Naman 3, 1, Dtd uut1,2019) 15, August (Dated: § ∗ Utkarsh, ilbMahato, Biplab 4, ¶ iei h ed ti ihsc namta ehv con- have we quantum that noisy aim simulating an for such library with software is a exper- It structed future field. developing the in in importance tise vital of simulators quantum be op- by would the provided so classi- exposure and and the programming, portunity from computer of different experience qualitatively quan- cal a is Programming processor a as well tum providing as processors. technology, ‘programming’ quantum of quantum skills ‘designing’ the of acquire stu- field to attract platform the to easily way to excellent be an world dents can therefore are facilities and They computational portable, wide. are existing over They distributed well. as purpose quantum noisy results. the best for the design produce sim- what would software out processor the the figure vary in to can connectivity ulator one the importantly, and quan- More imperfections imperfect with components. work test tum a algorithms can various what one well to and deliver, how close would look improvisa- processor would such quantum noisy results With simulation processor. the quantum tions, real struc- a the of be may ture what imitate the would between which connectivity restrictions components, and are operations distributions. included logic possible be on can appropriate include that with features to Additional have them would of simulator quantum all realistic error-free.) error-prone essentially a in- are program to So classical, the due that are assume which and to structions, distur- memory, safe is in to (It data due measurements. due the implementation, preparation, to gate state bances initial logic imprecise imperfect to to due error: of processor. quantum noisy a mimic to has this the designed simulator For the be at algebra, simulator. to exact software using processor difference suitable of quantum instead a little purpose, just genuine makes obtain- or it a end and other program, is cloud a the there of on whether output computer For the a size.) ing accessing space Hilbert user the a in systems growth (It exponential larger to systems. simulate due classically qubit to 10-50 possible benchmark not is and model can and n pov .Patel D. Apoorva and unu iuaossrea motn educational important an serve simulators Quantum sources many from suffer may computation quantum A fIfrainTcnlg,Hdrbd500032 Hyderabad Technology, Information of 1, † asy Priyadarshi, Lakshya sil rosi ntaiain logic initialisation, in errors ossible n ehooy uko 226021 Lucknow Technology, and uet epoieisdescription, its provide we cument, nu oi icis erepresent We circuits. logic antum u iuao sipeetda a as implemented is simulator tum aglr 560012 Bangalore , pr208016 npur oe560012 lore 5, ∗∗ 2, ‡ 2 logic circuits. We provide the details in what follows. overlapping logic operations at every clock step and then implementing them in parallel. Our is an open-source software II. IMPLEMENTATION written in Python, which is added as a new back- end to IBM’s Qiskit platform [5]. That extends the In the standard formulation of , existing Qiskit capability, while retaining the con- states are vectors in a and evolve by uni- venience (e.g. portability, documentation, graphical tary transformations, ψ U ψ . This evolution is de- interface) of the Qiskit format. Being an open-source terministic, continuous| andi → reversible.| i It is appropriate software platform, Qiskit is popular, and a variety of for describing the pure states of a closed quantum sys- quantum algorithms have been implemented using it tem, but is insufficient for describing the mixed states [6]. Its comparison with other that result from interactions of an open quantum system software packages, in terms of features and per- with its environment. formance, is also available [7]. Our simulator, with a The most general description of a quantum system is in user guide, is available as a “derivative work” of Qiskit at terms of its ρ, which evolves according to a https://github.com/indian-institute-of-science-qc/ linear completely positive trace preserving map known as qiskit-aakash. the superoperator [1, 2]. For generic mixed states, ρ is a Hermitian positive semi-definite matrix, with T r(ρ)=1 A. The and ρ2 ρ, while for pure states, ρ = ψ ψ and ρ2 = ρ. The density matrix provides an ensemble| ih description| of the quantum system, and so is inherently probabilistic, in We express the density matrix of an n-qubit quantum contrast to the state vector description that can describe register in the orthogonal Pauli basis: individual experimental system evolution. Nonetheless, ρ = a n (σ σ ... σ n ) . (2) it allows determination of the expectation value of any i1i2...i i1 ⊗ i2 ⊗ ⊗ i physical , O = T r(Oρ), which is defined as i1,iX2,...,in the average result overh manyi experimental realisations. Here i1,i2,...,in 0, 1, 2, 3 , σ0 I, and ai1i2...in are In its discrete form, a superoperator can be specified 4n real coefficients∈ encoded { as} an array.≡ The normalisa- −n by its Kraus -sum representation: tion T r(ρ) = 1 implies a00...0 = 2 , and the constraint 2 2 −n T r(ρ ) 1 implies i1,i2,...,in ai1i2...in 2 . We find ρ M ρM † , M †M = I. (1) this expression≤ for the density matrix easier≤ to work with, → µ µ µ µ P Xµ Xµ compared to expressing it as a 2n 2n complex matrix. is linear in× terms of ρ. Moreover, Unitary evolution, ρ UρU †, and orthogonal projec- we consider problems where all operations—logic gates, → tive measurement, ρ PkρPk, are special cases of errors and measurements—are local, i.e. act on only a → k this representation. (NoteP that the projection operators few qubits. Then during a single operation, only a few † satisfy Pk = Pk , k Pk = I and PkPl = Pkδkl.) Also, subscripts of ai1i2...in change while all the rest remain various environmentalP disturbances to the quantum sys- unaltered. Such operations are efficiently implemented tem can be modeled by suitable choices of Mµ . in the software using linear algebra vector instructions, In going from a description based on ψ{ to} the one with explicit evaluation of Eq. (1). We list density matrix based on ρ, the degrees of freedom get| squared.i This transformations for some commonly used logic gates in property is fully consistent with the Schmidt decompo- Appendix A. sition, which implies that any correlation between the We allow several options to initialise the density ma- −n ⊗n system and the environment can be specified by model- trix: The all-zero state is 2 (I + σ3) , the uniform −n ⊗n ing the environment using a set of degrees of freedom as superposition state is 2 (I + σ1) , a state specified by large as that for the system. The squaring of the degrees a binary string of 0’s and 1’s is mapped to a matching 1 1 of freedom is the price to be paid for the flexibility to tensor product of (1 + σ3) and (1 σ3) factors, and 2 2 − include all possible environmental effects on the quan- a custom density matrix can be read from a file as the n tum system, and it slows down the performance of our 4 coefficients. We also use the overlap, T r(ρ1ρ2), as a quantum simulator. convenient measure of closeness of two density matrices. We consider computational problems whose algorithms have already been converted to discrete quantum logic circuits acting on a set of qubits. We also assume that all B. The Logic Gates logic gate instructions can be executed with a fixed clock step. In this framework, the computational complexity of Generic unitary transformations acting on the density the program is specified by the number of qubits and the matrix belong to the group SU(2n), since ρ does not total number of clock steps. Since the quantum state de- contain the overall unobservable phase that ψ has. It is teriorates with time due to environmental disturbances, well-known that any such transformation can| i be decom- we reduce the total execution time by identifying non- posed in to a sequence of one-qubit and two-qubit logic 3 gates. The optimal choice for these elementary gates de- kth qubit is measured along directionn ˆ, the coefficients pends on what operations are convenient to execute on a are set to zero for i nˆ, while those for i nˆ and ...ik ... k ⊥ kk the quantum hardware, but it is a small set in any case. ik = 0 remain unchanged. We choose this select set to be the one-qubit rotations With these ingredients, we have implemented several about the fixed Cartesian axes (i.e. x, y and z) and the projective measurement options: two-qubit C-NOT gate, which is suitable for most hard- Expectation value of a Pauli operator string: This is • n ware implementations. just σi1 σi2 ... σin =2 ai1i2...in . The density ma- We assume that the program to be executed is available trixh is updated⊗ ⊗ for⊗ each ik, by projecting the coefficients as a time-ordered sequence of logic gate operations. If with subscripts i 1, 2, 3 and leaving those with sub- k ∈{ } that is not so, then a compiler would be needed to convert scripts ik = 0 unaltered. instructions in a high-level language to a sequence of logic Single qubit measurement: When the kth qubit is mea- gate operations. We also assume that the C-NOT gate •sured along directionn ˆ, the two possible results have 1 can be applied between any two qubits of a register. If 2 (I nˆ ~σ)k . In terms of the coefficients there are restrictions on qubit connectivity, then again it h ± · i 1 n c0,~c = a0...ik...0, the two probabilities are 2 (1 2 nˆ ~c). would be the task of a compiler to express the C-NOT The{ density} matrix is updated by projecting coefficients± · gate as a sequence of operations along an available qubit with subscript ik. interaction route. Ensemble measurement: For simultaneous binary mea- We follow the Qiskit convention in describing the logic surement• of all the qubits in the computational basis, the gates. A single qubit rotation by angle θ about the axis probabilities of the 2n possible results are: −inˆ·~σθ/2 nˆ is Rn(θ)= e . Then using Euler decomposition, −n 2 k(I σ3)k = j ,...,jn∈{0,3}( k sk)aj1...jn , with any single qubit rotation is decomposed as: h ± i 1 the signQ sk = δjk ,0 δjPk,3. This measurementQ is typically ± cos θ eiλ sin θ performed at the end of the computation. u3(θ, φ, λ) = e−i(φ+λ)/2 2 − 2 Bell-basis measurement of a pair of qubits: For this eiφ sin θ ei(φ+λ) cos θ  2 2 •joint measurement of two qubits, the projection opera- = Rz(φ)Ry(θ)Rz (λ) . (3) tors for the four orthonormal state vectors are given by: We also use the Qiskit preprocessor and transpiler to 00 + 11 simplify the quantum logic circuit. First, the prepro- 1 | i | i 4 (I I + σ1 σ1 σ2 σ2 + σ3 σ3), cessor converts several of the commonly used quantum √2 → ⊗ ⊗ − ⊗ ⊗ logic gates to the select set of gates, e.g. the phase gates 00 11 1 | i − | i 4 (I I σ1 σ1 + σ2 σ2 + σ3 σ3), are expressed in terms of u1(θ)= Rz(θ), the Hadamard √2 → ⊗ − ⊗ ⊗ ⊗ gate becomes H = i u3( π , 0, π), and the Toffoli gate 2 01 + 10 1 (C2-NOT) becomes a combination of C-NOT and phase | i | i (I I + σ1 σ1 + σ2 σ2 σ3 σ3), √2 → 4 ⊗ ⊗ ⊗ − ⊗ gates. Then the transpiler optimises the quantum logic circuit, wherever possible, by collapsing adjacent gates 01 10 1 | i − | i 4 (I I σ1 σ1 σ2 σ2 σ3 σ3). and by cancelling gates using commutation rules. √2 → ⊗ − ⊗ − ⊗ − ⊗

So for the Bell-basis measurement of kth and lth qubits, C. Projective Measurements the probabilities of the four outcomes are determined by the coefficients with subscripts ik = il and all other sub- Given the density matrix ρ, the expectation value of scripts set to zero. These four probabilities can be used any Hermitian operator O is easily obtained by express- to quantify entanglement between the two qubits. The ing it in the Pauli basis, post-measurement density matrix is obtained by setting the coefficients with i = i to zero, while not changing k 6 l O = b n (σ σ ... σ n ) , (4) those with ik = il. i1i2...i i1 ⊗ i2 ⊗ ⊗ i i1,iX2,...,in Qubit reset: Although not a measurement, this instruc- tion• permits reuse of a qubit. The transformation to reset and then evaluating the inner product, a quantum state to 0 is: ρ P0ρP0+σ1P1ρP1σ1. When th | i → n the k qubit is reset, the coefficients with ik 1, 2 are O = T r(Oρ)=2 ai1i2...in bi1i2...in . (5) ∈{ } h i made zero, and the coefficient with ik = 3 is made equal i1,iX2,...,in to the one with ik = 0. Furthermore, the reduced density matrix with the de- Complete tomography: The 4n coefficients of the den- th • grees of freedom of k qubit summed over, T rk(ρ), is sity matrix determine expectation values of all the physi- n−1 specified by the 4 coefficients 2ai1...ik−10ik+1...in . This cal operators that can be measured in principle, although prescription can be repeated to reduce the density matrix only a set of mutually commuting operators can be mea- over as many qubits as desired. sured in a single quantum experiment. In our classical Quantum measurement modifies the quantum state as simulation, we can store the full density matrix at any well; components orthogonal to the direction of measure- stage of a program, and use it later as the initial state of ment vanish up on a projective measurement. So when another program. 4

D. The Partitioned Logic Circuit of the initial state options:

⊗n An open quantum system continuously deteriorates in p 0 p E1 E0 ρth = , = exp − . (6) time. To mitigate that, it is useful to reduce the total ex- 0 1 p 1 p  kT  − ecution time of a quantum program as much as possible. − Towards this end, we restructure the Here the parameter p is provided by the user. produced by the transpiler as follows. The preprocessor decomposes the logic gates provided by the user to the select set u1,u3,C-NOT . This de- B. Logic Gate Execution Error composition adds to the number{ of logic gates} in the circuit, increasing its depth. As a countermeasure, we go The single qubit rotations in our select set have fixed through the sequence of operations on each qubit, and rotation axes, and we assume that errors arise from in- merge consecutive single-qubit rotations that we find in accuracies in their rotation angles. Let α denote the in- to a single one (e.g. u3 u3 u3), using SU(2) group accuracy in the angle, with the mean α = α and the ∗ → composition rules. fluctuations symmetric about α. Thenhh theii replacement Next we arrange the complete list of instructions in to θ θ + α in Rn(θ) modifies the density matrix transfor- a set of partitions, such that all operations in a single par- mation→ according to the substitutions: tition can be executed as parallel threads during a single clock step. To accomplish this, the clock step has to be cos θ r cos(θ + α) , sin θ r sin(θ + α) , (7) longer than the execution times of individual operations → → (i.e. u3, C-NOT and various measurements). We parti- where α and r = cos(α α) are the parameters pro- tion the circuit by organising the list of instructions as a vided by the user.hh They may− dependii on the rotation axis stack of sequential operations for every qubit, introduc- (i.e. x, y or z). ing barriers such that each qubit can have at most one To model the error in the C-NOT gate, we assume that operation in a partition, and combining non-overlapping C-NOT is implemented as a transition selective pulse qubit operations in to a single partition wherever pos- that exchanges amplitudes of the two target qubit lev- sible. In particular, this procedure puts logic gate op- els when the control qubit state is 1 . Then the er- erations and measurement operations in separate parti- ror is in the duration of the transition| i selective pulse, tions (note that a single qubit measurement may affect and alters only the second half of the unitary operator, the whole quantum register in case of entangled quantum Ucx = 0 0 I + 1 1 σ1. It can be included in the states). Thus a partition may have either multiple quan- same manner| ih |⊗ as the| ih error|⊗ in single qubit rotation angle tum logic gates operating on different qubits, or multiple (i.e. as a disturbance to the rotation operator σ1). The single qubit measurements on distinct qubits. We let ex- corresponding two parameters, analogous to α and r, are pectation value calculations, ensemble measurements and provided by the user. Bell-basis measurements form partitions on their own. We provide details of the merging and the partition- ing logic circuit operations in Appendices A and B, with C. Measurement Error simple illustrations. Projective measurements of quantum systems are not perfect in practice. We model a single qubit measurement III. THE NOISY EVOLUTION error as depolarisation, which is equivalent to a bit-flip error in a binary measurement. Then when the kth qubit

The manipulations of circuit operations described in is measured along directionn ˆ, the coefficients ai1...ik ...in the previous section are carried out at the classical level; in the post-measurement state are set to zero for ik nˆ, even when a quantum hardware is available, they would reduced by a multiplicative factor d for i nˆ, and⊥ left 1 kk be implemented by a classical compiler. So we safely as- unaffected for ik = 0. Also, the probabilities of the two sume that they are error-free. It is the execution of the outcomes become 1 (1 2nd nˆ ~c), in the notation of 2 ± 1 · partitioned circuit on a quantum backend that is influ- Section II.C. Here the parameter d1 is provided by the enced by the environment. Assuming that the environ- user. In case of a measurement of a multi-qubit Pauli ment disturbs each qubit independently, we now present operator string, the above procedure is applied to every simple models that include the environmental noise in qubit whose measurement operator has ik = 0. the simulator at various stages of the program execution. In the case of a Bell-basis measurement,6 the post- measurement coefficients with i = i are set to zero, k 6 l those with ik = il 1, 2, 3 are reduced by a multiplica- A. Initialisation Error ∈{ } tive factor d2, and those with ik = il = 0 are left the same. Also, the probabilities of the four outcomes are The initial state of the program is often an equilibrium obtained by reducing the i = i 1, 2, 3 contributions k l ∈{ } state. So we allow a fully-factorised thermal state as one by the factor d2 that is provided by the user. 5

Thermal and Depolarisation factors D. Memory Errors 1.0 0.99 0.98 0.97 0.96 0.95 0.94 0.93 0.92 0.91 Rotation error 1.0 0.999 0.998 0.997 0.996 0.995 0.994 0.993 0.992 0.991

An open quantum system undergoes decoherence and 1.0 decay, irrespective of whether it is being manipulated by some instruction or not. These effects cause maximum 0.9 damage to a quantum signal, because they act on all the 0.8 qubits all the time, while operational errors are confined to particular qubits at specific times. We assume that 0.7 these memory errors are small during a clock step, and implement them by modifying the density matrix at the probability Success 0.6 thermal end of every clock step, in the spirit of the Trotter ex- depolarisation 0.5 rotation pansion. Such an implementation is actually the decoherence decay behind our partitioning of the quantum circuit. 0.4 1.0 0.9999 0.9998 0.9997 0.9996 0.9995 0.9994 0.9993 0.9992 0.9991 Taking the σ3 basis as the computational basis, the Decoherence factor 1.0 0.99999 0.99998 0.99997 0.99996 0.99995 0.99994 0.99993 0.99992 0.99991 decoherence effect is to suppress the off-diagonal coeffi- Decay factor cients with ik 1, 2 for every qubit by a multiplicative factor f. It can∈{ be represented} by the Kraus operators: FIG. 1: Success probability of the binary addition program, 1+ f 1 f 110+11=1001, as a function of different types of errors. For M = I,M = σ . (8) easy comparison, multiple parameters are plotted along the 0 r 1 r − 3 2 2 X-axis with different scales: thermal factor p (green), depo- In terms of the clock step ∆t and the decoherence time larisation factor d1 (orange), rotation error parameter r (red), f g T , the parameter f = exp( ∆t/T ), and it is provided decoherence factor (blue) and decay factor (black). In all 2 − 2 cases, the success probability is observed to decrease expo- by the user. nentially. We consider the decay of the quantum state towards the thermal state, ρth, defined in Section III.A. This evo- lution is represented by the Kraus operators: qubits and 100 operations in a few minutes on a laptop; it 1 0 0 √1 g would be practical to handle larger quantum systems, say M0 = √p ,M1 = √p − , (9) 0 √g 0 0  up to 15 qubits and 1000 operations, on more powerful dedicated computers. √g 0 0 0 M = 1 p ,M = 1 p . The main achievement of our simulator is the ability to 2 −  0 1 3 − √1 g 0 p p − simulate noisy quantum systems, using simple error mod- Its effect on every qubit is to suppress the off-diagonal co- els. In such simulations, the final results are probability distributions over the possible outcomes of the algorithm, efficients with ik 1, 2 by √g, and change the diagonal coefficients according∈{ to:} and their stability against variations of the error param- eters can be explicitly checked. The distributions can

a...3... g a...3... + (2p 1)(1 g)a...0... . (10) be easily visualised using various types of plots, and we → − − expect exponential deterioration of the quantum signal In terms of the clock step ∆t and the relaxation time T1, with increasing error rates. the parameter g = exp( ∆t/T1), and it is provided by As a straightforward example, we simulated the binary − the user. (Note that our Kraus representation automat- addition algorithm. That requires three quantum sub- ically ensures the physical constraint T2 2T1). registers, two for the two numbers and one for the carry ≤ We execute both the decoherence and the decay oper- bit. We varied the error parameters one at a time, and ations at the end of every partition. observed the probability distributions of the final sum. (To interpret the results correctly, we needed to invert Qiskit’s convention of the least significant bit first and IV. TESTS AND EXAMPLES the most significant bit last, to the standard numerical convention of the most significant bit first and the least We have tested our density matrix simulator against significant bit last.) Our results for the probability of Qiskit’s state vector version, using circuits of randomly the correct answer, for the addition 110 + 11 = 1001, generated quantum logic operations. Both give identical are shown in Fig. 1. Although the success probability results, when all the errors are absent. Since our density decreases exponentially in all the cases, we see a wide matrix simulator works with 4n coefficients, it is slower variation in sensitivity of the calculation to the differ- n than the state vector simulator that works with 2 coeffi- ent types of errors. Thermal (p) and depolarisation (d1) cients. On the other hand, it produces the complete out- factors act only at the ends of the program, and produce put in one run, while the state the smallest errors. Rotation angle fluctuations (r) in the vector simulators require multiple runs of the program logic gates give rise to intermediate size errors. Decoher- for the same purpose. We can simulate circuits with 10 ence (f) and decay (g) factors that act throughout the 6

2 1.000 As a second example, we simulated the Quantum 3 0.998 Rotation error 4 0.996 algorithm for various number of Number of5 qubits 0.994 6 0.9 7 0.992 qubits, again varying the error parameters one at a time. 8 1.0 Our results for the final state fidelity, with reference to

0.9 0.8 the exact result, are displayed in Fig. 2. We find that

0.8

Fidelity the fidelity deviates from 1 quadratically for very small 0.7 errors, but subsequently drops exponentially, both as a 0.7 0.6 function of the number of qubits and the error parame- 0.5 ters. This is the expected behaviour, and the hierarchy

0.6 of sensitivity to different errors is the same as in case of the addition algorithm.

0.5 2 1.0000 Decoherence factor 3 0.9998 4 Appendix A: Some Logic Gate Transformations 0.9996 0.9 5 Number of qubits 0.9994 6 for the Density Matrix 7 0.9992 8 0.8 1.0 0.9 0.8 0.7 The single qubit density matrix is ρ = a0I +~a ~σ, with 0.7 1 · Fidelity a0 = . It is straightforward to apply commonly used 0.6 2 0.6 0.5 one-qubit logic gates to it: 0.4 0.3 0.5 σ1ρσ1 = a0I + a1σ1 a2σ2 a3σ3 , 0.2 − − 0.4 σ ρσ = a I a σ + a σ a σ , 2 2 0 − 1 1 2 2 − 3 3 0.3 σ ρσ = a I a σ a σ + a σ , 3 3 0 − 1 1 − 2 2 3 3

0.2 HρH = a0I + a3σ1 + a2σ2 + a1σ3 , (11) 2 1.0000 Decay factor 3 0.9998 † 4 0.9996 SρS = a0I a2σ1 + a1σ2 + a3σ3 , Number of5 qubits 6 0.9994 † − 7 0.9992 0.8 S ρS = a I + a σ a σ + a σ , 8 0 2 1 1 2 3 3 1.0 − † a1 a2 a1 + a2 TρT = a0I + − σ1 + σ2 + a3σ3 , 0.8 √2 √2 0.6 0.6 Fidelity † a1 + a2 a1 a2 T ρT = a0I + σ1 − σ2 + a3σ3 . 0.4 √2 − √2

0.4 0.2 Rotation errors in these one-qubit transformations are † incorporated by changing them from Rn(θ)ρRn(θ) to 0.2 † Rn(θ + α)ρRn(θ + α). † The two-qubit C-NOT transformation is UcxρUcx, with U = 0 0 I + 1 1 σ . Transition selective pulse cx | ih |⊗ | ih |⊗ 1 error in the C-NOT gate is included by changing σ1 to FIG. 2: Fidelity of the Quantum Fourier Transform program, Rx(α)σ1 in Ucx. for different number of qubits n and as a function of differ- Consecutive rotations of a single qubit can be merged ent error parameters: (Top) The rotation error parameter r, in to a single one using SU(2) group composition rules. π with the same value for Rx, Ry, Rz and C-NOT operations; We rewrite Qiskit’s u2(φ, λ) logic gate as u3( 2 , φ, λ), (Middle) The decoherence parameter f; (Bottom) The decay which reduces merging possibilities to only four cases: parameter g. The fidelity decreases first quadratically and u1 u1, u1 u3, u3 u1 and u3 u3. The first three are then exponentially, both as a function of n and the error pa- ∗ ∗ ∗ ∗ easily taken care of by adding the Rz rotation angles, e.g. rameters. u1(θ1) u1(θ2) = u1(θ1 + θ2). To take care of the last one, we× express:

u3(θ , φ , λ ) u3(θ , φ , λ ) (12) program cause the largest errors, with decay dominating 1 1 1 ∗ 2 2 2 over decoherence. This pattern, together with the ac- = Rz(φ2)Ry(θ2)Rz(λ2)Rz(φ1)Ry(θ1)Rz(λ1) tual error parameter values, gives us an estimate of how = Rz(φ2)Ry(θ2)Rz(λ2 + φ1)Ry(θ1)Rz(λ1) accurately we need to control various errors in quantum hardware in order to get meaningful results. We also ob- = Rz(φ2)Rz(α)Ry(β)Rz(γ)Rz(λ1) served that decreasing p, d1 and g more or less kept the = Rz(φ2 + α)Ry(β)Rz(γ + λ1) probability distribution centred around the correct an- = u3(β, φ2 + α, γ + λ1) . swer, but decreasing f tended to make the distribution flat and decreasing g drove the distribution towards the Here the YZY Euler decomposition on the third line is all-zero state. converted to the ZYZ Euler decomposition on the fourth 7

merge 0 0 0 q0 U0 U1 U2 q0 X T H merge 1 1 q1 U0 U1 q1 H Z T

2 q2 U0 q2 T † Z

3 q3 U0 q3 H S X merge 4 q4 U0

q0 H × T × X C0 2,2 U 0 C1 q1 H T × Z 2 3,2 C0 C1 2,1 4,1 M M M 0 1 q2 † U 0 U 1 C C C1 T Z 1 1 2,2 3,2 4,1 C0 C1 C0 C0 4 3,0 4,0 2,1 3,0 U0 q3 X × S × H U 0 U 1 U 2 U 3 C1 0 0 0 0 4,0

FIG. 3: (Top) A quantum logic circuit with commonly used 1 2 3 4 5 6 7 gates. (Bottom) The same logic circuit after merging several 0 0 0 one-qubit gates. q0 U0 U1 U2

1 1 q1 U0 U1 line. This conversion is conveniently performed by ex- 2 plicitly matching the product matrices and using the q2 U0 arctan2 math-library function to extract the angles. 3 An illustration of how this merging can simplify a logic q3 U0 circuit is presented in Fig. 3. Note that the reversal of 4 the operator order is due to the convention of the left- q4 U0 most gate acting first in a circuit and the rightmost factor acting first in a matrix operation. FIG. 4: (Top) A quantum logic circuit specified by sequential instructions. (Middle) The qubit operation stack generated from the logic circuit. (Bottom) The partitioned logic circuit Appendix B: Logic Circuit Rearrangement constructed from the qubit stack.

To minimise decay and decoherence errors, we need to reduce the logic circuit depth as much as possible. For semble measurements and Bell-basis measurements from this purpose, we look for maximum parallelisation of the the rest of the instructions by inserting barriers. The in- program provided as a time-ordered instruction set. We structions between successive barriers are then inspected rearrange instructions in to a set of partitions, preserving to check if their further partitioning is necessary. their temporal order, such that all instructions in a given We implement a simple partitioning procedure, which partition commute with each other and can be executed may not be optimal, but works well in practice. Our first simultaneously while the partitions are executed in suc- step is to construct a qubit stack from the instruction set, cession. Then each partition is assigned a clock step, and which lists the temporal sequence of instructions that act overall decay and decoherence errors depend on the total on every qubit. A multi-qubit instruction (e.g. C-NOT) number of partitions. is listed in the column of each participating qubit. In We note that all our logic gates including their errors case of a single qubit measurement or reset, we add a involve only one or two qubits, while any projective mea- dummy instruction to the rest of the qubit columns as surement operation may affect the density matrix glob- a barrier. An example of this construction is shown in ally. So the partitions fall in to two categories; they have Fig. 4, and the pseudocode of our algorithm is presented either only unitary logic gates (u1, u3 or C-NOT) or only in Fig. 5. projective measurement operations. These two categories Our next step is to sequentially inspect the bottom in- can have different clock step duration if required. To be- struction for every qubit column in the stack, and pop it gin with, we therefore go through the whole instruction in to a new partition under certain conditions. In case set and insert barriers between sets of consecutive logic of a logic gate partition, at most only one instruction gates and sets of consecutive measurement operations. from a column can go in to a partition, and a multi- We also separate out expectation value calculations, en- qubit instruction can go in to the partition only if it is 8

Algorithm 1: Algorithm for constructing qubit stack from instruction set Algorithm 2: Algorithm for partitioning the logic circuit Input : Instruction Set: iSet, Input : Instruction Set: iSet, Number of qubits: numQubits Number of qubits: numQ Output: stacks Qubit stacks: , Output: Partitioned instruction set: piSet, Stacks maximum depth: depth levels 1 def qubitStacks (iSet, numQubits): Number of partitions: 1 2 Initialize numQ empty stacks: [ [ ], [ ] ... []] ← qubitStack def partitionInstructions (iSet, numQ): 3 for instruction in iSet: 2 iStack, depth ← qubitStacks(iSet, numQ) 4 if not isMeasure(instruction) and not isReset(instruction): 3 Initialize empty partitions: [ [ ], [ ] ... []] ← sequence 5 for qubit in instruction.qubits: 4 Initialize level: level 6 qubitStack[qubit].append(instruction) 5 while iSet: 7 elif isMeasure(instruction): 6 if level == len(sequence): 8 ← qubit instruction.qubits[0] 7 sequence.append([ ]) 9 if not isMeasureDummy(qubitStack[qubit][-1]): 8 for qubit in range(numQ): 10 qubitStack[qubit].append(instruction) 9 if : 11 for qb in iStack[qubit] 10 ← set(range(numQ)).difference(set(instruction.qubits)): gate iStack[qubit][0] 12 qubitStack[qb].append(dummymeasureinstruction) 11 else: 13 else: 12 continue 14 qubitStack[qubit][-1] ← instruction 13 if isDummy(gate): 15 elif isReset(instruction): 14 continue 16 qubit ← instruction.qubits[0] 15 elif isSingle(gate): 17 if not isMeasureDummy(qubitStack[qubit][-1]): 16 sequence[level].append(gate) 18 qubitStack[qubit].append(instruction) 17 iSet.remove(gate) 19 for qb in 18 iStack[qubit].pop(0) set(range(numQ)).difference(set(instruction.qubits)): 19 20 qubitStack[qb].append(dummyresetinstruction) elif isCX(gate): 20 21 else: firstQb, secondQb = gate.qubits 22 qubitStack[qubit][-1] ← instruction 21 currGate ← iStack[firstQb][0] 23 depth ← max([len(stack) for stack in qubitStack]) 22 buffGate ← iStack[secondQb][0] 24 return qubitStack, depth 23 if currGate == buffGate: 24 sequence[level].append(gate) 25 iSet.remove(gate) FIG. 5: Pseudocode for the algorithm that constructs the 26 iStack[firstQb].pop(0) qubit operation stack from the instruction set. 27 iStack[secondQb].pop(0) 28 else: 29 continue 30 elif isMeasure(gate): present at the bottom of columns of each participating 31 allDummy ← True qubit. In case of a measurement partition, dummy in- 32 for x in numQ: structions are ignored, which allows simultaneous single 33 if not isMeasure(iStack[x][0] and not qubit measurement or reset on distinct qubits to be in 34 isMeasureDummy(iStack[x][0])): 35 allDummy ← False the same partition (we have called that partial measure- 36 break ment) while preventing successive measurements on the 37 if allDummy: same qubit to do so. This process of creating new parti- 38 for x in range(numQ): 39 ← tions is repeated until all the columns in the qubit stack instruction iStack[x][0] 40 if isMeasure(instruction): become empty. An example of this procedure is shown in 41 sequence[level].append(instruction) Fig. 4, and the pseudocode of our algorithm is presented 42 iSet.remove(instruction) in Fig. 6. 43 iStack[x].pop(0) At the end, we point out that an efficient software 44 break 45 elif isReset(gate): rescheduling of the program instructions is desirable even 46 allDummy ← True when the algorithm is to be implemented on a quantum 47 for x in numQ: hardware. 48 if not isReset(iStack[x][0] and not 49 isResetDummy(iStack[x][0])): 50 allDummy ← False 51 break 52 if allDummy: 53 for x in range(numQ): 54 instruction ← iStack[x][0] 55 if isReset(instruction): 56 sequence[level].append(instruction) 57 iSet.remove(instruction) 58 iStack[x].pop(0) 59 break 60 if not iSet: 61 break 62 level ← level + 1 63 return sequence, level

FIG. 6: Pseudocode for the algorithm that partitions the qubit operation stack in to sequential levels. 9

[1] J. Preskill, Lecture Notes for the [5] See https://qiskit.org/ and Course on Quantum Computation, https://github.com/Qiskit/qiskit-terra http://www.theory.caltech.edu/people/preskill/ph219/ Qiskit has copyright under Apache License 2.0. [2] M.A. Nielsen and I.L. Chuang, Quantum Computation [6] See for instance, P.J. Coles et al., and , (Cambridge University Press, Implementations for Beginners, arXiv:1804.03719. 2000). [7] See for instance, R. LaRose, Overview and Com- [3] J. Preskill, Quantum Computing in the NISQ Era and parison of Gate Level Quantum Software Platforms, Beyond, Quantum 2 (2018) 79. arXiv:1807.02500. [4] See for instance, http://quantiki.org/wiki/list-qc-simulators