Dance with Noise in NISQ Era — A NASA Perspective on

Zhihui Wang1,2 Eleanor Rieffel1

[email protected] [email protected]

1. NASA Quantum Artificial Intelligence Lab (QuAIL) 2. Universities Space Research Association

ICCAD 2019, West Minster, CO Quantum Computing: Explore “bizarre” features of quantum mechanics

Superposition

Entanglement

Quantum Tunneling Article developed fast, high-fidelity gates that can be executed simultaneously a across a two-dimensional array. We calibrated and benchmarked the processor at both the component and system level using a powerful new tool: cross-entropy benchmarking11. Finally, we used component- level fidelities to accurately predict the performance of the whole sys- tem, further showing that quantum information behaves as expected when scaling to large systems.

A suitable computational task To demonstrate , we compare our quantum proces- sor against state-of-the-art classical computers in the task of sampling the output of a pseudo-random quantum circuit11,13,14. Random circuits are a suitable choice for benchmarking because they do not possess The Noisy Intermediate-Scale Quantum (NISQ) Era structure and therefore allow for limited guarantees of computational hardness10–12. We design the circuits to entangle a set of quantum bits () by repeated application of single-qubit and two-qubit logi- cal operations. Sampling the quantum circuit’s output produces a set Qubit Adjustable coupler of bitstrings, for example {0000101, 1011100, …}. Owing to quantum interference, the probability distribution of the bitstrings resembles b a speckled intensity pattern produced by light interference in laser Google scatter, such that some bitstrings are much more likely to occur than others. Classically computing this probability distribution becomes exponentially more difficult as the number of qubits (width) and number of gate cycles (depth) grow. Hardware Limitations: We verify that the quantum processor is working properly using a method called cross-entropy benchmarking11,12,14, which compares how often each bitstring is observed experimentally with its corresponding ideal probability computed via simulation on a classical computer. For a given circuit, we collect the measured bitstrings {xi} and compute the linear cross-entropy benchmarking fidelity11,13,14 (see also Supplementary Information), which is the mean of the simulated probabilities of the bitstrings we measured: 10 mm Intel - up to 100 physical qubits n FXEB =2 "Px( i)#i − 1 (1) Fig. 1 | The Sycamore processor. a, Layout of processor, showing a rectangular array of 54 qubits (grey), each connected to its four nearest neighbours with where n is the number of qubits, P(x ) is the probability of bitstring x i i couplers (blue). The inoperable qubit is outlined. b, Photograph of the computed for the ideal quantum circuit, and the average is over the Sycamore chip. - (some) planar connectivity observed bitstrings. Intuitively, FXEB is correlated with how often we sample high-probability bitstrings. When there are no errors in the quantum circuit, the distribution of probabilities is exponential (see connectivity was chosen to be forward-compatible with error correc- Supplementary Information), and sampling from this distribution will tion using the surface code26. A key systems engineering advance of this produce FXEB =1. On the other hand, sampling from the uniform device is achieving high-fidelity single- and two-qubit operations, not n - Limited in circuit depth distribution will give "P(xi)#i = 1/2 and produce FXEB =0. Values of FXEB just in isolation but also while performing a realistic computation with between 0 and 1 correspond to the probability that no error has occurred simultaneous gate operations on many qubits. We discuss the highlights while running the circuit. The probabilities P(xi) must be obtained from IBMbelow; see also the Supplementary Information. classically simulating the quantum circuit, and thus computing FXEB is In a superconducting circuit, conduction electrons condense into a intractable in the regime of quantum supremacy. However, with certain macroscopic quantum state, such that currents and voltages behave circuit simplifications, we can obtain quantitative fidelity estimates of quantum mechanically2,30. Our processor uses transmon qubits6, which - Little to no error correction a fully operating processor running wide and deep quantum circuits. can be thought of as nonlinear superconducting resonators at 5–7 GHz.

Our goal is to achieve a high enough FXEB for a circuit with sufficient The qubit is encoded as the two lowest quantum eigenstates of the width and depth such that the classical computing cost is prohibitively resonant circuit. Each transmon has two controls: a microwave drive large. This is a difficult task because our logic gates are imperfect and to excite the qubit, and a magnetic flux control to tune the frequency. the quantum states we intend to create are sensitive to errors. A single Each qubit is connected to a linear resonator used to read out the qubit bit or phase flip over the course of the algorithm will completely shuffle state5. As shown in Fig. 1, each qubit is also connected to its neighbouring the speckle pattern and result in close to zero fidelity11 (see also Sup- qubits using a new adjustable coupler31,32. Our coupler design allows us Rigetti plementary Information). Therefore, in order to claim quantum suprem- to quickly tune the qubit–qubit coupling from completely off to 40 MHz. acy we need a quantum processor that executes the program with One qubit did not function properly, so the device uses 53 qubits and sufficiently low error rates. 86 couplers. Are NISQ devices useful? The processor is fabricated using aluminium for metallization and Josephson junctions, and indium for bump-bonds between two silicon Building a high-fidelity processor wafers. The chip is wire-bonded to a superconducting circuit board We designed a quantum processor named ‘Sycamore’ which consists and cooled to below 20 mK in a dilution refrigerator to reduce ambient of a two-dimensional array of 54 transmon qubits, where each qubit is thermal energy to well below the qubit energy. The processor is con- tunably coupled to four nearest neighbours, in a rectangular lattice. The nected through filters and attenuators to room-temperature electronics, - Quantum supremacy 506 | Nature | Vol 574 | 24 OCTOBER 2019 - Optimization Honeywell - Quantum chemistry - Quantum machine learning IonQ - …

… Quantum supremacyArticle through random circuit developed fast, high-fidelity gates that can be executed simultaneously a across a two-dimensional qubit array. We calibrated and benchmarked the processor at both the component and system level using a powerful • Advantage Sampling the output of a new tool: cross-entropy benchmarking11. Finally, we used component- level fidelities to accurately predict the performance of the whole sys- tem, further showing that quantum information behaves as expected pseudo-random quantum circuit when scaling to large systems.

A suitable computational task • qFlex: state of art classical simulator To demonstrate quantum supremacy, we compare our quantum proces- sor against state-of-the-art classical computers in the task of sampling h4ps://github.com/ngnrsaa/qflex the output of a pseudo-random quantum circuit11,13,14. Random circuits are a suitable choice for benchmarking because they do not possess structure and therefore allow for limited guarantees of computational hardness10–12. We design the circuits to entangle a set of quantum bits [Villalonga et. al., accepted on NPJ QIP 2019(qubits) by repeated application of single-qubit and two-qubit logi- cal operations. Sampling the quantum circuit’s output produces a set Qubit Adjustable coupler of bitstrings, for example {0000101, 1011100, …}. Owing to quantum interference, the probability distribution of the bitstrings resembles b a speckled intensity pattern produced by light interference in laser • Application random-number generatorscatter, such that some bitstrings are much more likely to occur than others. Classically computing this probability distribution becomes exponentially more difficult as the number of qubits (width) and number of gate cycles (depth) grow. • Why achievable in NISQ era We verify that the quantum processor is working properly using a method called cross-entropy benchmarking11,12,14, which compares how often each bitstring is observed experimentally with its corresponding “Random circuits are a suitable choice forideal probability computed via simulation on a classical computer. For a given circuit, we collect the measured bitstrings {xi} and compute the linear cross-entropy benchmarking fidelity11,13,14 (see also Supplementary benchmarking because they do not Information), which is the mean of the simulated probabilities of the bitstrings we measured: 10 mm

n FXEa B =2 "Px( i)#i − 1 Classically veri(1fable) b Supremacy regime possess structure and therefore allow for 10 0 Fig. 1 | The Sycamore processor. a, Layout of processor, showing a rectangular array of 54 qubits (grey), each connected to its four nearest neighbours with where n is the number of qubits, P(x ) is the probability of bitstring x i i couplers (blue). The inoperable qubit is outlined. b, Photograph of the limited guarantees of computationalcomputed for the ideal quantum circuit, and the average is over the Sycamore chip. E F G H E F G H observed bitstrings. Intuitively, FXEB is correlated with how often we A B C D C D A B sample high-probability bitstrings.XEB When there are no errors in the 6 Sycamore sampling (Ns = 10 ): 200 s quantum circuit, the distribution–1 of probabilities is exponential (see connectivity was chosen to be forward-compatible with error correc- hardness” 10 26 delity, 2h ClassicalCl sil samplingli g at Supplementary Information),f and sampling from this distribution will tion using the surface code . A key systems engineeringSycamoree advance of this

produce FXEB =1. On the other hand, sampling from the uniform device is achieving high-fidelity single-22k weeke ands two-qubit operations, not — Randomness is desired n Classical verifcation 1 week distribution will give "P(xi)#i = 1/2 and produce FXEB =0. Values of FXEB just in isolation but also while performing a realistic4 yry computation with 4 yr between 0 and 1 correspond to the probability that no error has occurred simultaneous gate operations5 h on many qubits. We discuss1000 yry the highlights 600 yr while running the circuit. The probabilities P(xi) must be obtained from below; see also the Supplementary Information. 10,0000 yr –2 m = 14 cycles classically simulating the quantum10 circuit, and thus computing FXEB is In a superconducting circuit, conduction electrons condense into a & intractable in the regime of quantum supremacy.Prediction from However, gate and measurement with certain errors macroscopic quantum state, such that currents and voltages behave Full circuit Elided circuit Patch circuit 2,30 6

circuit simplifications, we canCross-entropy benchmarking obtain quantitative fidelity estimates of quantum mechanically . Our processor uses transmon qubits , which n = 53 qubits a fully operating processor running wide and deep quantum circuits. can be thought of as nonlinear superconductingPrediction resonators at 5–7 GHz. Signature is resilient against noise. Our goal is to achieve a high enough FXEB for a circuit with sufficient The qubit is encoded as the twoElided lowest (±5 error quantum bars) eigenstates of the width and depth such that the classical computing cost is prohibitively resonant circuit. Each transmonPatch has two controls: a microwave drive 10–3 large. This is a difficult task because10 our15 logic 20gates are25 imperfect30 and35 to 40excite the45 qubit,50 and a 55magnetic12 flux14 control16 to tune18 the frequency.20 the quantum states we intend to create are sensitive to errors.Number A ofsingle qubits, n Each qubit is connected to a linear resonatorNumber usedof cycles, to m read out the qubit 5 bit or phase flip over the courseFig. 4 | Demonstrating of the algorithm quantum will supremacy. completely a, Verification shuffle of benchmarking state . As showncomplexity, in Fig. justifies 1, each the use qubit of elided is also circuits connected to estimate fidelity to its in neighbouring the 11 31,32 the speckle pattern and resultmethods. in F XEcloseB values to for zero patch, fidelityelided and full (see verification also Sup circuits- arequbits usingsupremacy a new regime.adjustable b, Estimating coupler FXEB in the. quantumOur coupler supremacy design regime. allows Here, us plementary Information).calculated Therefore, from inmeasured order bitstrings to claim and quantum the corresponding suprem probabilities- to quickly tunethe two-qubit the qubit–qubit gates are applied coupling in a non-simplifiable from completely tiling and sequence off to for 40 MHz. predicted by classical simulation. Here, the two-qubit gates are applied in a which it is much harder to simulate. For the largest elided data (n = 53, m = 20,

acy we need a quantum simplifiableprocessor tiling that and sequence executes such thatthe the program full circuits canwith be simulated One out qubit totaldid Nnots = 30 function million), we findproperly, an average so FXE theB > 0.1% device with 5σ confidence,uses 53 qubits where σ and sufficiently low error rates.to n = 53, m = 14 in a reasonable amount of time. Each data point is an average86 couplers. over includes both systematic and statistical uncertainties. The corresponding full ten distinct quantum circuit instances that differ in their single-qubit gates (for n circuit data, not simulated but archived, is expected to show similarly = 39, 42 and 43 only two instances were simulated). For each n, each instanceThe is processorstatistically is significant fabricated fidelity. using For m =aluminium 20, obtaining a formillion metallization samples on the and

sampled with Ns of 0.5–2.5 million. The black line shows the predictedJosephson FXEB based quantum junctions, processor and takes indium 200 seconds, for bump-bonds whereas an equal-fidelity between classical two silicon Building a high-fidelityon single- processor and two-qubit gate and measurement errors. The close wafers. Thesampling chip wouldis wire-bonded take 10,000 years to on a a millionsuperconducting cores, and verifying circuit the fidelity board correspondence between all four curves, despite their vast differences in would take millions of years. We designed a quantum processor named ‘Sycamore’ which consists and cooled to below 20 mK in a dilution refrigerator to reduce ambient of a two-dimensional array of 54 transmon qubits, where each qubit is thermal energy to well below the qubit energy. The processor is con- quantum processor time is only about 30 seconds. The bitstring samples choosing circuits that randomize and decorrelate errors, by optimizing tunably coupled to four nearestfrom all neighbours,circuits have been in archiveda rectangular online (see lattice. ‘Data availability’ The nected section) through control filters to minimize and attenuators systematic errors to room-temperature and leakage, and by designingelectronics, to encourage development and testing of more advanced verification gates that operate much faster than correlated noise sources, such as algorithms. 1/f flux noise37. Demonstrating a predictive uncorrelated error model 506 | Nature | Vol 574 | 24One OCTOBER may wonder 2019 to what extent algorithmic innovation can enhance up to a Hilbert space of size 253 shows that we can build a system where classical simulations. Our assumption, based on insights from complex- quantum resources, such as entanglement, are not prohibitively fragile. ity theory11–13, is that the cost of this algorithmic task is exponential in circuit size. Indeed, simulation methods have improved steadily over the past few years42–50. We expect that lower simulation costs than reported The future here will eventually be achieved, but we also expect that they will be Quantum processors based on superconducting qubits can now perform consistently outpaced by hardware improvements on larger quantum computations in a Hilbert space of dimension 253 ≈ 9 × 1015, beyond the processors. reach of the fastest classical supercomputers available today. To our knowledge, this experiment marks the first computation that can be performed only on a quantum processor. Quantum processors have Verifying the digital error model thus reached the regime of quantum supremacy. We expect that their A key assumption underlying the theory of computational power will continue to grow at a double-exponential is that quantum state errors may be considered digitized and local- rate: the classical cost of simulating a quantum circuit increases expo- ized38,51. Under such a digital model, all errors in the evolving quantum nentially with computational volume, and hardware improvements will state may be characterized by a set of localized Pauli errors (bit-flips or probably follow a quantum-processor equivalent of Moore’s law52,53, phase-flips) interspersed into the circuit. Since continuous amplitudes doubling this computational volume every few years. To sustain the are fundamental to quantum mechanics, it needs to be tested whether double-exponential growth rate and to eventually offer the computa- errors in a quantum system could be treated as discrete and probabil- tional volume needed to run well known quantum algorithms, such as istic. Indeed, our experimental observations support the validity of the Shor or Grover algorithms25,54, the engineering of quantum error this model for our processor. Our system fidelity is well predicted by a correction will need to become a focus of attention. simple model in which the individually characterized fidelities of each The extended Church–Turing thesis formulated by Bernstein and gate are multiplied together (Fig. 4). Vazirani55 asserts that any ‘reasonable’ model of computation can be To be successfully described by a digitized error model, a system efficiently simulated by a Turing machine. Our experiment suggests should be low in correlated errors. We achieve this in our experiment by that a model of computation may now be available that violates this

Nature | Vol 574 | 24 OCTOBER 2019 | 509 Where to look next?

NISQ era: Challenging for fault tolerate quantum computing Need to find ways to deal / live with noise qFlex: state of art classical simulator √ Quantum Supremacy: random circuit h4ps://github.com/ngnrsaa/qflex [Villalonga et. al., accepted on NPJ QIP 2019 Two other (out of many) NASA QuAIL efforts: • Advanced qubit-routing (quantum circuit compilation) • Variational quantum heuristics for optimization and quantum chemistry — like Quantum Alternating Operator Ansatz (QAOA), Variational Quantum Eigensolver (VQE) Where to look next?

NISQ era: Challenging for fault tolerate quantum computing Need to find ways to deal / live with noise

√ Quantum Supremacy: random circuit

Two other (out of many) NASA QuAIL efforts: • Advanced qubit-routing (quantum circuit compilation) • Variational quantum heuristics for optimization and quantum chemistry — like Quantum Alternating Operator Ansatz (QAOA), Variational Quantum Eigensolver (VQE) Noise-aware qubit routing / circuit compilation Circuit for an algorithm is usually written at a hardware-agnostic level — Non-commuting gates should obey the ordering dictated by the algorithm (circuit) — When a set of 2-qubit gates commute but cannot be executed at the same time, E.g., [ZZ(q1, q2), ZZ(q2, q3)] = 0 Does not matter ideally, ordering can affect circuit length/depth, and therefore the effect of noise.

— On hardware with restricted qubit connectivity, SWAPs are needed. Noise-aware qubit routing / circuit compilation Circuit for an algorithm is usually written at a hardware-agnostic level — Non-commuting gates should obey the ordering dictated by the algorithm (circuit) — When a set of 2-qubit gates commute but cannot be executed at the same time, E.g., [ZZ(q1, q2), ZZ(q2, q3)] = 0 Does not matter ideally, ordering can affect circuit length/depth, and therefore the effect of noise.

— On hardware with restricted qubit connectivity, SWAPs are needed. Pioneered temporal planning & for compilation for NISQ devices - Collaborating with domain experts at NASA, utilizing state of the art temporal planners

For superconducting qubits — Taking Rigetti and Google h/w constraints — Minimizing makespan (gate-duration aware) — Including Crosstalk — Logical-to-Physical mapping/allocation

Venturelli, Do, Rieffel, Frank., Quantum Science and Technology (2018) Noise-aware qubit routing / circuit compilation

Figure 7 Compiled graph coloring. (LeF) Qubit alloca.on for 3-coloring of a square graph on Rigem Aspen 16Q. The grey qubits are Do et. al., Planning for quantum circuit compilation for ini.ally not assigned with logical qubits. (center) and (right) Illustra.on of a compiled quantum circuits found by temporal planning tool TFD. graph coloring, In preparation Ver.cal line between two qubits indicate two-qubit gates: with arrows as end points for SWAPs, filled circles as end points for UZZ gates, while a PioneeredUXY gate is indicated with color transi.ons. Legend in each box along with the color (edge color in case of a SWAP) encodes the logical qubit at temporal planning & for compilation for NISQ devices - the moment of opera.on. Collaborating withFor example in (center) at .me 0, Q2 represents V2B and Q3 is V2G, aFer a swap from t=0 to 1 Q2 becomes V2G and domain experts at NASA, utilizing state of the art temporal planners Q3 becomes V2B. (b) Using conven.onal synthesis, UZZ gate has been approximated to take 3 units of .me. The total makespan of the circuit is 31. (right) Using na.ve UZZ and XY (unitary dura.on) the makespan for the same circuit is only 16. For superconducting qubits — Taking Rigetti and Google h/w constraints — Minimizing makespan (gate-duration aware) — Including Crosstalk — Logical-to-Physical mapping/allocation

Venturelli, Do, Rieffel, Frank., Quantum GATEScience and Technology (2018)UZZ COMPILATION UXY COMPOSITION

SWAP

Figure 8 SWAP gate construcCon using UZZ or UXY. The SWAP gate can be implemented (leF) as 3 CZ gates and 6 H gates, or (right) as 3 iSWAP gates U1001 (π) and 3 single qubit rota.ons about the X-axis. However, if the SWAP acts on a set of qubits which belong to the same “color” then it can be simplified with a UXY(π/2) opera.on since they are equivalent in the symmetric subspace – effec.vely reducing the SWAP dura.on to that of a single two-qubit gate. These preliminary results represent the baseline for our investigations, that will be considering the empirical performance of multiple different schedules, employing realistic gate durations, with additional weight related to their fidelities (action costs in the temporal planning formalism) and will consider the initial allocation as part of the compilation [Booth18]. Fine-tuning the compilers to serve the specific application of interest for SAAM is essential for the benchmarking aspect of the proposal, described next.

Benchmarking — Performance success during both phases will be evaluated on the following three different criteria:

▪ Reliability of the hardware in terms of fidelity, precision and operational timings. The last 5 years have seen a proliferation of ideas for metrics of performance for QPUs during the NISQ eras (see [Cross18] for quantum volume). As thoroughly discussed in the recent paper

HR001119S0052 Volume 1 14 Noise-aware qubit routing / circuit compilation

Figure 7 Compiled graph coloring. (LeF) Qubit alloca.on for 3-coloring of a square graph on Rigem Aspen 16Q. The grey qubits are Do et. al., Planning for quantum circuit compilation for ini.ally not assigned with logical qubits. (center) and (right) Illustra.on of a compiled quantum circuits found by temporal planning tool TFD. graph coloring, In preparation Ver.cal line between two qubits indicate two-qubit gates: with arrows as end points for SWAPs, filled circles as end points for UZZ gates, while a PioneeredUXY gate is indicated with color transi.ons. Legend in each box along with the color (edge color in case of a SWAP) encodes the logical qubit at temporal planning & for compilation for NISQ devices - the moment of opera.on. Collaborating withFor example in (center) at .me 0, Q2 represents V2B and Q3 is V2G, aFer a swap from t=0 to 1 Q2 becomes V2G and domain experts at NASA, utilizing state of the art temporal planners Q3 becomes V2B. (b) Using conven.onal synthesis, UZZ gate has been approximated to take 3 units of .me. The total makespan of the circuit is 31. (right) Using na.ve UZZ and XY (unitary dura.on) the makespan for the same circuit is only 16. For superconducting qubits — Taking Rigetti and Google h/w constraints — Benchmarking against analytical bounds — Minimizing makespan (gate-duration aware) — Demonstrating effect of native gate sets — Including Crosstalk — Raising challenges and inspiring new — Logical-to-Physical mapping/allocation planning/scheduling algorithms

Venturelli, Do, Rieffel, Frank., Quantum Generalized swap networks for near-term GATEScience and Technology (2018)UZZ COMPILATION quantumUXY computing, COMPOSITION B O’Gorman 2019]

SWAP

Figure 8 SWAP gate construcCon using UZZ or UXY. The SWAP gate can be implemented (leF) as 3 CZ gates and 6 H gates, or (right) as 3 iSWAP gates U1001 (π) and 3 single qubit rota.ons about the X-axis. However, if the SWAP acts on a set of qubits which belong to the same “color” then it can be simplified with a UXY(π/2) opera.on since they are equivalent in the symmetric subspace – effec.vely reducing the SWAP dura.on to that of a single two-qubit gate. These preliminary results represent the baseline for our investigations, that will be considering the empirical performance of multiple different schedules, employing realistic gate durations, with additional weight related to their fidelities (action costs in the temporal planning formalism) and will consider the initial allocation as part of the compilation [Booth18]. Fine-tuning the compilers to serve the specific application of interest for SAAM is essential for the benchmarking aspect of the proposal, described next.

Benchmarking — Performance success during both phases will be evaluated on the following three different criteria:

▪ Reliability of the hardware in terms of fidelity, precision and operational timings. The last 5 years have seen a proliferation of ideas for metrics of performance for QPUs during the NISQ eras (see [Cross18] for quantum volume). As thoroughly discussed in the recent paper

HR001119S0052 Volume 1 14 Noise-aware qubit routing / circuit compilation

Figure 7 Compiled graph coloring. (LeF) Qubit alloca.on for 3-coloring of a square graph on Rigem Aspen 16Q. The grey qubits are ini.ally not assigned with logical qubits. (center) and (right) Illustra.on of a compiled quantum circuits found by temporal planning tool TFD. Ver.cal line between two qubits indicate two-qubit gates: with arrows as end points for SWAPs, filled circles as end points for UZZ gates, while a PioneeredUXY gate is indicated with color transi.ons. Legend in each box along with the color (edge color in case of a SWAP) encodes the logical qubit at temporal planning & for compilation for NISQ devices - the moment of opera.on. Collaborating withFor example in (center) at .me 0, Q2 represents V2B and Q3 is V2G, aFer a swap from t=0 to 1 Q2 becomes V2G and domain experts at NASA, utilizing state of the art temporal planners Q3 becomes V2B. (b) Using conven.onal synthesis, UZZ gate has been approximated to take 3 units of .me. The total makespan of the circuit is 31. (right) Using na.ve UZZ and XY (unitary dura.on) the makespan for the same circuit is only 16. For superconducting qubits

GATE UZZ COMPILATION UXY COMPOSITION

SWAP

Figure 8 SWAP gate construcCon using UZZ or UXY. The SWAP gate can be implemented (leF) as 3 CZ gates and 6 H gates, or (right) as 3 iSWAP gates U1001 (π) and 3 single qubit rota.ons about the X-axis. However, if the SWAP acts on a set of qubits which belong to the same “color” then it can be simplified with a UXY(π/2) opera.on since they are equivalent in the symmetric subspace – effec.vely reducing the SWAP dura.on to that of a single two-qubit gate. These preliminary results represent the baseline for our investigations, that will be considering the empirical performance of multiple different schedules, employing realistic gate durations, with additional weight related to their fidelities (action costs in the temporal planning formalism) and will consider the initial allocation as part of the compilation [Booth18]. Fine-tuning the compilers to serve the specific application of interest for SAAM is essential for the benchmarking aspect of the proposal, described next.

Benchmarking — Performance success during both phases will be evaluated on the following three different criteria:

▪ Reliability of the hardware in terms of fidelity, precision and operational timings. The last 5 years have seen a proliferation of ideas for metrics of performance for QPUs during the NISQ eras (see [Cross18] for quantum volume). As thoroughly discussed in the recent paper

HR001119S0052 Volume 1 14 2 QCC for NISQ Devices q1 In the circuit model of quantum computation, a quantum q2 Idealized Quantum q3 algorithm is expressed conceptually as a logical quantum Circuit: sequential gates, q4 Noise-aware qubit routing / circuit compilation circuit, consisting of a series of quantum operations called no hardware … constraints quantum logic gates. Quantum processors are physical de- q8 vices that implement these quantum logic gates so that the q2 q Beyond superconducting qubit platform, desired quantum operations can be carried out on the quan- 1 q3 tum states stored in the qubits. In simple cases, the quantum Idealized Quantum noise and gate infidelity are posing other q logic gates directly correspond to physical quantum gates Hardware: commuting q8 4 on the quantum processor, but more typically the processor gates, no hardware challenges. We are looking to constraints has physical constraints that prevent a quantum logic circuit, q q — Develop hardware specific models and goals describing the desired algorithm, from being directly imple- 7 5 q6 — Further exploiting planners for compilation mented. n2 NISQ Chip: nearest n At a high level, these constraints can be classified into 1 X n3 neighbor constraints, two types: (1) gate set constraints (i.e., those that specify different gate durations (red/blue arcs), the set of logic gates the processor is capable of applying), n n swap gates (curved arcs), 8 X 4 and (2) geometric constraints (i.e., those that specify upon crosstalk constraints (Xs), which sets of qubits the available logic gates can be applied, initialization n n limited by, for example, processor connectivity). Although qi -> ni (dashed arc) 7 5 n these constraints differ among quantum processors, quan- Pioneered temporal6 planning & for compilation for NISQ devices tum algorithms can be re-expressed respecting the processor constraints with polynomial overhead in the number of gates - Collaborating with domain experts at NASA, utilizing state of the art temporal planners (Brierley 2017). As such, for theoretical algorithmic work, Figure 1: Pictorial view of QCC-NISQ concepts. At the the design of logical quantum circuits without concern for highest level, an idealized quantum circuit specifies a se- the implementation constraints of physical devices is suf- quence of quantum logical gates over qubit states that ficient. However, to implement a quantum algorithm on an solves a specified problem (top). The idealized quantum cir- actual device, these constraints must be efficiently addressed cuit could conceptually be implemented on fully-connected to take full advantage of NISQ processors. quantum hardware in which gates can be carried out between In this work, we focus on a particular approach to address- all pairs of physical qubits, which is depicted here by a fully- ing geometric constraints associated with processor connec- connected graph. Qubit states in an idealized quantum cir- tivity. The approach maps logical qubits to physical qubits cuit are mapped onto physical qubits in the fully-connected on the processor and iteratively updates the mapping through architecture. At this level, gates are specified between phys- the insertion of additional gates in the course of the com- ical qubits and can be executed in parallel if they do not in- putation so as to enable the logical operations to be imple- volve the same qubit (middle). In an actual NISQ chip, phys- mented respecting the physical contraints. This problem is ical gates can only be carried out between a subset of pairs often referred to as “quantum compilation,” though quan- of qubits, usually nearest neighbors in a 1D or 2D array. To tum compilation usually involves addressing gate set con- carry out 2-qubit gates specified in idealized quantum cir- straints as well as geometric (e.g., connectivity) constraints. cuits between qubits that are not connected, swap gates are Another simple constraint is that gates involving the same added to route logical qubit states to physical qubits that are qubit cannot be executed in parallel. A generalization of this connected so that the desired gates can be applied (bottom). constraint is a “cross-talk” constraint that may prevent gates in physical proximity from being executed at the same time (Booth et al. 2018). critical to minimize the duration of compiled circuits. Thus, QCC-NISQ frequently requires adding supplementary compilation is challenging due to: the parallel execution of operations supported by the hardware to those specified gates with different durations, the planar or quasi-planar in the idealized circuit. Current superconducting quantum topology of the qubit locations on the chip, the ordering processors have planar architectures with connections only constraints from the original idealized circuit, as well as between nearest-neighbor locations (qubits), resulting in additional constraints such as cross-talk. restrictions as to where gates can be applied. Specifically, a gate can operate only on qubit states located on adjacent Example: Figure 1 shows a concrete QCC example requir- qubits on the chip. To compensate for the nearest-neighbor ing gate operations gA = (q1,q2) and gB = (q3,q8). At limitation, swap gates can move qubit states between the top, the algorithm is specified,G as is typicallyG done in the connected qubits to reach a configuration where the desired gate-model quantum computing literature, as an idealized gate, specified in the idealized circuit, can be applied. Cur- quantum circuit, with sequential specifications of 2-qubit rent quantum computational hardware suffers greatly from gates over qubit states. There are no hardware constraints; decoherence (akin to noise), which degrades the fidelity of further, some ordered pairs of gates in the idealized circuit the computation (Bishop 2017). In NISQ processors, deco- may commute (i.e., could execute in arbitrary order, even herence is intimately linked to the duration of the executed simultaneously, and still produce correct results). A fully- circuit that carries out the quantum computation, so it is connected quantum hardware, with no hardware constraints Where to look next?

NISQ era: Challenging for fault tolerate quantum computing Need to find ways to deal / live with noise

√ Quantum Supremacy: random circuit

Two other (out of many) NASA QuAIL efforts: • Advanced qubit-routing (quantum circuit compilation) • Variational quantum heuristics for optimization and quantum chemistry — like Quantum Alternating Operator Ansatz (QAOA), Variational Quantum Eigensolver (VQE) Quantum Computing for NASA Applications

Objective: Find “better” solution • Faster • More precise • Not found by classical algorithm Anomaly Detection Data Analysis and Decision and Data Fusion Making

V&V and Air Traffic Optimal Management Sensor Placement

Robust Network Design for UAV

Topologically- aware Parallel Computing

Mission Planning, Scheduling, and Coordination

Image from P. Kopardekar et. al., Unmanned Aircraft System Traffic Management (UTM) Concept of Operations, DASC 2016

Common Feature: Intractable problems on classical supercomputers Biswas, SMC-IT, 28 Sept 2017 3

W () symmetry implies that if ↵ is an eigenvalue of W (), then its complex conjugate ↵⇤ is also an eigenvalue + e i⇡X/n e i⇡X/n | i ··· of W (); the corresponding eigenstates are denoted by w and w ,respectively.Whenrestrictedtothe i⇡X/n i⇡X/n ↵ ↵⇤ + e e | i | i | i iC iC ··· 2-dimensional subspace ↵ spanned by w↵ , w↵⇤ , e e S {| i | i} . . . . and written in the basis w+ , w ,where . . . . {| i | i} 1 i⇡X/n i⇡X/n w = w w , (6) + e e ↵ ↵⇤ | i ··· | ± i p2 | i ±| i ⇣ ⌘ W () has the matrix representation FIG. 1. To map the input state to a state having large overlap 1/2 Quantum Approximate Optimizationwith the target, Algorithms the unitary W () is repeated (QAOA) for (N ) 0 arg(↵) times. O W ()=exp i . (7) ↵ arg(↵)0 S  ✓ ◆ The unitary W () thus drives a closed transition between is that B acts only on individual spins, so it is easier w with the transition rate arg(↵). To drive a full tran- and more ecient to implement. The input state of our | ± i n sition, one needs to repeat W () for roughly ⇡/(2 arg(↵)) algorithm is the usual one, the tensor product + ⌦ , | i times. the joint +1 eigenstate of all the Xj operators, and the 1 n n Let b = + ⌦ ⌦ .WeshowinSec.VII even superposition of all bit strings, | ± i p2 | i ±|i that for the eigenvalues ↵ and ↵ exponentially close to ⇤ n 1 1 but not equal to 1, w↵ and w↵⇤ have large over- in = + ⌦ = s . (3) laps with 1 0 i b| ,i respectively.| i In other words, | i | i pN | i p2 + s 0,1 n | i ± | i 2{ } w and w have large overlaps with 0 and i b , X + + respectively,| i | soi the algorithm drives b | iclose to| thei We can simplify the analysis, following Farhi et al. [14], | + i by working in a basis in which the target state is 0 = target state 0 . The value of arg(↵) has to be exponen- | i tially small| in in, otherwise, our algorithm would have 0 00 .SincethedriverB and the initial state in remain| ··· thei same when any subset of the n qubits| arei beaten the optimal query complexity of Grover’s algo- flipped, the problem can be converted to finding the bit rithm. Hereafter, ↵ will refer to this specific eigenvalue. The initial state (3) can be written as string 0 using the oracle C0 with the same driver B.

Doing so drastically simplifies our analysis: the state n 1 n = + ⌦ = b + b ; (8) Farhi, Goldstone,0 , and and the initialGutmann, state + ⌦ , arXiv:1411.4028 are in the (n + 1)- in + | i | i | i | i p2 | i | i dimensional symmetric subspace (under permutations of ⇣ ⌘ qubits), and the evolution under both B and C0 preserves note that b is a dark state, i.e., W () b = b . | i | i | i this subspace, so we only need to consider that (n + 1)- For 0 w+ b+ w 1, the output state is |h | i| ' |h | i| ' dimensional subspace instead of the whole Hilbert space approximately n Cost Hamiltonian Mixer Hamiltonian of dimension 2 . To simplify notation, we will omit the 1 subscript in C0 hereafter, i.e., C C0. out 0 + b , (9) ⌘ | i'p2 | i | i The building block of our algorithm is a simple product ⇣ ⌘ encoding cost function generating transitions of unitaries generated by B and C, and the probability of finding the target state 0 is ap- proximately 1/2. | i i⇡B/n iC i⇡B/n iC 5 between different states W ()=e e e e , (4) In Sec. VII, we derive approximate results for our al- gorithm in the large-n limit.fully determines For the= state⇡, we , find since the that spin coherent Z Z where (0, ⇡] is a free parameter. The intuition that +pN/n 3 i✓B/2 | i n 4 2 states1/4 e 0 for ✓ [0, 2⇡) are over-complete for 2 i⇡B/n pN/n 0 w+ (1 ⇡ /2n) in Eq. (58)(thefidelityis2 we choose the angle of the rotation e can be found the symmetric subspace; the advantage of this represen- smaller|h | i for| ' = ⇡). See Fig. 2(a) ↵ for a comparison in Sec. V. The algorithm repeatedly applies the unitary tation is that both B and C can be expressed concisely. Variational parameters: γ , γ , ⋯, β , β , ⋯ of analytical and6 numericalFor even results.n,the function We also satisfies find the periodic that boundary ( 1 2 1 2 ) W () for ⇥(pN) times (see Fig. 1). The relevant eigen- 1 condition Y b+ w 1 NY in Eq. (67). See Fig. 2(b) for values of the unitary W () determine the query complex- |h | i| ' , 2⇡ = 0 ei⇡B a comparison of analytical and| i numerical results. Fur- n (24) ity of our algorithm, while the corresponding eigenvectors =(⌦ 1) 0 ↵ =1/2 , 0 . thermore, we calculate that arg(↵) 4 ph 2|Ni |(1i determine the probability of success. We willX show that X ' Work principle: constructive interference ⇡2/2n)1/4 in Eq. (66). FigureFor the initial3(a) stateshows in Eq. a (3 comparison), we have the relevant eigenvalues are the ones closest to 1, but(a) not (b) of analytic and numerical results. Considering that thein✓/2 i✓B/2 e equal to 1. in , ✓ = 0 e in = . (25) FIG. 4. Spin coherent statesuccess representation: probabilities (a) for b+ , of our algorithm| i is about 1/2 andpN The unitary W () has a time-reversal-like symmetry | i where pN/n is the order ofeach the expansion iteration coeWcients() in calls the oracle twice,⌦ the ↵ average 3/4 For the target state 0 ,wehave Eq. (12); (b) for b0 , where n is the order of the ex- | i ⇤W ()⇤ = W () , pansion coecients(5)| ini Eq. (17query). complexity of our algorithm is Case shown speedup: reproduces quantum speedup in Grover’s† † 0 , ✓ = 0 ei✓B/2 0 = cos(✓/2)n . (26) 2⇡ | ⇡i i⇡B/n iB/n/⌦2 2 iC ↵ where ⇤ = e Z Z Z with Z being the Pauli- n/2 T (n) The unitaries e 2 and, e take simple(10) forms, 1 2 n j Thus, the unitary W () approximately drives a tran-' arg(↵) ' 2p2 algorithm for needle-in-a-haystack search ··· sition between b+ and b0 with the rate ⌘. Applying iB/2 Z operator of the j-th qubit. Equation (5) holds gener- e , ✓ = , ✓ , (27) the unitary W|()i for order| i n/⌘ times, one can drive | i | i which di↵ers from the optimali valueC presented in Ref. [12] ally for Hamiltonians based on classicalthe cost state functions,b+ to a state close to b0 . The probability e , ✓ | i | i | i (28) Hamiltonians diagonal in the computationalof finding basis. the target This state withby ab0 factoris only of polynomiallyp2. i n | i = ,✓ +(e 1) , 0 cos(✓/2) . small in n as opposed to the exponentially small value | i | i with b+ , achieving the quadratic speedup in Grover’s | i For the discrete angles ✓k =2k⇡/n,weintroducethe algorithm up to a logarithmic factor. notation Zhang, Rieffel,Although and the case Wang,1 is illustrative, PRA it requires 2017 ⌧ ik⇡B/n a logarithmically many more calls to the oracle than k = 0 e . (29) Grover’s original algorithm, and the probability of find- | i ⌦ ↵ ing the target state is small. Since ⌘ is exponentially The function of 0 will be used frequently, and we denote it as | i small in n, both b+ and b0 are close to eigenvec- tors for eigenvalues| exponentiallyi | i close to 1. This anal- ⇠ 0 = cos(k⇡/n)n . (30) ysis suggests concentrating on the subspace spanned by k ⌘ k | i

w↵ , w↵⇤ ,where↵ and ↵⇤ are the eigenvalues clos- We will used the following identity repeatedly, est{| toi 1.| Indeed,i} we show in Sec. VII that one can in- crease the success probability and reduce the number of n 1 k 2 2n calls to the oracle by setting = ⇡. In Fig. 3(b), arg(↵) ( 1) ⇠k = n 0 b+ = . (31) h | i N is plotted as a function of . The reason behind why kX=0 = ⇡ performs the best (or why it even works) seems For discrete angles, Eqs. (27) and (28) becomes unclear without a tedious calculation. We give this cal- culation in Sec. VII, after introducing a “phase space” i⇡B/n k e = k 1 , (32) representation that will be useful in that analysis. | i | i iC i e = +(e 1) ⇠ . (33) k | i k | i 0 | i k VI. PHASE SPACE REPRESENTATIONS For the eigenstates ofB with eigenvalues n,wehave ± k 1/2 k bn = k b n =( 1) N , (34) We introduce a novel phase space representation in | i | i n n this section, which is essential in the following section where bn = in = + ⌦ and b n = ⌦ . | i | i | i | i | i to the analytical solution of the success probability and Since the discrete function of the states bn and b n the query complexity of our algorithm. The phase space are the same, it does not uniquely determine| i a state| ini representation is based on the inner products of a quan- the symmetric subspace with dimension n + 1. The dis- tum state with the spin coherent states we introduced in crete function, however, is unique in the orthogonal Sec. V. 1 space of b = p bn b n . We will restrict our Any state can be uniquely determined by | i 2 | i| i | i2HS discussions in that subspace, and b is a dark state the inner products 0 ei✓B/2 .The function 1 | i anyway. For b+ = bn + b n ,wehave | i p2 | i | i ⌦ i↵✓B/2 , ✓ = 0 e . (23) k 1/2 b = p2( 1) N . (35) | i k | + i ⌦ ↵ —> Symmetry Preserving Quantum Alternating Operator Ansatz (QAOA) Exploit local or global symmetries in the problem E.g., QAOA for constrained optimization Hadfield, Wang, O'Gorman, Rieffel, Venturelli, Biswas, arXiv 1709.03489 Design a mixer that contains the quantum evolution in the subspace that satisfies the constraints.

One-hot encoding:

Symmetry (constraints): preserve Hamming weight of subset 000 Standard mixer: X 110 ✓ 101 symmetry-preserving 100 011 010 ✓ ✓ mixer: XX+YY 001 111

Wang, Rubin, Dominy, Rieffel, arXiv:1904.09314 6 only quadratic fermionic couplings: We also point out that the Givens rotation network is a powerful tool for state preparation for general quadratic  Hamiltonians. For pairing models, the linear depth net- H = xx + yy XY c c+1 c c+1 work was used to prepare ground states [30]. This initial c=1 X state can be used in the context where the hard con- # straint is of the form that qubits must appear paired up.  We point this out as an example of how di↵erent flavor HXY =2 ac†ac+1 +h.c. , (20) constraints can correspond to evolving a wide variety of c=1 X constraint-preserving Hamiltonians. wherea ˆ anda ˆ† are fermionic operators, and we assumed  is even for simpler demonstration. The quadratic Hamiltonian in Eq. (20) can be diago- B. Linear depth simultaneous complete-graph mixer nalized by a basis rotation on the operators. For nearest- 6 neighbor, one-body coupling, the fermionic Fourier trans- form We consider the simultaneous mixer for a node, only quadratic fermionic couplings: We also point out that the Givens rotation network is a iH ,v  1 complete e ,withHcomplete,v = HXY,v,c,c0 , powerful tool for state preparation for general quadratic c

Small problems solved at low-depth with XY XY wins over X creases. Furthermore, for each level, we computed the 10 probability ofOptimal getting results for the Prism actual graph optimal-chromatic solution graphs of (a size validn as the benchmarking sets 1 for the XY mixers under consideration. See Table II for 3-coloring) upon measurement. At levelthe number one, of instances this proba- in each benchmarking set. bility is slightly0.9 lower than 0.2, and quickly goesn num. above graphs 0.6 3 5 12 at level-3, which implies that repeating the3 6 64 experimentn  num. graphs 0.8 3 7 475 4 4 6 3 times, one will find a valid coloring with4 6 26 probability4 6 6 4 7 282 4 8 6 > 0.9. 0.7 5 7 46

Approxim. ratio 6 7 5

0.6 Table II. Left: Benchmarking graph sets: each row indicates classical initial states all -chromatic graphs of size n, and we solve the problem W-state of -coloring of such graphs choosing  = .Right:Bench- marking graph sets II for examining the simultaneous vs par- 0.5 0 2 4 6 8 10 12 titioned ring mixers on di↵erent ring sizes: Each row indicates QAOA level all connected graphs of size n, and we solve the problem of (a) (b) -coloring of such graphs. Because the total number of qubits is n, which is the limiting factor to the simulation, we limit to small n to see  varying up to 8. Figure 6. The Prism graph, the expected value of QAOA op- Figure 3. Numerical results for level 1 QAOA on the problem timized over the angle sets. Triangles show the results with W-state as initial states. Circles show the results with a fea- 000 of 3-coloring of a triangle graph. (a) using X mixer along sible classical initial state, averaged over the set of all feasible classical states, the error bar is the standard deviation. For 1 1 with phase-separating Hamiltonian, Eq. (8) where the penalty each initial state, optimization over angles are derived from a 110 basin-hopping search. 0.99 weight is taken to be the numerically determined optimal 0.95 What graphs to color on NISQ era hardware? 1010.98

value ↵⇤ =1.7. (b) using the XY mixer with W-state be- 0.97

100complete complete initial state. Even at level 10, rclassical is still lower than 0.9 011 ing the• initialSize state. of search space rW for level-1. Moreover, the approximation ratio with 0.96 classical initial state shows a tendency toward satura- 010 0.85 0.95 Small & Hard graphs tion around level 10 – this could either be the nature 0.85 0.9 0.95 1 0.95 0.96 0.97 0.98 0.99 1 of the algorithm, or due to increasing diculty in find- 001 ring ring Penalty + X-mixer Full Hilbert space ing the global optimum in the parameter subspace as the (a) QAOA level-2 (b) QAOA level-8 level increases, which poses another practical considera- 111 the Envelope graph is the smallest hard-to-color graph XY mixer Feasible subspacetion for application. (Note that due to the optimization Figure 7. QAOA with simultaneous mixers. Performance for the largest-first(LF) sequential method. Note that over parameter space for each initial state, the average comparison between ring and complete-graph mixers applied • For a classical algorithm, there is a concept of over classical initial state is not equivalent to prepare the to the same graph coloring problems. The axes show approx- initial state in a mixed state for the ensemble). imation ratio achieved using the labeled mixer type. Scat- these classicalRatio: algorithms are aiming to The compute feasible the space Becauseshrinks our simulation exponentially is noise-free, due to ergodicity, ter plotwith shows then. results for 4-coloring of all connected smallest slightly-hard-to-color graph: applying the algorithmin the limitwill of p sometimesthe optimal performance should chromatic-4 graphs of size n =7.In(b),forbettervisi- chromatic number, while in this paper we focus on find- be independent of!1 the initial state. But for practical bility, an outlier data point at (ring = 0.95, complete = 0.9) ing the maximal colorable subgraph. Although finding implementation on a near-term hardware where noises is not shown in the plot. yield the optimal solution accumulates fast with circuit depth, such medium-level the max-colorable subgraph could serve as a subroutine QAOA behavior is of high relevance. In Appendix C we survey methods to generate quantum circuits for prepar- & [Wang,for determining Rubin, Dominy, chromatic Rieffel, numbers, arXiv:1904.09314] the chromatic num- ing W-states. It is shown that with certain methods it 1. Approximation ratio and probability-to-optimal-solution Figurecan 5. be generatedThe Prism with O() graph. CNOT gates. Dots The over- are approximation ratios ber can also be directly attacked by QAOA using a much and crossesall performance are of the QAOA expected will be a tradeo probability↵ between the Using of gettingW -state as the initial opti- state, for simultaneous smallest hard-to-color graph: applying the algorithm never extrayields e↵ort in preparing the W-stateoptimal and the damage that ring and complete-graph mixers, the mean and median more complex mixer.[2] Nevertheless we are not aiming mal coloring.comes with circuit For depth. each QAOA level, resultsof the approximation are shown ratio at as well the as the probability-to- optimal-solution is evaluated across problem sets. solution at doing side-by-side comparison of quantum and classi- (sub)optimal angles resulted from a basin-hoppingThe following observations search. have been made on the typ- C. Benchmarking graph sets ical performance for each problem set. cal algorithms, and will use these small graphs only as a a. Consistent performance over instances. For all To better understand the behavior of these QAOA problem sets, the approximation ratio and the proof-of-principle demonstration of the QAOA with XY graph-coloring algorithms, we make use of the sets of all probability-of-optimal-solution curves as a function of • Examples mixers. 2. E↵ect of initial states

The W-state – as both an even superposition of all fea- sible classical states, and the ground state of the simul- taneous ring mixer – is a natural candidate for the initial state for QAOA. It involves multiple two-qubit gates to prepare. An easier-to-prepare state for each vertex can be defined via a randomly-assigned coloring (feasible but not necessarily optimal), , i.e., a randomly drawn bit Figure 4. TheEnvelope two small and hard-to-colorPrism graphs: Envelope C string of Hamming weight| one.i Preparing such a state and Prism. A valid 3-coloring is shown on each graph. involves only n single-qubit gates. We study both initial states for the prism graph with simultaneous ring mixer. For level-1 QAOA, the best achievable optimization ratio (optimized over all angle 1. Performance of QAOA with the simultaneous ring mixer sets (, )) for W-state is higher than the classical Ham- ming weight 1 state C . Notice that for C ,the With the simultaneous ring mixer, Figure. 5 shows phase-separating unitary| i commutes with the density| i ma- the results for QAOA levels 1 to 6. For each level, the trix of the state, hence has no e↵ect to the state evolu- W-state is used as initial state, and stochastic search tion. As a result, the whole circuit for level-1 QAOA (basin-hopping with BFGS) is performed to optimize the is equivalent to applying the mixing unitary followed by expected value of the cost Hamiltonian over the angle measurement. We further simulated higher levels, and in sets. The approximation ratio corresponding to the op- Figure. 6 show the performance of QAOA with the W- timal expectation value is plotted as filled circles. Even state versus a classical state as initial state. We found at level one, the approximation ratio takes a high value that with the classical initial state, the performance of 0.8, and this value quickly approaches 1 as the level in- QAOA is significantly lower than using the W-state as Symmetry-preserving QAOA circuits for noise characterization

100 010 110 101 001 X Thanks to the feature Eq. (6), and for any Hamming weight h and any QAOA level l,[Ul, H=h] = 0, 011 (this is reflecting the fact that the noiseless QAOA circuit preserves Hamming weight). WeP thus have i i i U † ( )U = , that is, ( ) stays invariant under QAOA unitaries. l E Pfea l Pfea E Pfea i i fea =Tr[⇢0 ( fea)] (8) 000 111 hP i E P For higher level QAOA, ( ( i ) ) = 3 c(l) again commute with the unitaries, we thus E E ···E Pfea ··· h=0 h PH=h have (a) pnoise =0.01 (b) pnoise =0.05 P i Figure 1: Probability of finding a feasiblei state when using XY-QAOA for various system sizes and number =Tr[⇢0U † (U † (U † (U † ( )Ul) U3)U2)U1] (9) Subspace for one vertex hPfeaQAOAi blocks.1 E We2 hereE 3 added···E a locall E P noisefea channel··· after each QAOA layer for each qubit. =Tr[⇢ ( ( ( i )) )] (10) Noise exponentially shrinks the 0E E ···E Pfea ··· 2.3 Back to the feasible subspace (l) probability of staying in the desiredwhich is independent of the QAOA unitaries, and the exact coecients ch are determined soly by the initial state and the noise level,Probability andBy can using be a exactly circuit of withstaying computed two ancillary asin Eq. the qubits (4). desired (for the 3-colorable subspace problem) for each logical qubit, see Fig. 2, we subspace with problem size and QAOA can make sure that all non-feasible states are mapped back to the feasible subspace. This circuit however does is exactlynot localize thecomputable qubit on which a errorunder occurred noise— but replaces can an non-feasible be state with ANY feasible state. level 2.1.3 More general noiseSo models we might end up in a di↵erent state than we had before the error occurred. The probability of staying usedin the for feasible experimental subspace is pretty muchnoise independent characterization of the number of QAOA blocks and just determined In the previous section, we usedby the the noise depolarizing probability on channel the ancillary for analyzing qubits. ZW: theThis probability is in flavor of of error staying correction in up the to the symmetry of feasible subspace. The reasonpermutation. that we were We’ll able further to look do this into that analysis analytically was related to the fact that the depolarizing channel perseveres the symmetry in the weights of the projection operator and that it is self-dual. However, this is not a feature of the depolarizing channel only, but holds for most local Kraus Symmetry-preserving QAOA circuitsmaps. For for an noise arbitrary characterization, local2.3.1 noise channel Circuit Streif, design Rieffel, principle Wang, in preparation We can start with identifying the minimum number of correcting operators that are sucient to bring all states to the feasible subspace. In k = 3 case, only 4 operations are needed, for example, I, X ,X,XX , see syndromes ✏(⇢):⇢ Ki⇢Ki† with KiKi† 1 and Ki† = Ki, 1 3 1 3 and! operations shown in Table 4. The 4 syndroms can be encoded into 2 ancilla qubits, a and a . Note that certain i i 1 2 statesX fit in multiple syndromX sets, for example, 000 can be fixed by either X3 or X1, as shown in Figure. 4. Such | i we can do the same calculation.syndrom For grouping the example leads to di of↵erent 3 colors, circuits. one Figure term 2 shows of the the projector former, and onto Figure the 5 shows feasible the latter. Note that in states reads any grouping, the action of X1 and X3 corresponds to ancilla states a2 =1anda1 = 1, respectively, hence can be applied sequentially, and one can apply only one ancilla and recycle it after correcting for one syndrome.

K K K 001 001 K†K†K† i j l | ih | i j l 2.3.2Xijk Beyond k =3 = K 0 0 K† K 0 0 K† K 1 1 K† Similar to ik|=ih 3 case,| i for⌦ anyj |k ihand| targetedj ⌦ subspacel | ih | of anyl Hamming weight d,bycountingminimalnumberof correctingijk operation needed and choose a syndrom grouping, we can determine the number of ancilla qubits needed, andX design a circuit to bring the system to the feasible subspace. For k =4,ifHam = 1 is targeted, there are 7 syndromes= cijk andijk thereforeijk 3 ancillawith qubitscijk = arecjik sucient; if Ham = 2 is the feasible subspace, then only 4 syndromes and hence 2 ancila| ih qubits| are needed. Xijk For P =(100 100 + 010 010 + 001 001 ), this reads | ih | | ih2.4| Syndrome| ih | measurements The above error correcting circuit bring the non-feasible states back to the feasible subspace, but not necessarily to K K K PK†K†K† the statei beforej l thei error.j Insteadl of using this approach, we also can also use a adaption of a stabilizer codes, which canXijk localize a single bit flip error in the circuit and correct for it. The strategy is the following: We use an additional = c kij kij + c ikj ikj + c ijk ijk . kij | ih | ikj | ih | ijk | ih | Xijk (Note: This is not worked out yet, but it seems that the calculation in the previous section holds for most Kraus maps, even for time-dependent ones. The combinatorics however change.) 4 2.2 Pen-QAOA For pen-QAOA, the above arguments do not hold anymore, since the mixing Hamiltonian and the projection operator on the feasible subspace do not commute. We therefore simulate the quantum circuit. We therefore first find optimal angles (, ) (1-level-QAOA) in a noise-free situation. Instead of optimizing the energy expectation value of the cost Hamiltonian, we here directly optimize the success probability of finding a feasible state, i.e.

2 min i QAOA(, ) . , 0 | h | i | 1 i | i2XHfeas @ A Afterwards we use the found parameters (⇤, ⇤) for the simulation with noise (pnoise > 0) to calculate pfeas.

3 Summary

Things we study to live with noise in NISQ era

• Using planning tools for qubit routing / quantum circuit compilation

• Symmetry-preserving QAOA circuits for more efficient optimization

• Symmetry-preserving QAOA circuits for noise characterization QuAIL NASA