Strategic Quantum Ancilla Reuse for Modular Quantum Programs Via Cost-Effective Uncomputation

SQUARE: Strategic Quantum Ancilla Reuse for Modular Quantum Programs via Cost-Effective Uncomputation Yongshan Ding1, Xin-Chuan Wu1, Adam Holmes1,2, Ash Wiseth1, Diana Franklin1, Margaret Martonosi3, and Frederic T. Chong1 1Department of Computer Science, University of Chicago, Chicago, IL 60615, USA 2Intel Labs, Intel Corporation, Hillsboro, OR 97124, USA 3Department of Computer Science, Princeton University, Princeton, NJ 08544, USA Abstract—Compiling high-level quantum programs to ma- implemented to ensure operation fidelity is met for arbitrarily chines that are size constrained (i.e. limited number of quantum large computations. bits) and time constrained (i.e. limited number of quantum One major challenge, however, facing the QC community, is operations) is challenging. In this paper, we present SQUARE (Strategic QUantum Ancilla REuse), a compilation infrastructure the substantial resource gap between what quantum computer that tackles allocation and reclamation of scratch qubits (called hardware can offer and what quantum algorithms (for classi- ancilla) in modular quantum programs. At its core, SQUARE cally intractable problems) typically require. Space and time strategically performs uncomputation to create opportunities for resources in a quantum computer are extremely constrained. qubit reuse. Space is constrained in the sense that there will be a limited Current Noisy Intermediate-Scale Quantum (NISQ) computers and forward-looking Fault-Tolerant (FT) quantum computers number of qubits available, often further complicated by poor have fundamentally different constraints such as data local- connectivity between qubits. Time is also constrained because ity, instruction parallelism, and communication overhead. Our qubits suffer from decoherence noise and gate noise. Too heuristic-based ancilla-reuse algorithm balances these consider- many successive operations on qubits results in lower program ations and fits computations into resource-constrained NISQ or success rates. FT quantum machines, throttling parallelism when necessary. To precisely capture the workload of a program, we propose Due to space and time constraints, it is critical to find effi- an improved metric, the “active quantum volume,” and use cient ways to compile large programs into programs (circuits) this metric to evaluate the effectiveness of our algorithm. Our that minimize the number of qubits and sequential operations results show that SQUARE improves the average success rate of (circuit depth). Several options have been proposed [8]–[14]. NISQ applications by 1.47X. Surprisingly, the additional gates Among the options, one approach not yet well studied is to for uncomputation create ancilla with better locality, and result in substantially fewer swap gates and less gate noise overall. coordinate allocation and reclamation of qubits for optimal SQUARE also achieves an average reduction of 1.5X (and up to reuse and load balancing [15]. Reclaiming qubits, however, 9.6X) in active quantum volume for FT machines. comes with a substantial operational cost. In particular, to Index Terms—quantum computing, compiler optimization, re- obey the rules of quantum computation, before recycling a versible logic synthesis used qubit, additional gate operations need to be applied to “undo” part of its computation. I. INTRODUCTION In this paper, we propose the first automated compila- Thanks to recent rapid advances in physical implementation tion framework for such strategic quantum ancilla reuse arXiv:2004.08539v2 [quant-ph] 25 Jun 2020 technologies, quantum computing (QC) is seeing an exciting (SQUARE) in modular quantum programs that could be read- surge of hardware prototypes from both academia and in- ily applied to both NISQ and FT machines. SQUARE is a dustry [1]–[4]. This phase of QC development is commonly compiler that automatically determines places in a program to referred as the Noisy Intermediate-Scale Quantum (NISQ) perform such uncomputation in order to manage the trade-offs era [5]. Current quantum computers are able to perform on in qubit savings and gate costs and to optimize for the overall the order of hundreds of quantum operations (gates) using resource consumption. tens to hundreds of quantum bits (qubits). While modest in Optimally choosing reclamation points in a program is scale, these NISQ machines are large and reliable enough to crucial in minimizing resource consumption. This is because perform some computational tasks. Looking beyond the NISQ reclaiming too often can result in significant time cost (due to era, quantum computers will ultimately arrive at the Fault- more gates dedicated to uncomputation). Likewise, reclaiming Tolerant (FT) era [6], [7], where quantum error correction is too seldom may require too many qubits (e.g. fail to fit the program in the machine). For example, Figure 1 shows how Corresponding author: [email protected] qubit usage changes over time for the modular exponentiation step in Shor’s algorithm [16]. Unfortunately, finding the 300 optimal points in a program for reclaiming qubit could get Max qubits of machine 250 Coherence timeof qubits extremely complex [17], [18]. An efficient qubit reuse strategy Too many qubits Eager: Always reclaim will play a pivotal role in enabling the execution of programs 200 on resource-constrained machines. Lazy: Never reclaim SQUARE: Selectively reclaim 150 To precisely estimate the workload of a computational Qubits Too many gates task, we propose a resource metric called “active quantum 100 volume” (AQV) that evaluates the “liveness” of qubit during the lifetime of the program, which we will formally introduce 50 in Section III-B. This is inspired by the concept of “quantum 0 volume” introduced by IBM [19], a common measure for 0 200 000 400 000 600 000 800 000 1.0×106 1.2×106 1.4×106 the computational capability of a quantum hardware device, Time based on parameters such as number of qubits, number of Fig. 1: Qubit usage over time for Modular Exponentiation. The gates, and error probability. AQV is a metric that measures the shaded area under the curve corresponds to the active quantum volume of resource required by a program when executing on a volume of this application. The blue curve, representing a target hardware, which can therefore serve as an minimization balance between qubit reclamation and uncomputation, has the objective for the allocation and reclamation strategies. lowest area and is the best option. The contributions of our work are: II. BACKGROUND AND RELATED WORK • We present a heuristic-based compilation framework, A. Basics of Quantum Computing called SQUARE, for optimizing qubit allocation and reclamation in modular reversible programs. It leverages Quantum computers are devices that harness quantum me- the knowledge of qubit locality as well as program chanics to store and process information. For this paper, we modularity and parallelism. highlight three of the basic rules derived from the principles • We introduce a resource metric, active quantum volume of quantum mechanics: (AQV), that calculates the “liveness” of qubits over the • Superposition rule: A quantum bit (qubit) can be in a lifetime of a program. This new metric allows us to quan- quantum state of a linear combination of 0 and 1: j i = tify the effectiveness of various optimization strategies, as α j0i + β j1i, where α and β are complex amplitudes well as to characterize the volume of resources consumed satisfying jαj2 + jβj2 = 1. by different computational tasks. • Transformation rule: Computation on qubits is accom- • Our approach fits computations into resource-constrained plished by applying a unitary quantum logic gate that NISQ machines by strategically reusing qubits. Surpris- maps from one quantum state to another. This process is ingly, adding gates for uncomputation can improve the reversible and deterministic. fidelity of a program rather than impair it, as it creates • Measurement rule: Measurement or readout of a qubit ancilla with better locality, leading to substantially fewer j i = α j0i + β j1i collapses the quantum state to swap gates and thus less gate noise. SQUARE improves classical outcomes: j 0i = j0i with probability jαj2 and the success rate of NISQ applications by 1.47X on j 0i = j1i with probability jβj2. This is irreversible and average. probabilistic. • Our approach has broad applicability from NISQ to FT 1) Reversibility constraints.: The above three rules give machines. SQUARE achieves an average reduction of rise to the potential computing power that quantum computers 1.5X (and up to 9.6X) in active quantum volume for FT possess, but at the same time, they impose strict constraints systems. on what we can do in quantum computation. For example, the transformation rule implies that any quantum logic gate The rest of the paper is organized as follows: Section II we apply to a qubit has to be reversible. The classical AND briefly discusses the basics of quantum computation and com- gate in Figure 2 is not reversible because we cannot recover pilation of reversible arithmetic to quantum circuits, as well the two input bits based solely on one output bit. To make it as related work in both classical and quantum compilation. reversible, we could introduce a scratch bit, called ancilla, to Section III illustrate the central problem of allocation and store the result out-of-place, as in controlled-controlled-NOT reclamation of ancilla tackled in this paper and the general idea gate (or Toffoli gate) in Figure 2. Note that we use the of our solution. Section

Load more